NAME

        URI::Sequin - Extract information from the URLs of Search-Engines


SYNOPSIS

        use URI::Sequin qw/se_extract key_extract log_extract %log_types/;

        $url = &log_extract($line_from_log_file, 'NCSA');

        $log_types{'MyLogType'} = '^(.+?) -> .+$';
        $url = &log_extract($line_from_log_file, 'MyLogType');

        $keyword_string = &key_extract($url);

        ($search_engine_name, $search_engine_url) = @{&se_extract($url)};


DESCRIPTION

This module provides three tools to aid people trying to analyse Search-Engine URLs. It’s meant mainly for those who want to analyse referrer logs and pick out key information about site visitors, such as which Search-Engine and keywords they used to find the site.

The functions and globals provided (and exported by default) from this module are:

log_extract($log_line, ‘Type’)

This will pick out the referring URL from a line of a logfile. The ‘type’ can be one of the built in types or can be a user-created one. For more information, see %log_types below. This subroutine accepts a scalar, and returns a scalar.

key_extract($url)

This will try and determine the keywords used in $url. It accepts a scalar and returns a scalar. Should nothing be found, it returns an undefined value.

se_extract($url)

This will try and determine the name of the Search-Engine used and its URL. It accepts a scalar, and returns an array containing firstly the Search- Engine’s name and secondly the Search-Engine’s URL. Should the URL appear not to be from a Search Query, it returns a reference to an empty array.

%log_types

There are five built-in logfile types already in this hash. They are:

It’s easy to add another one. Simply add a key to the hash, with a value that is a regex. Parenthesise the part that is the referring URL, as the script uses $1 to obtain the URL. (see the example in the Synopsis section).


AUTHOR

Peter Sergeant <pete_sergeant@hotmail.com>


COPYRIGHT

Copyright 2000 Peter Sergeant.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.