ActiveState!

ActivePerl Documentation
Table of Contents

(Usage Statistics)
(about this ver)


* Getting Started
    * Welcome To ActivePerl
    * Release Notes
    * Readme
    * ActivePerl Change Log
* Install Notes
    * Linux
    * Solaris
    * Windows
* ActivePerl Components
    * Overview
    * PPM
    * Windows Specifics
       * OLE Browser
       * PerlScript
       * Perl for ISAPI
       * PerlEZ
* ActivePerl FAQ
    * Introduction
    * Availability & Install
    * Using PPM
    * Docs & Support
    * Windows Specifics
       * Perl for ISAPI
       * Windows 9X/NT/2000
       * Quirks
       * Web Server Config
       * Web programming
       * Programming
       * Modules & Samples
       * Embedding & Extending
       * Using OLE with Perl
* Windows Scripting
    * Active Server Pages
    * Windows Script Host
    * Windows Script Components

Core Perl Documentation


* perl
* perlfaq
* perltoc
* perlbook

* perlsyn
* perldata
* perlop
* perlreftut
* perldsc
* perllol

* perllexwarn
* perldebug

* perlrun
* perlfunc
* perlopentut
* perlvar
* perlsub
* perlmod
* perlpod

* perlstyle
* perlmodlib
* perlmodinstall
* perltrap
* perlport
* perlsec

* perlref
* perlre
* perlform
* perllocale
* perlunicode

* perlboot
* perltoot
* perltootc
* perlobj
* perlbot
* perltie

* perlipc
* perlnumber
* perlfork
* perlthrtut

* perldiag
* perlfaq1
* perlfaq2
* perlfaq3
* perlfaq4
* perlfaq5
* perlfaq6
* perlfaq7
* perlfaq8
* perlfaq9

* perlcompile

* perlembed
* perlxstut
* perlxs
* perlguts
* perlcall
* perlfilter
* perldbmfilter
* perlapi
* perlintern
* perlapio
* perltodo
* perlhack

* perlhist
* perldelta
* perl5005delta
* perl5004delta

* perlamiga
* perlcygwin
* perldos
* perlhpux
* perlmachten
* perlos2
* perlos390
* perlvms
* perlwin32

Pragmas


* attributes
* attrs
* autouse
* base
* blib
* bytes
* charnames
* constant
* diagnostics
* fields
* filetest
* integer
* less
* lib
* locale
* lwpcook
* open
* ops
* overload
* perllocal
* re
* sigtrap
* strict
* subs
* utf8
* vars
* warnings

Libraries


* ActivePerl
    * DocTools
        * TOC
            * RDF
* AnyDBM_File
* Archive
    * Tar
* AutoLoader
* AutoSplit
* B
    * Asmdata
    * Assembler
    * Bblock
    * Bytecode
    * C
    * CC
    * Debug
    * Deparse
    * Disassembler
    * Lint
    * Showlex
    * Stackobj
    * Terse
    * Xref
* Benchmark
* Bundle
    * LWP
* ByteLoader
* Carp
    * Heavy
* CGI
    * Apache
    * Carp
    * Cookie
    * Fast
    * Pretty
    * Push
    * Switch
* Class
    * Struct
* Compress
    * Zlib
* Config
* CPAN
    * FirstTime
    * Nox
* Cwd
* Data
    * Dumper
* DB
* Devel
    * DProf
    * Peek
    * SelfStubber
* Digest
    * HMAC
    * HMAC_MD5
    * HMAC_SHA1
    * MD2
    * MD5
    * SHA1
* DirHandle
* Dumpvalue
* DynaLoader
* English
* Env
* Errno
* Exporter
    * Heavy
* ExtUtils
    * Command
    * Embed
    * Install
    * Installed
    * Liblist
    * MakeMaker
    * Manifest
    * Miniperl
    * Mkbootstrap
    * Mksymlists
    * MM_Cygwin
    * MM_OS2
    * MM_Unix
    * MM_VMS
    * MM_Win32
    * Packlist
    * testlib
* Fatal
* Fcntl
* File
    * Basename
    * CheckTree
    * Compare
    * Copy
    * CounterFile
    * DosGlob
    * Find
    * Glob
    * Listing
    * Path
    * Spec
        * Functions
        * Mac
        * OS2
        * Unix
        * VMS
        * Win32
    * stat
* FileCache
* FileHandle
* FindBin
* Font
    * AFM
* Getopt
    * Long
    * Std
* HTML
    * AsSubs
    * Element
    * Entities
    * Filter
    * Form
    * FormatPS
    * Formatter
    * FormatText
    * HeadParser
    * LinkExtor
    * Parse
    * Parser
    * TokeParser
    * TreeBuilder
* HTTP
    * Cookies
    * Daemon
    * Date
    * Headers
        * Util
    * Message
    * Negotiate
    * Request
        * Common
    * Response
    * Status
* I18N
    * Collate
* IO
    * Dir
    * File
    * Handle
    * Pipe
    * Poll
    * Seekable
    * Select
    * Socket
        * INET
        * UNIX
* IPC
    * Msg
    * Open2
    * Open3
    * Semaphore
    * SysV
* LWP
    * Debug
    * MediaTypes
    * MemberMixin
    * Protocol
    * RobotUA
    * Simple
    * UserAgent
* Math
    * BigFloat
    * BigInt
    * Complex
    * Trig
* MD5
* MIME
    * Base64
    * QuotedPrint
* NDBM_File
* Net
    * Cmd
    * Config
    * Domain
    * DummyInetd
    * FTP
    * hostent
    * libnetFAQ
    * netent
    * Netrc
    * NNTP
    * PH
    * Ping
    * POP3
    * protoent
    * servent
    * SMTP
    * SNPP
    * Time
* O
* ODBM_File
* Opcode
* Pod
    * Checker
    * Find
    * Html
    * InputObjects
    * Man
    * Parser
    * ParseUtils
    * Plainer
    * Select
    * Text
        * Color
        * Termcap
    * Usage
* POSIX
* PPM
    * SOAPClient
    * SOAPServer
* Safe
* SDBM_File
* Search
    * Dict
* SelectSaver
* SelfLoader
* SHA
* Shell
* SOAP
    * Defs
    * Envelope
    * EnvelopeMaker
    * GenericHashSerializer
    * GenericInputStream
    * GenericScalarSerializer
    * Lite
    * OutputStream
    * Packager
    * Parser
    * Transport
        * HTTP
            * Apache
            * CGI
            * Client
            * Server
        * LOCAL
        * MAILTO
        * POP3
        * TCP
    * TypeMapper
* Socket
* Symbol
* Sys
    * Hostname
    * Syslog
* Term
    * ANSIColor
    * Cap
    * Complete
    * ReadLine
* Test
    * Harness
* Text
    * Abbrev
    * ParseWords
    * Soundex
    * Tabs
    * Wrap
* Thread
    * Queue
    * Semaphore
    * Signal
    * Specific
* Tie
    * Array
    * Handle
    * Hash
    * RefHash
    * Scalar
    * SubstrHash
* Time
    * gmtime
    * Local
    * localtime
    * tm
* UDDI
    * Lite
* UNIVERSAL
* URI
    * data
    * Escape
    * file
    * Heuristic
    * ldap
    * URL
    * WithBase
* User
    * grent
    * pwent
* Win32
    * AuthenticateUser
    * ChangeNotify
    * Clipboard
    * Console
    * Event
    * EventLog
    * File
    * FileSecurity
    * Internet
    * IPC
    * Mutex
    * NetAdmin
    * NetResource
    * ODBC
    * OLE
        * Const
        * Enum
        * NEWS
        * NLS
        * TPJ
        * Variant
    * PerfLib
    * Pipe
    * Process
    * Registry
    * Semaphore
    * Service
    * Sound
    * TieRegistry
* Win32API
    * File
    * Net
    * Registry
* WWW
    * RobotRules
        * AnyDBM_File
* XML
    * Element
    * Parser
        * Expat
    * PPD
    * PPMConfig
    * ValidatingElement
* XSLoader

 WWW::Search::WebCrawler - class for searching WebCrawler


NAME

WWW::Search::WebCrawler - class for searching WebCrawler


SUPPORTED PLATFORMS

  • Windows
This module is not included with the standard ActivePerl distribution. It is available as a separate download using PPM.

SYNOPSIS

  use WWW::Search;
  my $oSearch = new WWW::Search('WebCrawler');
  my $sQuery = WWW::Search::escape_query("+sushi restaurant +Columbus Ohio");
  $oSearch->native_query($sQuery);
  while (my $oResult = $oSearch->next_result())
    print $oResult->url, "\n";


DESCRIPTION

This class is a WebCrawler specialization of WWW::Search. It handles making and interpreting WebCrawler searches http://www.WebCrawler.com.

This class exports no public interface; all interaction should be done through the WWW::Search manpage objects.


SEE ALSO

To make new back-ends, see the WWW::Search manpage.


HOW DOES IT WORK?

native_setup_search is called (from WWW::Search::setup_search) before we do anything. It initializes our private variables (which all begin with underscore) and sets up a URL to the first results page in {_next_url}.

native_retrieve_some is called (from WWW::Search::retrieve_some) whenever more hits are needed. It calls WWW::Search::http_request to fetch the page specified by {_next_url}. It then parses this page, appending any search hits it finds to {cache}. If it finds a ``next'' button in the text, it sets {_next_url} to point to the page for the next set of results, otherwise it sets it to undef to indicate we''re done.


BUGS

Please tell the author if you find any!


TESTING

This module adheres to the WWW::Search test suite mechanism. See $TEST_CASES below.


AUTHOR

As of 1998-03-16, WWW::Search::WebCrawler is maintained by Martin Thurn (MartinThurn@iname.com)

WWW::Search::WebCrawler was originally written by Martin Thurn based on WWW::Search::HotBot.


LEGALESE

THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.


VERSION HISTORY

If it's not listed here, then it wasn't a meaningful or released version.

2.02, 1999-10-05

now uses hash_to_cgi_string()

2.01, 1999-07-13

1.13, 1999-03-29

Remove extraneous HTML from description (thanks to Jim Smyser jsmyser@bigfoot.com)

1.11, 1998-10-09

Now uses split_lines function

1.9

1998-08-20: New format of www.webcrawler.com output.

1.7

\n changed to \012 for MacPerl compatibility

1.5

1998-05-29: New format of www.webcrawler.com output.

1.3

First publicly-released version.

 WWW::Search::WebCrawler - class for searching WebCrawler