ActiveState!

ActivePerl Documentation
Table of Contents

(Usage Statistics)
(about this ver)


* Getting Started
    * Welcome To ActivePerl
    * Release Notes
    * Readme
    * ActivePerl Change Log
* Install Notes
    * Linux
    * Solaris
    * Windows
* ActivePerl Components
    * Overview
    * PPM
    * Windows Specifics
       * OLE Browser
       * PerlScript
       * Perl for ISAPI
       * PerlEZ
* ActivePerl FAQ
    * Introduction
    * Availability & Install
    * Using PPM
    * Docs & Support
    * Windows Specifics
       * Perl for ISAPI
       * Windows 9X/NT/2000
       * Quirks
       * Web Server Config
       * Web programming
       * Programming
       * Modules & Samples
       * Embedding & Extending
       * Using OLE with Perl
* Windows Scripting
    * Active Server Pages
    * Windows Script Host
    * Windows Script Components

Core Perl Documentation


* perl
* perlfaq
* perltoc
* perlbook

* perlsyn
* perldata
* perlop
* perlreftut
* perldsc
* perllol

* perllexwarn
* perldebug

* perlrun
* perlfunc
* perlopentut
* perlvar
* perlsub
* perlmod
* perlpod

* perlstyle
* perlmodlib
* perlmodinstall
* perltrap
* perlport
* perlsec

* perlref
* perlre
* perlform
* perllocale
* perlunicode

* perlboot
* perltoot
* perltootc
* perlobj
* perlbot
* perltie

* perlipc
* perlnumber
* perlfork
* perlthrtut

* perldiag
* perlfaq1
* perlfaq2
* perlfaq3
* perlfaq4
* perlfaq5
* perlfaq6
* perlfaq7
* perlfaq8
* perlfaq9

* perlcompile

* perlembed
* perlxstut
* perlxs
* perlguts
* perlcall
* perlfilter
* perldbmfilter
* perlapi
* perlintern
* perlapio
* perltodo
* perlhack

* perlhist
* perldelta
* perl5005delta
* perl5004delta

* perlamiga
* perlcygwin
* perldos
* perlhpux
* perlmachten
* perlos2
* perlos390
* perlvms
* perlwin32

Pragmas


* attributes
* attrs
* autouse
* base
* blib
* bytes
* charnames
* constant
* diagnostics
* fields
* filetest
* integer
* less
* lib
* locale
* lwpcook
* open
* ops
* overload
* perllocal
* re
* sigtrap
* strict
* subs
* utf8
* vars
* warnings

Libraries


* ActivePerl
    * DocTools
        * TOC
            * RDF
* AnyDBM_File
* Archive
    * Tar
* AutoLoader
* AutoSplit
* B
    * Asmdata
    * Assembler
    * Bblock
    * Bytecode
    * C
    * CC
    * Debug
    * Deparse
    * Disassembler
    * Lint
    * Showlex
    * Stackobj
    * Terse
    * Xref
* Benchmark
* Bundle
    * LWP
* ByteLoader
* Carp
    * Heavy
* CGI
    * Apache
    * Carp
    * Cookie
    * Fast
    * Pretty
    * Push
    * Switch
* Class
    * Struct
* Compress
    * Zlib
* Config
* CPAN
    * FirstTime
    * Nox
* Cwd
* Data
    * Dumper
* DB
* Devel
    * DProf
    * Peek
    * SelfStubber
* Digest
    * HMAC
    * HMAC_MD5
    * HMAC_SHA1
    * MD2
    * MD5
    * SHA1
* DirHandle
* Dumpvalue
* DynaLoader
* English
* Env
* Errno
* Exporter
    * Heavy
* ExtUtils
    * Command
    * Embed
    * Install
    * Installed
    * Liblist
    * MakeMaker
    * Manifest
    * Miniperl
    * Mkbootstrap
    * Mksymlists
    * MM_Cygwin
    * MM_OS2
    * MM_Unix
    * MM_VMS
    * MM_Win32
    * Packlist
    * testlib
* Fatal
* Fcntl
* File
    * Basename
    * CheckTree
    * Compare
    * Copy
    * CounterFile
    * DosGlob
    * Find
    * Glob
    * Listing
    * Path
    * Spec
        * Functions
        * Mac
        * OS2
        * Unix
        * VMS
        * Win32
    * stat
* FileCache
* FileHandle
* FindBin
* Font
    * AFM
* Getopt
    * Long
    * Std
* HTML
    * AsSubs
    * Element
    * Entities
    * Filter
    * Form
    * FormatPS
    * Formatter
    * FormatText
    * HeadParser
    * LinkExtor
    * Parse
    * Parser
    * TokeParser
    * TreeBuilder
* HTTP
    * Cookies
    * Daemon
    * Date
    * Headers
        * Util
    * Message
    * Negotiate
    * Request
        * Common
    * Response
    * Status
* I18N
    * Collate
* IO
    * Dir
    * File
    * Handle
    * Pipe
    * Poll
    * Seekable
    * Select
    * Socket
        * INET
        * UNIX
* IPC
    * Msg
    * Open2
    * Open3
    * Semaphore
    * SysV
* LWP
    * Debug
    * MediaTypes
    * MemberMixin
    * Protocol
    * RobotUA
    * Simple
    * UserAgent
* Math
    * BigFloat
    * BigInt
    * Complex
    * Trig
* MD5
* MIME
    * Base64
    * QuotedPrint
* NDBM_File
* Net
    * Cmd
    * Config
    * Domain
    * DummyInetd
    * FTP
    * hostent
    * libnetFAQ
    * netent
    * Netrc
    * NNTP
    * PH
    * Ping
    * POP3
    * protoent
    * servent
    * SMTP
    * SNPP
    * Time
* O
* ODBM_File
* Opcode
* Pod
    * Checker
    * Find
    * Html
    * InputObjects
    * Man
    * Parser
    * ParseUtils
    * Plainer
    * Select
    * Text
        * Color
        * Termcap
    * Usage
* POSIX
* PPM
    * SOAPClient
    * SOAPServer
* Safe
* SDBM_File
* Search
    * Dict
* SelectSaver
* SelfLoader
* SHA
* Shell
* SOAP
    * Defs
    * Envelope
    * EnvelopeMaker
    * GenericHashSerializer
    * GenericInputStream
    * GenericScalarSerializer
    * Lite
    * OutputStream
    * Packager
    * Parser
    * Transport
        * HTTP
            * Apache
            * CGI
            * Client
            * Server
        * LOCAL
        * MAILTO
        * POP3
        * TCP
    * TypeMapper
* Socket
* Symbol
* Sys
    * Hostname
    * Syslog
* Term
    * ANSIColor
    * Cap
    * Complete
    * ReadLine
* Test
    * Harness
* Text
    * Abbrev
    * ParseWords
    * Soundex
    * Tabs
    * Wrap
* Thread
    * Queue
    * Semaphore
    * Signal
    * Specific
* Tie
    * Array
    * Handle
    * Hash
    * RefHash
    * Scalar
    * SubstrHash
* Time
    * gmtime
    * Local
    * localtime
    * tm
* UDDI
    * Lite
* UNIVERSAL
* URI
    * data
    * Escape
    * file
    * Heuristic
    * ldap
    * URL
    * WithBase
* User
    * grent
    * pwent
* Win32
    * AuthenticateUser
    * ChangeNotify
    * Clipboard
    * Console
    * Event
    * EventLog
    * File
    * FileSecurity
    * Internet
    * IPC
    * Mutex
    * NetAdmin
    * NetResource
    * ODBC
    * OLE
        * Const
        * Enum
        * NEWS
        * NLS
        * TPJ
        * Variant
    * PerfLib
    * Pipe
    * Process
    * Registry
    * Semaphore
    * Service
    * Sound
    * TieRegistry
* Win32API
    * File
    * Net
    * Registry
* WWW
    * RobotRules
        * AnyDBM_File
* XML
    * Element
    * Parser
        * Expat
    * PPD
    * PPMConfig
    * ValidatingElement
* XSLoader

 WWW::Search::Crawler - class for searching Crawler


NAME

WWW::Search::Crawler - class for searching Crawler


SUPPORTED PLATFORMS

  • Windows
This module is not included with the standard ActivePerl distribution. It is available as a separate download using PPM.

SYNOPSIS

    require WWW::Search;
    $search = new WWW::Search('Crawler');


DESCRIPTION

This class is an Crawler specialization of WWW::Search. It handles making and interpreting Fireball searches http://www.crawler.de.

This class exports no public interface; all interaction should be done through WWW::Search objects.


SEE ALSO

To make new back-ends, see the WWW::Search manpage.


HOW DOES IT WORK?

native_setup_search is called before we do anything. It initializes our private variables (which all begin with underscores) and sets up a URL to the first results page in {_next_url}.

native_retrieve_some is called (from WWW::Search::retrieve_some) whenever more hits are needed. It calls the LWP library to fetch the page specified by {_next_url}. It parses this page, appending any search hits it finds to {cache}. If it finds a ``next'' button in the text, it sets {_next_url} to point to the page for the next set of results, otherwise it sets it to undef to indicate we're done.


AUTHOR

WWW::Search::Crawler has been shamelessly copied by Andreas Borchert, <borchert@mathematik.uni-ulm.de> from WWW::Search::AltaVista by John Heidemann, <johnh@isi.edu>.


COPYRIGHT

The original parts from John Heidemann are subject to following copyright notice:

Copyright (c) 1996-1998 University of Southern California. All rights reserved.


Redistribution and use in source and binary forms are permitted
provided that the above copyright notice and this paragraph are
duplicated in all such forms and that any documentation, advertising
materials, and other materials related to such distribution and use
acknowledge that the software was developed by the University of
Southern California, Information Sciences Institute.  The name of the
University may not be used to endorse or promote products derived from
this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.

 WWW::Search::Crawler - class for searching Crawler