Home > Fluid Dynamics Search Engine > Help > 1031

Limiting the crawler to n levels or directories

A common feature request is the ability to limit the crawler to a certain number of levels, or directories, when crawling a site. For example, a crawler limited to 2 levels would index "http://xav.com/1/2.html" but not "http://xav.com/1/2/3.html".

This functionality can be realized using Filter Rules. Go to "Admin Page" => "Filter Rules" and choose "Create New Rule". Use the following parameters:

The final integer in the regex is the maximum level allowed. Examples:

Depth substring Examples
First-level ^http://[^/]+(/[^/]*){0,1}$ http://xav.com/
http://xav.com/contact.html
Second-level ^http://[^/]+(/[^/]*){0,2}$ http://xav.com/
http://xav.com/contact.html
http://xav.com/scripts/
Third-level ^http://[^/]+(/[^/]*){0,3}$ http://xav.com/
http://xav.com/scripts/
http://xav.com/scripts/search/index.html

See also: Filter Rules: Creating a new Filter Rule


    "Limiting the crawler to n levels or directories"
    http://www.xav.com/scripts/search/help/1031.html