Home > Fluid Dynamics Search Engine > Help > 1126

Filter Rules: Using the "follow,noindex" filter rule

The filter rule "Follow links, but do not index document", also known as "follow,noindex", is designed to mimic the behavior of the "follow,noindex" robots meta tag. A document affected by this rule will not be indexed at all, but any links found in the document will be stored and may be followed.

An example of a user scenario for this rule would be indexing a "seed page" like http://xav.com/seed.html which contains links to many files on a server which are not otherwise interlinked. When building a website realm, the FDSE web crawler is directed to the seed file whereby it learns about all of the files that it should index or "follow". Because the seed file itself is boring and shouldn't be in the index, it is covered by a "follow,noindex" rule. Generally it is better to use a robots meta tag for this, but in some cases it is not possible or desirable to add that meta header to seed file, and so the filter rule is the next best approach.

To create a "follow,noindex" filter rule, follow these steps:

  1. Log in to the "Admin Page".

  2. Choose the "Filter Rules" link from the navigation menu. From that page, choose Filter Rules - Create New Rule.

  3. You will be taken to the Create or Update Rule page. Enter these values:

    Name: My Rule
    Enabled: [x] (checked)
    Action: (*) Follow links, but do not index document
    Analyze: (*) URL
    Minimum Occurrences: 1
    (*) Apply rule only...
    Strings: (enter URL's or hostnames here)
    Scope: (*) Apply to all realms

  4. Click the "Save Data" button to save your new rule.

  5. Test the rule by going to "Admin Page" and adding a URL in the "Add New URL" form. Make sure the original URL has been listed in the filter rule "strings" section that was just created, and that the URL has lots of links on it. After indexing, FDSE should respond with "Error: document will not be included in the index because it is denied by a "noindex,follow" filter rule" but at the same time, all of the page's links should appear under the "Embedded Links" section.

History: this type of filter rule was added with FDSE version

    "Filter Rules: Using the "follow,noindex" filter rule"