Home > Fluid Dynamics Search Engine > Help > 1133

How and when to use Filtered Realms

When to Use Filtered Realms

The Fitlered Realm is a special type of realm that tries to index as much as possible.

Other realm types have strict built-in limits to control how much they will index. Open realms only index individual pages that have been submitted. Website realms only index documents on a single site. File-fed realms only index documents linked from the starter file. Filtered realms, on the other hand, do not have these restrictions.

In practice, no FDSE realm should try to index the entire web because FDSE can only index about 10,000 documents before it slows down. Some limits must be in place. The Filtered Realm is useful because the administrator can set up versatile FDSE Filter Rules to control the indexing behavior.

The usefulness of Filtered Realms can be illustrated with some examples.

Example - Indexing the Intranet

An intranet might consist of many websites whose URL's all match a certain pattern. On Windows intranets, all web site hostnames will consist of a single word, like "http://msw/" or "http://payroll/".

To index the entire intranet with a single FDSE realm, you would set up a Filtered Realm named "Intranet". Then, under "Admin Page" => "Filter Rules", you would create a new rule named "Intranet - limit" with the following parameters:

       Name: Intranet - limit
     Action: Deny
    Analyze: Hostname
(*) Apply rule only if the required number of strings...
( ) Always apply rule, unless the...
    Strings: .
      Scope: (*) Apply only to: [x] Intranet

This Filter Rule means that the "Intranet" realm should not contain an URL whose Hostname portion contains a dot ".". Thus, the realm will happily index all of the "http://msw/" website and then follow links to the "http://payroll/" website and follow links to the "http://security/" site, but when it encounters an Internet link the outside, like "http://www.yahoo.com/", then it will recognize the "." in the hostname "www.yahoo.com" and it will not follow that link.

Without filtered realms, the administrator would have to create a separate website realm for each intranet server.

Example - Indexing utexas

A university's web presence might consist of hundreds of web servers, all organized under the ".utexas.edu" domain name. All of these servers could be indexed from a single Filtered Realm, when guided by a Filter Rule with the following parameters:

       Name: University - limit
     Action: Deny
    Analyze: Hostname
( ) Apply rule only if the required number of strings...
(*) Always apply rule, unless the...
    Strings: utexas.edu
      Scope: (*) Apply only to: [x] University

Without filtered realms, the administrator would have to create a separate website realm for each university server.

How to Use Filtered Realms

To create a Filtered Realm:

  1. Make sure you have FDSE version 2.0.0.0054 or newer.

  2. Make sure that you are running in either "Trial" or "Registered" mode. This realm type is not available in "Freeware" mode.

    You can check which mode is selected by going to "Admin Page" => "Update License".

  3. Go to "Admin Page" => "Manage Realms".

    Towards the bottom of that page will be the setting "Allow Filtered Realms". Edit that setting and set it to 1 (checked).

    (In FDSE version 2.0.0.0056 and newer, that setting has been renamed to "Show Advanced Commands".)

  4. After checking "Allow Filtered Realms", return to "Admin Page" => "Manage Realms" and follow the "Create New Realm" link.

    You will be able to choose the realm type by selecting the radio button under each type name. Select the radio button under "Filtered Realms" and submit the form.

  5. Now go to "Admin Page" => "Filter Rules". Create a new Filter Rule, with a "Scope" that ties this new filter rule to your new filtered realm. In the new Filter Rule, enter parameters that will limit the indexing to the desired set of documents.

  6. You will now have an empty Filtered Realm, and you have set up some limits for it.

    The empty realm cannot do anything because it does not have any starting content. Return to the top-level "Admin Page". The first form on that page, "Add New URL", will allow you to add starter documents to the realm.

    (If there is a Realm select box, make sure that you've selected the name of your new Filtered Realm. If there is only a single applicable realm, then the Realm select box will not appear.)

    After adding your starter URL, you will see the success/failure page. If there is a failure, you may need to adjust your starting URL or filter rules appropriately. Otherwise, when you see the success message, simply return to the main admin page.

    You will now have a Filtered Realm which contains a successful starter document. Simply find your Filtered Realm in the realms list on the main "Admin Page", and click the corresponding "Rebuild" action link. The system will begin the indexing process and continue until finished.


    "How and when to use Filtered Realms"
    http://www.xav.com/scripts/search/help/1133.html