Following Symbolic Links while Indexing Files
APPLIES TO
Searching local realms, with either indexed searches or runtime searches
SYMTOMS
Queries may time out without producing any output
Some files are not included in the index, particularly files in sub-folders with common names
DESCRIPTION OF PROBLEM
When the search engine is indexing local files, it searches through the base folder, indexes all files in it, then searches all sub-folders. It continues until all files and folders have been searched.
Some file systems support symbolic links, which allow users to create objects which behave like folders, but actually point to a folder outside of the immediate tree. For example, below is a folder structure where the "home" folder is a symbolic link pointing to the folder two levels higher:
~bob/. ~bob/public_html ~bob/public_html/images ~bob/public_html/home -> ~bob
When the search engine tries to index the public_html folder, it will correctly see the images and home folders. When searching for subfolders of "home", it will transparently be moved two directories higher, so it will then see "mail" and "public_html" as subfolders. The resulting folder list will appear like:
public_html public_html/images public_html/home public_html/home/public_html public_html/home/public_html/images public_html/home/public_html/home public_html/home/public_html/home/public_html public_html/home/public_html/home/public_html/images public_html/home/public_html/home/public_html/home [infinity]
WORKAROUND:
Two variables have been created to control link-following behavior: $AllowSymbolicLinks and $TrustSymbolicLinks. These variables are defined near lines 127 and 128 of search.pl. Behavior for different values is below:
$AllowSymbolicLinks = 1;
$TrustSymbolicLinks = 0;
(default configuration)
In this situation, the file indexing process will try each symbolic link once. Each time a link if followed, its name is saved in a table. If another link is found with this same name, it will be ignored. In the above example:
public_html public_html/images public_html/home # symlink name "home" is saved public_html/home/public_html public_html/home/public_html/images public_html/home/public_html/home # symlink "home" encountered again; stops
As this example shows, there is still some redundant indexing due to the folder/link structure, but the process does not become infinite.
$AllowSymbolicLinks = 0;
$TrustSymbolicLinks = any;
In this situation, no folder links will be followed.
$AllowSymbolicLinks = 1;
$TrustSymbolicLinks = 1;
In this situation, all folder links will be followed, even if they cause an infinite loop. If the script finds itself in an infinite loop, it will get no actual work done and will eventually time out. This configuration is useful for those with valid symbolic links with the same name which are known to not cause loops.
"Following Symbolic Links while Indexing Files"
http://www.xav.com/scripts/search/help/1057.html