Home > Fluid Dynamics Search Engine > Help > 1057

Following Symbolic Links while Indexing Files

APPLIES TO

Searching local realms, with either indexed searches or runtime searches

SYMTOMS

Queries may time out without producing any output

Some files are not included in the index, particularly files in sub-folders with common names

DESCRIPTION OF PROBLEM

When the search engine is indexing local files, it searches through the base folder, indexes all files in it, then searches all sub-folders. It continues until all files and folders have been searched.

Some file systems support symbolic links, which allow users to create objects which behave like folders, but actually point to a folder outside of the immediate tree. For example, below is a folder structure where the "home" folder is a symbolic link pointing to the folder two levels higher:

	~bob/.
	~bob/public_html
	~bob/public_html/images
	~bob/public_html/home -> ~bob

When the search engine tries to index the public_html folder, it will correctly see the images and home folders. When searching for subfolders of "home", it will transparently be moved two directories higher, so it will then see "mail" and "public_html" as subfolders. The resulting folder list will appear like:

	public_html
	public_html/images
	public_html/home
	public_html/home/public_html
	public_html/home/public_html/images
	public_html/home/public_html/home
	public_html/home/public_html/home/public_html
	public_html/home/public_html/home/public_html/images
	public_html/home/public_html/home/public_html/home
	[infinity]

WORKAROUND:

Two variables have been created to control link-following behavior: $AllowSymbolicLinks and $TrustSymbolicLinks. These variables are defined near lines 127 and 128 of search.pl. Behavior for different values is below:

$AllowSymbolicLinks = 1;
$TrustSymbolicLinks = 0;
(default configuration)
In this situation, the file indexing process will try each symbolic link once. Each time a link if followed, its name is saved in a table. If another link is found with this same name, it will be ignored. In the above example:

public_html
public_html/images
public_html/home # symlink name "home" is saved
public_html/home/public_html
public_html/home/public_html/images
public_html/home/public_html/home # symlink "home" encountered again; stops

As this example shows, there is still some redundant indexing due to the folder/link structure, but the process does not become infinite.

$AllowSymbolicLinks = 0;
$TrustSymbolicLinks = any;
In this situation, no folder links will be followed.

$AllowSymbolicLinks = 1;
$TrustSymbolicLinks = 1;
In this situation, all folder links will be followed, even if they cause an infinite loop. If the script finds itself in an infinite loop, it will get no actual work done and will eventually time out. This configuration is useful for those with valid symbolic links with the same name which are known to not cause loops.


    "Following Symbolic Links while Indexing Files"
    http://www.xav.com/scripts/search/help/1057.html