Home > Fluid Dynamics Search Engine > Help > 1030

Script is indexing my custom 404 Not Found page

On web servers that allow custom "Not Found" error messages, the crawler is often unable to distinguish between a legitimate page and an expired, Not Found page.

According to the HTTP standard, a web server is supposed to return the status code "404 Not Found" when a page doesn't exist, and the server can then follow that with whatever HTML text is desired. However, many web servers will redirect the visitor to a separate document to handle Not Found errors. To accomplish that, the web server returns a "302 Redirect" header, followed by a "200 Okay" header on the final document. At the HTTP level - where the crawler operates - this is indistinguishable from a legitimate redirect to a valid document. Microsoft IIS has this limitation with "URL"-type custom messages, but not with Text, Default, or File; Apache has this limitation when custom error messages are located on remote servers, but not for custom messages hosted on the same server.

One workaround is to add the Meta header:

<meta name="robots" content="noindex" />

to your custom Not Found message - this will prevent the crawler from indexing it regardless of the headers involved. In addition, you can use a local file for your custom message ("File"-type on IIS, local server on Apache). This will preserve the 404 status header. Then, if you need to do a redirect to a separate file, you can trigger it with Javascript or an HTML Meta refresh.


    "Script is indexing my custom 404 Not Found page"
    http://www.xav.com/scripts/search/help/1030.html