Not many web master take the time to use a robots.txt file for their website. For search engine spiders that use the robots.txt to see what directories to search through, the robots.txt file can be very helpful in keeping the spiders indexing your actual pages and not other information, such as looking through your stats!
The robots.txt file is useful in keeping your spiders from accessing parts folders and files in your hosting directory that are totally unrelated to your actual web site content. You can choose to have the spiders kept out of areas that contain programming that search engines cannot parse properly, and to keep them out of the web stats portion of your site.
Many search engines cannot view dynamically generated content properly, mainly created by programming languages, such as PHP or ASP. If you have an online store programmed in your hosting account, and it is in a seperate directory, you would be wise to block out the spiders from this directory so it only finds relevant information.
The robots.txt file should be placed in the directory where your main files for your hosting are located. So you would be advised to create a blank text file, and save it as robots.txt, and then upload it to your web hosting to the same directory your index.htm file is located.
Here is examples of the use of the robots.txt file:
To block out a directory in a robots.txt file, such as a subdirectory for your online store called /store/ you would do the following: Disallow: /store/
Another example to block out your stats directory: Disallow: /stats/
You may also want to disallow individual files that you do not want searched by the search engines. For example you dont want search.php to be parsed by the Search Engines. To do this you type in the following on its own line:
Following the rules outlined and creating the robots.txt file, you will keep search engine spiders out of unwanted files and directories, and letting them go through the important files to see what your web site is all about!