Creating a robots.txt
If you ever heard of a robot.txt file then you always woundered if it really helps search engine ranking. the robot.txt file does not help search engine ranking but that doesn’t mean you don’t need it.
A robots.txt file can also be set up to instruct specifically which robots
have permission to crawl your files. For instance you may choose to disallow
known email address harvesters from your files.
Good Reasons to use robot.txt is to Disallow Crawling of a Particular File
- Private Files - Certainly you don’t necessarily need or even desire for
your site logs and such to be indexed by the search engines.
- Test Files - Many webmasters will upload a page for a client to preview or
as a test page to work through coding challenges etc.
Why you should consider getting a robots.txt file?
- When a robot crawls your site it looks for the robots.txt file. If it
doesn’t find one it assumes automatically that it may crawl and index the
entire site. Not having a robots.txt file can also create unnecessary 404
errors in your server logs, making it more difficult to track “real” 404
errors.
- Assuming you want your entire site indexed and only want to stop the
unnecessary 404 errors from occurring you have a couple of options.
* Upload a blank robots.txt file to the root directory of your domain.
*Upload a simple robots.txt file to the root directory of your domain.
What is a simple robots.txt file?
This code allows all robots to crawl all files.
User-agent: *
Disallow:
What if I don’t want a particular file crawled?
Please note: Disallowing a file to be crawled will keep it from being
indexed. The file disallowed will not show up in the search engines.
This allows all robots to crawl all files except the images file.
User-agent: *
Disallow: /images/
This allows all robots to crawl all files except the images file and the
stats file.
User-agent: *
Disallow: /images/
Disallow: /stats/
What if I want to disallow a particular robot?
- Occasionally you may find that you would like to disallow specific robots
from crawling your site or limit which files they may have access to.
(Please note that most of the so called “bad robots” will simply disregard
your robots.txt file.)
*This denies access to Googlebot-image to any files in your domain.
User-agent: Googlebot-Image
Disallow: /
*This specifically denies Googlebot-image to your images file
User-agent: Googlebot-Image
Disallow: /images/
- For a current data base of robot names and information, visit:
http://www.robotstxt.org/wc/ active/html/ index.html
How do I create a robots.txt file?
- Simply create a text document and save the new document as robots.txt Do
not use a html editor to create the file unless is has the ability to create
a plain text document (ASCII). Most computers will allow you to create a
text document using notepad.
- Create robots.txt file
- Right click on your desktop
- Choose new
- Choose text document
- Open the document you just created
- Insert instructions to robots
- Click on save as
- Save document as robots.txt
That’s all you will ever need to know about robot.txt

Tags:
Posted by Alexander Batista on January 24th, 2008







Be The First To Comment on: "Creating a robots.txt"