Kaula Posted March 4, 2006 Posted March 4, 2006 I'm really confused about robots.txt. I've read a few things on it, but I dont really know what its for. Google, yahoos slurp and something called majestic12 keep trying to get access it. Can someone give me the easy explanation, like if I need it, or why might it be important to me? Probably a stupid question Quote
TCH-Andy Posted March 4, 2006 Posted March 4, 2006 Hi kaula, It's a plain text file, where you can leave instrctions for all "robots" - ie. Search engine spiders such as google, yahoo etc. They will all try and read that file first. In it you can tell them areas they should not go. For example, if you don't want Google to index your family pages, you can tell it not to. If you're happy for google (or any other search engine) to index your whole site, you don't need to have this file. If you want to stop the error in your log, caused by all the bots looking for this file, just create an empty file called tobots.txt and upload it to your public_html folder. Quote
Kaula Posted March 4, 2006 Author Posted March 4, 2006 Thanks for the reply, Is its only use to prevent the robot from indexing certain pages/areas? Quote
TCH-Rob Posted March 4, 2006 Posted March 4, 2006 That is, provided it is a nice robot and actually reads the file first. Not all of them bother to look for that file. Quote
TCH-Don Posted March 4, 2006 Posted March 4, 2006 You can find out more about robots.txt at robotstxt.org and web master tool central has a robot.txt generator. Quote
TCH-JimE Posted March 6, 2006 Posted March 6, 2006 Hi, You can find out how other people do it too: http://www.google.co.uk/robots.txt http://www.microsoft.com/robots.txt http://www.nasa.gov/robots.txt Many sites have them so have a look! JimE Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.