queenpictoria Posted June 30, 2005 Posted June 30, 2005 I have been roaming around on the forums here and in other places about this bew subject of robots.txt. I know they are crawlers of a sort, but I feel like I am in Spain and all I can do is say "agua" to ask for a drink of water. Are there already robots.txt entries on the web domain or do I need to place them in meta tags on each webpage? Some suggested I set up a robots.txt file on the domain. I didn't see one thru my cPanel. This is unchartered for me. It will be clearer in a couple of months. Right now I am concerned in protecting my website. I want search engines and portals to have access to my site. So when does one need to set up the robot.txt controls. Thanks for helping a newbie! Quote
Deverill Posted June 30, 2005 Posted June 30, 2005 There are no automatic robots.txt file but the web crawlers (spiders) automatically assume everything is ok to look at. If you have no robots file then they will crawl all over your site and they will index you as well as possible. You would only need a robots.txt file to ask them to skip over certain pages or sections. Quote
queenpictoria Posted June 30, 2005 Author Posted June 30, 2005 Is an example of using a robots.txt file when using a shopping cart or something with proprietary information? When do people use robots.txt tags or files? Thanks so much for your reply. I have make a first attempt at the tutorials on robots and some of them are very technical and use words that have up to this point been unnecessary for me to learn. I'll keep working at it. I have no need for secrecy. However, I don't know if I need certain privacy for certain aspects of my site. If you have any suggested reading, that'd be cool. Thanks for your help. There are no automatic robots.txt file but the web crawlers (spiders) automatically assume everything is ok to look at. If you have no robots file then they will crawl all over your site and they will index you as well as possible. You would only need a robots.txt file to ask them to skip over certain pages or sections. <{POST_SNAPBACK}> Quote
TCH-Bruce Posted June 30, 2005 Posted June 30, 2005 The use of the robots.txt is to direct the search engine. Not all search engines honor your robots.txt file but most do. Here is a good tutorial and interesting FAQ on the robots.txt file. Quote
TCH-Dick Posted June 30, 2005 Posted June 30, 2005 Just wanted to add a side note It's a good idea to have a robots.txt file even if it is blank. Bots will look for this files and if its not there the server will give a "file not found" error. If your site is crawled frequently this can make a real mess of your logs. Quote
queenpictoria Posted June 30, 2005 Author Posted June 30, 2005 Where should I put the robots.txt file? Would it work to put a robots.txt meta tag on each page? Just wanted to add a side noteIt's a good idea to have a robots.txt file even if it is blank. Bots will look for this files and if its not there the server will give a "file not found" error. If your site is crawled frequently this can make a real mess of your logs. <{POST_SNAPBACK}> Quote
queenpictoria Posted June 30, 2005 Author Posted June 30, 2005 Hello Dick, Bruce gave me a tutorial to read that may answer my questions. I'll get back to you if I still don't understand. Thanks. Just wanted to add a side noteIt's a good idea to have a robots.txt file even if it is blank. Bots will look for this files and if its not there the server will give a "file not found" error. If your site is crawled frequently this can make a real mess of your logs. <{POST_SNAPBACK}> Quote
abinidi Posted June 30, 2005 Posted June 30, 2005 Oh, please let us know what the answer was! We all want to benefit from the knowledge that you gain Quote
TCH-Bruce Posted June 30, 2005 Posted June 30, 2005 Where should I put the robots.txt file? Would it work to put a robots.txt meta tag on each page? One should go into your public_html folder. And if you are running subdomains you should also put one in the root sub-domain folders. Quote
queenpictoria Posted July 2, 2005 Author Posted July 2, 2005 Thanks for all your good help. I put a robots.txt file inside my public_html folder. One more thing...... Is this a correct meta tag to put in the head of the webpage when I don't want to block any robots? <meta name="robots" content="index,follow"> I think it is right, but I just need a confirmation yes or no. Thanks, Quote
queenpictoria Posted July 2, 2005 Author Posted July 2, 2005 Yes, that is the correct meta tag. <{POST_SNAPBACK}> Thank you! Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.