SEO Posted June 5, 2003 Posted June 5, 2003 Recently asked: I noticed one of your meta tags that was new to me. It said something about Robots...can you tell me more about it's function? ><meta name="robots" content="ALL,FOLLOW"> The robot tag is specifically directed at (some) spiders. Content values include: ALL - index all pages INDEX - index the specific page but do not follow the links on the page FOLLOW - follow and index linked pages NOINDEX - continue onto linked pages but do not index current page NOFOLLOW - do not follow the links on the page Quote
surefire Posted June 5, 2003 Posted June 5, 2003 Dsdemmin, what is your feeling on having a robots.txt file? (I think I got that right.) Second question, does your post mean that you suggest having this Meta tag on pages? Quote
SEO Posted June 5, 2003 Author Posted June 5, 2003 Dsdemmin, what is your feeling on having a robots.txt file? Does not hurt, the below allows all robots to visit all files >User-agent: * Disallow: This is primarily used to disallow certain files/directories/etc. from a visit though. This would ban all robots from the cgi-bin directory >User-agent: * Disallow: /cg-bin/ Remember: This must be a txt file, it needs to be in your root directory and named robots.txt (i.e. case sensative). Second question, does your post mean that you suggest having this Meta tag on pages? Does not hurt. Quote
natimage Posted June 5, 2003 Posted June 5, 2003 If you disallow a robot to visit a certain directory, how is that different from cloaking a folder? And if it's not different, I thought cloaking was frowned at by the spiders? Thanks! Quote
surefire Posted June 5, 2003 Posted June 5, 2003 Disallowing a spider from visiting is very different from cloaking. Cloaking is serving different content to spiders than the content that you show to human visitors. The idea is that you can design the site that humans see, and then show a different, optimized version to spiders so your rank is higher than it normally would be. Search engines don't like being fooled. From what I have read, most tricks should be avoided. I say 'most' because some would argue over what is considered a trick. I guess if you are trying to 'fool' the search engines, then you have to be aware that there is a good chance that if you are caught, then your site could be banned. Quote
SEO Posted June 5, 2003 Author Posted June 5, 2003 natimage: Cloaking vs. disallow (or noindex) is indeed very different.... as Jack explained. Actually, the search engines like disallow or noindex in the sense that it helps them avoid pages, which do not need to be spidered. Why would anyone every want a page not spidered? Well many reasons... private images, private content. I never have spiders index form pages, not necessary. In addition, I can have a directory full of scripts with html files for documentation purposes... I do not want those indexed. Quote
natimage Posted June 6, 2003 Posted June 6, 2003 Thanks for the clarification. I definately don't have and won't have any aspirations to "fool" the search engines. That's why I was asking in the first place...and just because I didn't understand the difference. Just a little uneducated newbie here trying to learn her way! Thanks again for the clarification. Quote
natimage Posted June 10, 2003 Posted June 10, 2003 I just loaded my robots.txt file. The code is below, but I wanted to ask if this is the correct way to disallow the robots from multiple directories??? >User-agent: * Disallow: /cg-bin/ Disallow: /Images/ Disallow: /Misc/ Disallow: /MyJunk/ And I also just put the robots meta tag in place. I noticed in my raw files that googlebot had somehow looked for that on my site. Does it always look for that file? Thanks, Tracy Quote
boxturt Posted June 10, 2003 Posted June 10, 2003 You missed the ' i ' in cgi-bin. Otherwise that looks correct. I believe Google does look for and respect the robots tag as well as robots.txt file. Not all do. Evil little buggers that they are........ ty Quote
SEO Posted June 10, 2003 Author Posted June 10, 2003 I confirm that it looks good Googlebot checks a robots.txt first when entering a site and then will check for the robot meta tag on each page.... why? The big reason in to conserve resources. Right now, the big issue with Google is resources (time). Therefore, if they do not need to spider one page (directory, or site) the better. Thus, what they are looking for are disallow and noindex. Meaning both of these are used to avoid spidering, they will ignore 'cute' things like spider after 10 days, etc. Quote
theladybug Posted June 23, 2003 Posted June 23, 2003 Do I need to add the "Meta tags" on all of my pages?? Quote
SEO Posted June 23, 2003 Author Posted June 23, 2003 Does not hurt, may not help much but it definitely will not hurt. Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.