TCH-Thomas Posted September 19, 2003 Posted September 19, 2003 Hi everyone, We are all doing our best to get visitors to our sites so this might be a little odd question. Is there any way I can prevent that search engines (like google) indexes my site, except for not publishing it? I am gonna set up a "members area" for a friend, but we do not want to find the individual pages in different search engines, as I have found out i can with my own site. -Thomas Quote
greatfolios sysop Posted September 19, 2003 Posted September 19, 2003 I can't speak too loud on this, however one of the key concepts is not to have links to those pages anywhere, I have a family site (billmccorddotcom) that I made one refrence to here at the forum and now it has a pr 3 from google. your friends can not refrence their pages in other forums, guestbooks etc. The other thing that I think you can do is add a nospider line into your .htaccess file (someone else will have to tell you how, I am not an expert on this one.... Good luck! Mr. Bill :Nerd: Quote
surefire Posted September 19, 2003 Posted September 19, 2003 (edited) You change the robots.txt file. I found this If you want to keep it a little more private (without having to password protect it) add this to your robots.txt file : User-agent: * Disallow: /hidden.html/ Boxturt on TCH forum Obviously, replace hidden.html with the page(s) and dir(s) not to index. Also, I think you could password protect the section and the spiders wouldn't get in. *** If you haven't put a robots.txt file on your site then you probably don't have one. It's just a text file put in the public_html directory of your site. I would advise reading a tutorial on it because if you goof it up then you might send all spiders away from all pages. Edited September 19, 2003 by surefire Quote
Wilexa Posted September 19, 2003 Posted September 19, 2003 Here are a couple of good links on robots.txt: A tutorial: www.searchengineworld.com/robots/robots_tutorial.htm A validator (to make sure everything is OK): www.searchengineworld.com/cgi-bin/robotcheck.cgi Just remember - 1) robots.txt MUST be in your public_html folder and 2) you must have a carriage return (new line) after your last "Disallow" line. Make sure that you set up your robots.txt as soon as you publish a site. That way YOU control which parts of your site will get spidered. Once a page is in google, it takes a long time for it to disappear! Good luck, Dave Quote
TCH-Thomas Posted September 20, 2003 Author Posted September 20, 2003 Does this robots.txt work on all servers? I don´t know yet and its not my decision what server/host my friend choose. But I do my best to recommend TCH. -Thomas Quote
surefire Posted September 20, 2003 Posted September 20, 2003 Yes. Search engine 'spiders' check for the existence of this file first. Has nothing to do with servers. Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.