Lex Posted October 16, 2003 Posted October 16, 2003 Hello, I'm new here and new to web hosting in general. I was wondering if there is anything special you need to do to keep your site from being "crawled" and recorded by the search engines until completed. I think I read somewhere a while back that you have a harder time getting ranked if you start with just a simple one page website (under construction or coming soon) type of thing. I can't remember where I read this though. I seem to remember there where some things you could do to prevent this from happening. Thanks, Lex Quote
shakes Posted October 16, 2003 Posted October 16, 2003 you can do this with META tags ... i just read something about this recently .. lemme see if i can find it again... ok ... found some stuff. there are two ways ... META tags and robots.txt file for meta tags ... put a tag: ><meta name="robots" content="noindex,nofollow"> this tells search engines not to index your page and not to follow links to index other pages. good info on that is here. this robots.txt thing I'm not too familiar with ... from what I understand you place a file robots.txt in your webserver root and search engines will check it for instructions on what to and what not to index. little bit of info here. hth Quote
shakes Posted October 16, 2003 Posted October 16, 2003 ok it was easier than I thought ... basically in your robots.txt make entries for each robot you want to filter for and include what areas they are or are not allowed to scan... so >User-agent: * Disallow: / would stop robots from scanning your entire site. >User-agent: * Disallow: /inc Disallow: /secretfolder would stop all robots from indexing the 2 listed paths. prertty cool ... think I'll have to try that. Quote
Wilexa Posted October 16, 2003 Posted October 16, 2003 The robots.txt and meta tags keep "nice" crawlers out of your site. Fo a "belt and suspenders" approach, it can be a good idea to put a blank index.html at the top level of your site (i.e in the public_html directory). Then just use a name like test1.html as your real index page while testing. Security by obscurity! That way, even if there is a link to yoursitename.tld (from a whois directory, for example), the rogue crawlers will be stumped! Also hides things from curious humans. ...dave Quote
Lex Posted October 17, 2003 Author Posted October 17, 2003 Thanks Guys! That was exactly what I was looking for. I think I'm gonna like it here! Lex Quote
TCH-Don Posted October 17, 2003 Posted October 17, 2003 Welcome to the family Lex! hope you will come back often, to get help or give help. You are always welcome! Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.