I'm new here and new to web hosting in general. I was wondering if there is anything special you need to do to keep your site from being "crawled" and recorded by the search engines until completed.


I think I read somewhere a while back that you have a harder time getting ranked if you start with just a simple one page website (under construction or coming soon) type of thing.


I can't remember where I read this though. I seem to remember there where some things you could do to prevent this from happening.




you can do this with META tags ... i just read something about this recently .. lemme see if i can find it again...


ok ... found some stuff. there are two ways ... META tags and robots.txt file


for meta tags ... put a tag:


><meta name="robots" content="noindex,nofollow">


this tells search engines not to index your page and not to follow links to index other pages.


good info on that is here.


this robots.txt thing I'm not too familiar with ... from what I understand you place a file robots.txt in your webserver root and search engines will check it for instructions on what to and what not to index.


little bit of info here.



ok it was easier than I thought ... basically in your robots.txt make entries for each robot you want to filter for and include what areas they are or are not allowed to scan...




>User-agent: *
Disallow: /


would stop robots from scanning your entire site.


>User-agent: *
Disallow: /inc
Disallow: /secretfolder


would stop all robots from indexing the 2 listed paths.


prertty cool ... think I'll have to try that.

The robots.txt and meta tags keep "nice" crawlers out of your site.


Fo a "belt and suspenders" approach, it can be a good idea to put a blank index.html at the top level of your site (i.e in the public_html directory). Then just use a name like test1.html as your real index page while testing.


Security by obscurity! :D


That way, even if there is a link to yoursitename.tld (from a whois directory, for example), the rogue crawlers will be stumped! Also hides things from curious humans.



