Jump to content

Prevent Search Engine Crawling..?


Faliol
 Share

Recommended Posts

Hello,

 

How do you prevent search engines from crawling certain parts of your site? For example we are a web design company. If we set up a test server on a subdomain called like Acorn (the address would be acorn.******) how would you prevent this so outside people can't view the test site? If I recall you have to use a robot text file.

 

Thanks

Link to comment
Share on other sites

Remember though that the robots.txt file only works on well-behaved crawlers. If it's something you really don't want to get out then make sure you password protect it.

Link to comment
Share on other sites

  • 3 months later...

The robotstxt.org link first says:

 

when a Robot vists a Web site, say http://www.foobar.com/, it firsts checks for http://www.foobar.com/robots.txt.

 

But here: http://www.robotstxt.org/wc/exclusion-user.html

it says:

 

If you rent space for your HTML files on the server of your Internet Service Provider, or another third party, you are usually not allowed to install or modify files in the top-level of the server's document space.

 

This means that to use the Robots Exclusion Protocol, you have to liase with the server administrator, and get him/her add the rules to the "/robots.txt", using the Web Server Administrator's Guide to the Robots Exclusion Protocol.

 

These seem to conflict. Can I just put the robots.txt in the root of my sites?

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...