Jump to content

Recommended Posts

Posted

Does anyone use any cloaking software to avoid spiders from grabbing all the email addresses off your website? :rolleyes:

 

I imagine if they have created software to pull them off, they've created software to FOOL them...so anyone in the TCH Family have recommendations or advice? :dance:

 

Thanks in advance! woooot

Posted

I am not an exprert and I am positive there are smarter ways to do it - someone will let us know for sure .... but in the meantime, this works just fine:

 

<script language="javascript">

<!--

document.write('<a href=mailto:'+'anything'+'@'+'domain.tld>Email Me</a>')

//-->

</script>

 

The concept being that the spiders are looking for the syntax *@*.* If you dont obey the syntax in your coding, then they dont pick it up.

 

It doesnt affect the way people see your site or the functioning of the mailto:.

Posted

Many of us are pretty new to this codewriting stuff.... so please forgive what is probably a simplistic question...

 

You guys are so great and generous to suggest and share helpful code, but I'm not exactly sure what to do with it! Do I simply copy and paste it into the html anywhere on the site? Do I need it everywhere I have a mailto link or is once (say on the index page) enough? Is it a universal command or specific to a single link?

 

I'd love to be able to use some of the codes I've seen on this board, but... honestly I'm a bit afraid of messing up my whole site. I'm a wysiwyg kind of gal and if I can't see it, I tend to run away from it. :) But this spam thing would be great to have so if someone can tell me how to use it, I'll stop running and give it a try...

Posted

Using the above link, you might try the

simpler method link

type in your address

copy the html code and replace the code your are using for your

contact me link

 

or just copy the

converted

name@site.com part

%6E%61%6D%65%40%73%69%74%65%2E%63%6F%6D

and insert with your program as you did to create the email link,

just paste the result when you get to the part where your would type in your address.

Posted

"You can fool some of the spiders some of the time but not all of the spiders all of the time."

 

Collecting email addresses is a business, thus there are those that spend time figuring out ways to bypass email hiding techniques. I use the javascript technique on my pages (see http://www.jsnmp.com/contactus.html) and the associated JavaScript file.

 

Note that the latest scheme (I found this out on a job interview a McAfee) is to simply blast a domain with common user names and take out all of those that bounce. Thus, email names such as support, postmaster, webmaster, hostmaster, sales, joe.blow, jblow, etc. will always get spam.

 

jim

Posted

This is one of those kind of questions that has a million and one ways around it, and of course I use the easiest and most inconvient way. Say you have your "e-mail me!" link on your site.

 

><a href="mailto:mitch@totalchoicehosting.com">e-mail me!</a>

 

You could just place in there:

 

><a href="mailto:mitchNOSPAM@totalchoicehosting.com">e-mail me!</a>

 

Of course anybody who clicks the link would have to edit out the "NOSPAM" part. Now I didn't say this is fool proof or the perfect way of doing things but it is yet one more way to get this task done for the most part.

Posted

Remember that if you use a java or javascript solution the person viewing your page may have those turned off in their browser and may see nothing at all. This is pretty rare but if you have a commercial site where sales would be at stake then it's worth noting.

 

As "the other Jim" says, e-mail harvesting is a business and they get smarter every time. That's why I personally think things like Spam Assassin and Bayesian filters (like in Mozilla's thunderbird program) are our best bet.

 

For what its worth, I used a java trick on one of my sites and the spams dropped from about 50 per day to about 4 within a week so it definitely helps!

 

Best wishes and if you have any problems just ask!

  • 1 month later...
Posted (edited)

Wow!

I have marked that one,

it seems to be the best so far.

I have found that spammers can read the ones encoded

like

email@&

#109;ysite&

#46;com

 

So this is a big improvement,

Thanks caffeine Thumbs Up

Edited by TCH-Don
Posted

another alternative is to block know e-mail harvester via .htaccess

 

>RewriteEngine on

#The next lines check for Email Spammers Robots and redirect them to a fake page
RewriteCond %{HTTP_USER_AGENT} ^Alexibot                [OR]
RewriteCond %{HTTP_USER_AGENT} ^asterias                [OR]
RewriteCond %{HTTP_USER_AGENT} ^BackDoorBot             [OR]
RewriteCond %{HTTP_USER_AGENT} ^Black.Hole              [OR]
RewriteCond %{HTTP_USER_AGENT} ^BlackWidow              [OR]
RewriteCond %{HTTP_USER_AGENT} ^BlowFish                [OR]
RewriteCond %{HTTP_USER_AGENT} ^BotALot                 [OR]
RewriteCond %{HTTP_USER_AGENT} ^BuiltBotTough           [OR]
RewriteCond %{HTTP_USER_AGENT} ^Bullseye                [OR]
RewriteCond %{HTTP_USER_AGENT} ^BunnySlippers           [OR]
RewriteCond %{HTTP_USER_AGENT} ^Cegbfeieh               [OR]
RewriteCond %{HTTP_USER_AGENT} ^CheeseBot               [OR]
RewriteCond %{HTTP_USER_AGENT} ^CherryPicker            [OR]
RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw               [OR]
RewriteCond %{HTTP_USER_AGENT} ^CopyRightCheck          [OR]
RewriteCond %{HTTP_USER_AGENT} ^cosmos                  [OR]
RewriteCond %{HTTP_USER_AGENT} ^Crescent                [OR]
RewriteCond %{HTTP_USER_AGENT} ^Custo                   [OR]
RewriteCond %{HTTP_USER_AGENT} ^DISCo                   [OR]
RewriteCond %{HTTP_USER_AGENT} ^DittoSpyder             [OR]
RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon         [OR]
RewriteCond %{HTTP_USER_AGENT} ^eCatch                  [OR]
RewriteCond %{HTTP_USER_AGENT} ^EirGrabber              [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailCollector          [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon             [OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf               [OR]
RewriteCond %{HTTP_USER_AGENT} ^EroCrawler              [OR]
RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures    [OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro            [OR]
RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE                [OR]
RewriteCond %{HTTP_USER_AGENT} ^FlashGet                [OR]
RewriteCond %{HTTP_USER_AGENT} ^Foobot                  [OR]
RewriteCond %{HTTP_USER_AGENT} ^FrontPage               [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^GetRight                [OR]
RewriteCond %{HTTP_USER_AGENT} ^GetWeb!                 [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It         [OR]
RewriteCond %{HTTP_USER_AGENT} ^Googlebot-Image         [OR]
RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla                [OR]
RewriteCond %{HTTP_USER_AGENT} ^GrabNet                 [OR]
RewriteCond %{HTTP_USER_AGENT} ^Grafula                 [OR]
RewriteCond %{HTTP_USER_AGENT} ^Harvest                 [OR]
RewriteCond %{HTTP_USER_AGENT} ^hloader                 [OR]
RewriteCond %{HTTP_USER_AGENT} ^HMView                  [OR]
RewriteCond %{HTTP_USER_AGENT} ^httplib                 [OR]
RewriteCond %{HTTP_USER_AGENT} ^HTTrack                 [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^humanlinks              [OR]
RewriteCond %{HTTP_USER_AGENT} ^ia_archiver             [OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper         [OR]
RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker           [OR]
RewriteCond %{HTTP_USER_AGENT} ^Indy\ Library           [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^InfoNaviRobot           [OR]
RewriteCond %{HTTP_USER_AGENT} ^InterGET                [OR]
RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja         [OR]
RewriteCond %{HTTP_USER_AGENT} ^JennyBot                [OR]
RewriteCond %{HTTP_USER_AGENT} ^JetCar                  [OR]
RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider        [OR]
RewriteCond %{HTTP_USER_AGENT} ^Kenjin.Spider           [OR]
RewriteCond %{HTTP_USER_AGENT} ^Keyword.Density         [OR]
RewriteCond %{HTTP_USER_AGENT} ^larbin                  [OR]
RewriteCond %{HTTP_USER_AGENT} ^LeechFTP                [OR]
RewriteCond %{HTTP_USER_AGENT} ^LexiBot                 [OR]
RewriteCond %{HTTP_USER_AGENT} ^libWeb/clsHTTP          [OR]
RewriteCond %{HTTP_USER_AGENT} ^LinkextractorPro        [OR]
RewriteCond %{HTTP_USER_AGENT} ^LinkScan/8.1a.Unix      [OR]
RewriteCond %{HTTP_USER_AGENT} ^LinkWalker              [OR]
RewriteCond %{HTTP_USER_AGENT} ^lwp-trivial             [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader        [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mata.Hari               [OR]
RewriteCond %{HTTP_USER_AGENT} ^Microsoft.URL           [OR]
RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool            [OR]
RewriteCond %{HTTP_USER_AGENT} ^MIIxpc                  [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mister.PiX              [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX             [OR]
RewriteCond %{HTTP_USER_AGENT} ^moget                   [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/2               [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/3.Mozilla/2.01  [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT           [OR]
RewriteCond %{HTTP_USER_AGENT} ^Navroad                 [OR]
RewriteCond %{HTTP_USER_AGENT} ^NearSite                [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetAnts                 [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetMechanic             [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetSpider               [OR]
RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire            [OR]
RewriteCond %{HTTP_USER_AGENT} ^NetZIP                  [OR]
RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO               [OR]
RewriteCond %{HTTP_USER_AGENT} ^NPBot                   [OR]
RewriteCond %{HTTP_USER_AGENT} ^Octopus                 [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline.Explorer        [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer       [OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator      [OR]
RewriteCond %{HTTP_USER_AGENT} ^Openfind                [OR]
RewriteCond %{HTTP_USER_AGENT} ^PageGrabber             [OR]
RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto              [OR]
RewriteCond %{HTTP_USER_AGENT} ^pavuk                   [OR]
RewriteCond %{HTTP_USER_AGENT} ^pcBrowser               [OR]
RewriteCond %{HTTP_USER_AGENT} ^ProPowerBot/2.14        [OR]
RewriteCond %{HTTP_USER_AGENT} ^ProWebWalker            [OR]
RewriteCond %{HTTP_USER_AGENT} ^ProWebWalker            [OR]
RewriteCond %{HTTP_USER_AGENT} ^QueryN.Metasearch       [OR]
RewriteCond %{HTTP_USER_AGENT} ^ReGet                   [OR]
RewriteCond %{HTTP_USER_AGENT} ^RepoMonkey              [OR]
RewriteCond %{HTTP_USER_AGENT} ^RMA                     [OR]
RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger             [OR]
RewriteCond %{HTTP_USER_AGENT} ^SlySearch               [OR]
RewriteCond %{HTTP_USER_AGENT} ^SmartDownload           [OR]
RewriteCond %{HTTP_USER_AGENT} ^SpankBot                [OR]
RewriteCond %{HTTP_USER_AGENT} ^spanner                 [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperBot                [OR]
RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP               [OR]
RewriteCond %{HTTP_USER_AGENT} ^Surfbot                 [OR]
RewriteCond %{HTTP_USER_AGENT} ^suzuran                 [OR]
RewriteCond %{HTTP_USER_AGENT} ^Szukacz/1.4             [OR]
RewriteCond %{HTTP_USER_AGENT} ^tAkeOut                 [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport                [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro           [OR]
RewriteCond %{HTTP_USER_AGENT} ^Telesoft                [OR]
RewriteCond %{HTTP_USER_AGENT} ^The.Intraformant        [OR]
RewriteCond %{HTTP_USER_AGENT} ^TheNomad                [OR]
RewriteCond %{HTTP_USER_AGENT} ^TightTwatBot            [OR]
RewriteCond %{HTTP_USER_AGENT} ^Titan                   [OR]
RewriteCond %{HTTP_USER_AGENT} ^toCrawl/UrlDispatcher   [OR]
RewriteCond %{HTTP_USER_AGENT} ^toCrawl/UrlDispatcher   [OR]
RewriteCond %{HTTP_USER_AGENT} ^True_Robot              [OR]
RewriteCond %{HTTP_USER_AGENT} ^turingos                [OR]
RewriteCond %{HTTP_USER_AGENT} ^TurnitinBot/1.5         [OR]
RewriteCond %{HTTP_USER_AGENT} ^URLy.Warning            [OR]
RewriteCond %{HTTP_USER_AGENT} ^VCI                     [OR]
RewriteCond %{HTTP_USER_AGENT} ^VoidEYE                 [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebAuto                 [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebBandit               [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebCopier               [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebEMailExtrac.*        [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebEnhancer             [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebFetch                [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS               [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web.Image.Collector     [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector   [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebLeacher              [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebmasterWorldForumBot  [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebReaper               [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebSauger               [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor      [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website.Quester         [OR]
RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester        [OR]
RewriteCond %{HTTP_USER_AGENT} ^Webster.Pro             [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebStripper             [OR]
RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker             [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebWhacker              [OR]
RewriteCond %{HTTP_USER_AGENT} ^WebZip                  [OR]
RewriteCond %{HTTP_USER_AGENT} ^Wget                    [OR]
RewriteCond %{HTTP_USER_AGENT} ^Widow                   [OR]
RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit         [OR]
RewriteCond %{HTTP_USER_AGENT} ^WWW-Collector-E         [OR]
RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE                [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider       [OR]
RewriteCond %{HTTP_USER_AGENT} ^Xenu's                  [OR]
RewriteCond %{HTTP_USER_AGENT} ^Zeus
RewriteRule ^.*$ emailsforyou.php  [L]

  • 2 months later...
Posted

I realize this post is a few months old, but I want to add my two cents in case it's of interest to anyone.

 

The way I've gotten around the problem of spam bots finding my email address on a web page is by using a PHP form mail program. There are many out there, you just have to find one that you like.

 

My link may look something like this:

><a href="mailpage.htm">Contact Us</a>

 

When users click the "Contact Us" link and arrive at mailpage.htm, they see a form with input fields for name, email address, subject, and a large textarea for entering their message. When they click the "submit" button, my PHP form mail script handles the processing of the email and sends it off to me at the email address I specified within the form mail script.

 

By using a form and a PHP program for processing the email, I have avoided posting my email address in the actual text of the HTML pages so the spam bots can't find it.

 

I suppose if you wanted to make sure that users could actually "see" your web address, you could encode the email address inside PHP function, like this...

>You may contact us via email at <a href="mailpage.htm"><?php include("displayEmailAddress.php") ?></a>.

 

where the code for displayEmailAddress.php was this:

>echo("youraddress@yoursite.com");

 

This still allows users to see your web address, but because the web address won't be visible until the page loads and the call to displayEmailAddress.php is executed, the spam bots can't see it.

 

I hope this little tidbit was helpful to someone -- it may not be the best way, but it's been working for me so far! Also, if spam bots ever get smart enough to be able to read email addresses from inside scripts that are directly coded onto a web page, like this ...

>You may contact us via email at <?php echo("youraddress@yoursite.com") ?>.

my email address is still hidden away in a separate file that I can control access to.

 

;) Kasey

Posted

I want to preface this by saying, I don't know everything there is to know about php... and this is in no way an attack... just my differing viewpoint.

 

I have serious doubts as to whether the second part of Kasey's strategy would work. PHP does things on the server and then sends html to the visitor's browser. That being the case, a spam bot browsing through web pages is going to see the email address just as surely as your human visitors will. There is no difference.

 

By the time anything is sent to the browser, php has already done everything it's going to do (actually, the server does it).

 

If someone wants to point out the fault in my logic, I will gladly chalk this up as a new lesson learned. Until then, I would stick to the first part of Kasey's advice, which is very sound.

Posted

You are probably right, TCH-Jack.

 

Thanks for correcting me. I'm certainly no expert on PHP, either, and I don't know the exact details of how pages are served to a browser from the server itself, so perhaps I shouldn't have spoken so soon. What I have been told (perhaps erroneously) is that spam bots currently do not have the "brains" required to harvest data from within scripts, which is why I thought that coding it within a PHP script would prevent it from being seen by the bots. I was not aware that the page would be served up to the bot in the same way its served to the web browser. That makes a world of difference because it means that when the bot requests to see the web page, the server would execute the PHP and offer the page to the bot as if it were serving it to a browser, so the email address would be out there just as plain as day.

 

I certainly appreciate your clarification of my assumption. It's much better to be correctly informed than to go around operating on assumptions and information that are only partially true or only work in certain situations.

 

This is one of the reasons I'm so happy with these forums and with TCH. My knowledge base is expanding almost exponentially because I have so many "mentors" that freely share their expertise with us all!

 

So, thanks again for correcting me, and thanks for educating me!

 

;) Kasey

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...