Cheryl Posted October 18, 2003 Posted October 18, 2003 Does anyone use any cloaking software to avoid spiders from grabbing all the email addresses off your website? I imagine if they have created software to pull them off, they've created software to FOOL them...so anyone in the TCH Family have recommendations or advice? Thanks in advance! woooot Quote
Aeroknight Posted October 18, 2003 Posted October 18, 2003 I am not an exprert and I am positive there are smarter ways to do it - someone will let us know for sure .... but in the meantime, this works just fine: <script language="javascript"> <!-- document.write('<a href=mailto:'+'anything'+'@'+'domain.tld>Email Me</a>') //--> </script> The concept being that the spiders are looking for the syntax *@*.* If you dont obey the syntax in your coding, then they dont pick it up. It doesnt affect the way people see your site or the functioning of the mailto:. Quote
Virtual Imager Posted October 18, 2003 Posted October 18, 2003 Many of us are pretty new to this codewriting stuff.... so please forgive what is probably a simplistic question... You guys are so great and generous to suggest and share helpful code, but I'm not exactly sure what to do with it! Do I simply copy and paste it into the html anywhere on the site? Do I need it everywhere I have a mailto link or is once (say on the index page) enough? Is it a universal command or specific to a single link? I'd love to be able to use some of the codes I've seen on this board, but... honestly I'm a bit afraid of messing up my whole site. I'm a wysiwyg kind of gal and if I can't see it, I tend to run away from it. But this spam thing would be great to have so if someone can tell me how to use it, I'll stop running and give it a try... Quote
TCH-Don Posted October 18, 2003 Posted October 18, 2003 Using the above link, you might try the simpler method link type in your address copy the html code and replace the code your are using for your contact me link or just copy the converted name@site.com part %6E%61%6D%65%40%73%69%74%65%2E%63%6F%6D and insert with your program as you did to create the email link, just paste the result when you get to the part where your would type in your address. Quote
jpickeri Posted October 18, 2003 Posted October 18, 2003 "You can fool some of the spiders some of the time but not all of the spiders all of the time." Collecting email addresses is a business, thus there are those that spend time figuring out ways to bypass email hiding techniques. I use the javascript technique on my pages (see http://www.jsnmp.com/contactus.html) and the associated JavaScript file. Note that the latest scheme (I found this out on a job interview a McAfee) is to simply blast a domain with common user names and take out all of those that bounce. Thus, email names such as support, postmaster, webmaster, hostmaster, sales, joe.blow, jblow, etc. will always get spam. jim Quote
TCH-Sales Posted October 18, 2003 Posted October 18, 2003 This is one of those kind of questions that has a million and one ways around it, and of course I use the easiest and most inconvient way. Say you have your "e-mail me!" link on your site. ><a href="mailto:mitch@totalchoicehosting.com">e-mail me!</a> You could just place in there: ><a href="mailto:mitchNOSPAM@totalchoicehosting.com">e-mail me!</a> Of course anybody who clicks the link would have to edit out the "NOSPAM" part. Now I didn't say this is fool proof or the perfect way of doing things but it is yet one more way to get this task done for the most part. Quote
dreamcloudwolf Posted October 18, 2003 Posted October 18, 2003 Some people would word the whole email address. Example: myname at yahoo dot com For those who uses php, there's a php email script encoder that I like. It can be found at: http://embimedia.com/resources/labs/php-em...il-encoder.html Quote
Deverill Posted October 18, 2003 Posted October 18, 2003 Remember that if you use a java or javascript solution the person viewing your page may have those turned off in their browser and may see nothing at all. This is pretty rare but if you have a commercial site where sales would be at stake then it's worth noting. As "the other Jim" says, e-mail harvesting is a business and they get smarter every time. That's why I personally think things like Spam Assassin and Bayesian filters (like in Mozilla's thunderbird program) are our best bet. For what its worth, I used a java trick on one of my sites and the spams dropped from about 50 per day to about 4 within a week so it definitely helps! Best wishes and if you have any problems just ask! Quote
TCH-Rob Posted November 23, 2003 Posted November 23, 2003 (edited) I like this way, alicorna.com/obfuscator.html. Edited November 23, 2003 by TCH-Rob Quote
llama_thumper Posted November 25, 2003 Posted November 25, 2003 aeroknight (or anyone), how can you set the font size, etc, of that javascript? i've been able to do that however the link remains underlined and white in colour instead of the colour i want it... Quote
caffeine Posted November 25, 2003 Posted November 25, 2003 i think one of the best methods for doing this comes from Dan Benjamin: http://www.hiveware.com/enkoder_form.php Quote
TCH-Don Posted November 25, 2003 Posted November 25, 2003 (edited) Wow! I have marked that one, it seems to be the best so far. I have found that spammers can read the ones encoded like email@& #109;ysite& #46;com So this is a big improvement, Thanks caffeine Thumbs Up Edited November 25, 2003 by TCH-Don Quote
TCH-Dick Posted November 25, 2003 Posted November 25, 2003 another alternative is to block know e-mail harvester via .htaccess >RewriteEngine on #The next lines check for Email Spammers Robots and redirect them to a fake page RewriteCond %{HTTP_USER_AGENT} ^Alexibot [OR] RewriteCond %{HTTP_USER_AGENT} ^asterias [OR] RewriteCond %{HTTP_USER_AGENT} ^BackDoorBot [OR] RewriteCond %{HTTP_USER_AGENT} ^Black.Hole [OR] RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR] RewriteCond %{HTTP_USER_AGENT} ^BlowFish [OR] RewriteCond %{HTTP_USER_AGENT} ^BotALot [OR] RewriteCond %{HTTP_USER_AGENT} ^BuiltBotTough [OR] RewriteCond %{HTTP_USER_AGENT} ^Bullseye [OR] RewriteCond %{HTTP_USER_AGENT} ^BunnySlippers [OR] RewriteCond %{HTTP_USER_AGENT} ^Cegbfeieh [OR] RewriteCond %{HTTP_USER_AGENT} ^CheeseBot [OR] RewriteCond %{HTTP_USER_AGENT} ^CherryPicker [OR] RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR] RewriteCond %{HTTP_USER_AGENT} ^CopyRightCheck [OR] RewriteCond %{HTTP_USER_AGENT} ^cosmos [OR] RewriteCond %{HTTP_USER_AGENT} ^Crescent [OR] RewriteCond %{HTTP_USER_AGENT} ^Custo [OR] RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR] RewriteCond %{HTTP_USER_AGENT} ^DittoSpyder [OR] RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR] RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR] RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR] RewriteCond %{HTTP_USER_AGENT} ^EmailCollector [OR] RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR] RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR] RewriteCond %{HTTP_USER_AGENT} ^EroCrawler [OR] RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR] RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR] RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR] RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR] RewriteCond %{HTTP_USER_AGENT} ^Foobot [OR] RewriteCond %{HTTP_USER_AGENT} ^FrontPage [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR] RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR] RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR] RewriteCond %{HTTP_USER_AGENT} ^Googlebot-Image [OR] RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR] RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR] RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR] RewriteCond %{HTTP_USER_AGENT} ^Harvest [OR] RewriteCond %{HTTP_USER_AGENT} ^hloader [OR] RewriteCond %{HTTP_USER_AGENT} ^HMView [OR] RewriteCond %{HTTP_USER_AGENT} ^httplib [OR] RewriteCond %{HTTP_USER_AGENT} ^HTTrack [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^humanlinks [OR] RewriteCond %{HTTP_USER_AGENT} ^ia_archiver [OR] RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR] RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR] RewriteCond %{HTTP_USER_AGENT} ^Indy\ Library [NC,OR] RewriteCond %{HTTP_USER_AGENT} ^InfoNaviRobot [OR] RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR] RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR] RewriteCond %{HTTP_USER_AGENT} ^JennyBot [OR] RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR] RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR] RewriteCond %{HTTP_USER_AGENT} ^Kenjin.Spider [OR] RewriteCond %{HTTP_USER_AGENT} ^Keyword.Density [OR] RewriteCond %{HTTP_USER_AGENT} ^larbin [OR] RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR] RewriteCond %{HTTP_USER_AGENT} ^LexiBot [OR] RewriteCond %{HTTP_USER_AGENT} ^libWeb/clsHTTP [OR] RewriteCond %{HTTP_USER_AGENT} ^LinkextractorPro [OR] RewriteCond %{HTTP_USER_AGENT} ^LinkScan/8.1a.Unix [OR] RewriteCond %{HTTP_USER_AGENT} ^LinkWalker [OR] RewriteCond %{HTTP_USER_AGENT} ^lwp-trivial [OR] RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR] RewriteCond %{HTTP_USER_AGENT} ^Mata.Hari [OR] RewriteCond %{HTTP_USER_AGENT} ^Microsoft.URL [OR] RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR] RewriteCond %{HTTP_USER_AGENT} ^MIIxpc [OR] RewriteCond %{HTTP_USER_AGENT} ^Mister.PiX [OR] RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR] RewriteCond %{HTTP_USER_AGENT} ^moget [OR] RewriteCond %{HTTP_USER_AGENT} ^Mozilla/2 [OR] RewriteCond %{HTTP_USER_AGENT} ^Mozilla/3.Mozilla/2.01 [OR] RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT [OR] RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR] RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR] RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR] RewriteCond %{HTTP_USER_AGENT} ^NetMechanic [OR] RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR] RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR] RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR] RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO [OR] RewriteCond %{HTTP_USER_AGENT} ^NPBot [OR] RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR] RewriteCond %{HTTP_USER_AGENT} ^Offline.Explorer [OR] RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR] RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR] RewriteCond %{HTTP_USER_AGENT} ^Openfind [OR] RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR] RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR] RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR] RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR] RewriteCond %{HTTP_USER_AGENT} ^ProPowerBot/2.14 [OR] RewriteCond %{HTTP_USER_AGENT} ^ProWebWalker [OR] RewriteCond %{HTTP_USER_AGENT} ^ProWebWalker [OR] RewriteCond %{HTTP_USER_AGENT} ^QueryN.Metasearch [OR] RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR] RewriteCond %{HTTP_USER_AGENT} ^RepoMonkey [OR] RewriteCond %{HTTP_USER_AGENT} ^RMA [OR] RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR] RewriteCond %{HTTP_USER_AGENT} ^SlySearch [OR] RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR] RewriteCond %{HTTP_USER_AGENT} ^SpankBot [OR] RewriteCond %{HTTP_USER_AGENT} ^spanner [OR] RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR] RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR] RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR] RewriteCond %{HTTP_USER_AGENT} ^suzuran [OR] RewriteCond %{HTTP_USER_AGENT} ^Szukacz/1.4 [OR] RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR] RewriteCond %{HTTP_USER_AGENT} ^Teleport [OR] RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR] RewriteCond %{HTTP_USER_AGENT} ^Telesoft [OR] RewriteCond %{HTTP_USER_AGENT} ^The.Intraformant [OR] RewriteCond %{HTTP_USER_AGENT} ^TheNomad [OR] RewriteCond %{HTTP_USER_AGENT} ^TightTwatBot [OR] RewriteCond %{HTTP_USER_AGENT} ^Titan [OR] RewriteCond %{HTTP_USER_AGENT} ^toCrawl/UrlDispatcher [OR] RewriteCond %{HTTP_USER_AGENT} ^toCrawl/UrlDispatcher [OR] RewriteCond %{HTTP_USER_AGENT} ^True_Robot [OR] RewriteCond %{HTTP_USER_AGENT} ^turingos [OR] RewriteCond %{HTTP_USER_AGENT} ^TurnitinBot/1.5 [OR] RewriteCond %{HTTP_USER_AGENT} ^URLy.Warning [OR] RewriteCond %{HTTP_USER_AGENT} ^VCI [OR] RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR] RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR] RewriteCond %{HTTP_USER_AGENT} ^WebBandit [OR] RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR] RewriteCond %{HTTP_USER_AGENT} ^WebEMailExtrac.* [OR] RewriteCond %{HTTP_USER_AGENT} ^WebEnhancer [OR] RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR] RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [OR] RewriteCond %{HTTP_USER_AGENT} ^Web.Image.Collector [OR] RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR] RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR] RewriteCond %{HTTP_USER_AGENT} ^WebmasterWorldForumBot [OR] RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR] RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR] RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR] RewriteCond %{HTTP_USER_AGENT} ^Website.Quester [OR] RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [OR] RewriteCond %{HTTP_USER_AGENT} ^Webster.Pro [OR] RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR] RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR] RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR] RewriteCond %{HTTP_USER_AGENT} ^WebZip [OR] RewriteCond %{HTTP_USER_AGENT} ^Wget [OR] RewriteCond %{HTTP_USER_AGENT} ^Widow [OR] RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [OR] RewriteCond %{HTTP_USER_AGENT} ^WWW-Collector-E [OR] RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR] RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR] RewriteCond %{HTTP_USER_AGENT} ^Xenu's [OR] RewriteCond %{HTTP_USER_AGENT} ^Zeus RewriteRule ^.*$ emailsforyou.php [L] Quote
KevinW Posted November 25, 2003 Posted November 25, 2003 these questions about hiding one's email address comes up enough, that I have posted the information in this message thread, along with some examples, to the TCH Help Web site. Click here to view! -kw Quote
kaseytraeger Posted February 10, 2004 Posted February 10, 2004 I realize this post is a few months old, but I want to add my two cents in case it's of interest to anyone. The way I've gotten around the problem of spam bots finding my email address on a web page is by using a PHP form mail program. There are many out there, you just have to find one that you like. My link may look something like this: ><a href="mailpage.htm">Contact Us</a> When users click the "Contact Us" link and arrive at mailpage.htm, they see a form with input fields for name, email address, subject, and a large textarea for entering their message. When they click the "submit" button, my PHP form mail script handles the processing of the email and sends it off to me at the email address I specified within the form mail script. By using a form and a PHP program for processing the email, I have avoided posting my email address in the actual text of the HTML pages so the spam bots can't find it. I suppose if you wanted to make sure that users could actually "see" your web address, you could encode the email address inside PHP function, like this... >You may contact us via email at <a href="mailpage.htm"><?php include("displayEmailAddress.php") ?></a>. where the code for displayEmailAddress.php was this: >echo("youraddress@yoursite.com"); This still allows users to see your web address, but because the web address won't be visible until the page loads and the call to displayEmailAddress.php is executed, the spam bots can't see it. I hope this little tidbit was helpful to someone -- it may not be the best way, but it's been working for me so far! Also, if spam bots ever get smart enough to be able to read email addresses from inside scripts that are directly coded onto a web page, like this ... >You may contact us via email at <?php echo("youraddress@yoursite.com") ?>. my email address is still hidden away in a separate file that I can control access to. Kasey Quote
surefire Posted February 10, 2004 Posted February 10, 2004 I want to preface this by saying, I don't know everything there is to know about php... and this is in no way an attack... just my differing viewpoint. I have serious doubts as to whether the second part of Kasey's strategy would work. PHP does things on the server and then sends html to the visitor's browser. That being the case, a spam bot browsing through web pages is going to see the email address just as surely as your human visitors will. There is no difference. By the time anything is sent to the browser, php has already done everything it's going to do (actually, the server does it). If someone wants to point out the fault in my logic, I will gladly chalk this up as a new lesson learned. Until then, I would stick to the first part of Kasey's advice, which is very sound. Quote
kaseytraeger Posted February 10, 2004 Posted February 10, 2004 You are probably right, TCH-Jack. Thanks for correcting me. I'm certainly no expert on PHP, either, and I don't know the exact details of how pages are served to a browser from the server itself, so perhaps I shouldn't have spoken so soon. What I have been told (perhaps erroneously) is that spam bots currently do not have the "brains" required to harvest data from within scripts, which is why I thought that coding it within a PHP script would prevent it from being seen by the bots. I was not aware that the page would be served up to the bot in the same way its served to the web browser. That makes a world of difference because it means that when the bot requests to see the web page, the server would execute the PHP and offer the page to the bot as if it were serving it to a browser, so the email address would be out there just as plain as day. I certainly appreciate your clarification of my assumption. It's much better to be correctly informed than to go around operating on assumptions and information that are only partially true or only work in certain situations. This is one of the reasons I'm so happy with these forums and with TCH. My knowledge base is expanding almost exponentially because I have so many "mentors" that freely share their expertise with us all! So, thanks again for correcting me, and thanks for educating me! Kasey Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.