Jump to content

Contact Form Spam


TheCanadian
 Share

Recommended Posts

Hi all!

 

I have contact forms on several pages that I use in lieu of mailto: links (so as not to get on spam lists), but I still get a fair amount of spam sent through the contact forms. Most of it I can filter out, and as far as I know, I'm protected up the ying-yang against spambots, email header injection, spam filtering, etc., but I get a lot of nonsensical messages sent through my forms. In most cases, they aren't even spam-like -- just random junk.

 

Here's a sample of what most of what I get coming in looks like:

 

>From: fdagg@hotmail.com (Meteor)
Subject: Greetings from your website...
Message-Id: <E1GuTEt-0005TU-AG@server329.tchmachines.com>
Date: Wed, 13 Dec 2006 07:29:35 -0500

adult
D

---------------------------------------------------------------------------
Security Question:
Is the moon made of cheese? No
---------------------------------------------------------------------------
REMOTE_USER: 
REMOTE_IDENT: 
REMOTE_HOST: 221x114x194x12.ap221.ftth.ucom.ne.jp
REMOTE_ADDR: 221.114.194.12
HTTP_USER_AGENT: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)
REQUEST_METHOD: POST

 

I removed the "To:" line because it's my address. The variables at the bottom I use to track who sent the message. My email is hardcoded into the script for the To field. The subject is also hardcoded into the script. All other header fields that originate from the form (only name & email) are filtered to remove anything that shouldn't be there. For the name, anything that isn't alphanumeric or _ - . & # ! , is removed (and logged, so I can see if an injection attack was attempted), and for the from email, anything not alphanumeric or _ ~ - . @ is removed and logged. That security question is part of a series of bot-catcher questions I devised. It's randomly loaded from an array of multiple choice questions where there is always only one right (and stupidly obvious) answer. If whoever (or whatever) filling out the form answers wrong, they are immediately banned from using any CGI on my site (it's my personal page, so even if it's a real person who picked wrong, I don't need to talk to folks that dumb :) ).

 

So with that protection (and more, but that could take hours to explain).. what gives? I get several of these gibberish contact form messages a day. Some have full sentences of stupidity, others just pointless words like that. But there's no URLs with these, and no injection attacks that I'm aware of, and some living person seems to have answered the question successfully. Is it just spambots that are getting luckily and randomly answering the questions right, or real people with way too much spare time? But where's the payload? What is the point? Is there a security hole other than those I've plugged that I could be missing? I've tried searching the web, but all I find is tons of articles from 2005 about injection attacks. I haven't found anything recent about these seemingly nonsense messages.

 

Oh - one thing I seem to notice. They almost always have a single letter at the end of the message that doesn't seem to belong. Some message content examples:

 

Hello people! Nice site!o
Very+interesting+website.+Keep+up+the+outstanding+work+and+thank+you...g
information

 

e

I you all love!o
Link to comment
Share on other sites

Yeah, and the more popular the site, the more of that comes in. I guess I'm getting more popular :thumbup:

 

This stuff is just annoying though. At least if it had spam-like substance, I could block it. But this gibberish is just confusing. I was actually hoping there was a secret agenda behind it so it would make sense. :)

Oh well!

If anyone wants to use the topic to share spam-blocking code they've programmed into their form-to-mail scripts, I'm game. Apart from a blacklist of IPs, I have a lot of different methods of keeping spam at bay. Here's the most effective ones...

 

On all variables passed to my mail processing script, I filter them with regular expressions for certain keywords and phrases:

 

>@banwords = ('rteasddaws\@jksdhfue\.com','texas(\s|\W)?holdem','guym[ae]n','mugu','free(\s|\W)?poker','poker(\s|\W)?site','poker\.\w','(texas|online|SPAMMER-BEWARE|party|poker|craps|roulette|hold(\s)?em|free|black(\s)?jack)(\s|\W)?(texas|online|SPAMMER-BEWARE|party|poker|craps|roulette|hold(\s)?em|free|black(\s)?jack)','(buy|cheap|generic|cialis|online)(\s|\W)?(buy|cheap|generic|cialis|online)','facial(\s|\W)?cream','phentermine','bextra','dirfor\.com','dorank\.com','(http:\/\/(.|\n)*){3,}?','(href(.|\n)*){2,}?','free\S*ringtone','\[url.*\]','cheap(\s|\W)?cigarette','End \^\) See you','href(.|\n)*(\/url|atspace\.com)','svotyt.*google\.com','^\s(\cM\n|\n)(\s|\w)?(\cM\n|\n|$)','jestak\.com','I you all love','Xenical');

foreach $banword (@banwords) {
  if ($value=~/$banword/i) { &error('banned',$banword); }
}

 

I also check the from "name" and "email" variables sent to ensure they contain nothing more than they should:

 

>if ($name eq 'Your Name') {
  if ($value=~/[^a-zA-Z0-9_\-\. \&\#\!\,]/ or length($value) > 60) { $warnings{$name}=$value; }
  $value=~s/[^a-zA-Z0-9_\-\. \&\#\!\,]//g; $value=substr($value,0,60);
} elsif ($name eq 'Your Email') {
  if ($value=~/[^a-zA-Z0-9_\~\-\.\@]/ or length($value) > 60) { $warnings{$name}=$value; }
  $value=~s/[^a-zA-Z0-9_\~\-\.\@]//g; $value=substr($value,0,60);
}

 

Those two checks catch about 90% of the spambots and all of the attempted email injection attacks (AFAIK). The above code is Perl, but can easily be adapted to PHP.

Link to comment
Share on other sites

I use both client side and web mail viewers. I haven't enabled any filters for web mail, but for my mail clients (Eudora and Thunderbird mainly - I check mail on several machines, depending on where I am) I rely heavily on SpamAssassin's suggestions, and filter marked mail. For my business accounts I do no filtering beyond that because I don't want to accidentally trash new customer's inquiries. For my personal accounts, I trash any messages sent from someone who isn't in my address book that contain in the body src="cid: because I find that the majority of spam has embedded images. In the last several months, the only unwanted mail I've had to sort through (for my personal accounts) has pretty much been these nonsense form submissions.

 

That K9 sounds interesting. I might try that out with some of the business accounts and see what happens.

Edited by TheCanadian
Link to comment
Share on other sites

  • 3 months later...

We have a bunch of clients all using variations of a contact form we wrote. It's similar in function to yours (the "to:" address is hardcoded to me, the form values are all escaped, etc.) One of the sites was getting hammered with formbot spam recently so I tried various non-CAPTCHA techniques to catch it. Things I have discovered:

- some formbots have harvested the forms earlier and periodically just do a POST with their form spam. So adding a new field to the form and checking to see if it is there catches them.

- formbots seem to be clever enough to not mess with hidden fields. I set one to the time the page was generated and then look at the time when it is POSTed. If it is too short (I have it at 5 seconds now--maybe there's someone who is a really fast typist ;), I don't send the form email. That catches relatively few of these.

- I also mess with some of the viewable text fields and make sure they contain data I seed them with. If not, I don't send the form email.

 

Bill

Edited by wmadill
Link to comment
Share on other sites

We have a bunch of clients all using variations of a contact form we wrote. It's similar in function to yours (the "to:" address is hardcoded to me, the form values are all escaped, etc.) One of the sites was getting hammered with formbot spam recently so I tried various non-CAPTCHA techniques to catch it. Things I have discovered:

- some formbots have harvested the forms earlier and periodically just do a POST with their form spam. So adding a new field to the form and checking to see if it is there catches them.

- formbots seem to be clever enough to not mess with hidden fields. I set one to the time the page was generated and then look at the time when it is POSTed. If it is too short (I have it at 5 seconds now--maybe there's someone who is a really fast typist :thumbup:, I don't send the form email. That catches relatively few of these.

- I also mess with some of the viewable text fields and make sure they contain data I seed them with. If not, I don't send the form email.

 

Hmm... that time idea is an excellent one! I think I'm going to implement that too. In the last couple of months I've noticed that spambots seem to be posting from a cached form (like you mentioned), so I added a server-side cookie to my form that is created when the form page is loaded, contains the IP and browser type, and is verified by the CGI before it does anything. If the cookie has timed out or the data doesn't match, then it's obviously a cached form and not loaded directly from my page, so it isn't processed. That has worked wonders so far by elminating 99% of the random garbage messages, and it even allowed me to reduce much of the "banwords" from my script.

Link to comment
Share on other sites

Actually I implemented something similar a while back into the guestbook script I write. The admin can specify a time delay before a post will be accepted and a time limit for which the form is valid. I place a mildly encrypted timestamp into a hidden input to show when the form was generated.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...