Keeping comment spam off your site and away from users

Friday, September 26, 2008

So, you've set up a forum on your site for the first time, or enabled comments on your blog. You carefully craft a post or two, click the submit button, and wait with bated breath for comments to come in.

And they do come in. Perhaps you get a friendly note from a fellow blogger, a pressing update from an MMORPG guild member, or a reminder from your Aunt Millie about dinner on Thursday. But then you get something else. Something... disturbing. Offers for deals that are too good to be true, bizarre logorrhean gibberish, and explicit images you certainly don't want Aunt Millie to see. You are now buried in a deluge of dreaded comment spam.

Comment spam is bad stuff all around. It's bad for you, because it adds to your workload. It's bad for your users, who want to find information on your site and certainly aren't interested in dodgy links and unrelated content. It's bad for the web as a whole, since it discourages people from opening up their sites for user-contributed content and joining conversations on existing forums.

So what can you, as a webmaster, do about it?

A quick disclaimer: the list below is a good start, but not exhaustive. There are so many different blog, forum, and bulletin board systems out there that we can't possibly provide detailed instructions for each, so the points below are general enough to make sense on most systems.

Make sure your commenters are real people

  • Add a CAPTCHA. CAPTCHAs require users to read a bit of obfuscated text and type it back in to prove they're a human being and not an automated script. If your blog or forum system doesn't have CAPTCHAs built in you may be able to find a plugin like Recaptcha, a project which also helps digitize old books. CAPTCHAs are not foolproof but they make life a little more difficult for spammers. You can read more about the many different types of CAPTCHAS, but keep in mind that just adding a simple one can be fairly effective.
  • Block suspicious behavior. Many forums allow you to set time limits between posts, and you can often find plugins to look for excessive traffic from individual IP addresses or proxies and other activity more common to bots than human beings.

Use automatic filtering systems

  • Block obviously inappropriate comments by adding words to a blocklist. Spammers obfuscate words in their comments so this isn't a very scalable solution, but it can keep blatant spam at bay.
  • Use built-in features or plugins that delete or mark comments as spam for you. Spammers use automated methods to besmirch your site, so why not use an automated system to defend yourself? Comprehensive systems like