As part of my role as Lifehack’s manager, I am responsible for moderating the comments queue. Lifehack’s back-end has a “Pending” queue for comments that our spam-catching software thinks might be spam, a “Spam” queue for comments labeled “spam” either by the software or by me, and another queue for comments that have been approved, again either by the software or by me. As a general rule, I check that “Pending” queue several times a day, the “Approved” queue every day or so, and the “Spam” queue every week or so. I’ve been doing this for two years, and I’ve gotten pretty proficient at figuring out what is and is not spam – a tough call to make sometimes, since spammers get more and more sophisticated in lock-step with those of us charged with blocking them. I present my “formula” here for two reasons: one, to give less experienced bloggers and webmasters an idea of how to catch spam on their own site, and two, to give commenters an idea of the kind of thing to avoid so their comments don’t get accidentally thrown in the “Spam” bin. I should say, a big part of catching spam is a “feel” – intuiting that some comment just doesn’t feel right. I’m not sure I can capture exactly what goes into that feel. Andy Warhol once said that to recognize a great painting, first you have to look at a thousand paintings, and catching spam is a bit like that – the experience of having looked at thousands of spam messages cannot be easily encapsulated. But I’ll try as well as I can.

What is spam?

What makes a message spam is relative and subjective. In a sense, spam is like a weed – a weed is not any particular kind of plant, but a plant that isn’t wanted where it’s at. (See, for example, Wikipidia’s definition of Weed as “a plant that is considered by the user of the term to be a nuisance.”) For instance, Corn is delicious, but if it’s growing in your soybean field, it’s a weed. A message that, say, pimps a word processor might be perfectly welcome on a post that asks for product recommendations for writers, while on a post that just happens to mention writing, the same message could be considered spam. Some messages are clearly spam; for example, anything delivered by a spambot programmed to leave its message wherever it can find an open form to submit through. But a message can be left by a living person, custom-written for the particular content it’s posted to, and still be spam. This list starts with the most obvious signs and moves to more vague and difficult-to-interpret signs. My guess is that a lot of people run into the ones further down the list because they post without thinking very clearly, so pay attention. A comment is spam if it: Anyone else have advice for would-be spam-catchers? Or for commenters who might be finding their comments relegated to the spam-heaps of history? Leave a thoughtful, non-spammy comment below!

Confessions of a Spam Catcher  How to Identify Spam - 11