CAPTCHA is Dead, Long Live CAPTCHA!

March 4, 2008

In November 2007 I called these three CAPTCHA implementations "unbreakable":

Google
(unbreakable)
captcha-decoder-7.png
Hotmail
(unbreakable)
captcha-decoder-8.png
Yahoo
(unbreakable)
captcha-decoder-9.png

2008 is shaping up to be a very bad year indeed for CAPTCHAs:

Which means I am now 0 for 3. Understand that I am no fan of CAPTCHA. I view them as a necessary and important evil, one of precious few things separating average internet users from a torrential deluge of email, comment, and forum spam.

So reading that the three best CAPTCHA implementations have been defeated sort of breaks my heart. Even what I consider to be the strongest, Google's implementation, fell hard:

On average, only 1 in every 5 CAPTCHA breaking requests are successfully including both algorithms used by the bot, approximating a success rate of 20%.

A twenty percent success rate doesn't sound like much, but these spammers are harnessing networks of compromised PCs to send out thousands upon thousands of simultaenous sign-up requests to GMail, Hotmail, and Yahoo Mail from computers all over the world. Even a five percent success rate against a particular email service CAPTCHA would be cause for serious concern; with twenty percent success rate you might as well put a fork in that thing-- it's done.

In the meantime, CAPTCHA still serves a useful purpose-- speed bumps that prevent evil bots and the nefarious people who run them from completely overrunning the internet, as Gunter Ollman notes:

CAPTCHAs were a good idea, but frankly, in today's profit-motivated attack environment they have largely become irrelevant as a protection technology. Yes, the CAPTCHAs can be made stronger, but they are already too advanced for a large percentage of Internet users. Personally, I don't think it's really worth strengthening the algorithms used to create more complex CAPTCHAs – instead, just deploy them as a small "speed-bump" to stop the script-kiddies and their unsophisticated automated attack tools. CAPTCHAs aren't the right tool for stopping today's commercially minded attackers.

There's simply too much money to be made in email spam for the commercial CAPTCHA algorithms, regardless of how good they may be, to survive forever. How old is Google's CAPTCHA now? Two to three years old? In the short term, perhaps proliferation and evolution of many different CAPTCHA techniques is the most effective prevention. You should emulate the techniques from the most effective and human-readable industrial grade commercial CAPTCHA, but avoid copying them outright. Otherwise, when they're inevitably broken, you're broken too. CAPTCHA defeating tools are tailored to very specific inputs; if there's little to no monetary incentive, odds are nobody will bother to customize one for yours. My ridiculously simple "orange" comment form protection is ample evidence of that.

Beyond diversification, the deeper question remains: how do we tell automated bots from people-- without alienating our users in the process? How can we build a next generation CAPTCHA that's less vulnerable to attack?

Here's some food for thought:

At some point, unfortunately, CAPTCHA devolves from a simple human reading test into an intelligence test or an acuity test. Depending on how invasive you want to be, you'll eventually be forced to move to two-factor authentication, like sending a text message to someone's cell phone with a temporary key.

I don't have the all answers, but one thing is for sure: I hate spammers. As fellow spam-hating internet users we all have a vested interest in seeing CAPTCHA techniques evolve to defeat spammers.

Posted by Jeff Atwood
173 Comments

Well. CAPTCHA are often not understandable for old people.
And since they are generated with computers, it seems possible that a way to reverse it with a computer exists.

I really prefer clever and hand made captha :
"What is the first name of Jeff Johnson ?"
"What is the year of the end of 2nd world war(1939-1945) ?"
And so on ... If you create hand-made silly question, then all the spammers will be defintively blocked. There is no software which is able to understand a question.

Coren on March 5, 2008 1:07 AM

Somewhere I've read that using hidden inputs as bot traps can be effective. If something was entered into the hidden fields, it must be a bot. The bot isnt going to render the page to determine if a textbox is hidden. You'd probably have to constantly randomize the field names on high profile sites though

Gary on March 5, 2008 1:18 AM

What about language barriers?

Also your site uses CAPTCHA! :)

Jesus DeLaTorre on March 5, 2008 1:26 AM

what about using some CJK charactors?
there are ten thousands of charactors.
perhaps, it need longer time to break.

of course, a human must be a chinese, japanese, or korean.

hito on March 5, 2008 1:32 AM

On .Net Rocks, I heard them talking about "invisible Capta", which was something to the effect of your trivia questions. The whole thing had to do with having an invisible Div with a small math problem that would only be answered by Javascript enabled browsers, which would root out all bots, or something to that effect.

charles graham on March 5, 2008 1:35 AM

Another solution : stop using these damn registration pages and use OpenID. Of course there will be openid spam server but it's easier to control and ban them.

acemtp on March 5, 2008 1:39 AM

When will we actually hit the spammers where it hurts ? And by hitting, I mean prosecution. Yes they are in various countries that do not necessarily care, but maybe, just maybe, we can make them care ? I would think the WTO is for that kind of things...

bahbar on March 5, 2008 1:42 AM

And it is even Web 2.0 enabeled... Isnt that great?

Here is the link :

a href="http://hotcaptcha.com/"http://hotcaptcha.com//a

I think Bots will have a hard time breaking that one

Heiko Hatzfeld on March 5, 2008 1:43 AM

P.s.:

And here is the link to the article where I found it... I know its "old" but i found it quite interesting...

http://radar.oreilly.com/archives/2006/07/another-captcha-but-i-failed-p.html

Heiko Hatzfeld on March 5, 2008 1:48 AM

The text on the Google CAPTCHA breaking page suggests that they pay humans to solve the CAPTCHAs. I'm not sure if this is true or not.

===================

If you are unable to recognize a picture or she is not loaded (picture appears black, empty picture), just press Enter.

In no case do not enter random characters!

If there is delay in downloading images, exit from your account, refresh the page and go again.

The system tested in browsers:
Internet Explorer
Mozilla Firefox

Before each payment deemed by pictures checked Admin. We pay only correctly recognized pictures!

Gigi on March 5, 2008 1:49 AM

So what if the CAPTCHA turns into an intelligence test? Let's not have dumb people make comments either :)

Oh, damn. I can't spell "orange."

Matt Gibson on March 5, 2008 1:51 AM

My Freakonomics thing tells me "CipherTrust has analyzed the effectiveness of various kinds of spam. It turns out that pornography is far and away the most effective spam, with a click-through rate of 5.6 percent. The next-best click-through rate is pharmaceuticals, at 0.02 percent."

The only way to solve spam forever is to stop people opening spam messages.


Best dumb-butted responses so far:

Pay $1 for all the (stupid) websites that ask for your info.
- The reason I give fake email addresses is because I don't want spam, and I don't want dodgy website having my credit card neither. Gee let me pay to comment on forums? I'm already annoyed that I have to sign up in the first place.

Give up your identity, SSN, credit card, etc
- Why should you really know who I am? Spammers will still have fake ID's, while honest people pay the price.
- Do I trust you with that information. Do I trust your security and data retention policies?

Limit the number of emails for new accounts.
- Sure, but for how long. Spammers will then create accounts, have their fake accounts send a few 'real' messages, and after a period of time resume full-time spamming. All you did is introduce a temporary delay.

All internet advertising must be PAID advertising
- I'm sure someone paid the spammer, so ipso facto stupido. Do you think spammers are doing it gratis?

Universal ID
- Who manages this, and can you be sure that their captcha works? You're just pushing the problem up a level, and making one large target instead of many small targets.
- Conventiently you also get universal tracking of habits and selling my information to ... spammers. Thanks! Where's my tinfoil hat?

Charge people per email:
- Great idea. Maybe I already do pay you retard. I pay for hosting. I pay for internet service. I pay for bandwidth.

PS Banks don't use captcha. They have secure offline processes in place to set up your internet banking so that even their employees can't fake the system out. Multi-level authentication isn't captcha.

Clbuttic on March 5, 2008 1:52 AM

I read the Websense report on Google's CAPTCHA last week. I was under the impression that it wasn't broken in the sense that machines were solving the CAPTCHAs automatically (via machine vision or whatever), but by duping humans to solve them (unknowingly, on a different site) in order to make money or get access to free porn (http://www.boingboing.net/2004/01/27/solving-and-creating.html)

As I understand it, the hard part about breaking Google's CAPTCHA was the bot getting the image to human eyes, and getting a response back to Google before the process timed out.

If this is the case, changing the CAPTCHA from a reading test to an intelligence test probably won't make much difference. The hard part is surely making the authentication process robust against this kind of attack?

Paul on March 5, 2008 1:55 AM

Asirra has no chance to success
Users are usually dumb with the willing to be even more dumber.
If You start to forcing them to use brain they will rather search for 'X' button insteed of on the photos of cats..

However ASCII art is available on Drupal CMS and i've started to using it some time ago.
Seems to be fine for now.

arty on March 5, 2008 2:03 AM

Also

Another good thing is to use javascript along with captcha, even simply onmouseover effect above the captcha image (like : display captcha image when moise is above 'fake' captcha image)
Bots usually don't do that
or use splitted captcha images with different z-index, animated gif's (or just backgrounds)

Just use your imagination

arty on March 5, 2008 2:09 AM

Well, as you say: "perhaps proliferation and evolution of many different CAPTCHA techniques is the most effective prevention".

However, many of these CAPTCHA alternatives you mention are broken much easier than your average "type the characters from the picture" CAPTCHA. So, how about just sticking with the image CAPTCHAs, but using much more randomness in your rendering - i.e. there's no need to distort the picture heavily, you just need to have a bunch of different not-so-distorted, easily readable CAPTCHA variants?

If you have a bunch of different algorithms (each requiring a different cracking approach), and switch them randomly (requiring the bot to be able to distinguish between them), bots will not get far.

Of course, coming up with continuous variations in your CAPTCHA rendering can be a part-time job on it's own, but is only necessary if you're a high-profile target - for most websites in existence, changing a broken CAPTCHA algorithm for a different one is going to be enough to solve your problems for a long while... Unless you have a cracker who's REALLY keen on spamming your site and your site only, enough to change his cracking approach every time you change the protection, even if it will never pay off (and as we know, most spammers are in it for money).

Let's face it: if you're Google, or Microsoft, or Yahoo - any of those "alternative" methods will be broken much faster than a new CAPTCHA rendering algorithm. Something to think about...

dave on March 5, 2008 2:10 AM

I was under the impression that it wasn't broken in the sense that machines were solving the CAPTCHAs automatically (via machine vision or whatever), but by duping humans to solve them (unknowingly, on a different site) in order to make money or get access to free porn

If that's the case, then Google's CAPTCHA generation algorithm isn't broken after all. These human farms would work against ANY turing test.

Does anyone know for sure?

Jeff Atwood on March 5, 2008 2:14 AM

That is excellent food for thought. Distinguish a type of animal, bloody brilliant! At least then the captcha would be fun!

Ryan Allen on March 5, 2008 2:17 AM

As with all anti-abuse measures, CAPTCHAs have to evolve to keep up; this is the nature of adversarial systems like anti-spam and anti-virus. They'll be broken eventually, by a sufficiently-determined attacker.

also:

'Of course there will be openid spam server but it's easier to control and ban them.'

Great hand-waving assertion there, acemtp ;) Same way it's easier to control and ban mail servers originating spam in SMTP-land?

Justin Mason on March 5, 2008 2:24 AM

Has anyone seen efforts for captchas that reveal the letters in an animation? Something that is easy to solve by human eye looking at the letter revealing/morphing animation but really hard for OCR technique to solve since there are too many frames to tie together to make sense of the word.

Erki Esken on March 5, 2008 2:29 AM

Hehe:

http://www.ubersite.com/m/113411

Is this the future (possibly NSFW due to two swear words)

Bryan Childers on March 5, 2008 2:39 AM

There are already human farms, alright. Some involve unwitting users solving CAPTCHAs for access to porn, and others involve low-paid workers overseas solving CAPTCHAs for money a la the "gold farming" model.

Here are observed cases of "CAPTCHAs for porn":

http://www.linuxworld.com/community/?q=node/2400
http://www.theregister.co.uk/2007/10/31/captcha-busting_trojan/

And there are some cases of CAPTCHA farming:

http://ha.ckers.org/blog/20070427/solving-captchas-for-cash/ (be sure to read the comments for several farmers offering their services)

By the way, here is the source for all these recent "Google CAPTCHA broken" stories -- one Websense blog post:

http://www.websense.com/securitylabs/blog/blog.php?BlogID=174

To be honest I suspect this is blown out of proportion. It looks a lot like another CAPTCHA-solving farm behind a web service API. (Observe the timestamps in the logs -- 30 seconds to decode a CAPTCHA sounds like a human, not an algorithm, if you ask me.)

Justin Mason on March 5, 2008 2:51 AM

Another solution : stop using these damn registration pages and use
OpenID.

Hint: this problem is *not* a nail. Your hammer is of no use.

Peter on March 5, 2008 2:53 AM

Make the captcha too hard, and you'll lock out many of your human readers. Life's too short to spend it straining my eyes at distorted text.

Captcha: apple? banana? grape? peach? I KNOW! it's cherry, right? No?
kiwi? guava? strawberry? secret? password? Oh please, just post my comment already.

Watermelon? Pineapple? Persimmon? Sweet potato? Lime? Lemon? Tangerine? Pomegranate? Olive? Nectarine? Pumpkin? Cantelope?

Izzy on March 5, 2008 2:57 AM

What about developing a system that uses VOIP to call a number and give the code, as to not alienate users without cell phones. You could also sell add space on the calls to make it generate some money.

It is probably like trying to kill a house fly with a bazoka, and not totaly fool proof, but atleast it makes some money too.

FireCracker37 on March 5, 2008 2:58 AM

Next step in defense against spammers is probably using an external ID authentification (google, passport, or openID).
Next spammers step is therefore id theft.

ISP are very eager to fight a grandma that download an illegal song, they seems not very interested in fighting spammers.

The only solution would be to apply ARIN/RIPE policy strictly, but it would kill business since most firms are not very carefull about where their business mail comes from...

jul on March 5, 2008 3:02 AM

I run a website/forum for a World of Warcraft guild that I'm in. We used to get a lot of forum spam. What worked for us was to have add a question to the registration form - a trivia question. In our case I asked a question that anyone who has leveled a character to 70 would know the answer to, but noone in a captcha-breaking sweatshop would be able to answer. A lot of topic-specific websites could use similar techniques to filter out spambots. You just need to tailor the questions for your audience.

After making this change we haven't seen any spam posts.

DancesWithLysol on March 5, 2008 3:05 AM

In contexts where people come together around a specific interest, you have a better point of cleavage -- not between people and machines, but between members of the in-group and everyone else, including people who wouldn't be interested in what you are about as well as computers. As an example, an associate of mine has left a phpBB installation with just such a captcha replacement out on the 'net for almost part a year now, and despite it being at the default location in the domain, no spam sign-ups have been recorded.

If you're one of "us" for the purposes for which this was written, then signing up here

http://www.obsessivemathsfreak.org/phpbb/

should be trivial. Without some significant AI this isn't going to admit a bot, and if you just play the captcha out of context in an unwitting mechanical turk attack (e.g. as part of a porn site login), you're not going to get very many false positives.


Steve on March 5, 2008 3:19 AM

Forums / Blogs,

Seriously, any form of validation that requires a user to enter anything but their blog comment or forum message is useless. It may be partially effective against automated means, but a human farm of people can break any of these ideas EXCEPT bayesian filtering.

Once you train your bayesian filter by marking actual spam as spam, and good posts as good then only a very small percentage of spam make it. The ones that do make it you mark as spam manually which further 'trains' your filter. Simple.

Email providers,
bayesian won't help you prevent people from creating spam sending accounts.

Michael Lang on March 5, 2008 3:31 AM


If you're one of "us" for the purposes for which this was written, then signing up here

http://www.obsessivemathsfreak.org/phpbb/

should be trivial. Without some significant AI this isn't going to admit a bot, and if you just play the captcha out of context in an unwitting mechanical turk attack (e.g. as part of a porn site login), you're not going to get very many false positives.


Yeah, right. I wasn't able to answer a single one out of ten. :D Obviously, I don't to belong to the targeted audience.

Vinzent Hoefler on March 5, 2008 4:03 AM

@KG

No, desktop SMTP is slowly being replaced by web based SMTP clients. SMTP is still there.

Alex G on March 5, 2008 4:17 AM

How about instead of testing if a human is filling out the form, just make sure to map an account to something pretty unique

SUCH AS A CELLPHONE.

Problem solved :)

Greg Magarshak on March 5, 2008 4:21 AM

Thanks for the heads up regarding Asirra (the "click on all of the cat pictures" CAPTCHA). It's definitely way less tedious (and more fun) than the standard text CAPTCHAs...

Erik Novales on March 5, 2008 4:29 AM

I've had good luck using form-morphing techniques to prevent spam: http://nedbatchelder.com/text/stopbots.html. It won't stop a human, but what will?

Ned Batchelder on March 5, 2008 4:46 AM

Considering most internet sites are niche sites (company sites, tailoring to some kind of group etc), you should *always* write your auth according to that group. If you have a webmaster forum, just ask questions webmasters *should be able to* answer, if you have a Tattoo forum, as them about that kind of thing. This can be made more difficult but warping the text of the question differently every time, so they have a 1 in 5 success rate with ocr *and* have to know the answer and use pictures.

As a previous poster already said; if I run a forum/site about X, I want people *with* a brain to comment on things, not a moron, so I don't care about people who 'fail' the test. They cannot join. Mala Suerte.

Another method is to save up the comments/forum posts by new members and auto-checking them against a bunch of heuristics; I have quite a bit of success with that; I simply grep out all http addresses in posts (using heuristics to 'fix' urls that are broken up etc) and submit them to google. If I find too much of them on unrelated forums (you can use google queries to do that) I will auto-flag the post and send me a message. Spammers have a goal with their spamming and they don't, currently, have infinite resources to prevent me from finding it and blocking it. I have a *very* high succes ratio using this technique.

Basically my (long winded) point is that you should tailor your protection to the site you are protecting and you won't have much spamming problems.

frank on March 5, 2008 4:55 AM

I believe CAPTCHA has been broken and if you beleive this post:

http://www.mperfect.net/aiCaptcha/

It has been broken for a very long time. With a little time and effort you could recognize any letter that has been distorted, especially if you analyze the average pattern of the letter.

Even if it hasn't been broken with services like the Mechanical Turk it makes it much harder to determine a human that has good intentions verse one that has bad intension.

http://www.mturk.com/mturk/welcome

And the CAPTCHA definitely isn't going away. I was just asked to create one for the ASP.NET MVC Framework for a project that I am working on.

http://www.coderjournal.com/2008/03/aspnet-mvc-captcha/

Solutions are only going to get harder and harder. One growing method that I have seen to prevent bots, is by exploiting their weakness when it comes to JavaScript. Basically you add a AJAX authentication string to the POST and the authentication string is only grabbed from the server moments before it is submitted. But that doesn't really solve the problem because if AJAX can get it so can a bot.

It is a no win situation with current stateless web.

Nick Berardi on March 5, 2008 4:55 AM

I believe CAPTCHA has been broken and if you beleive this post:

a href="http://www.mperfect.net/aiCaptcha/"http://www.mperfect.net/aiCaptcha//a

It has been broken for a very long time. With a little time and effort you could recognize any letter that has been distorted, especially if you analyze the average pattern of the letter.

Even if it hasn't been broken with services like the Mechanical Turk it makes it much harder to determine a human that has good intentions verse one that has bad intension.

a href="http://www.mturk.com/mturk/welcome"http://www.mturk.com/mturk/welcome/a

And the CAPTCHA definitely isn't going away. I was just asked to create one for the ASP.NET MVC Framework for a project that I am working on.

a href="http://www.coderjournal.com/2008/03/aspnet-mvc-captcha/"http://www.coderjournal.com/2008/03/aspnet-mvc-captcha//a

Solutions are only going to get harder and harder. One growing method that I have seen to prevent bots, is by exploiting their weakness when it comes to JavaScript. Basically you add a AJAX authentication string to the POST and the authentication string is only grabbed from the server moments before it is submitted. But that doesn't really solve the problem because if AJAX can get it so can a bot.

It is a no win situation with current stateless web.

Nick Berardi on March 5, 2008 4:56 AM

Sorry about the double submit there was a hickup in the form that said the permission was denied for copying the HTML file.

Nick Berardi on March 5, 2008 4:57 AM

I wonder if the ever increase battle between spammers and their victims (us) will result in the first true AI systems? Will they unintentionally end up make a computer that thinks like a human?

Jim Cook on March 5, 2008 5:08 AM

There has to be a way using Flash or some sort of randomly generated animated field that a human can distinguish that a bot can't.

The simple ability to use actionscript to randomly create a timeline for the flashed letter forms and the ability to use the same scripiting to create an endless combination of noise or disfigurement, would have to at least set the absolute bots back a bit.

Of course the real deal spam houses forcing actual people to do their dirty work may never be stopped, but the marco/bot programmers would have a hell of a time with a CAPTCHA that had its own timeline and actually moved.

If you were really clever you could also have the algorithms get MORE strict each time the person pressed the "show me a new one" button based on their session ID or cookies.

Perhaps also getting away from letterforms entirely, using a gradient slice of colors with a corresponding universal sound, upon mouseover, by universal I mean things anyone would recognize. Flowing water, a cheering crowd. For the backend, have hundreds of dummy sounds, use a scripted backend to randomize the filename of the embedded sound clip, always make sure its position is as random as you can get it.

I think any system can be broken, and perhaps none of my ideas would work since the obvious weakness is it must be executed client side and would therefore be susceptible to reverse engineering.

I find this topic particularly interesting even the first time you brought it up, because its a call back to the most simple rule in our electronic age:

Anything that can be made, can be unmade.

And frankly, there's something almost comforting about that, as crazy as that sounds.

Mike on March 5, 2008 5:10 AM

Quote
So what if the CAPTCHA turns into an intelligence test? Let's not have dumb people make comments either :)

Oh, damn. I can't spell "orange."
Matt Gibson on March 5, 2008 01:51 AM
/Quote

Better yet, let them enter a word that rhymes with orange.

Thijs on March 5, 2008 5:25 AM

I can't help but wonder that if adding a second captcha will significantly reduce the success rate. Currently, if an automated process get's 20% correct, adding a second captcha will cut that by an additional 80%, leaving it at just around 5% success. Or at least that's my thought on it. I hate captchas, by the way, but they do keep tons and tons of spam from getting to our inboxes.

Seth Braunstein on March 5, 2008 5:27 AM

Thinking AloudHow about a CAPTCHA that depends on -errors- that humans will make (reliably!)./Thinking Aloud

MattF on March 5, 2008 5:29 AM

I really don't understand how spammers get so much money to make spamming worth the effort...

Nicolas on March 5, 2008 5:33 AM

The "distinguish pictures of dogs from cats" page just informed me that I am a bot.

It asked me to choose all of the pictures of cats.

I did so, including one that contained both a dog and a cat.

I guess that was supposedly a picture of a dog.

blah on March 5, 2008 5:51 AM

I had read a while back about a way to make spamming prohibitively expensive.
It was Cringely IIRC, that proposed to make the sender of an email perform a small calculation sent to it by the email server. A small enough calculation to not be noticed by the average email user, but large enough that when a spammer tries to send huge amounts of mail at once, the computation becomes too time consuming to be worth it.
This was over two years ago. Has anyone heard of this concept being used?
Now, I understand that this would unjustly penalize businesses that legitimately send bulk emails. But, do any legitimate bulk emailers send as much as a spammer?

KevinM on March 5, 2008 5:51 AM

I prefer technical solutions.

For web form spam, you can easily filter keywords and links (spammers can't obfuscate them, they want links to be machine-readable - google-readable to be exact).

For other kinds of abuse, it's more tricky, but it might be possible ot use trending, i.e. don't check individual request, observe how "user" behaves, how many registration he makes, how soon and how many e-mails he sends, etc. This should reliably pick up bots until spammers learn to emulate human behaviour better.

kL on March 5, 2008 5:52 AM

Why bother with Captcha? Just let them spam all they want and ignore them silently. This is what services like Akismet and Defensio are for. They will take care of watching over the evolution of spam messages and adjust the filtering techniques.

Defensio advertises an efficiency of 99.77%. Considering Akismet (no numbers) is at least 99.5%, you can combine both and get 99.9999% accuracy. Who needs a CAPTCHA?

Louis-Philippe Huberdeau on March 5, 2008 5:53 AM

Actually, I believe dynamically generated CAPTCHA fields would do the job. Your Server sends you a session ID and retains the properties of the generated CAPTCHA field.

A Bot will not be able to find the CAPTCHA field so, it won't be able to insert text. Human CAPTCHA solving is no option, as the field name will be different every time.

Also, you should map the other fields as well, so the whole page changes in an unpredictable way for each session.

This could also be done with the Javascripts within each page, changing function names throughout to make it even more difficult for Bots to analyze it.

GUI Junkie on March 5, 2008 5:59 AM

Why don't we skip the CAPTCHA and move to a pay per email type system? Just like we do with stamps. No gmail account for you until you provide a credit card number.

Akira on March 5, 2008 6:10 AM

I read it that humans were being used (either wittingly or unwittingly) to read Google's CAPTCHAs. Either way, it's a bit over the top to suggest CAPTHCA implementations are broken for everyone. Google/ Yahoo/ MS need to do something new because the monetary reward for breaking their CAPTCHAs is high enough to make it worth paying people to do it. This is not so for the average blog or even small web application.

Tom Clancy on March 5, 2008 6:16 AM

Wouldn't it be much easier for Google/Yahoo/Hotmail/etc. to limit outgoing emails for new accounts? Instead of a single Turing Test, CAPTCHA, these services could pose a series of tests for the new users to complete over the course of weeks and months before email restrictions were lifted.

aikimark on March 5, 2008 6:19 AM

This sounds like the Matrix - Human farms of unknowing subjects breaking the CAPTCHA algorithm for the machines.

Roberto on March 5, 2008 6:19 AM

Mandatory 1 year prison sentences for all convicted spammers. 6-figure fines for all ISP's who knowingly distribute spam. Ninkinpoops who attempt to respond to spam should be warned once, then have their internet connections disabled for 30 days if they do.

All internet advertising must be PAID advertising and belongs on commercial web pages and pop-up ads only. Everything else should be punishable by law.

Desperate measures for desperate times.

PaulG. on March 5, 2008 6:26 AM

I like the JavaScript solution that was suggested earlier. You could just have JavaScript populate a hidden field with a value that will be read when the form is submitted. If a bot is visiting the page the JS won't be executed and the bot will be defeated. That would unburden the user also.

Brian K on March 5, 2008 6:33 AM

Some of the options that you offer as alternatives are no better. If they offer a multiple choice then the probability of breaking the capture becomes 1 in the number of options offered. The number of choices needs to approach a really, really big number (I originally wrote infinity) to make the approach effective.

I'm just saying, is all...

prairiedog2k on March 5, 2008 6:34 AM

To me the long term solution is to figure out and define exactly what spamming is, and automatically detect that behavior.

Maybe this requires some kind of machine learning. It may require shared databases of information about current spammers too -- that has the capability to stay ahead of the spammers.

This would have to be combined also with some more sophisticated and fine-grained access control. (E.g. to beat the case where a spammer takes your captcha image and uses it to give other users access to a fake porn site, only serve captcha images to clients that you can be sure have already visited your site in the past N ms.)

A combination of countermeasures that are not uniform from site to site or even request-to-request would also be best. I.e. imagine your captcha incorporated all kinds of variation [note how similar the example cpatchas above are to each other for each of Google, Hotmail, Yahoo]. If in order to beat your captcha, a spammer had to run several recognition passes tuned for different kind of captcha distortions, it makes it that much more expensive and time consuming.

We can also come up with more sophisticated ways of defining exactly what some of the charactaristics of a "high quality" blog comment is, and score comments accordingly, and send lower ones into human moderation.

The community of people who don't like spam is much larger than spammers and people who don't care. We also have the advantage that the characteristic that unites us is that we hate spam, and want to fight it. Our disadvantage is that most of us who hate spam are just average users, and have a certain threshold of what hoops they're willing to jump through to get their actual work done.

So to me the best thing to do is to make our websites smarter, rather than forcing users to do too much work; and when we do have a task for the user to do (log in, captcha, whatever), make sure it's as streamlined and easy to deal with as possible.

Reed Hedges on March 5, 2008 6:41 AM

I write some bots myself (though not spam bots!). Just some bots to simplify certain internet tasks and I use WebKit which actually loads javascript. So to the people who are saying that using an invisible div with a javascript math problem solves CAPTCHA... it does not...

Mitchell Hashimoto on March 5, 2008 6:44 AM

I had to receive a text message in order to sign up for gmail. Don't they still do that?

Joe Beam on March 5, 2008 6:46 AM

The economics are heavily in favour of the spammers, aren't they? The spammers have an ongoing financial incentive to break big systems so they'll keep working on it. Which basically means entering into an arms race with spammers.

Better would be to hit the opposite side and prosecute anyone who uses spam to sell something. If they aren't legally accessible, just block name/IP of the mail/web server that does the trade until fixed. I imagine that would reduce spam by an order of magnitude more-or-less instantly.

Jim on March 5, 2008 6:53 AM

The anti-bot method I like requires no script and no effort from the real user. A text input styled for display:none within the submission form, possibly with a name of "zipcode" or something similar. Most bots will attempt to populate it with "convincing" data.

When you process the form, reject any submission with data in that box.

I didn't come up with this, but it seems to work pretty well.

Brother Erryn on March 5, 2008 6:56 AM

The anti-bot method I like requires no script and no effort from the real user. A text input styled for display:none within the submission form, possibly with a name of "zipcode" or something similar. Most bots will attempt to populate it with "convincing" data.

When you process the form, reject any submission with data in that box.

I didn't come up with this, but it seems to work pretty well.

Brother Erryn on March 5, 2008 6:57 AM

I think the ascii-art captchas are as weak as image based captchas. it's a kind of security by obscurity if you ask me. if google would decide to use them, it'll take a day and they're solved :)

Greg on March 5, 2008 7:01 AM

@monsur recaptcha has an audio base captcha

ka2 on March 5, 2008 7:03 AM

Please enter your Social Security number, mother's maiden name, date of birth and driver's license ID.

(It works for the bank...)

I am starting to think that some kind of global internet ID system that relates back to real world credentials is the only way to go. I know, it removes our anonymity, but it solves the problem.

The ID could be constructed in such a way that websites could not access the private information, just the fact that this ID is from a real person. Of course, regulating whoever has that information would be the challenge.

Jeff Davis on March 5, 2008 7:12 AM

Almost all the great suggestions on this "thread" are security by obscurity. Putting in JavaScript ? Invisible fields named "zipcode" ? Those things will be circumvented 30 seconds after they have been implemented. Remember, we are talking about Google and Hotmail here, not some private blog. On a private blog, even something as silly as Jeff's "orange" is enough.

J. Stoever on March 5, 2008 7:13 AM

Brother Erryn, that's rather easy to break - naturally it will only stop bots that are not expecting your mechanism, but any sophisticated attacker of Google/Yahoo/Microsoft is going to spend some time studying the page to determine minor obstacles such as those.

Javascript/css tricks are easily broken.

Bobby on March 5, 2008 7:20 AM

Jeff,

From a recent post on the Joel on Software forums:

"Chenette said organized attackers are using automated tools to sign up for Gmail and other Web-mail accounts. When the CAPTCHA image appears, it's automatically sent off to a large and low-paid workforce, typically in another country, where a worker enters the code and sends it back so the account can be created."

http://www.theregister.co.uk/2008/02/08/microsoft_captcha_buster/

http://www.enterprise-security-today.com/story.xhtml?story_id=58602

How do you stop spammers from using low paid Humans to beat CAPTCHAs? Is the CAPTCHAs days numbered?"

So it appears that the Google CAPTCHA algo hasn't been broken at all, but simply circumvented by those willing to pay people to get them through.

KenW on March 5, 2008 7:21 AM

Oops! Forgot the link to the post at JOS:

http://discuss.joelonsoftware.com/default.asp?joel.3.600679.21

KenW on March 5, 2008 7:24 AM

i think the next step in captcha is to require a valid answer, not just repeat the letters.

here are some ideas. to be successful, you would have to have a bank of X000 simple, first grade question/answers.

what color is this?
what is 1+3
what is this year
what is the first
what is 10/2
what day is after Monday
how many hours are in the day

scott cate on March 5, 2008 7:25 AM

How exactly are spammers any different than traditional marketing houses that send bulk mail advertising to your mailbox? Guess what.. the difference is strictly due to the public's perception.

If you really want to stem the tide, it needs to be legitimized and regulated. Once that is done, the various governments would have a financial incentive to really punish rogue spammers. After all, a rogue spammer would be cutting into their own profit. Further, the traditional marketing companies would push the smaller guys out of the market.

Here's how I see it: all ISPs pool their email address list into a giant database. A spammer would buy the right to send x number of messages to x number of addresses on that list. Say 40% goes to the government of the country of the ISP of the recipient, the rest goes to the ISP. If the spammer sends a message to someone on the Do Not Email list they are fined something like $100 per instance, lack of paying the fine = jail time for whoever the government can capture. Maybe it costs something like $0.05 per address per message, which is pretty close to bulk mailing rates.

There's financial incentive for: 1. ISPs to join the list; 2. Pretty much any government to enforce regulatation, which is something they like doing anyway; and, 3. Spammers to register and follow the rules.

After all, from a spammers perspective it's much more cost effective to broadcast a message to a known good list of recipients than it is to try and harvest those addresses in the first place.

Chris Lively on March 5, 2008 7:32 AM

One solution I've seen (and only in one place - in a free 2chan-esque image board software package) is a 'spam trap' - basically invisible form fields that are only filled out by spambots. These fields are then tested and if they have any value, the input is discounted as spam.

Phil on March 5, 2008 7:32 AM

I'm a fan of the reCAPTCHA project. But lately I've hit a lot of words on reCAPTCHA that I can't decipher! I love the idea of CAPTCHA using a picture instead of words; it'd be easier to internationalize such a system.

monsur on March 5, 2008 7:33 AM

How about this:
Once your CAPTCHA algorithm is broken, you obtain the solution and incorporate it into you own CAPTCHA generator:

1) Generate the image.
2) Use the solution.
3) See if the "solution" matches the actual answer.
4) If it does, discard it and do steps 2-4 again. If it doesn't match, then your CAPTCHA is safe!

The only problem would be obtaining the solution. $$$ :P

saintpretz59 on March 5, 2008 7:35 AM

Heres a question: could improvements of CAPTCHA-defeating technology be used to make super-reliable OCR?

Shmork on March 5, 2008 7:41 AM

as http://en.wikipedia.org/wiki/Captcha, clearly states:

[quote]
A CAPTCHA system is a means of automatically generating new challenges which:

...
- Does not rely on the type of CAPTCHA being new to the attacker. Although a checkbox "check here if you are not a bot" might serve to distinguish between humans and computers, it is not a CAPTCHA because it relies on the fact that an attacker has not spent effort to break that specific form.
[/quote]

This point seems to be missed by just about anyone, and it's something worth considering. Just think "what would Bruce Scheiner say?" and you'll get it right ;)

dave on March 5, 2008 7:44 AM

they just need flash based animated + audio captchas

netduke on March 5, 2008 7:53 AM

An intelligence test like:

"You have a bucket that holds two gallons and one that holds three gallons. How many buckets do you have?" (smirk)

Mad Prophet on March 5, 2008 7:55 AM

BTW, I've expanded my comment into a post: http://taint.org/2008/03/05/122732a.html

Justin Mason on March 5, 2008 7:56 AM

I think it's high time that we stop trying to address the symptoms and start addressing the root cause of these sorts of problems.

Spammers should be legal hunting targets, plain and simple. I know I'd pay a hefty license and tag fee to be able to hunt spammers. This ought to be reality TV as well. Think of "Running Man" with Dog the Bounty Hunter as the host.

Enough is enough already. If spammers can't use the internet for good, then they should lose the privilege to use it (or live, I'd prefer it that way). They're a waste of humanity. They're also utterly stupid if they can't realize I'm not going to buy they're stupid pills after the 200th e-mail...

Chris Holmes on March 5, 2008 8:00 AM

i hate them

so many times they forced me to try again... and again...

nowadays, i dont want to bother. i look how difficult the captcha is.

if it is too difficult, i dont even bother to write anything and leave
the site quickly.

they may be necessary, but as far as I am concerned, they make MY life harder. So i will not give in to them at all.

The web should be for people, not against people.

shev on March 5, 2008 8:01 AM

Has anyone tried bayesian filtering?

http://en.wikipedia.org/wiki/Bayesian_spam_filtering
http://en.wikipedia.org/wiki/Naive_Bayes_classifier

Michael Lang on March 5, 2008 8:02 AM

Hmm, perhaps this is why I've been getting a lot of spam from gmail users as of late...

Thic Ric on March 5, 2008 8:04 AM

Do you know what a PITA CAPTCHA is on a site like JK on the Run? It's so blurry and grey that sometimes it takes me 3-4 tries to enter the correct combination.

I don't understand the need for this speedbump with people who use OpenID. WordPress has embraced OpenID and I can use it to comment at many places without having to compile a long list of usernames and passwords.

Why can't there be a method for sites to compile lists of valid OpenIDs so people like me can skip the CAPTCHA Hell?

Mike Cane on March 5, 2008 8:09 AM

The choose cat and dog problem would probably work, but of course, you'd have to do either iteratively (which people would get tired of), or have something like a 5x5 grid, and choose which pictures had cats. A 5x5 grid, with a 50% chance of any given picture containing a cat would result in about a 1 in 8 million chance of a bot getting it right by random guessing. And the server can have a very large number of pictures stored for the purpose (Each picture could conceivably be less than 10 kB in size). The CAPTCHA would have to conduct random modifications to the pictures to prevent an attacker from just storing what picture corresponds with a given answer, however.

Matthew Hui on March 5, 2008 8:10 AM

I don't like CAPTCHAs, but I see a major problem (from a web design perspective) with most of the new methods as well: most of them rely on Javascript.

Now, don't get me wrong, lacking support for non-javascript browsers isn't a show-stopper, but it does pose a problem for people who browse with javascript disabled. This includes users of the NoScript extension for Firefox, and people with text-based browsers like Lynx.

Jacob on March 5, 2008 8:12 AM

The website describing the GMail captcha crack was confusing, but it seemed to me that far from inventing a brilliant captcha-reading algorithm, they were just employing people to type in the captcha's as they come in. No human-vs-machine principle can beat that.

Zack on March 5, 2008 8:17 AM

About the human farms, including a watermark/some kind of branding in the captcha image would at least generate some suspicion.

lmjabreu on March 5, 2008 8:30 AM

My own blog at http://smokinn.com/blog does similar to the cats vs dogs thing. I make people pick out between fluffy/not fluffy.

I fully expect this to be the new wave in captchas. It's MUCH more user-friendly and there are so many implementation tricks you can use (mine is very naive but I can already think of 3 improvements I'll make if ever a spam bot gets through) that it can be very solid.

Guillaume Theoret on March 5, 2008 8:30 AM

captcha has a grave usability problem. alternative to this is to ask simple questions. eg. water is liquid or solid? answer:liquid etc. here is a href="http://www.adesblog.com/2008/03/01/wp-plugin-captcha-alternative-quesion/" rel="nofollow"such plugin/a for wordpress blogs.

Ades on March 5, 2008 8:49 AM

My knowledge of how CAPTCHA works is very limited but I want to know why CAPTCHAs are always static? Would bots be able to break CAPTCHAs that use kinetic typography? I would think that trying to analyze moving, morphing text/images would be much more difficult to break.

http://www.cs.cmu.edu/~johnny/kt/ ..the demos are really neat.

chillings on March 5, 2008 8:51 AM

The major web-based email providers - GMail, Hotmail, etc... - should require the user's browser to perform some calculation in java script for every email that is sent. The time required to perform this calculation would be minimal for normal users, but prohibitive if you're sending bulk spam.

@JP on March 5, 2008 02:37 AM
Paying $1 for every web registration is a terrible idea. First, I don't want to give out my credit-card number to everyone. I even feel paranoid about Amazon.com trying to store it. Second, lots of people don't have credit cards (e.g., kids). Lastly, it would discourage me from posting on almost any discussion board because I'm too cheap to pay for the privilage of providing help or asking questions myself.

KG on March 5, 2008 8:53 AM

Logging into HSBC's personal internet banking account requires 3 things: user name, password you type in and last - another password where you have to use your mouse to point and click on an on-screen keyboard in order to enter the information.

If it is required to point and click an on-screen keyboard in order to enter information - would that help stop bots?

JR on March 5, 2008 9:10 AM

Actually you forgot a catagory: Social synchronization. If you expect people to pick a word that best represents multiple pictures you are expecting them to think the same way. I frequently fail this type of social sync test, I just don't seem to think the same way as most people for some reason. Crossword puzzles are perhaps the best example of this.

With bots cracking it 20% of time, I would be intersted in failure rate for flesh and blood. I know that I don't hit 100% Makes me wonder what the average is.

At a thought, put instructions into a image to complete a task and have the result of that task be the key. Of course, that would exclude the simple and the visually impaired and those of a different language.

I suspect that ultimately we will have to fall back on a third party that could be use used to verify our identity and provide websites a semi-anonymous ya or nay without passing on any of the personal details we provided to the third party to verify our identity.

Wait, I think that's been done and it failed the paranoia test...

We are our own worst enemies at times.

Xepol on March 5, 2008 9:15 AM

Here's a thought... use the captcha to do a little research as well.

While we're asking users a question... maybe we can make the answers they give relevant and useful to us

Jim on March 5, 2008 9:17 AM

another thought... you could make the instructions an image... so the algorithms would need to get the OCR right... then interpret the directions... then figure out how to follow them.

seems pretty bulletproof to me (at the moment)

Jim on March 5, 2008 9:19 AM

Any kind of Captcha is useless for determined spammers. If I were I spammer, I would hire a low cost laborer who can, in an hour, manually open tens of new email accounts. Why waste time developing and launching anti captcha bots?

Abdu on March 5, 2008 9:22 AM

Spammers are like child molesters. They only stop when you put bullets through their heads. I continue to hope that one day society will decide to step up to the plate, instead of endlessly and pointlessly playing Spy-vs-Spy. But (sigh) just like the "War on Terror", the real point is how much money MegaCorpGov can make out of it, and that is ultimately determined by how long it can be dragged out.

Ed Tuonine on March 5, 2008 9:57 AM

More comments»

The comments to this entry are closed.