unCaptcha: A Low-resource Defeat of reCaptcha's Audio Challenge


12/28/2018 After we informed Google about unCaptcha, they updated their audio challenges to issue phrases instead of digits. unCaptcha v2 now breaks these new challenges, with even higher accuracy (around 90%) than before. (unCaptcha v2 code)


CAPTCHAs are the Internet's first line of defense against automated account creation and service abuse. Google's reCaptcha, one of the most popular captcha systems, is currently used by hundreds of thousands of websites to protect against automated attackers by testing whether a user is truly human.

We present unCaptcha, an automated system that can solve reCaptcha's most difficult auditory challenges with high success rate. We have evaluated unCaptcha using over 450 reCaptcha challenges from live websites, and showed that it can solve them with 85.15% accuracy in 5.42 seconds, on average: less time than it takes to even play the audio challenge!

unCaptcha combines free, public, online speech-to-text engines with a novel phonetic mapping technique, demonstrating that it requires minimal resources to mount a large-scale successful attack on the reCaptcha system.

To put it simply: talk is cheap!

WHAT IS reCaptcha?

reCaptcha is a service offered by Google to infer whether the user of a website is truly human. Such a "captcha" is a defense system designed to protect against automated account creation. The security of captcha systems is paramount to protecting services on the Internet from attacks, such as the Sybil attack.

reCaptcha works primarily by observing many different pieces of evidence that might indicate a human user, such as how the user types, moves their mouse, and so on. Nonetheless, sometimes reCaptcha cannot tell whether the user is human. When that happens, reCaptcha presents users with a grid of pictures, like the one below:

However, visual captchas are not solvable by all users; to support visually-impaired users, reCaptcha allows users to request audio captchas by clicking on the headphones icon in the bottom left of the above picture. These audio challenges consist of a sequence of recorded voices saying numbers "seven... three... two..." Users are simply asked to type in the digits they hear. This is what we attack.

HOW DOES unCaptcha WORK?

The key insight behind unCaptcha is that today's speech-to-text services are highly capable. Even Google's own free speech-to-text services could be used against the very defense mechanism they offer!

unCaptcha design overview figure

Briefly, unCaptcha works in the following steps:
  1. Download the audio captcha
  2. Segment the audio into individual digit audio clips
  3. Upload each segment to multiple online speech-to-text services
  4. Convert these services' responses to digits:
    • Exact homophones: If it is "one" "two", etc., then guess that number
    • Near homophones: If it sounds like a digit, like "true" sounds like "two", then guess what it sounds like
  5. Ensemble the multiple services together by taking a weighted vote based on confidence
  6. And finally upload the answer

For detailed information, please view our WOOT 2017 paper.


All code and data from the WOOT'17 paper are available on the unCaptcha GitHub repository.

The updated attack is available on the unCaptcha v2 GitHub repository.


(pdf) unCaptcha: A Low-Resource Defeat of reCaptcha's Audio Challenge
Kevin Bock, Daven Patel, George Hughey, Dave Levin
WOOT 2017 (USENIX Workshop on Offensive Technologies)


The following people have contributed to this project:
  • Kevin Bock (University of Maryland)
  • Daven Patel (University of Maryland)
  • George Hughey (University of Maryland)
  • Dave Levin (University of Maryland)


Web Accessibility