ABSTRACT
Despite several attacks have been proposed, text-based CAPTCHAs are still being widely used as a security mechanism. One of the reasons for the pervasive use of text captchas is that many of the prior attacks are scheme-specific and require a labor-intensive and time-consuming process to construct. This means that a change in the captcha security features like a noisier background can simply invalid an earlier attack. This paper presents a generic, yet effective text captcha solver based on the generative adversarial network. Unlike prior machine-learning-based approaches that need a large volume of manually-labeled real captchas to learn an effective solver, our approach requires significantly fewer real captchas but yields much better performance. This is achieved by first learning a captcha synthesizer to automatically generate synthetic captchas to learn a base solver, and then fine-tuning the base solver on a small set of real captchas using transfer learning. We evaluate our approach by applying it to 33 captcha schemes, including 11 schemes that are currently being used by 32 of the top-50 popular websites including Microsoft, Wikipedia, eBay and Google. Our approach is the most capable attack on text captchas seen to date. It outperforms four state-of-the-art text-captcha solvers by not only delivering a significant higher accuracy on all testing schemes, but also successfully attacking schemes where others have zero chance. We show that our approach is highly efficient as it can solve a captcha within 0.05 second using a desktop GPU. We demonstrate that our attack is generally applicable because it can bypass the advanced security features employed by most modern text captcha schemes. We hope the results of our work can encourage the community to revisit the design and practical use of text captchas.
Supplemental Material
- Are you a human. https://www.areyouahuman.com/.Google Scholar
- Nucaptcha. www.nucaptcha.com/.Google Scholar
- Athanasopoulos, E., and Antonatos, S. Enhanced captchas: using animation to tell humans and computers apart. In IFIP International Conference on Communications and Multimedia Security (2006), pp. 97--108. Google ScholarDigital Library
- Audet, C., and Jr, J. E. D. Mesh adaptive direct search algorithms for constrained optimization. Siam Journal on Optimization 17, 1 (2006), 188--217. Google ScholarDigital Library
- Barreno, M., Nelson, B., Sears, R., Joseph, A. D., and Tygar, J. D. Can machine learning be secure? In ACM Symposium on Information, Computer and Communications Security (2006), pp. 16--25. Google ScholarDigital Library
- Bigham, J. P., and Cavender, A. C. Evaluating existing audio captchas and an interface optimized for non-visual use. In Sigchi Conference on Human Factors in Computing Systems (2009), pp. 1829--1838. Google ScholarDigital Library
- Bursztein, E. How we broke the nucaptcha video scheme and what we proposed to fix it. https://elie.net/blog/security/how-we-broke-the-nucaptcha-video-scheme-and-what-we-propose-to-fix-it.Google Scholar
- Bursztein, E., Aigrain, J., Moscicki, A., and Mitchell, J. C. The end is nigh: generic solving of text-based captchas. In USENIX WOOT (2014). Google ScholarDigital Library
- Bursztein, E., and Bethard, S. Decaptcha: breaking 75% of ebay audio captchas. In Usenix Conference on Offensive Technologies (2009), pp. 8--8. Google ScholarDigital Library
- Bursztein, E., Martin, M., and Mitchell, J. Text-based captcha strengths and weaknesses. In CCS (2011), pp. 125--138. Google ScholarDigital Library
- Chellapilla, K., Larson, K., Simard, P. Y., and Czerwinski, M. Computers beat humans at single character recognition in reading based human interaction proofs (hips). In Conference on Email & Anti-Spam (2005).Google Scholar
- Chow, R., Golle, P., Jakobsson, M., Wang, L., and Wang, X. Making captchas clickable. In Proceedings of the 9th workshop on Mobile computing systems and applications (2008), ACM, pp. 91--94. Google ScholarDigital Library
- Elson, J., Douceur, J. R., Howell, J., and Saul, J. Asirra:a captcha that exploits interest-aligned manual image categorization. In ACM Conference on Computer and Communications Security, CCS 2007, Alexandria, Virginia, Usa, October (2007), pp. 366--374. Google ScholarDigital Library
- et al., P. I. Pix2Pix: Image-to-image translation with conditional adversarial networks. https://github.com/phillipi/pix2pix.Google Scholar
- Gao, H., Tang, M., Liu, Y., Zhang, P., and Liu, X. Research on the security of microsoft's two-layer captcha. IEEE Transactions on Information Forensics & Security 12, 7 (2017), 1671--1685. Google ScholarDigital Library
- Gao, H., Wei, W., Wang, X., Liu, X., and Yan, J. The robustness of hollow captchas. In ACM Sigsac Conference on Computer & Communications Security (2013), pp. 1075--1086. Google ScholarDigital Library
- Gao, H., Yan, J., Cao, F., Zhang, Z., Lei, L., Tang, M., Zhang, P., Zhou, X., Wang, X., and Li, J. A simple generic attack on text captchas. In NDSS (2016).Google ScholarCross Ref
- Gao, S. An evolutionary study of dynamic cognitive game captchas: Automated attacks and defenses. Dissertations & Theses - Gradworks (2014).Google Scholar
- George, D., Lehrach, W., Kansky, K., Lázaro-Gredilla, M., Laan, C., Marthi, B., Lou, X., Meng, Z., Liu, Y., and Wang, H. A generative vision model that trains with high data efficiency and breaks text-based captchas. Science (2017), eaag2612.Google Scholar
- Gold, C., Holub, A., and Sollich, P. Bayesian approach to feature selection and parameter tuning for support vector machine classifiers. Neural Networks 18, 5 (2005), 693--701. Google ScholarDigital Library
- Goodfellow, I. J., Bulatov, Y., Ibarz, J., Arnoud, S., and Shet, V. Multi-digit number recognition from street view imagery using deep convolutional neural networks. In International Conference on Learning Representations (ICLR) (2014).Google Scholar
- Goodfellow, I. J., Pougetabadie, J., Mirza, M., Xu, B., Wardefarley, D., Ozair, S., Courville, A., and Bengio, Y. Generative adversarial networks. Advances in Neural Information Processing Systems 3 (2014), 2672--2680. Google ScholarDigital Library
- Goodfellow, I. J., Shlens, J., Szegedy, C., Goodfellow, I. J., Shlens, J., and Szegedy, C. Explaining and harnessing adversarial examples. In ICML (2015), pp. 1--10.Google Scholar
- Gossweiler, R., Kamvar, M., and Baluja, S. What's up captcha?:a captcha based on image orientation. In International Conference on World Wide Web, WWW 2009, Madrid, Spain, April (2009), pp. 841--850. Google ScholarDigital Library
- Greg, M., and Malik, J. Recognizing objects in adversarial cultter: Breaking a visual captcha. In IEEE Computer Society Conferene on Computer Vision and Pattern Recognition (2003). Google ScholarDigital Library
- He, K., Gkioxari, G., Dollár, P., and Girshick, R. Mask R-CNN. In IEEE International Conference on Computer Vision (ICCV) (2017), pp. 2980--2988.Google Scholar
- He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. 770--778.Google Scholar
- Hecht-Nielsen, R. Theory of the backpropagation neural network. Harcourt Brace & Co., 1989.Google ScholarCross Ref
- Hernandezcastro, C. J., Ribagorda, A., and Saez, Y. Side-channel attack on labeling captchas. Computer Science (2009).Google Scholar
- Huang, L., Joseph, A. D., Nelson, B., Rubinstein, B. I. P., and Tygar, J. D. Adversarial machine learning. IEEE Internet Computing 15, 5 (2011), 4--6. Google ScholarDigital Library
- Isola, P., Zhu, J.-Y., Zhou, T., and Efros, A. A. Image-to-image translation with conditional adversarial networks. arxiv (2016).Google Scholar
- J, W. Strong captcha guidelines v1. 2.Google Scholar
- Jiang, Z., Zhao, J., Li, X.-Y., Han, J., and Xi, W. Rejecting the attack: Source authentication for wi-fi management frames using csi information. In IEEE INFOCOM (2013), pp. 2544--2552.Google ScholarCross Ref
- Kingma, D. P., and Ba, J. Adam: A method for stochastic optimization. Computer Science (2014).Google Scholar
- Krol, K., Parkin, S., and Sasse, M. A. Better the devil you know: A user study of two captchas and a possible replacement technology. In NDSS Workshop on Usable Security (2016).Google ScholarCross Ref
- Le, T. A., Baydin, A. G., Zinkov, R., and Wood, F. Using synthetic data to train neural networks is model-based reasoning. In International Joint Conference on Neural Networks (2017), pp. 3514--3521.Google Scholar
- Lea, C., Vidal, R., Reiter, A., and Hager, G. D. Temporal convolutional networks: A unified approach to action segmentation. In European Conference on Computer Vision (2016), pp. 47--54.Google ScholarCross Ref
- Lecun, Y., Bottou, L., Bengio, Y., and Haffner, P. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 2278--2324.Google ScholarCross Ref
- Li, J., Monroe, W., Shi, T., Jean, S., Ritter, A., and Jurafsky, D. Adversarial learning for neural dialogue generation.Google Scholar
- Meutzner, H., and Kolossa, D. Reducing the cost of breaking audio captchas by active and semi-supervised learning. In International Conference on Machine Learning and Applications (2014), pp. 67--73. Google ScholarDigital Library
- Miyato, T., Maeda, S., Koyama, M., Nakae, K., and Ishii, S. Distributional smoothing by virtual adversarial examples. arXiv (2015).Google Scholar
- Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., and Riedmiller, M. Playing atari with deep reinforcement learning. arXiv (2013).Google Scholar
- Mohamed, M., Sachdeva, N., Georgescu, M., Gao, S., Saxena, N., Zhang, C., Kumaraguru, P., Oorschot, P. C. V., and Chen, W. B. A three-way investigation of a game-captcha:automated attacks, relay attacks and usability. In ACM Symposium on Information, Computer and Communications Security (2014), pp. 195--206. Google ScholarDigital Library
- Mohameda, M., Gaob, S., Sachdevac, N., Saxena, N., Zhangd, C., Kumaraguruc, P., and Oorschote, P. C. V. On the security and usability of dynamic cognitive game captchas. Journal of Computer Security (2017), 1--26.Google Scholar
- Ogilvie, W. F., Petoumenos, P., Wang, Z., and Leather, H. Fast automatic heuristic construction using active learning. In International Workshop on Languages and Compilers for Parallel Computing (2014), pp. 146--160.Google Scholar
- Ogilvie, W. F., Petoumenos, P., Wang, Z., and Leather, H. Minimizing the cost of iterative compilation with active learning. In Proceedings of the 2017 International Symposium on Code Generation and Optimization (2017), CGO '17, pp. 245--256. Google Scholar
- Osadchy, M., Hernandez-Castro, J., Gibson, S., Dunkelman, O., and Pérez-Cabo, D. No bot expects the deepcaptcha! introducing immutable adversarial examples, with applications to captcha generation. IEEE Transactions on Information Forensics & Security PP, 99 (2017), 1--1.Google Scholar
- Pan, S. J., and Yang, Q. A survey on transfer learning. IEEE Transactions on Knowledge & Data Engineering 22, 10 (2010), 1345--1359. Google ScholarDigital Library
- Rosenberg, I., Shabtai, A., Rokach, L., and Elovici, Y. Generic black-box end-to-end attack against rnns and other api calls based malware classifiers. arXiv (2017).Google Scholar
- Schlaikjer, A. A dual-use speech captcha: Aiding visually impaired web users while providing transcriptions of audio streams. LTI (2010).Google Scholar
- Shahzad, M., Liu, A. X., and Samuel, A. Behavior based human authentication on touch screen devices using gestures and signatures. IEEE Transactions on Mobile Computing 16, 10 (2017), 2726--2741.Google ScholarDigital Library
- Shrivastava, A., Pfister, T., Tuzel, O., Susskind, J., Wang, W., and Webb, R. Learning from simulated and unsupervised images through adversarial training. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017).Google ScholarCross Ref
- Simonyan, K., and Zisserman, A. Very deep convolutional networks for large-scale image recognition. Computer Science (2014).Google Scholar
- Sivakorn, S., Polakis, I., and Keromytis, A. D. I am robot: (deep) learning to break semantic image captchas. In IEEE European Symposium on Security and Privacy (2016), pp. 388--403.Google ScholarCross Ref
- Stark, F., Hazirbas, C., Triebel, R., and Cremers, D. Captcha recognition with active deep learning. In German Conference on Pattern Recognition Workshop (2015).Google Scholar
- Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna, Z. Rethinking the inception architecture for computer vision. Computer Science (2015), 2818--2826.Google Scholar
- Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., and Fergus, R. Intriguing properties of neural networks. Computer Science (2013).Google Scholar
- Tam, J., Simsa, J., Hyde, S., and Ahn, L. V. Breaking audio captchas. In Conference on Neural Information Processing Systems, Vancouver, British Columbia, Canada, December (2008), pp. 1625--1632. Google ScholarDigital Library
- Von Ahn, L., Blum, M., Hopper, N. J., and Langford, J. CAPTCHA: Using Hard AI Problems for Security. Springer Berlin Heidelberg, 2003. Google ScholarDigital Library
- Von Ahn, L., Blum, M., and Langford, J. Telling humans and computers apart automatically. Communications of the ACM 47, 2 (2004), 56--60. Google ScholarDigital Library
- Xu, W., Qi, Y., and Evans, D. Automatically evading classifiers: A case study on pdf malware classifiers. In Network and Distributed System Security Symposium (2016).Google ScholarCross Ref
- Xu, Y., Reynaga, G., Chiasson, S., Frahm, J.-M., Monrose, F., and Van Oorschot, P. C. Security analysis and related usability of motion-based captchas: Decoding codewords in motion. IEEE transactions on dependable and secure computing 11, 5 (2014), 480--493.Google Scholar
- Yan, J., and Ahmad, A. S. E. Breaking visual captchas with naive pattern recognition algorithms. In Computer Security Applications Conference, 2007. ACSAC 2007. Twenty-Third Annual (2007), pp. 279--291.Google ScholarCross Ref
- Yan, J., and Ahmad, A. S. E. A low-cost attack on a microsoft captcha. In ACM Conference on Computer and Communications Security, CCS 2008, Alexandria, Virginia, Usa, October (2008), pp. 543--554. Google ScholarDigital Library
- Yosinski, J., Clune, J., Bengio, Y., and Lipson, H. How transferable are features in deep neural networks? In Advances in neural information processing systems (2014), pp. 3320--3328. Google ScholarDigital Library
- Yu, L., Zhang, W., Wang, J., and Yu, Y. Seqgan: Sequence generative adversarial nets with policy gradient.Google Scholar
- Zhu, J.-Y., Park, T., Isola, P., and Efros, A. A. Unpaired image-to-image translation using cycle-consistent adversarial networks. arXiv preprint arXiv:1703.10593 (2017).Google Scholar
Index Terms
- Yet Another Text Captcha Solver: A Generative Adversarial Network Based Approach
Recommendations
An Experimental Investigation of Text-based CAPTCHA Attacks and Their Robustness
Text-based CAPTCHA has become one of the most popular methods for preventing bot attacks. With the rapid development of deep learning techniques, many new methods to break text-based CAPTCHAs have been developed in recent years. However, a holistic and ...
Using Generative Adversarial Networks to Break and Protect Text Captchas
Text-based CAPTCHAs remains a popular scheme for distinguishing between a legitimate human user and an automated program. This article presents a novel genetic text captcha solver based on the generative adversarial network. As a departure from prior ...
A Generic Solver Combining Unsupervised Learning and Representation Learning for Breaking Text-Based Captchas
WWW '20: Proceedings of The Web Conference 2020Although there are many alternative captcha schemes available, text-based captchas are still one of the most popular security mechanism to maintain Internet security and prevent malicious attacks, due to the user preferences and ease of design. Over the ...
Comments