skip to main content
10.1145/3128572.3140441acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

Generating Look-alike Names For Security Challenges

Authors Info & Claims
Published:03 November 2017Publication History

ABSTRACT

Motivated by the need to automatically generate behavior-based security challenges to improve user authentication for web services, we consider the problem of large-scale construction of realistic-looking names to serve as aliases for real individuals. We aim to use these names to construct security challenges, where users are asked to identify their real contacts among a presented pool of names. We seek these look-alike names to preserve name characteristics like gender, ethnicity, and popularity, while being unlinkable back to the source individual, thereby making the real contacts not easily guessable by attackers.

To achive this, we introduce the technique of distributed name embeddings, representing names in a high-dimensional space such that distance between name components reflects the degree of cultural similarity between these strings. We present different approaches to construct name embeddings from contact lists observed at a large web-mail provider, and evaluate their cultural coherence. We demonstrate that name embeddings strongly encode gender and ethnicity, as well as name popularity. We applied this algorithm to generate imitation names in email contact list challenge. Our controlled user study verified that the proposed technique reduced the attacker's success rate to 26.08%, indistinguishable from random guessing, compared to a success rate of 62.16% from previous name generation algorithms.

Finally, we use these embeddings to produce an open synthetic name resource of 1 million names for security applications, constructed to respect both cultural coherence and U.S. census name frequencies.

References

  1. Anurag Ambekar, Charles Ward, Jahangir Mohammed, Swapna Male, and Steven Skiena. 2009. Name-ethnicity classification from open sources. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge Discovery and Data Mining. ACM, 49--58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Yoshua Bengio, Aaron Courville, and Pierre Vincent. 2013. Representation learning: A review and new perspectives. Pattern Analysis and Machine Intelligence, IEEE Transactions on 35, 8 (2013), 1798--1828.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Joseph Bonneau, Elie Bursztein, Ilan Caron, Rob Jackson, and Mike Williamson. 2015. Secrets, Lies, and Account Recovery: Lessons from the Use of Personal Knowledge Questions at Google. In Proceedings of the 24th International Conference on World Wide Web (WWW '15). ACM, New York, NY, USA, 141--150. https://doi.org/10.1145/2736277.2741691 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Joseph Bonneau, Cormac Herley, Paul C. van Oorschot, and Frank Stajano. 2015. Passwords and the Evolution of Imperfect Authentication. Commun. ACM 58, 7 (June 2015), 78--87. https://doi.org/10.1145/2699390 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Census Bureau. 1990. https://www.census.gov/topics/population/genealogy/data/1990_census/1990_census_namefiles.html. (1990).Google ScholarGoogle Scholar
  6. Census Bureau. 2000. https://www.census.gov/topics/population/genealogy/data/2000_surnames.html. (2000).Google ScholarGoogle Scholar
  7. Elie Bursztein and Ilan Caron. 2015. https://security.googleblog.com/2015/05/new-research-some-tough-questions-for.html. (2015).Google ScholarGoogle Scholar
  8. Mike Campbell. 1996. http://www.behindthename.com. (1996).Google ScholarGoogle Scholar
  9. J. Chang, I. Rosenn, L. Backstrom, and C. Marlow. 2010. ePluribus: Ethnicity on social networks. In Proceedings of the International Conference in Weblogs and Social Media (ICWSM). 18--25.Google ScholarGoogle Scholar
  10. David Freeman, Sakshi Jain, Markus Dürmuth, Battista Biggio, and Giorgio Giacinto. 2016. Who Are You? A Statistical Approach to Measuring User Authenticity. In 23nd Annual Network and Distributed System Security Symposium, NDSS 2016, San Diego, California, USA, February 21-24, 2016. The Internet Society, 1--15. http://www.internetsociety.org/sites/default/files/blogs-media/who-are-you-statistical-approach-measuring-user-authenticity.pdfGoogle ScholarGoogle ScholarCross RefCross Ref
  11. Ralph Gross and Alessandro Acquisti. 2005. Information revelation and privacy in online social networks. In Proceedings of the 2005 ACM workshop on Privacy in the electronic society. ACM, 71--80. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Andrew Harris. 2015. What's in a Name? A Method for Extracting Information about Ethnicity from Names. Political Analysis 23, 2 (2015), 212--224. Google ScholarGoogle ScholarCross RefCross Ref
  13. Yifan Hu, Emden Gansner, and Stephen Kobourov. 2010. Visualizing graphs and clusters as maps. IEEE Computer Graphics and Applications 30 (2010), 54--66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Mike Just. 2004. Designing and Evaluating Challenge-Question Systems. IEEE Security & Privacy 2, 5 (2004), 32--39. https://doi.org/10.1109/MSP.2004.80 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Omer Levy and Yoav Goldberg. 2014. Neural Word Embedding as Implicit Matrix Factorization. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada. 2177--2185.Google ScholarGoogle Scholar
  16. P. Mateos, R. Webber, and P. Longley. 2007. The cultural, ethnic and linguistic classification of populations and neighbourhoods using personal names. Technical Report CASA Working Papers 116. Centre for Advanced Spatial Analysis University College London.Google ScholarGoogle Scholar
  17. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. In Proceedings of Workshop at ICLR.Google ScholarGoogle Scholar
  18. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119.Google ScholarGoogle Scholar
  19. Alan Mislove, Sune Lehmann, Yong-Yeol Ahn, Jukka-Pekka Onnela, and J. Niels Rosenquist. 2011. Understanding the Demographics of Twitter Users. ICWSM 11 (2011), 5th.Google ScholarGoogle Scholar
  20. open source project. 2013. https://code.google.com/archive/p/word2vec/. (2013).Google ScholarGoogle Scholar
  21. Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global vectors for word representation. Proceedings of the Empiricial Methods in Natural Language Processing (EMNLP 2014) 12 (2014).Google ScholarGoogle Scholar
  22. Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 701--710. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Pucktada Treeratpituk and C. Lee Giles. 2012. Name-Ethnicity Classification and Ethnicity-Sensitive Name Matching. In Proceedings of AAAI Conference on Artificial Intelligence.Google ScholarGoogle Scholar
  24. Laurens Van Der Maaten. 2014. Accelerating t-sne using tree-based algorithms. The Journal of Machine Learning Research 15, 1 (2014), 3221--3245.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. David L. Word, Charles D. Coleman, Robert Nunziata, and Robert Kominski. 2008. Demographic aspects of surnames from census 2000. Unpublished manuscript, Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download (2008).Google ScholarGoogle Scholar

Index Terms

  1. Generating Look-alike Names For Security Challenges

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            AISec '17: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security
            November 2017
            140 pages
            ISBN:9781450352024
            DOI:10.1145/3128572

            Copyright © 2017 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 3 November 2017

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

            Acceptance Rates

            AISec '17 Paper Acceptance Rate11of36submissions,31%Overall Acceptance Rate94of231submissions,41%

            Upcoming Conference

            CCS '24
            ACM SIGSAC Conference on Computer and Communications Security
            October 14 - 18, 2024
            Salt Lake City , UT , USA

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader