skip to main content
10.1145/3442188.3445928acmconferencesArticle/Chapter ViewAbstractPublication PagesfacctConference Proceedingsconference-collections
research-article
Public Access

Building and Auditing Fair Algorithms: A Case Study in Candidate Screening

Published:01 March 2021Publication History

ABSTRACT

Academics, activists, and regulators are increasingly urging companies to develop and deploy sociotechnical systems that are fair and unbiased. Achieving this goal, however, is complex: the developer must (1) deeply engage with social and legal facets of "fairness" in a given context, (2) develop software that concretizes these values, and (3) undergo an independent algorithm audit to ensure technical correctness and social accountability of their algorithms. To date, there are few examples of companies that have transparently undertaken all three steps.

In this paper we outline a framework for algorithmic auditing by way of a case-study of pymetrics, a startup that uses machine learning to recommend job candidates to their clients. We discuss how pymetrics approaches the question of fairness given the constraints of ethical, regulatory, and client demands, and how pymetrics' software implements adverse impact testing. We also present the results of an independent audit of pymetrics' candidate screening tool.

We conclude with recommendations on how to structure audits to be practical, independent, and constructive, so that companies have better incentive to participate in third party audits, and that watchdog groups can be better prepared to investigate companies.

References

  1. Ifeoma Ajunwa. 2020. The Paradox of Automation as Anti-Bias Intervention. Cardozo, L. Rev. 41 (2020).Google ScholarGoogle Scholar
  2. Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. 2016. Machine Bias. ProPublica. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing.Google ScholarGoogle Scholar
  3. Joshua Asplund, Motahhare Eslami, Hari Sundaram, Christian Sandvig, and Karrie Karahalios. 2020. Auditing Race and Gender Discrimination in Online Housing Markets. In Proc of ICWSM.Google ScholarGoogle ScholarCross RefCross Ref
  4. Jack Bandy and Nicholas Diakopoulos. 2020. Auditing News Curation Systems: A Case Study Examining Algorithmic and Editorial Logic in Apple News. In Proc of ICWSM.Google ScholarGoogle ScholarCross RefCross Ref
  5. Solon Barocas and Andrew D. Selbst. 2016. Big Data's Disparate Impact. 104 California Law Review 671 (2016).Google ScholarGoogle Scholar
  6. Joy Buolamwini and Timnit Gebru. 2018. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. In Proc. of FAT*.Google ScholarGoogle Scholar
  7. Aylin Caliskan, Joanna J. Bryson, and Arvind Narayanan. 2017. Semantics derived automatically from language corpora contain human-like biases. Science 356, 6334 (April 2017), 183--186.Google ScholarGoogle ScholarCross RefCross Ref
  8. Le Chen, Ruijun Ma, Anikó Hannák, and Christo Wilson. 2018. Investigating the Impact of Gender on Rank in Resume Search Engines. In Proc. of CHI.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Le Chen, Alan Mislove, and Christo Wilson. 2015. Peeking Beneath the Hood of Uber. In Proc. of IMC.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Le Chen, Alan Mislove, and Christo Wilson. 2016. An Empirical Analysis of Algorithmic Pricing on Amazon Marketplace. In Proc. of WWW.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Alexandra Chouldechova, Diana Benavides-Prado, Oleksandr Fialko, and Rhema Vaithianathan. 2018. A case study of algorithm-assisted decision making in child maltreatment hotline screening decisions. In Proc. of FAT*.Google ScholarGoogle Scholar
  12. Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Machine Learning 20 (1995).Google ScholarGoogle Scholar
  13. Nicholas Diakopoulos. 2014. Algorithmic Accountability Reporting: on the Investigation of Black Boxes. Tow Center for Digital Journalism Brief.Google ScholarGoogle Scholar
  14. Nicholas Diakopoulos, Daniel Trielli, Jennifer Stark, and Sean Mussenden. 2018. I Vote For---How Search Informs Our Choice of Candidate. In Digital Dominance: The Power of Google, Amazon, Facebook, and Apple, M. Moore and D. Tambini (Eds.). 22.Google ScholarGoogle Scholar
  15. Danielle Ensign, Sorelle A. Friedler, Scott Neville, Carlos Scheidegger, and Suresh Venkatasubramanian. 2018. Runaway Feedback Loops in Predictive Policing. In Proc. of FAT*.Google ScholarGoogle Scholar
  16. Equal Employment Opportunity Commission, Civil Service Commission, et al. 1978. Uniform guidelines on employee selection procedures. Federal Register 43, 166 (1978), 38290--38315.Google ScholarGoogle Scholar
  17. Motahhare Eslami, Kristen Vaccaro, Karrie Karahalios, and Kevin Hamilton. 2017. "Be careful; things can be worse than they appear": Understanding Biased Algorithms and Users' Behavior around Them in Rating Platforms. In Proc of ICWSM.Google ScholarGoogle ScholarCross RefCross Ref
  18. Motahhare Eslami, Kristen Vaccaro, Min Kyung Lee, Amit Elazari Bar On, Eric Gilbert, and Karrie Karahalios. 2019. User Attitudes towards Algorithmic Opacity and Transparency in Online Reviewing Platforms. In Proc. of CHI.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Michael Feldman, Sorelle A. Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. 2015. Certifying and removing disparate impact. In Proc. of KDD.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Jessica Fjeld, Nele Achten, Hannah Hilligoss, Adam Nagy, and Madhulika Srikumar. 2020. Principled Artificial Intelligence: Mapping Consensus in Ethical and Rights-Based Approaches to Principles for AI. Berkman Klein Center Research Publication 2020, 1 (2020). https://ssrn.com/abstract=3518482Google ScholarGoogle Scholar
  21. International Organization for Standardization. 2012. ISO/IEC 27001 Information Security Management. http://iso.org/isoiec-27001-information-security.html.Google ScholarGoogle Scholar
  22. Sorelle A. Friedler, Carlos Scheidegger, and Suresh Venkatasubramanian. 2016. On the (im)possibility of fairness. CoRR abs/1609.07236 (2016).Google ScholarGoogle Scholar
  23. Sorelle A. Friedler, Carlos Scheidegger, Suresh Venkatasubramanian, Sonam Choudhary, Evan P. Hamilton, and Derek Roth. 2019. A comparative study of fairness-enhancing interventions in machine learning. In Proc. of FAT*.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Batya Friedman and Helen Nissenbaum. 1996. Bias in Computer Systems. ACM Trans. Inf. Syst. 14, 3 (July 1996), 330--347.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Aniko Hannak, Piotr Sapieżyński, Arash Molavi Kakhki, Balachander Krishnamurthy, David Lazer, Alan Mislove, and Christo Wilson. 2013. Measuring Personalization of Web Search. In Proc. of WWW.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Aniko Hannak, Gary Soeller, David Lazer, Alan Mislove, and Christo Wilson. 2014. Measuring Price Discrimination and Steering on E-commerce Web Sites. In Proc. of IMC.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Anikó Hannák, Claudia Wagner, David Garcia, Alan Mislove, Markus Strohmaier, and Christo Wilson. 2017. Bias in Online Freelance Marketplaces: Evidence from TaskRabbit and Fiverr. In Proc of CSCW.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Desheng Hu, Shan Jiang, Ronald E. Robertson, and Christo Wilson. 2019. Auditing the Partisanship of Google Search Snippets. In Proc. of WWW.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Eslam Hussein, Prerna Juneja, and Tanushree Mitra. 2020. Measuring Misinformation in Video Search Platforms: An Audit Study on YouTube. Proc. ACM Hum.-Comput. Interact. 4, CSCW1 (May 2020).Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Shan Jiang, Le Chen, Alan Mislove, and Christo Wilson. 2018. On Ridesharing Competition and Accessibility: Evidence from Uber, Lyft, and Taxi. In Proc. of WWW.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Anna Kawakami, Khonzoda Umarova, Dongchen Huang, and Eni Mustafaraj. 2020. The 'Fairness Doctrine' Lives on? Theorizing about the Algorithmic News Curation of Google's Top Stories. In Proc. of HT.Google ScholarGoogle Scholar
  32. Matthew Kay, Cynthia Matuszek, and Sean A. Munson. 2015. Unequal Representation and Gender Stereotypes in Image Search Results for Occupations. In Proc. of CHI.Google ScholarGoogle Scholar
  33. Pauline T. Kim. 2017. Data-Driven Discrimination at Work. William & Mary Law Review 58 (2017).Google ScholarGoogle Scholar
  34. Jon M. Kleinberg, Sendhil Mullainathan, and Manish Raghavan. 2016. Inherent Trade-Offs in the Fair Determination of Risk Scores. CoRR abs/1609.05807 (2016).Google ScholarGoogle Scholar
  35. Chloe Kliman-Silver, Aniko Hannak, David Lazer, Christo Wilson, and Alan Mislove. 2015. Location, Location, Location: The Impact of Geolocation on Web Search Personalization. In Proc. of IMC.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Juhi Kulshrestha, Motahhare Eslami, Johnnatan Messias, Muhammad Bilal Zafar, Saptarshi Ghosh, Krishna P. Gummadi, and Karrie Karahalios. 2017. Quantifying Search Bias: Investigating Sources of Bias for Political Searches in Social Media. In Proc of CSCW.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Peter Lee. 2016. Learning from Tay's Introduction. Official Microsoft Blog. https://blogs.microsoft.com/blog/2016/03/25/learning-tays-introduction/.Google ScholarGoogle Scholar
  38. Zachary Lipton, Julian McAuley, and Alexandra Chouldechova. 2018. Does mitigating ML's impact disparity require treatment disparity?. In Proc. of NeurIPS.Google ScholarGoogle Scholar
  39. Kristian Lum and William Isaac. 2016. To predict and serve? Significance 13, 5 (2016).Google ScholarGoogle Scholar
  40. Scott M. Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In Proc. of NIPS.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Emma Lurie and Eni Mustafaraj. 2018. Investigating the Effects of Google's Search Engine Result Page in Evaluating the Credibility of Online News Sources. In Proc. of WebSci.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Dino Pedreshi, Salvatore Ruggieri, and Franco Turini. 2008. Discrimination-aware Data Mining. In Proc. of KDD.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. pymetrics, inc. 2019. [Confidential] Fairness Testing Procedures.Google ScholarGoogle Scholar
  44. pymetrics, inc. 2019. [Confidential] Games, Measures and Factors: Measurement Validity.Google ScholarGoogle Scholar
  45. pymetrics, inc. 2019. [Confidential] Job Analysis Methods & Process.Google ScholarGoogle Scholar
  46. pymetrics, inc. 2019. [Confidential] Technical Brief for pymetrics, inc.Google ScholarGoogle Scholar
  47. pymetrics, inc. 2020. [Confidential] Demographic Disclosure Study.Google ScholarGoogle Scholar
  48. pymetrics, inc. 2020. pymetrics/audit-ai. GitHub. https://github.com/pymetrics/audit-ai.Google ScholarGoogle Scholar
  49. Lincoln Quillian, Devah Pager, Ole Hexel, and Arnfinn H. Midtbøen. 2017. Meta-analysis of field experiments shows no change in racial discrimination in hiring over time. Proceedings of the National Academy of Sciences 114, 41 (2017), 10870--10875.Google ScholarGoogle ScholarCross RefCross Ref
  50. Manish Raghavan, Solon Barocas, Jon Kleinberg, and Karen Levy. 2020. Mitigating Bias in Algorithmic Hiring: Evaluating Claims and Practices. In Proc. of FAT*.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Inioluwa Deborah Raji, Andrew Smart, Rebecca N. White, Margaret Mitchell, Timnit Gebru, Ben Hutchinson, Jamila Smith-Loud, Daniel Theron, and Parker Barnes. 2020. Closing the AI Accountability Gap: Defining an End-to-End Framework for Internal Algorithmic Auditing. In Proc. of FAT*.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why Should I Trust You?": Explaining the Predictions of Any Classifier. In Proc. of KDD.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Ronald E Robertson, Shan Jiang, Kenneth Joseph, Lisa Friedland, David Lazer, and Christo Wilson. 2018. Auditing Partisan Audience Bias within Google Search. Proceedings of the ACM: Human-Computer Interaction 2, CSCW (November 2018).Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Ronald E. Robertson, Shan Jiang, David Lazer, and Christo Wilson. 2019. Auditing Autocomplete: Recursive Algorithm Interrogation and Suggestion Networks. In Proc. of WebSci.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Ronald E Robertson, David Lazer, and Christo Wilson. 2018. Auditing the Personalization and Composition of Politically-Related Search Engine Results Pages. In Proc. of WWW.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Christian Sandvig, Kevin Hamilton, Karrie Karahalios, and Cedric Langbort. 2014. Auditing Algorithms: Research Methods for Detecting Discrimination on Internet Platforms. In Proc. of Data and Discrimination: Converting Critical Concerns into Productive Inquiry, a preconference at the Annual Meeting of the International Communication Association.Google ScholarGoogle Scholar
  57. Gary Soeller, Karrie Karahalios, Christian Sandvig, and Christo Wilson. 2016. MapWatch: Detecting and Monitoring International Border Personalization on Online Maps. In Proc. of WWW.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Latanya Sweeney. 2013. Discrimination in Online Ad Delivery. ACM Queue 11, 3 (April 2013).Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. U.S. Congress. 1964. Civil Rights Act.Google ScholarGoogle Scholar
  60. Giridhari Venkatadri, Yabing Liu, Athanasios Andreou, Oana Goga, Patrick Loiseau, Alan Mislove, and Krishna P. Gummadi. 2018. Privacy Risks with Facebook's PII-based Targeting: Auditing a Data Broker's Advertising Interface. In Proc. of IEEE Symposium on Security and Privacy.Google ScholarGoogle Scholar
  61. Giridhari Venkatadri, Elena Lucherini, Piotr Sapiezyński, and Alan Mislove. 2019. Investigating sources of PII used in Facebook's targeted advertising. In Proc. of PETS.Google ScholarGoogle ScholarCross RefCross Ref
  62. James Vincent. 2018. These stickers make computer vision software hallucinate things that aren't there. The Verge. https://www.theverge.com/2018/1/3/16844842/ai-computer-vision-trick-adversarial-patches-google.Google ScholarGoogle Scholar
  63. Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rogriguez, and Krishna P. Gummadi. 2017. Fairness Constraints: Mechanisms for Fair Classification. In Proc. of International Conference on Artificial Intelligence and Statistics.Google ScholarGoogle Scholar

Index Terms

  1. Building and Auditing Fair Algorithms: A Case Study in Candidate Screening

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            FAccT '21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency
            March 2021
            899 pages
            ISBN:9781450383097
            DOI:10.1145/3442188

            Copyright © 2021 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 1 March 2021

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed limited

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader