skip to main content
10.1145/3593013.3593972acmotherconferencesArticle/Chapter ViewAbstractPublication PagesfacctConference Proceedingsconference-collections
research-article

How to Explain and Justify Almost Any Decision: Potential Pitfalls for Accountability in AI Decision-Making

Published:12 June 2023Publication History

ABSTRACT

Discussion of the “right to an explanation” has been increasingly relevant because of its potential utility for auditing automated decision systems, as well as for making objections to such decisions. However, most existing work on explanations focuses on collaborative environments, where designers are motivated to implement good-faith explanations that reveal potential weaknesses of a decision system. This motivation may not hold in an auditing environment. Thus, we ask: how much could explanations be used maliciously to defend a decision system? In this paper, we demonstrate how a black-box explanation system developed to defend a black-box decision system could manipulate decision recipients or auditors into accepting an intentionally discriminatory decision model. In a case-by-case scenario where decision recipients are unable to share their cases and explanations, we find that most individual decision recipients could receive a verifiable justification, even if the decision system is intentionally discriminatory. In a system-wide scenario where every decision is shared, we find that while justifications frequently contradict each other, there is no intuitive threshold to determine if these contradictions are because of malicious justifications or because of simplicity requirements of these justifications conflicting with model behavior. We end with discussion of how system-wide metrics may be more useful than explanation systems for evaluating overall decision fairness, while explanations could be useful outside of fairness auditing.

References

  1. Alejandro Barredo Arrieta, Natalia Díaz-Rodríguez, Javier Del Ser, Adrien Bennetot, Siham Tabik, Alberto Barbado, Salvador García, Sergio Gil-López, Daniel Molina, Richard Benjamins, Raja Chatila, and Francisco Herrera. 2019. Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI. (2019). https://doi.org/10.48550/ARXIV.1910.10045 arxiv:1910.10045 [cs.AI]Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Ulrich Aïvodji, Hiromi Arai, Olivier Fortineau, Sébastien Gambs, Satoshi Hara, and Alain Tapp. 2019. Fairwashing: the risk of rationalization. (2019). https://doi.org/10.48550/ARXIV.1901.09749 arxiv:1901.09749 [cs.LG]Google ScholarGoogle Scholar
  3. Gagan Bansal, Tongshuang Wu, Joyce Zhou, Raymond Fok, Besmira Nushi, Ece Kamar, Marco Tulio Ribeiro, and Daniel Weld. 2021. Does the Whole Exceed Its Parts? The Effect of AI Explanations on Complementary Team Performance. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI ’21). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3411764.3445717Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Solon Barocas, Andrew D. Selbst, and Manish Raghavan. 2020. The hidden assumptions behind counterfactual explanations and principal reasons. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency(FAT* ’20). Association for Computing Machinery, New York, NY, USA, 80–89. https://doi.org/10.1145/3351095.3372830Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Umang Bhatt, Adrian Weller, and José M. F. Moura. 2020. Evaluating and Aggregating Feature-based Model Explanations. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-20, Christian Bessiere (Ed.). International Joint Conferences on Artificial Intelligence Organization, 3016–3022. https://doi.org/10.24963/ijcai.2020/417Google ScholarGoogle ScholarCross RefCross Ref
  6. Miles Brundage, Shahar Avin, Jasmine Wang, Haydn Belfield, Gretchen Krueger, Gillian Hadfield, Heidy Khlaaf, Jingying Yang, Helen Toner, Ruth Fong, Tegan Maharaj, Pang Wei Koh, Sara Hooker, Jade Leung, Andrew Trask, Emma Bluemke, Jonathan Lebensold, Cullen O’Keefe, Mark Koren, Théo Ryffel, JB Rubinovitz, Tamay Besiroglu, Federica Carugati, Jack Clark, Peter Eckersley, Sarah de Haas, Maritza Johnson, Ben Laurie, Alex Ingerman, Igor Krawczuk, Amanda Askell, Rosario Cammarota, Andrew Lohn, David Krueger, Charlotte Stix, Peter Henderson, Logan Graham, Carina Prunkl, Bianca Martin, Elizabeth Seger, Noa Zilberman, Seán Ó hÉigeartaigh, Frens Kroeger, Girish Sastry, Rebecca Kagan, Adrian Weller, Brian Tse, Elizabeth Barnes, Allan Dafoe, Paul Scharre, Ariel Herbert-Voss, Martijn Rasser, Shagun Sodhani, Carrick Flynn, Thomas Krendl Gilbert, Lisa Dyer, Saif Khan, Yoshua Bengio, and Markus Anderljung. 2020. Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims. (2020). https://doi.org/10.48550/ARXIV.2004.07213 arxiv:2004.07213 [cs.CY]Google ScholarGoogle Scholar
  7. Zana Buçinca, Maja Barbara Malaya, and Krzysztof Z. Gajos. 2021. To Trust or to Think: Cognitive Forcing Functions Can Reduce Overreliance on AI in AI-Assisted Decision-Making. Proc. ACM Hum.-Comput. Interact. 5, CSCW1 (April 2021). https://doi.org/10.1145/3449287Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. 2017. Algorithmic Decision Making and the Cost of Fairness(KDD ’17). Association for Computing Machinery, New York, NY, USA, 797–806. https://doi.org/10.1145/3097983.3098095Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Sanjoy Dasgupta, Nave Frost, and Michal Moshkovitz. 2022. Framework for Evaluating Faithfulness of Local Explanations. (2022). https://doi.org/10.48550/ARXIV.2202.00734 arxiv:2202.00734 [cs.LG]Google ScholarGoogle Scholar
  10. Jonathan Dodge, Q. Vera Liao, Yunfeng Zhang, Rachel K. E. Bellamy, and Casey Dugan. 2019. Explaining Models: An Empirical Study of How Explanations Impact Fairness Judgment. In Proceedings of the 24th International Conference on Intelligent User Interfaces (Marina del Ray, California) (IUI ’19). Association for Computing Machinery, New York, NY, USA, 275–285. https://doi.org/10.1145/3301275.3302310Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Finale Doshi-Velez and Been Kim. 2017. Towards A Rigorous Science of Interpretable Machine Learning. (2017). https://doi.org/10.48550/ARXIV.1702.08608 arxiv:1702.08608 [cs.AI]Google ScholarGoogle Scholar
  12. Finale Doshi-Velez, Mason Kortz, Ryan Budish, Chris Bavitz, Sam Gershman, David O’Brien, Kate Scott, Stuart Schieber, James Waldo, David Weinberger, Adrian Weller, and Alexandra Wood. 2017. Accountability of AI Under the Law: The Role of Explanation. (2017). https://doi.org/10.48550/ARXIV.1711.01134 arxiv:1711.01134 [cs.AI]Google ScholarGoogle Scholar
  13. Lilian Edwards and Michael Veale. 2017. Slave to the Algorithm? Why a ’Right to an Explanation’ Is Probably Not the Remedy You Are Looking For. Duke Law & Technology Review 16 (2017), 18–84. https://doi.org/10.2139/ssrn.2972855Google ScholarGoogle Scholar
  14. Marzyeh Ghassemi, Luke Oakden-Rayner, and Andrew L. Beam. 2021. The false hope of current approaches to explainable artificial intelligence in health care.The Lancet. Digital health 3, 11 (Nov. 2021), e745–e750. https://doi.org/10.1016/S2589-7500(21)00208-9Google ScholarGoogle Scholar
  15. Gillian K. Hadfield. 2021. Explanation and justification: AI decision-making, law, and the rights of citizens. https://srinstitute.utoronto.ca/news/hadfield-justifiable-aiGoogle ScholarGoogle Scholar
  16. Ana Valeria González, Gagan Bansal, Angela Fan, Yashar Mehdad, Robin Jia, and Srinivasan Iyer. 2021. Do Explanations Help Users Detect Errors in Open-Domain QA? An Evaluation of Spoken vs. Visual Explanations. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. Association for Computational Linguistics, Online, 1103–1116. https://doi.org/10.18653/v1/2021.findings-acl.95Google ScholarGoogle ScholarCross RefCross Ref
  17. Yoyo Tsung-Yu Hou and Malte F. Jung. 2021. Who is the Expert? Reconciling Algorithm Aversion and Algorithm Appreciation in AI-Supported Decision Making. Proc. ACM Hum.-Comput. Interact. 5, CSCW2 (Oct. 2021). https://doi.org/10.1145/3479864Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. 2016. Machine Bias: There’s software used across the country to predict future criminals. And it’s biased against blacks.https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencingGoogle ScholarGoogle Scholar
  19. Todd Kulesza, Simone Stumpf, Margaret Burnett, Sherry Yang, Irwin Kwan, and Weng-Keen Wong. 2013. Too much, too little, or just right? Ways explanations impact end users’ mental models. In 2013 IEEE Symposium on Visual Languages and Human Centric Computing. IEEE, San Jose, CA, USA, 3–10. https://doi.org/10.1109/VLHCC.2013.6645235Google ScholarGoogle ScholarCross RefCross Ref
  20. Himabindu Lakkaraju, Ece Kamar, Rich Caruana, and Jure Leskovec. 2019. Faithful and Customizable Explanations of Black Box Models. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society (Honolulu, HI, USA) (AIES ’19). Association for Computing Machinery, New York, NY, USA, 131–138. https://doi.org/10.1145/3306618.3314229Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Ruiwen Li, Zhibo Zhang, Jiani Li, Chiheb Trabelsi, Scott Sanner, Jongseong Jang, Yeonjeong Jeong, and Dongsub Shim. 2021. EDDA: Explanation-driven Data Augmentation to Improve Explanation Faithfulness. (2021). https://doi.org/10.48550/ARXIV.2105.14162 arxiv:2105.14162 [cs.AI]Google ScholarGoogle Scholar
  22. Q. Vera Liao, Daniel Gruen, and Sarah Miller. 2020. Questioning the AI: Informing Design Practices for Explainable AI User Experiences. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–15. https://doi.org/10.1145/3313831.3376590Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Brian Y. Lim, Anind K. Dey, and Daniel Avrahami. 2009. Why and Why Not Explanations Improve the Intelligibility of Context-Aware Intelligent Systems. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Boston, MA, USA) (CHI ’09). Association for Computing Machinery, New York, NY, USA, 2119–2128. https://doi.org/10.1145/1518701.1519023Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Zachary C. Lipton. 2018. The Mythos of Model Interpretability: In Machine Learning, the Concept of Interpretability is Both Important and Slippery.Queue 16, 3 (June 2018), 31–57. https://doi.org/10.1145/3236386.3241340Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Han Liu, Vivian Lai, and Chenhao Tan. 2021. Understanding the Effect of Out-of-Distribution Examples and Interactive Explanations on Human-AI Decision Making. Proc. ACM Hum.-Comput. Interact. 5, CSCW2 (Oct. 2021). https://doi.org/10.1145/3479552Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Hui Liu, Qingyu Yin, and William Yang Wang. 2019. Towards Explainable NLP: A Generative Explanation Framework for Text Classification. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 5570–5581. https://doi.org/10.18653/v1/P19-1560Google ScholarGoogle ScholarCross RefCross Ref
  27. Shixia Liu, Xiting Wang, Mengchen Liu, and Jun Zhu. 2017. Towards Better Analysis of Machine Learning Models: A Visual Analytics Perspective. (2017). https://doi.org/10.48550/ARXIV.1702.01226 arxiv:1702.01226 [cs.LG]Google ScholarGoogle Scholar
  28. Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. 2021. A Survey on Bias and Fairness in Machine Learning. ACM Comput. Surv. 54, 6 (July 2021). https://doi.org/10.1145/3457607Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Tim Miller. 2017. Explanation in Artificial Intelligence: Insights from the Social Sciences. (2017). https://doi.org/10.48550/ARXIV.1706.07269 arxiv:1706.07269 [cs.AI]Google ScholarGoogle Scholar
  30. Sina Mohseni, Niloofar Zarei, and Eric D. Ragan. 2021. A Multidisciplinary Survey and Framework for Design and Evaluation of Explainable AI Systems. ACM Transactions on Interactive Intelligent Systems 11, 3–4 (Aug. 2021). https://doi.org/10.1145/3387166Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Menaka Narayanan, Emily Chen, Jeffrey He, Been Kim, Sam Gershman, and Finale Doshi-Velez. 2018. How do Humans Understand Explanations from Machine Learning Systems? An Evaluation of the Human-Interpretability of Explanation. (2018). https://doi.org/10.48550/ARXIV.1802.00682 arxiv:1802.00682 [cs.AI]Google ScholarGoogle Scholar
  32. Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why Should I Trust You?": Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Francisco, California, USA) (KDD ’16). Association for Computing Machinery, New York, NY, USA, 1135–1144. https://doi.org/10.1145/2939672.2939778Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Cynthia Rudin. 2019. Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead. https://doi.org/10.48550/arXiv.1811.10154 arXiv:1811.10154 [cs, stat].Google ScholarGoogle Scholar
  34. Andrew Selbst and Solon Barocas. 2018. The Intuitive Appeal of Explainable Machines. Fordham Law Review 87, 3 (Jan. 2018), 1085. https://ir.lawnet.fordham.edu/flr/vol87/iss3/11Google ScholarGoogle Scholar
  35. Andrew D Selbst and Julia Powles. 2017. Meaningful information and the right to explanation. International Data Privacy Law 7, 4 (Dec. 2017), 233–242. https://doi.org/10.1093/idpl/ipx022 arXiv:https://academic.oup.com/idpl/article-pdf/7/4/233/22923065/ipx022.pdfGoogle ScholarGoogle ScholarCross RefCross Ref
  36. Jacob Sippy, Gagan Bansal, and Daniel S. Weld. 2020. Data Staining: A Method for Comparing Faithfulness of Explainers. Technical Report. https://aiweb.cs.washington.edu/ai/pubs/sippy-icml20.pdfGoogle ScholarGoogle Scholar
  37. Dylan Slack, Sophie Hilgard, Emily Jia, Sameer Singh, and Himabindu Lakkaraju. 2020. Fooling LIME and SHAP: Adversarial Attacks on Post Hoc Explanation Methods. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (New York, NY, USA) (AIES ’20). Association for Computing Machinery, New York, NY, USA, 180–186. https://doi.org/10.1145/3375627.3375830Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Sandra Wachter, Brent Daniel Mittelstadt, and Chris Russell. 2017. Counterfactual Explanations Without Opening the Black Box: Automated Decisions and the GDPR. Harvard Journal of Law & Technology 31, 2 (2017), 47. https://doi.org/10.2139/ssrn.3063289Google ScholarGoogle Scholar
  39. Fan Yang, Mengnan Du, and Xia Hu. 2019. Evaluating Explanation Without Ground Truth in Interpretable Machine Learning. (2019). https://doi.org/10.48550/ARXIV.1907.06831 arxiv:1907.06831 [cs.AI]Google ScholarGoogle Scholar
  40. Wei Zhang, Ziming Huang, Yada Zhu, Guangnan Ye, Xiaodong Cui, and Fan Zhang. 2021. On Sample Based Explanation Methods for NLP: Faithfulness, Efficiency and Semantic Evaluation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 5399–5411. https://doi.org/10.18653/v1/2021.acl-long.419Google ScholarGoogle Scholar

Index Terms

  1. How to Explain and Justify Almost Any Decision: Potential Pitfalls for Accountability in AI Decision-Making

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Other conferences
          FAccT '23: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency
          June 2023
          1929 pages
          ISBN:9798400701924
          DOI:10.1145/3593013

          Copyright © 2023 ACM

          Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 12 June 2023

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited
        • Article Metrics

          • Downloads (Last 12 months)500
          • Downloads (Last 6 weeks)36

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format