research-article

Explainable AI is Dead, Long Live Explainable AI!: Hypothesis-driven Decision Support using Evaluative AI

Author:
Tim Miller

The University of Melbourne, Australia

The University of Melbourne, Australia

0000-0003-4908-6063
View Profile

FAccT '23: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and TransparencyJune 2023Pages 333–342https://doi.org/10.1145/3593013.3594001

Published:12 June 2023Publication History

FAccT '23: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency

Pages 333–342

ABSTRACT

In this paper, we argue for a paradigm shift from the current model of explainable artificial intelligence (XAI), which may be counter-productive to better human decision making. In early decision support systems, we assumed that we could give people recommendations and that they would consider them, and then follow them when required. However, research found that people often ignore recommendations because they do not trust them; or perhaps even worse, people follow them blindly, even when the recommendations are wrong. Explainable artificial intelligence mitigates this by helping people to understand how and why models give certain recommendations. However, recent research shows that people do not always engage with explainability tools enough to help improve decision making. The assumption that people will engage with recommendations and explanations has proven to be unfounded. We argue this is because we have failed to account for two things. First, recommendations (and their explanations) take control from human decision makers, limiting their agency. Second, giving recommendations and explanations does not align with the cognitive processes employed by people making decisions. This position paper proposes a new conceptual framework called Evaluative AI for explainable decision support. This is a machine-in-the-loop paradigm in which decision support tools provide evidence for and against decisions made by people, rather than provide recommendations to accept or reject. We argue that this mitigates issues of over- and under-reliance on decision support tools, and better leverages human expertise in decision making.

References

Zana Buçinca, Maja Barbara Malaya, and Krzysztof Z Gajos. 2021. To Trust or to Think: Cognitive Forcing Functions Can Reduce Overreliance on AI in AI-assisted Decision-making. Proc. ACM Hum.-Comput. Interact. 5, CSCW1 (April 2021), 1–21. https://doi.org/10.1145/3449287Google ScholarDigital Library
Michael Chromik, Malin Eiband, Felicitas Buchner, Adrian Kruger, and Andreas Butz. 2021. I think I get your point, AI! the illusion of explanatory depth in explainable AI. 26th International Conference on Intelligent User Interfaces (2021). https://dl.acm.org/doi/abs/10.1145/3397481.3450644Google ScholarDigital Library
S Coderre, H Mandin, P H Harasym, and G H Fick. 2003. Diagnostic reasoning strategies and diagnostic success. Medical education 37, 8 (Aug. 2003), 695–703. https://doi.org/10.1046/j.1365-2923.2003.01577.xGoogle Scholar
Jill L Drury, Gary L Klein, Lashon Booker, Kathy Ryall, and Samantha Dubrow. 2022. Reimagining Situation Awareness and Option Awareness for Human-Machine Teaming. In 2022 IEEE Conference on Cognitive and Computational Aspects of Situation Management (CogSIMA). 9–15. https://doi.org/10.1109/CogSIMA54611.2022.9830660Google Scholar
Malin Eiband, Daniel Buschek, Alexander Kremer, and Heinrich Hussmann. 2019. The Impact of Placebic Explanations on Trust in Intelligent Systems. In Extended Abstracts of CHI. ACM, 1–6. https://doi.org/10.1145/3290607.3312787Google ScholarDigital Library
Krzysztof Z Gajos and Lena Mamykina. 2022. Do People Engage Cognitively with AI? Impact of AI Assistance on Incidental Learning. In 27th International Conference on Intelligent User Interfaces (Helsinki, Finland) (IUI ’22). ACM, New York, NY, USA, 794–806. https://doi.org/10.1145/3490099.3511138Google ScholarDigital Library
Ben Green and Yiling Chen. 2019. Disparate Interactions: An Algorithm-in-the-Loop Analysis of Fairness in Risk Assessments. In Proceedings of the Conference on Fairness, Accountability, and Transparency (Atlanta, GA, USA). ACM, New York, NY, USA, 90–99. https://doi.org/10.1145/3287560.3287563Google ScholarDigital Library
Ben Green and Yiling Chen. 2019. The Principles and Limits of Algorithm-in-the-Loop Decision Making. Proc. ACM Hum.-Comput. Interact. 3, CSCW (Nov. 2019), 1–24. https://doi.org/10.1145/3359152Google ScholarDigital Library
David Gunning and David Aha. 2019. DARPA’s explainable artificial intelligence (XAI) program. AI magazine 40, 2 (June 2019), 44–58. https://doi.org/10.1609/aimag.v40i2.2850Google ScholarDigital Library
Robert R Hoffman. 2017. A taxonomy of emergent trusting in the human–machine relationship. Cognitive systems engineering: The future for a changing world (2017), 137–164.Google Scholar
Robert R Hoffman, Tim Miller, and William J Clancey. 2022. Psychology and AI at a Crossroads: How Might Complex Systems Explain Themselves?The American journal of psychology 135, 4 (2022), 365–378.Google Scholar
Robert R Hoffman and Frank J Yates. 2005. Decision making [human-centered computing]. IEEE intelligent systems 20, 4 (July 2005), 76–83. https://doi.org/10.1109/MIS.2005.67Google ScholarDigital Library
Alon Jacovi, Ana Marasović, Tim Miller, and Yoav Goldberg. 2021. Formalizing Trust in Artificial Intelligence: Prerequisites, Causes and Goals of Human Trust in AI. In Proceedings of the ACM Conference on Fairness, Accountability, and Transparency. ACM, 624–635. https://doi.org/10.1145/3442188.3445923Google ScholarDigital Library
Daniel Kahneman. 2011. Thinking, Fast and Slow. Penguin UK.Google Scholar
Harmanpreet Kaur, Eytan Adar, Eric Gilbert, and Cliff Lampe. 2022. Sensible AI: Re-imagining Interpretability and Explainability using Sensemaking Theory. (May 2022). arxiv:2205.05057 [cs.HC] http://arxiv.org/abs/2205.05057Google Scholar
Gary Klein. 2015. A naturalistic decision making perspective on studying intuitive decision making. Journal of applied research in memory and cognition 4, 3 (Sept. 2015), 164–168. https://doi.org/10.1016/j.jarmac.2015.07.001Google ScholarCross Ref
Gary Klein, Jennifer K Phillips, Erica L Rall, and Deborah A Peluso. 2007. A data–frame theory of sensemaking. In Expertise out of context. Psychology Press, 118–160.Google Scholar
Gary A Klein. 2017. Sources of Power: How People Make Decisions. MIT Press.Google Scholar
Wouter Kool and Matthew Botvinick. 2018. Mental labour. Nature human behaviour 2, 12 (Dec. 2018), 899–908. https://doi.org/10.1038/s41562-018-0401-9Google Scholar
Kathryn Ann Lambe, Gary O’Reilly, Brendan D Kelly, and Sarah Curristan. 2016. Dual-process cognitive interventions to enhance diagnostic reasoning: a systematic review. BMJ quality & safety 25, 10 (Oct. 2016), 808–820. https://doi.org/10.1136/bmjqs-2015-004417Google Scholar
John D Lee and Katrina A See. 2004. Trust in automation: designing for appropriate reliance. Human factors 46, 1 (2004), 50–80. https://doi.org/10.1518/hfes.46.1.50_30392Google Scholar
Benedikt Leichtmann, Andreas Hinterreiter, Christina Humer, Marc Streit, and Martina Mara. 2022. Explainable Artificial Intelligence improves human decision-making: Results from a mushroom picking experiment at a public art festival. (Sept. 2022). https://doi.org/10.31219/osf.io/68emrGoogle Scholar
Prashan Madumal, Tim Miller, Liz Sonenberg, and Frank Vetere. 2020. Explainable Reinforcement Learning through a Causal Lens. Proceedings of AAAI 34, 03 (April 2020), 2493–2500. https://doi.org/10.1609/aaai.v34i03.5631Google Scholar
David Alvarez Melis, Harmanpreet Kaur, Hal Daumé, III, Hanna Wallach, and Jennifer Wortman Vaughan. 2021. From Human Explanation to Model Interpretability: A Framework Based on Weight of Evidence. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing. 35–47. https://ojs.aaai.org/index.php/HCOMP/article/view/18938Google Scholar
Hugo Mercier and Dan Sperber. 2017. The Enigma of Reason. Harvard University Press.Google Scholar
Tim Miller. 2019. Explanation in artificial intelligence: Insights from the social sciences. Artificial intelligence (2019). https://www.sciencedirect.com/science/article/pii/S0004370218305988Google Scholar
Tim Miller. 2021. Contrastive explanation: A structural-model approach. Knowledge Engineering Review 36 (2021), e14.Google ScholarCross Ref
Mahsan Nourani, Chiradeep Roy, Jeremy E Block, Donald R Honeycutt, Tahrima Rahman, Eric Ragan, and Vibhav Gogate. 2021. Anchoring Bias Affects Mental Model Formation and User Reliance in Explainable AI Systems. In 26th International Conference on Intelligent User Interfaces(IUI ’21). ACM, New York, NY, USA, 340–350. https://doi.org/10.1145/3397481.3450639Google ScholarDigital Library
Raja Parasuraman and Victor Riley. 1997. Humans and Automation: Use, Misuse, Disuse, Abuse. Human factors 39, 2 (June 1997), 230–253. https://doi.org/10.1518/001872097778543886Google Scholar
Charles S Peirce. 2009. Writings of Charles S. Peirce: A Chronological Edition, Volume 8: 1890–1892. Indiana University Press.Google Scholar
Mark S Pfaff, Gary L Klein, Jill L Drury, Sung Pil Moon, Yikun Liu, and Steven O Entezari. 2013. Supporting Complex Decision Making Through Option Awareness. Journal of Cognitive Engineering and Decision Making 7, 2 (June 2013), 155–178. https://doi.org/10.1177/1555343412455799Google ScholarCross Ref
Forough Poursabzi-Sangdeh, Daniel G Goldstein, Jake M Hofman, Jennifer Wortman Vaughan, and Hanna Wallach. 2021. Manipulating and Measuring Model Interpretability. In Proceedings of the CHI 2021. ACM, New York, NY, USA, 1–52. https://doi.org/10.1145/3411764.3445315Google ScholarDigital Library
Cynthia Rudin. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence 1, 5 (May 2019), 206–215. https://doi.org/10.1038/s42256-019-0048-xGoogle ScholarCross Ref
J W Rudolph. 2003. Into the big muddy and out again: Error persistence and crisis management in the operating room. Ph. D. Dissertation.Google Scholar
Christin Schulze and Ralph Hertwig. 2021. A description-experience gap in statistical intuitions: Of smart babies, risk-savvy chimps, intuitive statisticians, and stupid grown-ups. Cognition 210 (May 2021), 104580. https://doi.org/10.1016/j.cognition.2020.104580Google Scholar
Ben Shneiderman, Catherine Plaisant, Maxine S Cohen, Steven Jacobs, Niklas Elmqvist, and Nicholas Diakopoulos. 2016. Designing the User Interface: Strategies for Effective Human-Computer Interaction, 6th Edition. Pearson.Google ScholarDigital Library
Venkatesh Sivaraman, Leigh A Bukowski, Joel Levin, Jeremy M Kahn, and Adam Perer. 2023. Ignore, Trust, or Negotiate: Understanding Clinician Acceptance of AI-Based Treatment Recommendations in Health Care. In Proceedings of CHI. https://arxiv.org/abs/2302.00096Google ScholarDigital Library
Frode Sørmo, Jörg Cassens, and Agnar Aamodt. 2005. Explanation in case-based reasoning–perspectives and goals. Artificial intelligence review 24, 2 (Oct. 2005), 109–143. https://doi.org/10.1007/s10462-005-4607-7Google ScholarDigital Library
William R Swartout and Johanna D Moore. 1993. Explanation in Second Generation Expert Systems. In Second Generation Expert Systems. Springer Berlin Heidelberg, 543–585. https://doi.org/10.1007/978-3-642-77927-5_24Google Scholar
Richard Tomsett, Alun Preece, Dave Braines, Federico Cerutti, Supriyo Chakraborty, Mani Srivastava, Gavin Pearson, and Lance Kaplan. 2020. Rapid Trust Calibration through Interpretable and Uncertainty-Aware AI. Patterns (New York, N.Y.) 1, 4 (July 2020), 100049. https://doi.org/10.1016/j.patter.2020.100049Google Scholar
Jasper van der Waa, Elisabeth Nieuwburg, Anita Cremers, and Mark Neerincx. 2021. Evaluating XAI: A comparison of rule-based and example-based explanations. Artificial intelligence 291 (Feb. 2021), 103404. https://doi.org/10.1016/j.artint.2020.103404Google Scholar
Q Vera Liao and Kush R Varshney. 2021. Human-Centered Explainable AI (XAI): From Algorithms to User Experiences. (Oct. 2021). arxiv:2110.10790 [cs.AI] http://arxiv.org/abs/2110.10790Google Scholar
Mor Vered, Piers Howe, Tim Miller, Liz Sonenberg, and Eduardo Velloso. 2020. Demand-Driven Transparency for Monitoring Intelligent Agents. IEEE Transactions on Human-Machine Systems 50, 3 (June 2020), 264–275. https://doi.org/10.1109/THMS.2020.2988859Google ScholarCross Ref
Danding Wang, Qian Yang, Ashraf Abdul, and Brian Y Lim. 2019. Designing Theory-Driven User-Centric Explainable AI. In Proceedings of the CHI 2019(CHI ’19, Paper 601). ACM, New York, NY, USA, 1–15. https://doi.org/10.1145/3290605.3300831Google ScholarDigital Library
Frank J Yates and Georges A Potworowski. 2012. Evidence-Based Decision Management. Oxford University Press, 198––222. https://doi.org/10.1093/oxfordhb/9780199763986.013.0012Google Scholar
Yunfeng Zhang, Q Vera Liao, and Rachel K E Bellamy. 2020. Effect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (Barcelona, Spain). ACM, New York, NY, USA, 295–305. https://doi.org/10.1145/3351095.3372852Google ScholarDigital Library

Index Terms

Explainable AI is Dead, Long Live Explainable AI!: Hypothesis-driven Decision Support using Evaluative AI
1. Computing methodologies
  1. Artificial intelligence
2. Human-centered computing
  1. Human computer interaction (HCI)
    1. HCI theory, concepts and models

Recommendations

Explainable Recommendation: A Survey and New Perspectives

Explainable recommendation attempts to develop models that generate not only high-quality recommendations but also intuitive explanations. The explanations may either be post-hoc or directly come from an explainable model (also called interpretable or ...
Read More
Counterfactual Explainable Recommendation
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management

By providing explanations for users and system designers to facilitate better understanding and decision making, explainable recommendation has been an important research problem. In this paper, we propose Counterfactual Explainable Recommendation (...
Read More
Attention-driven Factor Model for Explainable Personalized Recommendation
SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval

Latent Factor Models (LFMs) based on Collaborative Filtering (CF) have been widely applied in many recommendation systems, due to their good performance of prediction accuracy. In addition to users' ratings, auxiliary information such as item features ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

FAccT '23: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency
June 2023
1929 pages
ISBN:9798400701924
DOI:10.1145/3593013

Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 June 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 1,118
  Total Downloads
- Downloads (Last 12 months)1,118
- Downloads (Last 6 weeks)154
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Explainable AI is Dead, Long Live Explainable AI!: Hypothesis-driven Decision Support using Evaluative AI

FAccT '23: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency

ABSTRACT

References

Cited By

Index Terms

Recommendations

Explainable Recommendation: A Survey and New Perspectives

Counterfactual Explainable Recommendation

Attention-driven Factor Model for Explainable Personalized Recommendation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

Explainable AI is Dead, Long Live Explainable AI!: Hypothesis-driven Decision Support using Evaluative AI

FAccT '23: Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency

ABSTRACT

References

Cited By

Index Terms

Recommendations

Explainable Recommendation: A Survey and New Perspectives

Counterfactual Explainable Recommendation

Attention-driven Factor Model for Explainable Personalized Recommendation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media