ABSTRACT
In this paper, we argue for a paradigm shift from the current model of explainable artificial intelligence (XAI), which may be counter-productive to better human decision making. In early decision support systems, we assumed that we could give people recommendations and that they would consider them, and then follow them when required. However, research found that people often ignore recommendations because they do not trust them; or perhaps even worse, people follow them blindly, even when the recommendations are wrong. Explainable artificial intelligence mitigates this by helping people to understand how and why models give certain recommendations. However, recent research shows that people do not always engage with explainability tools enough to help improve decision making. The assumption that people will engage with recommendations and explanations has proven to be unfounded. We argue this is because we have failed to account for two things. First, recommendations (and their explanations) take control from human decision makers, limiting their agency. Second, giving recommendations and explanations does not align with the cognitive processes employed by people making decisions. This position paper proposes a new conceptual framework called Evaluative AI for explainable decision support. This is a machine-in-the-loop paradigm in which decision support tools provide evidence for and against decisions made by people, rather than provide recommendations to accept or reject. We argue that this mitigates issues of over- and under-reliance on decision support tools, and better leverages human expertise in decision making.
- Zana Buçinca, Maja Barbara Malaya, and Krzysztof Z Gajos. 2021. To Trust or to Think: Cognitive Forcing Functions Can Reduce Overreliance on AI in AI-assisted Decision-making. Proc. ACM Hum.-Comput. Interact. 5, CSCW1 (April 2021), 1–21. https://doi.org/10.1145/3449287Google ScholarDigital Library
- Michael Chromik, Malin Eiband, Felicitas Buchner, Adrian Kruger, and Andreas Butz. 2021. I think I get your point, AI! the illusion of explanatory depth in explainable AI. 26th International Conference on Intelligent User Interfaces (2021). https://dl.acm.org/doi/abs/10.1145/3397481.3450644Google ScholarDigital Library
- S Coderre, H Mandin, P H Harasym, and G H Fick. 2003. Diagnostic reasoning strategies and diagnostic success. Medical education 37, 8 (Aug. 2003), 695–703. https://doi.org/10.1046/j.1365-2923.2003.01577.xGoogle Scholar
- Jill L Drury, Gary L Klein, Lashon Booker, Kathy Ryall, and Samantha Dubrow. 2022. Reimagining Situation Awareness and Option Awareness for Human-Machine Teaming. In 2022 IEEE Conference on Cognitive and Computational Aspects of Situation Management (CogSIMA). 9–15. https://doi.org/10.1109/CogSIMA54611.2022.9830660Google Scholar
- Malin Eiband, Daniel Buschek, Alexander Kremer, and Heinrich Hussmann. 2019. The Impact of Placebic Explanations on Trust in Intelligent Systems. In Extended Abstracts of CHI. ACM, 1–6. https://doi.org/10.1145/3290607.3312787Google ScholarDigital Library
- Krzysztof Z Gajos and Lena Mamykina. 2022. Do People Engage Cognitively with AI? Impact of AI Assistance on Incidental Learning. In 27th International Conference on Intelligent User Interfaces (Helsinki, Finland) (IUI ’22). ACM, New York, NY, USA, 794–806. https://doi.org/10.1145/3490099.3511138Google ScholarDigital Library
- Ben Green and Yiling Chen. 2019. Disparate Interactions: An Algorithm-in-the-Loop Analysis of Fairness in Risk Assessments. In Proceedings of the Conference on Fairness, Accountability, and Transparency (Atlanta, GA, USA). ACM, New York, NY, USA, 90–99. https://doi.org/10.1145/3287560.3287563Google ScholarDigital Library
- Ben Green and Yiling Chen. 2019. The Principles and Limits of Algorithm-in-the-Loop Decision Making. Proc. ACM Hum.-Comput. Interact. 3, CSCW (Nov. 2019), 1–24. https://doi.org/10.1145/3359152Google ScholarDigital Library
- David Gunning and David Aha. 2019. DARPA’s explainable artificial intelligence (XAI) program. AI magazine 40, 2 (June 2019), 44–58. https://doi.org/10.1609/aimag.v40i2.2850Google ScholarDigital Library
- Robert R Hoffman. 2017. A taxonomy of emergent trusting in the human–machine relationship. Cognitive systems engineering: The future for a changing world (2017), 137–164.Google Scholar
- Robert R Hoffman, Tim Miller, and William J Clancey. 2022. Psychology and AI at a Crossroads: How Might Complex Systems Explain Themselves?The American journal of psychology 135, 4 (2022), 365–378.Google Scholar
- Robert R Hoffman and Frank J Yates. 2005. Decision making [human-centered computing]. IEEE intelligent systems 20, 4 (July 2005), 76–83. https://doi.org/10.1109/MIS.2005.67Google ScholarDigital Library
- Alon Jacovi, Ana Marasović, Tim Miller, and Yoav Goldberg. 2021. Formalizing Trust in Artificial Intelligence: Prerequisites, Causes and Goals of Human Trust in AI. In Proceedings of the ACM Conference on Fairness, Accountability, and Transparency. ACM, 624–635. https://doi.org/10.1145/3442188.3445923Google ScholarDigital Library
- Daniel Kahneman. 2011. Thinking, Fast and Slow. Penguin UK.Google Scholar
- Harmanpreet Kaur, Eytan Adar, Eric Gilbert, and Cliff Lampe. 2022. Sensible AI: Re-imagining Interpretability and Explainability using Sensemaking Theory. (May 2022). arxiv:2205.05057 [cs.HC] http://arxiv.org/abs/2205.05057Google Scholar
- Gary Klein. 2015. A naturalistic decision making perspective on studying intuitive decision making. Journal of applied research in memory and cognition 4, 3 (Sept. 2015), 164–168. https://doi.org/10.1016/j.jarmac.2015.07.001Google ScholarCross Ref
- Gary Klein, Jennifer K Phillips, Erica L Rall, and Deborah A Peluso. 2007. A data–frame theory of sensemaking. In Expertise out of context. Psychology Press, 118–160.Google Scholar
- Gary A Klein. 2017. Sources of Power: How People Make Decisions. MIT Press.Google Scholar
- Wouter Kool and Matthew Botvinick. 2018. Mental labour. Nature human behaviour 2, 12 (Dec. 2018), 899–908. https://doi.org/10.1038/s41562-018-0401-9Google Scholar
- Kathryn Ann Lambe, Gary O’Reilly, Brendan D Kelly, and Sarah Curristan. 2016. Dual-process cognitive interventions to enhance diagnostic reasoning: a systematic review. BMJ quality & safety 25, 10 (Oct. 2016), 808–820. https://doi.org/10.1136/bmjqs-2015-004417Google Scholar
- John D Lee and Katrina A See. 2004. Trust in automation: designing for appropriate reliance. Human factors 46, 1 (2004), 50–80. https://doi.org/10.1518/hfes.46.1.50_30392Google Scholar
- Benedikt Leichtmann, Andreas Hinterreiter, Christina Humer, Marc Streit, and Martina Mara. 2022. Explainable Artificial Intelligence improves human decision-making: Results from a mushroom picking experiment at a public art festival. (Sept. 2022). https://doi.org/10.31219/osf.io/68emrGoogle Scholar
- Prashan Madumal, Tim Miller, Liz Sonenberg, and Frank Vetere. 2020. Explainable Reinforcement Learning through a Causal Lens. Proceedings of AAAI 34, 03 (April 2020), 2493–2500. https://doi.org/10.1609/aaai.v34i03.5631Google Scholar
- David Alvarez Melis, Harmanpreet Kaur, Hal Daumé, III, Hanna Wallach, and Jennifer Wortman Vaughan. 2021. From Human Explanation to Model Interpretability: A Framework Based on Weight of Evidence. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing. 35–47. https://ojs.aaai.org/index.php/HCOMP/article/view/18938Google Scholar
- Hugo Mercier and Dan Sperber. 2017. The Enigma of Reason. Harvard University Press.Google Scholar
- Tim Miller. 2019. Explanation in artificial intelligence: Insights from the social sciences. Artificial intelligence (2019). https://www.sciencedirect.com/science/article/pii/S0004370218305988Google Scholar
- Tim Miller. 2021. Contrastive explanation: A structural-model approach. Knowledge Engineering Review 36 (2021), e14.Google ScholarCross Ref
- Mahsan Nourani, Chiradeep Roy, Jeremy E Block, Donald R Honeycutt, Tahrima Rahman, Eric Ragan, and Vibhav Gogate. 2021. Anchoring Bias Affects Mental Model Formation and User Reliance in Explainable AI Systems. In 26th International Conference on Intelligent User Interfaces(IUI ’21). ACM, New York, NY, USA, 340–350. https://doi.org/10.1145/3397481.3450639Google ScholarDigital Library
- Raja Parasuraman and Victor Riley. 1997. Humans and Automation: Use, Misuse, Disuse, Abuse. Human factors 39, 2 (June 1997), 230–253. https://doi.org/10.1518/001872097778543886Google Scholar
- Charles S Peirce. 2009. Writings of Charles S. Peirce: A Chronological Edition, Volume 8: 1890–1892. Indiana University Press.Google Scholar
- Mark S Pfaff, Gary L Klein, Jill L Drury, Sung Pil Moon, Yikun Liu, and Steven O Entezari. 2013. Supporting Complex Decision Making Through Option Awareness. Journal of Cognitive Engineering and Decision Making 7, 2 (June 2013), 155–178. https://doi.org/10.1177/1555343412455799Google ScholarCross Ref
- Forough Poursabzi-Sangdeh, Daniel G Goldstein, Jake M Hofman, Jennifer Wortman Vaughan, and Hanna Wallach. 2021. Manipulating and Measuring Model Interpretability. In Proceedings of the CHI 2021. ACM, New York, NY, USA, 1–52. https://doi.org/10.1145/3411764.3445315Google ScholarDigital Library
- Cynthia Rudin. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence 1, 5 (May 2019), 206–215. https://doi.org/10.1038/s42256-019-0048-xGoogle ScholarCross Ref
- J W Rudolph. 2003. Into the big muddy and out again: Error persistence and crisis management in the operating room. Ph. D. Dissertation.Google Scholar
- Christin Schulze and Ralph Hertwig. 2021. A description-experience gap in statistical intuitions: Of smart babies, risk-savvy chimps, intuitive statisticians, and stupid grown-ups. Cognition 210 (May 2021), 104580. https://doi.org/10.1016/j.cognition.2020.104580Google Scholar
- Ben Shneiderman, Catherine Plaisant, Maxine S Cohen, Steven Jacobs, Niklas Elmqvist, and Nicholas Diakopoulos. 2016. Designing the User Interface: Strategies for Effective Human-Computer Interaction, 6th Edition. Pearson.Google ScholarDigital Library
- Venkatesh Sivaraman, Leigh A Bukowski, Joel Levin, Jeremy M Kahn, and Adam Perer. 2023. Ignore, Trust, or Negotiate: Understanding Clinician Acceptance of AI-Based Treatment Recommendations in Health Care. In Proceedings of CHI. https://arxiv.org/abs/2302.00096Google ScholarDigital Library
- Frode Sørmo, Jörg Cassens, and Agnar Aamodt. 2005. Explanation in case-based reasoning–perspectives and goals. Artificial intelligence review 24, 2 (Oct. 2005), 109–143. https://doi.org/10.1007/s10462-005-4607-7Google ScholarDigital Library
- William R Swartout and Johanna D Moore. 1993. Explanation in Second Generation Expert Systems. In Second Generation Expert Systems. Springer Berlin Heidelberg, 543–585. https://doi.org/10.1007/978-3-642-77927-5_24Google Scholar
- Richard Tomsett, Alun Preece, Dave Braines, Federico Cerutti, Supriyo Chakraborty, Mani Srivastava, Gavin Pearson, and Lance Kaplan. 2020. Rapid Trust Calibration through Interpretable and Uncertainty-Aware AI. Patterns (New York, N.Y.) 1, 4 (July 2020), 100049. https://doi.org/10.1016/j.patter.2020.100049Google Scholar
- Jasper van der Waa, Elisabeth Nieuwburg, Anita Cremers, and Mark Neerincx. 2021. Evaluating XAI: A comparison of rule-based and example-based explanations. Artificial intelligence 291 (Feb. 2021), 103404. https://doi.org/10.1016/j.artint.2020.103404Google Scholar
- Q Vera Liao and Kush R Varshney. 2021. Human-Centered Explainable AI (XAI): From Algorithms to User Experiences. (Oct. 2021). arxiv:2110.10790 [cs.AI] http://arxiv.org/abs/2110.10790Google Scholar
- Mor Vered, Piers Howe, Tim Miller, Liz Sonenberg, and Eduardo Velloso. 2020. Demand-Driven Transparency for Monitoring Intelligent Agents. IEEE Transactions on Human-Machine Systems 50, 3 (June 2020), 264–275. https://doi.org/10.1109/THMS.2020.2988859Google ScholarCross Ref
- Danding Wang, Qian Yang, Ashraf Abdul, and Brian Y Lim. 2019. Designing Theory-Driven User-Centric Explainable AI. In Proceedings of the CHI 2019(CHI ’19, Paper 601). ACM, New York, NY, USA, 1–15. https://doi.org/10.1145/3290605.3300831Google ScholarDigital Library
- Frank J Yates and Georges A Potworowski. 2012. Evidence-Based Decision Management. Oxford University Press, 198––222. https://doi.org/10.1093/oxfordhb/9780199763986.013.0012Google Scholar
- Yunfeng Zhang, Q Vera Liao, and Rachel K E Bellamy. 2020. Effect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (Barcelona, Spain). ACM, New York, NY, USA, 295–305. https://doi.org/10.1145/3351095.3372852Google ScholarDigital Library
Index Terms
- Explainable AI is Dead, Long Live Explainable AI!: Hypothesis-driven Decision Support using Evaluative AI
Recommendations
Explainable Recommendation: A Survey and New Perspectives
Explainable recommendation attempts to develop models that generate not only high-quality recommendations but also intuitive explanations. The explanations may either be post-hoc or directly come from an explainable model (also called interpretable or ...
Counterfactual Explainable Recommendation
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge ManagementBy providing explanations for users and system designers to facilitate better understanding and decision making, explainable recommendation has been an important research problem. In this paper, we propose Counterfactual Explainable Recommendation (...
Attention-driven Factor Model for Explainable Personalized Recommendation
SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information RetrievalLatent Factor Models (LFMs) based on Collaborative Filtering (CF) have been widely applied in many recommendation systems, due to their good performance of prediction accuracy. In addition to users' ratings, auxiliary information such as item features ...
Comments