Attitudinal Tensions in the Joint Pursuit of Explainable and Trusted AI

Narayanan, Devesh; Tan, Zhi Ming

doi:10.1007/s11023-023-09628-y

Attitudinal Tensions in the Joint Pursuit of Explainable and Trusted AI

Published: 26 February 2023

Volume 33, pages 55–82, (2023)
Cite this article

Minds and Machines Aims and scope Submit manuscript

Devesh Narayanan^1,2 &
Zhi Ming Tan³

954 Accesses
13 Altmetric
Explore all metrics

Abstract

It is frequently demanded that AI-based Decision Support Tools (AI-DSTs) ought to be both explainable to, and trusted by, those who use them. The joint pursuit of these two principles is ordinarily believed to be uncontroversial. In fact, a common view is that AI systems should be made explainable so that they can be trusted, and in turn, accepted by decision-makers. However, the moral scope of these two principles extends far beyond this particular instrumental connection. This paper argues that if we were to account for the rich and diverse moral reasons that ground the call for explainable AI, and fully consider what it means to “trust” AI in a descriptively rich sense of the term, we would uncover a deep and persistent tension between the two principles. For explainable AI to usefully serve the pursuit of normatively desirable goals, decision-makers must carefully monitor and critically reflect on the content of an AI-DST’s explanation. This entails a deliberative attitude. Conversely, calls for trust in AI-DSTs imply the disposition to put questions about their reliability out of mind. This entails an unquestioning attitude. As such, the joint pursuit of explainable and trusted AI calls on decision-makers to simultaneously adopt incompatible attitudes towards their AI-DST, which leads to an intractable implementation gap. We analyze this gap and explore its broader implications: suggesting that we may need alternate theoretical conceptualizations of what explainability and trust entail, and/or alternate decision-making arrangements that separate the requirements for trust and deliberation to different parties.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accountability in artificial intelligence: what it is and how it works

Article Open access 07 February 2023

Ethical Decision-Making Theory: An Integrated Approach

Article 26 October 2015

In AI We Trust: Ethics, Artificial Intelligence, and Reliability

Article Open access 10 June 2020

Notes

We use this particular term in order to make explicit the connection to similar ‘process models’ developed in the fields of information systems and organizational behavior (e.g., the ‘Technology Acceptance Model’) – where the pursuits of trust, ease of use, result demonstrability, etc. are justified exclusively in terms of their instrumental utility in obtaining the clearly defined end-goal of ‘acceptance’. We discuss these process models and further elaborate on our choice of this term in Sect. 3.
Although we use terms like “AI system” and “AI-DST” throughout our argument, it is important to note that worries about opacity are most salient when AI models are trained using certain machine learning (ML) techniques, since ML models tend to be architecturally more complex, and hence less intelligible to humans (Gunning et al., 2017). However, partly because even simple regression models can be opaque in certain interesting ways (see, for example, Lipton’s (2016) discussion of opacity arising from feature selection and preprocessing), and partly because ML techniques are increasingly commonplace in modern AI deployments, we decided that it would be appropriate to use the more general terms (AI/AI-DST) throughout.
Ferrario et al. (2022) provide an especially detailed and careful account of how explainability contributes to trust in AI. Specifically, they argue that explainability fosters trust if and only if it (a) provides justification for a belief about the trustworthiness of the AI system, and (b) causally contributes to rely on the AI in the absence of monitoring. Interestingly, in developing their arguments, they employ a concept of “paradigmatic trust”, one component of which is an “anti-monitoring” stance towards the trustee. As we will subsequently discuss, this view relates very closely how trust is conceptualized in the present paper (i.e., as comprising an ‘unquestioning attitude’). For this reason, one may reasonably see our paper as a natural extension to Ferrario et al.’s arguments – while they focus on the attitudes, beliefs and dispositions that make explanations useful for fostering trust in AI, we focus on what happens (at the attitudinal level) after this trust is fostered. Specifically, they ask: when does (and doesn’t) explainability foster trust? And, we ask: can explanations still serve their intended normative purpose after trust is fostered? Our thanks to an anonymous reviewer for pointing out this interesting connection.
Explanations from AI, of course, are not only useful for decision-makers to improve their decision-making. There are a number of important moral reasons for pursuing explainability that center on the value of explanations to decision-subjects – to contest and seek recourse for unfair decisions, to give informed consent to AI-driven decision-making processes, etc. We will return to discuss these in Sect. 5 [Implications].
Although we happen to find Ryan’s view generally persuasive, the arguments in this paper do not require a firm stance on any particular position about whether AI systems can be trustworthy and/or trusted in the paradigmatic interpersonal senses of these terms. If the reader is persuaded by Ryan’s (and/or other similar) arguments about trust in AI being a form of category error, they might still accept without contradiction that (a) people can adopt similar attitudes and dispositions towards an AI system as they would towards a human they trust, and (b) sometimes this ‘trust’ (in a descriptively rich, but non-normative sense of the term) is desirable. Conversely, if the reader believes that full-blooded normative trust in AI is possible, our arguments would be readily compatible with their view.
Baier ultimately defends a morally-loaded view of trust: she believes that a trustor’s assumption of goodwill on part of their trustee is what generates betrayal during failures, and in turn, is what separates trust from mere reliance. As mentioned before, we wish to keep the arguments in this paper neutral between specific conceptualizations of trust – since our focus is on characterizing the attitudes that the typical trustor adopts towards a trustee. As such, we are taking Baier’s distinction between trust (betrayal) and reliance (disappointment) as an attitudinal marker to separate the two concepts in terms of the differing attitudes they evoke in trustors (or reliers). This usage of Baier’s distinction is not especially idiosyncratic – Holton (1994) is a notable example of an attempt to theorize trust using Baier’s trust-reliance distinction as an attitudinal marker.
This is especially noteworthy, since, in Simon’s (2010) view, the “ascription of intentionality is crucial for the feeling of betrayal” (p. 347). If some artifact fails at some task it was ‘trusted’ to perform, and one ascribes intentionality in this failure (i.e., that the artifact ‘chose’ to fail its task), they might reasonably feel betrayed. As Simon further argues, “whether [one] feels betrayed or disappointed resides in [one’s] perception of the reasons for failure” (p. 347). These perceptions may be mistaken, and perhaps one ought not to ascribe intentionality to non-agential artifacts: but at a descriptive level, the attitudes and affections of ‘trust’ and ‘betrayal’ look the same.
For an instructive discussion of this distinction between descriptive and normative views of trust in non-agential artifacts, and how they are both distinct from reliance, but not equivalent to one another, see Tallant’s (2019) provocatively-titled paper: “You can trust that ladder, but you shouldn’t”.
It is important to point out that Nguyen’s account, as well as other similar views (e.g. Ferrario et al.’s (2022) discussion of simple trust and the ‘anti-monitoring’ stance), do not exhaustively characterize what it means to trust technological artifacts, but simply describe – as fully as possible – the attitudes and dispositions that a trustor has towards their trustee. This is necessary, but perhaps insufficient, for fully describing trust in AI. A more complete conceptualization might need to also describe other beliefs, normative expectations, or assumptions that the typical trustor and trustee must have in relation to one another. Ferrario et al.’s (2022) “incremental model of trust” is one such careful and detailed attempt to do so. However, since the arguments in this paper are primarily centered on the attitudinal tensions between explainability and trust in AI, it is most important for us to clearly and completely characterize the attitudes that decision-makers have towards their AI-DSTs. For this, in our view, Nguyen’s “unquestioning attitude” view is especially instructive.
Importantly, for the purposes of our paper, Nguyen’s view allows us to remain neutral about the object(s) of trust. One may adopt an unquestioning attitude towards the AI-DST itself, or towards the sociotechnical system surrounding the AI-DST (including, for instance, the people and organizations that work to develop, deploy and manage the system) – but the disposition to put questions about the trustee’s reliability out of mind remains, at a descriptive level, the same.
On Nguyen’s view, therefore, agential integration between the trustor and trustee is what makes ‘betrayal’ (rather than, say, anger or shock) the appropriate affective response when the trustee fails to do what they are trusted to do. It falls outside the scope of this paper to fully explain and evaluate the particulars of Nguyen’s argument for this position, but we encourage interested readers to engage with his view in full, especially in Sect. 5 ‘The Integrative Stance’ and 6 ‘Gullibility and Agential Outsourcing’ (p. 24 onwards).
For an AI-DST to be deployed in consequential decision-making processes, developmental milestones that have to be met often involve the negotiation of expertise between domain- and developer-knowledge. As recent ethnographic work in organizational studies indicates, these processes can be long and drawn-out, including both significant model compromise and extension (Kim & Mehrizi, 2022; Kim et al., 2022). It is, in our view, entirely feasible for individuals like Zoe to be involved in the development process through the translation of her clinical experience and expertise into model parameters. Resultantly, if Zoe believes that the resulting AI-DST is well-tailored to her needs and concerns, she is more likely to retain an unquestioning attitude in her interactions with this AI-DST that she has had significant involvement in training and deploying.
This is not to say, of course, that explanations are morally irrelevant at this point. For example, in case of harmful system malfunctions, explainability can be a valuable tool for retrospectively tracking how these malfunctions occurred, and how future similar malfunctions might be avoided. As such, even in cases where decision-makers no longer effectively scrutinizing their AI-DST’s explanations (and in turn, failing to meaningfully perform their roles as ‘humans-in-the-loop’), these explanations can still be morally useful in other ways. Our thanks to an anonymous reviewer for pointing out the need for this important clarification.
As we mentioned earlier, a large amount of empirical literature on this topic holds ‘trust’ as conceptually equivalent to ‘appropriate reliance’. Those who wish to, as we suggest, press a strict normative distinction between trust and reliance would need to mount an effective challenge against those who argue for ‘trust as appropriate reliance’. Mark Ryan’s (2020) paper is one notable attempt to do so, and his careful and detailed arguments may be instructive to those who wish to pursue this strategy.
It is useful to note that, in some cases, AI-DSTs that impose significant liability burdens on individual decision-makers might represent a good target for regulatory interventions. This is especially the case in domains – such as clinical decision-making – that fall clearly under purview of a powerful and well-resourced regulator, who might be able to intervene in reviewing and validating AI-DSTs, and properly calibrating and demarcating the professional liability of decision-makers who use these systems. Our thanks to an anonymous reviewer for pointing out that Zoe’s AI-DST for clinical prescriptions, introduced earlier in Sect. 4.3, would be a good regulatory target, and if so, it is possible that Zoe might not be held personally liable for harms ensuing from the AI-DSTs potential errors.

References

Alufaisan, Y., Marusich, L. R., Bakdash, J. Z., Zhou, Y., & Kantarcioglu, M. (2020). Does Explainable Artificial Intelligence Improve Human Decision-Making? ArXiv Preprint ArXiv:2006.11194.
Ananny, M., & Crawford, K. (2018). Seeing without knowing: Limitations of the transparency ideal and its application to algorithmic accountability. New Media & Society, 20(3), 979–989. https://doi.org/10.1177/1461444816676645.
Article Google Scholar
Arrieta, A. B., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., ... & Herrera, F. (2020). Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information fusion, 58, 82–115.
Baier, A. (1986). Trust and Antitrust. Ethics, 96(2), 231–260. https://doi.org/10.1086/292745.
Article MathSciNet Google Scholar
Bansal, G., Wu, T., Zhou, J., Fok, R., Nushi, B., Kamar, E., Ribeiro, M. T., & Weld, D. (2021). Does the Whole Exceed its Parts? The Effect of AI Explanations on Complementary Team Performance. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (pp. 1–16). Association for Computing Machinery. https://doi.org/10.1145/3411764.3445717
Benbya, H., Davenport, T. H., & Pachidi, S. (2020). Artificial intelligence in organizations: Current state and future opportunities.MIS Quarterly Executive, 19(4).
Bigman, Y. E., Waytz, A., Alterovitz, R., & Gray, K. (2019). Holding robots responsible: the elements of machine morality. Trends in cognitive sciences, 23(5), 365–368.
Article Google Scholar
Brown, S., Davidovic, J., & Hasan, A. (2021). The algorithm audit: scoring the algorithms that score us. Big Data & Society, 8(1), 2053951720983865.
Article Google Scholar
Burrell, J. (2016). How the machine ‘thinks’: understanding opacity in machine learning algorithms. Big Data & Society, 3(1), 205395171562251. https://doi.org/10.1177/2053951715622512.
Article Google Scholar
Cappelen, H., & Dever, J. (2021). Making AI Intelligible: philosophical foundations. Oxford University Press.
Chatila, R., Dignum, V., Fisher, M., Giannotti, F., Morik, K., Russell, S., & Yeung, K. (2021). Trustworthy AI. In B. Braunschweig & M. Ghallab (Eds.), Reflections on Artificial Intelligence for Humanity (pp. 13–39). Springer International Publishing. https://doi.org/10.1007/978-3-030-69128-8_2
Clark, J., McLoughlin, I., Rose, H., Jon Clark, D., & King, R. (1988). The process of technological change: New technology and social choice in the workplace (Issue 11). CUP Archive.
Coeckelbergh, M. (2020). Artificial intelligence, responsibility attribution, and a relational justification of explainability. Science and Engineering Ethics, 26(4), 2051–2068.
Article Google Scholar
Cummings, M. L. (2017). Automation bias in intelligent time critical decision support systems. Decision making in aviation (pp. 289–294). Routledge.
Danaher, J. (2020). Robot Betrayal: a guide to the ethics of robotic deception. Ethics and Information Technology, 22(2), 117–128.
Article Google Scholar
Darling, K., Nandy, P., & Breazeal, C. (2015, August). Empathic concern and the effect of stories in human-robot interaction. In 2015 24th IEEE international symposium on robot and human interactive communication (RO-MAN) (pp. 770–775). IEEE.
Davis, F. D. (1989). Perceived usefulness, perceived ease of Use, and user Acceptance of Information Technology. MIS Quarterly, 13(3), 319. https://doi.org/10.2307/249008.
Article Google Scholar
Deloitte (2021). (n.d.). Ethical technology and trust. Retrieved 29 April from https://www2.deloitte.com/us/en/insights/focus/tech-trends/2020/ethical-technology-and-brand-trust.html
Deloitte (2021). Thriving in the era of pervasive AI. Deloitte Insights. Retrieved 05 April 2022, from https://www2.deloitte.com/us/en/insights/focus/cognitive-technologies/state-of-ai-and-intelligent-automation-in-business-survey.html
Dietvorst, B. J., Simmons, J. P., & Massey, C. (2015). Algorithm aversion: people erroneously avoid algorithms after seeing them err. Journal of Experimental Psychology: General, 144(1), 114.
Article Google Scholar
DNI (2020). INTEL - Artificial Intelligence Ethics Framework for the Intelligence Community. INTEL.Gov. https://www.intelligence.gov/artificial-intelligence-ethics-framework-for-the-intelligence-community
Durán, J. M., & Formanek, N. (2018). Grounds for Trust: essential epistemic opacity and computational reliabilism. Minds and Machines, 28(4), 645–666. https://doi.org/10.1007/s11023-018-9481-6.
Article Google Scholar
Edwards, L., & Veale, M. (2017). Slave to the algorithm: why a right to an explanation is probably not the remedy you are looking for. Duke L & Tech Rev, 16, 18.
Google Scholar
Ehsan, U., Liao, Q. V., Muller, M., Riedl, M. O., & Weisz, J. D. (2021). Expanding Explainability: Towards Social Transparency in AI systems. Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, 1–19. https://doi.org/10.1145/3411764.3445188
European Commission. (2019). Policy and intestment recommendations for trustworthy AI. High Level Expert Group on Artificial Intelligence, European Commission.
Ferrario, A., & Loi, M. (2022). How Explainability Contributes to Trust in AI. In 2022 ACM Conference on Fairness, Accountability, and Transparency, 1457–1466. https://doi.org/10.1145/3531146.3533202
Floridi, L. (2019). Establishing the rules for building trustworthy AI. Nature Machine Intelligence, 1(6), 261–262.
Article Google Scholar
Floridi, L., Cowls, J., Beltrametti, M., Chatila, R., Chazerand, P., Dignum, V., Luetge, C., Madelin, R., Pagallo, U., Rossi, F., Schafer, B., Valcke, P., & Vayena, E. (2018). AI4People-An ethical Framework for a good AI society: Opportunities, Risks, Principles, and recommendations. Minds and Machines, 28(4), 689–707. https://doi.org/10.1007/s11023-018-9482-5.
Article Google Scholar
Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., & Bouchachia, A. (2014). A survey on concept drift adaptation. ACM Computing Surveys, 46(4), 44:1–4437. https://doi.org/10.1145/2523813.
Article MATH Google Scholar
Gillespie, N., Curtis, C., Bianchi, R., Akbari, A., & van Vlissingen, F., R (2020). Achieving trustworthy AI: a model for trustworthy Artificial Intelligence. The University of Queensland and KPMG. https://doi.org/10.14264/ca0819d.
Glikson, E., & Woolley, A. W. (2020). Human trust in artificial intelligence: review of empirical research. Academy of Management Annals, 14(2), 627–660. https://doi.org/10.5465/annals.2018.0057.
Article Google Scholar
Goodman, B., & Flaxman, S. (2017). European Union regulations on algorithmic decision-making and a “right to explanation”. AI Magazine, 38(3), 50–57.
Article Google Scholar
Google (2021). People + AI Guidebook. Retrieved 05 April 2022, from https://pair.withgoogle.com/guidebook/
Grint, K., & Woolgar, S. (2013). The machine at work: technology, work and organization. John Wiley & Sons.
Gunning, D. (2017). Explainable Artificial Intelligence (XAI). DARPA/I20 Project.
Hagendorff, T. (2020). The ethics of AI ethics: an evaluation of guidelines. Minds and Machines, 30(1), 99–120.
Article Google Scholar
Hao, K. (2021). Worried about your firm’s AI ethics? These startups are here to help MIT Technology Review. Retrieved 05 April 2022, from https://www.technologyreview.com/2021/01/15/1016183/ai-ethics-startups/
Hoff, K. A., & Bashir, M. (2015). Trust in automation: integrating empirical evidence on factors that influence trust. Human Factors, 57(3), 407–434. https://doi.org/10.1177/0018720814547570.
Article Google Scholar
Hollanek, T. (2020). AI transparency: A matter of reconciling design with critique. AI & SOCIETY 2020, 1–9. https://doi.org/10.1007/s00146-020-01110-y
Humphreys, P. (2004). Extending ourselves: computational science, empiricism, and scientific method. Oxford University Press.
Infocomm Media Development Authority (2021). Singapore Model AI Governance Framework Second Edition. Retrieved 05 April 2022, from https://www.sgpc.gov.sg/sgpcmedia/media_releases/imda/press_release/P-20200122-2/attachment/Singapore%20Model%20AI%20Governance%20Framework%20Second%20Edition%20-%20Framework.pdf
ISO/IEC (2020). ISO/IEC TR 24028:2020(en), Information technology—Artificial intelligence—Overview of trustworthiness in artificial intelligence. https://www.iso.org/obp/ui/#iso:std:iso-iec:tr:24028:ed-1:v1:en
Jacovi, A., Marasović, A., Miller, T., & Goldberg, Y. (2021). Formalizing trust in artificial intelligence: Prerequisites, causes and goals of human trust in ai. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 624–635.
Janssen, M., Hartog, M., Matheus, R., Ding, Y., A., & Kuk, G. (2020). Will Algorithms Blind People? The effect of explainable AI and decision-makers’ experience on AI-supported decision-making in Government. Social Science Computer Review, 0894439320980118. https://doi.org/10.1177/0894439320980118.
Jones, K. (1996). Trust as an affective attitude. Ethics, 107(1), 4–25. https://doi.org/10.1086/233694.
Article Google Scholar
Kaminski, M. E. (2021). In S. Sandeen, C. Rademacher, & A. Ohly (Eds.), The right to explanation, explained (p. 22). Edward Elgar Publishing.
Kaur, H., Nori, H., Jenkins, S., Caruana, R., Wallach, H., & Vaughan, W. (2020). J. Interpreting Interpretability: Understanding Data Scientists’ Use of Interpretability Tools for Machine Learning. Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, 1–14.
Killock, D. (2020). AI outperforms radiologists in mammographic screening. Nature Reviews Clinical Oncology, 17(3), 134–134.
Article Google Scholar
Kim, T. W., & Routledge, B. R. (2021). Why a right to an explanation of algorithmic decision-making should exist: A Trust-Based Approach. Business Ethics Quarterly, 1–28. https://doi.org/10.2139/ssrn.3716519.
Kim, B., & Mehrizi, M.H.R. (2022). Generating Knowledge Around the Unknowable Algorithm. Academy of Management Proceedings. https:// doi.org/10.5465/AMBPP.2022.31
Kim, B., Mehrizi, M.H.R., & Huysman, M. (2022). Developing Algorithms in the Dark: Coping with an Autonomous and Inscrutable Algorithm. In 38th EGOS Colloquium-2022-Sub-theme 44: New Approaches to Organizing Collaborative Knowledge Creation.
Article Google Scholar
Koshiyama, A., Kazim, E., Treleaven, P., Rai, P., Szpruch, L., Pavey, G., Ahamat, G., Leutner, F., Goebel, R., Knight, A., Adams, J., Hitrova, C., Barnett, J., Nachev, P., Barber, D., Chamorro-Premuzic, T., Klemmer, K., Gregorovic, M., Khan, S., & Lomas, E. (2021). Towards Algorithm auditing: a Survey on managing legal, ethical and Technological Risks of AI, ML and Associated Algorithms. Social Science Research Network, 3778998, https://doi.org/10.2139/ssrn.3778998. (SSRN Scholarly Paper ID.
Lee, J. D., & See, K. A. (2004). Trust in Automation: Designing for Appropriate Reliance.Human Factors,31.
Lipton, Z. (2019). The Mythos of Model Interpretability. ACMQueue, 16(3). Retrieved 05 April 2022, from https://queue.acm.org/detail.cfm?id=3241340
Long, B. (2020). The Ethics of Deep Learning AI and the Epistemic Opacity Dilemma. Blog of the APA. Retrieved 05 April 2022, from https://blog.apaonline.org/2020/08/13/the-ethics-of-deep-learning-ai-and-the-epistemic-opacity-dilemma/
Mandrake, L., Doran, G., Goel, A., Ono, H., Amini, R., Feather, M. S., & Kaufman, J. (2022, March). Space Applications of a Trusted AI Framework: Experiences and Lessons Learned. In 2022 IEEE Aerospace Conference (AERO) (pp. 1–20). IEEE.
Margalit, A. (2017). On betrayal. Cambridge: Harvard University Press.
Mayer, R. C., Davis, J. H., & Schoorman, F. D. (1995). An integrative model of organizational trust. Academy of Management Review, 20(3), 709–734.
Article Google Scholar
Mitchell, M., Wu, S., Zaldivar, A., Barnes, P., Vasserman, L., Hutchinson, B., Spitzer, E., Raji, I. D., & Gebru, T. (2019). Model Cards for Model Reporting. Proceedings of the Conference on Fairness, Accountability, and Transparency, 220–229. https://doi.org/10.1145/3287560.3287596
Mollen, J., Putten, P. V. D., & Darling, K. (2022). Bonding with a Couchsurfing Robot: The Impact of Common Locus on Human-Robot Bonding In-the-wild. ACM Transactions on Human-Robot Interaction.
Mueller, S. T., Hoffman, R. R., Clancey, W., Emrey, A., & Klein, G. (2019). Explanation in Human-AI Systems: A Literature Meta-Review, Synopsis of Key Ideas and Publications, and Bibliography for Explainable AI. http://arxiv.org/abs/1902.01876v1
Nguyen, A., Yosinski, J., & Clune, J. (2015). Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. 427–436.
Nguyen, C. T. (2020). Trust as an unquestioning attitude (48 vol.). Oxford Studies in Epistemology.
Nickel, P. J., Franssen, M., & Kroes, P. (2010). Can we make sense of the notion of Trustworthy Technology? Knowledge Technology & Policy, 23(3–4), 429–444. https://doi.org/10.1007/s12130-010-9124-6.
Article Google Scholar
Papenmeier, A., Englebienne, G., & Seifert, C. (2019). How model accuracy and explanation fidelity influence user trust. ArXiv Preprint ArXiv:1907.12652.
Pasquale, F. (2015). The black box society. Harvard University Press.
Pieters, W. (2011). Explanation and trust: what to tell the user in security and AI? Ethics and Information Technology, 13(1), 53–64. https://doi.org/10.1007/s10676-010-9253-3.
Article Google Scholar
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). ‘Why Should I Trust You?’: Explaining the Predictions of Any Classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135–1144. https://doi.org/10.1145/2939672.2939778
Robbins, S. (2019). A misdirected Principle with a catch: explicability for AI. Minds and Machines, 29(4), 495–514. https://doi.org/10.1007/s11023-019-09509-3.
Article MathSciNet Google Scholar
Rudin, C. (2019). Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5), 206–215.
Article Google Scholar
Ryan, M. (2020). AI We Trust: Ethics, Artificial Intelligence, and reliability. Science and Engineering Ethics, 26(5), 2749–2767.
Article Google Scholar
Sætra, H. S. (2021). Social robot deception and the culture of trust. Paladyn Journal of Behavioral Robotics, 12(1), 276–286.
Article Google Scholar
Shin, D. (2021). The effects of explainability and causability on perception, trust, and acceptance: implications for explainable AI. International Journal of Human-Computer Studies, 146, 102551. https://doi.org/10.1016/j.ijhcs.2020.102551.
Article Google Scholar
Simon, J. (2010). The entanglement of trust and knowledge on the web. Ethics and Information Technology, 12(4), 343–355.
Article Google Scholar
Slingerland, P., Perry, L., Kaufman, J., Bycroft, B., Linstead, E., Mandrake, L., & Amini, R. (2022, March). Adapting a trusted AI framework to space mission autonomy. In 2022 IEEE Aerospace Conference (AERO) (pp. 1–20). IEEE.
Sloane, M., Moss, E., & Chowdhury, R. (2021). A Silicon Valley Love Triangle: Hiring Algorithms, Pseudo-Science, and the Quest for Auditability. ArXiv Preprint ArXiv:2106.12403.
Sonboli, N., Smith, J. J., Berenfus, F. C., Burke, R., & Fiesler, C. (2021). Fairness and Transparency in Recommendation: The Users’ Perspective. ArXiv:2103.08786 [Cs]. https://doi.org/10.1145/3450613.3456835
Shrestha, Y. R., Ben-Menahem, S. M., & Von Krogh, G. (2019). Organizational decision-making structures in the age of artificial intelligence. California Management Review, 61(4), 66–83.
Article Google Scholar
Stanton, B., & Jensen, T. (2021). Trust and Artificial Intelligence [Preprint]. https://doi.org/10.6028/NIST.IR.8332-draft
Sung, J. Y., Guo, L., Grinter, R. E., & Christensen, H. I. (2007, September). “My Roomba is Rambo”: intimate home appliances. In International conference on ubiquitous computing (pp. 145–162). Springer, Berlin, Heidelberg.
Taddeo, M. (2017). Trusting Digital Technologies correctly. Minds and Machines, 27(4), 565–568. https://doi.org/10.1007/s11023-017-9450-5.
Article Google Scholar
Tallant, J. (2019). You can trust the ladder, but you shouldn’t. Theoria, 85(2), 102–118.
Article MathSciNet Google Scholar
Tsymbal, A. (2004). The problem of concept drift: definitions and related work. Computer Science Department Trinity College Dublin, 106(2), 58.
Google Scholar
UK Information Commisioner’s Office (2019). An overview of the Auditing Framework for Artificial Intelligence and its core components. ICO. https://ico.org.uk/about-the-ico/news-and-events/ai-blog-an-overview-of-the-auditing-framework-for-artificial-intelligence-and-its-core-components/
U.S. Department of Defense (2020). DOD Adopts Ethical Principles for Artificial Intelligence. https://www.defense.gov/News/Releases/Release/Article/2091996/dod-adopts-ethical-principles-for-artificial-intelligence/
Ustun, B., Spangher, A., & Liu, Y. (2019). Actionable recourse in linear classification. Proceedings of the Conference on Fairness, Accountability, and Transparency. https://doi.org/10.1145/3287560.3287566.
Venkatesh, V., & Davis, F. D. (2000). A theoretical extension of the Technology Acceptance Model: four Longitudinal Field Studies. Management Science, 46(2), 186–204. https://doi.org/10.1287/mnsc.46.2.186.11926.
Article Google Scholar
von Eschenbach, W. J. (2021). Transparency and the Black Box Problem: why we do not trust AI. Philosophy & Technology, 34(4), 1607–1622. https://doi.org/10.1007/s13347-021-00477-0.
Article Google Scholar
Vredenburgh, K. (2019). Explanation and Social Scientific Modeling. Doctoral Dissertation, Harvard University, Graduate School of Arts & Sciences, 134.
Wachter, S., Mittelstadt, B., & Russell, C. (2017). Counterfactual Explanations without opening the Black Box: automated decisions and the GDPR. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3063289.
Article Google Scholar
Weitz, K., Schiller, D., Schlagowski, R., Huber, T., & André, E. (2019). ‘ Do you trust me?’ Increasing user-trust by integrating virtual agents in explainable AI interaction design. Proceedings of the 19th ACM International Conference on Intelligent Virtual Agents, 7–9.
Wiese, E., Metta, G., & Wykowska, A. (2017). Robots as intentional agents: using neuroscientific methods to make robots appear more social. Frontiers in psychology, 8, 1663.
Article Google Scholar
Wu, K., Zhao, Y., Zhu, Q., Tan, X., & Zheng, H. (2011). A meta-analysis of the impact of trust on technology acceptance model: investigation of moderating influence of subject and context type. International Journal of Information Management, 31(6), 572–581. https://doi.org/10.1016/j.ijinfomgt.2011.03.004.
Article Google Scholar
Yang, F., Huang, Z., Scholtz, J., & Arendt, D. L. (2017). How Do Visual Explanations Foster End Users’ Appropriate Trust in Machine Learning? 13.
Zanzotto, F. M. (2019). Viewpoint: human-in-the-loop Artificial Intelligence. Journal of Artificial Intelligence Research, 64, 243–252. https://doi.org/10.1613/jair.1.11345.
Article MathSciNet Google Scholar

Download references

Acknowledgements

The authors would like to thank Neiladri Sinhababu, Diane Bailey, and the New Media and Society Working Group at Cornell University for their helpful advice, insights, and feedback on earlier drafts. The authors are also grateful for the valuable feedback received from anonymous reviewers.

Funding

The first author of this article was supported by a research project grant from the NUS Centre for Trusted Internet and Community (Grant Number: CTIC-RP-20-06) awarded to Prof. David De Cremer (NUS Business School).

Author information

Authors and Affiliations

Centre on AI Technology for Humankind, National University of Singapore Business School, 15 Kent Ridge Drive, 119245, Singapore, Singapore
Devesh Narayanan
Department of Philosophy, National University of Singapore, Singapore, Singapore
Devesh Narayanan
Department of Communication, Cornell University, Ithaca, USA
Zhi Ming Tan

Authors

Devesh Narayanan
View author publications
You can also search for this author in PubMed Google Scholar
Zhi Ming Tan
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The two authors contributed equally to the conceptualization, drafting and editing of this article.

Corresponding author

Correspondence to Devesh Narayanan.

Ethics declarations

Conflict of Interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Narayanan, D., Tan, Z.M. Attitudinal Tensions in the Joint Pursuit of Explainable and Trusted AI. Minds & Machines 33, 55–82 (2023). https://doi.org/10.1007/s11023-023-09628-y

Download citation

Received: 04 April 2022
Accepted: 28 January 2023
Published: 26 February 2023
Issue Date: March 2023
DOI: https://doi.org/10.1007/s11023-023-09628-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Attitudinal Tensions in the Joint Pursuit of Explainable and Trusted AI

Abstract

Access this article

Similar content being viewed by others

Accountability in artificial intelligence: what it is and how it works

Ethical Decision-Making Theory: An Integrated Approach

In AI We Trust: Ethics, Artificial Intelligence, and Reliability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Attitudinal Tensions in the Joint Pursuit of Explainable and Trusted AI

Abstract

Access this article

Similar content being viewed by others

Accountability in artificial intelligence: what it is and how it works

Ethical Decision-Making Theory: An Integrated Approach

In AI We Trust: Ethics, Artificial Intelligence, and Reliability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation