skip to main content
10.1145/3576840.3578315acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Understanding the Cognitive Influences of Interpretability Features on How Users Scrutinize Machine-Predicted Categories

Authors Info & Claims
Published:20 March 2023Publication History

ABSTRACT

The goal of interpretable machine learning (ML) is to design tools and visualizations to help users scrutinize a system’s predictions. Prior studies have mostly employed quantitative methods to investigate the effects of specific tools/visualizations on outcomes related to objective performance—a human’s ability to correctly agree or disagree with the system—and subjective perceptions of the system. Few studies have employed qualitative methods to investigate how and why specific tools/visualizations influence performance, perceptions, and behaviors. We report on a lab study (N = 30) that investigated the influences of two interpretability features: confidence values and sentence highlighting. Participants judged whether medical articles belong to a predicted medical topic and were exposed to two interface conditions—one with and one without interpretability features. We investigate the effects of our interpretability features on participants’ performance and perceptions. Additionally, we report on a qualitative analysis of participants’ responses during an exit interview. Specifically, we report on how our interpretability features impacted different cognitive activities that participants engaged with during the task—reading, learning, and decision making. We also describe ways in which the interpretability features introduced challenges and sometimes led participants to make mistakes. Insights gained from our results point to future directions for interpretable ML research.

References

  1. Gagan Bansal, Tongshuang Wu, Joyce Zhou, Raymond Fok, Besmira Nushi, Ece Kamar, Marco Tulio Ribeiro, and Daniel Weld. 2021. Does the whole exceed its parts? the effect of AI explanations on complementary team performance. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–16.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. David A Broniatowski. 2021. Psychological Foundations of Explainability and Interpretability in Artificial Intelligence. (2021).Google ScholarGoogle Scholar
  3. John Brooke. 2013. SUS: a retrospective. Journal of usability studies 8, 2 (2013), 29–40.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Samuel Carton, Qiaozhu Mei, and Paul Resnick. 2020. Feature-Based Explanations Don’t Help People Detect Misclassifications of Online Toxicity. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 14. 95–106.Google ScholarGoogle ScholarCross RefCross Ref
  5. Jianbo Chen, Le Song, Martin Wainwright, and Michael Jordan. 2018. Learning to Explain: An Information-Theoretic Perspective on Model Interpretation. In Proceedings of the 35th International Conference on Machine Learning(Proceedings of Machine Learning Research). PMLR, 883–892.Google ScholarGoogle Scholar
  6. Finale Doshi-Velez and Been Kim. 2017. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608(2017).Google ScholarGoogle Scholar
  7. Gunther Eysenbach and Ch Kohler. 2003. What is the prevalence of health-related searches on the World Wide Web? Qualitative and quantitative analysis of search engine queries on the internet. In AMIA annual symposium proceedings, Vol. 2003. American Medical Informatics Association, 225.Google ScholarGoogle Scholar
  8. Peter Hase and Mohit Bansal. 2020. Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior?. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 5540–5552.Google ScholarGoogle ScholarCross RefCross Ref
  9. Daniel W Hook, Simon J Porter, and Christian Herzog. 2018. Dimensions: building context for search and evaluation. Frontiers in Research Metrics and Analytics 3 (2018), 23.Google ScholarGoogle ScholarCross RefCross Ref
  10. Harmanpreet Kaur, Harsha Nori, Samuel Jenkins, Rich Caruana, Hanna Wallach, and Jennifer Wortman Vaughan. 2020. Interpreting Interpretability: Understanding Data Scientists’ Use of Interpretability Tools for Machine Learning. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Been Kim, Rajiv Khanna, and Oluwasanmi O Koyejo. 2016. Examples are not enough, learn to criticize! Criticism for Interpretability. In Advances in Neural Information Processing Systems. Curran Associates, Inc.Google ScholarGoogle Scholar
  12. Pang Wei Koh and Percy Liang. 2017. Understanding black-box predictions via influence functions. In International conference on machine learning. PMLR, 1885–1894.Google ScholarGoogle Scholar
  13. Vivian Lai, Han Liu, and Chenhao Tan. 2020. " Why is’ Chicago’deceptive?" Towards Building Model-Driven Tutorials for Humans. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–13.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Vivian Lai and Chenhao Tan. 2019. On human predictions with explanations and predictions of machine learning models: A case study on deception detection. In Proceedings of the conference on fairness, accountability, and transparency. 29–38.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Himabindu Lakkaraju, Stephen H Bach, and Jure Leskovec. 2016. Interpretable decision sets: A joint framework for description and prediction. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 1675–1684.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Tao Lei, Regina Barzilay, and Tommi Jaakkola. 2016. Rationalizing Neural Predictions. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 107–117.Google ScholarGoogle ScholarCross RefCross Ref
  17. Henry J Lowe and G Octo Barnett. 1994. Understanding and using the medical subject headings (MeSH) vocabulary to perform literature searches. JAMA 271, 14 (1994), 1103–1108.Google ScholarGoogle ScholarCross RefCross Ref
  18. Scott M Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions. Advances in neural information processing systems 30 (2017).Google ScholarGoogle Scholar
  19. W James Murdoch, Chandan Singh, Karl Kumbier, Reza Abbasi-Asl, and Bin Yu. 2019. Definitions, methods, and applications in interpretable machine learning. Proceedings of the National Academy of Sciences 116, 44(2019), 22071–22080.Google ScholarGoogle ScholarCross RefCross Ref
  20. National Library of Medicine. 2021. MEDLINE 2022 Initiative: Transition to Automated Indexing. https://www.nlm.nih.gov/pubs/techbull/nd21/nd21_medline_2022.html. Accessed: 2022-01.Google ScholarGoogle Scholar
  21. Cecilia Panigutti, Andrea Beretta, Fosca Giannotti, and Dino Pedreschi. 2022. Understanding the Impact of Explanations on Advice-Taking: A User Study for AI-Based Clinical Decision Support Systems. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI ’22). Association for Computing Machinery, New York, NY, USA, Article 568, 9 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Jiaming Qu, Jaime Arguello, and Yue Wang. 2021. A Study of Explainability Features to Scrutinize Faceted Filtering Results. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 1498–1507.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why should I trust you?" Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 1135–1144.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2018. Anchors: High-precision model-agnostic explanations. In Proceedings of the AAAI conference on artificial intelligence, Vol. 32.Google ScholarGoogle ScholarCross RefCross Ref
  25. Jakob Schoeffer, Niklas Kuehl, and Yvette Machowski. 2022. “There Is Not Enough Information”: On the Effects of Explanations on Perceptions of Informational Fairness and Trustworthiness in Automated Decision-Making(FAccT ’22). Association for Computing Machinery, New York, NY, USA, 1616–1628.Google ScholarGoogle Scholar
  26. Hendrik Schuff, Alon Jacovi, Heike Adel, Yoav Goldberg, and Ngoc Thang Vu. 2022. Human Interpretation of Saliency-Based Explanation Over Text. In 2022 ACM Conference on Fairness, Accountability, and Transparency(FAccT ’22). Association for Computing Machinery, New York, NY, USA, 611–636.Google ScholarGoogle Scholar
  27. Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. In 2017 IEEE International Conference on Computer Vision (ICCV). 618–626.Google ScholarGoogle ScholarCross RefCross Ref
  28. Aaron Springer and Steve Whittaker. 2020. Progressive disclosure: When, why, and how do users want algorithmic transparency information?ACM Transactions on Interactive Intelligent Systems (TiiS) 10, 4(2020), 1–32.Google ScholarGoogle Scholar
  29. S Wachter, B Mittelstadt, and C Russell. 2018. Counterfactual explanations without opening the black box: automated decisions and the GDPR. Harvard Journal of Law and Technology 31, 2 (2018), 841–887.Google ScholarGoogle Scholar
  30. Huaiyu Wan, Yutao Zhang, Jing Zhang, and Jie Tang. 2019. Aminer: Search and mining of academic social networks. Data Intelligence 1, 1 (2019), 58–76.Google ScholarGoogle ScholarCross RefCross Ref
  31. Xinru Wang and Ming Yin. 2021. Are Explanations Helpful? A Comparative Study of the Effects of Explanations in AI-Assisted Decision-Making. In 26th International Conference on Intelligent User Interfaces. 318–328.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Hilde JP Weerts, Werner van Ipenburg, and Mykola Pechenizkiy. 2019. A human-grounded evaluation of SHAP for alert processing. arXiv preprint arXiv:1907.03324(2019).Google ScholarGoogle Scholar
  33. Peace Ossom Williamson and Christian IJ Minter. 2019. Exploring PubMed as a reliable resource for scholarly communications services. Journal of the Medical Library Association: JMLA 107, 1 (2019), 16.Google ScholarGoogle Scholar
  34. Linyi Yang, Eoin M Kenny, Tin Lok James Ng, Yi Yang, Barry Smyth, and Ruihai Dong. 2020. Generating plausible counterfactual explanations for deep transformers in financial text classification. arXiv preprint arXiv:2010.12512(2020).Google ScholarGoogle Scholar
  35. Muhammad Rehman Zafar and Naimul Mefraz Khan. 2019. DLIME: A Deterministic Local Interpretable Model-Agnostic Explanations Approach for Computer-Aided Diagnosis Systems. In ACM SIGKDD Workshop on Explainable AI/ML (XAI) for Accountability, Fairness, and Transparency. 6.Google ScholarGoogle Scholar

Index Terms

  1. Understanding the Cognitive Influences of Interpretability Features on How Users Scrutinize Machine-Predicted Categories

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CHIIR '23: Proceedings of the 2023 Conference on Human Information Interaction and Retrieval
          March 2023
          520 pages
          ISBN:9798400700354
          DOI:10.1145/3576840
          • Editors:
          • Jacek Gwizdka,
          • Soo Young Rieh

          Copyright © 2023 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 20 March 2023

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited

          Acceptance Rates

          Overall Acceptance Rate55of163submissions,34%
        • Article Metrics

          • Downloads (Last 12 months)53
          • Downloads (Last 6 weeks)3

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format