skip to main content
research-article
Open Access

AIMEE: An Exploratory Study of How Rules Support AI Developers to Explain and Edit Models

Published:04 October 2023Publication History
Skip Abstract Section

Abstract

In real-world applications when deploying Machine Learning (ML) models, initial model development includes close analysis of the model results and behavior by a data scientist. Once trained, however, models may need to be retrained with new data or updated to adhere to new rules or regulations. This presents two challenges. First, how to communicate how a model is making its decisions before and after retraining, and second how to support model editing to take into account new requirements. To address these needs, we built AIMEE (AI Model Explorer and Editor), a tool created to address these challenges by providing interactive methods to explain, visualize, and modify model decision boundaries using rules. Rules should benefit model builders by providing a layer of abstraction for understanding and manipulating the model and reduces the need to modify individual rows of data directly. To evaluate if this was the case, we conducted a pair of user studies totaling 23 participants to evaluate AIMEE's rules-based approach for model explainability and editing. We found that participants correctly interpreted rules and report on their perspectives of how rules are beneficial (and not), ways that rules could support collaboration, and provide a usability evaluation of the tool.

References

  1. Behnoush Abdollahi and Olfa Nasraoui. 2018. Transparency in Fair Machine Learning: the Case of Explainable Recommender Systems. Springer International Publishing, Cham, 21--35. https://doi.org/10.1007/978--3--319--90403-0_2Google ScholarGoogle ScholarCross RefCross Ref
  2. Amina Adadi and Mohammed Berrada. 2018. Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI). IEEE Access, Vol. 6 (2018), 52138--52160. https://doi.org/10.1109/ACCESS.2018.2870052Google ScholarGoogle ScholarCross RefCross Ref
  3. Oznur Alkan, Dennis Wei, Massimiliano Mattetti, Rahul Nair, Elizabeth Daly, and Diptikalyan Saha. 2022. FROTE: Feedback Rule-Driven Oversampling for Editing Models. In Proceedings of Machine Learning and Systems 2022, MLSys 2022, Santa Clara, CA, USA, August 29 - September 1, 2022. mlsys.org.Google ScholarGoogle Scholar
  4. Saleema Amershi, Andrew Begel, Christian Bird, Robert DeLine, Harald Gall, Ece Kamar, Nachiappan Nagappan, Besmira Nushi, and Thomas Zimmermann. 2019. Software engineering for machine learning: A case study. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). IEEE, 291--300.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Saleema Amershi, Maya Cakmak, William Bradley Knox, and Todd Kulesza. 2014. Power to the people: The role of humans in interactive machine learning. Ai Magazine, Vol. 35, 4 (2014), 105--120.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Ehsan Amid and Manfred K Warmuth. 2019. TriMap: Large-scale dimensionality reduction using triplets. arXiv preprint arXiv:1910.00204 (2019).Google ScholarGoogle Scholar
  7. Sule Anjomshoae, Amro Najjar, Davide Calvaresi, and Kary Fr"amling. 2019. Explainable Agents and Robots: Results from a Systematic Literature Review. In Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems (Montreal QC, Canada) (AAMAS '19). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, 1078--1088.Google ScholarGoogle Scholar
  8. Nuno Antunes, Leandro Marinho, Flavio Figueiredo, Nuno Lourenço, Wagner Meira Jr, and Walter Santos. 2018. Fairness and Transparency of Machine Learning for Trustworthy Cloud Services. 188--193. https://doi.org/10.1109/DSN-W.2018.00063Google ScholarGoogle ScholarCross RefCross Ref
  9. Matthew Arnold, Rachel KE Bellamy, Michael Hind, Stephanie Houde, Sameep Mehta, Aleksandra Mojsilović, Ravi Nair, K Natesan Ramamurthy, Alexandra Olteanu, David Piorkowski, et al. 2019. FactSheets: Increasing trust in AI services through supplier's declarations of conformity. IBM Journal of Research and Development, Vol. 63, 4/5 (2019), 6--1.Google ScholarGoogle ScholarCross RefCross Ref
  10. Vijay Arya, Rachel KE Bellamy, Pin-Yu Chen, Amit Dhurandhar, Michael Hind, Samuel C Hoffman, Stephanie Houde, Q Vera Liao, Ronny Luss, Aleksandra Mojsilović, et al. 2019. One explanation does not fit all: A toolkit and taxonomy of ai explainability techniques. arXiv preprint arXiv:1909.03012 (2019).Google ScholarGoogle Scholar
  11. Anil Aswani, Philip Kaminsky, Yonatan Mintz, Elena Flowers, and Yoshimi Fukuoka. 2018. Behavioral Modeling in Weight Loss Interventions. European Journal of Operational Research, Vol. 272 (07 2018). https://doi.org/10.1016/j.ejor.2018.07.011Google ScholarGoogle ScholarCross RefCross Ref
  12. Hamsa Bastani and Osbert Bastani. 2017. Interpreting Predictive Models for Human-in-the-Loop Analytics. In FATML Workshop.Google ScholarGoogle Scholar
  13. Victoria Bellotti and Keith Edwards. 2001. Intelligibility and Accountability: Human Considerations in Context-Aware Systems. HumanComputer Interaction, Vol. 16, 2--4 (2001), 193--212.Google ScholarGoogle Scholar
  14. Anant Bhardwaj, Souvik Bhattacherjee, Amit Chavan, Amol Deshpande, Aaron J. Elmore, Samuel Madden, and Aditya G. Parameswaran. 2014. DataHub: Collaborative Data Science Dataset Version Management at Scale. https://doi.org/10.48550/ARXIV.1409.0798Google ScholarGoogle ScholarCross RefCross Ref
  15. Mayernik Matthew S. Borgman Christine L., Wallis Jillian. C. 2012. Who's Got the Data? Interdependencies in Science and Technology Collaboration. (2012).Google ScholarGoogle Scholar
  16. Bertjan Broeksema, Alexandru C Telea, and Thomas Baudel. 2013. Visual Analysis of Multi-Dimensional Categorical Data Sets. In Computer Graphics Forum, Vol. 32. Wiley Online Library, 158--169.Google ScholarGoogle Scholar
  17. Ángel Alexander Cabrera, Marco Tulio Ribeiro, Bongshin Lee, Robert Deline, Adam Perer, and Steven M Drucker. 2023. What Did My AI Learn? How Data Scientists Make Sense of Model Behavior. ACM Transactions on Computer-Human Interaction, Vol. 30, 1 (2023), 1--27.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Maya Cakmak, Crystal Chao, and Andrea L Thomaz. 2010. Designing interactions for robot active learners. IEEE Transactions on Autonomous Mental Development, Vol. 2, 2 (2010), 108--118.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Alexandra Chouldechova. 2017. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data, Vol. 5 2 (2017), 153--163.Google ScholarGoogle Scholar
  20. European Commission. 2021. Proposal for a Regulation laying down harmonised rules on artificial intelligence.Google ScholarGoogle Scholar
  21. Roberto Confalonieri, Fermín Prado, Sebastia Agramunt, Daniel Malagarriga, Daniele Faggion, Tillman Weyde, and Tarek Besold. 2019. An Ontology-based Approach to Explaining Artificial Neural Networks.Google ScholarGoogle Scholar
  22. Mark Craven and Jude Shavlik. 1995. Extracting Tree-Structured Representations of Trained Networks. In Advances in Neural Information Processing Systems, D. Touretzky, M.C. Mozer, and M. Hasselmo (Eds.), Vol. 8. MIT Press. https://proceedings.neurips.cc/paper/1995/file/45f31d16b1058d586fc3be7207b58053-Paper.pdfGoogle ScholarGoogle Scholar
  23. Elizabeth M. Daly, Massimiliano Mattetti, Öznur Alkan, and Rahul Nair. 2021. User Driven Model Adjustment via Boolean Rule Explanations. In Conference on Artificial Intelligence, Vol. 35. 5896--5904.Google ScholarGoogle ScholarCross RefCross Ref
  24. Sanjeeb Dash, Oktay Gunluk, and Dennis Wei. 2018. Boolean Decision Rules via Column Generation. Advances in Neural Information Processing Systems, Vol. 31 (2018), 4655--4665.Google ScholarGoogle Scholar
  25. Fred D. Davis. 1989. Perceived Usefulness, Perceived Ease of Use, and User Acceptance of Information Technology. MIS Q., Vol. 13 (1989), 319--340.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Houtao Deng. 2019. Interpreting Tree Ensembles with inTrees. arXiv:1408.5456, Vol. 7 (06 2019). https://doi.org/10.1007/s41060-018-0144--8Google ScholarGoogle ScholarCross RefCross Ref
  27. Central Digital and Data Office. 2021. UK government publishes pioneering standard for algorithmic transparency. https://www.gov.uk/government/news/uk-government-publishes-pioneering-standard-for-algorithmic-transparency.Google ScholarGoogle Scholar
  28. Filip Karlo Do?ilovi?, Mario Br?i?, and Nikica Hlupi?. 2018. Explainable artificial intelligence: A survey. In 2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO). 0210--0215. https://doi.org/10.23919/MIPRO.2018.8400040Google ScholarGoogle ScholarCross RefCross Ref
  29. Jerry Alan Fails and Dan R Olsen Jr. 2003. Interactive machine learning. In Proceedings of the 8th international conference on Intelligent user interfaces. 39--45.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Rebecca Fiebrink, Perry R. Cook, and Dan Trueman. 2011. Human Model Evaluation in Interactive Supervised Learning. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Vancouver, BC, Canada) (CHI '11). Association for Computing Machinery, New York, NY, USA, 147--156. https://doi.org/10.1145/1978942.1978965Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Simson Garfinkel, Jeanna Matthews, Stuart Shapiro, and Jonathan Smith. 2017. Toward algorithmic transparency and accountability. Commun. ACM, Vol. 60 (08 2017), 5--5. https://doi.org/10.1145/3125780Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé Iii, and Kate Crawford. 2021. Datasheets for datasets. Commun. ACM, Vol. 64, 12 (2021), 86--92.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. John C Gower. 1971. A general coefficient of similarity and some of its properties. Biometrics (1971), 857--871.Google ScholarGoogle Scholar
  34. Corrado Grappiolo, Emile van Gerwen, Jack Verhoosel, and Lou Somers. 2019. The Semantic Snake Charmer Search Engine: A Tool to Facilitate Data Science in High-Tech Industry Domains. In Proceedings of the 2019 Conference on Human Information Interaction and Retrieval (Glasgow, Scotland UK) (CHIIR '19). Association for Computing Machinery, New York, NY, USA, 355--359. https://doi.org/10.1145/3295750.3298915Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Edouard Grave, Armand Joulin, and Quentin Berthet. 2019. Unsupervised alignment of embeddings with wasserstein procrustes. In The 22nd International Conference on Artificial Intelligence and Statistics. PMLR, 1880--1890.Google ScholarGoogle Scholar
  36. Riccardo Guidotti, Anna Monreale, Fosca Giannotti, Dino Pedreschi, Salvatore Ruggieri, and Franco Turini. 2019. Factual and Counterfactual Explanations for Black Box Decision Making. IEEE Intelligent Systems, Vol. 34, 6 (2019), 14--23. https://doi.org/10.1109/MIS.2019.2957223Google ScholarGoogle ScholarCross RefCross Ref
  37. David Gunning and David Aha. 2019. DARPA's Explainable Artificial Intelligence (XAI) Program. AI Magazine, Vol. 40, 2 (Jun. 2019), 44--58. https://doi.org/10.1609/aimag.v40i2.2850Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Karthik S Gurumoorthy, Amit Dhurandhar, Guillermo Cecchi, and Charu Aggarwal. 2019. Efficient data representation by selecting prototypes with importance weights. In 2019 IEEE International Conference on Data Mining (ICDM). IEEE, 260--269.Google ScholarGoogle ScholarCross RefCross Ref
  39. Chen He, Denis Parra, and Katrien Verbert. 2016. Interactive recommender systems: A survey of the state of the art and future research challenges and opportunities. Expert Systems with Applications, Vol. 56 (2016), 9--27. https://doi.org/10.1016/j.eswa.2016.02.013Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Xin He, Kaiyong Zhao, and Xiaowen Chu. 2021. AutoML: A survey of the state-of-the-art. Knowledge-Based Systems, Vol. 212 (2021), 106622.Google ScholarGoogle ScholarCross RefCross Ref
  41. Florian Heimerl, Steffen Koch, Harald Bosch, and Thomas Ertl. 2012. Visual Classifier Training for Text Document Retrieval. IEEE Transactions on Visualization and Computer Graphics, Vol. 18, 12 (2012), 2839--2848. https://doi.org/10.1109/TVCG.2012.277Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Sarah Holland, Ahmed Hosny, Sarah Newman, Joshua Joseph, and Kasia Chmielinski. 2020. The dataset nutrition label. Data Protection and Privacy, Volume 12: Data Protection and Democracy, Vol. 12 (2020), 1.Google ScholarGoogle ScholarCross RefCross Ref
  43. Illinois General Assembly. [n.,d.]. HB2557. https://www.ilga.gov/legislation/fulltext.asp?DocName=&SessionId=108&GA=101&DocTypeId=HB&DocNum=2557&GAID=15&LegID=118664&SpecSess=&Session=Google ScholarGoogle Scholar
  44. P. N. Johnson-Laird. 1986. Mental Models: Towards a Cognitive Science of Language, Inference, and Consciousness. Harvard University Press, USA.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Hamid Karimi and Jiliang Tang. 2020. Decision Boundary of Deep Neural Networks: Challenges and Opportunities. In Proceedings of the 13th International Conference on Web Search and Data Mining (Houston, TX, USA) (WSDM '20). Association for Computing Machinery, New York, NY, USA, 919--920. https://doi.org/10.1145/3336191.3372186Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Harmanpreet Kaur, Harsha Nori, Samuel Jenkins, Rich Caruana, Hanna Wallach, and Jennifer Wortman Vaughan. 2020. Interpreting interpretability: understanding data scientists' use of interpretability tools for machine learning. In Proceedings of the 2020 CHI conference on human factors in computing systems. 1--14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Miryung Kim, Thomas Zimmermann, Robert DeLine, and Andrew Begel. 2018. Data Scientists in Software Teams: State of the Art and Challenges. IEEE Transactions on Software Engineering, Vol. 44, 11 (2018), 1024--1038. https://doi.org/10.1109/TSE.2017.2754374Google ScholarGoogle ScholarCross RefCross Ref
  48. Laura Koesten, Emilia Kacprzak, Jeni Tennison, and Elena Simperl. 2019. Collaborative Practices with Structured Data: Do Tools Support What Users Need?. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI '19). Association for Computing Machinery, New York, NY, USA, 1--14. https://doi.org/10.1145/3290605.3300330Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Todd Kulesza, Margaret Burnett, Weng-Keen Wong, and Simone Stumpf. 2015. Principles of explanatory debugging to personalize interactive machine learning. In International Conference on Intelligent User Interfaces. 126--137.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Isaac Lage and Finale Doshi-Velez. 2020. Learning Interpretable Concept-Based Models with Human Feedback. arXiv preprint arXiv:2012.02898 (2020).Google ScholarGoogle Scholar
  51. Himabindu Lakkaraju, Stephen H. Bach, and Jure Leskovec. 2016. Interpretable Decision Sets: A Joint Framework for Description and Prediction. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Francisco, California, USA) (KDD '16). Association for Computing Machinery, New York, NY, USA, 1675--1684. https://doi.org/10.1145/2939672.2939874Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Piyawat Lertvittayakumjorn and Francesca Toni. 2021. Explanation-based human debugging of nlp models: A survey. Transactions of the Association for Computational Linguistics, Vol. 9 (2021), 1508--1528.Google ScholarGoogle ScholarCross RefCross Ref
  53. Pantelis Linardatos, Vasilis Papastefanopoulos, and Sotiris B. Kotsiantis. 2021. Explainable AI: A Review of Machine Learning Interpretability Methods. Entropy, Vol. 23 (2021).Google ScholarGoogle Scholar
  54. Arnold Lund. 2001. Measuring Usability with the USE Questionnaire. Usability and User Experience Newsletter of the STC Usability SIG, Vol. 8 (01 2001).Google ScholarGoogle Scholar
  55. Joseph B. Lyons. 2013. Being Transparent about Transparency. In AAAI Spring Symposium Series, (pp.48--53). Palo Alto, CL: Association for the Advancement of Artificial Intelligence. | .Google ScholarGoogle Scholar
  56. Francisco Caio M. Rodrigues, Roberto Hirata, and Alexandru Cristian Telea. 2018. Image-Based Visualization of Classifier Decision Boundaries. In 2018 31st SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI). 353--360. https://doi.org/10.1109/SIBGRAPI.2018.00052Google ScholarGoogle ScholarCross RefCross Ref
  57. Michael A Madaio, Luke Stark, Jennifer Wortman Vaughan, and Hanna Wallach. 2020. Co-designing checklists to understand organizational challenges and opportunities around fairness in AI. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1--14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Yaoli Mao, Dakuo Wang, Michael Muller, Kush R. Varshney, Ioana Baldini, Casey Dugan, and Aleksandra Mojsilović. 2019. How Data Scientists Work Together With Domain Experts in Scientific Collaborations: To Find The Right Answer Or To Ask The Right Question?, Vol. 3, GROUP, Article 237 (dec 2019), 23 pages. https://doi.org/10.1145/3361118Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Nora McDonald, Sarita Schoenebeck, and Andrea Forte. 2019. Reliability and inter-rater reliability in qualitative research: Norms and guidelines for CSCW and HCI practice. Proceedings of the ACM on human-computer interaction, Vol. 3, CSCW (2019), 1--23.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Leland McInnes, John Healy, and James Melville. 2018. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426 (2018).Google ScholarGoogle Scholar
  61. M. Migut, Marcel Worring, and Cor Veenman. 2013. Visualizing multi-dimensional decision boundaries in 2D. Data Mining and Knowledge Discovery, Vol. 29 (01 2013). https://doi.org/10.1007/s10618-013-0342-xGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  62. Yao Ming, Huamin Qu, and Enrico Bertini. 2019. RuleMatrix: Visualizing and Understanding Classifiers with Rules. IEEE Transactions on Visualization and Computer Graphics, Vol. 25, 1 (2019), 342--352. https://doi.org/10.1109/TVCG.2018.2864812Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. Model cards for model reporting. In Proceedings of the conference on fairness, accountability, and transparency. 220--229.Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Brent Mittelstadt, Chris Russell, and Sandra Wachter. 2019. Explaining Explanations in AI. In Proceedings of the Conference on Fairness, Accountability, and Transparency (Atlanta, GA, USA) (FAT* '19). Association for Computing Machinery, New York, NY, USA, 279--288. https://doi.org/10.1145/3287560.3287574Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Christoph Molnar. 2019. Interpretable Machine Learning. https://christophm.github.io/interpretable-ml-book/.Google ScholarGoogle Scholar
  66. Michael J. Muller, Ingrid Lange, Dakuo Wang, David Piorkowski, Jason Tsay, Qingzi Vera Liao, Casey Dugan, and Thomas Erickson. 2019. How Data Science Workers Work with Data: Discovery, Capture, Curation, Design, Creation. Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (2019).Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Rahul Nair, Massimiliano Mattetti, Elizabeth Daly, Dennis Wei, Öznur Alkan, and Yunfeng Zhang. 2021. What Changed? Interpretable Model Comparison. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, $$IJCAI-21$$. 2855--2861.Google ScholarGoogle ScholarCross RefCross Ref
  68. New York City Council. 2021. Automated employment decision tools. https://legistar.council.nyc.gov/LegislationDetail.aspx?ID=4344524&GUID=B051915D-A9AC-451E-81F8--6596032FA3F9Google ScholarGoogle Scholar
  69. Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. 2019. DeepXplore: Automated Whitebox Testing of Deep Learning Systems. Commun. ACM, Vol. 62, 11 (oct 2019), 137--145. https://doi.org/10.1145/3361566Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. David Piorkowski, Soya Park, April Yi Wang, Dakuo Wang, Michael Muller, and Felix Portnoy. 2021. How ai developers overcome communication challenges in a multidisciplinary team: A case study. Proceedings of the ACM on Human-Computer Interaction, Vol. 5, CSCW1 (2021), 1--25.Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Peter Pirolli and Stuart Card. 2005. The sensemaking process and leverage points for analyst technology as identified through cognitive task analysis. In Proceedings of international conference on intelligence analysis, Vol. 5. McLean, VA, USA, 2--4.Google ScholarGoogle Scholar
  72. Paulo E Rauber, Samuel G Fadel, Alexandre X Falcao, and Alexandru C Telea. 2016. Visualizing the hidden activity of artificial neural networks. IEEE transactions on visualization and computer graphics, Vol. 23, 1 (2016), 101--110.Google ScholarGoogle Scholar
  73. Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016a. "Why Should I Trust You?": Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Francisco, California, USA) (KDD '16). Association for Computing Machinery, New York, NY, USA, 1135--1144. https://doi.org/10.1145/2939672.2939778Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016b. "Why Should I Trust You?": Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Francisco, California, USA) (KDD '16). Association for Computing Machinery, New York, NY, USA, 1135--1144. https://doi.org/10.1145/2939672.2939778Google ScholarGoogle ScholarDigital LibraryDigital Library
  75. Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2018. Anchors: High-Precision Model-Agnostic Explanations. Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32, 1 (Apr. 2018). https://doi.org/10.1609/aaai.v32i1.11491Google ScholarGoogle ScholarCross RefCross Ref
  76. Burr Settles. 2009. Active Learning Literature Survey. Computer Sciences Technical Report 1648. University of Wisconsin--Madison. http://axon.cs.byu.edu/ martinez/classes/778/Papers/settles.activelearning.pdfGoogle ScholarGoogle Scholar
  77. Pannaga Shivaswamy and Thorsten Joachims. 2015. Coactive Learning. J. Artif. Int. Res., Vol. 53, 1 (may 2015), 1--40.Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. Stefano Teso and Kristian Kersting. 2019. Explanatory interactive machine learning. In Conference on AI, Ethics, and Society. 239--245.Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research, Vol. 9, 11 (2008).Google ScholarGoogle Scholar
  80. Jasper van der Waa, Elisabeth Nieuwburg, Anita Cremers, and Mark Neerincx. 2021. Evaluating XAI: A comparison of rule-based and example-based explanations. Artificial Intelligence, Vol. 291 (2021), 103404. https://doi.org/10.1016/j.artint.2020.103404Google ScholarGoogle ScholarCross RefCross Ref
  81. Sandra Wachter, Brent Mittelstadt, and Chris Russell. 2017. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. Harv. JL & Tech., Vol. 31 (2017), 841.Google ScholarGoogle Scholar
  82. Dakuo Wang, Justin D. Weisz, Michael Muller, Parikshit Ram, Werner Geyer, Casey Dugan, Yla Tausczik, Horst Samulowitz, and Alexander Gray. 2019. Human-AI Collaboration in Data Science: Exploring Data Scientists' Perceptions of Automated AI., Vol. 3, CSCW, Article 211 (nov 2019), 24 pages. https://doi.org/10.1145/3359313Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. Fulton Wang and Cynthia Rudin. 2015. Falling Rule Lists. In Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol. 38), Guy Lebanon and S. V. N. Vishwanathan (Eds.). PMLR, San Diego, California, USA, 1013--1022. https://proceedings.mlr.press/v38/wang15a.htmlGoogle ScholarGoogle Scholar
  84. Daniel S Weld and Gagan Bansal. 2019. The challenge of crafting intelligible intelligence. Commun. ACM, Vol. 62, 6 (2019), 70--79.Google ScholarGoogle ScholarDigital LibraryDigital Library
  85. Hongyu Yang, Cynthia Rudin, and Margo Seltzer. 2017. Scalable Bayesian Rule Lists. In Proceedings of the 34th International Conference on Machine Learning - Volume 70 (Sydney, NSW, Australia) (ICML'17). JMLR.org, 3921--3930.Google ScholarGoogle Scholar
  86. Qian Yang, Jina Suh, Nan-Chen Chen, and Gonzalo Ramos. 2018. Grounding Interactive Machine Learning Tool Design in How Non-Experts Actually Build Models. https://doi.org/10.1145/3196709.3196729Google ScholarGoogle ScholarDigital LibraryDigital Library
  87. Amy X. Zhang, Michael Muller, and Dakuo Wang. 2020. How Do Data Science Workers Collaborate? Roles, Workflows, and Tools., Vol. 4, CSCW1, Article 22 (may 2020), 23 pages. https://doi.org/10.1145/3392826Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. AIMEE: An Exploratory Study of How Rules Support AI Developers to Explain and Edit Models

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image Proceedings of the ACM on Human-Computer Interaction
      Proceedings of the ACM on Human-Computer Interaction  Volume 7, Issue CSCW2
      CSCW
      October 2023
      4055 pages
      EISSN:2573-0142
      DOI:10.1145/3626953
      Issue’s Table of Contents

      Copyright © 2023 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 4 October 2023
      Published in pacmhci Volume 7, Issue CSCW2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
    • Article Metrics

      • Downloads (Last 12 months)324
      • Downloads (Last 6 weeks)53

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader