skip to main content
10.1145/3640543.3645205acmconferencesArticle/Chapter ViewAbstractPublication PagesiuiConference Proceedingsconference-collections
research-article
Open Access

Benefits of Human-AI Interaction for Expert Users Interacting with Prediction Models: a Study on Marathon Running

Published:05 April 2024Publication History

ABSTRACT

Users with large domain knowledge can be reluctant to use prediction models. This also applies to the sports domain, where running coaches rarely rely on marathon prediction tools for race-plan advice for their runners’ next marathon. This paper studies the effect of adding interactivity to such prediction models, to incorporate and acknowledge users’ domain knowledge. In think-aloud sessions and an online study, we tested an interactive machine learning tool that allowed coaches to indicate the importance of earlier races feeding into the model. Our results show that coaches deploy rich knowledge when working with the model on runners familiar to them, and their adaptations improved model accuracy. Those coaches who could interact with the model displayed more trust and acceptance in the resulting predictions.

References

  1. 2020. RunKeeper. https://runkeeper.com/ Accessed: 2022-09-15.Google ScholarGoogle Scholar
  2. 2020. Strava. https://www.strava.com/ Accessed: 2022-09-15.Google ScholarGoogle Scholar
  3. Agnar Aamodt and Enric Plaza. 1994. Case-Based reasoning: Foundational issues, methodological variations, and system approaches. AI Communications 7, 1 (1994), 39–59. https://doi.org/10.3233/AIC-1994-7104Google ScholarGoogle ScholarCross RefCross Ref
  4. Saleema Amershi, Maya Cakmak, W. Bradley Knox, and Todd Kulesza. 2014. Power to the people: The role of humans in interactive machine learning. AI Magazine 35, 4 (2014), 105–120. https://doi.org/10.1609/aimag.v35i4.2513Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Hall P. Beck, J. Bates McKinney, Mary T. Dzindolet, and Linda G. Pierce. 2009. Effects of human-machine competition on intent errors in a target detection task. Human Factors 51, 4 (2009), 477–486. https://doi.org/10.1177/0018720809341746Google ScholarGoogle ScholarCross RefCross Ref
  6. Jakim Berndsen, Barry Smyth, and Aonghus Lawlor. 2019. Pace My Race: Recommendations for Marathon Running. In Proceedings of the 13th ACM Conference on Recommender Systems (Copenhagen, Denmark) (RecSys ’19). Association for Computing Machinery, New York, NY, USA, 246–250. https://doi.org/10.1145/3298689.3346991Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Nadia Boukhelifa, Anastasia Bezerianos, and Evelyne Lutton. 2018. Evaluation of Interactive Machine Learning Systems. In Human and Machine Learning. Human-Computer Interaction Series., J. Zhou and F. Chen (Eds.). Springer, Cham. https://doi.org/10.1007/978-3-319-90403-0_17 arxiv:1801.07964Google ScholarGoogle ScholarCross RefCross Ref
  8. Imornefe Bowes and Robyn L. Jones. 2006. Working at the Edge of Chaos: Understanding Coaching as a Complex, Interpersonal System. The Sport Psychologist 20 (2006), 235–245. https://doi.org/10.1123/tsp.20.2.235Google ScholarGoogle ScholarCross RefCross Ref
  9. Carrie J. Cai, Emily Reif, Narayan Hegde, Jason Hipp, Been Kim, Daniel Smilkov, Martin Wattenberg, Fernanda Viegas, Greg S. Corrado, Martin C. Stumpe, and Michael Terry. 2019. Human-Centered Tools for Coping with Imperfect Algorithms during Medical Decision-Making. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems - CHI ’19. 1–14. arxiv:1902.02960http://arxiv.org/abs/1902.02960Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. João Gustavo Claudino, Daniel de Oliveira Capanema, Thiago Vieira de Souza, Julio Cerca Serrão, Adriano C. Machado Pereira, and George P. Nassis. 2019. Current Approaches to the Use of Artificial Intelligence for Injury Risk Assessment and Performance Prediction in Team Sports: a Systematic Review. Sports Medicine - Open 5, 28 (2019). https://doi.org/10.1186/s40798-019-0202-3Google ScholarGoogle ScholarCross RefCross Ref
  11. Dave Collins, Loel Collins, and Howie J Carson. 2016. “If it feels right, do it”: Intuitive decision making in a sample of high-level sport coaches. Frontiers in psychology 7 (2016), 504.Google ScholarGoogle Scholar
  12. Joan Dallinga, Mark Janssen, Jet van der Werf, Ruben Walravens, Steven Vos, and Marije Deutekom. 2018. Analysis of the features important for the effectiveness of physical activity–related apps for recreational sports: Expert panel approach. JMIR mHealth and uHealth 6, 6 (2018). https://doi.org/10.2196/mhealth.9459Google ScholarGoogle ScholarCross RefCross Ref
  13. Robyn M Dawes and Bernard Corrigan. 1974. Linear models in decision making.Psychological bulletin 81, 2 (1974), 95.Google ScholarGoogle Scholar
  14. Srikant Devaraj, Sushil K. Sharma, Dyan J. Fausto, Sara Viernes, and Hadi Kharrazi. 2014. Barriers and Facilitators to Clinical Decision Support Systems Adoption: A Systematic Review. Journal of Business Administration Research 3, 2 (2014). https://doi.org/10.5430/jbar.v3n2p36Google ScholarGoogle ScholarCross RefCross Ref
  15. Cailbhe Doherty, Alison Keogh, James Davenport, Aonghus Lawlor, Barry Smyth, and Brian Caulfield. 2020. An evaluation of the training determinants of marathon performance: A meta-analysis with meta-regression. Journal of Science and Medicine in Sport 23, 2 (2020), 182–188. https://doi.org/10.1016/j.jsams.2019.09.013Google ScholarGoogle ScholarCross RefCross Ref
  16. John J. Dudley and Per Ola Kristensson. 2018. A review of user interface design for interactive machine learning. ACM Transactions on Interactive Intelligent Systems (TiiS) 8, 2 (2018), 1–37. https://doi.org/10.1145/3185517Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Jerry Alan Fails and Dan R Olsen. 2003. Interactive Machine Learning. In Proceedings of the 8th international conference on Intelligent user interfaces (IUI ’03). 39–45.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Shi Feng and Jordan Boyd-Graber. 2019. What can ai do for me? evaluating machine learning interpretations in cooperative play. In Proceedings of the 24th International Conference on Intelligent User Interfaces. 229–239.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Carl Foster, Matthew Schrager, Ann C. Snyder, and Nancy N. Thompson. 1994. Pacing Strategy and Athletic Performance. Sports Medicine 17, 2 (1994), 77–85. https://doi.org/10.2165/00007256-199417020-00001Google ScholarGoogle ScholarCross RefCross Ref
  20. Bhavya Ghai, Q Vera Liao, Yunfeng Zhang, Rachel Bellamy, and Klaus Mueller. 2021. Explainable active learning (xal) toward ai explanations as interfaces for machine teachers. Proceedings of the ACM on Human-Computer Interaction 4, CSCW3 (2021), 1–28.Google ScholarGoogle Scholar
  21. Jos Goudsmit, Ruby T. A. Otter, Inge Stoter, Berry van Holland, Stephan van der Zwaard, Johan de Jong, and Steven Vos. 2022. Co-Operative Design of a Coach Dashboard for Training Monitoring and Feedback. Sensors 22, 23 (2022). https://doi.org/10.3390/s22239073Google ScholarGoogle ScholarCross RefCross Ref
  22. William M. Grove, David H. Zald, Boyd S. Lebow, Beth E. Snitz, and Chad Nelson. 2000. Clinical versus mechanical prediction: A meta-analysis. Psychological Assessment 12, 1 (2000), 19–30. https://doi.org/10.1037/1040-3590.12.1.19Google ScholarGoogle ScholarCross RefCross Ref
  23. Lijie Guo, Elizabeth M Daly, Oznur Alkan, Massimiliano Mattetti, Owen Cornec, and Bart Knijnenburg. 2022. Building trust in interactive machine learning via user contributed interpretable rules. In 27th International Conference on Intelligent User Interfaces. 537–548.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Akshit Gupta, Debadeep Basu, Ramya Ghantasala, Sihang Qiu, and Ujwal Gadiraju. 2022. To Trust or Not To Trust: How a Conversational Interface Affects Trust in a Decision Support System. In Proceedings of the ACM Web Conference 2022. 3531–3540.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Kevin Anthony Hoff and Masooda Bashir. 2015. Trust in automation: Integrating empirical evidence on factors that influence trust. Human Factors 57, 3 (2015), 407–434. https://doi.org/10.1177/0018720814547570Google ScholarGoogle ScholarCross RefCross Ref
  26. Andreas Holzinger. 2016. Interactive machine learning for health informatics: when do we need the human-in-the-loop?Brain Informatics 3, 2 (2016), 119–131. https://doi.org/10.1007/s40708-016-0042-6Google ScholarGoogle ScholarCross RefCross Ref
  27. Donald Honeycutt, Mahsan Nourani, and Eric Ragan. 2020. Soliciting human-in-the-loop user feedback for interactive machine learning reduces user trust and impressions of model accuracy. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, Vol. 8. 63–72.Google ScholarGoogle ScholarCross RefCross Ref
  28. Mark Janssen, Jeroen Scheerder, Erik Thibaut, Aarnout Brombacher, and Steven Vos. 2017. Who uses running apps and sports watches? Determinants and consumer profiles of event runners’ usage of running-related smartphone applications and sports watches. PLoS ONE 12, 7 (2017), 1–17. https://doi.org/10.1371/journal.pone.0181167Google ScholarGoogle ScholarCross RefCross Ref
  29. Daniel Kahneman, Paul Slovic, and Amos Tversky. 1982. Judgment under uncertainty: Heuristics and biases. Cambridge University Press. https://doi.org/10.1097/00001888-199907000-00012Google ScholarGoogle ScholarCross RefCross Ref
  30. Harmanpreet Kaur, Harsha Nori, Samuel Jenkins, Rich Caruana, Hanna Wallach, and Jennifer Wortman Vaughan. 2020. Interpreting Interpretability: Understanding Data Scientists’ Use of Interpretability Tools for Machine Learning. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) (CHI ’20). Association for Computing Machinery, New York, NY, USA, 1–14. https://doi.org/10.1145/3313831.3376219Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. A. Keogh, B. Smyth, B. Caulfield, A. Lawlor, J. Berndsen, and C. Doherty. 2019. Prediction equations for marathon performance: A systematic review. International Journal of Sports Physiology and Performance 14, 9 (2019), 1159–1169.Google ScholarGoogle ScholarCross RefCross Ref
  32. Saif Khairat, David Marc, William Crosby, and Ali Al Sanousi. 2018. Reasons for physicians not adopting clinical decision support systems: Critical analysis. Journal of Medical Internet Research 20, 4 (2018). https://doi.org/10.2196/medinform.8912Google ScholarGoogle ScholarCross RefCross Ref
  33. Gary Klein, Ben Shneiderman, Robert R. Hoffman, and Kenneth M. Ford. 2017. Why Expertise Matters: A Response to the Challenges. IEEE Intelligent Systems 32, 6 (2017), 67–73. https://doi.org/10.1109/MIS.2017.4531230Google ScholarGoogle ScholarCross RefCross Ref
  34. Todd Kulesza, Margaret Burnett, Weng-keen Wong, and Simone Stumpf. 2015. Principles of Explanatory Debugging to Personalize Interactive Machine Learning. In Proceedings of the 20th international conference on Intelligent user interfaces (IUI ’15). 126–137. https://doi.org/10.1145/2678025.2701399Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Johannes Kunkel, Tim Donkers, Lisa Michael, Catalin-Mihai Barbu, and Jürgen Ziegler. 2019. Let me explain: Impact of personal and impersonal explanations on trust in recommender systems. In Proceedings of the 2019 CHI conference on human factors in computing systems. 1–12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Q. Vera Liao, Daniel Gruen, and Sarah Miller. 2020. Questioning the AI: Informing design practices for explainable AI user experiences. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI ’20). Honolulu, Hawai.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Jennifer M Logg, Julia A Minson, and Don A Moore. 2019. Algorithm appreciation: People prefer algorithmic to human judgment. Organizational Behavior and Human Decision Processes 151 (2019), 90–103.Google ScholarGoogle ScholarCross RefCross Ref
  38. John Lyle. 2010. Coaches’ Decision Making: A Naturalistic Decision Making Analysis. In Sports Coaching E-book: Professionalisation and Practice. Elsevier Health Sciences, Chapter 3, 27–42.Google ScholarGoogle Scholar
  39. George M Marakas, Richard D Johnson, and Paul F Clay. 2007. The Evolving Nature of the Computer Self-Efficacy Construct: An Empirical Investigation of Measurement Construction, Validity, Reliability and Stability Over Time.Journal of the Association for Information Systems 8, 1 (2007), 16–46. https://doi.org/10.17705/1jais.00112Google ScholarGoogle ScholarCross RefCross Ref
  40. Kevin S. Masters, Benjamin M. Ogles, and Jeffrey A. Jolton. 1993. The development of an instrument to measure motivation for marathon running: The motivations of marathoners scales (moms). Research Quarterly for Exercise and Sport 64, 2 (1993), 134–143. https://doi.org/10.1080/02701367.1993.10608790Google ScholarGoogle ScholarCross RefCross Ref
  41. D. Harrison McKnight, Vivek Choudhury, and Charles Kacmar. 2002. Developing And Validating Trust Measure for E-Commerce: An Integrative Typology.Information Systems Research 13, 3 (2002), 334–359.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Paul E Meehl. 1954. Clinical versus statistical prediction: A theoretical analysis and a review of the evidence. (1954).Google ScholarGoogle Scholar
  43. Benjamin Ogles and Kevin Masters. 2003. A Typology of Marathon Runners Based on Cluster Analysis of Motivations. Journal of Sport Behavior 26, 1 (2003), 69.Google ScholarGoogle Scholar
  44. Monika Pobiruchin, Julian Suleder, Richard Zowalla, and Martin Wiesner. 2017. Accuracy and Adoption of Wearable Technology Used by Active Citizens: A Marathon Event Field Study. JMIR mHealth and uHealth 5, 2 (2017). https://doi.org/10.2196/mhealth.6395Google ScholarGoogle ScholarCross RefCross Ref
  45. Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. " Why should i trust you?" Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 1135–1144.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Peter S Riegel. 1977. Time predicting. Runner’s World Magazine 12, 8 (1977).Google ScholarGoogle Scholar
  47. Heleen Rutjes, Martijn C. Willemsen, and Wijnand A. IJsselsteijn. 2019. Beyond Behavior: The Coach’s Perspective on Technology in Health Coaching. In Proceedings of ACM Conference on Human Factors in Computing Systems (CHI ’19). Glasgow, Scotland UK.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Kurt Salzinger. 2005. Clinical, Statistical, and Broken-Leg Predictions. Behavior and Philosophy 33 (2005), 91–99.Google ScholarGoogle Scholar
  49. Jeroen Scheerder, Koen Breedveld, and Julie Borgers. 2015. Running across Europe: the rise and size of one of the largest sport markets. Springer.Google ScholarGoogle Scholar
  50. Max Schemmer, Niklas Kuehl, Carina Benz, Andrea Bartos, and Gerhard Satzger. 2023. Appropriate reliance on AI advice: Conceptualization and the effect of explanations. In Proceedings of the 28th International Conference on Intelligent User Interfaces. 410–422.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Ben Shneiderman. 2020. Human-Centered Artificial Intelligence: Reliable, Safe & Trustworthy. International Journal of Human-Computer Interaction 36, 6 (2020), 495–504. https://doi.org/10.1080/10447318.2020.1741118Google ScholarGoogle ScholarCross RefCross Ref
  52. Patrice Y. Simard, Saleema Amershi, David M. Chickering, Alicia Edelman Pelton, Soroush Ghorashi, Christopher Meek, Gonzalo Ramos, Jina Suh, Johan Verwey, Mo Wang, and John Wernsing. 2017. Machine Teaching: A New Paradigm for Building Machine Learning Systems. arXiv preprint arXiv:1707.06742 (2017). arxiv:1707.06742http://arxiv.org/abs/1707.06742Google ScholarGoogle Scholar
  53. Barry Smyth and Pádraig Cunningham. 2017. Running with cases: A CBR approach to running your best marathon. Lecture Notes in Computer Science 10339 (2017). https://doi.org/10.1007/978-3-319-61030-6_25Google ScholarGoogle ScholarCross RefCross Ref
  54. Barry Smyth and Martijn C. Willemsen. 2020. Predicting the Personal-Best Times of Speed Skaters Using Case-Based Reasoning. In Case-Based Reasoning Research and Development - 28th International Conference, ICCBR 2020, Proceedings(Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)), Ian Watson and Rosina Weber (Eds.). Springer, Germany, 112–126. https://doi.org/10.1007/978-3-030-58342-2_8 28th International Conference on Case-Based Reasoning, ICCBR 2020 ; Conference date: 08-06-2020 Through 12-06-2020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Janet A. Sniezek and Lyn M. Van Swol. 2001. Trust, Confidence, and Expertise in a Judge-Advisor System. Organizational Behavior and Human Decision Processes 84, 2 (2001), 288–307. https://doi.org/10.1006/obhd.2000.2926Google ScholarGoogle ScholarCross RefCross Ref
  56. Clare D. Stevinson and Stuart J H Biddle. 1998. Cognitive orientations in marathon running and "hitting the wall". British Journal of Sports Medicine 32, 3 (1998), 229–235. https://doi.org/10.1136/bjsm.32.3.229Google ScholarGoogle ScholarCross RefCross Ref
  57. Simone Stumpf, Vidya Rajaram, Lida Li, Margaret Burnett, Thomas Dietterich, Erin Sullivan, Russell Drummond, and Jonathan Herlocker. 2007. Toward Harnessing User Feedback for Machine Learning. In Proceeding of the ACM Conference on Intelligent User Interfaces (IUI ’07). 82–91.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Simone Stumpf, Erin Sullivan, Erin Fitzhenry, Ian Oberst, Weng Keen Wong, and Margaret Burnett. 2008. Integrating rich user feedback into intelligent user interfaces. International Conference on Intelligent User Interfaces, Proceedings IUI (2008), 50–59. https://doi.org/10.1145/1378773.1378781Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Giovanni Tanda. 2011. Prediction of marathon performance time on the basis of training indices. Journal of Human Sport and Exercise 6, 3 (2011), 511–520. https://doi.org/10.4100/jhse.2011.63.05Google ScholarGoogle ScholarCross RefCross Ref
  60. Stefano Teso, Öznur Alkan, Wolfgang Stammer, and Elizabeth Daly. 2023. Leveraging explanations in interactive machine learning: An overview. Frontiers in Artificial Intelligence 6 (2023), 1066049.Google ScholarGoogle ScholarCross RefCross Ref
  61. Stefano Teso and Kristian Kersting. 2019. Explanatory interactive machine learning. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society. 239–245.Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Ambra Vitti, Pantelis T. Nikolaidis, Elias Villiger, Vincent Onywera, and Beat Knechtle. 2020. The “New York City Marathon”: participation and performance trends of 1.2M runners during half-century. Research in Sports Medicine 28, 1 (2020), 121–137. https://doi.org/10.1080/15438627.2019.1586705Google ScholarGoogle ScholarCross RefCross Ref
  63. Danding Wang, Qian Yang, Ashraf Abdul, and Brian Y. Lim. 2019. Designing Theory-Driven User-Centric Explainable AI. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (Glasgow, Scotland Uk) (CHI ’19). Association for Computing Machinery, New York, NY, USA, 1–15. https://doi.org/10.1145/3290605.3300831Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Xinru Wang and Ming Yin. 2021. Are explanations helpful? a comparative study of the effects of explanations in ai-assisted decision-making. In 26th international conference on intelligent user interfaces. 318–328.Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Ilan Yaniv and Eli Kleinberger. 2000. Advice Taking in Decision Making: Egocentric Discounting and Reputation Formation. Organizational Behavior and Human Decision Processes 83, 2 (2000), 260–281. https://doi.org/10.1006/obhd.2000.2909Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Benefits of Human-AI Interaction for Expert Users Interacting with Prediction Models: a Study on Marathon Running

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Article Metrics

          • Downloads (Last 12 months)97
          • Downloads (Last 6 weeks)97

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format .

        View HTML Format