skip to main content
10.1145/3610661.3617151acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
short-paper
Open Access

Towards Objective Evaluation of Socially-Situated Conversational Robots: Assessing Human-Likeness through Multimodal User Behaviors

Published:09 October 2023Publication History

ABSTRACT

This paper tackles the challenging task of evaluating socially situated conversational robots and presents a novel objective evaluation approach that relies on multimodal user behaviors. In this study, our main focus is on assessing the human-likeness of the robot as the primary evaluation metric. While previous research often relied on subjective evaluations from users, our approach aims to evaluate the robot’s human-likeness based on observable user behaviors indirectly, thus enhancing objectivity and reproducibility. To begin, we created an annotated dataset of human-likeness scores, utilizing user behaviors found in an attentive listening dialogue corpus. We then conducted an analysis to determine the correlation between multimodal user behaviors and human-likeness scores, demonstrating the feasibility of our proposed behavior-based evaluation method.

References

  1. Alaa Abd-Alrazaq, Zeineb Safi, Mohannad Alajlani, Jim Warren, Mowafa Househ, Kerstin Denecke, 2020. Technical metrics used to evaluate health care chatbots: scoping review. Journal of medical Internet research 22, 6 (2020).Google ScholarGoogle ScholarCross RefCross Ref
  2. Jacky Casas, Marc-Olivier Tricot, Omar Abou Khaled, Elena Mugellini, and Philippe Cudré-Mauroux. 2020. Trends & methods in chatbot evaluation. In Companion Publication of ICMI. 280–286.Google ScholarGoogle Scholar
  3. Jan Deriu, Alvaro Rodrigo, Arantxa Otegi, Guillermo Echegoyen, Sophie Rosset, Eneko Agirre, and Mark Cieliebak. 2021. Survey on evaluation methods for dialogue systems. Artificial Intelligence Review 54 (2021), 755–810.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. David DeVault, Ron Artstein, Grace Benn, Teresa Dey, Ed Fast, Alesia Gainer, Kallirroi Georgila, Jon Gratch, Arno Hartholt, Margaux Lhommet, Gale Lucas, Stacy Marsella, Fabrizio Morbini, Angela Nazarian, Stefan Scherer, Giota Stratou, Apar Suri, David Traum, Rachel Wood, Yuyu Xu, Albert Rizzo, and Louis P. Morency. 2014. SimSensei Kiosk: A Virtual Human Interviewer for Healthcare Decision Support. In AAMAS. 1061–1068.Google ScholarGoogle Scholar
  5. Jens Edlund, Joakim Gustafson, Mattias Heldner, and Anna Hjalmarsson. 2008. Towards human-like spoken dialogue systems. Speech Communication 50, 8 (2008), 630–645.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Sangdo Han, Kyusong Lee, Donghyeon Lee, and Gary Geunbae Lee. 2013. Counseling Dialog System with 5W1H Extraction. In SIGDIAL.Google ScholarGoogle Scholar
  7. Takamasa Iio, Satoru Satake, Takayuki Kanda, Kotaro Hayashi, Florent Ferreri, and Norihiro Hagita. 2020. Human-like guide robot that proactively explains exhibits. International Journal of Social Robotics 12 (2020), 549–566.Google ScholarGoogle ScholarCross RefCross Ref
  8. Koji Inoue, Kohei Hara, Divesh Lala, Kenta Yamamoto, Shizuka Nakamura, Katsuya Takanashi, and Tatsuya Kawahara. 2020. Job interviewer android with elaborate follow-up question generation. In ICMI. 324–332.Google ScholarGoogle Scholar
  9. Koji Inoue, Divesh Lala, Kenta Yamamoto, Shizuka Nakamura, Katsuya Takanashi, and Tatsuya Kawahara. 2020. An attentive listening system with android ERICA: Comparison of autonomous and WOZ interactions. In SIGDIAL. 118–127.Google ScholarGoogle Scholar
  10. Koji Inoue, Pierrick Milhorat, Divesh Lala, Tianyu Zhao, and Tatsuya Kawahara. 2016. Talking with ERICA, an autonomous android. In SIGDIAL. 212–215.Google ScholarGoogle Scholar
  11. Michael Johnston, Patrick Ehlen, Frederick G. Conrad, Michael F. Schober, Christopher Antoun, Stefanie Fail, Andrew Hupp, Lucas Vickers, Huiying Yan, and Chan Zhang. 2013. Spoken Dialog Systems for Automated Survey Interviewing. In SIGDIAL. 329–333.Google ScholarGoogle Scholar
  12. Tatsuya Kawahara. 2018. Spoken dialogue system for a human-like conversational robot ERICA. In IWSDS. 65–75.Google ScholarGoogle Scholar
  13. Liliana Laranjo, Adam G Dunn, Huong Ly Tong, Ahmet Baki Kocaballi, Jessica Chen, Rabia Bashir, Didi Surian, Blanca Gallego, Farah Magrabi, Annie YS Lau, 2018. Conversational agents in healthcare: a systematic review. Journal of the American Medical Informatics Association 25, 9 (2018), 1248–1258.Google ScholarGoogle ScholarCross RefCross Ref
  14. Ting-En Lin, Yuchuan Wu, Fei Huang, Luo Si, Jian Sun, and Yongbin Li. 2022. Duplex Conversation: Towards Human-like Interaction in Spoken Dialogue Systems. In SIGKDD. 3299–3308.Google ScholarGoogle Scholar
  15. Raveesh Meena, José Lopes, Gabriel Skantze, and Joakim Gustafson. 2015. Automatic Detection of Miscommunication in Spoken Dialogue Systems. In SIGDIAL. 354–363.Google ScholarGoogle Scholar
  16. Jinjie Ni, Tom Young, Vlad Pandelea, Fuzhao Xue, and Erik Cambria. 2023. Recent advances in deep learning based dialogue systems: A systematic survey. Artificial intelligence review 56 (2023), 3055–3155.Google ScholarGoogle Scholar
  17. Catharine Oertel, Patrik Jonell, Dimosthenis Kontogiorgos, Kenneth Funes Mora, Jean-Marc Odobez, and Joakim Gustafson. 2021. Towards an engagement-aware attentive artificial listener for multi-party interactions. Frontiers in Robotics and AI 8 (2021).Google ScholarGoogle Scholar
  18. Samira Rasouli, Garima Gupta, Elizabeth Nilsen, and Kerstin Dautenhahn. 2022. Potential applications of social robots in robot-assisted interventions for social anxiety. International Journal of Social Robotics 14 (2022), 1–32.Google ScholarGoogle ScholarCross RefCross Ref
  19. Marc Schröder, Elisabetta Bevacqua, Roddy Cowie, Florian Eyben, Hatice Gunes, Dirk Heylen, Mark ter Maat, Gary McKeown, Sathish Pammi, Maja Pantic, Catherine Pelachaud, Björn Schuller, Etienne de Sevin, Michel Valstar, and Martin Wöllmer. 2015. Building autonomous sensitive artificial listeners. In ACII. 456–462.Google ScholarGoogle Scholar
  20. Arielle AJ Scoglio, Erin D Reilly, Jay A Gorman, and Charles E Drebing. 2019. Use of social robots in mental health and well-being research: Systematic review. Journal of medical Internet research 21, 7 (2019).Google ScholarGoogle ScholarCross RefCross Ref
  21. Doreen Ying Ying Sim and Chu Kiong Loo. 2015. Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A review. Information Sciences 301 (2015), 305–344.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. William Swartout, David Traum, Ron Artstein, Dan Noren, Paul Debevec, Kerry Bronnenkant, Josh Williams, Anton Leuski, Shrikanth Narayanan, Diane Piepol, Chad Lane, Jacquelyn Moriel, Priti Aggarwal, Matt Liewer, Jen-Yuan Chiang, Jillian Gerten, Selina Chu, and Kyle White. 2010. Ada and Grace: Toward realistic and engaging virtual museum guides. In IVA. 286–300.Google ScholarGoogle Scholar
  23. Stefan Ultes and Wolfgang Maier. 2021. User Satisfaction Reward Estimation Across Domains: Domain-independent Dialogue Policy Learning. Dialogue & Discourse 12, 2 (2021), 81–114.Google ScholarGoogle ScholarCross RefCross Ref
  24. Zhou Yu, Vikram Ramanarayanan, Patrick Lange, and David Suendermann-Oeft. 2017. An open-source dialog system with real-time engagement tracking for job interview training applications. In IWSDS.Google ScholarGoogle Scholar
  25. Chen Zhang, João Sedoc, Luis Fernando D’Haro, Rafael Banchs, and Alexander Rudnicky. 2021. Automatic evaluation and moderation of open-domain dialogue systems. arXiv preprint, 2111.02110 (2021).Google ScholarGoogle Scholar

Index Terms

  1. Towards Objective Evaluation of Socially-Situated Conversational Robots: Assessing Human-Likeness through Multimodal User Behaviors

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Article Metrics

      • Downloads (Last 12 months)105
      • Downloads (Last 6 weeks)17

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format