Towards Objective Evaluation of Socially-Situated Conversational Robots: Assessing Human-Likeness through Multimodal User Behaviors

Authors:
Koji Inoue

Graduate School of Informatics, Kyoto University, Japan

Graduate School of Informatics, Kyoto University, Japan

0000-0002-2929-2559
View Profile

,
Divesh Lala

Graduate School of Informatics, Kyoto University, Japan

Graduate School of Informatics, Kyoto University, Japan

0009-0005-3181-6410
View Profile

,
Keiko Ochi

Graduate School of Informatics, Kyoto University, Japan

Graduate School of Informatics, Kyoto University, Japan

0000-0002-7026-8518
View Profile

,
Tatsuya Kawahara

Graduate School of Informatics, Kyoto University, Japan

Graduate School of Informatics, Kyoto University, Japan

0000-0002-2686-2296
View Profile

,
Gabriel Skantze

KTH, Sweden

KTH, Sweden

0000-0002-8579-1790
View Profile

ICMI '23 Companion: Companion Publication of the 25th International Conference on Multimodal InteractionOctober 2023Pages 86–90https://doi.org/10.1145/3610661.3617151

Published:09 October 2023Publication History

ICMI '23 Companion: Companion Publication of the 25th International Conference on Multimodal Interaction

Pages 86–90

ABSTRACT

This paper tackles the challenging task of evaluating socially situated conversational robots and presents a novel objective evaluation approach that relies on multimodal user behaviors. In this study, our main focus is on assessing the human-likeness of the robot as the primary evaluation metric. While previous research often relied on subjective evaluations from users, our approach aims to evaluate the robot’s human-likeness based on observable user behaviors indirectly, thus enhancing objectivity and reproducibility. To begin, we created an annotated dataset of human-likeness scores, utilizing user behaviors found in an attentive listening dialogue corpus. We then conducted an analysis to determine the correlation between multimodal user behaviors and human-likeness scores, demonstrating the feasibility of our proposed behavior-based evaluation method.

References

Alaa Abd-Alrazaq, Zeineb Safi, Mohannad Alajlani, Jim Warren, Mowafa Househ, Kerstin Denecke, 2020. Technical metrics used to evaluate health care chatbots: scoping review. Journal of medical Internet research 22, 6 (2020).Google ScholarCross Ref
Jacky Casas, Marc-Olivier Tricot, Omar Abou Khaled, Elena Mugellini, and Philippe Cudré-Mauroux. 2020. Trends & methods in chatbot evaluation. In Companion Publication of ICMI. 280–286.Google Scholar
Jan Deriu, Alvaro Rodrigo, Arantxa Otegi, Guillermo Echegoyen, Sophie Rosset, Eneko Agirre, and Mark Cieliebak. 2021. Survey on evaluation methods for dialogue systems. Artificial Intelligence Review 54 (2021), 755–810.Google ScholarDigital Library
David DeVault, Ron Artstein, Grace Benn, Teresa Dey, Ed Fast, Alesia Gainer, Kallirroi Georgila, Jon Gratch, Arno Hartholt, Margaux Lhommet, Gale Lucas, Stacy Marsella, Fabrizio Morbini, Angela Nazarian, Stefan Scherer, Giota Stratou, Apar Suri, David Traum, Rachel Wood, Yuyu Xu, Albert Rizzo, and Louis P. Morency. 2014. SimSensei Kiosk: A Virtual Human Interviewer for Healthcare Decision Support. In AAMAS. 1061–1068.Google Scholar
Jens Edlund, Joakim Gustafson, Mattias Heldner, and Anna Hjalmarsson. 2008. Towards human-like spoken dialogue systems. Speech Communication 50, 8 (2008), 630–645.Google ScholarDigital Library
Sangdo Han, Kyusong Lee, Donghyeon Lee, and Gary Geunbae Lee. 2013. Counseling Dialog System with 5W1H Extraction. In SIGDIAL.Google Scholar
Takamasa Iio, Satoru Satake, Takayuki Kanda, Kotaro Hayashi, Florent Ferreri, and Norihiro Hagita. 2020. Human-like guide robot that proactively explains exhibits. International Journal of Social Robotics 12 (2020), 549–566.Google ScholarCross Ref
Koji Inoue, Kohei Hara, Divesh Lala, Kenta Yamamoto, Shizuka Nakamura, Katsuya Takanashi, and Tatsuya Kawahara. 2020. Job interviewer android with elaborate follow-up question generation. In ICMI. 324–332.Google Scholar
Koji Inoue, Divesh Lala, Kenta Yamamoto, Shizuka Nakamura, Katsuya Takanashi, and Tatsuya Kawahara. 2020. An attentive listening system with android ERICA: Comparison of autonomous and WOZ interactions. In SIGDIAL. 118–127.Google Scholar
Koji Inoue, Pierrick Milhorat, Divesh Lala, Tianyu Zhao, and Tatsuya Kawahara. 2016. Talking with ERICA, an autonomous android. In SIGDIAL. 212–215.Google Scholar
Michael Johnston, Patrick Ehlen, Frederick G. Conrad, Michael F. Schober, Christopher Antoun, Stefanie Fail, Andrew Hupp, Lucas Vickers, Huiying Yan, and Chan Zhang. 2013. Spoken Dialog Systems for Automated Survey Interviewing. In SIGDIAL. 329–333.Google Scholar
Tatsuya Kawahara. 2018. Spoken dialogue system for a human-like conversational robot ERICA. In IWSDS. 65–75.Google Scholar
Liliana Laranjo, Adam G Dunn, Huong Ly Tong, Ahmet Baki Kocaballi, Jessica Chen, Rabia Bashir, Didi Surian, Blanca Gallego, Farah Magrabi, Annie YS Lau, 2018. Conversational agents in healthcare: a systematic review. Journal of the American Medical Informatics Association 25, 9 (2018), 1248–1258.Google ScholarCross Ref
Ting-En Lin, Yuchuan Wu, Fei Huang, Luo Si, Jian Sun, and Yongbin Li. 2022. Duplex Conversation: Towards Human-like Interaction in Spoken Dialogue Systems. In SIGKDD. 3299–3308.Google Scholar
Raveesh Meena, José Lopes, Gabriel Skantze, and Joakim Gustafson. 2015. Automatic Detection of Miscommunication in Spoken Dialogue Systems. In SIGDIAL. 354–363.Google Scholar
Jinjie Ni, Tom Young, Vlad Pandelea, Fuzhao Xue, and Erik Cambria. 2023. Recent advances in deep learning based dialogue systems: A systematic survey. Artificial intelligence review 56 (2023), 3055–3155.Google Scholar
Catharine Oertel, Patrik Jonell, Dimosthenis Kontogiorgos, Kenneth Funes Mora, Jean-Marc Odobez, and Joakim Gustafson. 2021. Towards an engagement-aware attentive artificial listener for multi-party interactions. Frontiers in Robotics and AI 8 (2021).Google Scholar
Samira Rasouli, Garima Gupta, Elizabeth Nilsen, and Kerstin Dautenhahn. 2022. Potential applications of social robots in robot-assisted interventions for social anxiety. International Journal of Social Robotics 14 (2022), 1–32.Google ScholarCross Ref
Marc Schröder, Elisabetta Bevacqua, Roddy Cowie, Florian Eyben, Hatice Gunes, Dirk Heylen, Mark ter Maat, Gary McKeown, Sathish Pammi, Maja Pantic, Catherine Pelachaud, Björn Schuller, Etienne de Sevin, Michel Valstar, and Martin Wöllmer. 2015. Building autonomous sensitive artificial listeners. In ACII. 456–462.Google Scholar
Arielle AJ Scoglio, Erin D Reilly, Jay A Gorman, and Charles E Drebing. 2019. Use of social robots in mental health and well-being research: Systematic review. Journal of medical Internet research 21, 7 (2019).Google ScholarCross Ref
Doreen Ying Ying Sim and Chu Kiong Loo. 2015. Extensive assessment and evaluation methodologies on assistive social robots for modelling human–robot interaction – A review. Information Sciences 301 (2015), 305–344.Google ScholarDigital Library
William Swartout, David Traum, Ron Artstein, Dan Noren, Paul Debevec, Kerry Bronnenkant, Josh Williams, Anton Leuski, Shrikanth Narayanan, Diane Piepol, Chad Lane, Jacquelyn Moriel, Priti Aggarwal, Matt Liewer, Jen-Yuan Chiang, Jillian Gerten, Selina Chu, and Kyle White. 2010. Ada and Grace: Toward realistic and engaging virtual museum guides. In IVA. 286–300.Google Scholar
Stefan Ultes and Wolfgang Maier. 2021. User Satisfaction Reward Estimation Across Domains: Domain-independent Dialogue Policy Learning. Dialogue & Discourse 12, 2 (2021), 81–114.Google ScholarCross Ref
Zhou Yu, Vikram Ramanarayanan, Patrick Lange, and David Suendermann-Oeft. 2017. An open-source dialog system with real-time engagement tracking for job interview training applications. In IWSDS.Google Scholar
Chen Zhang, João Sedoc, Luis Fernando D’Haro, Rafael Banchs, and Alexander Rudnicky. 2021. Automatic evaluation and moderation of open-domain dialogue systems. arXiv preprint, 2111.02110 (2021).Google Scholar

Index Terms

Towards Objective Evaluation of Socially-Situated Conversational Robots: Assessing Human-Likeness through Multimodal User Behaviors
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. HCI design and evaluation methods

Recommendations

Designing effective multimodal behaviors for robots: a data-driven perspective
ICMI '13: Proceedings of the 15th ACM on International conference on multimodal interaction

Robots need to effectively use multimodal behaviors, including speech, gaze, and gestures, in support of their users to achieve intended interaction goals, such as improved task performance. This proposed research concerns designing effective multimodal ...
Read More
Multimodal User Satisfaction Recognition for Non-task Oriented Dialogue Systems
ICMI '21: Proceedings of the 2021 International Conference on Multimodal Interaction

Multimodal dialogue systems (MDSs) are needed to allow users to converse with virtual agents that use natural language by sensing the multimodal behavior of users. One crucial step in the development of an MDS is measuring how well the dialogue system ...
Read More
Generative Model of Agent’s Behaviors in Human-Agent Interaction
ICMI '19: 2019 International Conference on Multimodal Interaction

A social interaction implies a social exchange between two or more persons, where they adapt and adjust their behaviors in response to their interaction partners. With the growing interest in human-agent interactions, it is desirable to make these ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICMI '23 Companion: Companion Publication of the 25th International Conference on Multimodal Interaction
October 2023
434 pages
ISBN:9798400703218
DOI:10.1145/3610661
Editors:
Elisabeth André
University of Augsburg
,
Mohamed Chetouani
Sorbonne University
,
Dominique Vaufreydaz
Univ. Grenoble Alpes
,
Gale Lucas
USC Institute for Creative Technologies
,
Tanja Schultz
University of Bremen
,
Louis-Philippe Morency
Carnegie Mellon University
,
Alessandro Vinciarelli
University of Glasgow
Copyright © 2023 Owner/Author
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 October 2023
Check for updates
Author Tags
Conversational Robot
Evaluation Method
Multimodal Behavior
Spoken Dialogue System
Qualifiers
- short-paper
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate453of1,080submissions,42%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 105
  Total Downloads
- Downloads (Last 12 months)105
- Downloads (Last 6 weeks)17
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Towards Objective Evaluation of Socially-Situated Conversational Robots: Assessing Human-Likeness through Multimodal User Behaviors

ICMI '23 Companion: Companion Publication of the 25th International Conference on Multimodal Interaction

ABSTRACT

References

Cited By

Index Terms

Recommendations

Designing effective multimodal behaviors for robots: a data-driven perspective

Multimodal User Satisfaction Recognition for Non-task Oriented Dialogue Systems

Generative Model of Agent’s Behaviors in Human-Agent Interaction