ABSTRACT
In this paper, we present a dialog system that was exhibited at the Swedish National Museum of Science and Technology. Two visitors at a time could play a collaborative card sorting game together with the robot head Furhat, where the three players discuss the solution together. The cards are shown on a touch table between the players, thus constituting a target for joint attention. We describe how the system was implemented in order to manage turn-taking and attention to users and objects in the shared physical space. We also discuss how multi-modal redundancy (from speech, card movements and head pose) is exploited to maintain meaningful discussions, given that the system has to process conversational speech from both children and adults in a noisy environment. Finally, we present an analysis of 373 interactions, where we investigate the robustness of the system, to what extent the system's attention can shape the users' turn-taking behaviour, and how the system can produce multi-modal turn-taking signals (filled pauses, facial gestures, breath and gaze) to deal with processing delays in the system.
- Al Moubayed, S., Skantze, G., & Beskow, J. (2013). The Furhat Back-Projected Humanoid Head - Lip reading, Gaze and Multiparty Interaction. International Journal of Humanoid Robotics, 10(1).Google Scholar
- Al Moubayed, S., Beskow, J., Bollepalli, B., HussenAbdelaziz, A., Johansson, M., Koutsombogera, M., Lopes, J., Novikova, J., Oertel, C., Skantze, G., Stefanov, K., & Varol, G. (2014). Tutoring Robots: Multiparty multimodal social dialogue with an embodied tutor. In Proceedings of eNTERFACE2013. Springer.Google Scholar
- Sacks, H., Schegloff, E., & Jefferson, G. (1974). A simplest systematics for the organization of turn-taking for conversation. Language, 50, 696--735.Google ScholarCross Ref
- Duncan, S. (1972). Some Signals and Rules for Taking Speaking Turns in Conversations. Journal of Personality and Social Psychology, 23(2), 283--292.Google ScholarCross Ref
- Koiso, H., Horiuchi, Y., Tutiya, S., Ichikawa, A., & Den, Y. (1998). An analysis of turn-taking and backchannels based on prosodic and syntactic features in Japanese Map Task dialogs. Language and Speech, 41, 295--321.Google ScholarCross Ref
- Gravano, A., & Hirschberg, J. (2011). Turn-taking cues in task-oriented dialogue. Computer Speech & Language, 25(3), 601--634. Google ScholarDigital Library
- Skantze, G., & Hjalmarsson, A. (2013). Towards Incremental Speech Generation in Conversational Systems. Computer Speech & Language, 27(1), 243--262. Google ScholarDigital Library
- Kendon, A. (1967). Some functions of gaze direction in social interaction. Acta Psychologica, 26, 22--63.Google ScholarCross Ref
- Ishii, R., Otsuka, K., Kumano, S., & Yamato, J. (2014). Analysis of Respiration for Prediction of "Who Will Be Next Speaker and When?" in Multi-Party Meetings. In Proceedings of ICMI (pp. 18--25). New York, NY: ACM. Google ScholarDigital Library
- Mutlu, B., Kanda, T., Forlizzi, J., Hodgins, J., & Ishiguro, H. (2012). Conversational Gaze Mechanisms for Humanlike Robots. ACM Trans. Interact. Intell. Syst., 1(2), 12:1--12:33. Google ScholarDigital Library
- Vertegaal, R., Slagter, R., van der Veer, G., & Nijholt, A. (2001). Eye gaze patterns in conversations: there is more to conversational agents than meets the eyes. In Proceedings of ACM Conf. on Human Factors in Computing Systems. Google ScholarDigital Library
- Katzenmaier, M., Stiefelhagen, R., Schultz, T., Rogina, I., & Waibel, A. (2004). Identifying the Addressee in HumanHuman-Robot Interactions based on Head Pose and Speech. In Proceedings of ICMI. PA, USA: State College. Google ScholarDigital Library
- Stiefelhagen, R., & Zhu, J. (2002). Head orientation and gaze direction in meetings. In CHI '02 Extended Abstracts on Human Factors in Computing Systems (pp. 858--859). Google ScholarDigital Library
- Ba, S. O., & Odobez, J-M. (2009). Recognizing visual focus of attention from head pose in natural meetings. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 39(1), 16--33. Google ScholarDigital Library
- Bohus, D., & Horvitz, E. (2010). Facilitating multiparty dialog with gaze, gesture, and speech. In Proceedings of ICMI. Beijing, China. Google ScholarDigital Library
- Skantze, G., Al Moubayed, S., Gustafson, J., Beskow, J., & Granström, B. (2012). Furhat at Robotville: A Robot Head Harvesting the Thoughts of the Public through Multi-party Dialogue. In Proceedings of IVA-RCVA. Santa Cruz, CA.Google Scholar
- Skantze, G., Hjalmarsson, A., & Oertel, C. (2014). Turntaking, Feedback and Joint Attention in Situated HumanRobot Interaction. Speech Communication, 65, 50--66.Google ScholarCross Ref
- Johansson, M., Skantze, G., & Gustafson, J. (2013). Head Pose Patterns in Multiparty Human-Robot Team-Building Interactions. In Proceedings of ICSR. Bristol, UK. Google ScholarDigital Library
- Velichkovsky, B. M. (1995). Communicating attention: Gaze position transfer in cooperative problem solving. Pragmatics and Cognition, 3, 199--224.Google ScholarCross Ref
- Argyle, M., & Graham, J. A. (1976). The central Europe experiment: Looking at persons and looking at objects. Environmental Psychology and Nonverbal Behavior, 1(1), 6--16.Google ScholarCross Ref
- Kawahara, T., Iwatate, T., & Takanashi, K. (2012). Prediction of Turn-Taking by Combining Prosodic and Eye-Gaze Information in Poster Conversations.. In Interspeech 2012.Google Scholar
- Clark, H. H. (2005). Coordinating with each other in a material world. Discourse studies, 7(4--5), 507--525.Google Scholar
- Skantze, G., & Al Moubayed, S. (2012). IrisTK: a statechartbased toolkit for multi-party face-to-face interaction. In Proceedings of ICMI. Santa Monica, CA. Google ScholarDigital Library
- Harel, D. (1987). Statecharts: A visual formalism for complex systems. Science of Computer Programming, 8, 231--274. Google ScholarDigital Library
- Johansson, M., & Skantze, G. (2015). Opportunities and Obligations to Take Turns in Collaborative Multi-Party Human-Robot Interaction. In Proceedings of SIGDIAL. Prague, Czech Republic.Google ScholarCross Ref
Index Terms
- Exploring Turn-taking Cues in Multi-party Human-robot Discussions about Objects
Recommendations
Smooth Turn-taking by a Robot Using an Online Continuous Model to Generate Turn-taking Cues
ICMI '19: 2019 International Conference on Multimodal InteractionTurn-taking in human-robot interaction is a crucial part of spoken dialogue systems, but current models do not allow for human-like turn-taking speed seen in natural conversation. In this work we propose combining two independent prediction models. A ...
Spontaneous spoken dialogues with the furhat human-like robot head
HRI '14: Proceedings of the 2014 ACM/IEEE international conference on Human-robot interactionFurhat [1] is a robot head that deploys a back-projected animated face that is realistic and human-like in anatomy. Furhat relies on a state-of-the-art facial animation architecture allowing accurate synchronized lip movements with speech, and the ...
Timing multimodal turn-taking for human-robot cooperation
ICMI '12: Proceedings of the 14th ACM international conference on Multimodal interactionIn human cooperation, the concurrent usage of multiple social modalities such as speech, gesture, and gaze results in robust and efficient communicative acts. Such multimodality in combination with reciprocal intentions supports fluent turn-taking. I ...
Comments