research-article

Exploring Turn-taking Cues in Multi-party Human-robot Discussions about Objects

Authors:
Gabriel Skantze

KTH - Royal Institute of Technology, Stockholm, Sweden

KTH - Royal Institute of Technology, Stockholm, Sweden
View Profile

,
Martin Johansson

KTH - Royal Institute of Technology, Stockholm, Sweden

KTH - Royal Institute of Technology, Stockholm, Sweden
View Profile

,
Jonas Beskow

KTH - Royal Institute of Technology, Stockholm, Sweden

KTH - Royal Institute of Technology, Stockholm, Sweden
View Profile

ICMI '15: Proceedings of the 2015 ACM on International Conference on Multimodal InteractionNovember 2015Pages 67–74https://doi.org/10.1145/2818346.2820749

Published:09 November 2015Publication History

ICMI '15: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction

Pages 67–74

ABSTRACT

In this paper, we present a dialog system that was exhibited at the Swedish National Museum of Science and Technology. Two visitors at a time could play a collaborative card sorting game together with the robot head Furhat, where the three players discuss the solution together. The cards are shown on a touch table between the players, thus constituting a target for joint attention. We describe how the system was implemented in order to manage turn-taking and attention to users and objects in the shared physical space. We also discuss how multi-modal redundancy (from speech, card movements and head pose) is exploited to maintain meaningful discussions, given that the system has to process conversational speech from both children and adults in a noisy environment. Finally, we present an analysis of 373 interactions, where we investigate the robustness of the system, to what extent the system's attention can shape the users' turn-taking behaviour, and how the system can produce multi-modal turn-taking signals (filled pauses, facial gestures, breath and gaze) to deal with processing delays in the system.

References

Al Moubayed, S., Skantze, G., & Beskow, J. (2013). The Furhat Back-Projected Humanoid Head - Lip reading, Gaze and Multiparty Interaction. International Journal of Humanoid Robotics, 10(1).Google Scholar
Al Moubayed, S., Beskow, J., Bollepalli, B., HussenAbdelaziz, A., Johansson, M., Koutsombogera, M., Lopes, J., Novikova, J., Oertel, C., Skantze, G., Stefanov, K., & Varol, G. (2014). Tutoring Robots: Multiparty multimodal social dialogue with an embodied tutor. In Proceedings of eNTERFACE2013. Springer.Google Scholar
Sacks, H., Schegloff, E., & Jefferson, G. (1974). A simplest systematics for the organization of turn-taking for conversation. Language, 50, 696--735.Google ScholarCross Ref
Duncan, S. (1972). Some Signals and Rules for Taking Speaking Turns in Conversations. Journal of Personality and Social Psychology, 23(2), 283--292.Google ScholarCross Ref
Koiso, H., Horiuchi, Y., Tutiya, S., Ichikawa, A., & Den, Y. (1998). An analysis of turn-taking and backchannels based on prosodic and syntactic features in Japanese Map Task dialogs. Language and Speech, 41, 295--321.Google ScholarCross Ref
Gravano, A., & Hirschberg, J. (2011). Turn-taking cues in task-oriented dialogue. Computer Speech & Language, 25(3), 601--634. Google ScholarDigital Library
Skantze, G., & Hjalmarsson, A. (2013). Towards Incremental Speech Generation in Conversational Systems. Computer Speech & Language, 27(1), 243--262. Google ScholarDigital Library
Kendon, A. (1967). Some functions of gaze direction in social interaction. Acta Psychologica, 26, 22--63.Google ScholarCross Ref
Ishii, R., Otsuka, K., Kumano, S., & Yamato, J. (2014). Analysis of Respiration for Prediction of "Who Will Be Next Speaker and When?" in Multi-Party Meetings. In Proceedings of ICMI (pp. 18--25). New York, NY: ACM. Google ScholarDigital Library
Mutlu, B., Kanda, T., Forlizzi, J., Hodgins, J., & Ishiguro, H. (2012). Conversational Gaze Mechanisms for Humanlike Robots. ACM Trans. Interact. Intell. Syst., 1(2), 12:1--12:33. Google ScholarDigital Library
Vertegaal, R., Slagter, R., van der Veer, G., & Nijholt, A. (2001). Eye gaze patterns in conversations: there is more to conversational agents than meets the eyes. In Proceedings of ACM Conf. on Human Factors in Computing Systems. Google ScholarDigital Library
Katzenmaier, M., Stiefelhagen, R., Schultz, T., Rogina, I., & Waibel, A. (2004). Identifying the Addressee in HumanHuman-Robot Interactions based on Head Pose and Speech. In Proceedings of ICMI. PA, USA: State College. Google ScholarDigital Library
Stiefelhagen, R., & Zhu, J. (2002). Head orientation and gaze direction in meetings. In CHI '02 Extended Abstracts on Human Factors in Computing Systems (pp. 858--859). Google ScholarDigital Library
Ba, S. O., & Odobez, J-M. (2009). Recognizing visual focus of attention from head pose in natural meetings. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 39(1), 16--33. Google ScholarDigital Library
Bohus, D., & Horvitz, E. (2010). Facilitating multiparty dialog with gaze, gesture, and speech. In Proceedings of ICMI. Beijing, China. Google ScholarDigital Library
Skantze, G., Al Moubayed, S., Gustafson, J., Beskow, J., & Granström, B. (2012). Furhat at Robotville: A Robot Head Harvesting the Thoughts of the Public through Multi-party Dialogue. In Proceedings of IVA-RCVA. Santa Cruz, CA.Google Scholar
Skantze, G., Hjalmarsson, A., & Oertel, C. (2014). Turntaking, Feedback and Joint Attention in Situated HumanRobot Interaction. Speech Communication, 65, 50--66.Google ScholarCross Ref
Johansson, M., Skantze, G., & Gustafson, J. (2013). Head Pose Patterns in Multiparty Human-Robot Team-Building Interactions. In Proceedings of ICSR. Bristol, UK. Google ScholarDigital Library
Velichkovsky, B. M. (1995). Communicating attention: Gaze position transfer in cooperative problem solving. Pragmatics and Cognition, 3, 199--224.Google ScholarCross Ref
Argyle, M., & Graham, J. A. (1976). The central Europe experiment: Looking at persons and looking at objects. Environmental Psychology and Nonverbal Behavior, 1(1), 6--16.Google ScholarCross Ref
Kawahara, T., Iwatate, T., & Takanashi, K. (2012). Prediction of Turn-Taking by Combining Prosodic and Eye-Gaze Information in Poster Conversations.. In Interspeech 2012.Google Scholar
Clark, H. H. (2005). Coordinating with each other in a material world. Discourse studies, 7(4--5), 507--525.Google Scholar
Skantze, G., & Al Moubayed, S. (2012). IrisTK: a statechartbased toolkit for multi-party face-to-face interaction. In Proceedings of ICMI. Santa Monica, CA. Google ScholarDigital Library
Harel, D. (1987). Statecharts: A visual formalism for complex systems. Science of Computer Programming, 8, 231--274. Google ScholarDigital Library
Johansson, M., & Skantze, G. (2015). Opportunities and Obligations to Take Turns in Collaborative Multi-Party Human-Robot Interaction. In Proceedings of SIGDIAL. Prague, Czech Republic.Google ScholarCross Ref

Index Terms

Exploring Turn-taking Cues in Multi-party Human-robot Discussions about Objects
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
      2. Computer vision tasks
        Scene understanding
2. Human-centered computing
  1. Human computer interaction (HCI)

Recommendations

Smooth Turn-taking by a Robot Using an Online Continuous Model to Generate Turn-taking Cues
ICMI '19: 2019 International Conference on Multimodal Interaction

Turn-taking in human-robot interaction is a crucial part of spoken dialogue systems, but current models do not allow for human-like turn-taking speed seen in natural conversation. In this work we propose combining two independent prediction models. A ...
Read More
Spontaneous spoken dialogues with the furhat human-like robot head
HRI '14: Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction

Furhat [1] is a robot head that deploys a back-projected animated face that is realistic and human-like in anatomy. Furhat relies on a state-of-the-art facial animation architecture allowing accurate synchronized lip movements with speech, and the ...
Read More
Timing multimodal turn-taking for human-robot cooperation
ICMI '12: Proceedings of the 14th ACM international conference on Multimodal interaction

In human cooperation, the concurrent usage of multiple social modalities such as speech, gesture, and gaze results in robust and efficient communicative acts. Such multimodality in combination with reciprocal intentions supports fluent turn-taking. I ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICMI '15: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction
November 2015
678 pages
ISBN:9781450339124
DOI:10.1145/2818346
General Chairs:
Zhengyou Zhang
Microsoft Research, USA
,
Phil Cohen
VoiceBox Technologies, USA
,
Program Chairs:
Dan Bohus
Microsoft Research, USA
,
Radu Horaud
INRIA Grenoble Rhone-Alpes, France
,
Helen Meng
Chinese University of Hong Kong, China
Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 November 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
attention
gaze
human-robot interaction
multi-party turn-taking
Qualifiers
- research-article
Conference

Acceptance Rates
ICMI '15 Paper Acceptance Rate52of127submissions,41%Overall Acceptance Rate453of1,080submissions,42%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 37
  Total Citations
  View Citations
- 711
  Total Downloads
- Downloads (Last 12 months)86
- Downloads (Last 6 weeks)7
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Exploring Turn-taking Cues in Multi-party Human-robot Discussions about Objects

ICMI '15: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction

ABSTRACT

References

Cited By

Index Terms

Recommendations

Smooth Turn-taking by a Robot Using an Online Continuous Model to Generate Turn-taking Cues

Spontaneous spoken dialogues with the furhat human-like robot head

Timing multimodal turn-taking for human-robot cooperation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Exploring Turn-taking Cues in Multi-party Human-robot Discussions about Objects

ICMI '15: Proceedings of the 2015 ACM on International Conference on Multimodal Interaction

ABSTRACT

References

Cited By

Index Terms

Recommendations

Smooth Turn-taking by a Robot Using an Online Continuous Model to Generate Turn-taking Cues

Spontaneous spoken dialogues with the furhat human-like robot head

Timing multimodal turn-taking for human-robot cooperation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media