Abstract
Speech entrainment is the tendency of interlocutors to become similar to each other during spoken interaction. Entrainment is a natural component of the cognitive system underlying communication, and the alignment of cognitive (para)linguistic representations between interlocutors is one way of conceptualizing it. Speech entrainment also plays an important social role, since humans perceive people who entrain to their speaking style as more socially attractive and likeable, more competent and intimate, and conversations with such partners as more successful. Furthermore, dis-entrainment might signal an increase in social distance and a negative attitude towards the interlocutor. Importantly for social robotics, humans also entrain to computer systems, and implementing this idea has brought improvements in several domains of human–machine interaction. This paper provides a targeted overview of advances in speech entrainment and argues that entrainment should be exploited in applications in which communication between humans and robots uses speech, as it opens up possibilities for developing and controlling social relations such as likeability and dominance and makes the applications more efficient.
Similar content being viewed by others
Notes
More natural dynamic and bi-directional aspects of entrainment in HRI will be briefly discussed in section “Discussion and Conclusion”.
References
De Jaegher H, Di Paolo E, Gallagher S. Can social interaction constitute social cognition? Trends Cogn Sci. 2010;14:441–7.
Baxter PE, de Greff J, Belpaeme T. Cognitive architecture for human–robot interaction: towards behavioural alignment. Biol Inspir Cogn Archit. 2013;6:30–9.
Moore RK. Finding rhythm in speech: a response to Cummins. Empir Musicol Rev. 2012;7(1–2):36–44.
Hirschberg J. Speaking more like you: entrainment in conversational speech. In: Proceedings of Interspeech 2011. p. 27–31.
Taylor JG. Cognitive computation. Cogn Comput. 2009;1:4–16.
Pentland A. To signal is human. Am Sci. 2010;98:204–10.
Pardo JS. On phonetic convergence during conversational interaction. J Acoust Soc Am. 2006;119(4):2382–93.
Babel M. Evidence for phonetic and social selectivity in spontaneous phonetic imitation. J Phon. 2012;40:177–89.
Delvaux V, Soquett A. The influence of ambient speech on adult speech productions through unintentional imitation. Phonetica. 2007;64:145–73.
Levitan R, Hirschberg J. Measuring acoustic-prosodic entrainment with respect to multiple levels and dimensions. In: Proceedings of Interspeech 2011. p. 3081–84.
Jain M, McDonough J, Gweon G, Raj B, Rosé C. An unsupervised dynamic Bayesian network approach to measuring speech style accommodation. In: Proceedings of the 13th ECACL 2012. p. 787–97.
Lee CC, Katsamanis A, Black M, Baucom B, Georgiou P, Narayanan S. An analysis of PCA-based vocal entrainment measures in married couples’ affective spoken interactions. In: Proceedings of Interspeech, 2011.
Lee CC, Black M, Katsamanis A, Lammer A, Baucom B, Christensen A, Georgiou P, Narayanan S. Quantification of prosodic entrainment in affective spontaneous spoken interactions of married couples. In: Proceedings of Interspeech, 2010.
Branigan HP, Pickering MJ, Pearson J, McLean JF. Linguistic alignment between humans and computers. J Pragmat. 2010;42:2355–68.
Brennan SE, Clark HH. Conceptual pacts and lexical choice in conversation. J Exp Psychol Learn Mem Cogn. 1996;22(6):1482–93.
Danescu-Niculescu-Mizil D, Lee L, Pang B, Kleinberg J. Echoes of power: language effects and power differences in social interaction. In: Proceedings of the 21st international conference on World Wide Web; 2012. p. 699–708.
Branigan HP, Pickering MJ, Cleland AA. Syntactic co-ordination in dialogue. Cognition. 2000;75:B13–25.
Reitter D, Keller F, Moore JD. Computational modelling of structural priming in dialogue. In: Proceedings of HLT/NAACL 2006. p. 121–4.
Garrod S, Anderson A. Saying what you mean in dialogue: a study in conceptual and semantic co-ordination. Cognition. 1987;27(2):181–218.
Beňuš Š, Gravano A, Hirschberg J. Pragmatic aspects of temporal entrainment in turn-taking. J Pragmat. 2011;43(12):3001–27.
ten Bosch L, Oostdijk N, Boves L. On temporal aspects of turn taking in conversational dialogues. Speech Commun. 2005;47:80–6.
Edlund J, Heldner M, Hirschberg J. Pause and gap length in face-to-face interaction. In: Proceedings of Interspeech, 2009. p. 2779–82.
Heldner M, Edlund J, Hirschberg J. Pitch similarity in the vicinity of backchannels. In: Proceedings of Interspeech, 2010. p. 3054–57.
Chartrand T, Bargh J. The chameleon effect: the perception-behavior link and social interaction. J Pers Soc Psychol. 1999;76:893–910.
Shockley K, Santana MV, Fowler CA. Mutual interpersonal postural constraints are involved in cooperative conversation. J Exp Psychol Hum Percept Perform. 2003;29:326–32.
Yamazaki A, Yamazaki K, Burdelski M, Kuno Y, Fukushima M. Coordination of verbal and non-verbal actions in human–robot interaction at museums and exhibitions. J Pragmat. 2010;42(9):2398–414.
Loehr D. Temporal, structural, and pragmatic synchrony between intonation and gesture. Lab Phonol. 2012;3(1):71–89.
Zajonc RB, Adelmann PK, Murphy ST, Niedenthal PM. Convergence in the physical appearance of spouses. Mot Emot. 1987;11(4):335–46.
Simmer ML. Newborn’s response to the cry of another infant. Dev Psychol. 1971;5:136–50.
Melzoff AN, Moore MK. Imitation of facial and manual gestures by human neonates. Science. 1977;198:75–8.
Phillips-Silver J, Aktipis A, Bryant G. The ecology of entrainment: foundations of coordinated rhythmic movement. Music Percept. 2010;28(1):3–14.
Bernieri FJ, Davis JM, Rosenthal R, Knee CR. Inter-actional synchrony and rapport: measuring synchrony in displays devoid of sound and facial affect. Pers Soc Psychol Bull. 1994;20(3):303–11.
Bell L, Gustafson J, Heldner M. Prosodic adaptation in human–computer interaction. In: Proceedings of international congress of phonetic sciences 2003. p. 2463–66.
Oviatt SL, Darves C, Coulston R. Toward adaptive conversational interfaces: modeling speech convergence with animated personas. ACM Trans Comput–Hum Interact. 2004;11(3):300–28.
Gustafson J, Larsson A, Carlson R, Hellman K. How do system questions influence lexical choices in user answers? In: Proceedings of Eurospeech 1997.
Thomason J, Nguyen HV, Litman DJ. Prosodic entrainment and tutoring dialogue success. In: Yacef K, editor. Artificial intelligence in education (LNCS 7926). Berlin: Springer; 2013. p. 750–3.
Iio T, Shiomi M, Shinozawa K, Miyashita T, Akimoto T, Hagita N. Lexical entrainment in human–robot interaction: can robots entrain human vocabulary? In: Proceeding of IEEE/RSJ international conference on intelligent robots and systems, 2009.
Porzel R, Scheffler A, Malaka R. How entrainment increases dialogical effectiveness. In: International conference on intelligent user interfaces, workshop on effective multimodal dialogue interfaces, 2006.
Litman D, Friedberg H, Forbes-Riley K. Prosodic cues to disengagement and uncertainty in physics tutorial dialogues. In: Proceedings of Interspeech, 2012.
Stoyanchev S, Stent A. Lexical and syntactic priming and their impact in deployed spoken dialogue systems. In: Proceedings of conference of the North American chapter of the association for computational linguistics: human language technologies, 2009.
Gravano A, Hirschberg J. Turn-taking cues in task-oriented dialogue. Comput Speech Lang. 2011;25(3):601–34.
Raux A, Eskenazi M. Optimizing the turn-taking behavior of task-oriented spoken dialog systems. ACM Trans Speech Lang Process. 2012;9:1.
Branigan HP, Pickering MJ, Pearson J, McLean JF, Brown A. The role of beliefs in lexical alignment: evidence from dialogues with humans and computers. Cognition. 2011;121:41–57.
Kopp S. Social resonance and embodied coordination in face-to-face conversation with artificial interlocutors. Speech Commun. 2010;52(6):587–97.
Giles H, Mulac A, Bradac JJ, Johnson P. Speech accommodation theory: the first decade and beyond. In: McLaughlin M, editor. Communication yearbook, 10. Newbury Park: Sage; 1987. p. 13–48.
Giles H, Coupland N, Coupland J. Accommodation theory: communication, context, and consequence. In: Giles H, Coupland N, Coupland J, editors. Contexts of accommodation: developments in applied sociolinguistics. Cambridge: Cambridge University Press; 1991. p. 1–68.
Pickering MJ, Garrod S. Toward a mechanistic psychology of dialogue. Behav Brain Sci. 2004;27:169–226.
Nenkova A, Gravano A, Hirschberg J. High frequency word entrainment in spoken dialogue. In: Proceedings of ACL/HLT, 2008.
Putnam W, Street RL. The conception and perception of noncontent speech performance: implications for speech accommodation theory. Int J Sociol Lang. 1984;46:97–114.
Street RL Jr. Speech convergence and social evaluation in fact-finding interviews. Hum Commun Res. 1984;11:139–69.
Belpaeme T, Baxter P, Read R, Wood R, Cuayahuitl H, Kiefer B, et al. Multimodal child–robot interaction: building social bonds. J Hum–Robot Interact. 2012;1:33–53.
Levitan R, Gravano A., Wilson L, Benus S, Hirschberg J, Nenkova A. Acoustic–prosodic entrainment and social behavior. In: Proceedings of conference of the North American chapter of the association for computational linguistics: human language technologies, 2012. p. 11–9.
Bourhis RY, Giles H. The language of intergroup distinctiveness. In: Giles H, editor. Language, ethnicity and intergroup relations. New York: European Association of Experimental Social Psychology; 1977. p. 119–35.
Ziemke T, Lowe R. On the role of emotion in embodied cognitive architectures: from organisms to robots. Cogn Comput. 2009;1:104–17.
Poggi I, D’Errico F. Dominance signals in debates. In: Salah AA, et al., editors. HBU 2010, LNCS 6219. Berlin: Springer; 2010. p. 163–74.
Poggi I, D’Errico F, Vincze L. Agreement and its multimodal communication in debates: a qualitative qnalysis. Cogn Comput. 2011;3:466–79.
Dunbar NE, Burgoon JK. Perceptions of power and interactional dominance in interpersonal relationships. J Soc Pers Relationsh. 2005;22:231–57.
Dunbar NE, Bippus AM, Young SL. Interpersonal dominance in relational conflict: a view from Dyadic Power Theory. Interpersona. 2008;2(1):1–33.
Gregory SW, Webster S. A nonverbal signal in voices of interview partners effectively predicts communication entrainment and social status perceptions. J Pers Soc Psychol. 1996;70:1231–40.
Beňuš Š, Levitan R, Hirschberg J. Entrainment in spontaneous speech: the case of filled pauses in Supreme Court hearings. In: Proceedings of the 3rd IEEE conference on cognitive infocommunications, 2012. p. 793–7.
Nass C, Moon Y, Fogg BJ, Reeves B, Dryer DC. Can computer personalities be human personalities? Int J Hum–Comput Stud. 1995;43(2):223–39.
Kemp C, Tenenbaum JB. The discovery of structural form. Proc Natl Acad Sci. 2008;105(31):10687–92.
Harnad S. The symbol grounding problem. Phys D. 1990;42:335–46.
Muller V. Interaction and resistance: the recognition of intentions in new human–computer interaction. In: Esposito A, editor. COST 2102 International Training School (LNCS 6456). Berlin: Springer; 2011. p. 1–7.
Ziemke T, Lowe R. On the role of emotion in embodied cognitive architectures: from organisms to robots. Cogn Comput. 2009;1:104–17.
Shockley K, Richardson D, Dale R. Conversation and coordinative structures. Top Cogn Sci. 2009;1:305–19.
McClelland JL. Is a machine realization of truly human-like intelligence achievable? Cogn Comput. 2009;1:17–21.
Gafos A. Dynamics in grammar: comments on Ladd and Ernestus and Baayen. In: Goldstein L, Whalen D, Best C, editors. Varieties of phonological competence (Laboratory phonology No. 8). Berlin: Mouton de Gruyter; 2006. p. 51–79.
Gafos A, Beňuš Š. Dynamics of phonological cognition. Cogn Sci. 2007;30(5):905–43.
Varela FJ, Depraz N. At the source of time: valence and the constitutional dynamics of affect. J Conscious Stud. 2005;12(8–10):61–81.
Clark HH. Using language. Cambridge: Cambridge University Press; 1996.
Belpaeme T, et al. Multimodal child–robot interaction: building social bonds. J Hum–Robot Interact. 2012;1(2):33–53.
Dautenhahn K. Methodology and themes of human–robot interaction: a growing research field. Int J Adv Robot Syst. 2007;4(1):103–8.
Gussenhoven C. Intonation and interpretation: phonetics and phonology. In: Bel B, Marlien I editors. Proceedings of Speech Prosody, 2002. p. 47–57.
Hirschberg, J. The pragmatics of intonational meaning. In: Bel B, Marlien I editorss. Proceedings of Speech Prosody, 2002. p. 65–8.
Acknowledgments
This work results from the project implementation: Technology research for the management of business processes in heterogeneous distributed systems in real time with the support of multi-modal communication, ITMS 26240220060 supported by the Research and Development Operational Programme funded by the ERDF, and was also supported in part by VEGA 1/0547/14 grant. The author is indebted to Julia Hirschberg, Agustin Gravano, and Rivka Levitan. All mistakes are the sole responsibility of the author.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Beňuš, Š. Social Aspects of Entrainment in Spoken Interaction. Cogn Comput 6, 802–813 (2014). https://doi.org/10.1007/s12559-014-9261-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12559-014-9261-4