Abstract
The interaction of humans and robots in everyday contexts is no longer a vision of the future. This is demonstrated, for example, in the increasing use of service robots, e.g., household robots or social robots such as Pepper from the company SoftBank Robotics, illustrates. The prerequisite for social interaction is the robot’s ability to perceive their counterpart on a social level and, based on this, output an appropriate reaction in the form of speech, gestures or facial expressions. In this paper, we first present the state of the art for multi modal emotion recognition and dialog system architectures which utilize emotion recognition. The methods are then discussed in terms of their applicability and robustness. Starting points for improvements are identified and subsequently, an architecture for the use of multi-modal emotion recognition techniques for further research is proposed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ferrari, F., Eyssel, F.: Toward a hybrid society. In: Agah, A., Cabibihan, J.-J., Howard, A.M., Salichs, M.A., He, H. (eds.) ICSR 2016. LNCS (LNAI), vol. 9979, pp. 909–918. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-47437-3_89
Meudt, S.: Maschinelle Emotionserkennung in der Mensch-Maschine Interaktion (2019). https://doi.org/10.18725/OPARU-15022. https://oparu.uni-ulm.de/xmlui/handle/123456789/15079
Jurafsky, D., Martin, J.H.: Speech and Language Processing (3rd ed. draft). https://web.stanford.edu/~jurafsky/slp3/
Schuetzler, R., Grimes, M., Giboney, J., Buckman, J.: Facilitating Natural Conversational Agent Interactions: Lessons from a Deception Experiment (2014)
Wannemacher, P.: Bots Aren’t Ready To Be Bankers. https://go.forrester.com/wp-content/uploads/Forrester_Bots_Arent_Ready_To_Be_Bankers.pdf
Kasinathan, V., Abd Wahab, M.H., Syed Idrus, S.Z., Mustapha, A., Yuen, K.: AIRA Chatbot for travel: case study of AirAsia. J. Phys. Conf. Ser. 1529, 022101 (2020). https://doi.org/10.1088/1742-6596/1529/2/022101
Liebrecht, C., van Hooijdonk, C.: Creating humanlike chatbots: what chatbot developers could learn from webcare employees in adopting a conversational human voice. In: Følstad, A., et al. (eds.) CONVERSATIONS 2019. LNCS, vol. 11970, pp. 51–64. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-39540-7_4
Gnewuch, U., Morana, S., Maedche, A.: Towards Designing Cooperative and Social Conversational Agents for Customer Service (2017)
Williams, J.D., Raux, A., Henderson, M.: The dialog state tracking challenge series: a review. dad 7, 4–33 (2016). https://doi.org/10.5087/dad.2016.301
Zhou, L., Gao, J., Li, D., Shum, H.-Y.: The design and implementation of XiaoIce, an empathetic social chatbot. arXiv:1812.08989 [cs] (2019)
Oh, K.-J., Lee, D., Ko, B., Choi, H.-J.: A chatbot for psychiatric counseling in mental healthcare service based on emotional dialogue analysis and sentence generation. In: 2017 18th IEEE International Conference on Mobile Data Management (MDM), pp. 371–375. IEEE, Daejeon (2017). https://doi.org/10.1109/MDM.2017.64
Li, C.-Y., Ortega, D., Väth, D., Lux, F., Vanderlyn, L., Schmidt, M., Neumann, M., Völkel, M., Denisov, P., Jenne, S., Kacarevic, Z., Vu, N.T.: ADVISER: a toolkit for developing multi-modal, multi-domain and socially-engaged conversational agents. arXiv:2005.01777 [cs] (2020)
Bagher Zadeh, A., Liang, P.P., Poria, S., Cambria, E., Morency, L.-P.: Multimodal language analysis in the wild: CMU-MOSEI dataset and interpretable dynamic fusion graph. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2236–2246. Association for Computational Linguistics, Melbourne, Australia (2018). https://doi.org/10.18653/v1/P18-1208
Lu, Y., Srivastava, M., Kramer, J., Elfardy, H., Kahn, A., Wang, S., Bhardwaj, V.: Goal-oriented end-to-end conversational models with profile features in a real-world setting. In: Proceedings of the 2019 Conference of the North, pp. 48–55. Association for Computational Linguistics, Minneapolis - Minnesota (2019). https://doi.org/10.18653/v1/N19-2007
Hancock, B., Bordes, A., Mazaré, P.-E., Weston, J.: Learning from Dialogue after Deployment: Feed Yourself, Chatbot! arXiv:1901.05415 [cs, stat] (2019)
Lundholm Fors, K.: Production and Perception of Pauses in Speech (2015). http://hdl.handle.net/2077/39346
Acknowledgements
The authors acknowledge the financial support by the Federal Ministry of Education and Research of Germany in the framework of FH-Kooperativ 2-2019 (project number 13FH504KX9).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Schiffmann, M., Thoma, A., Richert, A. (2021). Multi-modal Emotion Recognition for User Adaptation in Social Robots. In: Zallio, M., Raymundo Ibañez, C., Hernandez, J.H. (eds) Advances in Human Factors in Robots, Unmanned Systems and Cybersecurity. AHFE 2021. Lecture Notes in Networks and Systems, vol 268. Springer, Cham. https://doi.org/10.1007/978-3-030-79997-7_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-79997-7_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-79996-0
Online ISBN: 978-3-030-79997-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)