Abstract
Human-computer interaction and the creation of humanoid robots both depend heavily on emotions. By integrating the concept of emotion understanding, intelligent software systems become more effective and intuitive in resembling human-human interactions. Typically, we combine factors like intonation (speech), facial expression (visual modality), and word content (text). All possible multimodal combinations must be taken into consideration to process emotions appropriately. Among multimodal approaches, the use of human audio samples for emotion processing is given more weight than the use of facial expressions. To accomplish accurate categorization, analyzing massive volumes of real-time data has become more necessary. Machine Learning (ML) models that operate in a distributed fashion are crucial, given the size and complexity of the problem under study. In this respect, we propose a distributed ensemble model for vocal cue-based emotion classification. Three base ML models that work in a distributed manner were used. According to the findings, the ensemble model proposed differentiates between the seven fundamental emotions with reasonable accuracy. The proposed distributed ensemble model performed better than existing ML models on TESS, SAVEE, and RAVDESS, achieving 86% accuracy on the unified dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Setyohadi, D.B., Kusrohmaniah, S., Gunawan, S.B., Pranowo, P., Prabuwono, A.S.: Galvanic skin response data classification for emotion detection. Int. J. Electr. Comput. Eng. 8(5), 4004 (2018). https://doi.org/10.11591/ijece.v8i5.pp4004-4014
Goshvarpour, A., Abbasi, A., Goshvarpour, A.: An accurate emotion recognition system using ECG and GSR signals and matching pursuit method. Biomed. J. 40(6), 355–368 (2017). https://doi.org/10.1016/j.bj.2017.11.001
Alarcão, S.M., Fonseca, M.J.: Emotions recognition using EEG signals: a survey. IEEE Trans. Affect. Comput. 10(3), 374–393 (2019). https://doi.org/10.1109/TAFFC.2017.2714671
Emotion Recognition based on Heart Rate and Skin Conductance. In: Proceedings of the 2nd International Conference on Physiological Computing Systems, pp. 26–32 (2015). https://doi.org/10.5220/0005241100260032
Cowie, R., et al.: Emotion recognition in human-computer interaction. IEEE Signal Process. Mag. 18(1), 32–80 (2001). https://doi.org/10.1109/79.911197
Ekman, P., et al.: Universals and cultural differences in the judgments of facial expressions of emotion. J. Pers. Soc. Psychol. 53(4), 712–717 (1987). https://doi.org/10.1037/0022-3514.53.4.712
Mehrabian, A.: Basic Dimensions for a General Psychological Theory: Implications for Personality, Social, Environmental, and Developmental Studies, 1st ed., vol. 1. Oelgeschlager, Gunn & Hain, 1980 (1980)
Russell, J.A.: A circumplex model of affect. J. Pers. Soc. Psychol. 39(6), 1161–1178 (1980). https://doi.org/10.1037/h0077714
Plutchik, R.: A general psychoevolutionary theory of emotion. In: Theories of Emotion, Elsevier, pp. 3–33 (1980)
Andrews, M.: Emotion recognition in spontaneous speech. In: Faculty Research Working Paper Series, vol. 28, pp. 457–473 (2014). https://bsc.cid.harvard.edu/files/bsc/files/285_andrews_this_is_pfm.pdf
Nwe, T.L., Foo, S.W., De Silva, L.C.: Detection of stress and emotion in speech using traditional and FFT based log energy features. In: Fourth International Conference on Information, Communications and Signal Processing, 2003 and the Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint, vol. 3, pp. 1619–1623 (2003). https://doi.org/10.1109/ICICS.2003.1292741
Pan, Y., Shen, P., Shen, L.: Speech emotion recognition using support vector machine. Int. J. Smart Home 6(2), 101–108 (2012). https://doi.org/10.30534/ijeter/2020/43842020
Matin, R., Valles, D.: A speech emotion recognition solution-based on support vector machine for children with autism spectrum disorder to help identify human emotions. In: 2020 Intermountain Engineering, Technology and Computing (IETC), pp. 1–6 (2020). https://doi.org/10.1109/IETC47856.2020.9249147
Valles, D., Matin, R.: An Audio processing approach using ensemble learning for speech-emotion recognition for children with ASD. In: 2021 IEEE World AI IoT Congress (AIIoT), pp. 0055–0061 (2021). https://doi.org/10.1109/AIIoT52608.2021.9454174
Yuncu, E., Hacihabiboglu, H., Bozsahin, C.: Automatic speech emotion recognition using auditory models with binary decision tree and SVM. In: 2014 22nd International Conference on Pattern Recognition, pp. 773–778 (2014). https://doi.org/10.1109/ICPR.2014.143
Khan, A., Roy, U.K.: Emotion recognition using prosodie and spectral features of speech and Naïve Bayes classifier. In: 2017 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET), pp. 1017–1021 (2017). https://doi.org/10.1109/WiSPNET.2017.8299916
Prajapati, Y.J., Gandhi, P.P., Degadwala, S.: A review - ML and DL classifiers for emotion detection in audio and speech data. In: 2022 International Conference on Inventive Computation Technologies (ICICT), pp. 63–69 (2022). https://doi.org/10.1109/ICICT54344.2022.9850614
Wani, T.M., Gunawan, T.S., Qadri, S.A.A., Kartiwi, M., Ambikairajah, E.: A comprehensive review of speech emotion recognition systems. IEEE Access 9, 47795–47814 (2021). https://doi.org/10.1109/ACCESS.2021.3068045
Lin, L., Tan, L.: Multi-distributed speech emotion recognition based on mel frequency cepstogram and parameter transfer. Chinese J. Electron. 31(1), 155–167 (2022). https://doi.org/10.1049/cje.2020.00.080
Yi, L., Mak, M.-W.: Improving speech emotion recognition with adversarial data augmentation network. IEEE Trans. Neural Networks Learn. Syst. 33(1), 172–184 (2022). https://doi.org/10.1109/TNNLS.2020.3027600
Cao, H., Verma, R., Nenkova, A.: Speaker-sensitive emotion recognition via ranking: Studies on acted and spontaneous speech. Comput. Speech Lang. 29(1), 186–202 (2015). https://doi.org/10.1016/j.csl.2014.01.003
Basu, S., Chakraborty, J., Bag, A., Aftabuddin, M.: A review on emotion recognition using speech. In: 2017 International Conference on Inventive Communication and Computational Technologies (ICICCT), pp. 109–114 (2017). https://doi.org/10.1109/ICICCT.2017.7975169
Pichora-Fuller, M.K.. Dupuis, K.: Toronto emotional speech set (TESS). Borealis (2010).https://doi.org/10.5683/SP2/E8H2MF
Jackson, P.: Surrey audio-visual expressed emotion (SAVEE) database. http://kahlan.eps.surrey.ac.uk/savee/
Livingstone, S.R., Russo, F.A.: The ryerson audio-visual database of emotional speech and song (RAVDESS): a dynamic, multimodal set of facial and vocal expressions in North American English. PLoS ONE 13(5), e0196391 (2018). https://doi.org/10.1371/journal.pone.0196391
Verbraeken, J., Wolting, M., Katzy, J., Kloppenburg, J., Verbelen, T., Rellermeyer, J.S.: A survey on distributed machine learning. ACM Comput. Surv. 53(2), 1–33 (2021). https://doi.org/10.1145/3377454
Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. 28(4), 357–366 (1980). https://doi.org/10.1109/TASSP.1980.1163420
Jiang, D.N., Lu, L., Zhang, H.J., Tao, J.H., Cai, L. H.: Music type classification by spectral contrast feature. In: Proceedings. IEEE International Conference on Multimedia and Expo, pp. 113–116. https://doi.org/10.1109/ICME.2002.1035731
Wold, S., Esbensen, K., Geladi, P.: Principal component analysis. Chemom. Intell. Lab. Syst. 2(1–3), 37–52 (1987). https://doi.org/10.1016/0169-7439(87)80084-9
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Vijayan, B., Soman, G., Vivek, M.V., Judy, M.V. (2022). A Distributed Ensemble Machine Learning Technique for Emotion Classification from Vocal Cues. In: Roy, P.P., Agarwal, A., Li, T., Krishna Reddy, P., Uday Kiran, R. (eds) Big Data Analytics. BDA 2022. Lecture Notes in Computer Science, vol 13773. Springer, Cham. https://doi.org/10.1007/978-3-031-24094-2_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-24094-2_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-24093-5
Online ISBN: 978-3-031-24094-2
eBook Packages: Computer ScienceComputer Science (R0)