A Distributed Ensemble Machine Learning Technique for Emotion Classification from Vocal Cues

Vijayan, Bineetha; Soman, Gayathri; Vivek, M. V.; Judy, M. V.

doi:10.1007/978-3-031-24094-2_9

Bineetha Vijayan¹²,
Gayathri Soman¹²,
M. V. Vivek¹² &
…
M. V. Judy¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13773))

Included in the following conference series:

International Conference on Big Data Analytics

332 Accesses

Abstract

Human-computer interaction and the creation of humanoid robots both depend heavily on emotions. By integrating the concept of emotion understanding, intelligent software systems become more effective and intuitive in resembling human-human interactions. Typically, we combine factors like intonation (speech), facial expression (visual modality), and word content (text). All possible multimodal combinations must be taken into consideration to process emotions appropriately. Among multimodal approaches, the use of human audio samples for emotion processing is given more weight than the use of facial expressions. To accomplish accurate categorization, analyzing massive volumes of real-time data has become more necessary. Machine Learning (ML) models that operate in a distributed fashion are crucial, given the size and complexity of the problem under study. In this respect, we propose a distributed ensemble model for vocal cue-based emotion classification. Three base ML models that work in a distributed manner were used. According to the findings, the ensemble model proposed differentiates between the seven fundamental emotions with reasonable accuracy. The proposed distributed ensemble model performed better than existing ML models on TESS, SAVEE, and RAVDESS, achieving 86% accuracy on the unified dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Setyohadi, D.B., Kusrohmaniah, S., Gunawan, S.B., Pranowo, P., Prabuwono, A.S.: Galvanic skin response data classification for emotion detection. Int. J. Electr. Comput. Eng. 8(5), 4004 (2018). https://doi.org/10.11591/ijece.v8i5.pp4004-4014
Article Google Scholar
Goshvarpour, A., Abbasi, A., Goshvarpour, A.: An accurate emotion recognition system using ECG and GSR signals and matching pursuit method. Biomed. J. 40(6), 355–368 (2017). https://doi.org/10.1016/j.bj.2017.11.001
Article Google Scholar
Alarcão, S.M., Fonseca, M.J.: Emotions recognition using EEG signals: a survey. IEEE Trans. Affect. Comput. 10(3), 374–393 (2019). https://doi.org/10.1109/TAFFC.2017.2714671
Article Google Scholar
Emotion Recognition based on Heart Rate and Skin Conductance. In: Proceedings of the 2nd International Conference on Physiological Computing Systems, pp. 26–32 (2015). https://doi.org/10.5220/0005241100260032
Cowie, R., et al.: Emotion recognition in human-computer interaction. IEEE Signal Process. Mag. 18(1), 32–80 (2001). https://doi.org/10.1109/79.911197
Article Google Scholar
Ekman, P., et al.: Universals and cultural differences in the judgments of facial expressions of emotion. J. Pers. Soc. Psychol. 53(4), 712–717 (1987). https://doi.org/10.1037/0022-3514.53.4.712
Article Google Scholar
Mehrabian, A.: Basic Dimensions for a General Psychological Theory: Implications for Personality, Social, Environmental, and Developmental Studies, 1st ed., vol. 1. Oelgeschlager, Gunn & Hain, 1980 (1980)
Google Scholar
Russell, J.A.: A circumplex model of affect. J. Pers. Soc. Psychol. 39(6), 1161–1178 (1980). https://doi.org/10.1037/h0077714
Article Google Scholar
Plutchik, R.: A general psychoevolutionary theory of emotion. In: Theories of Emotion, Elsevier, pp. 3–33 (1980)
Google Scholar
Andrews, M.: Emotion recognition in spontaneous speech. In: Faculty Research Working Paper Series, vol. 28, pp. 457–473 (2014). https://bsc.cid.harvard.edu/files/bsc/files/285_andrews_this_is_pfm.pdf
Nwe, T.L., Foo, S.W., De Silva, L.C.: Detection of stress and emotion in speech using traditional and FFT based log energy features. In: Fourth International Conference on Information, Communications and Signal Processing, 2003 and the Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint, vol. 3, pp. 1619–1623 (2003). https://doi.org/10.1109/ICICS.2003.1292741
Pan, Y., Shen, P., Shen, L.: Speech emotion recognition using support vector machine. Int. J. Smart Home 6(2), 101–108 (2012). https://doi.org/10.30534/ijeter/2020/43842020
Article Google Scholar
Matin, R., Valles, D.: A speech emotion recognition solution-based on support vector machine for children with autism spectrum disorder to help identify human emotions. In: 2020 Intermountain Engineering, Technology and Computing (IETC), pp. 1–6 (2020). https://doi.org/10.1109/IETC47856.2020.9249147
Valles, D., Matin, R.: An Audio processing approach using ensemble learning for speech-emotion recognition for children with ASD. In: 2021 IEEE World AI IoT Congress (AIIoT), pp. 0055–0061 (2021). https://doi.org/10.1109/AIIoT52608.2021.9454174
Yuncu, E., Hacihabiboglu, H., Bozsahin, C.: Automatic speech emotion recognition using auditory models with binary decision tree and SVM. In: 2014 22nd International Conference on Pattern Recognition, pp. 773–778 (2014). https://doi.org/10.1109/ICPR.2014.143
Khan, A., Roy, U.K.: Emotion recognition using prosodie and spectral features of speech and Naïve Bayes classifier. In: 2017 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET), pp. 1017–1021 (2017). https://doi.org/10.1109/WiSPNET.2017.8299916
Prajapati, Y.J., Gandhi, P.P., Degadwala, S.: A review - ML and DL classifiers for emotion detection in audio and speech data. In: 2022 International Conference on Inventive Computation Technologies (ICICT), pp. 63–69 (2022). https://doi.org/10.1109/ICICT54344.2022.9850614
Wani, T.M., Gunawan, T.S., Qadri, S.A.A., Kartiwi, M., Ambikairajah, E.: A comprehensive review of speech emotion recognition systems. IEEE Access 9, 47795–47814 (2021). https://doi.org/10.1109/ACCESS.2021.3068045
Article Google Scholar
Lin, L., Tan, L.: Multi-distributed speech emotion recognition based on mel frequency cepstogram and parameter transfer. Chinese J. Electron. 31(1), 155–167 (2022). https://doi.org/10.1049/cje.2020.00.080
Article Google Scholar
Yi, L., Mak, M.-W.: Improving speech emotion recognition with adversarial data augmentation network. IEEE Trans. Neural Networks Learn. Syst. 33(1), 172–184 (2022). https://doi.org/10.1109/TNNLS.2020.3027600
Article Google Scholar
Cao, H., Verma, R., Nenkova, A.: Speaker-sensitive emotion recognition via ranking: Studies on acted and spontaneous speech. Comput. Speech Lang. 29(1), 186–202 (2015). https://doi.org/10.1016/j.csl.2014.01.003
Article Google Scholar
Basu, S., Chakraborty, J., Bag, A., Aftabuddin, M.: A review on emotion recognition using speech. In: 2017 International Conference on Inventive Communication and Computational Technologies (ICICCT), pp. 109–114 (2017). https://doi.org/10.1109/ICICCT.2017.7975169
Pichora-Fuller, M.K.. Dupuis, K.: Toronto emotional speech set (TESS). Borealis (2010).https://doi.org/10.5683/SP2/E8H2MF
Jackson, P.: Surrey audio-visual expressed emotion (SAVEE) database. http://kahlan.eps.surrey.ac.uk/savee/
Livingstone, S.R., Russo, F.A.: The ryerson audio-visual database of emotional speech and song (RAVDESS): a dynamic, multimodal set of facial and vocal expressions in North American English. PLoS ONE 13(5), e0196391 (2018). https://doi.org/10.1371/journal.pone.0196391
Article Google Scholar
Verbraeken, J., Wolting, M., Katzy, J., Kloppenburg, J., Verbelen, T., Rellermeyer, J.S.: A survey on distributed machine learning. ACM Comput. Surv. 53(2), 1–33 (2021). https://doi.org/10.1145/3377454
Article Google Scholar
Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. 28(4), 357–366 (1980). https://doi.org/10.1109/TASSP.1980.1163420
Article Google Scholar
Jiang, D.N., Lu, L., Zhang, H.J., Tao, J.H., Cai, L. H.: Music type classification by spectral contrast feature. In: Proceedings. IEEE International Conference on Multimedia and Expo, pp. 113–116. https://doi.org/10.1109/ICME.2002.1035731
Wold, S., Esbensen, K., Geladi, P.: Principal component analysis. Chemom. Intell. Lab. Syst. 2(1–3), 37–52 (1987). https://doi.org/10.1016/0169-7439(87)80084-9
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Applications, Cochin University of Science and Technology, Kochi, Kerala, 682022, India
Bineetha Vijayan, Gayathri Soman, M. V. Vivek & M. V. Judy

Authors

Bineetha Vijayan
View author publications
You can also search for this author in PubMed Google Scholar
Gayathri Soman
View author publications
You can also search for this author in PubMed Google Scholar
M. V. Vivek
View author publications
You can also search for this author in PubMed Google Scholar
M. V. Judy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bineetha Vijayan .

Editor information

Editors and Affiliations

Indian Institute of Technology-Roorkee, Roorkee, India
Partha Pratim Roy
IBM Research, Gurugram, India
Arvind Agarwal
Southwest Jiaotong University, Chengdu, China
Tianrui Li
International Institute of Information Technology - Hyderabad, Hyderabad, India
P. Krishna Reddy
The University of Aizu, Fukushima, Japan
R. Uday Kiran

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vijayan, B., Soman, G., Vivek, M.V., Judy, M.V. (2022). A Distributed Ensemble Machine Learning Technique for Emotion Classification from Vocal Cues. In: Roy, P.P., Agarwal, A., Li, T., Krishna Reddy, P., Uday Kiran, R. (eds) Big Data Analytics. BDA 2022. Lecture Notes in Computer Science, vol 13773. Springer, Cham. https://doi.org/10.1007/978-3-031-24094-2_9

Download citation

DOI: https://doi.org/10.1007/978-3-031-24094-2_9
Published: 29 January 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-24093-5
Online ISBN: 978-3-031-24094-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics