Abstract
In recent years, it has been observed that emotion recognition is a rapidly growing research area. Unlike humans, machines do not possess aptitudes to recognize emotions. This study aims to recognize five emotions, i.e., fear, anger, happiness, neutral, and sadness, from a speech signal. EmoDB dataset comprises the above mentioned emotions; so, this German dataset is used. Prosodic features like energy, voiced and unvoiced frame duration, and silence are extracted. When an individual expresses his/her emotions strongly, his/her voice amplitude and frequency fluctuations turn out to be high. Such fluctuations can detect an emotion, and it is possible to measure jitter and shimmer prosodic features that are extracted for a speech signal’s voiced frames. Spectrum tilt and zero-crossing rate are obtained, and frames of a speech segment are classified as voiced, unvoiced, and silence using hamming windows. Each frame comprises 960 speech samples. Moreover, in our study, a new feature, i.e., Fast Fourier transform (FFT), is extracted. Instead of using FFT of 960 samples, only mean, minimum, maximum, variance, and standard deviation for each FFT are used. We name these features as FFTF. Thirteen Mel frequency cepstral coefficients features are also extracted. A random forest classifier is used; in the first experiment, the effect of different number of trees is observed. With 100 trees, an emotion recognition accuracy of 78.6885% is attained. In the second experiment, different features of voiced and unvoiced frames are used, and it is observed that for FFT features, an accuracy of 78.6885% is attained. In the third experiment, the number of features are reduced using PCA, and it is observed that the accuracy is improved to 83.60%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Lisetti, C.L.: Affective computing. Pattern Anal. Appl. 1(1), 71–73 (1998). https://doi.org/10.1007/bf01238028
Wang, K., An, N., Li, B.N., Zhang, Y., Li, L.: Speech emotion recognition using Fourier parameters. IEEE Trans. Affect. Comput. 6(1), 69–75 (2015). https://doi.org/10.1109/TAFFC.2015.2392101
Iqbal, A., Barua, K.: A real-time emotion recognition from speech using gradient boosting. In: 2nd International Conference on Electrical, Computer and Communication Engineering (ECCE), pp. 1–5, (2019). https://doi.org/10.1109/ECACE.2019.8679271
Ververidis, D., Kotropoulos, C.: Emotional speech recognition: Resources, features, and methods. Speech Commun. 48(9), 1162–1181 (2006). https://doi.org/10.1016/j.specom.2006.04.003
Ke, X., Cao, B., Bai, J., Yu, Q., Yang, D.: Speech emotion recognition based on PCA and CHMM. In: Proceedings of 2019 IEEE 8th Joint International Information Technology and Artificial Intelligence Conference, ITAIC 2019, pp. 667–671 (2019). https://doi.org/10.1109/ITAIC.2019.8785867
El Ayadi, M., Kamel, M.S., Karray, F.: Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognit. 44(3), 572–587 (2011). https://doi.org/10.1016/j.patcog.2010.09.020
Akçay, M.B., Oğuz, K.: Speech emotion recognition: Emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers. Speech Commun. 116, 56–76 (2020). https://doi.org/10.1016/j.specom.2019.12.001
Xi, L., et al.: Stress and emotion classification using jitter and shimmer features. In: Proceedings on ICASSP, IEEE International Conference on Acoustics, Speech, & Signal Processing, vol. 4, May 2007. https://doi.org/10.1109/ICASSP.2007.367261
Küstner, D., Tato, R., Kemp, T., Meffert, B.: Towards real life applications in emotion recognition. In: André, E., Dybkjær, L., Minker, W., Heisterkamp, P. (eds.) ADS 2004. LNCS (LNAI), vol. 3068, pp. 25–35. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24842-2_3
Busso, C., Mariooryad, S., Metallinou, A., Narayanan, S.: Iterative feature normalization scheme for automatic emotion detection from speech. IEEE Trans. Affect. Comput. 4(4), 386–397 (2013). https://doi.org/10.1109/T-AFFC.2013.26
Mukhopadhyay, M., et al.: Facial emotion recognition based on textural pattern and convolutional neural network. In: 2021 IEEE 4th International Conference on Computing, Power and Communication Technologies (GUCON), pp. 1–6 (2021). https://doi.org/10.1109/GUCON50781.2021.9573860
Lee, C.M., Narayanan, S.S., Pieraccini, R.: Classifying emotions in human-machine spoken dialogs. In: Proceedings 2002 IEEE International Conference on Multimedia and Expo, ICME 2002, vol. 1, pp. 737–740 (2002). https://doi.org/10.1109/ICME.2002.1035887
Emotion Recognition From Noisy Speech College of Computer Science, YuQuan Campus, ZheJiang University, Hangzhou, CHINA, 310027 National Laboratory of Pattern Recognition, Chinese Academy of Sciences, Beijing, China, vol. 100080, pp. 1653–1656 (2006)
You, M., Chen, C., Bu, J., Liu, J., Tao, J.: A hierarchical framework for speech emotion recognition. IEEE Int. Symp. Ind. Electron. 1(2), 515–519 (2006). https://doi.org/10.1109/ISIE.2006.295649
Rawat, R., Mahor, V., Chirgaiya, S., Shaw, R.N., Ghosh, A.: Sentiment analysis at online social network for cyber-malicious post reviews using machine learning techniques. In: Bansal, J.C., Paprzycki, M., Bianchini, M., Das, S. (eds.) Computationally Intelligent Systems and their Applications. SCI, vol. 950, pp. 113–130. Springer, Singapore (2021). https://doi.org/10.1007/978-981-16-0407-2_9
Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., Weiss, B.: A database of German emotional speech. In: 9th European Conference on Speech Communication and Technology, January, pp. 1517–1520 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Patil, S., Kharate, G.K. (2022). PCA-Based Random Forest Classifier for Speech Emotion Recognition Using FFTF Features, Jitter, and Shimmer. In: Mekhilef, S., Shaw, R.N., Siano, P. (eds) Innovations in Electrical and Electronic Engineering. ICEEE 2022. Lecture Notes in Electrical Engineering, vol 894. Springer, Singapore. https://doi.org/10.1007/978-981-19-1677-9_17
Download citation
DOI: https://doi.org/10.1007/978-981-19-1677-9_17
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-1676-2
Online ISBN: 978-981-19-1677-9
eBook Packages: EnergyEnergy (R0)