Skip to main content

PCA-Based Random Forest Classifier for Speech Emotion Recognition Using FFTF Features, Jitter, and Shimmer

  • Conference paper
  • First Online:
Innovations in Electrical and Electronic Engineering (ICEEE 2022)

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 894))

Included in the following conference series:

Abstract

In recent years, it has been observed that emotion recognition is a rapidly growing research area. Unlike humans, machines do not possess aptitudes to recognize emotions. This study aims to recognize five emotions, i.e., fear, anger, happiness, neutral, and sadness, from a speech signal. EmoDB dataset comprises the above mentioned emotions; so, this German dataset is used. Prosodic features like energy, voiced and unvoiced frame duration, and silence are extracted. When an individual expresses his/her emotions strongly, his/her voice amplitude and frequency fluctuations turn out to be high. Such fluctuations can detect an emotion, and it is possible to measure jitter and shimmer prosodic features that are extracted for a speech signal’s voiced frames. Spectrum tilt and zero-crossing rate are obtained, and frames of a speech segment are classified as voiced, unvoiced, and silence using hamming windows. Each frame comprises 960 speech samples. Moreover, in our study, a new feature, i.e., Fast Fourier transform (FFT), is extracted. Instead of using FFT of 960 samples, only mean, minimum, maximum, variance, and standard deviation for each FFT are used. We name these features as FFTF. Thirteen Mel frequency cepstral coefficients features are also extracted. A random forest classifier is used; in the first experiment, the effect of different number of trees is observed. With 100 trees, an emotion recognition accuracy of 78.6885% is attained. In the second experiment, different features of voiced and unvoiced frames are used, and it is observed that for FFT features, an accuracy of 78.6885% is attained. In the third experiment, the number of features are reduced using PCA, and it is observed that the accuracy is improved to 83.60%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 299.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 299.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Lisetti, C.L.: Affective computing. Pattern Anal. Appl. 1(1), 71–73 (1998). https://doi.org/10.1007/bf01238028

    Article  Google Scholar 

  2. Wang, K., An, N., Li, B.N., Zhang, Y., Li, L.: Speech emotion recognition using Fourier parameters. IEEE Trans. Affect. Comput. 6(1), 69–75 (2015). https://doi.org/10.1109/TAFFC.2015.2392101

    Article  Google Scholar 

  3. Iqbal, A., Barua, K.: A real-time emotion recognition from speech using gradient boosting. In: 2nd International Conference on Electrical, Computer and Communication Engineering (ECCE), pp. 1–5, (2019). https://doi.org/10.1109/ECACE.2019.8679271

  4. Ververidis, D., Kotropoulos, C.: Emotional speech recognition: Resources, features, and methods. Speech Commun. 48(9), 1162–1181 (2006). https://doi.org/10.1016/j.specom.2006.04.003

    Article  Google Scholar 

  5. Ke, X., Cao, B., Bai, J., Yu, Q., Yang, D.: Speech emotion recognition based on PCA and CHMM. In: Proceedings of 2019 IEEE 8th Joint International Information Technology and Artificial Intelligence Conference, ITAIC 2019, pp. 667–671 (2019). https://doi.org/10.1109/ITAIC.2019.8785867

  6. El Ayadi, M., Kamel, M.S., Karray, F.: Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognit. 44(3), 572–587 (2011). https://doi.org/10.1016/j.patcog.2010.09.020

    Article  MATH  Google Scholar 

  7. Akçay, M.B., Oğuz, K.: Speech emotion recognition: Emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers. Speech Commun. 116, 56–76 (2020). https://doi.org/10.1016/j.specom.2019.12.001

  8. Xi, L., et al.: Stress and emotion classification using jitter and shimmer features. In: Proceedings on ICASSP, IEEE International Conference on Acoustics, Speech, & Signal Processing, vol. 4, May 2007. https://doi.org/10.1109/ICASSP.2007.367261

  9. Küstner, D., Tato, R., Kemp, T., Meffert, B.: Towards real life applications in emotion recognition. In: André, E., Dybkjær, L., Minker, W., Heisterkamp, P. (eds.) ADS 2004. LNCS (LNAI), vol. 3068, pp. 25–35. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24842-2_3

    Chapter  Google Scholar 

  10. Busso, C., Mariooryad, S., Metallinou, A., Narayanan, S.: Iterative feature normalization scheme for automatic emotion detection from speech. IEEE Trans. Affect. Comput. 4(4), 386–397 (2013). https://doi.org/10.1109/T-AFFC.2013.26

    Article  Google Scholar 

  11. Mukhopadhyay, M., et al.: Facial emotion recognition based on textural pattern and convolutional neural network. In: 2021 IEEE 4th International Conference on Computing, Power and Communication Technologies (GUCON), pp. 1–6 (2021). https://doi.org/10.1109/GUCON50781.2021.9573860

  12. Lee, C.M., Narayanan, S.S., Pieraccini, R.: Classifying emotions in human-machine spoken dialogs. In: Proceedings 2002 IEEE International Conference on Multimedia and Expo, ICME 2002, vol. 1, pp. 737–740 (2002). https://doi.org/10.1109/ICME.2002.1035887

  13. Emotion Recognition From Noisy Speech College of Computer Science, YuQuan Campus, ZheJiang University, Hangzhou, CHINA, 310027 National Laboratory of Pattern Recognition, Chinese Academy of Sciences, Beijing, China, vol. 100080, pp. 1653–1656 (2006)

    Google Scholar 

  14. You, M., Chen, C., Bu, J., Liu, J., Tao, J.: A hierarchical framework for speech emotion recognition. IEEE Int. Symp. Ind. Electron. 1(2), 515–519 (2006). https://doi.org/10.1109/ISIE.2006.295649

    Article  Google Scholar 

  15. Rawat, R., Mahor, V., Chirgaiya, S., Shaw, R.N., Ghosh, A.: Sentiment analysis at online social network for cyber-malicious post reviews using machine learning techniques. In: Bansal, J.C., Paprzycki, M., Bianchini, M., Das, S. (eds.) Computationally Intelligent Systems and their Applications. SCI, vol. 950, pp. 113–130. Springer, Singapore (2021). https://doi.org/10.1007/978-981-16-0407-2_9

    Chapter  Google Scholar 

  16. Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., Weiss, B.: A database of German emotional speech. In: 9th European Conference on Speech Communication and Technology, January, pp. 1517–1520 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sheetal Patil .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Patil, S., Kharate, G.K. (2022). PCA-Based Random Forest Classifier for Speech Emotion Recognition Using FFTF Features, Jitter, and Shimmer. In: Mekhilef, S., Shaw, R.N., Siano, P. (eds) Innovations in Electrical and Electronic Engineering. ICEEE 2022. Lecture Notes in Electrical Engineering, vol 894. Springer, Singapore. https://doi.org/10.1007/978-981-19-1677-9_17

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-1677-9_17

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-1676-2

  • Online ISBN: 978-981-19-1677-9

  • eBook Packages: EnergyEnergy (R0)

Publish with us

Policies and ethics