Skip to main content

Anchor Model Fusion for Emotion Recognition in Speech

  • Conference paper
Biometric ID Management and Multimodal Communication (BioID 2009)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5707))

Included in the following conference series:

Abstract

In this work, a novel method for system fusion in emotion recognition for speech is presented. The proposed approach, namely Anchor Model Fusion (AMF), exploits the characteristic behaviour of the scores of a speech utterance among different emotion models, by a mapping to a back-end anchor-model feature space followed by a SVM classifier. Experiments are presented in three different databases: Ahumada III, with speech obtained from real forensic cases; and SUSAS Actual and SUSAS Simulated. Results comparing AMF with a simple sum-fusion scheme after normalization show a significant performance improvement of the proposed technique for two of the three experimental set-ups, without degrading performance in the third one.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ververidisa, D., Kotropoulos, C.: Emotional speech recognition: Resources, features, and methods. Speech Communication (9), 1162–1181 (2006)

    Google Scholar 

  2. Picard, R.W.: Affective Computing. MIT Press, Cambridge (1997)

    Book  Google Scholar 

  3. Ramabadran, T., Meunier, J., Jasiuk, M., Kushner, B.: Enhancing distributed speech recognition with back-end speech reconstruction. In: Proceedings of Eurospeech 2001, pp. 1859–1862 (2001)

    Google Scholar 

  4. Collet, M., Mami, Y., Charlet, D., Bimbot, F.: Probabilistic anchor models approach for speaker verification, 2005–2008 (2005)

    Google Scholar 

  5. Ramos, D., Gonzalez-Rodriguez, J., Gonzalez-Dominguez, J., Lucena-Molina, J.J.: Addressing database mismatch in forensic speaker recognition with ahumada iii: a public real-case database in spanish. In: Proceedings of Interspeech 2008, September 2008, pp. 1493–1496 (2008)

    Google Scholar 

  6. Hansen, J., Sahar, E.: Getting started with susas: a speech under simulated and actual stress database. In: Proceedings of Eurospeech 1997, pp. 1743–1746 (1997)

    Google Scholar 

  7. Lopez-Moreno, I., Ramos, D., Gonzalez-Rodriguez, J., Toledano, D.T.: Anchor-model fusion for language recognition. In: Proceedings of Interspeech 2008 (September 2008)

    Google Scholar 

  8. Hansen, J., Patil, S.: Speech under stress: Analysis, modeling and recognition. In: Müller, C. (ed.) Speaker Classification 2007. LNCS (LNAI), vol. 4343, pp. 108–137. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  9. Boersma, P., Weenink, D.: Praat: doing phonetics by computer (version 5.1.04) [computer program] (April 2009), http://www.praat.org/

  10. Hu, H., Xu, M.X., Wu, W.: Gmm supervector based svm with spectral features for speech emotion recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing. ICASSP 2007, vol. 4, pp. IV-413–IV-416 (2007)

    Google Scholar 

  11. Kwon, O.W., Chan, K., Hao, J., Lee, T.W.: Emotion recognition by speech signals. In: EUROSPEECH 2003, pp. 125–128 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ortego-Resa, C., Lopez-Moreno, I., Ramos, D., Gonzalez-Rodriguez, J. (2009). Anchor Model Fusion for Emotion Recognition in Speech. In: Fierrez, J., Ortega-Garcia, J., Esposito, A., Drygajlo, A., Faundez-Zanuy, M. (eds) Biometric ID Management and Multimodal Communication. BioID 2009. Lecture Notes in Computer Science, vol 5707. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04391-8_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04391-8_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04390-1

  • Online ISBN: 978-3-642-04391-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics