Anchor Model Fusion for Emotion Recognition in Speech

Ortego-Resa, Carlos; Lopez-Moreno, Ignacio; Ramos, Daniel; Gonzalez-Rodriguez, Joaquin

doi:10.1007/978-3-642-04391-8_7

Carlos Ortego-Resa²⁰,
Ignacio Lopez-Moreno²⁰,
Daniel Ramos²⁰ &
…
Joaquin Gonzalez-Rodriguez²⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5707))

Included in the following conference series:

European Workshop on Biometrics and Identity Management

1088 Accesses
2 Citations

Abstract

In this work, a novel method for system fusion in emotion recognition for speech is presented. The proposed approach, namely Anchor Model Fusion (AMF), exploits the characteristic behaviour of the scores of a speech utterance among different emotion models, by a mapping to a back-end anchor-model feature space followed by a SVM classifier. Experiments are presented in three different databases: Ahumada III, with speech obtained from real forensic cases; and SUSAS Actual and SUSAS Simulated. Results comparing AMF with a simple sum-fusion scheme after normalization show a significant performance improvement of the proposed technique for two of the three experimental set-ups, without degrading performance in the third one.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ververidisa, D., Kotropoulos, C.: Emotional speech recognition: Resources, features, and methods. Speech Communication (9), 1162–1181 (2006)
Google Scholar
Picard, R.W.: Affective Computing. MIT Press, Cambridge (1997)
Book Google Scholar
Ramabadran, T., Meunier, J., Jasiuk, M., Kushner, B.: Enhancing distributed speech recognition with back-end speech reconstruction. In: Proceedings of Eurospeech 2001, pp. 1859–1862 (2001)
Google Scholar
Collet, M., Mami, Y., Charlet, D., Bimbot, F.: Probabilistic anchor models approach for speaker verification, 2005–2008 (2005)
Google Scholar
Ramos, D., Gonzalez-Rodriguez, J., Gonzalez-Dominguez, J., Lucena-Molina, J.J.: Addressing database mismatch in forensic speaker recognition with ahumada iii: a public real-case database in spanish. In: Proceedings of Interspeech 2008, September 2008, pp. 1493–1496 (2008)
Google Scholar
Hansen, J., Sahar, E.: Getting started with susas: a speech under simulated and actual stress database. In: Proceedings of Eurospeech 1997, pp. 1743–1746 (1997)
Google Scholar
Lopez-Moreno, I., Ramos, D., Gonzalez-Rodriguez, J., Toledano, D.T.: Anchor-model fusion for language recognition. In: Proceedings of Interspeech 2008 (September 2008)
Google Scholar
Hansen, J., Patil, S.: Speech under stress: Analysis, modeling and recognition. In: Müller, C. (ed.) Speaker Classification 2007. LNCS (LNAI), vol. 4343, pp. 108–137. Springer, Heidelberg (2007)
Chapter Google Scholar
Boersma, P., Weenink, D.: Praat: doing phonetics by computer (version 5.1.04) [computer program] (April 2009), http://www.praat.org/
Hu, H., Xu, M.X., Wu, W.: Gmm supervector based svm with spectral features for speech emotion recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing. ICASSP 2007, vol. 4, pp. IV-413–IV-416 (2007)
Google Scholar
Kwon, O.W., Chan, K., Hao, J., Lee, T.W.: Emotion recognition by speech signals. In: EUROSPEECH 2003, pp. 125–128 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

ATVS - Biometric Recognition Group, Universidad Autonoma de Madrid, Spain
Carlos Ortego-Resa, Ignacio Lopez-Moreno, Daniel Ramos & Joaquin Gonzalez-Rodriguez

Authors

Carlos Ortego-Resa
View author publications
You can also search for this author in PubMed Google Scholar
Ignacio Lopez-Moreno
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Ramos
View author publications
You can also search for this author in PubMed Google Scholar
Joaquin Gonzalez-Rodriguez
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Escuela Politecnica Superior, Universidad Autonoma de Madrid, C/ Francisco Tomas y Valiente 11, 28049, Madrid, Spain
Julian Fierrez & Javier Ortega-Garcia &
Second University of Naples, and IIASS, Via Vivaldi 43, 81100, Caserta, Italy
Anna Esposito
EPFL, Speech Processing and Biometrics Group, EPFL-STI-IEL-LIDIAP, ELE 233, Station 11, 1015, Lausanne, Switzerland
Andrzej Drygajlo
Escola Universitària Politècnica de Mataró, Avda. Puig i Cadafalch 101-111, 08303, Mataro (Barcelona), Spain
Marcos Faundez-Zanuy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ortego-Resa, C., Lopez-Moreno, I., Ramos, D., Gonzalez-Rodriguez, J. (2009). Anchor Model Fusion for Emotion Recognition in Speech. In: Fierrez, J., Ortega-Garcia, J., Esposito, A., Drygajlo, A., Faundez-Zanuy, M. (eds) Biometric ID Management and Multimodal Communication. BioID 2009. Lecture Notes in Computer Science, vol 5707. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04391-8_7

Download citation

DOI: https://doi.org/10.1007/978-3-642-04391-8_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04390-1
Online ISBN: 978-3-642-04391-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics