Abstract
In this paper we address the problem of building a good speech recognizer if there is only a small amount of training data available. The acoustic models can be improved by interpolation with the well-trained models of a second recognizer from a different application scenario. In our case, we interpolate a children’s speech recognizer with a recognizer for adults’ speech. Each hidden Markov model has its own set of interpolation partners; experiments were conducted with up to 50 partners. The interpolation weights are estimated automatically on a validation set using the EM algorithm. The word accuracy of the children’s speech recognizer could be improved from 74.6 % to 81.5 %. This is a relative improvement of almost 10 %.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Jelinek, F., Mercer, R.L.: Interpolated Estimation of Markov Source Parameters from Sparse Data. In: Gelsema, E.S., Kanal, L.N. (eds.) Pattern Recognition in Practice, pp. 381–397. North Holland Publishing Co., Amsterdam (1980)
Linder, M., Grissemann, H.: Zürcher Lesetest. 6th edn. Testzentrale Göttingen, Robert-Bosch-Breite 25, 37079 Göttingen (2000), http://www.testzentrale.de
Livescu, K.: Analysis and Modeling of Non–Native Speech for Automatic Speech Recognition. Master Thesis, Massachusetts Institute of Technology (1999)
Mayfield Tomokiyo, L.: Recognizing Non–Native Speech: Characterizing and Adapting to Non–Native Usage in LVCSR. PhD Thesis, Carnegie Mellon University (2001)
SAMPA – Computer Readable Phonetic Alphabet, http://www.phon.ucl.ac.uk/home/sampa/home.htm
Schukat-Talamazzini, E.G.: Automatische Spracherkennung – Grundlagen, statistische Modelle und effiziente Algorithmen. Vieweg (1995)
Steidl, S.: Interpolation von Hidden Markov Modellen. Diploma Thesis, Chair for Pattern Recognition, University of Erlangen-Nuremberg (2002) (in German)
Wahlster, W.: Verbmobil: Foundations of Speech-to-Speech Translation, p. 56. Springer, Heidelberg (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Steidl, S., Stemmer, G., Hacker, C., Nöth, E., Niemann, H. (2003). Improving Children’s Speech Recognition by HMM Interpolation with an Adults’ Speech Recognizer. In: Michaelis, B., Krell, G. (eds) Pattern Recognition. DAGM 2003. Lecture Notes in Computer Science, vol 2781. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45243-0_76
Download citation
DOI: https://doi.org/10.1007/978-3-540-45243-0_76
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40861-1
Online ISBN: 978-3-540-45243-0
eBook Packages: Springer Book Archive