Skip to main content

Improving Children’s Speech Recognition by HMM Interpolation with an Adults’ Speech Recognizer

  • Conference paper
Pattern Recognition (DAGM 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2781))

Included in the following conference series:

Abstract

In this paper we address the problem of building a good speech recognizer if there is only a small amount of training data available. The acoustic models can be improved by interpolation with the well-trained models of a second recognizer from a different application scenario. In our case, we interpolate a children’s speech recognizer with a recognizer for adults’ speech. Each hidden Markov model has its own set of interpolation partners; experiments were conducted with up to 50 partners. The interpolation weights are estimated automatically on a validation set using the EM algorithm. The word accuracy of the children’s speech recognizer could be improved from 74.6 % to 81.5 %. This is a relative improvement of almost 10 %.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Jelinek, F., Mercer, R.L.: Interpolated Estimation of Markov Source Parameters from Sparse Data. In: Gelsema, E.S., Kanal, L.N. (eds.) Pattern Recognition in Practice, pp. 381–397. North Holland Publishing Co., Amsterdam (1980)

    Google Scholar 

  2. Linder, M., Grissemann, H.: Zürcher Lesetest. 6th edn. Testzentrale Göttingen, Robert-Bosch-Breite 25, 37079 Göttingen (2000), http://www.testzentrale.de

  3. Livescu, K.: Analysis and Modeling of Non–Native Speech for Automatic Speech Recognition. Master Thesis, Massachusetts Institute of Technology (1999)

    Google Scholar 

  4. Mayfield Tomokiyo, L.: Recognizing Non–Native Speech: Characterizing and Adapting to Non–Native Usage in LVCSR. PhD Thesis, Carnegie Mellon University (2001)

    Google Scholar 

  5. SAMPA – Computer Readable Phonetic Alphabet, http://www.phon.ucl.ac.uk/home/sampa/home.htm

  6. Schukat-Talamazzini, E.G.: Automatische Spracherkennung – Grundlagen, statistische Modelle und effiziente Algorithmen. Vieweg (1995)

    Google Scholar 

  7. Steidl, S.: Interpolation von Hidden Markov Modellen. Diploma Thesis, Chair for Pattern Recognition, University of Erlangen-Nuremberg (2002) (in German)

    Google Scholar 

  8. Wahlster, W.: Verbmobil: Foundations of Speech-to-Speech Translation, p. 56. Springer, Heidelberg (2000)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Steidl, S., Stemmer, G., Hacker, C., Nöth, E., Niemann, H. (2003). Improving Children’s Speech Recognition by HMM Interpolation with an Adults’ Speech Recognizer. In: Michaelis, B., Krell, G. (eds) Pattern Recognition. DAGM 2003. Lecture Notes in Computer Science, vol 2781. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45243-0_76

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-45243-0_76

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40861-1

  • Online ISBN: 978-3-540-45243-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics