Skip to main content

Speech Production Model and Automatic Recognition

  • Chapter
Nature, Cognition and System I

Part of the book series: Theory and Decision Library ((TDLD,volume 2))

  • 123 Accesses

Abstract

This paper discusses the validity of feature extraction method for speech recognition in articulatory domain. Firstly, a method is described to estimate articulatory movements from speech waves on the basis of a speech production model is described. Secondly, the validity of estimated articulatory parameters for speaker adaptation is tested. The results of experiments to recognize vowels of unspecified speakers show that the adaptation of the model by the estimated mean vocal tract length is effective to normalize the speaker difference. Thirdly, the effectiveness for continuous speech recognition is considered. Motor commands to move articulatory organs are estimated considering articulatory dynamics and the continuous vowels are recognized using the estimated commands. It is found that considerable part of coarticulation effects can be removed by the command estimation. Finally, some characteristics of phonemes are investigated in articulatory domain. It is found that the phonemic characteristics can be represented in particular parameter according to its articulatory manner.

T.Kobayashi is now with Hosei University.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. . H.Wakita, ‘Direct estimation of the vocal tract shape by inverse filtering of acoustic speech’,IEEE Trans. Audio & Electroacoustics, AU-21, No.5, 1973, pp.417–429.

    Article  Google Scholar 

  2. . H.Wakita,Estimation of vocal-tract shapes from acoustical analysis of the speech wave: The state of the art’, IEEE Trans. Acoust., Speech &Signal Process., ASSP-27, No.3, 1979, pp.281–285.

    Article  Google Scholar 

  3. . T.Nakajima, T.Ohmura, H.Tanaka and S.Ishizaki, Estimation of Vacal tract area functions by adaptive inverse filtering method’, Bul. Electrotech. Lab., 37, 1973, pp.462–481.

    Google Scholar 

  4. . H.W.Strube, ‘Can the area function of the human vocal tract be estimated from speech wave’, in Dynamic Aspects of Speech Production, Univ. Tokyo Press, Tokyo, 1977, pp.409–416.

    Google Scholar 

  5. . K.Shirai and M.Honda, ‘An articulatory model and the estimation of articulatory parameters by nonlinear regression model’, Trans. IECE Japan, J59-A, No.8, 1976, pp.668–674.

    Google Scholar 

  6. . K.Shirai and M.Honda, ‘ Estimation of articulatory parameter from speech wave’, Trans. IECE Japan, J61-A, No.5, 1978, pp.409–416.

    Google Scholar 

  7. . T.Kobayashi, J.Yazawa and K.Shirai, ‘Evaluation of spectral distance measure for the estimation of articulatory motion by the model matching method’, Trans. IECE Japan, J68-A, No.2, 1985, pp.210–217.

    Google Scholar 

  8. . H.Wakita, ‘Normalization of vowels by vocal-tract length and its application to vowel identification’, IEEE Trans. Acoust., Speech &Signal Process., ASSP-25, No.2, 1977, pp.183–192.

    Article  Google Scholar 

  9. . K.Shirai, ‘Vowel identification in continuous speech using articulatory parameters’, IEEE Proc. ICASSP 81, Atlanta, USA, March 30 -April 1, 1981, pp.1172–1175.

    Google Scholar 

  10. . K.Shirai and T.Matsui, ‘Estimation of articulatory states from nasal sounds’ Trans. IECE Japan, J63-A, No.2, 1980, pp.75–81.

    Google Scholar 

  11. . K.Shirai and T.Kobayashi, ‘Recognition semivowels and consonants in continuous speech using articulatory parameters’, IEEE Proc. ICASSP 82, Paris, France, May 3–5, 1982, pp.2004–2007.

    Google Scholar 

  12. . K.Shirai, H.Matsuura and T.Kobayashi, ‘Validity of articulatory parameters in continuous speech recognition for unspecified speakers -Vowel recognition test -’, Trans. IECE Japan, J65-A, No.7, 1982, pp.671–678.

    Google Scholar 

  13. . S.Ishizaki, ‘Dynamic speech discrimination using an articulatory model’, Proc. IJCAI-79, Tokyo, Aug. 20–23, Japan, 1979, pp.422–424.

    Google Scholar 

  14. . S.Itahashi and S.Yokoyama, ‘Automatic formant trajectory tracking and its approximation by second order linear system’, J.Acoust.Soc. Japan, 29, No.11, 1973, pp.690–691.

    Google Scholar 

  15. . S.Itahashi and S.Yokoyama, ‘Description and segmentation of formant trajectory with second order linear system model’, Bul.Electrotech. Lab., 40, No.6, 1976.

    Google Scholar 

  16. . H.Fujisaki, M.Yoshida, Y.Sato and Y.Tanabe, ‘Automatic recognition of connected vowels using a functional model of the coarticulatory process’, J.Acoust.Soc.Japan, 29, No.10, 1973, pp.636–638.

    Google Scholar 

  17. . Y.Sato and H.Fujisaki, ‘Formulation of the process of coarticulation in terms of formant frequencies and its application to automa-tic speech recognition’, J.Acoust.Soc.Japan, 34, No.3,1978, pp.177–185.

    Google Scholar 

  18. . K.Shirai and T.Kobayashi, ‘Consideration on articulatory dynamics for continuous speech recognition’, IEEE Proc. ICASSP 83, 7.10, Boston, U.S.A., April 14–16, 1983, pp.324–327.

    Google Scholar 

  19. . K.Shirai and T.Kobayashi, ‘Estimating articulatory motion from speech wave’, Speech Communication, 5, No.2, 1986, pp.159–170.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1988 Kluwer Academic Publishers

About this chapter

Cite this chapter

Shirai, K., Kobayashi, T. (1988). Speech Production Model and Automatic Recognition. In: Carvallo, M.E. (eds) Nature, Cognition and System I. Theory and Decision Library, vol 2. Springer, Dordrecht. https://doi.org/10.1007/978-94-009-2991-3_1

Download citation

  • DOI: https://doi.org/10.1007/978-94-009-2991-3_1

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-94-010-7844-3

  • Online ISBN: 978-94-009-2991-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics