Speech Production Model and Automatic Recognition

Shirai, K.; Kobayashi, T.

doi:10.1007/978-94-009-2991-3_1

K. Shirai³ &
T. Kobayashi³

Part of the book series: Theory and Decision Library ((TDLD,volume 2))

123 Accesses

Abstract

This paper discusses the validity of feature extraction method for speech recognition in articulatory domain. Firstly, a method is described to estimate articulatory movements from speech waves on the basis of a speech production model is described. Secondly, the validity of estimated articulatory parameters for speaker adaptation is tested. The results of experiments to recognize vowels of unspecified speakers show that the adaptation of the model by the estimated mean vocal tract length is effective to normalize the speaker difference. Thirdly, the effectiveness for continuous speech recognition is considered. Motor commands to move articulatory organs are estimated considering articulatory dynamics and the continuous vowels are recognized using the estimated commands. It is found that considerable part of coarticulation effects can be removed by the command estimation. Finally, some characteristics of phonemes are investigated in articulatory domain. It is found that the phonemic characteristics can be represented in particular parameter according to its articulatory manner.

T.Kobayashi is now with Hosei University.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

. H.Wakita, ‘Direct estimation of the vocal tract shape by inverse filtering of acoustic speech’,IEEE Trans. Audio & Electroacoustics, AU-21, No.5, 1973, pp.417–429.
Article Google Scholar
. H.Wakita,Estimation of vocal-tract shapes from acoustical analysis of the speech wave: The state of the art’, IEEE Trans. Acoust., Speech &Signal Process., ASSP-27, No.3, 1979, pp.281–285.
Article Google Scholar
. T.Nakajima, T.Ohmura, H.Tanaka and S.Ishizaki, Estimation of Vacal tract area functions by adaptive inverse filtering method’, Bul. Electrotech. Lab., 37, 1973, pp.462–481.
Google Scholar
. H.W.Strube, ‘Can the area function of the human vocal tract be estimated from speech wave’, in Dynamic Aspects of Speech Production, Univ. Tokyo Press, Tokyo, 1977, pp.409–416.
Google Scholar
. K.Shirai and M.Honda, ‘An articulatory model and the estimation of articulatory parameters by nonlinear regression model’, Trans. IECE Japan, J59-A, No.8, 1976, pp.668–674.
Google Scholar
. K.Shirai and M.Honda, ‘ Estimation of articulatory parameter from speech wave’, Trans. IECE Japan, J61-A, No.5, 1978, pp.409–416.
Google Scholar
. T.Kobayashi, J.Yazawa and K.Shirai, ‘Evaluation of spectral distance measure for the estimation of articulatory motion by the model matching method’, Trans. IECE Japan, J68-A, No.2, 1985, pp.210–217.
Google Scholar
. H.Wakita, ‘Normalization of vowels by vocal-tract length and its application to vowel identification’, IEEE Trans. Acoust., Speech &Signal Process., ASSP-25, No.2, 1977, pp.183–192.
Article Google Scholar
. K.Shirai, ‘Vowel identification in continuous speech using articulatory parameters’, IEEE Proc. ICASSP 81, Atlanta, USA, March 30 -April 1, 1981, pp.1172–1175.
Google Scholar
. K.Shirai and T.Matsui, ‘Estimation of articulatory states from nasal sounds’ Trans. IECE Japan, J63-A, No.2, 1980, pp.75–81.
Google Scholar
. K.Shirai and T.Kobayashi, ‘Recognition semivowels and consonants in continuous speech using articulatory parameters’, IEEE Proc. ICASSP 82, Paris, France, May 3–5, 1982, pp.2004–2007.
Google Scholar
. K.Shirai, H.Matsuura and T.Kobayashi, ‘Validity of articulatory parameters in continuous speech recognition for unspecified speakers -Vowel recognition test -’, Trans. IECE Japan, J65-A, No.7, 1982, pp.671–678.
Google Scholar
. S.Ishizaki, ‘Dynamic speech discrimination using an articulatory model’, Proc. IJCAI-79, Tokyo, Aug. 20–23, Japan, 1979, pp.422–424.
Google Scholar
. S.Itahashi and S.Yokoyama, ‘Automatic formant trajectory tracking and its approximation by second order linear system’, J.Acoust.Soc. Japan, 29, No.11, 1973, pp.690–691.
Google Scholar
. S.Itahashi and S.Yokoyama, ‘Description and segmentation of formant trajectory with second order linear system model’, Bul.Electrotech. Lab., 40, No.6, 1976.
Google Scholar
. H.Fujisaki, M.Yoshida, Y.Sato and Y.Tanabe, ‘Automatic recognition of connected vowels using a functional model of the coarticulatory process’, J.Acoust.Soc.Japan, 29, No.10, 1973, pp.636–638.
Google Scholar
. Y.Sato and H.Fujisaki, ‘Formulation of the process of coarticulation in terms of formant frequencies and its application to automa-tic speech recognition’, J.Acoust.Soc.Japan, 34, No.3,1978, pp.177–185.
Google Scholar
. K.Shirai and T.Kobayashi, ‘Consideration on articulatory dynamics for continuous speech recognition’, IEEE Proc. ICASSP 83, 7.10, Boston, U.S.A., April 14–16, 1983, pp.324–327.
Google Scholar
. K.Shirai and T.Kobayashi, ‘Estimating articulatory motion from speech wave’, Speech Communication, 5, No.2, 1986, pp.159–170.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku Tokyo, 160, Japan
K. Shirai & T. Kobayashi

Authors

K. Shirai
View author publications
You can also search for this author in PubMed Google Scholar
T. Kobayashi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

State University of Groningen, The Netherlands
Marc E. Carvallo

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Shirai, K., Kobayashi, T. (1988). Speech Production Model and Automatic Recognition. In: Carvallo, M.E. (eds) Nature, Cognition and System I. Theory and Decision Library, vol 2. Springer, Dordrecht. https://doi.org/10.1007/978-94-009-2991-3_1

Download citation

DOI: https://doi.org/10.1007/978-94-009-2991-3_1
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-010-7844-3
Online ISBN: 978-94-009-2991-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics