Preview
Unable to display preview. Download preview PDF.
References
Allen, J.B., “How do humans process and recognize speech?,” IEEE Trans. on Speech and Audio Processing, vol. 2, no. 4, pp.567–577, 1994.
Austin, S., Zavaliagkos, G., Makhoul, J., and Schwartz, J., “Improving state-of-the-art continuous speech recognition systems using the N-best paradigm with neural networks,” Proc. DARPA Speech and Natural Language Workshop (Harriman, NY), Morgan Kaufmann, pp. 180–184, Feb. 1992.
Baum, L., “An inequality and associated maximization techniques in statistical estimation of probabilistic functions of Markov processes,” Inequalities, no. 3, pp. 1–8, 1972.
Bengio, Y., De Mori, R., Flammia, G. and Kompe, R., “Global optimization of a neural network-Hidden Markov Model hybrid,” IEEE Trans. on Neural Networks, vol. 3, no. 2, pp. 252–259, 1992.
Bilmes, J., Morgan, N., Wu, S., and Bourlard, H., “Stochastic perceptual speech models with durational dependence,” Intl. Conference on Spoken Language Processing, pp. 1301–1304, 1996.
Bourlard, H. and Morgan, N., Connectionist Speech Recognition — A Hybrid Approach, Kluwer Academic Publishers, 1994.
Bourlard, H., Konig, Y. and Morgan, N., “REMAP: Recursive Estimation and Maximization of A Posteriori Probabilities in connectionist speech recognition”, Proc. EUROSPEECH'95 (Madrid, Spain), Sep. 1995.
Bourlard, H. and Dupont, S. (1996), “A new ASR approach based on independent processing and recombination of partial frequency bands,” Proc. of Intl. Conf. on Spoken Language Processing (ICSLP) (Philadelphia), pp. 426–429, Oct. 3–6, 1996.
Bridle, J.S., “Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition,” in Neurocomputing: Algorithms, Architectures and Applications, F. Fogelman Soulié and J. Hérault (Eds.), NATO ASI Series, pp. 227–236, 1990.
Dupont, S. and Bourlard, H., “Using multiple time scales in a multi-stream speech recognition system,” to be published in Proc. EUROSPEECH'97 (Rhodes, Greece), Sep. 1997.
Furui, S., “Speaker independent isolated word recognizer using dynamic features of speech spectrum,” IEEE Trans. on Acoustics, Speech, and Signal Processing, vol. 34, no. 1, pp. 52–59, 1986.
Gish, H., “A probabilistic approach to the understanding and training of neural network classifiers,” in IEEE Proc. Intl. Conf. on Acoustics, Speech and Signal Processing (Albuquerque, NM), pp. 1361–1364, 1990.
Haeb-Umbach, R., Geller, D., Ney, H., “Improvements in connected digit recognition using linear discriminant analysis and mixture densities,” Proc. IEEE Intl. Conf. on Acoustics, Speech, and Signal Processing (Adelaide, Australia), pp. II-239–242, 1994.
Hennebert, J., Ris, C., Bourlard, H., and Renals, S., “Estimation of global posteriors and forward-backward training of hybrid HMM/ANN systems,” to be published in Proc. EUROSPEECH'97 (Rhodes, Greece), Sep. 1997.
Hermansky, H., “Perceptual Linear Predictive (PLP) analysis of speech,” Journal of the Acoust. Soc. Am., vol. 87, no. 4, 1990.
Hochberg, M.M., Renals, S.J., Robinson, A.J., and G.D. Cook., “Recent improvements to the ABBOT large vocabulary CSR system,” Proc. of IEEE Intl. Conf. on Acoustics, Speech, and Signal Processing (Detroit, MI), pp. 69–72, 1995.
Huang, X.D., Lee, K.F. and Waibel, A., “Connectionist speaker normalization and its application to speech recognition,” Proc. of IEEE Workshop on Neural Networks for Signal Processing, pp. 357–366, IEEE Press, 1991.
Katagiri, S., Lee, C., and Juang, B., “New Discriminative Training Algorithms Based on the Generalized Probabilistic Descent Method”, Proc. of the 1991 IEEE Workshop on Neural Networks for Signal Processing, ppp. 299–308, 1991.
Kohonen, T., “The ‘neural’ phonetic typewriter,” IEEE Computer: 11–22, 1988.
Levin, E., “Speech recognition using hidden control neural network architecture,” in Proc. IEEE Intl. Conf. on Acoustics, Speech, and Signal Processing (Albuquerque, NM), pp. 433–436, 1990.
Lippmann, R.P., “Review of neural networks for speech recognition,” Neural Computation, vol. 1, no. 1, pp. 1–38, 1989.
Lubensky, D.M., Asadi, A.O. and Naik, J.M., “Connected digit recognition using connectionist probability estimators and mixture-gaussian densities,” IEEE Proc. of the Intl. Conf. on Spoken Language Processing, pp.295–298, Yokohama, Japan, 1994.
Morgan, N. and Bourlard, H., “Generalization and parameter estimation in feed-forward nets: some experiments, “ in Advances in Neural Information Processing Systems 2 (D.S. Touretzky, Ed.), San Mateo, CA: Morgan Kaufmann, pp. 630–637, 1990.
Morgan, N., “Big Dumb Deural Nets (BDNN): a working brute force approach to speech recognition”, Proceedings of the ICNN, vol. VII, pp.4462–4465, 1994.
Morgan, N. and Bourlard, H., “Neural networks for statistical recognition of continuous speech,” Proceedings of the IEEE, vol. 83, no. 5, pp. 741–770, 1995.
Ney, N., “The use of a one-stage dynamic programming algorithm for connected word recognition,” IEEE Trans. on Acoustics, Speech, and Signal Processing, 32:263–271, 1984.
Poritz, A., “Linear predictive Hidden Markov Models and the speech signal,” Proc. IEEE Intl. Conf. on Acoustic, Speech, and Signal Processing, pp. 1291–1294, Paris, 1982.
Poritz, A.B. and Richter, A.L., “On hidden Markov models in isolated word recognition”, IEEE Proc. Intl. Conf. on Acoustics, Speech, and Signal Processing, pp. 14.3.1–4, Tokyo, Japan, 1986.
Rabiner, L.R., “A tutorial on hidden Markov models and selected applications in speech recognition,” Proceedings of the IEEE, vol. 77, no. 2, pp. 257–285, 1989.
Renals, S., Morgan, N., Bourlard, H., Cohen, M. and Franco, F., “Connectionist probability estimators in HMM speech recognition,” IEEE Trans. on Speech and Audio Processing, vol. 2, no. 1, pp. 161–174, 1994.
Renals, S. and Hochberg, M., “Efficient search using posterior phone probability estimates,” Proc. of IEEE Intl. Conf. on Acoustics, Speech, and Signal Processing (Detroit, MI), pp. 596–599, 1995.
Richard, M.D. and Lippmann, R.P., “Neural network classifiers estimate Bayesian a posteriori probabilities,” Neural Computation, no. 3, pp. 461–483, 1991.
Robinson, T., Almeida, L., Boite, J.M., Bourlard, H., Fallside, F., Hochberg, M., Kershaw, D., Kohn, P., Konig, Y., Morgan, N., Neto, J.P., Renals, S., Saerens, M. and Wooters, C., “A neural network based, speaker independent, large vocabulary, continuous speech recognition system: The WERNICKE Project,” Proc. EUROSPEECH'93 (Berlin, Germany), pp. 1941–1944, 1993.
Sorenson, H., “A cepstral noise reduction multi-layer network,” Proc. IEEE Intl. Conf. on Acoustic, Speech, and Signal Processing Toronto, Canada, pp. 933–936, 1991.
Steeneken, J.M. and Van Leeuwen, D.A., “Multi-lingual assessment of speaker independent large vocabulary speech-recognition systems: the SQALE project (speech recognition quality assessment for language engineering),” Proc. EUROSPEECH'95 (Madrid, Spain), Sep. 1995.
Tebelskis, J. and Waibel, A., “Large vocabulary recognition using linked predictive neural networks,” in Proc. IEEE Intl. Conf. on Acoustic, Speech, and Signal Processing (Albuquerque, NM), pp. 437–440, 1990.
Tomlinson, M.J., Russell, M.J., Moore, R.K., Buckland, A.P., Fawley, M.A., “Modelling asynchrony in speech using elementary single-signal decomposition,” Proc. IEEE Intl. Conf. on Acoustics, Speech, and Signal Processing (Munich, Germany), pp. 1247–1250, 1997.
Varga, A. and Moore, R., “Hidden Markov model decomposition of speech and noise,” Proc. IEEE Intl. Conf. on Acoustics, Speech, and Signal Processing, pp. 845–848, 1990.
Zavaliagkos, G., Zhao, Y., Schwartz, R. and Makhoul, J., “A hybrid segmental neural net/hidden markov model system for continuous speech recognition” IEEE Trans. on Speech and Audio Processing, vol. 2, no. 1, pp. 151–160, 1994.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Bourlard, H., Morgan, N. (1998). Hybrid HMM/ANN systems for speech recognition: Overview and new research directions. In: Giles, C.L., Gori, M. (eds) Adaptive Processing of Sequences and Data Structures. NN 1997. Lecture Notes in Computer Science, vol 1387. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0054006
Download citation
DOI: https://doi.org/10.1007/BFb0054006
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64341-8
Online ISBN: 978-3-540-69752-7
eBook Packages: Springer Book Archive