A new scheme to overcome the independence assumption in standard hidden Markov modeling (HMM) formulations is presented within the framework of a hybrid system that uses a discriminatively trained multilayer perceptron (MLP) to compute a correlated emission probability. The scheme takes advantage of the MLP's ability to model correlations across multiple frames allowing the use of multiframe long vector history to condition the emission probability. The required number of parameters is the same as in the standard hybrid HMM/MLP formulation. Results presented in a large vocabulary continuous speech recognition task show that even though performance so far has not improved over the standard approach, the acoustic and language model probabilities are better balanced with this new scheme as compared to the standard one.
Cite as: Franco, H., Digalakis, V. (1995) Temporal correlation modeling in a hybrid neural network/hidden Markov model speech recognizer. Proc. 4th European Conference on Speech Communication and Technology (Eurospeech 1995), 1681-1684, doi: 10.21437/Eurospeech.1995-406
@inproceedings{franco95_eurospeech, author={Horatio Franco and Vassilios Digalakis}, title={{Temporal correlation modeling in a hybrid neural network/hidden Markov model speech recognizer}}, year=1995, booktitle={Proc. 4th European Conference on Speech Communication and Technology (Eurospeech 1995)}, pages={1681--1684}, doi={10.21437/Eurospeech.1995-406} }