Language model estimations and representations for real-time continuous speech recognition

Antoniol, Giuliano; Brugnara, Fabio; Cettolo, Mauro; Federico, Marcello

doi:10.21437/ICSLP.1994-229

Language model estimations and representations for real-time continuous speech recognition

Giuliano Antoniol, Fabio Brugnara, Mauro Cettolo, Marcello Federico

This paper compares different ways of estimating bigram language models and of representing them in a finite state network used by a beam-search based, continuous speech, and speaker independent HMM recognizer. Attention is focused on the n-gram interpolation scheme for which seven models are considered. Among them, the Stacked estimated linear interpolated model favourably compares with the best known ones. Further, two different static representations of the search space are investigated: "linear" and "tree-based". Results show that the latter topology is better suited to the beam-search algorithm. Moreover, this representation can be reduced by a network optimization technique, which allows the dynamic size of the recognition process to be decreased by 60%. Extensive recognition experiments on a 10,000-word dictation task with four speakers are described in which an average word accuracy of 93% is achieved with real-time response.

doi: 10.21437/ICSLP.1994-229

Cite as: Antoniol, G., Brugnara, F., Cettolo, M., Federico, M. (1994) Language model estimations and representations for real-time continuous speech recognition. Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994), 859-862, doi: 10.21437/ICSLP.1994-229

@inproceedings{antoniol94_icslp,
  author={Giuliano Antoniol and Fabio Brugnara and Mauro Cettolo and Marcello Federico},
  title={{Language model estimations and representations for real-time continuous speech recognition}},
  year=1994,
  booktitle={Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994)},
  pages={859--862},
  doi={10.21437/ICSLP.1994-229}
}