Source-extended language model for large vocabulary continuous speech recognition

Kobayashi, Tetsunori; Wada, Yosuke; Kobayashi, Norihiko

doi:10.21437/ICSLP.1998-649

Source-extended language model for large vocabulary continuous speech recognition

Tetsunori Kobayashi, Yosuke Wada, Norihiko Kobayashi

Information source extension is utilized to improve the language model for large vocabulary continuous speech recognition (LVCSR). McMillan's theory, source extension make the model entropy close to the real source entropy, implies that the better language model can be obtained by source extension (making new unit through word concatenations and using the new unit for the language modeling). In this paper, we examined the effectiveness of this source extension. Here, we tested two methods of source extension: frequency-based extension and entropy-based extension. We tested the effect in terms of perplexity and recognition accuracy using Mainichi newspaper articles and JNAS speech corpus. As the results, the bi-gram perplexity is improved from 98.6 to 70.8 and tri-gram perplexity is improved from 41.9 to 26.4. The bigram-based recognition accuracy is improved from 79.8% to 85.3%.

doi: 10.21437/ICSLP.1998-649

Cite as: Kobayashi, T., Wada, Y., Kobayashi, N. (1998) Source-extended language model for large vocabulary continuous speech recognition. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0708, doi: 10.21437/ICSLP.1998-649

@inproceedings{kobayashi98_icslp,
  author={Tetsunori Kobayashi and Yosuke Wada and Norihiko Kobayashi},
  title={{Source-extended language model for large vocabulary continuous speech recognition}},
  year=1998,
  booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)},
  pages={paper 0708},
  doi={10.21437/ICSLP.1998-649}
}