This paper proposes a word-class-based Chinese language model for Mandarin speech recognition with very large vocabulary. The word classes used are developed based on the special structure of Chinese words. We have also developed some improved techniques. The ambiguous syllable filter can delete many confusion syllables and increase significantly the accuracy. The short-term cache memory can help the language model to adapt to the current application domain, and the learning module can significantly reduce the zero values in the language model.
Cite as: Yang, Y.-J., Lin, S.-C., Chien, L.-F., Chen, K.-J., Lee, L.-S. (1994) An intelligent and efficient word-class-based Chinese language model for Mandarin speech recognition with very large vocabulary. Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994), 1371-1374, doi: 10.21437/ICSLP.1994-357
@inproceedings{yang94b_icslp, author={Yen-Ju Yang and Sung-Chien Lin and Lee-Feng Chien and Keh-Jiann Chen and Lin-Shan Lee}, title={{An intelligent and efficient word-class-based Chinese language model for Mandarin speech recognition with very large vocabulary}}, year=1994, booktitle={Proc. 3rd International Conference on Spoken Language Processing (ICSLP 1994)}, pages={1371--1374}, doi={10.21437/ICSLP.1994-357} }