In this paper, we present the Philips large vocabulary continuous Mandarin speech recognition system developed for the 2000 Taiwan Speech Input Technology Assessment. We systematically integrated key Mandarin components with up-todate Western-language techniques to build up a state-of-the-art Mandarin speech recognition system. These technologies include robust pitch extraction/tone modeling, context-dependent preme/core-final units, Chinese phrase/syllable trigram language model, linear discriminant analysis (LDA), cross-syllable modeling/decoding, speaker clustering and maximum likelihood linear regression (MLLR) adaptation. Among them, the major breakthroughs were our robust pitch extraction/tone modeling technology and the treatment of coarticulation across syllable boundaries. For the development set, we dramatically reduced last yearÂ’s best error rates by relative 44.8%~67.8% on all three categories we participated. Moreover, for the evaluation set, we achieved the lowest unit error rates on all three categories.
Cite as: Liao, Y.-F., Wang, N., Huang, M., Huang, H., Seide, F. (2000) Improvements of the Philips 2000 Taiwan Mandarin benchmark system. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 4, 298-301, doi: 10.21437/ICSLP.2000-810
@inproceedings{liao00_icslp, author={Yuan-Fu Liao and Nick Wang and Max Huang and Hank Huang and Frank Seide}, title={{Improvements of the Philips 2000 Taiwan Mandarin benchmark system}}, year=2000, booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)}, pages={vol. 4, 298-301}, doi={10.21437/ICSLP.2000-810} }