ISCA Archive Interspeech 2004
ISCA Archive Interspeech 2004

Fast on-the-fly composition for weighted finite-state transducers in 1.8 million-word vocabulary continuous speech recognition

Takaaki Hori, Chiori Hori, Yasuhiro Minami

This paper proposes a new on-the-fly composition algorithm for Weighted Finite-State Transducers (WFSTs) in large-vocabulary continuous-speech recognition. In general on-the-fly composition, two transducers are composed during decoding, and a Viterbi search is performed based on the composed search space. In this new method, a Viterbi search is performed based on the first of two transducers. The second transducer is only used to rescore the hypotheses generated during the search. Since this rescoring is very efficient, the total amount of computation is almost the same as when using only the first transducer. In a 30k-word vocabulary spontaneous speech transcription task, our proposed method significantly outperformed the general on-the-fly algorithm. Furthermore our method worked with small memory requirements, and the speed was slightly faster than that of decoding with a single fully composed and optimized WFST. Finally, we have achieved one-pass real-time speech recognition in an extremely large vocabulary of 1.8 million words.


doi: 10.21437/Interspeech.2004-140

Cite as: Hori, T., Hori, C., Minami, Y. (2004) Fast on-the-fly composition for weighted finite-state transducers in 1.8 million-word vocabulary continuous speech recognition. Proc. Interspeech 2004, 289-292, doi: 10.21437/Interspeech.2004-140

@inproceedings{hori04_interspeech,
  author={Takaaki Hori and Chiori Hori and Yasuhiro Minami},
  title={{Fast on-the-fly composition for weighted finite-state transducers in 1.8 million-word vocabulary continuous speech recognition}},
  year=2004,
  booktitle={Proc. Interspeech 2004},
  pages={289--292},
  doi={10.21437/Interspeech.2004-140}
}