Painless WFST cascade construction for LVCSR - transducersaurus

Novak, Josef R.; Minematsu, Nobuaki; Hirose, Keikichi

doi:10.21437/Interspeech.2011-324

Painless WFST cascade construction for LVCSR - transducersaurus

Josef R. Novak, Nobuaki Minematsu, Keikichi Hirose

This paper introduces the Transducersaurus toolkit which provides a set of classes for generating each of the fundamental components of a typical WFST ASR cascade, including a Context-dependency transducer, a Lexicon, a stochastic language model and an optional silence class model. The toolkit further implements a simple scripting language in order to facilitate the construction of cascades with a variety of popular combination and optimization methods and provides integrated support for the T3 and Juicer WFST decoders, and both Sphinx and HTK format acoustic models. New results for two standard WSJ tasks are also provided, comparing a variety of cascade construction and optimization algorithms. These results illustrate the flexibility of the toolkit as well as the tradeoffs inherent in various build algorithms.

doi: 10.21437/Interspeech.2011-324

Cite as: Novak, J.R., Minematsu, N., Hirose, K. (2011) Painless WFST cascade construction for LVCSR - transducersaurus. Proc. Interspeech 2011, 1537-1540, doi: 10.21437/Interspeech.2011-324

@inproceedings{novak11_interspeech,
  author={Josef R. Novak and Nobuaki Minematsu and Keikichi Hirose},
  title={{Painless WFST cascade construction for LVCSR - transducersaurus}},
  year=2011,
  booktitle={Proc. Interspeech 2011},
  pages={1537--1540},
  doi={10.21437/Interspeech.2011-324}
}