Statistical language modeling using the CMU-cambridge toolkit

Clarkson, Philip; Rosenfeld, Ronald

doi:10.21437/Eurospeech.1997-683

Statistical language modeling using the CMU-cambridge toolkit

Philip Clarkson, Ronald Rosenfeld

The CMU Statistical Language Modeling toolkit was released in 1994 in order to facilitate the construction and testing of bigram and trigram language models. It is currently in use in over 40 academic, government and industrial laboratories in over 12 countries. This paper presents a new version of the toolkit. We outline the conventional language modeling technology, as implemented in the toolkit, and describe the extra efficiency and functionality that the new toolkit provides as compared tï previous software for this task. Finally, we give an example of the use of the toolkit in constructing and testing a simple language model.

doi: 10.21437/Eurospeech.1997-683

Cite as: Clarkson, P., Rosenfeld, R. (1997) Statistical language modeling using the CMU-cambridge toolkit. Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997), 2707-2710, doi: 10.21437/Eurospeech.1997-683

@inproceedings{clarkson97_eurospeech,
  author={Philip Clarkson and Ronald Rosenfeld},
  title={{Statistical language modeling using the CMU-cambridge toolkit}},
  year=1997,
  booktitle={Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997)},
  pages={2707--2710},
  doi={10.21437/Eurospeech.1997-683}
}