Impact of bucketing on performance of linearly interpolated language models

Visweswariah, K.; Printz, H.; Picheny, M.

doi:10.21437/ICSLP.2000-44

Impact of bucketing on performance of linearly interpolated language models

K. Visweswariah, H. Printz, M. Picheny

N-gram models are used to model language in various applications. For large vocabularies, even a very large corpus is insucient to estimate a raw ratio-of-counts trigram model. One common way to overcome this problem is by linear interpolation of the trigram model with lower order models. The interpolation weights can be varied as a function of the current history, to reflect the confidence we have in the estimates of various orders. Since the number of histories is large we cannot hope to estimate a set of weights for each history. Thus sets of histories are tied together and the same weights are used for all histories within the set. In this paper we study the eect of the algorithm used to tie together the various histories. We report word error rate (WER) results on a large-vocabulary speech recognition task.

doi: 10.21437/ICSLP.2000-44

Cite as: Visweswariah, K., Printz, H., Picheny, M. (2000) Impact of bucketing on performance of linearly interpolated language models. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 1, 178-181, doi: 10.21437/ICSLP.2000-44

@inproceedings{visweswariah00_icslp,
  author={K. Visweswariah and H. Printz and M. Picheny},
  title={{Impact of bucketing on performance of linearly interpolated language models}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 1, 178-181},
  doi={10.21437/ICSLP.2000-44}
}