Sparse non-negative matrix language modeling for skip-grams

Shazeer, Noam; Pelemans, Joris; Chelba, Ciprian

doi:10.21437/Interspeech.2015-342

Sparse non-negative matrix language modeling for skip-grams

Noam Shazeer, Joris Pelemans, Ciprian Chelba

We present a novel family of language model (LM) estimation techniques named Sparse Non-negative Matrix (SNM) estimation. A first set of experiments empirically evaluating these techniques on the One Billion Word Benchmark [3] shows that with skip-gram features SNMLMs are able to match the state-of-the-art recurrent neural network (RNN) LMs; combining the two modeling techniques yields the best known result on the benchmark. The computational advantages of SNM over both maximum entropy and RNNLM estimation are probably its main strength, promising an approach that has the same flexibility in combining arbitrary features effectively and yet should scale to very large amounts of data as gracefully as n-gram LMs do.

doi: 10.21437/Interspeech.2015-342

Cite as: Shazeer, N., Pelemans, J., Chelba, C. (2015) Sparse non-negative matrix language modeling for skip-grams. Proc. Interspeech 2015, 1428-1432, doi: 10.21437/Interspeech.2015-342

@inproceedings{shazeer15_interspeech,
  author={Noam Shazeer and Joris Pelemans and Ciprian Chelba},
  title={{Sparse non-negative matrix language modeling for skip-grams}},
  year=2015,
  booktitle={Proc. Interspeech 2015},
  pages={1428--1432},
  doi={10.21437/Interspeech.2015-342}
}