Stem-based maximum entropy language models for inflectional languages

Oikonomidis, Dimitrios; Digalakis, Vassilios

doi:10.21437/Eurospeech.2003-638

Stem-based maximum entropy language models for inflectional languages

Dimitrios Oikonomidis, Vassilios Digalakis

In this work we build language models using three different training methods: n-gram, class-based and maximum entropy models. The main issue is the use of stem information to cope with the very large number of distinct words of an inflectional language, like Greek. We compare the three models with both perplexity and word error rate. We also examine thoroughly the perplexity differences of the three models on specific subsets of words.

doi: 10.21437/Eurospeech.2003-638

Cite as: Oikonomidis, D., Digalakis, V. (2003) Stem-based maximum entropy language models for inflectional languages. Proc. 8th European Conference on Speech Communication and Technology (Eurospeech 2003), 2285-2288, doi: 10.21437/Eurospeech.2003-638

@inproceedings{oikonomidis03_eurospeech,
  author={Dimitrios Oikonomidis and Vassilios Digalakis},
  title={{Stem-based maximum entropy language models for inflectional languages}},
  year=2003,
  booktitle={Proc. 8th European Conference on Speech Communication and Technology (Eurospeech 2003)},
  pages={2285--2288},
  doi={10.21437/Eurospeech.2003-638}
}