In this work we build language models using three different training methods: n-gram, class-based and maximum entropy models. The main issue is the use of stem information to cope with the very large number of distinct words of an inflectional language, like Greek. We compare the three models with both perplexity and word error rate. We also examine thoroughly the perplexity differences of the three models on specific subsets of words.
Cite as: Oikonomidis, D., Digalakis, V. (2003) Stem-based maximum entropy language models for inflectional languages. Proc. 8th European Conference on Speech Communication and Technology (Eurospeech 2003), 2285-2288, doi: 10.21437/Eurospeech.2003-638
@inproceedings{oikonomidis03_eurospeech, author={Dimitrios Oikonomidis and Vassilios Digalakis}, title={{Stem-based maximum entropy language models for inflectional languages}}, year=2003, booktitle={Proc. 8th European Conference on Speech Communication and Technology (Eurospeech 2003)}, pages={2285--2288}, doi={10.21437/Eurospeech.2003-638} }