Lexicon optimization for dutch speech recognition in spoken document retrieval

Ordelman, Roeland; Hessen, Arjan van; Jong, Franciska de

doi:10.21437/Eurospeech.2001-234

Lexicon optimization for dutch speech recognition in spoken document retrieval

Roeland Ordelman, Arjan van Hessen, Franciska de Jong

In this paper, ongoing work concerning the language modelling and lexicon optimization of a Dutch speech recognition system for Spoken Document Retrieval is described: the collection and normalization of a training data set and the optimization of our recognition lexicon. Effects on lexical coverage of the amount of training data, of decompounding compound words and of different selection methods for proper names and acronyms are discussed.

doi: 10.21437/Eurospeech.2001-234

Cite as: Ordelman, R., Hessen, A.v., Jong, F.d. (2001) Lexicon optimization for dutch speech recognition in spoken document retrieval. Proc. 7th European Conference on Speech Communication and Technology (Eurospeech 2001), 1085-1088, doi: 10.21437/Eurospeech.2001-234

@inproceedings{ordelman01_eurospeech,
  author={Roeland Ordelman and Arjan van Hessen and Franciska de Jong},
  title={{Lexicon optimization for dutch speech recognition in spoken document retrieval}},
  year=2001,
  booktitle={Proc. 7th European Conference on Speech Communication and Technology (Eurospeech 2001)},
  pages={1085--1088},
  doi={10.21437/Eurospeech.2001-234}
}