In this work we describe several approaches to determine an effective set of subword units for modeling the spoken Greek language. We tried to form a concrete set of basic units which must have the capability of giving a unique phonetic transcription for every input utterance. The results of an extensive set of experiments showed that the use of longer units than phonemes can lead to a significant improvement in a system's performance. Three sets of subword units were finally formed regarding the way we combined the 42 phonemes of the Greek Language. The three approaches showed better results than the baseline phoneme-based system and the most effective one proved to be the second approach in which we used two-phoneme combinations of the types non-vowel/vowel and non-vowel/non- vowel. The phoneme recognition rate of the system increased almost by 9% (reaching a level of 78.65%) for the best situation compared to the baseline system.
Cite as: Tsopanoglou, A., Fakotakis, N. (1997) Selection of the most effective set of subword units for an HMM-based speech recognition system. Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997), 1231-1234, doi: 10.21437/Eurospeech.1997-31
@inproceedings{tsopanoglou97_eurospeech, author={Anastasios Tsopanoglou and Nikos Fakotakis}, title={{Selection of the most effective set of subword units for an HMM-based speech recognition system}}, year=1997, booktitle={Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997)}, pages={1231--1234}, doi={10.21437/Eurospeech.1997-31} }