ISCA Archive Eurospeech 1993
ISCA Archive Eurospeech 1993

Duration of phones as function of utterance length and its use in automatic speech recognition

Yifan Gong, William C. Treurniet

Duration probability of phonemes is widely used as a constraint in phoneme-based continuous speech recognizers, and is known to improve recognition accuracy. Usually, models of phoneme duration are extracted from continuous utterances of sentences in a training database. However, tokens obtained from shorter utterances may not be well represented by models created from longer utterances. We designed an experiment to compute observed average phoneme duration as a function of the number of phonemes per utterance. We observed that the average duration consistently increases as the number of phonemes per utterance increases. The experiment showed that the average duration of phonemes in words spoken in isolation may be as much as 50% longer than the average duration of phonemes in continuously spoken sentences. The variation of phoneme duration as a function of utterance duration was modeled in both the phoneme probability estimation stage and the utterance search stage of a recognition system. As a result, a 47% reduction in word recognition errors was obtained.


doi: 10.21437/Eurospeech.1993-98

Cite as: Gong, Y., Treurniet, W.C. (1993) Duration of phones as function of utterance length and its use in automatic speech recognition. Proc. 3rd European Conference on Speech Communication and Technology (Eurospeech 1993), 315-318, doi: 10.21437/Eurospeech.1993-98

@inproceedings{gong93_eurospeech,
  author={Yifan Gong and William C. Treurniet},
  title={{Duration of phones as function of utterance length and its use in automatic speech recognition}},
  year=1993,
  booktitle={Proc. 3rd European Conference on Speech Communication and Technology (Eurospeech 1993)},
  pages={315--318},
  doi={10.21437/Eurospeech.1993-98}
}