ISCA Archive Interspeech 2007
ISCA Archive Interspeech 2007

A comparative study of speech rate estimation techniques

Tomas Dekens, Mike Demol, Werner Verhelst, Piet Verhoeve

In this paper we evaluate the performance of 8 different speech rate estimators [1, 2, 3, 4, 5] previously described in the literature by applying them on a multilingual test database [6]. All the estimators show an underestimation at high speech rates and some also suffer from an overestimation at low speech rates. Overall the tested methods obtain high correlation coefficients with the reference speech rate. The Temporal Correlation and Selected Sub-band Correlation method (tcssbc), which uses sub-band and time domain correlation for detecting the number of vowels or diphthongs present in the speech signal, shows little errors and appears to be the most appropriate overall technique for speech rate estimation.


doi: 10.21437/Interspeech.2007-237

Cite as: Dekens, T., Demol, M., Verhelst, W., Verhoeve, P. (2007) A comparative study of speech rate estimation techniques. Proc. Interspeech 2007, 510-513, doi: 10.21437/Interspeech.2007-237

@inproceedings{dekens07_interspeech,
  author={Tomas Dekens and Mike Demol and Werner Verhelst and Piet Verhoeve},
  title={{A comparative study of speech rate estimation techniques}},
  year=2007,
  booktitle={Proc. Interspeech 2007},
  pages={510--513},
  doi={10.21437/Interspeech.2007-237}
}