In this paper we present a speech rate estimator based on so-called rhythmicity features derived from a modified version of the short-time energy envelope. To evaluate the new method, it is compared to a traditional speech rate estimator on the basis of semi-automatic segmentation. Speech material from the Alcohol Language Corpus (ALC) covering intoxicated and sober speech of different speech styles provides a statistically sound foundation to test upon. The proposed measure clearly correlates with the semi-automatically determined speech rate and seems to be robust across speech styles and speaker states.
Cite as: Heinrich, C., Schiel, F. (2011) Estimating speaking rate by means of rhythmicity parameters. Proc. Interspeech 2011, 1873-1876, doi: 10.21437/Interspeech.2011-509
@inproceedings{heinrich11_interspeech, author={Christian Heinrich and Florian Schiel}, title={{Estimating speaking rate by means of rhythmicity parameters}}, year=2011, booktitle={Proc. Interspeech 2011}, pages={1873--1876}, doi={10.21437/Interspeech.2011-509} }