Abstract
Speech intelligibility for voice rehabilitation has been successfully evaluated by automatic prosodic analysis. In this paper, the influence of reading errors and the selection of certain words for the computation of prosodic features (nouns only, nouns and verbs, beginning of each sentence, beginnings of sentences and subclauses) are examined. 73 hoarse patients (48.3 ± 16.8 years) read the German version of the text “The North Wind and the Sun”. Their intelligibility was evaluated perceptually by 5 trained experts according to a 5-point scale. Eight prosodic features showed human-machine correlations of r \(\ge \) 0.4. The normalized energy in a word-pause-word interval, computed from all words (r = 0.69 for the full speaker set), the mean of jitter in nouns and verbs (r = 0.67), and the pause duration before a word (r = 0.66) were the most robust features. However, reading errors can significantly influence these results.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Batliner, A., Buckow, J., Niemann, H., Nöth, E., Warnke, V.: The prosody module. In: Wahlster, W. (ed.) Verbmobil: Foundations of Speech-to-Speech Translation, pp. 106–121. Springer, Berlin (2000). doi:10.1007/978-3-662-04230-4_8
Ellis, L., Fucci, D.: Magnitude-estimation scaling of speech intelligibility: effects of listeners’ experience and semantic-syntactic context. Percept. Mot. Skills 73, 295–305 (1991)
Haderlein, T., Moers, C., Möbius, B., Rosanowski, F., Nöth, E.: Intelligibility rating with automatic speech recognition, prosodic, and cepstral evaluation. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS, vol. 6836, pp. 195–202. Springer, Heidelberg (2011). doi:10.1007/978-3-642-23538-2_25
Haderlein, T., Nöth, E., Batliner, A., Eysholdt, U., Rosanowski, F.: Automatic intelligibility assessment of pathologic speech over the telephone. Logoped. Phoniatr. Vocol. 36, 175–181 (2011)
Haderlein, T., Nöth, E., Maier, A., Schuster, M., Rosanowski, F.: Influence of reading errors on the text-based automatic evaluation of pathologic voices. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2008. LNCS, vol. 5246, pp. 325–332. Springer, Heidelberg (2008). doi:10.1007/978-3-540-87391-4_42
Haderlein, T., Schwemmle, C., Döllinger, M., Matoušek, V., Ptok, M., Nöth, E.: Automatic evaluation of voice quality using text-based laryngograph measurements and prosodic analysis. Comput. Math. Methods. Med. 2015, 11p. (2015)
International Phonetic Association (IPA): Handbook of the International Phonetic Association. Cambridge University Press, Cambridge (1999)
Kaufmann, R., Obler, L.: Classification of normal reading error types. In: Leong, C., Joshi, R. (eds.) Developmental and Acquired Dyslexia, pp. 149–157. Kluwer Academic Publishers, Dordrecht (1995)
Kempler, D., van Lancker, D.: Effect of speech task on intelligibility in dysarthria: a case study of Parkinson’s disease. Brain Lang. 80, 449–464 (2002)
Kollmeier, B., Wesselkamp, M.: Development and evaluation of a German sentence test for objective and subjective speech intelligibility assessment. J. Acoust. Soc. Am. 102, 2412–2421 (1997)
Maier, A.: Speech of Children with Cleft Lip and Palate: Automatic Assessment, Studien zur Mustererkennung, vol. 29. Logos Verlag, Berlin (2009)
Nöth, E., Batliner, A., Kießling, A., Kompe, R., Niemann, H.: Verbmobil: the use of prosody in the linguistic components of a speech understanding system. IEEE Trans. Speech Audio Process. 8, 519–532 (2000)
Origlia, A., Alfano, I.: Prosomarker: a prosodic analysis tool based on optimal pitch stylization and automatic syllabification. In: Calzolari, N., et al. (ed.) Proceedings of 8th International Conference on Language Resources and Evaluation (LREC 2012), pp. 997–1002 (2012)
Rosenberg, A.: Automatic detection and classification of prosodic events. Ph.D. thesis, Columbia University, New York (2009)
Rubenstein, H., Pickett, J.: Intelligibility of words in sentences. J. Acoust. Soc. Am. 30, 670 (1958)
Acknowledgments
Dr. Döllinger’s contribution was supported by the German Research Foundation (Deutsche Forschungsgemeinschaft; DFG), grant no. DO1247/8-1.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Haderlein, T., Schützenberger, A., Döllinger, M., Nöth, E. (2017). Robust Automatic Evaluation of Intelligibility in Voice Rehabilitation Using Prosodic Analysis. In: Ekštein, K., Matoušek, V. (eds) Text, Speech, and Dialogue. TSD 2017. Lecture Notes in Computer Science(), vol 10415. Springer, Cham. https://doi.org/10.1007/978-3-319-64206-2_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-64206-2_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-64205-5
Online ISBN: 978-3-319-64206-2
eBook Packages: Computer ScienceComputer Science (R0)