Abstract
The changes that occur in the human voice due to ageing have been well documented. The impact of these changes on speaker verification is less clear. In this work, we examine the effect of long-term vocal ageing on a speaker verification system. On a cohort of 13 adult speakers, using a conventional GMM-UBM system, we carry out longitudinal testing of each speaker across a time span of 30-40 years. We uncover a progressive degradation in verification score as the time span between the training and test material increases. The addition of temporal information to the features causes the rate of degradation to increase. No significant difference was found between MFCC and PLP features. Subsequent experiments show that the effect of short-term ageing (<5 years) is not significant compared with normal inter-session variability. Above this time span however, ageing has a detrimental effect on verification. Finally, we show that the age of the speaker at the time of training influences the rate at which the verification scores degrade. Our results suggest that the verification score drop-off accelerates for speakers over the age of 60. The results presented are the first of their kind to quantify the effect of long-term vocal ageing on speaker verification.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Mueller, P.B.: The Aging Voice. Seminars in speech and language 18(2), 159–168 (1997)
Linville, S.E.: Vocal aging. Current Opinion in Otolaryngology & Head and Neck Surgery 3, 183–187 (1995)
Linville, S.E.: The Sound of Senescence. Journal of Voice 10(2), 190–200 (1996)
Sataloff, R.T.: Vocal aging. Current Opinion in Otolaryngology & Head and Neck Surgery 6, 421–428 (1998)
Reubold, U., et al.: Vocal aging effects on F0 and the first formant: A longitudinal analysis in adult speakers. Speech Communication 52, 638–651 (2010)
Vipperla, R., et al.: Ageing Voices: The Effect of Changes in Voice Parameters on ASR Performance. EURASIP Journal on Audio, Speech, and Music Processing (2010)
Cole, R., et al.: The CSLU speaker recognition corpus. In: Proceedings of the International Conference on Spoken Language Processing, pp. 3167–3170 (1998)
Lawson, A.D., et al.: The Multi-Session Audio Research Project (MARP) Corpus: Goals, Design and Initial Findings. In: INTERSPEECH 2009, Brighton (2009)
Lawson, A.D., et al.: Long term examination of intra-session and inter-session speaker variability. In: INTERSPEECH 2009, Brighton, United Kingdom (2009)
Garofolo, J.S.: TIMIT Acoustic-Phonetic Continuous Speech Corpus. Linguistic Data Consortium, Philadelphia (1993)
Harnsberger, J.D., et al.: Modeling perceived vocal age in American English. To be presented at Interspeech 2010 (2010)
Reynolds, D.A., et al.: Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing 10, 19–41 (2000)
Rosenberg, A.E., et al.: Speaker background models for connected digit password speaker verification. In: ICASSP 1996 (1996)
Bimbot, F., et al.: A Tutorial on Text-Independent Speaker Verification. EURASIP Journal on Applied Signal Processing 4, 430–451 (2004)
Hermansky, H., et al.: RASTA processing of speech. IEEE Transactions on Speech and Audio Processing 2, 578–589 (1994)
Furui, S.: Comparison of speaker recognition methods using statistical features and dynamic features. IEEE Transactions on Acoustics, Speech and Signal Processing 29(3), 342–350 (1981)
Hermansky, H., et al.: Perceptual Linear Predictive (PLP) Analysis-Resynthesis Technique. In: IEEE ASSP Workshop on Applications of Signal Processing to Audio and Acoustics, Final Program and Paper Summaries, pp. 37–38 (1991)
Kinnunen, T., et al.: An overview of text-independent speaker recognition: From features to supervectors. Speech Communication 52, 12–40 (2010)
Lawson, A.D., et al.: External factors influencing the performance of speaker identification of the multisession audio research project (MARP) corpus, 153rd Meeting of the Acoustical Society of America (June 2007)
Campbell, J.P., et al.: Forensic speaker recognition. IEEE Signal Processing Magazine 26(2), 95–103 (2009)
Kinnunen, T.: Optimizing Spectral Feature Based Text-Independent Speaker Recognition, PhD thesis, Department of Computer Science, University of Joensuu (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kelly, F., Harte, N. (2011). Effects of Long-Term Ageing on Speaker Verification. In: Vielhauer, C., Dittmann, J., Drygajlo, A., Juul, N.C., Fairhurst, M.C. (eds) Biometrics and ID Management. BioID 2011. Lecture Notes in Computer Science, vol 6583. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19530-3_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-19530-3_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19529-7
Online ISBN: 978-3-642-19530-3
eBook Packages: Computer ScienceComputer Science (R0)