Skip to main content

Score Fusion in Text-Dependent Speaker Recognition Systems

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6800))

Abstract

According to some significant advantages, the text-dependent speaker recognition is still widely used in biometric systems. These systems are, in comparison with the text-independent, more accurate and resistant against the replay attacks. There are many approaches regarding the text-dependent recognition. This paper introduces a combination of classifiers based on fractional distances, biometric dispersion matcher and dynamic time warping. The first two mentioned classifiers are based on a voice imprint. They have low memory requirements while the recognition procedure is fast. This is advantageous especially in low-cost biometric systems supplied by batteries. It is shown that using the trained score fusion, it is possible to reach successful detection rate equal to 98.98% and 92.19% in case of microphone mismatch. During verification, system reached equal error rate 2.55% and 6.77% when assuming the microphone mismatch. System was tested using Catalan database which consists of 48 speakers (three 3s training samples per speaker).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. BenZeghiba, M.F., Bourlard, H.: User-customized Password Speaker Verification Using Multiple Reference and Background Models. Speech Communication 8, 1200–1213 (2006), iDIAP-RR 04-41

    Article  Google Scholar 

  2. Campbell, W.M., Campbell, J.P., Reynolds, D.A., Singer, E., Torres-Carrasquillo, P.A.: Support Vector Machines for Speaker and Language Recognition. Computer Speech & Language 20(2-3), 210–229 (2006), http://www.sciencedirect.com/science/article/B6WCW-4GSSP9F-1/2/4aaea6467cc61ee4919a9b1c953316b1 , odyssey 2004: The speaker and Language Recognition Workshop - Odyssey-04

    Article  Google Scholar 

  3. Campbell, W.M., Sturim, D.E., Reynolds, D.A.: Support Vector Machines Using GMM Supervectors for Speaker Verification. IEEE Signal Processing Letters 13(5), 308–311 (2006)

    Article  Google Scholar 

  4. Das., A., Chittaranjan, G., Srinivasan, V.: Text-dependent Speaker Recognition by Compressed Feature-dynamics Derived from Sinusoidal Representation of Speech. In: 16th European Signal Processing Conference (EUSIPCO 2008), Lausanne, Switzerland (2008)

    Google Scholar 

  5. Davis, S., Mermelstein, P.: Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences. IEEE Transactions on Acoustics, Speech and Signal Processing 28(4), 357–366 (1980)

    Article  Google Scholar 

  6. Fàbregas, J., Faundez-Zanuy, M.: Biometric Dispersion Matcher. Pattern Recogn. 41, 3412–3426 (2008), http://portal.acm.org/citation.cfm?id=1399656.1399907

    Article  MATH  Google Scholar 

  7. Fàbregas, J., Faundez-Zanuy, M.: Biometric Dispersion Matcher Versus LDA. Pattern Recogn. 42, 1816–1823 (2009), http://portal.acm.org/citation.cfm?id=1542560.1542866

    Article  MATH  Google Scholar 

  8. Furui, S.: Cepstral Analysis Technique for Automatic Speaker Verification. IEEE Transactions on Acoustics, Speech and Signal Processing 29(2), 254–272 (1981)

    Article  Google Scholar 

  9. Hermansky, H.: Perceptual Linear Predictive (PLP) Analysis of Speech. The Journal of the Acoustical Society of America 87(4), 1738–1752 (1990), http://link.aip.org/link/?JAS/87/1738/1

    Article  Google Scholar 

  10. Kinnunen, T., Li, H.: An Overview of Text-independent Speaker Recognition: From Features to Supervectors. Speech Communication 52(1), 12–40 (2010), http://www.sciencedirect.com/science/article/B6V1C-4X4Y22C-1/2/7926da351ef5c650f2a1a37adcd839a1

    Article  Google Scholar 

  11. Mammone, R.J., Zhang, X., Ramachandran, R.P.: Robust Speaker Recognition: a Feature-based Approach. IEEE Signal Processing Magazine 13(5), 58 (1996)

    Article  Google Scholar 

  12. Mekyska, J., Faundez-Zanuy, M., Smekal, Z., Fàbregas, J.: Text-dependent Speaker Recognition in Low-cost Systems. In: 6th International Conference on Teleinformatics, Dolni Morava, Czech Republic, pp. 154–158 (2011)

    Google Scholar 

  13. Reynolds, D.A.: Speaker Identification and Verification Using Gaussian Mixture Speaker Models. Speech Commun. 17, 91–108 (1995), http://portal.acm.org/citation.cfm?id=211311.211317

    Article  Google Scholar 

  14. Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing (2000)

    Google Scholar 

  15. Swanson, A.L., Ramachandran, R.P., Chin, S.H.: Fast Adaptive Component Weighted Cepstrum Pole Filtering for Speaker Identification. In: Proceedings of the 2004 International Symposium on Circuits and Systems, ISCAS 2004, vol. 5, pp. 612–615 (May 2004)

    Google Scholar 

  16. Vivaracho-Pascual, C., Faundez-Zanuy, M., Pascual, J.M.: An Efficient Low Cost Approach for On-line Signature Recognition Based on Length Normalization and Fractional Distances. Pattern Recogn. 42, 183–193 (2009), http://portal.acm.org/citation.cfm?id=1412761.1413027

    Article  MATH  Google Scholar 

  17. Wong, E., Sridharan, S.: Comparison of Linear Prediction Cepstrum Coefficients and Mel-requency Cepstrum Coefficients for Language Identification. In: Proceedings of 2001 International Symposium on Intelligent Multimedia, Video and Speech Processing, pp. 95–98 (2001)

    Google Scholar 

  18. Yegnanarayana, B., Kishore, S.P.: AANN: an Alternative to GMM for Pattern Recognition. Neural Networks 15(3), 459–469 (2002), http://www.sciencedirect.com/science/article/B6T08-459952R-2/2/a53c123eaecb7ccb7b50baec88885192

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mekyska, J., Faundez-Zanuy, M., Smékal, Z., Fàbregas, J. (2011). Score Fusion in Text-Dependent Speaker Recognition Systems. In: Esposito, A., Vinciarelli, A., Vicsi, K., Pelachaud, C., Nijholt, A. (eds) Analysis of Verbal and Nonverbal Communication and Enactment. The Processing Issues. Lecture Notes in Computer Science, vol 6800. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25775-9_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-25775-9_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-25774-2

  • Online ISBN: 978-3-642-25775-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics