Skip to main content

Speaker Recognition: Introduction

  • Chapter
  • First Online:
Robustness-Related Issues in Speaker Recognition

Part of the book series: SpringerBriefs in Electrical and Computer Engineering ((BRIEFSSIGNAL))

  • 613 Accesses

Abstract

In the ancient war times, officers and soldiers could recognize one friend or foe through the predetermined password(s). In real life, we human are able to get in and out of a house using keys or e-cards. While surfing the Internet, the user logins in websites or mail servers with his/her account and password.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 16.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Wikipedia. https://en.wikipedia.org/wiki/Biometrics

  2. Zhang C (2014) Research on short utterance speaker recognition. Tsinghua University, Ph.D. Dissertation

    Google Scholar 

  3. Zheng TF, Jin Q, Li L et al (2014) An overview of robustness related issues in speaker recognition. Asia-Pacific Signal and Information Processing Association, 2014 Annual Summit and Conference (APSIPA). IEEE, pp 1–10

    Google Scholar 

  4. Furui S (2005) 50 years of progress in speech and speaker recognition. SPECOM 2005, Patras, pp 1–9

    Google Scholar 

  5. Atal BS, Hanauer SL (1971) Speech analysis and synthesis by linear prediction of the speech wave. J Acoust Soc Am 50(2B):637–655

    Article  Google Scholar 

  6. Doddington GR, Flanagan JL, Lummis R C (1972) Automatic speaker verification by non-linear time alignment of acoustic parameters. U.S. Patent 3,700,815 [P], pp 10–24

    Google Scholar 

  7. Atal BS (1972) Automatic speaker recognition based on pitch contours. J Acoust Soc Am 52(6B):1687–1697

    Article  Google Scholar 

  8. Hermansky H (1990) Perceptual linear predictive (PLP) analysis of speech. J Acoust Soc Am 87(4):1738–1752

    Article  Google Scholar 

  9. Vergin R (1999) O’shaughnessy D, Farhat A. Generalized mel frequency cepstral coefficients for large-vocabulary speaker-independent continuous-speech recognition. IEEE Trans Speech Audio Process 7(5):525–532

    Article  Google Scholar 

  10. Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoust Speech Signal Process 26(1):43–49

    Article  MATH  Google Scholar 

  11. Burton D, Shore J, Buck J (1983) A generalization of isolated word recognition using vector quantization. Acoustics, speech, and signal processing. IEEE international conference on ICASSP’83. IEEE vol 8, pp 1021–1024

    Google Scholar 

  12. Rabiner L, Juang B (1986) An introduction to hidden Markov models. IEEE ASSP Magazine 3(1):4–16

    Article  Google Scholar 

  13. Jain AK, Mao J, Mohiuddin KM (1996) Artificial neural networks: a tutorial. Computer 29(3):31–44

    Article  Google Scholar 

  14. Reynolds D (2015) Gaussian mixture models. Encyclopedia of biometrics, pp 827–832

    Google Scholar 

  15. Reynolds DA, Quatieri TF, Dunn RB (2000) Speaker verification using adapted Gaussian mixture models. Digit Signal Proc 10(1–3):19–41

    Article  Google Scholar 

  16. Dehak N, Dumouchel P, Kenny P (2007) Modeling prosodic features with joint factor analysis for speaker verification. IEEE Trans Audio Speech Lang Process 15(7):2095–2103

    Article  Google Scholar 

  17. Dehak N, Kenny P, Dehak R et al (2011) Front-end factor analysis for speaker verification. IEEE Trans Audio Speech Lang Process 19(4):788–798

    Article  Google Scholar 

  18. Hatch AO, Kajarekar SS, Stolcke A (2006) Within-class covariance normalization for SVM-based speaker recognition. INTERSPEECH

    Google Scholar 

  19. Solomonoff A, Quillen C, Campbell WM (2004) Channel compensation for SVM speaker recognition. Odyssey, vol 4, pp 219–226

    Google Scholar 

  20. McLaren M, Van Leeuwen D (2011) Source-normalised-and-weighted LDA for robust speaker recognition using i-vectors. Acoustics, speech and signal processing (ICASSP), 2011 IEEE international conference on. IEEE, pp 5456–5459

    Google Scholar 

  21. Ioffe S (2006) Probabilistic linear discriminant analysis. European conference on computer vision. Springer, Berlin, pp 531–542

    Google Scholar 

  22. Prince SJD, Elder JH (2007) Probabilistic linear discriminant analysis for inferences about identity. Computer vision, 2007. ICCV 2007. IEEE 11th international conference on. IEEE, pp 1–8

    Google Scholar 

  23. Yang L (2007) An overview of distance metric learning. Proceedings of the computer vision and pattern recognition conference

    Google Scholar 

  24. Dahl GE, Yu D, Deng L et al (2012) Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans Audio Speech Lang Process 20(1):30–42

    Article  Google Scholar 

  25. Graves A, Jaitly N (2014) Towards end-To-end speech recognition with recurrent neural networks. ICML, vol 14, pp 1764–1772

    Google Scholar 

  26. Sak H, Senior AW, Beaufays F (2014) Long short-term memory recurrent neural network architectures for large scale acoustic modeling. INTERSPEECH, pp 338–342

    Google Scholar 

  27. Lei Y, Scheffer N, Ferrer L et al (2014) A novel scheme for speaker recognition using a phonetically-aware deep neural network. Acoustics, speech and signal processing (ICASSP), 2014 IEEE international conference on. IEEE, pp 1695–1699

    Google Scholar 

  28. Kenny P, Gupta V, Stafylakis T et al (2014) Deep neural networks for extracting baum-welch statistics for speaker recognition. Proc. Odyssey, pp 293–298

    Google Scholar 

  29. Wang J, Wang D, Zhu Z et al (2014) Discriminative scoring for speaker recognition based on i-vectors. Asia-pacific signal and information processing association, 2014 annual summit and conference (APSIPA). IEEE, pp 1–5

    Google Scholar 

  30. Variani E, Lei X, McDermott E et al (2014) Deep neural networks for small footprint text-dependent speaker verification. Acoustics, speech and signal processing (ICASSP), 2014 IEEE international conference on. IEEE, pp 4052–4056

    Google Scholar 

  31. Li L, Lin Y, Zhang Z et al (2015) Improved deep speaker feature learning for text-dependent speaker recognition. Signal and information processing association annual summit and conference (APSIPA), 2015 Asia-Pacific. IEEE, pp 426–429

    Google Scholar 

  32. Chen N, Qian Y, Yu K (2015) Multi-task learning for text-dependent speaker verification. Sixteenth annual conference of the international speech communication association, pp 185–189

    Google Scholar 

  33. Wang D, Zheng TF (2015) Transfer learning for speech and language processing. Signal and information processing association annual summit and conference (APSIPA), 2015 Asia-Pacific. IEEE, pp 1225–1237

    Google Scholar 

  34. Tang Z, Li L, Wang D et al (2016) Collaborative joint training with multi-task recurrent model for speech and speaker recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing

    Google Scholar 

  35. Snyder D, Ghahremani P, Povey D et al (2016) Deep neural network-based speaker embeddings for end-to-end speaker verification, 2016 IEEE Workshop on Spoken Language Technology

    Google Scholar 

  36. Furui S (1997) Recent advances in speaker recognition. Pattern Recogn Lett 18(9):859–872

    Article  Google Scholar 

  37. Campbell JP (1997) Speaker recognition: a tutorial. Proc IEEE 85(9):1437–1462

    Article  Google Scholar 

  38. Tranter SE, Reynolds DA (2006) An overview of automatic speaker diarization systems. IEEE Trans Audio Speech Lang Process 14(5):1557–1565

    Article  Google Scholar 

  39. Martin A, Doddington G, Kamm T et al (1997) The DET curve in assessment of detection task performance. Proc of the European conference on speech communication and technology (Eurospeech 1997), Rhodes, Greece, vol 4, pp 1895–1898

    Google Scholar 

  40. The NIST year 2006 speaker recognition evaluation plan. http://www.itl.nist.gov/iad/mig/tests/sre/2006/sre-06_evalplan-v9.pdf

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2017 The Author(s)

About this chapter

Cite this chapter

Zheng, T.F., Li, L. (2017). Speaker Recognition: Introduction. In: Robustness-Related Issues in Speaker Recognition. SpringerBriefs in Electrical and Computer Engineering(). Springer, Singapore. https://doi.org/10.1007/978-981-10-3238-7_1

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-3238-7_1

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-3237-0

  • Online ISBN: 978-981-10-3238-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics