Skip to main content

Transformation-Based GMM with Improved Cluster Algorithm for Speaker Identification

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4426))

Included in the following conference series:

Abstract

The embedded linear transformation is a popular technique which integrates both transformation and diagonal-covariance Gaussian mixture into a unified framework to improve the performance of speaker recognition. However, the mixture number of GMM must be given in model training. The cluster expectation-maximization (EM) algorithm is a well-known technique in which the mixture number is regarded as an estimated parameter. This paper presents a new model that integrates an improved cluster algorithm into the estimating process of GMM with the embedded transformation. In the approach, the transformation matrix, the mixture number and other traditional model parameters are simultaneously estimated according to a maximum likelihood criterion. The proposed method is demonstrated on a database of three data sessions for text independent speaker identification. The experiments show that this method outperforms the traditional GMM with cluster EM algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Furui, S.: An Overview of Speaker Recognition Technology. In: Lee, C., Soong, F., Paliwal, K. (eds.) Automatic Speech and Speaker Recognition, Kluwer Academic Press, Dordrecht (1996)

    Google Scholar 

  2. Reynolds, D.A., Rose, R.C.: Robust Text-independent Speaker Identification Using Gaussian Mixture Speaker models. IEEE Trans. Speech Audio Process. 3(1), 72–83 (1995)

    Article  Google Scholar 

  3. You, K.H., Wang, H.C.: Joint Estimation of Feature Transformation Parameters and Gaussian mixture Model for Speaker identification. Speech Communication 28, 227–241 (1999)

    Article  Google Scholar 

  4. Hong, Q.Y., Kwong, S.: A Discriminative Training Approach for Text-independent Speaker Recognition. Signal Processing 85, 1449–1463 (2005)

    Article  Google Scholar 

  5. Li, H., Haton, J.P., Gong, Y.: On MMI Learning of Gaussian mixture for speaker models. In: Proceddings EUROSPEECH’95, pp. 363–366 (1995)

    Google Scholar 

  6. Ljolje, A.: The importance of cepstral parameter correlations in speech recognition. Computer Speech and Language 8, 223–232 (1994)

    Article  Google Scholar 

  7. Chen, C.-C.T., Chen, C.T., Hou, C.K.: Speaker Identification Using Hybrid Karhunen-Loeve transform and Gaussian mixture model approach. Pattern Recognition 37, 1073–1075 (2004)

    Article  Google Scholar 

  8. Fukunaga, K.: Introduction to Statistical Pattern Recognition. Academic Press, London (1990)

    MATH  Google Scholar 

  9. Boulis, C., Diakoloukas, V., Digalakis, V.: Maximum Likelihood Stochastic Transformation Adaptation for Medium and Small Data Sets. Computer Speech and Language 15, 257–285 (2001)

    Article  Google Scholar 

  10. Bouman, C.A.: Cluster: An Unsupervised Algorithm for Modeling Gaussian Mixtures (2005), http://www.ece.purdue.edu/~bouman

  11. Rissanen, J.: A Universal Prior for Integers and Estimation by Minimum Description Length. Annals of Statistics 11(2), 417–431 (1983)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Zhi-Hua Zhou Hang Li Qiang Yang

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Xu, L., Tang, Z., He, K., Qian, B. (2007). Transformation-Based GMM with Improved Cluster Algorithm for Speaker Identification. In: Zhou, ZH., Li, H., Yang, Q. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2007. Lecture Notes in Computer Science(), vol 4426. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71701-0_113

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-71701-0_113

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-71700-3

  • Online ISBN: 978-3-540-71701-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics