Transformation-Based GMM with Improved Cluster Algorithm for Speaker Identification

Xu, Limin; Tang, Zhenmin; He, Keke; Qian, Bo

doi:10.1007/978-3-540-71701-0_113

Limin Xu¹,
Zhenmin Tang¹,
Keke He¹ &
…
Bo Qian¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4426))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

1838 Accesses
2 Citations

Abstract

The embedded linear transformation is a popular technique which integrates both transformation and diagonal-covariance Gaussian mixture into a unified framework to improve the performance of speaker recognition. However, the mixture number of GMM must be given in model training. The cluster expectation-maximization (EM) algorithm is a well-known technique in which the mixture number is regarded as an estimated parameter. This paper presents a new model that integrates an improved cluster algorithm into the estimating process of GMM with the embedded transformation. In the approach, the transformation matrix, the mixture number and other traditional model parameters are simultaneously estimated according to a maximum likelihood criterion. The proposed method is demonstrated on a database of three data sessions for text independent speaker identification. The experiments show that this method outperforms the traditional GMM with cluster EM algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Furui, S.: An Overview of Speaker Recognition Technology. In: Lee, C., Soong, F., Paliwal, K. (eds.) Automatic Speech and Speaker Recognition, Kluwer Academic Press, Dordrecht (1996)
Google Scholar
Reynolds, D.A., Rose, R.C.: Robust Text-independent Speaker Identification Using Gaussian Mixture Speaker models. IEEE Trans. Speech Audio Process. 3(1), 72–83 (1995)
Article Google Scholar
You, K.H., Wang, H.C.: Joint Estimation of Feature Transformation Parameters and Gaussian mixture Model for Speaker identification. Speech Communication 28, 227–241 (1999)
Article Google Scholar
Hong, Q.Y., Kwong, S.: A Discriminative Training Approach for Text-independent Speaker Recognition. Signal Processing 85, 1449–1463 (2005)
Article Google Scholar
Li, H., Haton, J.P., Gong, Y.: On MMI Learning of Gaussian mixture for speaker models. In: Proceddings EUROSPEECH’95, pp. 363–366 (1995)
Google Scholar
Ljolje, A.: The importance of cepstral parameter correlations in speech recognition. Computer Speech and Language 8, 223–232 (1994)
Article Google Scholar
Chen, C.-C.T., Chen, C.T., Hou, C.K.: Speaker Identification Using Hybrid Karhunen-Loeve transform and Gaussian mixture model approach. Pattern Recognition 37, 1073–1075 (2004)
Article Google Scholar
Fukunaga, K.: Introduction to Statistical Pattern Recognition. Academic Press, London (1990)
MATH Google Scholar
Boulis, C., Diakoloukas, V., Digalakis, V.: Maximum Likelihood Stochastic Transformation Adaptation for Medium and Small Data Sets. Computer Speech and Language 15, 257–285 (2001)
Article Google Scholar
Bouman, C.A.: Cluster: An Unsupervised Algorithm for Modeling Gaussian Mixtures (2005), http://www.ece.purdue.edu/~bouman
Rissanen, J.: A Universal Prior for Integers and Estimation by Minimum Description Length. Annals of Statistics 11(2), 417–431 (1983)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, Nanjing University of Science and Technology,
Limin Xu, Zhenmin Tang, Keke He & Bo Qian

Authors

Limin Xu
View author publications
You can also search for this author in PubMed Google Scholar
Zhenmin Tang
View author publications
You can also search for this author in PubMed Google Scholar
Keke He
View author publications
You can also search for this author in PubMed Google Scholar
Bo Qian
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Zhi-Hua Zhou Hang Li Qiang Yang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, L., Tang, Z., He, K., Qian, B. (2007). Transformation-Based GMM with Improved Cluster Algorithm for Speaker Identification. In: Zhou, ZH., Li, H., Yang, Q. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2007. Lecture Notes in Computer Science(), vol 4426. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-71701-0_113

Download citation

DOI: https://doi.org/10.1007/978-3-540-71701-0_113
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-71700-3
Online ISBN: 978-3-540-71701-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics