The first motivation for using Gaussian mixture models for text-independent speaker identification is based on the observation that a linear combination of gaussian basis functions is capable of representing a large class of sample distributions. While this technique gives generally good results, little is known about which specific part of a speech signal best identifies a speaker. This contribution suggests a procedure, based on the Jensen divergence measure, to automatically extract from the input speech signal the part that best contribute to identify a speaker. It is shown, by results obtained, that this technique can significantly increase the performance of a speaker recognition system.
Cite as: Vergin, R., O'Shaughnessy, D. (1997) A double Gaussian mixture modeling approach to speaker recognition. Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997), 2287-2290, doi: 10.21437/Eurospeech.1997-602
@inproceedings{vergin97_eurospeech, author={Rivarol Vergin and Douglas O'Shaughnessy}, title={{A double Gaussian mixture modeling approach to speaker recognition}}, year=1997, booktitle={Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997)}, pages={2287--2290}, doi={10.21437/Eurospeech.1997-602} }