We compare several neural networks architectures to measure the degree of similarity among speakers. For each speaker of a reference set, Multilayer Perceptrons and Radial Basis Functions are trained to perform a non-linear principal component analysis of acoustic vectors, and Self-Organized Feature Maps are used to construct Vector Quantizers. As a first simple step, we use non-discriminant training to characterize speakers, and, then, the result is applied to combine speaker-dependent speech recognition models. In a second phase, discriminant training over speaker models is carried out, and speaker verification and identification performances of these networks are evaluated.
Keywords: Speech recognition, speaker recognition, neural networks, similarity measures, models.
Cite as: Hernandez-Mendez, J.A., Figueiras-Vidal, A.R. (1993) Measuring similarities among speakers by means of neural networks. Proc. 3rd European Conference on Speech Communication and Technology (Eurospeech 1993), 643-646, doi: 10.21437/Eurospeech.1993-156
@inproceedings{hernandezmendez93_eurospeech, author={J. A. Hernandez-Mendez and Anibal R. Figueiras-Vidal}, title={{Measuring similarities among speakers by means of neural networks}}, year=1993, booktitle={Proc. 3rd European Conference on Speech Communication and Technology (Eurospeech 1993)}, pages={643--646}, doi={10.21437/Eurospeech.1993-156} }