ISCA Archive Eurospeech 1991
ISCA Archive Eurospeech 1991

Speaker clustering for dialectic robustness in speaker independent recognition

Dirk Van Compernolle, J. Smolders, P. Jaspers, T. Hellemans

In this paper methods for creating multiple baseforms, in an HMM speaker independent speech recognition system are compared and analyzed. The multiple baseforms are used to better model different speakers characteristics, such as sex or regional accent. The required speaker classes are obtained either from known categorical differences (sex, address) or in an adaptive clustering procedure. Both methods are compared in a Dutch/Flemish digit recognizer. Telephone recordings from 600 Dutch and 600 Flemish speakers were used. A 2 baseform system based on regional subdivision leads to a 3% improvement in recognition performance, and yields results comparable to within class performances using a single model. Furthermore division on the basis of accent is significantly more advantageous than a division based on sex. Iterative clustering procedures do in general not work well as the different models tend to overlap more with every iteration step. Ultimately it was found that subdivision of speakers in classes only helps if the number of speakers per class remains truly large (typically > 200).


doi: 10.21437/Eurospeech.1991-190

Cite as: Compernolle, D.V., Smolders, J., Jaspers, P., Hellemans, T. (1991) Speaker clustering for dialectic robustness in speaker independent recognition. Proc. 2nd European Conference on Speech Communication and Technology (Eurospeech 1991), 723-726, doi: 10.21437/Eurospeech.1991-190

@inproceedings{compernolle91_eurospeech,
  author={Dirk Van Compernolle and J. Smolders and P. Jaspers and T. Hellemans},
  title={{Speaker clustering for dialectic robustness in speaker independent recognition}},
  year=1991,
  booktitle={Proc. 2nd European Conference on Speech Communication and Technology (Eurospeech 1991)},
  pages={723--726},
  doi={10.21437/Eurospeech.1991-190}
}