Multi-state predictive neural networks for text-independent speaker recognition

Artieres, T.; Gallinari, Patrick

doi:10.21437/Eurospeech.1995-106

Multi-state predictive neural networks for text-independent speaker recognition

T. Artieres, Patrick Gallinari

Both Hidden Markov Models and Neural Networks have already been used as production systems for speaker identification or verification. Recently [9] has shown that ergodic multi-state hidden Markov Models do not outperform one-state "hidden" Markov Models, i.e. Gaussian Mixture Models, for speaker recognition. She put in evidence that the important characteristic of these models is the total number of mixtures and not the number of states. These HMMs are thus unable to make use of temporal information for performing speaker recognition. On the other hand, recent experiments have shown that, for neural predictive systems, modelization of non stationarity allowed to significantly improve the performances [6]. We are interested here in the development of such models which will be refereed to as multi-state predictive neural networks (MSPNNs). We study the ability of these systems for speaker identification and discuss the superiority of multi-state upon one-state models. We provide results on 15 talkers from the TIMIT database.

doi: 10.21437/Eurospeech.1995-106

Cite as: Artieres, T., Gallinari, P. (1995) Multi-state predictive neural networks for text-independent speaker recognition. Proc. 4th European Conference on Speech Communication and Technology (Eurospeech 1995), 633-636, doi: 10.21437/Eurospeech.1995-106

@inproceedings{artieres95_eurospeech,
  author={T. Artieres and Patrick Gallinari},
  title={{Multi-state predictive neural networks for text-independent speaker recognition}},
  year=1995,
  booktitle={Proc. 4th European Conference on Speech Communication and Technology (Eurospeech 1995)},
  pages={633--636},
  doi={10.21437/Eurospeech.1995-106}
}