DNN senone MAP multinomial i-vectors for phonotactic language recognition

McCree, Alan; Garcia-Romero, Daniel

doi:10.21437/Interspeech.2015-162

DNN senone MAP multinomial i-vectors for phonotactic language recognition

Alan McCree, Daniel Garcia-Romero

Deep neural networks have recently shown great promise for language recognition. In particular, the expected counts of clustered context-dependent phone states (senones) can serve as a simple but effective phonotactic system. This paper introduces multinomial i-vectors applied to senone counts and shows that they work better than current PCA approaches. In addition, we show that a new approach using a standard normal prior and MAP multinomial i-vector estimation further improves performance, particularly for shorter test durations. Finally, we present a reduced-complexity version of Newton's method to greatly accelerate multinomial i-vector extraction. Experimental results on the NIST LRE11 task show that this approach performs significantly better than top-performing acoustic and phonotactic systems from that evaluation.

doi: 10.21437/Interspeech.2015-162

Cite as: McCree, A., Garcia-Romero, D. (2015) DNN senone MAP multinomial i-vectors for phonotactic language recognition. Proc. Interspeech 2015, 394-397, doi: 10.21437/Interspeech.2015-162

@inproceedings{mccree15_interspeech,
  author={Alan McCree and Daniel Garcia-Romero},
  title={{DNN senone MAP multinomial i-vectors for phonotactic language recognition}},
  year=2015,
  booktitle={Proc. Interspeech 2015},
  pages={394--397},
  doi={10.21437/Interspeech.2015-162}
}