A noise-robust system for NIST 2012 speaker recognition evaluation

Ferrer, Luciana; McLaren, Mitchell; Scheffer, Nicolas; Lei, Yun; Graciarena, Martin; Mitra, Vikramjit

doi:10.21437/Interspeech.2013-471

A noise-robust system for NIST 2012 speaker recognition evaluation

Luciana Ferrer, Mitchell McLaren, Nicolas Scheffer, Yun Lei, Martin Graciarena, Vikramjit Mitra

The National Institute of Standards and Technology (NIST) 2012 speaker recognition evaluation posed several new challenges including noisy data, varying test-sample length and number of enrollment samples, and a new metric. Target speakers were known during system development and could be used for model training and score normalization. For the evaluation, SRI International (SRI) submitted a system consisting of six subsystems that use different low- and high-level features, some specifically designed for noise robustness, fused at the score and iVector levels. This paper presents SRI's submission along with a careful analysis of the approaches that provided gains for this challenging evaluation including a multiclass voice-activity detection system, the use of noisy data in system training, and the fusion of subsystems using acoustic characterization metadata.

doi: 10.21437/Interspeech.2013-471

Cite as: Ferrer, L., McLaren, M., Scheffer, N., Lei, Y., Graciarena, M., Mitra, V. (2013) A noise-robust system for NIST 2012 speaker recognition evaluation. Proc. Interspeech 2013, 1981-1985, doi: 10.21437/Interspeech.2013-471

@inproceedings{ferrer13_interspeech,
  author={Luciana Ferrer and Mitchell McLaren and Nicolas Scheffer and Yun Lei and Martin Graciarena and Vikramjit Mitra},
  title={{A noise-robust system for NIST 2012 speaker recognition evaluation}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={1981--1985},
  doi={10.21437/Interspeech.2013-471}
}