ISCA Archive Interspeech 2006
ISCA Archive Interspeech 2006

Speaker identification under noisy environments by using harmonic structure extraction and reliable frame weighting

Hiromasa Fujihara, Tetsuro Kitahara, Masataka Goto, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno

We present methods for automatic speaker identification in noisy environments. To improve noise robustness of speaker identification, we developed two methods, the harmonic structure extraction method and the reliable frame weighting method. The harmonic structure extraction method enables the speaker of input speech signals to be identified after environmental noise has been reduced. This method first extracts harmonic components of the speech from the sound mixtures and then resynthesizes a clean speech signal by using a sinusoidal model driven by harmonic components. The reliable frame weighting method then determines how each frame of the resynthesized speech is reliable (i.e. little influenced by environmental noises) by using two Gaussian mixture models for the speech and noise. The speaker can be robustly identified by attaching importance to reliable frames. Experimental results with thirty speakers showed that our method was able to reduce the influences of environmental noise and achieved an error rate of 10.7%, while the error rate for a conventional method was 18.9%.


doi: 10.21437/Interspeech.2006-180

Cite as: Fujihara, H., Kitahara, T., Goto, M., Komatani, K., Ogata, T., Okuno, H.G. (2006) Speaker identification under noisy environments by using harmonic structure extraction and reliable frame weighting. Proc. Interspeech 2006, paper 1525-Wed1A1O.2, doi: 10.21437/Interspeech.2006-180

@inproceedings{fujihara06_interspeech,
  author={Hiromasa Fujihara and Tetsuro Kitahara and Masataka Goto and Kazunori Komatani and Tetsuya Ogata and Hiroshi G. Okuno},
  title={{Speaker identification under noisy environments by using harmonic structure extraction and reliable frame weighting}},
  year=2006,
  booktitle={Proc. Interspeech 2006},
  pages={paper 1525-Wed1A1O.2},
  doi={10.21437/Interspeech.2006-180}
}