Target speech GMM-based spectral compensation for noise robust speech recognition

Shinozaki, Takahiro; Furui, Sadaoki

doi:10.21437/Interspeech.2009-361

Target speech GMM-based spectral compensation for noise robust speech recognition

Takahiro Shinozaki, Sadaoki Furui

To improve speech recognition performance in adverse conditions, a noise compensation method is proposed that applies a transformation in the spectral domain whose parameters are optimized based on likelihood of speech GMM modeled on the feature domain. The idea is that additive and convolutional noises have mathematically simple expression in the spectral domain while speech characteristics are better modeled in the feature domain such as MFCC. The proposed method works as a feature extraction front-end that is independent from decoding engine, and has ability to compensate for non-stationary additive and convolutional noises with a short time delay. It includes spectral subtraction as a special case when no parameter optimization is performed. Experiments were performed using the AURORA-2J database. It has been shown that significantly higher recognition performance is obtained by the proposed method than spectral subtraction.

doi: 10.21437/Interspeech.2009-361

Cite as: Shinozaki, T., Furui, S. (2009) Target speech GMM-based spectral compensation for noise robust speech recognition. Proc. Interspeech 2009, 1255-1258, doi: 10.21437/Interspeech.2009-361

@inproceedings{shinozaki09_interspeech,
  author={Takahiro Shinozaki and Sadaoki Furui},
  title={{Target speech GMM-based spectral compensation for noise robust speech recognition}},
  year=2009,
  booktitle={Proc. Interspeech 2009},
  pages={1255--1258},
  doi={10.21437/Interspeech.2009-361}
}