Robust speech recognition using DNN-HMM acoustic model combining noise-aware training with spectral subtraction

Abe, Akihiro; Yamamoto, Kazumasa; Nakagawa, Seiichi

doi:10.21437/Interspeech.2015-599

Robust speech recognition using DNN-HMM acoustic model combining noise-aware training with spectral subtraction

Akihiro Abe, Kazumasa Yamamoto, Seiichi Nakagawa

Recently, acoustic models based on deep neural networks (DNNs) have been introduced and showed dramatic improvements over acoustic models based on GMM in a variety of tasks. In this paper, we considered the improvement of noise robustness of DNN. Inspired by Missing Feature Theory and static noise aware training, we proposed an approach that uses a noise-suppressed acoustic feature and estimated noise information as input of DNN. We used simple Spectral Subtraction as noise-suppression. As noise estimation, we used estimation per utterance or frame. In noisy speech recognition experiments, we compared the proposed method with other methods and the proposed method showed the superior performance than the other approaches. For noise estimation per utterance with log Mel Filterbank, we obtained 28.6% word error rate reduction compared with multi condition training, 5.9% reduction compared with noise adaptive training.

doi: 10.21437/Interspeech.2015-599

Cite as: Abe, A., Yamamoto, K., Nakagawa, S. (2015) Robust speech recognition using DNN-HMM acoustic model combining noise-aware training with spectral subtraction. Proc. Interspeech 2015, 2849-2853, doi: 10.21437/Interspeech.2015-599

@inproceedings{abe15_interspeech,
  author={Akihiro Abe and Kazumasa Yamamoto and Seiichi Nakagawa},
  title={{Robust speech recognition using DNN-HMM acoustic model combining noise-aware training with spectral subtraction}},
  year=2015,
  booktitle={Proc. Interspeech 2015},
  pages={2849--2853},
  doi={10.21437/Interspeech.2015-599}
}