ISCA Archive Interspeech 2011
ISCA Archive Interspeech 2011

Improvements of a dual-input DBN for noise robust ASR

Yang Sun, Jort F. Gemmeke, Bert Cranen, Louis ten Bosch, Lou Boves

In previous work we have shown that an ASR system consisting of a dual-input Dynamic Bayesian Network (DBN) which simultaneously observes MFCC acoustic features and an exemplar-based Sparse Classification (SC) phoneme predictor stream can achieve better word recognition accuracies in noise than a system that observes only one input stream. This paper explores three modifications of SC input to further improve the noise robustness of the dual-input DBN system: 1) using state likelihoods instead of phonemes, 2) integrating more contextual information and 3) using a complete set of likelihood distribution. Experiments on AURORA-2 reveal that the combination of the first two approaches significantly improves the recognition results, achieving up to 29% (absolute) accuracy gain at SNR -5 dB. In the dual-input system using the full likelihood vector does not outperform using the best state prediction.


doi: 10.21437/Interspeech.2011-215

Cite as: Sun, Y., Gemmeke, J.F., Cranen, B., Bosch, L.t., Boves, L. (2011) Improvements of a dual-input DBN for noise robust ASR. Proc. Interspeech 2011, 1669-1672, doi: 10.21437/Interspeech.2011-215

@inproceedings{sun11c_interspeech,
  author={Yang Sun and Jort F. Gemmeke and Bert Cranen and Louis ten Bosch and Lou Boves},
  title={{Improvements of a dual-input DBN for noise robust ASR}},
  year=2011,
  booktitle={Proc. Interspeech 2011},
  pages={1669--1672},
  doi={10.21437/Interspeech.2011-215}
}