Exploiting polynomial-fit histogram equalization and temporal average for robust speech recognition

Lin, Shih-Hsiang; Yeh, Yao-Ming; Chen, Berlin

doi:10.21437/Interspeech.2006-632

Exploiting polynomial-fit histogram equalization and temporal average for robust speech recognition

Shih-Hsiang Lin, Yao-Ming Yeh, Berlin Chen

The performance of current automatic speech recognition (ASR) systems radically deteriorates when the input speech is corrupted by various kinds of noise sources. Quite a few of techniques have been proposed to improve ASR robustness in the past several years. Histogram equalization (HEQ) is one of the most efficient techniques that have been used to compensate the nonlinear distortion. In this paper, we explored the use of the data fitting scheme to efficiently approximate the inverse of the cumulative density function of training speech for HEQ, in contrast to the conventional table-lookup or quantile based approaches. Moreover, the temporal average operation was also performed on the feature vector components to alleviate the influence of sharp peaks and valleys that were caused by non-stationary noises. Finally, we also investigated the possibility of combining our approaches with other feature discrimination and decorrelation methods. All experiments were carried out on the Aurora-2 database and task. Encouraging results were initially demonstrated.

doi: 10.21437/Interspeech.2006-632

Cite as: Lin, S.-H., Yeh, Y.-M., Chen, B. (2006) Exploiting polynomial-fit histogram equalization and temporal average for robust speech recognition. Proc. Interspeech 2006, paper 1195-Thu2CaP.1, doi: 10.21437/Interspeech.2006-632

@inproceedings{lin06e_interspeech,
  author={Shih-Hsiang Lin and Yao-Ming Yeh and Berlin Chen},
  title={{Exploiting polynomial-fit histogram equalization and temporal average for robust speech recognition}},
  year=2006,
  booktitle={Proc. Interspeech 2006},
  pages={paper 1195-Thu2CaP.1},
  doi={10.21437/Interspeech.2006-632}
}