Eigen-MLLR environment/speaker compensation for robust speech recognition

Liao, Yuan-Fu; Fang, Hung-Hsiang; Hsu, Chi-Hui

doi:10.21437/Interspeech.2008-300

Eigen-MLLR environment/speaker compensation for robust speech recognition

Yuan-Fu Liao, Hung-Hsiang Fang, Chi-Hui Hsu

In this paper an eigen-maximum likelihood linear regression (Eigen-MLLR) method is proposed to utilize a set of a priori noisy environment/speaker knowledge to online compensate the characteristics of unknown test environment/speaker. This idea is straightforward but is motivated from our recent findings that both the characteristics of different kinds of noisy environments and speakers could be simultaneously well organized in a PCA-constructed Eigen-MLLR subspace. Especially, the first three dimensions of the constructed Eigen-MLLR subspace are highly related to the SNR value, gender and type of noise. The proposed Eigen-MLLR was evaluated on Aurora 2 multi-condition training task. Experimental results showed that average word error rate (WER) of 6.14% was achieved. Moreover, Eigen-MLLR not only outperformed the multi-condition training baseline (Multi-Con., 13.72%) but also the blind ETSI advanced DSR front-end (ETSI-Adv., 8.65%), the histogram equalization (HEQ, 8.66%) and the non-blind reference model weighting (RMW, 7.29%) approaches.

doi: 10.21437/Interspeech.2008-300

Cite as: Liao, Y.-F., Fang, H.-H., Hsu, C.-H. (2008) Eigen-MLLR environment/speaker compensation for robust speech recognition. Proc. Interspeech 2008, 1249-1252, doi: 10.21437/Interspeech.2008-300

@inproceedings{liao08b_interspeech,
  author={Yuan-Fu Liao and Hung-Hsiang Fang and Chi-Hui Hsu},
  title={{Eigen-MLLR environment/speaker compensation for robust speech recognition}},
  year=2008,
  booktitle={Proc. Interspeech 2008},
  pages={1249--1252},
  doi={10.21437/Interspeech.2008-300}
}