Maximum a posteriori (MAP) adaptation and its discriminative variants, such as MMI-MAP (maximum mutual information MAP) and MPE-MAP (minimum phone error MAP), have been widely applied to acoustic model adaptation. This paper introduces a new adaptation approach, fMPE-MAP, which is an extension to the original fMPE (feature minimum phone error) algorithm, with the enhanced ability in porting Gaussian models and fMPE transforms to a new domain. We applied this approach to the SRI-ICSI 2007 NIST meeting recognition system, for which we ported our conversational telephone speech (CTS) and broadcast news (BN) models to the meeting domain. Experiments showed that the proposed fMPE-MAP approach has comparable or better performance than simply training the fMPE transform on combined data, in addition to the obvious speed advantage. In combination with MPE-MAP, we obtained about 20% relative word error rate reduction on a lecture meeting evaluation test set, over the models trained with the standard MAP approach.
Cite as: Zheng, J., Stolcke, A. (2007) fMPE-MAP: improved discriminative adaptation for modeling new domains. Proc. Interspeech 2007, 1573-1576, doi: 10.21437/Interspeech.2007-128
@inproceedings{zheng07_interspeech, author={Jing Zheng and Andreas Stolcke}, title={{fMPE-MAP: improved discriminative adaptation for modeling new domains}}, year=2007, booktitle={Proc. Interspeech 2007}, pages={1573--1576}, doi={10.21437/Interspeech.2007-128} }