This paper describes a novel approach based on unsupervised training of the MAP adaptation rate using Q-learning. Q-learning is a reinforcement learning technique and is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with a complex, uncertain environment. The proposed method defines the likelihood of the adapted model as a reward and learns a weight factor that indicates the relative balance between the initial model and adaptation data without the need for supervised data. We conducted recognition experiments on a lecture using a corpus of spontaneous Japanese. We were able to estimate the optimal weight factor using Q-learning in advance. MAP adaptation using the weight factor estimated with the proposed method acquired recognition accuracy that was equivalent to MAP adaptation using a weight factor determined experimentally.
Cite as: Nishida, M., Horiuchi, Y., Ichikawa, A. (2007) Unsupervised training of adaptation rate using q-learning in large vocabulary continuous speech recognition. Proc. Interspeech 2007, 278-281, doi: 10.21437/Interspeech.2007-121
@inproceedings{nishida07_interspeech, author={Masafumi Nishida and Yasuo Horiuchi and Akira Ichikawa}, title={{Unsupervised training of adaptation rate using q-learning in large vocabulary continuous speech recognition}}, year=2007, booktitle={Proc. Interspeech 2007}, pages={278--281}, doi={10.21437/Interspeech.2007-121} }