Unsupervised training of adaptation rate using q-learning in large vocabulary continuous speech recognition

Nishida, Masafumi; Horiuchi, Yasuo; Ichikawa, Akira

doi:10.21437/Interspeech.2007-121

Unsupervised training of adaptation rate using q-learning in large vocabulary continuous speech recognition

Masafumi Nishida, Yasuo Horiuchi, Akira Ichikawa

This paper describes a novel approach based on unsupervised training of the MAP adaptation rate using Q-learning. Q-learning is a reinforcement learning technique and is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with a complex, uncertain environment. The proposed method defines the likelihood of the adapted model as a reward and learns a weight factor that indicates the relative balance between the initial model and adaptation data without the need for supervised data. We conducted recognition experiments on a lecture using a corpus of spontaneous Japanese. We were able to estimate the optimal weight factor using Q-learning in advance. MAP adaptation using the weight factor estimated with the proposed method acquired recognition accuracy that was equivalent to MAP adaptation using a weight factor determined experimentally.

doi: 10.21437/Interspeech.2007-121

Cite as: Nishida, M., Horiuchi, Y., Ichikawa, A. (2007) Unsupervised training of adaptation rate using q-learning in large vocabulary continuous speech recognition. Proc. Interspeech 2007, 278-281, doi: 10.21437/Interspeech.2007-121

@inproceedings{nishida07_interspeech,
  author={Masafumi Nishida and Yasuo Horiuchi and Akira Ichikawa},
  title={{Unsupervised training of adaptation rate using q-learning in large vocabulary continuous speech recognition}},
  year=2007,
  booktitle={Proc. Interspeech 2007},
  pages={278--281},
  doi={10.21437/Interspeech.2007-121}
}