Dynamic language model adaptation using presentation slides for lecture speech recognition

Yamazaki, Hiroki; Iwano, Koji; Shinoda, Koichi; Furui, Sadaoki; Yokota, Haruo

doi:10.21437/Interspeech.2007-265

Dynamic language model adaptation using presentation slides for lecture speech recognition

Hiroki Yamazaki, Koji Iwano, Koichi Shinoda, Sadaoki Furui, Haruo Yokota

We propose a dynamic language model adaptation method that uses the temporal information from lecture slides for lecture speech recognition. The proposed method consists of two steps. First, the language model is adapted with the text information extracted from all the slides of a given lecture. Next, the text information of a given slide is extracted based on temporal information and used for local adaptation. Hence, the language model, used to recognize speech associated with the given slide changes dynamically from one slide to the next. We evaluated the proposed method with the speech data from four Japanese lecture courses. Our experiments show the effectiveness of our proposed method, especially for keyword detection. The F-measure error rate for lecture keywords was reduced by 2.4%.

doi: 10.21437/Interspeech.2007-265

Cite as: Yamazaki, H., Iwano, K., Shinoda, K., Furui, S., Yokota, H. (2007) Dynamic language model adaptation using presentation slides for lecture speech recognition. Proc. Interspeech 2007, 2349-2352, doi: 10.21437/Interspeech.2007-265

@inproceedings{yamazaki07_interspeech,
  author={Hiroki Yamazaki and Koji Iwano and Koichi Shinoda and Sadaoki Furui and Haruo Yokota},
  title={{Dynamic language model adaptation using presentation slides for lecture speech recognition}},
  year=2007,
  booktitle={Proc. Interspeech 2007},
  pages={2349--2352},
  doi={10.21437/Interspeech.2007-265}
}