ISCA Archive ICSLP 2002
ISCA Archive ICSLP 2002

Speech recognition with a re-speak method for subtitling live broadcasts

Toru Imai, Atsushi Matsui, Shinichi Homma, Takeshi Kobayakawa, Kazuo Onoe, Shoei Sato, Akio Ando

This paper describes a "re-speak" method for subtitling live TV broadcasts using a speech recognition system. Original on-location speech in live sport or music programs contains background noise, spontaneous or emotional speech, and the voices of speakers unknown to the recognition system, all of which cause recognition performance to deteriorate. However, if a different individual, to which the system has been adapted, carefully rephrases the original utterances in a studio, these problems can be largely overcome. Recognition experiments showed that rephrasing the commentary was effective in reducing perplexities and word error rates compared with simply repeating it. Speech recognition using the re-speak method was applied in practice to a music-based variety show and the 2002 Winter Olympic Games in order automatically to produce simultaneous subtitles for hearing-impaired viewers. A word error rate below 5% and a subtitle display delay time below three seconds were achieved.


doi: 10.21437/ICSLP.2002-523

Cite as: Imai, T., Matsui, A., Homma, S., Kobayakawa, T., Onoe, K., Sato, S., Ando, A. (2002) Speech recognition with a re-speak method for subtitling live broadcasts. Proc. 7th International Conference on Spoken Language Processing (ICSLP 2002), 1757-1760, doi: 10.21437/ICSLP.2002-523

@inproceedings{imai02_icslp,
  author={Toru Imai and Atsushi Matsui and Shinichi Homma and Takeshi Kobayakawa and Kazuo Onoe and Shoei Sato and Akio Ando},
  title={{Speech recognition with a re-speak method for subtitling live broadcasts}},
  year=2002,
  booktitle={Proc. 7th International Conference on Spoken Language Processing (ICSLP 2002)},
  pages={1757--1760},
  doi={10.21437/ICSLP.2002-523}
}