Automatic speech recognition framework for multilingual audio contents

Nanjo, Hiroaki; Oku, Yuichi; Yoshimi, Takehiko

doi:10.21437/Interspeech.2007-420

Automatic speech recognition framework for multilingual audio contents

Hiroaki Nanjo, Yuichi Oku, Takehiko Yoshimi

Automatic speech recognition (ASR) for multilingual audio contents, such as international conference recordings and broadcast news, is addressed. For handling such contents efficiently, a simultaneous ASR is promising. Conventionally, ASR has been performed independently, namely language by language, although multilingual speech, which consists of utterances in several languages representing the same meaning, is available. In this paper, we discuss a bilingual speech recognition framework based on statistical ASR and machine translation (MT) in which bilingual ASR is performed simultaneously and complementarily. Then, according to Japanese speech recognition with corresponding English text and MT, we shows the framework works well.

doi: 10.21437/Interspeech.2007-420

Cite as: Nanjo, H., Oku, Y., Yoshimi, T. (2007) Automatic speech recognition framework for multilingual audio contents. Proc. Interspeech 2007, 1445-1448, doi: 10.21437/Interspeech.2007-420

@inproceedings{nanjo07_interspeech,
  author={Hiroaki Nanjo and Yuichi Oku and Takehiko Yoshimi},
  title={{Automatic speech recognition framework for multilingual audio contents}},
  year=2007,
  booktitle={Proc. Interspeech 2007},
  pages={1445--1448},
  doi={10.21437/Interspeech.2007-420}
}