Optimizing sentence segmentation for spoken language translation

Rao, Sharath; Lane, Ian; Schultz, Tanja

doi:10.21437/Interspeech.2007-655

Optimizing sentence segmentation for spoken language translation

Sharath Rao, Ian Lane, Tanja Schultz

The conventional approach in text-based machine translation (MT) is to translate complete sentences, which are conveniently indicated by sentence boundary markers. However, since such boundary markers are not available for speech, new methods are required that define an optimal unit for translation. Our experimental results show that with a segment length optimized for a particular MT system, intra-sentence segmentation can improve translation performance (measured in BLEU) by up to 11% for Arabic Broadcast Conversation (BC) and 6% for Arabic Broadcast News (BN). We show that acoustic segmentation that minimizes Word Error Rate (WER) may not give the best translation performance. We improve upon it by automatically resegmenting the ASR output in a way that is optimized for translation and argue that it might be necessary for different stages of a Spoken Language Translation (SLT) system to define their own optimal units.

doi: 10.21437/Interspeech.2007-655

Cite as: Rao, S., Lane, I., Schultz, T. (2007) Optimizing sentence segmentation for spoken language translation. Proc. Interspeech 2007, 2845-2848, doi: 10.21437/Interspeech.2007-655

@inproceedings{rao07_interspeech,
  author={Sharath Rao and Ian Lane and Tanja Schultz},
  title={{Optimizing sentence segmentation for spoken language translation}},
  year=2007,
  booktitle={Proc. Interspeech 2007},
  pages={2845--2848},
  doi={10.21437/Interspeech.2007-655}
}