Improved acoustic modeling for transcribing Arabic broadcast data

Lamel, Lori; Messaoudi, Abdel.; Gauvain, Jean-Luc

doi:10.21437/Interspeech.2007-562

Improved acoustic modeling for transcribing Arabic broadcast data

Lori Lamel, Abdel. Messaoudi, Jean-Luc Gauvain

This paper summarizes our recent progress in improving the automatic transcription of Arabic broadcast audio data, and some efforts to address the challenges of the broadcast conversational speech. Our efforts are aimed at improving the acoustic, pronunciation and language models taking into account specificities of the Arabic language. In previous work we demonstrated that explicit modeling of short vowels improved recognition performance, even when producing non-vocalized hypotheses. In addition to modeling short vowels, consonant gemination and nunation are now explicitly modeled, alternative pronunciations have been introduced to better represent dialectical variants, and a duration model has been integrated. In order to facilitate training on Arabic audio data with non-vocalized transcripts a generic vowel model has been introduced. Compared with the previous system (used in the 2006 GALE evaluation) the relative word error rate has been reduced by over 10%.

doi: 10.21437/Interspeech.2007-562

Cite as: Lamel, L., Messaoudi, A., Gauvain, J.-L. (2007) Improved acoustic modeling for transcribing Arabic broadcast data. Proc. Interspeech 2007, 2077-2080, doi: 10.21437/Interspeech.2007-562

@inproceedings{lamel07_interspeech,
  author={Lori Lamel and Abdel. Messaoudi and Jean-Luc Gauvain},
  title={{Improved acoustic modeling for transcribing Arabic broadcast data}},
  year=2007,
  booktitle={Proc. Interspeech 2007},
  pages={2077--2080},
  doi={10.21437/Interspeech.2007-562}
}