Modeling cross-morpheme pronunciation variations for korean large vocabulary continuous speech recognition

Lee, Kyong-Nim; Chung, Minhwa

doi:10.21437/Eurospeech.2003-114

Modeling cross-morpheme pronunciation variations for korean large vocabulary continuous speech recognition

Kyong-Nim Lee, Minhwa Chung

In this paper, we describe a cross-morpheme pronunciation variation model which is especially useful for constructing morpheme-based pronunciation lexicon for Korean LVCSR. There are a lot of pronunciation variations occurring at morpheme boundaries in continuous speech. Since phonemic context together with morphological category and morpheme boundary information affect Korean pronunciation variations, we have distinguished pronunciation variation rules according to the locations such as within a morpheme, across a morpheme boundary in a compound noun, across a morpheme boundary in an eojeol, and across an eojeol boundary. In 33K-morpheme Korean CSR experiment, an absolute improvement of 1.16% in WER from the baseline performance of 23.17% WER is achieved by modeling cross-morpheme pronunciation variations with a context-dependent multiple pronunciation lexicon.

doi: 10.21437/Eurospeech.2003-114

Cite as: Lee, K.-N., Chung, M. (2003) Modeling cross-morpheme pronunciation variations for korean large vocabulary continuous speech recognition. Proc. 8th European Conference on Speech Communication and Technology (Eurospeech 2003), 261-264, doi: 10.21437/Eurospeech.2003-114

@inproceedings{lee03c_eurospeech,
  author={Kyong-Nim Lee and Minhwa Chung},
  title={{Modeling cross-morpheme pronunciation variations for korean large vocabulary continuous speech recognition}},
  year=2003,
  booktitle={Proc. 8th European Conference on Speech Communication and Technology (Eurospeech 2003)},
  pages={261--264},
  doi={10.21437/Eurospeech.2003-114}
}