Duration modeling for arabic text to speech synthesis

Hifny, Yasser; Rashwan, Mohsen

doi:10.21437/ICSLP.2002-527

Duration modeling for arabic text to speech synthesis

Yasser Hifny, Mohsen Rashwan

Duration modeling is a fundamental task of prosody generation for Text To Speech (TTS) systems. The objective of this task is to predict the duration of a speech unit from its phonological representation. Duration modeling has a significant influence on the intelligibility and the naturalness of the synthesized speech. This paper presents a Neural Network (NN) based approach to predict the duration of Arabic phonemes. The developed model utilizes neural networks to map the relation between the phonological features and duration values.

doi: 10.21437/ICSLP.2002-527

Cite as: Hifny, Y., Rashwan, M. (2002) Duration modeling for arabic text to speech synthesis. Proc. 7th International Conference on Spoken Language Processing (ICSLP 2002), 1773-1776, doi: 10.21437/ICSLP.2002-527

@inproceedings{hifny02_icslp,
  author={Yasser Hifny and Mohsen Rashwan},
  title={{Duration modeling for arabic text to speech synthesis}},
  year=2002,
  booktitle={Proc. 7th International Conference on Spoken Language Processing (ICSLP 2002)},
  pages={1773--1776},
  doi={10.21437/ICSLP.2002-527}
}