Affective story teller: a TTS system for emotional expressivity

Shaikh, Mostafa Al Masum; Rebordão, Antonio Rui Ferreira; Hirose, Keikichi

doi:10.21437/Interspeech.2010-212

Affective story teller: a TTS system for emotional expressivity

Mostafa Al Masum Shaikh, Antonio Rui Ferreira Rebordão, Keikichi Hirose

This paper describes a system, Affective Story Teller (AST), as an example of emotionally expressive speech synthesizer. Our technique uses several linguistic resources that recognizes emotions in the input text according to its emotional affinity and assigns appropriate prosodic parameters as well as pitch accents by XML-based tagging to generate a synthesized speech sample. Then the synthesized sample is re-synthesized through TD-PSOLA based pitch manipulation in accordance to emotional connotation. The system employed MARY TTS system to readout a folk tale. The preliminary perceptual test results are encouraging and human judges, by listening to the re-synthesized speech samples of AST, could perceive happy, sad, and fear emotions much better than compared to when they listened non-affective synthesized speech.

doi: 10.21437/Interspeech.2010-212

Cite as: Shaikh, M.A.M., Rebordão, A.R.F., Hirose, K. (2010) Affective story teller: a TTS system for emotional expressivity. Proc. Interspeech 2010, 518-521, doi: 10.21437/Interspeech.2010-212

@inproceedings{shaikh10_interspeech,
  author={Mostafa Al Masum Shaikh and Antonio Rui Ferreira Rebordão and Keikichi Hirose},
  title={{Affective story teller: a TTS system for emotional expressivity}},
  year=2010,
  booktitle={Proc. Interspeech 2010},
  pages={518--521},
  doi={10.21437/Interspeech.2010-212}
}