This paper focuses on the estimation of the Tilt intonation model [1]. Usually, Tilt events are detected using a first estimation which is improved using gradient descent techniques. To speed up the search we propose to use a closed form expression for some of the Tilt parameters. The gradient descent search is used only for the time related parameters because a close expression cannot be found. Furthermore, the original Tilt proposal estimates the Tilt events sentence by sentence. Here we propose to estimate the events of the whole training corpus at the same time, using what we call the JEMA methodology. This approach increases the consistency of the estimation producing better intonation models. It has been tested on two different languages: Slovenian and Spanish. The experimental results reveal that the Tilt model is appropriate for these languages and that the JEMA methodology produces better prosodic models.
Cite as: Rojc, M., Aguero, P.D., Bonafonte, A., Kacic, Z. (2005) Training the tilt intonation model using the JEMA methodology. Proc. Interspeech 2005, 3273-3276, doi: 10.21437/Interspeech.2005-570
@inproceedings{rojc05_interspeech, author={Matej Rojc and Pablo Daniel Aguero and Antonio Bonafonte and Zdravko Kacic}, title={{Training the tilt intonation model using the JEMA methodology}}, year=2005, booktitle={Proc. Interspeech 2005}, pages={3273--3276}, doi={10.21437/Interspeech.2005-570} }