ISCA Archive Interspeech 2007
ISCA Archive Interspeech 2007

Unsupervised HMM classification of F0 curves

Damien Lolive, Nelly Barbot, Olivier Boeffard

This article describes a new unsupervised methodology to learn F0 classes using HMM models on a syllable basis. A F0 class is represented by a HMM with three emitting states. The clustering algorithm relies on an iterative gaussian splitting and EM retraining process. First, a single class is learnt on a training corpus (8000 syllables) and it is then divided by perturbing gaussian means of successive levels. At each step, the mean RMS error is evaluated on a validation corpus (3000 syllables). The algorithm stops automatically when the error becomes stable or increases. The syllabic structure of a sentence is the reference level we have taken for F0 modelling even if the methodology can be applied to other structures. Clustering quality is evaluated in terms of cross-validation using a mean of RMS errors between F0 contours on a test corpus and the estimated HMM trajectories. The results show a pretty good quality of the classes (mean RMS error around 4Hz).


doi: 10.21437/Interspeech.2007-223

Cite as: Lolive, D., Barbot, N., Boeffard, O. (2007) Unsupervised HMM classification of F0 curves. Proc. Interspeech 2007, 478-481, doi: 10.21437/Interspeech.2007-223

@inproceedings{lolive07_interspeech,
  author={Damien Lolive and Nelly Barbot and Olivier Boeffard},
  title={{Unsupervised HMM classification of F0 curves}},
  year=2007,
  booktitle={Proc. Interspeech 2007},
  pages={478--481},
  doi={10.21437/Interspeech.2007-223}
}