ISCA Archive Eurospeech 2003
ISCA Archive Eurospeech 2003

Tone pattern discrimination combining parametric modeling and maximum likelihood estimation

Jinfu Ni, Hisashi Kawai

This paper presents a novel method for tone pattern discrimination derived by combining a functional fundamental frequency (F_0) model for feature extraction with vector quantization and maximum likelihood estimation techniques. Tone patterns are represented in a parametric form based on the F_0 model and clustered using the LBG algorithm. The mapping between lexical tones and acoustic patterns is statistically modeled and decoded by the maximum likelihood estimation. Evaluation experiments are conducted on 469 Mandarin utterances (1.4 hours of read speech from a female native) with varied analysis conditions of codebook sizes and tone contexts. Experimental results indicate the effectiveness of the method in both tone discrimination and detection of the inconsistency between a lexical tone and its F_0 pattern. The method is suitable for the prosodic labeling of a large scale speech corpus.


doi: 10.21437/Eurospeech.2003-171

Cite as: Ni, J., Kawai, H. (2003) Tone pattern discrimination combining parametric modeling and maximum likelihood estimation. Proc. 8th European Conference on Speech Communication and Technology (Eurospeech 2003), 465-468, doi: 10.21437/Eurospeech.2003-171

@inproceedings{ni03_eurospeech,
  author={Jinfu Ni and Hisashi Kawai},
  title={{Tone pattern discrimination combining parametric modeling and maximum likelihood estimation}},
  year=2003,
  booktitle={Proc. 8th European Conference on Speech Communication and Technology (Eurospeech 2003)},
  pages={465--468},
  doi={10.21437/Eurospeech.2003-171}
}