High-level feature-based speaker verification via articulatory phonetic-class pronunciation modeling

Zhang, Shi-Xiong; Mak, Man-Wai; Meng, Helen

doi:10.21437/Interspeech.2007-143

High-level feature-based speaker verification via articulatory phonetic-class pronunciation modeling

Shi-Xiong Zhang, Man-Wai Mak, Helen Meng

Although articulatory feature-based conditional pronunciation models (AFCPMs) can capture the pronunciation characteristics of speakers, they requires one discrete density function for each phoneme, which may lead to inaccurate models when the amount of training data is limited. This paper proposes a phonetic-class based AFCPM in which the density functions in speaker models are conditioned on phonetic classes instead of phonemes. Phonemes are mapped to phonetic classes by (1) vector quantizing the phoneme-dependent universal background models, (2) grouping phonemes according to the classical phoneme tree, and (3) combination of (1) and (2). A new scoring method that uses an SVM to combine the scores of phonetic-class models is also proposed. Evaluations based on 2000 NIST SRE show that the proposed approach can effectively solve the data sparseness problem encountered in conventional AFCPM.

doi: 10.21437/Interspeech.2007-143

Cite as: Zhang, S.-X., Mak, M.-W., Meng, H. (2007) High-level feature-based speaker verification via articulatory phonetic-class pronunciation modeling. Proc. Interspeech 2007, 762-765, doi: 10.21437/Interspeech.2007-143

@inproceedings{zhang07_interspeech,
  author={Shi-Xiong Zhang and Man-Wai Mak and Helen Meng},
  title={{High-level feature-based speaker verification via articulatory phonetic-class pronunciation modeling}},
  year=2007,
  booktitle={Proc. Interspeech 2007},
  pages={762--765},
  doi={10.21437/Interspeech.2007-143}
}