ISCA Archive Interspeech 2010
ISCA Archive Interspeech 2010

Can tongue be recovered from face? the answer of data-driven statistical models

Atef Ben Youssef, Pierre Badin, Gérard Bailly

This study revisits the face-to-tongue articulatory inversion problem in speech. We compare the Multi Linear Regression method (MLR) with two more sophisticated methods based on Hidden Markov Models (HMMs) and Gaussian Mixture Models (GMMs), using the same French corpus of articulatory data acquired by ElectroMagnetoGraphy. GMMs give overall results better than HMMs, but MLR does poorly. GMMs and HMMs maintain the original phonetic class distribution, though with some centralisation effects, effects still much stronger with MLR. A detailed analysis shows that, if the jaw / lips / tongue tip synergy helps recovering front high vowels and coronal consonants, the velars are not recovered at all. It is therefore not possible to recover reliably tongue from face.


doi: 10.21437/Interspeech.2010-567

Cite as: Youssef, A.B., Badin, P., Bailly, G. (2010) Can tongue be recovered from face? the answer of data-driven statistical models. Proc. Interspeech 2010, 2002-2005, doi: 10.21437/Interspeech.2010-567

@inproceedings{youssef10_interspeech,
  author={Atef Ben Youssef and Pierre Badin and Gérard Bailly},
  title={{Can tongue be recovered from face? the answer of data-driven statistical models}},
  year=2010,
  booktitle={Proc. Interspeech 2010},
  pages={2002--2005},
  doi={10.21437/Interspeech.2010-567}
}