ISCA Archive Interspeech 2010
ISCA Archive Interspeech 2010

A procedure for estimating gestural scores from natural speech

Hosung Nam, Vikramjit Mitra, Mark Tiede, Elliot Saltzman, Louis Goldstein, Carol Espy-Wilson, Mark Hasegawa-Johnson

Speech can be represented as a constellation of constricting events, gestures, which are defined at vocal tract variables, in a form of gestural score. Gestures and their output trajectories, tract variables, which are available only in synthetic speech, have recently been shown to improve the ASR performance. We introduce a procedure to annotate gestures on natural speech database, a landmark-based time warping method. For a given speech, Haskins Laboratories TADA model is used to generate a gestural score and acoustic output, and an optimal gestural score is estimated through iterative time-warping processes based on landmark (phone) comparison.


doi: 10.21437/Interspeech.2010-4

Cite as: Nam, H., Mitra, V., Tiede, M., Saltzman, E., Goldstein, L., Espy-Wilson, C., Hasegawa-Johnson, M. (2010) A procedure for estimating gestural scores from natural speech. Proc. Interspeech 2010, 30-33, doi: 10.21437/Interspeech.2010-4

@inproceedings{nam10_interspeech,
  author={Hosung Nam and Vikramjit Mitra and Mark Tiede and Elliot Saltzman and Louis Goldstein and Carol Espy-Wilson and Mark Hasegawa-Johnson},
  title={{A procedure for estimating gestural scores from natural speech}},
  year=2010,
  booktitle={Proc. Interspeech 2010},
  pages={30--33},
  doi={10.21437/Interspeech.2010-4}
}