Speaker adaptation using relevance vector regression for HMM-based expressive TTS

Hong, Doo Hwa; Lee, Joun Yeop; Jang, Se Young; Kim, Nam Soo

doi:10.21437/Interspeech.2015-307

Speaker adaptation using relevance vector regression for HMM-based expressive TTS

Doo Hwa Hong, Joun Yeop Lee, Se Young Jang, Nam Soo Kim

The conventional maximum likelihood linear regression (MLLR)-based adaptation algorithm employed to acoustic hidden Markov models (HMMs) is too restricted in linear regression to represent the details of mapping charateristics. To overcome this problem, we propose the relevance vector regression (RVR)-based model parameter adaptation technique. In this framework, the conventional technique is extended to have much more basis functions. Also, the weights for conducting a transform matrix are obtained by sparse Bayesian learning, in which most of the weights become zero due to the definition of the prior with the precision hyper-parameters. Furthermore, by using the appropriate kernel functions, RVR can take both of the advantages of linear and nonlinear regression. In the experiments, the emotional speech database is used for adaptation to evaluate the proposed method compared with the conventional constrained MLLR. From the experimental results, we conclude that the RVR adaption method performs better than the conventional method.

doi: 10.21437/Interspeech.2015-307

Cite as: Hong, D.H., Lee, J.Y., Jang, S.Y., Kim, N.S. (2015) Speaker adaptation using relevance vector regression for HMM-based expressive TTS. Proc. Interspeech 2015, 1216-1220, doi: 10.21437/Interspeech.2015-307

@inproceedings{hong15b_interspeech,
  author={Doo Hwa Hong and Joun Yeop Lee and Se Young Jang and Nam Soo Kim},
  title={{Speaker adaptation using relevance vector regression for HMM-based expressive TTS}},
  year=2015,
  booktitle={Proc. Interspeech 2015},
  pages={1216--1220},
  doi={10.21437/Interspeech.2015-307}
}