This paper studies the inclusion of glottal source characteristics in voice conversion (VC) systems. We use source/filter decomposition to parametrize the vocal tract using LSF, the glottal source using the LF model, and the aspiration noise using amplitude-modulated high-pass filtered AWGN noise. To evaluate the impact of this new parametrization in VC, we use a reference conversion system that estimates a linear transformation function using a joint target/source model obtained with CART and GMM. The reference system is based on the LPC model, uses LSF to represent the vocal tract and a selection technique for the residual. We use the reference algorithm to build a VC system for each of the three parameter sets. We compared both parametrizations in the framework of an intra-lingual voice conversion task in Spanish. The results show that the new source/filter representation clearly improves the overall performance, both in terms of speaker identity transformation and voice quality.
Cite as: Pérez, J., Bonafonte, A. (2011) Adding glottal source information to intra-lingual voice conversion. Proc. Interspeech 2011, 2773-2776, doi: 10.21437/Interspeech.2011-694
@inproceedings{perez11_interspeech, author={Javier Pérez and Antonio Bonafonte}, title={{Adding glottal source information to intra-lingual voice conversion}}, year=2011, booktitle={Proc. Interspeech 2011}, pages={2773--2776}, doi={10.21437/Interspeech.2011-694} }