ISCA Archive Eurospeech 1997
ISCA Archive Eurospeech 1997

Voice conversion by codebook mapping of line spectral frequencies and excitation spectrum

Levent M. Arslan, David Talkin

This paper presents a new scheme for developing a voice conversion system that modifies the utterance of a source speaker to sound like speech from a target speaker. We refer to the method as Speaker Transformation Algorithm using Segmental Codebooks (STASC). Two new methods are described to perform the transformation of vocal tract and glottal excitation characteristics across speakers. In addition, the source speaker's general prosodic characteristics are modified using time-scale and pitch-scale modification algorithms. Informal listening tests suggest that convincing voice conversion is achieved while maintaining high speech quality. The performance of the proposed system is also evaluated on a standard Caussian mixture model based speaker identification system, and the results show that the transformed speech is assigned higher likelihood by the target speaker model when compared to the source model.


doi: 10.21437/Eurospeech.1997-383

Cite as: Arslan, L.M., Talkin, D. (1997) Voice conversion by codebook mapping of line spectral frequencies and excitation spectrum. Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997), 1347-1350, doi: 10.21437/Eurospeech.1997-383

@inproceedings{arslan97_eurospeech,
  author={Levent M. Arslan and David Talkin},
  title={{Voice conversion by codebook mapping of line spectral frequencies and excitation spectrum}},
  year=1997,
  booktitle={Proc. 5th European Conference on Speech Communication and Technology (Eurospeech 1997)},
  pages={1347--1350},
  doi={10.21437/Eurospeech.1997-383}
}