Zero-Shot Text-to-Speech Synthesis Conditioned Using Self-Supervised Speech Representation Model

Zero-Shot Text-to-Speech Synthesis Conditioned Using Self-Supervised Speech Representation Model | IEEE Conference Publication | IEEE Xplore