ISCA Archive ICSLP 1990
ISCA Archive ICSLP 1990

Neural network based concatenation method of synthesis units for synthesis by rule

Yasushi Ishikawa, Kunio Nakajima

In this paper, we describe a neural network based concatenation method of synthesis units for synthesis by rule. In proposed method, two types of multilayer perceptions are used. One is neural network for phoneme recognition, another is for production of spectrum. A recognition network performs mapping a spectrum to a vector of which elements show similarities to each phoneme ( phonetic vector ), and a spectral production network performs inverse transformation of a recognition network. At boundary of synthesis units, two phonetic vectors are calculated using a recognition network, and interpolation between these vectors are performed, then the spectra of interpolation segment are generated by a spectral production network. We provide multiple sets of neural networks for vowels and consonants, and these are trained based on back-propagation algorithm. Using the proposed method, we obtained satisfactory results in realizing coarticulation, and synthetic speech is very natural.


doi: 10.21437/ICSLP.1990-96

Cite as: Ishikawa, Y., Nakajima, K. (1990) Neural network based concatenation method of synthesis units for synthesis by rule. Proc. First International Conference on Spoken Language Processing (ICSLP 1990), 793-796, doi: 10.21437/ICSLP.1990-96

@inproceedings{ishikawa90_icslp,
  author={Yasushi Ishikawa and Kunio Nakajima},
  title={{Neural network based concatenation method of synthesis units for synthesis by rule}},
  year=1990,
  booktitle={Proc. First International Conference on Spoken Language Processing (ICSLP 1990)},
  pages={793--796},
  doi={10.21437/ICSLP.1990-96},
  issn={2958-1796}
}