Skip to main content

Fujisaki Model Based Intonation Modeling for Korean TTS System

  • Conference paper
Ubiquitous Computing and Multimedia Applications (UCMA 2010)

Abstract

One of the enduring problems in developing high-quality TTS (text-to-speech) system is pitch contour generation. Considering language specific knowledge, an adjusted Fujisaki model for Korean TTS system is introduced along with refined machine learning features. The results of quantitative and qualitative evaluations show the validity of our system: the accuracy of the phrase command prediction is 0.8928; the correlations of the predicted amplitudes of a phrase command and an accent command are 0.6644 and 0.6002, respectively; our method achieved the level of “fair” naturalness (3.6) in a MOS scale for generated F0 curves.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Jun, S.-A.: K-ToBI (Korean ToBI) Labelling Conventions (version 3.1), http://www.linguistics.ucla.edu/people/jun/ktobi/K-tobi.html (accessed on Feburary 10, 2010)

  2. Fujisaki, H., Hirose, K.: Analysis of voice fundamental frequency contours for declarative sentences of Japanese. Journal of the Acoustical Society of Japan (E) 5(4), 233–242 (1984)

    Google Scholar 

  3. Fujisaki, H., Ohno, S.: The use of a generative model of F0 contours for multilingual speech synthesis. In: Proc. of the 4th International Conference on Signal Processing, pp. 714–717 (1998)

    Google Scholar 

  4. Mixdorff, H.: A novel approach to the fully automatic extraction of Fujisaki model parameters. In: Proc. of ICASSP, pp. 1281–1284 (2000)

    Google Scholar 

  5. Teixeira, J.P., Freitas, D., Fujisaki, H.: Prediction of Fujisaki Model’s Phrase Commands. In: Proc. of Eurospeech, pp. 397–400 (2003)

    Google Scholar 

  6. Teixeira, J.P., Freitas, D., Fujisaki, H.: Prediction of Accent Commands for the Fujisaki Intonation Model. In: Proc. of Speech Prosody, pp. 451–454 (2004)

    Google Scholar 

  7. Boersma, P., Weenink, D.: Praat: doing phonetics by computer, http://www.fon.hum.uva.nl/praat/ (accessed on Feburary 10, 2010)

  8. Machine Intelligence Laboratory in Cambridge University Engineering Department, Hidden Markov Model Toolkit (HTK), http://htk.eng.cam.ac.uk/ (accessed on Feburary 10, 2010)

  9. Lee, G.G., Cha, J., Lee, J.-H.: Syllable pattern-based unknown morpheme segmentation and estimation for hybrid part-of-speech tagging of Korean. Computational Linguistics 28(1), 53–70 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kim, B., Lee, J., Lee, G.G. (2010). Fujisaki Model Based Intonation Modeling for Korean TTS System. In: Tomar, G.S., Grosky, W.I., Kim, Th., Mohammed, S., Saha, S.K. (eds) Ubiquitous Computing and Multimedia Applications. UCMA 2010. Communications in Computer and Information Science, vol 75. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13467-8_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-13467-8_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-13466-1

  • Online ISBN: 978-3-642-13467-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics