Skip to main content

Evolutionary-Based Design of a Brazilian Portuguese Recording Script for a Concatenative Synthesis System

  • Conference paper
Computational Processing of the Portuguese Language (PROPOR 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5190))

  • 577 Accesses

Abstract

Modifications of prosodic parameters in concatenative synthesis systems may lead to a degradation in speech quality, especially when significant pitch changes are accomplished. Aiming to avoid large changes in the speech signal parameters, the speech corpus should present segments with phonetic and prosodic features close to the predicted ones. This condition is more often fulfilled by a speech corpus specially designed to be both phonetic and prosodically rich. The design of this corpus is strongly dependent on the script chosen for recording. For such, a procedure to select the recording script of a TTS system is proposed for the Brazilian Portuguese language. Selected sentences include declarative, exclamatory, and interrogative ones. Phonetic and prosodic information are firstly represented as a set of feature vectors. Next, the amount of distinct feature vectors is used as a fitness value for a genetic-based sentence selection. Experimental results point out a considerable improvement in script variability for speech synthesis applications.

This work was partially supported by the Brazilian National Council for Scientific and Technological Development (CNPq), Studies and Projects Funding Body (FINEP), and Dígitro Tecnologia Ltda.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Deller Jr., J.R., Hansen, J.H.L., Proakis, J.G.: Discrete-Time Processing of Speech Signals. IEEE Press, New York (2000)

    Google Scholar 

  2. Huang, X., Acero, A., Hon, H.-W.: Spoken Language Processing: A Guide to Theory, Algorithm and System Development. Prentice Hall PTR, Upper Saddle River (2001)

    Google Scholar 

  3. Hunt, A.J., Black, A.W.: Unit Selection in a Concatenative Speech Synthesis System Using a Large Speech Database. In: Proceedings of ICASSP, Atlanta, USA, vol. 1, pp. 373–376 (1996)

    Google Scholar 

  4. Schroeter, J.: Text-to-Speech Synthesis. In: Schroeter, J. (ed.) Circuits, Signals, and Speech and Image Processing, 3rd edn. Taylor & Francis Group, Abington (2006)

    Google Scholar 

  5. Sak, H., Güngör, T., Safkan, Y.: A Corpus-Based Concatenative Speech Synthesis System for Turkish. Turkish Journal of Electrical Engineering, and Computer Sciences 14(2), 209–223 (2006)

    Google Scholar 

  6. Zhu, W., Zhang, W., Shi, Q., et al.: Corpus Building for Data-Driven TTS Systems. In: Proceedings of TTS, Santa Monica, USA, pp. 199–202 (2002)

    Google Scholar 

  7. Pitrelli, J.F., Bakis, R., Eide, E.M., et al.: The IBM Expressive Text-to-Speech Synthesis System for American English. IEEE Transactions on Speech and Audio Processing 14(4), 1099–1108 (2006)

    Article  Google Scholar 

  8. Nicodem, M.V., Seara, R., Pacheco, F.S.: Reducing the Natural Click Effect within Database for High Quality Corpus-Based Speech Synthesis. In: ISSPA, Sydney, Australia, pp. 607–610 (2005)

    Google Scholar 

  9. Nicodem, M.V., Seara, R.: Natural Click Processing Through Wavelet Analysis and Extrapolation for Speech Enhancement. In: ITS, Fortaleza, Brazil, pp. 600–605 (2006)

    Google Scholar 

  10. Seara, I.C.: Statistical Study of the Phonemes Spoken in the Capital of Santa Catarina for the Elaboration of Phonetically Balanced Sentences. Master’s thesis, Federal University of Santa Catarina, Florianópolis, Brazil (in Portuguese)  (1994)

    Google Scholar 

  11. Cirigliano, R., Monteiro, C., Barbosa, F., et al.: A Set of 1000 Brazilian Portuguese Phonetically Balanced Sentences Obtained Using the Genetic Algorithm Approach. In: SBrT, Campinas, Brazil, pp. 544–549 (2005) (in Portuguese)

    Google Scholar 

  12. Chou, F.–C., Tseng, C.–Y.: The Design of Prosodically Oriented Mandarin Speech Database. In: ICPhs, San Francisco, USA, pp. 2375–2377 (1999)

    Google Scholar 

  13. Li, Z., Harman, M., Hierons, R.M.: Search Algorithms for Regression Test Case Prioritization. IEEE Transactions on Software Engineering 33(4), 225–237 (2007)

    Article  Google Scholar 

  14. Nicodem, M.V., Seara, I.C., Seara, R., dos Anjos, D.: Recording Script Design for a Brazilian Portuguese TTS System Aiming at a Higher Phonetic and Prosodic Variability. In: Proceedings of ISSPA, Sharjah, United Arab Emirates, pp. 1–4 (2007)

    Google Scholar 

  15. Seara, I.C., Pacheco, F.S., Seara Jr., R., et al.: Automatic Generation of Brazilian Portuguese Variants Aiming at Speech Recognition Systems. In: Proceedings of SBrT, Rio de Janeiro, Brazil, pp. 1–6 (2003) (in Portuguese)

    Google Scholar 

  16. Silva, D.C., Lima, A.A., de Maia, R., et al.: A Rule-Based Grapheme-Phone Converter and Stress Determination for Brazilian Portuguese Natural Language Processing. In: Proceeding of ITS, Fortaleza, Brazil, pp. 992–996 (2006)

    Google Scholar 

  17. Malfrére, F., Dutoit, T., Hertens, P.: Automatic Prosody Generation Using Suprasegmental Unit Selection. In: SSW, Jenolan Caves, Australia, pp. 323–328 (1998)

    Google Scholar 

  18. Seara, I., Kafka, S., Klein, S., Seara, R.: Vowel Sound Alternation of Verbs and Nouns of the Portuguese Spoken in Brazil for Application in TTS Synthesis. Journal of the Brazilian Telecommunications Society 17(1), 79–85 (2002) (in Portuguese)

    Google Scholar 

  19. Hasan, M.M., Lua, K.–T.: Neural Networks in Chinese Lexical Classification. In: PACLIC, Seoul, South Korea, pp. 119–128 (1996)

    Google Scholar 

  20. Ciaramita, M., Hofmann, T., Johnson, M.: Hierarchical Semantic Classification: Word Sense Disambiguation with World Knowledge. In: IJCAI, Acapulco, Mexico, pp. 817–822 (2003)

    Google Scholar 

  21. Cagliari, L.C.: Phonological Analysis: Introduction to Theory and Practice with Special Emphasis to the Phonemic Model, Mercado Letras, Campinas, Brazil (2002)

    Google Scholar 

  22. Sândalo, M.F.S.: Prosodic Phonology and Optimality Theory: Reflexions about the Interface Syntax-Phonology in the Generation of Phonological Phrases. Revista de Estudos da Linguagem 12(2), 319–344 (2004)

    Google Scholar 

  23. Truckenbrodt, H.: On the Relation between Syntactic Phrases and Phonological Phrases. Linguistic Inquiry 30(2), 219–255 (1999)

    Article  Google Scholar 

  24. Yoon, K.: A Prosodic Phrasing Model for a Korean Text-to-Speech Synthesis System. Computer, Speech, and Language 20(1), 69–79 (2006)

    Article  Google Scholar 

  25. Nicodem, M.V., Seara, I.C., Seara, R., dos Anjos, D., Seara, J.R.: Automatic Selection of Text Corpus for Speech Synthesis Systems. In: SBrT, Recife, Brazil, pp. 1–6 (2007) (in Portuguese)

    Google Scholar 

  26. Seara, I.C., Nicodem, M.V., Seara, R., Seara Jr., R.: Phrasal Classification Focusing Speech Synthesis: Rules for Brazilian Portuguese. In: SBrT, Recife, Brazil, pp. 1–6 (2007) (in Portuguese)

    Google Scholar 

  27. Tang, K.S., Man, K.F., Kwong, S., et al.: Genetic Algorithms and their Applications. IEEE Signal Processing Magazine 13(6), 22–37 (1996)

    Article  Google Scholar 

  28. Johnson, J.M., Rahmat-Samii, V.: Genetic Algorithms in Engineering Electromagnetics. IEEE Antennas and Propagation Magazine 39(4), 7–21 (1997)

    Article  Google Scholar 

  29. Hetland, M.L.: Beginning Python: From Novice to Professional. Apress (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

António Teixeira Vera Lúcia Strube de Lima Luís Caldas de Oliveira Paulo Quaresma

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nicodem, M.V., Seara, I.C., dos Anjos, D., Seara, R., Seara, R. (2008). Evolutionary-Based Design of a Brazilian Portuguese Recording Script for a Concatenative Synthesis System. In: Teixeira, A., de Lima, V.L.S., de Oliveira, L.C., Quaresma, P. (eds) Computational Processing of the Portuguese Language. PROPOR 2008. Lecture Notes in Computer Science(), vol 5190. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85980-2_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-85980-2_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-85979-6

  • Online ISBN: 978-3-540-85980-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics