Evolutionary-Based Design of a Brazilian Portuguese Recording Script for a Concatenative Synthesis System

Nicodem, Monique Vitório; Seara, Izabel Christine; dos Anjos, Daiana; Seara, Rui; Seara, Rui

doi:10.1007/978-3-540-85980-2_9

Monique Vitório Nicodem¹,
Izabel Christine Seara¹,
Daiana dos Anjos¹,
Rui Seara Jr.¹ &
…
Rui Seara¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5190))

Included in the following conference series:

International Conference on Computational Processing of the Portuguese Language

577 Accesses

Abstract

Modifications of prosodic parameters in concatenative synthesis systems may lead to a degradation in speech quality, especially when significant pitch changes are accomplished. Aiming to avoid large changes in the speech signal parameters, the speech corpus should present segments with phonetic and prosodic features close to the predicted ones. This condition is more often fulfilled by a speech corpus specially designed to be both phonetic and prosodically rich. The design of this corpus is strongly dependent on the script chosen for recording. For such, a procedure to select the recording script of a TTS system is proposed for the Brazilian Portuguese language. Selected sentences include declarative, exclamatory, and interrogative ones. Phonetic and prosodic information are firstly represented as a set of feature vectors. Next, the amount of distinct feature vectors is used as a fitness value for a genetic-based sentence selection. Experimental results point out a considerable improvement in script variability for speech synthesis applications.

This work was partially supported by the Brazilian National Council for Scientific and Technological Development (CNPq), Studies and Projects Funding Body (FINEP), and Dígitro Tecnologia Ltda.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Deller Jr., J.R., Hansen, J.H.L., Proakis, J.G.: Discrete-Time Processing of Speech Signals. IEEE Press, New York (2000)
Google Scholar
Huang, X., Acero, A., Hon, H.-W.: Spoken Language Processing: A Guide to Theory, Algorithm and System Development. Prentice Hall PTR, Upper Saddle River (2001)
Google Scholar
Hunt, A.J., Black, A.W.: Unit Selection in a Concatenative Speech Synthesis System Using a Large Speech Database. In: Proceedings of ICASSP, Atlanta, USA, vol. 1, pp. 373–376 (1996)
Google Scholar
Schroeter, J.: Text-to-Speech Synthesis. In: Schroeter, J. (ed.) Circuits, Signals, and Speech and Image Processing, 3rd edn. Taylor & Francis Group, Abington (2006)
Google Scholar
Sak, H., Güngör, T., Safkan, Y.: A Corpus-Based Concatenative Speech Synthesis System for Turkish. Turkish Journal of Electrical Engineering, and Computer Sciences 14(2), 209–223 (2006)
Google Scholar
Zhu, W., Zhang, W., Shi, Q., et al.: Corpus Building for Data-Driven TTS Systems. In: Proceedings of TTS, Santa Monica, USA, pp. 199–202 (2002)
Google Scholar
Pitrelli, J.F., Bakis, R., Eide, E.M., et al.: The IBM Expressive Text-to-Speech Synthesis System for American English. IEEE Transactions on Speech and Audio Processing 14(4), 1099–1108 (2006)
Article Google Scholar
Nicodem, M.V., Seara, R., Pacheco, F.S.: Reducing the Natural Click Effect within Database for High Quality Corpus-Based Speech Synthesis. In: ISSPA, Sydney, Australia, pp. 607–610 (2005)
Google Scholar
Nicodem, M.V., Seara, R.: Natural Click Processing Through Wavelet Analysis and Extrapolation for Speech Enhancement. In: ITS, Fortaleza, Brazil, pp. 600–605 (2006)
Google Scholar
Seara, I.C.: Statistical Study of the Phonemes Spoken in the Capital of Santa Catarina for the Elaboration of Phonetically Balanced Sentences. Master’s thesis, Federal University of Santa Catarina, Florianópolis, Brazil (in Portuguese) (1994)
Google Scholar
Cirigliano, R., Monteiro, C., Barbosa, F., et al.: A Set of 1000 Brazilian Portuguese Phonetically Balanced Sentences Obtained Using the Genetic Algorithm Approach. In: SBrT, Campinas, Brazil, pp. 544–549 (2005) (in Portuguese)
Google Scholar
Chou, F.–C., Tseng, C.–Y.: The Design of Prosodically Oriented Mandarin Speech Database. In: ICPhs, San Francisco, USA, pp. 2375–2377 (1999)
Google Scholar
Li, Z., Harman, M., Hierons, R.M.: Search Algorithms for Regression Test Case Prioritization. IEEE Transactions on Software Engineering 33(4), 225–237 (2007)
Article Google Scholar
Nicodem, M.V., Seara, I.C., Seara, R., dos Anjos, D.: Recording Script Design for a Brazilian Portuguese TTS System Aiming at a Higher Phonetic and Prosodic Variability. In: Proceedings of ISSPA, Sharjah, United Arab Emirates, pp. 1–4 (2007)
Google Scholar
Seara, I.C., Pacheco, F.S., Seara Jr., R., et al.: Automatic Generation of Brazilian Portuguese Variants Aiming at Speech Recognition Systems. In: Proceedings of SBrT, Rio de Janeiro, Brazil, pp. 1–6 (2003) (in Portuguese)
Google Scholar
Silva, D.C., Lima, A.A., de Maia, R., et al.: A Rule-Based Grapheme-Phone Converter and Stress Determination for Brazilian Portuguese Natural Language Processing. In: Proceeding of ITS, Fortaleza, Brazil, pp. 992–996 (2006)
Google Scholar
Malfrére, F., Dutoit, T., Hertens, P.: Automatic Prosody Generation Using Suprasegmental Unit Selection. In: SSW, Jenolan Caves, Australia, pp. 323–328 (1998)
Google Scholar
Seara, I., Kafka, S., Klein, S., Seara, R.: Vowel Sound Alternation of Verbs and Nouns of the Portuguese Spoken in Brazil for Application in TTS Synthesis. Journal of the Brazilian Telecommunications Society 17(1), 79–85 (2002) (in Portuguese)
Google Scholar
Hasan, M.M., Lua, K.–T.: Neural Networks in Chinese Lexical Classification. In: PACLIC, Seoul, South Korea, pp. 119–128 (1996)
Google Scholar
Ciaramita, M., Hofmann, T., Johnson, M.: Hierarchical Semantic Classification: Word Sense Disambiguation with World Knowledge. In: IJCAI, Acapulco, Mexico, pp. 817–822 (2003)
Google Scholar
Cagliari, L.C.: Phonological Analysis: Introduction to Theory and Practice with Special Emphasis to the Phonemic Model, Mercado Letras, Campinas, Brazil (2002)
Google Scholar
Sândalo, M.F.S.: Prosodic Phonology and Optimality Theory: Reflexions about the Interface Syntax-Phonology in the Generation of Phonological Phrases. Revista de Estudos da Linguagem 12(2), 319–344 (2004)
Google Scholar
Truckenbrodt, H.: On the Relation between Syntactic Phrases and Phonological Phrases. Linguistic Inquiry 30(2), 219–255 (1999)
Article Google Scholar
Yoon, K.: A Prosodic Phrasing Model for a Korean Text-to-Speech Synthesis System. Computer, Speech, and Language 20(1), 69–79 (2006)
Article Google Scholar
Nicodem, M.V., Seara, I.C., Seara, R., dos Anjos, D., Seara, J.R.: Automatic Selection of Text Corpus for Speech Synthesis Systems. In: SBrT, Recife, Brazil, pp. 1–6 (2007) (in Portuguese)
Google Scholar
Seara, I.C., Nicodem, M.V., Seara, R., Seara Jr., R.: Phrasal Classification Focusing Speech Synthesis: Rules for Brazilian Portuguese. In: SBrT, Recife, Brazil, pp. 1–6 (2007) (in Portuguese)
Google Scholar
Tang, K.S., Man, K.F., Kwong, S., et al.: Genetic Algorithms and their Applications. IEEE Signal Processing Magazine 13(6), 22–37 (1996)
Article Google Scholar
Johnson, J.M., Rahmat-Samii, V.: Genetic Algorithms in Engineering Electromagnetics. IEEE Antennas and Propagation Magazine 39(4), 7–21 (1997)
Article Google Scholar
Hetland, M.L.: Beginning Python: From Novice to Professional. Apress (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

LINSE – Circuits and Signal Processing Laboratory Department of Electrical Engineering, Federal University of Santa Catarina, Brazil
Monique Vitório Nicodem, Izabel Christine Seara, Daiana dos Anjos, Rui Seara Jr. & Rui Seara

Authors

Monique Vitório Nicodem
View author publications
You can also search for this author in PubMed Google Scholar
Izabel Christine Seara
View author publications
You can also search for this author in PubMed Google Scholar
Daiana dos Anjos
View author publications
You can also search for this author in PubMed Google Scholar
Rui Seara Jr.
View author publications
You can also search for this author in PubMed Google Scholar
Rui Seara
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

António Teixeira Vera Lúcia Strube de Lima Luís Caldas de Oliveira Paulo Quaresma

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nicodem, M.V., Seara, I.C., dos Anjos, D., Seara, R., Seara, R. (2008). Evolutionary-Based Design of a Brazilian Portuguese Recording Script for a Concatenative Synthesis System. In: Teixeira, A., de Lima, V.L.S., de Oliveira, L.C., Quaresma, P. (eds) Computational Processing of the Portuguese Language. PROPOR 2008. Lecture Notes in Computer Science(), vol 5190. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85980-2_9

Download citation

DOI: https://doi.org/10.1007/978-3-540-85980-2_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85979-6
Online ISBN: 978-3-540-85980-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics