Abstract
In this paper we will present a contribution to the design of an expressive speech synthesis system for the Arabic language. The system uses diphone concatenation as the synthesis method for the generation of 10 phonetically balanced sentences in Arabic. Rules for the orthographic-to-phonetic transcription are detailed, as well as the methodology employed for recording the diphone database. The sentences were synthesized with both “neutral” and “sadness” expressions and rated by 10 listeners, and the results of the test are provided.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Picard, R.W.: Affective Computing, MIT Media Laboratory Perceptual Computing Section Technical report no. 321 (1995)
Al-Dakkak, O., Ghneim, N., Zliekha, M.A., Al-Moubayed, S.: Prosodic feature introduction and emotion incorporation in an arabic TTS. In: 2nd Information and Communication Technologies, ICTTA 2006, vol. 1, pp. 1317–1322. IEEE (2006)
Al-Dakkak, O., Ghneim, N., Zliekha, M.A., Al-Moubayed, S.: Emotion inclusion in an arabic text-to speech. In: 13th European Signal Processing Conference (2005)
Azmy, W.M., Abdou, S., Shoman, M.: Arabic unit selection emotional speech synthesis using blending data approach. Int. J. Comput. Appl. 81(8), 22–28 (2013)
Azmy, W.M., Abdou, S., Shoman, M.: The creation of emotional effects for an arabic speech synthesis system. In: The Egyptian Society of Language Engineering, International Workshop, ESOLE 2013 (2013)
Black, A.W., Lenzo, K.A.: Building synthetic voices. Language Technologies Institute, Carnegie Mellon University and Cepstral LLC (2003)
Dutilleux, P., De Poli, G., Zlözer, U.: In: Zölzer, U. (ed.) DAFX - Digital Audio Effects, pp. 208–211. Wiley, Sussex (2002)
Silverman, K.E., Beckman, M.E., Pitrelli, J.F., Ostendorf, M., Wightman, C.W., Price, P., Hirschberg, J.: TOBI: a standard for labeling English prosody. In: ICSLP, vol. 2, pp. 867–870 (1992)
Xydas, G., Spiliotopoulos, D., Kouroupetroglou, G.: Prosody prediction from linguistically enriched documents based on a machine learning approach. In: Proceedings of the 6th International Conference of Greek Linguistics (2003)
Qian, Y., Wu, Z., Ma, X., Soong, F.: Automatic prosody prediction and detection with Conditional Random Field (CRF) models. In: 7th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 135–138. IEEE (2010)
Assaf, M.: A Prototype of an Arabic Diphone Speech Synthesizer in Festival. Master thesis, Uppsala University (2005)
Youssef, A., Emam, O.: An arabic TTS system based on the IBM trainable speech synthesizer. Autom. Process. Arab. EHD T-ALN 2, 1921 (2004)
Grichkovtsova, I., Lacheret, A., Morel, M.: The role of intonation and voice quality in the affective speech perception. In: Interspeech, pp. 2245–2248 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Demri, L., Falek, L., Teffahi, H. (2015). Contribution to the Design of an Expressive Speech Synthesis System for the Arabic Language. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds) Speech and Computer. SPECOM 2015. Lecture Notes in Computer Science(), vol 9319. Springer, Cham. https://doi.org/10.1007/978-3-319-23132-7_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-23132-7_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23131-0
Online ISBN: 978-3-319-23132-7
eBook Packages: Computer ScienceComputer Science (R0)