Abstract
This paper discussed how to collect phonetically rich and balanced verses as speech corpus for quranic recognition system. The Quranic phonology was analyzed based on the qira’a of ‘Asim in the riwaya of Hafs to transform arabic text of Holy Quran into alphabetical symbols that represent all possible sounds (QScript) when Holy Quran is read. The entire verses of Holy Quran were checked to select verses-set which met the criteria of a phonetically rich and balanced corpus. The selected verses contained 180 verses of 6236 whole verses in Quran. Statistical phonemes distribution similarity of selected verses was 0.9998 compared to phonemes distiribution in whole Quran. To determine the effect of using this corpus, early development speaker-dependent Quranic recognition system based on CMU Sphinx was developed. MFCC was used as feature extraction. The system used HMM with 3-emitting-states based on tri-phone. For language model, the system used N-gram with word as a basis. The system was trained using recitation from 3 speakers and obtained a recognition accuracy of 97.47 %.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abushariah, M., et al.: Phonetically rich and balanced speech corpus for arabic speaker-independent countinous automatic speech recognition systems. In: ISSPA International Conference on Information Science, Signal Processing and their Applications, p. 65 (2010)
Annuri, H.A.: Panduan Tahsin Tilawah Al-Quran dan Ilmu Tajwid. Al-Kautsar, Jakarta (2010)
Gus, A., Faqih, S.A.: Al-Quran Sang Mahkota Cahaya. PT. Elex Media Komputindo, Jakarta (2010)
Aslam, M., et al.: E-Hafiz: intelligent system to help muslims in recitation and memorization of quran. Life Sci. J. 9, 534 (2012)
Chenfour, N., et al.: Introduction to Arabic Speech Recognition Using CMUSphinx System (2005)
Dukes, K.: The Quranic Arabic Corpus (2009). http://corpus.quran.com/. Accessed January 2014
Hamid, S.E.: Computer Aided Pronounciation Learning System Using Statistical Based Automatic Speech Recognition Techniques. Ph.D. Thesis Faculty of Engineering Cairo University, Giza (2005)
Harrag, A., Mohamadi, T.: QSDAS: new quranic speech database for arabic speaker recognition. Arab. J. Sci. Eng. 35(2C), 7–19 (2010)
Hassan, T., et al.: Analysis and implementation of an automated delimiter of “Quranic” verse in audio files using speech recognition techniques (2007)
Razak, Z., et al.: Quranic verse recitation recognition module for support in j-QAF learning: a review. IJCSNS Int. J. Comput. Sci. Netw. Secur. 8(8), 207–216 (2008)
CMU Sphinx Website. Overview of CMUSphinx Toolkit. CMUSphinx: http://cmusphinx.sourceforge.net. Accessed December 2014
Wang, H.M.: Statistical analysis of mandarin acoustic units and automatic extraction of phonetically rich sentences based upon a very large chinese text corpus. Comput. Linguist. Chin. Lang. Process. 2(3), 93–114 (1998)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Science+Business Media Singapore
About this paper
Cite this paper
Yuwan, R., Lestari, D.P. (2016). Automatic Extraction Phonetically Rich and Balanced Verses for Speaker-Dependent Quranic Speech Recognition System. In: Hasida, K., Purwarianti, A. (eds) Computational Linguistics. PACLING 2015. Communications in Computer and Information Science, vol 593. Springer, Singapore. https://doi.org/10.1007/978-981-10-0515-2_5
Download citation
DOI: https://doi.org/10.1007/978-981-10-0515-2_5
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-0514-5
Online ISBN: 978-981-10-0515-2
eBook Packages: Computer ScienceComputer Science (R0)