Automatic Extraction Phonetically Rich and Balanced Verses for Speaker-Dependent Quranic Speech Recognition System

Yuwan, Rahmi; Lestari, Dessi Puji

doi:10.1007/978-981-10-0515-2_5

Rahmi Yuwan¹² &
Dessi Puji Lestari¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 593))

Included in the following conference series:

Conference of the Pacific Association for Computational Linguistics

691 Accesses
1 Citations

Abstract

This paper discussed how to collect phonetically rich and balanced verses as speech corpus for quranic recognition system. The Quranic phonology was analyzed based on the qira’a of ‘Asim in the riwaya of Hafs to transform arabic text of Holy Quran into alphabetical symbols that represent all possible sounds (QScript) when Holy Quran is read. The entire verses of Holy Quran were checked to select verses-set which met the criteria of a phonetically rich and balanced corpus. The selected verses contained 180 verses of 6236 whole verses in Quran. Statistical phonemes distribution similarity of selected verses was 0.9998 compared to phonemes distiribution in whole Quran. To determine the effect of using this corpus, early development speaker-dependent Quranic recognition system based on CMU Sphinx was developed. MFCC was used as feature extraction. The system used HMM with 3-emitting-states based on tri-phone. For language model, the system used N-gram with word as a basis. The system was trained using recitation from 3 speakers and obtained a recognition accuracy of 97.47 %.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abushariah, M., et al.: Phonetically rich and balanced speech corpus for arabic speaker-independent countinous automatic speech recognition systems. In: ISSPA International Conference on Information Science, Signal Processing and their Applications, p. 65 (2010)
Google Scholar
Annuri, H.A.: Panduan Tahsin Tilawah Al-Quran dan Ilmu Tajwid. Al-Kautsar, Jakarta (2010)
Google Scholar
Gus, A., Faqih, S.A.: Al-Quran Sang Mahkota Cahaya. PT. Elex Media Komputindo, Jakarta (2010)
Google Scholar
Aslam, M., et al.: E-Hafiz: intelligent system to help muslims in recitation and memorization of quran. Life Sci. J. 9, 534 (2012)
Google Scholar
Chenfour, N., et al.: Introduction to Arabic Speech Recognition Using CMUSphinx System (2005)
Google Scholar
Dukes, K.: The Quranic Arabic Corpus (2009). http://corpus.quran.com/. Accessed January 2014
Hamid, S.E.: Computer Aided Pronounciation Learning System Using Statistical Based Automatic Speech Recognition Techniques. Ph.D. Thesis Faculty of Engineering Cairo University, Giza (2005)
Google Scholar
Harrag, A., Mohamadi, T.: QSDAS: new quranic speech database for arabic speaker recognition. Arab. J. Sci. Eng. 35(2C), 7–19 (2010)
Google Scholar
Hassan, T., et al.: Analysis and implementation of an automated delimiter of “Quranic” verse in audio files using speech recognition techniques (2007)
Google Scholar
Razak, Z., et al.: Quranic verse recitation recognition module for support in j-QAF learning: a review. IJCSNS Int. J. Comput. Sci. Netw. Secur. 8(8), 207–216 (2008)
Google Scholar
CMU Sphinx Website. Overview of CMUSphinx Toolkit. CMUSphinx: http://cmusphinx.sourceforge.net. Accessed December 2014
Wang, H.M.: Statistical analysis of mandarin acoustic units and automatic extraction of phonetically rich sentences based upon a very large chinese text corpus. Comput. Linguist. Chin. Lang. Process. 2(3), 93–114 (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Electrical Engineering and Informatics, Institut Teknologi Bandung, Bandung, Indonesia
Rahmi Yuwan & Dessi Puji Lestari

Authors

Rahmi Yuwan
View author publications
You can also search for this author in PubMed Google Scholar
Dessi Puji Lestari
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rahmi Yuwan .

Editor information

Editors and Affiliations

Graduate School of Information Science, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
Kôiti Hasida
School of Electrical Eng and Informatics, Bandung Institute of Technology, Bandung, Indonesia
Ayu Purwarianti

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yuwan, R., Lestari, D.P. (2016). Automatic Extraction Phonetically Rich and Balanced Verses for Speaker-Dependent Quranic Speech Recognition System. In: Hasida, K., Purwarianti, A. (eds) Computational Linguistics. PACLING 2015. Communications in Computer and Information Science, vol 593. Springer, Singapore. https://doi.org/10.1007/978-981-10-0515-2_5

Download citation

DOI: https://doi.org/10.1007/978-981-10-0515-2_5
Published: 20 February 2016
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-0514-5
Online ISBN: 978-981-10-0515-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics