Enhancing Amazigh Speech Recognition System with MFDWC-SVM

Abakarim, Fadwa; Abenaou, Abdenbi

doi:10.1007/978-3-031-36805-9_31

Fadwa Abakarim¹⁴ &
Abdenbi Abenaou¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13956 ))

Included in the following conference series:

International Conference on Computational Science and Its Applications

546 Accesses

Abstract

In this research, a new Automatic Speech Recognition (ASR) system is introduced for the Amazigh language, more precisely Tachelhit, spoken in the High Atlas and the South-West of Morocco. The implementation of this system is a challenging task because the availability of Amazigh speech databases is limited, so the first step is to create a database and then to develop the system. Our database contains 33 letters and 10 digits (0 to 9). 120 speakers participated in the recording of the speech corpus: 80 adults (40 males and 40 females) and 40 children (20 boys and 20 girls). Among them, 60 are native speakers and the other 60 are speaking the language for the first time. After creating the database, we used Short-Time Energy (STE) to remove silence and compress the length of the input speech signals. Next, the features are extracted by a combination of two well-known approaches: the Mel-Frequency Cepstral Coefficients (MFCC) and the Discrete Wavelet Transform (DWT). The result is the Mel-Frequency Discrete Wavelet Coefficients (MFDWC) method. Finally, the classification step is conducted using Support Vector Machine (SVM) model. The experimental results prove that the system performs well by achieving an accuracy of 94.42% for the recognition of 10 digits, 90.82% for the recognition of 33 letters and 92.22% for the recognition of 33 letters and 10 digits.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Patil, U.G., Shirbahadurkar, S.D., Paithane, A.N.: Automatic speech recognition of isolated words in Hindi language using MFCC. In: Proceedings of the 2016 International Conference on Computing, Analytics and Security Trends (CAST), pp. 433–438. IEEE (2016). https://doi.org/10.1109/CAST.2016.7915008
Satori, H., Harti, M., Chenfour, N.: Introduction to Arabic speech recognition using CMUSphinx system. arXiv Preprint (2007). arXiv:0704.2083
Youcef, B.C., Elemine, Y.M., Islam, B., Farid, B.: Speech recognition system based on OLLO French Corpus by using MFCCs. In: Chadli, M., Bououden, S., Zelinka, I. (eds.) Recent Advances in Electrical Engineering and Control Applications. LNEE, vol. 411, pp. 326–331. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-48929-2_25
Chapter Google Scholar
Naithani, K., Thakkar, V.M., Semwal, A.: English language speech recognition using MFCC and HMM. In: Proceedings of the 2018 International Conference on Research in Intelligent and Computing in Engineering (RICE), pp. 1–7. IEEE (2018). https://doi.org/10.1109/RICE.2018.8509046
Nejme, F.Z., Boulaknadel, S., Aboutajdine, D.: Analyse automatique de la morphologie nominale Amazighe. In: Proceedings of the 2013 Actes de la conférence du Traitement Automatique du Langage Naturel (TALN), pp. 5–18 (2013)
Google Scholar
Young, S.J., Young, S.: The HTK Hidden Markov Model Toolkit : Design and Philosophy. Cambridge University Engineering Department (1993)
Google Scholar
Lamere, P., et al.: The CMU Sphinx-4 speech recognition system. In: Proceedings of the 2003 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2–5. IEEE (2003)
Google Scholar
Ordowski, M., Deshmukh, N., Ganapathiraju, A., Hamaker, J., Picone, J.: A public domain speech-to-text system. In: Proceedings of the 1999 European Conference on Speech Communication and Technology. (1999)
Google Scholar
Liu, X., Zhao, Y., Pi, X., Liang, L., Nefian, A.V.: Audio-visual continuous speech recognition using a coupled hidden Markov model. In: Proceedings of the 2002 International Conference on Spoken Language Processing (ICSLP) (2002)
Google Scholar
El Ouahabi, S., Atounti, M., Bellouki, M.: Optimal parameters selected for automatic recognition of spoken Amazigh digits and letters using Hidden Markov Model Toolkit. Int. J. Speech Technol. 23(4), 861–871 (2020). https://doi.org/10.1007/s10772-020-09762-3
Article Google Scholar
Telmem, M., Ghanou, Y.: Amazigh speech recognition system based on CMUSphinx. In: Ben Ahmed, M., Boudhir, A.A. (eds.) SCAMS 2017. LNNS, vol. 37, pp. 397–410. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-74500-8_37
Chapter Google Scholar
Satori, H., ElHaoussi, F.: Investigation Amazigh speech recognition using CMU tools. Int. J. Speech Technol. 17(3), 235–243 (2014). https://doi.org/10.1007/s10772-014-9223-y
Article Google Scholar
El Ouahabi, S., Atounti, M., Bellouki, M.: Toward an automatic speech recognition system for amazigh-tarifit language. Int. J. Speech Technol. 22(2), 421–432 (2019). https://doi.org/10.1007/s10772-019-09617-6
Article Google Scholar
Telmem, M., Ghanou, Y.: Estimation of the optimal HMM parameters for amazigh speech recognition system using CMU-Sphinx. Procedia Comput. Sci. 127, 92–101 (2018). https://doi.org/10.1016/j.procs.2018.01.102
Article Google Scholar
El Ouahabi, S., Atounti, M., Bellouki, M.: Amazigh isolated-word speech recognition system using hidden Markov model toolkit (HTK). In: Proceedings of the 2016 International Conference on Information Technology for Organizations Development (IT4OD), pp. 1–7. IEEE (2016). https://doi.org/10.1109/IT4OD.2016.7479305
Ouakrim, O.: Fonética y fonología del Bereber. Survey: University of Autònoma de Barcelona (1995)
Google Scholar
Boulaknadel, S., Talha, M.: Analyse syntactico-sémantique de la langue amazighe (2013)
Google Scholar
Ataa Allah, F., Boulaknadel, S.: Natural language processing for Amazigh language: challenges and future directions. In: Proceedings of the 2012 workshop on Language Technology for Normalisation of Less-Resourced Languages (SALTMIL8/AfLaT2012) (2012)
Google Scholar
Ataa Allah, F., Boulaknadel, S.: Convertisseur pour la langue amazighe: script arabe-latin–tifinaghe. In: Proceedings of the 2011 Symposium International sur le Traitement Automatique de la Culture Amazighe, pp. 3–10 (2011)
Google Scholar
Open Speech and Language Resources. https://www.openslr.org/resources.php. Accessed 20 Jan 2023
Audacity software. https://www.audacityteam.org/. Accessed 17 Dec 2022
Bachu, R.G., Kopparthi, S., Adapa, B., Barkana, B.D.: Separation of voiced and unvoiced using zero crossing rate and energy of the speech signal. In: Proceedings of the 2008 American Society for Engineering Education (ASEE), pp. 1–7 (2008)
Google Scholar
Tufekci, Z., Gowdy, J.N.: Feature extraction using discrete wavelet transform for speech recognition. In: Proceedings of the 2000 IEEE SoutheastCon. ‘Preparing for The New Millennium’ (Cat. No. 00CH37105), pp. 116–123. IEEE (2000). https://doi.org/10.1109/secon.2000.845444
Cutajar, M., Gatt, E., Grech, I., Casha, O., Micallef, J.: Comparative study of automatic speech recognition techniques. In: Proceedings of the 2013 IET Signal Processing, vol. 7, no. 1, pp. 25–46 (2013). https://doi.org/10.1049/iet-spr.2012.0151
Hammami, N., Lawal, I.A., Bedda, M., Farah, N.: Recognition of Arabic speech sound error in children. Int. J. Speech Technol. 23(3), 705–711 (2020). https://doi.org/10.1007/s10772-020-09746-3
Article Google Scholar
Pandit, P., Makwana, P., Bhatt, S.: Automatic speech recognition of continuous speech signal of Gujarati language using machine learning. In: Sahni, M., Merigó, J.M., Jha, B.K., Verma, R. (eds.) Mathematical Modeling, Computational Intelligence Techniques and Renewable Energy. AISC, vol. 1287, pp. 147–159. Springer, Singapore (2021). https://doi.org/10.1007/978-981-15-9953-8_13
Chapter Google Scholar
Senthil Raja, G., Dandapat, S.: Speaker recognition under stressed condition. Int. J. Speech Technol. 13, 141–161 (2010). https://doi.org/10.1007/s10772-010-9075-z
Article Google Scholar
Padmanabhan, J., Johnson Premkumar, M.J.: Machine learning in automatic speech recognition: a survey. IETE Tech. Rev. 32(4), 240–251 (2015). https://doi.org/10.1080/02564602.2015.1010611
Article Google Scholar
Ghai, W., Singh, N.: Literature review on automatic speech recognition. Int. J. Comput. Appl. 41(8), 42–50 (2012). https://doi.org/10.5120/5565-7646
Article Google Scholar
Ali, H., Jianwei, A., Iqbal, K.: Automatic speech recognition of Urdu digits with optimal classification approach. Int. J. Comput. Appl. 118(9), 1–5 (2015). https://doi.org/10.5120/20770-3275
Article Google Scholar

Download references

Author information

Authors and Affiliations

Research Team of Applied Mathematics and Intelligent Systems Engineering, National School of Applied Sciences, Ibn Zohr University, 80000, Agadir, Morocco
Fadwa Abakarim & Abdenbi Abenaou

Authors

Fadwa Abakarim
View author publications
You can also search for this author in PubMed Google Scholar
Abdenbi Abenaou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fadwa Abakarim .

Editor information

Editors and Affiliations

University of Perugia, Perugia, Italy
Osvaldo Gervasi
University of Basilicata, Potenza, Italy
Beniamino Murgante
Monash University, Clayton, VIC, Australia
David Taniar
Kyushu Sangyo University, Fukuoka, Japan
Bernady O. Apduhan
University of Minho, Braga, Portugal
Ana Cristina Braga
University of Cagliari, Cagliari, Italy
Chiara Garau
National Technical University of Athens, Athens, Greece
Anastasia Stratigea

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Abakarim, F., Abenaou, A. (2023). Enhancing Amazigh Speech Recognition System with MFDWC-SVM. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2023. ICCSA 2023. Lecture Notes in Computer Science, vol 13956 . Springer, Cham. https://doi.org/10.1007/978-3-031-36805-9_31

Download citation

DOI: https://doi.org/10.1007/978-3-031-36805-9_31
Published: 30 June 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36804-2
Online ISBN: 978-3-031-36805-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics