Abstract
In this research, a new Automatic Speech Recognition (ASR) system is introduced for the Amazigh language, more precisely Tachelhit, spoken in the High Atlas and the South-West of Morocco. The implementation of this system is a challenging task because the availability of Amazigh speech databases is limited, so the first step is to create a database and then to develop the system. Our database contains 33 letters and 10 digits (0 to 9). 120 speakers participated in the recording of the speech corpus: 80 adults (40 males and 40 females) and 40 children (20 boys and 20 girls). Among them, 60 are native speakers and the other 60 are speaking the language for the first time. After creating the database, we used Short-Time Energy (STE) to remove silence and compress the length of the input speech signals. Next, the features are extracted by a combination of two well-known approaches: the Mel-Frequency Cepstral Coefficients (MFCC) and the Discrete Wavelet Transform (DWT). The result is the Mel-Frequency Discrete Wavelet Coefficients (MFDWC) method. Finally, the classification step is conducted using Support Vector Machine (SVM) model. The experimental results prove that the system performs well by achieving an accuracy of 94.42% for the recognition of 10 digits, 90.82% for the recognition of 33 letters and 92.22% for the recognition of 33 letters and 10 digits.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Patil, U.G., Shirbahadurkar, S.D., Paithane, A.N.: Automatic speech recognition of isolated words in Hindi language using MFCC. In: Proceedings of the 2016 International Conference on Computing, Analytics and Security Trends (CAST), pp. 433–438. IEEE (2016). https://doi.org/10.1109/CAST.2016.7915008
Satori, H., Harti, M., Chenfour, N.: Introduction to Arabic speech recognition using CMUSphinx system. arXiv Preprint (2007). arXiv:0704.2083
Youcef, B.C., Elemine, Y.M., Islam, B., Farid, B.: Speech recognition system based on OLLO French Corpus by using MFCCs. In: Chadli, M., Bououden, S., Zelinka, I. (eds.) Recent Advances in Electrical Engineering and Control Applications. LNEE, vol. 411, pp. 326–331. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-48929-2_25
Naithani, K., Thakkar, V.M., Semwal, A.: English language speech recognition using MFCC and HMM. In: Proceedings of the 2018 International Conference on Research in Intelligent and Computing in Engineering (RICE), pp. 1–7. IEEE (2018). https://doi.org/10.1109/RICE.2018.8509046
Nejme, F.Z., Boulaknadel, S., Aboutajdine, D.: Analyse automatique de la morphologie nominale Amazighe. In: Proceedings of the 2013 Actes de la conférence du Traitement Automatique du Langage Naturel (TALN), pp. 5–18 (2013)
Young, S.J., Young, S.: The HTK Hidden Markov Model Toolkit : Design and Philosophy. Cambridge University Engineering Department (1993)
Lamere, P., et al.: The CMU Sphinx-4 speech recognition system. In: Proceedings of the 2003 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2–5. IEEE (2003)
Ordowski, M., Deshmukh, N., Ganapathiraju, A., Hamaker, J., Picone, J.: A public domain speech-to-text system. In: Proceedings of the 1999 European Conference on Speech Communication and Technology. (1999)
Liu, X., Zhao, Y., Pi, X., Liang, L., Nefian, A.V.: Audio-visual continuous speech recognition using a coupled hidden Markov model. In: Proceedings of the 2002 International Conference on Spoken Language Processing (ICSLP) (2002)
El Ouahabi, S., Atounti, M., Bellouki, M.: Optimal parameters selected for automatic recognition of spoken Amazigh digits and letters using Hidden Markov Model Toolkit. Int. J. Speech Technol. 23(4), 861–871 (2020). https://doi.org/10.1007/s10772-020-09762-3
Telmem, M., Ghanou, Y.: Amazigh speech recognition system based on CMUSphinx. In: Ben Ahmed, M., Boudhir, A.A. (eds.) SCAMS 2017. LNNS, vol. 37, pp. 397–410. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-74500-8_37
Satori, H., ElHaoussi, F.: Investigation Amazigh speech recognition using CMU tools. Int. J. Speech Technol. 17(3), 235–243 (2014). https://doi.org/10.1007/s10772-014-9223-y
El Ouahabi, S., Atounti, M., Bellouki, M.: Toward an automatic speech recognition system for amazigh-tarifit language. Int. J. Speech Technol. 22(2), 421–432 (2019). https://doi.org/10.1007/s10772-019-09617-6
Telmem, M., Ghanou, Y.: Estimation of the optimal HMM parameters for amazigh speech recognition system using CMU-Sphinx. Procedia Comput. Sci. 127, 92–101 (2018). https://doi.org/10.1016/j.procs.2018.01.102
El Ouahabi, S., Atounti, M., Bellouki, M.: Amazigh isolated-word speech recognition system using hidden Markov model toolkit (HTK). In: Proceedings of the 2016 International Conference on Information Technology for Organizations Development (IT4OD), pp. 1–7. IEEE (2016). https://doi.org/10.1109/IT4OD.2016.7479305
Ouakrim, O.: Fonética y fonología del Bereber. Survey: University of Autònoma de Barcelona (1995)
Boulaknadel, S., Talha, M.: Analyse syntactico-sémantique de la langue amazighe (2013)
Ataa Allah, F., Boulaknadel, S.: Natural language processing for Amazigh language: challenges and future directions. In: Proceedings of the 2012 workshop on Language Technology for Normalisation of Less-Resourced Languages (SALTMIL8/AfLaT2012) (2012)
Ataa Allah, F., Boulaknadel, S.: Convertisseur pour la langue amazighe: script arabe-latin–tifinaghe. In: Proceedings of the 2011 Symposium International sur le Traitement Automatique de la Culture Amazighe, pp. 3–10 (2011)
Open Speech and Language Resources. https://www.openslr.org/resources.php. Accessed 20 Jan 2023
Audacity software. https://www.audacityteam.org/. Accessed 17 Dec 2022
Bachu, R.G., Kopparthi, S., Adapa, B., Barkana, B.D.: Separation of voiced and unvoiced using zero crossing rate and energy of the speech signal. In: Proceedings of the 2008 American Society for Engineering Education (ASEE), pp. 1–7 (2008)
Tufekci, Z., Gowdy, J.N.: Feature extraction using discrete wavelet transform for speech recognition. In: Proceedings of the 2000 IEEE SoutheastCon. ‘Preparing for The New Millennium’ (Cat. No. 00CH37105), pp. 116–123. IEEE (2000). https://doi.org/10.1109/secon.2000.845444
Cutajar, M., Gatt, E., Grech, I., Casha, O., Micallef, J.: Comparative study of automatic speech recognition techniques. In: Proceedings of the 2013 IET Signal Processing, vol. 7, no. 1, pp. 25–46 (2013). https://doi.org/10.1049/iet-spr.2012.0151
Hammami, N., Lawal, I.A., Bedda, M., Farah, N.: Recognition of Arabic speech sound error in children. Int. J. Speech Technol. 23(3), 705–711 (2020). https://doi.org/10.1007/s10772-020-09746-3
Pandit, P., Makwana, P., Bhatt, S.: Automatic speech recognition of continuous speech signal of Gujarati language using machine learning. In: Sahni, M., Merigó, J.M., Jha, B.K., Verma, R. (eds.) Mathematical Modeling, Computational Intelligence Techniques and Renewable Energy. AISC, vol. 1287, pp. 147–159. Springer, Singapore (2021). https://doi.org/10.1007/978-981-15-9953-8_13
Senthil Raja, G., Dandapat, S.: Speaker recognition under stressed condition. Int. J. Speech Technol. 13, 141–161 (2010). https://doi.org/10.1007/s10772-010-9075-z
Padmanabhan, J., Johnson Premkumar, M.J.: Machine learning in automatic speech recognition: a survey. IETE Tech. Rev. 32(4), 240–251 (2015). https://doi.org/10.1080/02564602.2015.1010611
Ghai, W., Singh, N.: Literature review on automatic speech recognition. Int. J. Comput. Appl. 41(8), 42–50 (2012). https://doi.org/10.5120/5565-7646
Ali, H., Jianwei, A., Iqbal, K.: Automatic speech recognition of Urdu digits with optimal classification approach. Int. J. Comput. Appl. 118(9), 1–5 (2015). https://doi.org/10.5120/20770-3275
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Abakarim, F., Abenaou, A. (2023). Enhancing Amazigh Speech Recognition System with MFDWC-SVM. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2023. ICCSA 2023. Lecture Notes in Computer Science, vol 13956 . Springer, Cham. https://doi.org/10.1007/978-3-031-36805-9_31
Download citation
DOI: https://doi.org/10.1007/978-3-031-36805-9_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36804-2
Online ISBN: 978-3-031-36805-9
eBook Packages: Computer ScienceComputer Science (R0)