Skip to main content

Isolated Bangla Spoken Digit and Word Recognition Using MFCC and DTW

  • Chapter
  • First Online:
Engineering Mathematics and Computing

Part of the book series: Studies in Computational Intelligence ((SCI,volume 1042))

  • 562 Accesses

Abstract

Digit recognition is one of the elegant research topics in modern world. Scientists had already got an excellent output in their research work on this topic for English and Chinese like languages. However, very few works exist on digit recognition for regional language. Moreover in case of regional language the pronunciation rapidly varies on the basis of area and the performance of their proposed recognition reasonably low. In this paper we have worked on digit recognition on a regional language, Bengali referred to as Bangla. In our proposed work of isolated digit and word recognition, we created a small speech database, containing ten Bangla digit zero to nine (pronounced as ‘sunno’ to ‘noi’) and four Bangla words (English equivalent Right, Left, Above, and Below respectively) with 100 samples for each class. We have done a pre-processing phase, followed by a 39 dimensional feature extraction procedure of Mel Frequency Cepstral Coefficients (MFCC), \(\varDelta \) MFCC and \(\varDelta \) \(\varDelta \) MFCC. Finally, we used the Dynamic Time Warping (DTW) for classification of testing purpose. The system has achieved a highest accuracy of 93% for classification of Bangla words and digits, which is considerably satisfactory. The comparative analysis with the existing method is given in Sect. 5.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Mohanty, P., Nayak, A.K.: Isolated odia digit recognition using htk: an implementation view. In: 2018 2nd International Conference on Data Science and Business Analytics (ICDSBA), pp. 30–35. IEEE (2018)

    Google Scholar 

  2. Ghanty, S.K., Shaikh, S.H., Chaki, N.: On recognition of spoken Bengali numerals. In: 2010 International Conference on Computer Information Systems and Industrial Management Applications (CISIM), pp. 54–59. IEEE (2010)

    Google Scholar 

  3. Muhammad, G., Alotaibi, Y.A., Huda, M.N.: Automatic speech recognition for Bangladigits. In: 2009 12th International Conference on Computers and Information Technology, pp. 379–383. IEEE (2009)

    Google Scholar 

  4. Nahid, M.M.H., Islam, M.A., Islam, M.S.: A noble approach for recognizing Bangla real number automatically using CMU Sphinx4. In: 2016 5th International Conference on Informatics, Electronics and Vision (ICIEV), pp. 844–849. IEEE (2016)

    Google Scholar 

  5. Abdullah-al-Mamun, M.D., Mahmud, F.: Performance analysis of isolated Bangla speech recognition system using Hidden Markov Model

    Google Scholar 

  6. Gupta, A., Sarkar, K.: Recognition of spoken bengali numerals using MLP, SVM, RF based models with PCA based feature summarization. Int. Arab J. Inf. Technol. 15(2), 263–269 (2018)

    Google Scholar 

  7. Ahammad, K., Rahman, M.M.: Connected bangla speech recognition using artificial neural network. Int. J. Comput. Appl. 149(9), 38–41 (2016)

    Google Scholar 

  8. Karpagavalli, S., Rani, K.U., Deepika, R., Kokila, P.: Isolated Tamil digits speech recognition using vector quantization. Int. J. Eng. Res. Technol. 1(4), 1–12 (2012)

    Google Scholar 

  9. Hejazi, S.A., Kazemi, R., Ghaemmaghami, S.: Isolated Persian digit recognition using a hybrid HMM-SVM. In: 2008 International Symposium on Intelligent Signal Processing and Communications Systems, pp. 1–4. IEEE (2009)

    Google Scholar 

  10. Hai, N.T., Van Thuyen, N., Mai, T.T., Van Toi, V.: MFCC-DTW algorithm for speech recognition in an intelligent wheelchair. In: 5th International Conference on Biomedical Engineering in Vietnam, pp. 417–421. Springer, Cham (2015)

    Google Scholar 

  11. Marković, B.G., Stevanović, G., Jovičić, S.T., Mijić, M., Galić, J.: Recognition of normal and whispered speech based on RASTA filtering and DTW algorithm. In: Proceedings of the International Conference IcETRAN-2017, pp. 8–2 (2017)

    Google Scholar 

  12. Permanasari, Y., Harahap, E.H., Ali, E.P.: Speech recognition using Dynamic Time Warping (DTW). In: Journal of Physics: Conference Series (Vol. 1366, No. 1, p. 012091). IOP Publishing (2019)

    Google Scholar 

  13. Shaikh, H., Mesquita, L.C., Araujo, S.D.C.S., Student, P.: Recognition of isolated spoken words and numeric using MFCC and DTW. Int. J. Eng. Sci., 10539 (2017)

    Google Scholar 

  14. Walid, M., Bousselmi, S., Dabbabi, K., Cherif, A.: Real-time implementation of isolated-word speech recognition system on raspberry Pi 3 Using WAT-MFCC. IJCSNS 19(3), 42 (2019)

    Google Scholar 

  15. Zhang, L., Wu, D., Han, X., Zhu, Z.: Feature extraction of underwater target signal using Mel frequency cepstrum coefficients based on acoustic vector sensor. J. Sens. (2016)

    Google Scholar 

  16. Paul, B., Phadikar, S., Bera, S.: Indian regional spoken language identification using deep learning approach. In: Proceedings of the Sixth International Conference on Mathematics and Computing, pp. 263–274. Springer, Singapore (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bachchu Paul .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Paul, B., Paul, R., Bera, S., Phadikar, S. (2023). Isolated Bangla Spoken Digit and Word Recognition Using MFCC and DTW. In: Gyei-Kark, P., Jana, D.K., Panja, P., Abd Wahab, M.H. (eds) Engineering Mathematics and Computing. Studies in Computational Intelligence, vol 1042. Springer, Singapore. https://doi.org/10.1007/978-981-19-2300-5_16

Download citation

  • DOI: https://doi.org/10.1007/978-981-19-2300-5_16

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-19-2299-2

  • Online ISBN: 978-981-19-2300-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics