Skip to main content
Log in

Fostering inclusivity through effective communication: Real-time sign language to speech conversion system for the deaf and hard-of-hearing community

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In a world that values inclusivity, effective communication remains a cornerstone of empowerment for the deaf and hard-of-hearing community, with sign language serving as a pivotal means of expression. Yet, the limited proficiency in sign language among the general population underscores the need for innovative solutions like sign language translators. This study introduces a cutting-edge real-time sign language to speech conversion system, harnessing the power of a pre-trained InceptionResNetV2 deep learning model. To enhance accuracy, a bespoke American sign language dataset is employed, capturing 21 critical hand keypoints and sign images through Python libraries. The training dataset comprises 7200 images, categorized into 24 alphabet classes (excluding 'J' and 'Z'). Model refinement occurs over 20 epochs, each with a batch size of 16, culminating in remarkable training and validation accuracies of 98.23% and 97.07%, respectively. This impressive real-time sign language to speech conversion system, synergizing deep learning with Jetson Nano technology, paves the way for robust communication accessibility. Future advancements are envisaged, including expanding the system to support complete sentence translation and embracing diverse sign languages. By doing so, a comprehensive suite of sign language communication solutions will be offered, fostering universal understanding and inclusivity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability

On request, the dataset used to support the findings of this study can be obtained from the corresponding author.

References

  1. Alarcon G, Brandon VS (2016) Real-time american sign language recognition with convolutional neural networks 2. Convolutional Neural Netw Vis Recogn 8:225–232

    Google Scholar 

  2. Bukhari J, Rehman M, Malik SI, Kamboh AM, Salman A (2015) American sign language translation through sensory glove; SignSpeak. Int J u- e-Serv Sci Technol 8(1):131–142. https://doi.org/10.14257/ijunesst.2015.8.1.12

    Article  Google Scholar 

  3. Triwijoyo BK, Karnaen LYR, Adil A (2023) Deep learning approach for sign language recognition. JITEKI: Jurnal Ilmiah Teknik Elektro Komputer dan Informatika 9(1)

  4. Katoch S, Singh V, Tiwary US (2022) Indian Sign Language recognition system using SURF with SVM and CNN. Array 14. https://doi.org/10.1016/j.array.2022.100141

  5. Talukder D, Jahara F (2020) Real-time bangla sign language detection with sentence and speech generation. In ICCIT 2020 - 23rd International Conference on Computer and Information Technology, Proceedings, Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/ICCIT51783.2020.9392693

  6. Breve B, Cirillo S, Cuofano M, Desiato D (2022) Enhancing spatial perception through sound: mapping human movements into MIDI. Multimed Tools Appl 81(1):73–94. https://doi.org/10.1007/s11042-021-11077-7

    Article  Google Scholar 

  7. Pietro B, Di Gregorio M, Romano M, Sebillo M, Vitiello G, Solimando G (2020) Sign language interactive learning-measuring the user engagement. In Learning and Collaboration Technologies. Human and Technology Ecosystems: 7th International Conference, LCT 2020, Denmark: Springer, pp 3–12

  8. Yash J, Pooja S, Pradnya P, Jyoti W (2017) Sign language to speech conversion using arduino. Int J Innov Eng Res 2(1):37–44. https://www.researchgate.net/publication/339973280_SIGN_LANGUAGE_TO_SPEECH_CONVERSION_USING_ARDUINO

  9. de Souza CR, Pizzolato EB (2013) Sign language recognition with support vector machines and hidden conditional random fields: going from fingerspelling to natural articulated words. In: Machine learning and data mining in pattern recognition: 9th International Conference, MLDM 2013, New York, NY. Proceedings, vol 9. Springer Berlin Heidelberg, pp 84–98

  10. Gattupalli S, Ghaderi A, Athitsos V (2016) Evaluation of deep learning based pose estimation for sign language recognition. In ACM International Conference Proceeding Series, Association for Computing Machinery.https://doi.org/10.1145/2910674.2910716

  11. Amer Kadhim R, Khamees M (2020) A real-time american sign language recognition system using convolutional neural network for real datasets. TEM Journal:937–943. https://doi.org/10.18421/TEM93-14

  12. Kothadiya D, Bhatt C, Sapariya K, Patel K, Gil-González AB, Corchado JM (2022) Deepsign: Sign language detection and recognition using deep learning. Electronics (Switzerland), 11(11). https://doi.org/10.3390/electronics11111780

  13. Tayade A, Halder A (2021) Real-time vernacular sign language recognition using mediapipe and machine learning. Int J Res Publ Rev 2(5). https://doi.org/10.13140/RG.2.2.32364.03203

  14. Dertat A (2013) Applied deep learning - part 4: convolutional neural networks, Medium. https://towardsdatascience.com/applied-deep-learning-part-4-convolutional-neural-networks-584bc134c1e2. Accessed 16 May 2023

  15. Thakur A, Budhathoki P, Upreti S, Shrestha S, Shakya S (2020) Real time sign language recognition and speech generation. J Innov Image Process 2(2):65–76. https://doi.org/10.36548/jiip.2020.2.001

    Article  Google Scholar 

  16. Bantupalli K, Xie Y (2018) American sign language recognition using deep learning and computer vision. In 2018 IEEE International Conference on Big Data (Big Data), IEEE, pp 4896–4899. https://doi.org/10.1109/BigData.2018.8622141

  17. Bheda V, Radpour D (2017) Using deep convolutional networks for gesture recognition in American sign language. arXiv preprint arXiv:1710.06836

  18. Sabeenian RS, Sai Bharathwaj S, Mohamed Aadhil M (2020) Sign language recognition using deep learning and computer vision. J Adv Res Dyn Control Syst 12(5 Special Issue):964–968. https://doi.org/10.5373/JARDCS/V12SP5/20201842

    Article  Google Scholar 

  19. Shirbhate RS et al (2020) Sign language recognition using machine learning algorithm. Int Res J Eng Technol. [Online]. Available. http://www.irjet.net. Accessed 15 May 2023

  20. Nano J (2022) Developer kit. NVIDIA Developer. https://developer.nvidia.com/embedded/jetson-nano-developer-kit. Accessed 23 May 2023

  21. Suzen AA, Duman B, Sen B (2020) Benchmark analysis of jetson TX2, Jetson Nano and raspberry PI using deep-CNN. In 2020 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA), IEEE, pp 1–5 https://doi.org/10.1109/HORA49412.2020.9152915

  22. Rosebrock A (2020) How to configure your Nvidia jetson Nano for computer vision and deep learning. physiol image search. https://pyimagesearch.com/2020/03/25/how-to-configure-your-nvidia-jetson-nano-for-computer-vision-and-deep-learning/. Accessed 16 May 2023

  23. Shin DJ, Kim JJ (2022) A deep learning framework performance evaluation to use YOLO in Nvidia jetson platform. Appl Sci (Switzerland) 12(8). https://doi.org/10.3390/app12083734

  24. Jetson modules, support, ecosystem, and lineup. NVIDIA Developer (2023) https://developer.nvidia.com/embedded/jetson-modules. Accessed 16 May 2023

  25. Banana Pi B-M, Banana Pi (2023) Wiki. https://en.wikipedia.org/wiki/Banana_Pi. Accessed 17 May 2023

  26. Lencse G, Répás S (2016) Benchmarking further single board computers for building a mini supercomputer for simulation of telecommunication system. Int J Adv Telecommun Electrotech Signals Syst 5(1). https://doi.org/10.11601/ijates.v5i1.138

  27. Coral, Dev Board, Google (2020) https://coral.ai/products/dev-board/. Accessed 25 May 2023

  28. Rock Pi, 10 designed for AI apps and solutions based on, Aliexpress (2020) https://www.aliexpress.com/item/1005002921148955.html. Accessed 25 May 2023

  29. Linaro HIKEY970 (2023) https://www.96boards.org/product/hikey970/. Accessed 26 May 2023

  30. TI.com, BEAGLE-3P-BBONE-AI BeagleBone® AI AM5729 development board for embedded Artificial Intelligence (2023) https://www.ti.com/tool/BEAGLE-3P-BBONE-AI. Accessed 26 May 2023

  31. Zhou Z, Neo Y, Lui KS, Tam VWL, Lam EY, Wong N (2020) A portable hong kong sign language translation platform with deep learning and jetson Nano. in ASSETS 2020 - 22nd International ACM SIGACCESS Conference on Computers and Accessibility, Association for Computing Machinery, Inc. https://doi.org/10.1145/3373625.3418046.

  32. Gavrilova Y (2021) What are convolutional neural. Serokell Software Development Company. https://serokell.io/blog/introduction-to-convolutional-neural-networks. Accessed 31 May 2023

  33. Adeyanju IA, Bello OO, Adegboye MA (2021) Machine learning methods for sign language recognition: a critical review and analysis. Intell Syst Appl 12:56. https://doi.org/10.1016/j.iswa.2021.20

    Article  Google Scholar 

  34. Saxena S (2021) Beginner’s guide to support vector machine (SVM), analytics vidya. https://www.analyticsvidhya.com/blog/2021/03/beginners-guide-to-support-vector-machine-svm/. Accessed 30 May 2023

  35. Aloysius N, Geetha M (2020) A scale space model of weighted average CNN ensemble for ASL fingerspelling recognition. Int J Comput Sci Eng 22(1):154. https://doi.org/10.1504/IJCSE.2020.107268

    Article  Google Scholar 

  36. Srivastava T (2020) How does Artificial Neural Network (ANN) algorithm work? Simplified!, Analytics Vidya. https://www.analyticsvidhya.com/blog/2014/10/ann-work-simplified/. Accessed 30 May 2023

  37. Raj RD, Jasuja A (2018) British sign language recognition using HOG. In 2018 IEEE International Students’ Conference on Electrical, Electronics and Computer Science (SCEECS), IEEE, pp 1–4. https://doi.org/10.1109/SCEECS.2018.8546967

  38. Elhamraoui Z (2020) InceptionResNetV2 – Simple introduction, Medium. https://medium.com/@zahraelhamraoui1997/inceptionresnetv2-simple-introduction-9a2000edcdb6. Accessed 16 May 2023

  39. Szegedy C, Ioffe S, Vanhoucke V, Alemi AA. Inception-v4, Inception-ResNet and the impact of residual connections on learning.” [Online]. Available. http://www.aaai.org

Download references

Funding

This research received no financing from any commercial, public or non-profit organization.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rishika Anand.

Ethics declarations

Conflict of interest

There are no potential conflicts of interest for the authors to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kaur, B., Chaudhary, A., Bano, S. et al. Fostering inclusivity through effective communication: Real-time sign language to speech conversion system for the deaf and hard-of-hearing community. Multimed Tools Appl 83, 45859–45880 (2024). https://doi.org/10.1007/s11042-023-17372-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-17372-9

Keywords

Navigation