Abstract
In a world that values inclusivity, effective communication remains a cornerstone of empowerment for the deaf and hard-of-hearing community, with sign language serving as a pivotal means of expression. Yet, the limited proficiency in sign language among the general population underscores the need for innovative solutions like sign language translators. This study introduces a cutting-edge real-time sign language to speech conversion system, harnessing the power of a pre-trained InceptionResNetV2 deep learning model. To enhance accuracy, a bespoke American sign language dataset is employed, capturing 21 critical hand keypoints and sign images through Python libraries. The training dataset comprises 7200 images, categorized into 24 alphabet classes (excluding 'J' and 'Z'). Model refinement occurs over 20 epochs, each with a batch size of 16, culminating in remarkable training and validation accuracies of 98.23% and 97.07%, respectively. This impressive real-time sign language to speech conversion system, synergizing deep learning with Jetson Nano technology, paves the way for robust communication accessibility. Future advancements are envisaged, including expanding the system to support complete sentence translation and embracing diverse sign languages. By doing so, a comprehensive suite of sign language communication solutions will be offered, fostering universal understanding and inclusivity.
Similar content being viewed by others
Data availability
On request, the dataset used to support the findings of this study can be obtained from the corresponding author.
References
Alarcon G, Brandon VS (2016) Real-time american sign language recognition with convolutional neural networks 2. Convolutional Neural Netw Vis Recogn 8:225–232
Bukhari J, Rehman M, Malik SI, Kamboh AM, Salman A (2015) American sign language translation through sensory glove; SignSpeak. Int J u- e-Serv Sci Technol 8(1):131–142. https://doi.org/10.14257/ijunesst.2015.8.1.12
Triwijoyo BK, Karnaen LYR, Adil A (2023) Deep learning approach for sign language recognition. JITEKI: Jurnal Ilmiah Teknik Elektro Komputer dan Informatika 9(1)
Katoch S, Singh V, Tiwary US (2022) Indian Sign Language recognition system using SURF with SVM and CNN. Array 14. https://doi.org/10.1016/j.array.2022.100141
Talukder D, Jahara F (2020) Real-time bangla sign language detection with sentence and speech generation. In ICCIT 2020 - 23rd International Conference on Computer and Information Technology, Proceedings, Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/ICCIT51783.2020.9392693
Breve B, Cirillo S, Cuofano M, Desiato D (2022) Enhancing spatial perception through sound: mapping human movements into MIDI. Multimed Tools Appl 81(1):73–94. https://doi.org/10.1007/s11042-021-11077-7
Pietro B, Di Gregorio M, Romano M, Sebillo M, Vitiello G, Solimando G (2020) Sign language interactive learning-measuring the user engagement. In Learning and Collaboration Technologies. Human and Technology Ecosystems: 7th International Conference, LCT 2020, Denmark: Springer, pp 3–12
Yash J, Pooja S, Pradnya P, Jyoti W (2017) Sign language to speech conversion using arduino. Int J Innov Eng Res 2(1):37–44. https://www.researchgate.net/publication/339973280_SIGN_LANGUAGE_TO_SPEECH_CONVERSION_USING_ARDUINO
de Souza CR, Pizzolato EB (2013) Sign language recognition with support vector machines and hidden conditional random fields: going from fingerspelling to natural articulated words. In: Machine learning and data mining in pattern recognition: 9th International Conference, MLDM 2013, New York, NY. Proceedings, vol 9. Springer Berlin Heidelberg, pp 84–98
Gattupalli S, Ghaderi A, Athitsos V (2016) Evaluation of deep learning based pose estimation for sign language recognition. In ACM International Conference Proceeding Series, Association for Computing Machinery.https://doi.org/10.1145/2910674.2910716
Amer Kadhim R, Khamees M (2020) A real-time american sign language recognition system using convolutional neural network for real datasets. TEM Journal:937–943. https://doi.org/10.18421/TEM93-14
Kothadiya D, Bhatt C, Sapariya K, Patel K, Gil-González AB, Corchado JM (2022) Deepsign: Sign language detection and recognition using deep learning. Electronics (Switzerland), 11(11). https://doi.org/10.3390/electronics11111780
Tayade A, Halder A (2021) Real-time vernacular sign language recognition using mediapipe and machine learning. Int J Res Publ Rev 2(5). https://doi.org/10.13140/RG.2.2.32364.03203
Dertat A (2013) Applied deep learning - part 4: convolutional neural networks, Medium. https://towardsdatascience.com/applied-deep-learning-part-4-convolutional-neural-networks-584bc134c1e2. Accessed 16 May 2023
Thakur A, Budhathoki P, Upreti S, Shrestha S, Shakya S (2020) Real time sign language recognition and speech generation. J Innov Image Process 2(2):65–76. https://doi.org/10.36548/jiip.2020.2.001
Bantupalli K, Xie Y (2018) American sign language recognition using deep learning and computer vision. In 2018 IEEE International Conference on Big Data (Big Data), IEEE, pp 4896–4899. https://doi.org/10.1109/BigData.2018.8622141
Bheda V, Radpour D (2017) Using deep convolutional networks for gesture recognition in American sign language. arXiv preprint arXiv:1710.06836
Sabeenian RS, Sai Bharathwaj S, Mohamed Aadhil M (2020) Sign language recognition using deep learning and computer vision. J Adv Res Dyn Control Syst 12(5 Special Issue):964–968. https://doi.org/10.5373/JARDCS/V12SP5/20201842
Shirbhate RS et al (2020) Sign language recognition using machine learning algorithm. Int Res J Eng Technol. [Online]. Available. http://www.irjet.net. Accessed 15 May 2023
Nano J (2022) Developer kit. NVIDIA Developer. https://developer.nvidia.com/embedded/jetson-nano-developer-kit. Accessed 23 May 2023
Suzen AA, Duman B, Sen B (2020) Benchmark analysis of jetson TX2, Jetson Nano and raspberry PI using deep-CNN. In 2020 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA), IEEE, pp 1–5 https://doi.org/10.1109/HORA49412.2020.9152915
Rosebrock A (2020) How to configure your Nvidia jetson Nano for computer vision and deep learning. physiol image search. https://pyimagesearch.com/2020/03/25/how-to-configure-your-nvidia-jetson-nano-for-computer-vision-and-deep-learning/. Accessed 16 May 2023
Shin DJ, Kim JJ (2022) A deep learning framework performance evaluation to use YOLO in Nvidia jetson platform. Appl Sci (Switzerland) 12(8). https://doi.org/10.3390/app12083734
Jetson modules, support, ecosystem, and lineup. NVIDIA Developer (2023) https://developer.nvidia.com/embedded/jetson-modules. Accessed 16 May 2023
Banana Pi B-M, Banana Pi (2023) Wiki. https://en.wikipedia.org/wiki/Banana_Pi. Accessed 17 May 2023
Lencse G, Répás S (2016) Benchmarking further single board computers for building a mini supercomputer for simulation of telecommunication system. Int J Adv Telecommun Electrotech Signals Syst 5(1). https://doi.org/10.11601/ijates.v5i1.138
Coral, Dev Board, Google (2020) https://coral.ai/products/dev-board/. Accessed 25 May 2023
Rock Pi, 10 designed for AI apps and solutions based on, Aliexpress (2020) https://www.aliexpress.com/item/1005002921148955.html. Accessed 25 May 2023
Linaro HIKEY970 (2023) https://www.96boards.org/product/hikey970/. Accessed 26 May 2023
TI.com, BEAGLE-3P-BBONE-AI BeagleBone® AI AM5729 development board for embedded Artificial Intelligence (2023) https://www.ti.com/tool/BEAGLE-3P-BBONE-AI. Accessed 26 May 2023
Zhou Z, Neo Y, Lui KS, Tam VWL, Lam EY, Wong N (2020) A portable hong kong sign language translation platform with deep learning and jetson Nano. in ASSETS 2020 - 22nd International ACM SIGACCESS Conference on Computers and Accessibility, Association for Computing Machinery, Inc. https://doi.org/10.1145/3373625.3418046.
Gavrilova Y (2021) What are convolutional neural. Serokell Software Development Company. https://serokell.io/blog/introduction-to-convolutional-neural-networks. Accessed 31 May 2023
Adeyanju IA, Bello OO, Adegboye MA (2021) Machine learning methods for sign language recognition: a critical review and analysis. Intell Syst Appl 12:56. https://doi.org/10.1016/j.iswa.2021.20
Saxena S (2021) Beginner’s guide to support vector machine (SVM), analytics vidya. https://www.analyticsvidhya.com/blog/2021/03/beginners-guide-to-support-vector-machine-svm/. Accessed 30 May 2023
Aloysius N, Geetha M (2020) A scale space model of weighted average CNN ensemble for ASL fingerspelling recognition. Int J Comput Sci Eng 22(1):154. https://doi.org/10.1504/IJCSE.2020.107268
Srivastava T (2020) How does Artificial Neural Network (ANN) algorithm work? Simplified!, Analytics Vidya. https://www.analyticsvidhya.com/blog/2014/10/ann-work-simplified/. Accessed 30 May 2023
Raj RD, Jasuja A (2018) British sign language recognition using HOG. In 2018 IEEE International Students’ Conference on Electrical, Electronics and Computer Science (SCEECS), IEEE, pp 1–4. https://doi.org/10.1109/SCEECS.2018.8546967
Elhamraoui Z (2020) InceptionResNetV2 – Simple introduction, Medium. https://medium.com/@zahraelhamraoui1997/inceptionresnetv2-simple-introduction-9a2000edcdb6. Accessed 16 May 2023
Szegedy C, Ioffe S, Vanhoucke V, Alemi AA. Inception-v4, Inception-ResNet and the impact of residual connections on learning.” [Online]. Available. http://www.aaai.org
Funding
This research received no financing from any commercial, public or non-profit organization.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
There are no potential conflicts of interest for the authors to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kaur, B., Chaudhary, A., Bano, S. et al. Fostering inclusivity through effective communication: Real-time sign language to speech conversion system for the deaf and hard-of-hearing community. Multimed Tools Appl 83, 45859–45880 (2024). https://doi.org/10.1007/s11042-023-17372-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-17372-9