SignSpeaker: A Real-time, High-Precision SmartWatch-based Sign Language Translator

Authors:
Jiahui Hou

University of Science and Technology of China & Illinois Institute of Technology, Chicago, IL, USA

University of Science and Technology of China & Illinois Institute of Technology, Chicago, IL, USA
View Profile

,
Xiang-Yang Li

University of Science and Technology of China, Hefei, China

University of Science and Technology of China, Hefei, China
View Profile

,
Peide Zhu

University of Science and Technology of China, Hefei, China

University of Science and Technology of China, Hefei, China
View Profile

,
Zefan Wang

University of Science and Technology of China, Hefei, China

University of Science and Technology of China, Hefei, China
View Profile

,
Yu Wang

University of North Carolina at Charlotte, Charlotte, NC, USA

University of North Carolina at Charlotte, Charlotte, NC, USA
View Profile

,
Jianwei Qian

Illinois Institute of Technology, Chicago, IL, USA

Illinois Institute of Technology, Chicago, IL, USA
View Profile

,
Panlong Yang

University of Science and Technology of China, Hefei, China

University of Science and Technology of China, Hefei, China
View Profile

MobiCom '19: The 25th Annual International Conference on Mobile Computing and NetworkingAugust 2019Article No.: 24Pages 1–15https://doi.org/10.1145/3300061.3300117

Published:05 August 2019Publication History

MobiCom '19: The 25th Annual International Conference on Mobile Computing and Networking

Pages 1–15

ABSTRACT

Sign language is a natural and fully-formed communication method for deaf or hearing-impaired people. Unfortunately, most of the state-of-the-art sign recognition technologies are limited by either high energy consumption or expensive device costs and have a difficult time providing a real-time service in a daily-life environment. Inspired by previous works on motion detection with wearable devices, we propose Sign Speaker - a real-time, robust, and user-friendly American sign language recognition (ASLR) system with affordable and portable commodity mobile devices. SignSpeaker is deployed on a smartwatch along with a smartphone; the smartwatch collects the sign signals and the smartphone outputs translation through an inbuilt loudspeaker. We implement a prototype system and run a series of experiments that demonstrate the promising performance of our system. For example, the average translation time is approximately $1.1$ seconds for a sentence with eleven words. The average detection ratio and reliability of sign recognition are 99.2% and 99.5%, respectively. The average word error rate of continuous sentence recognition is 1.04% on average.

References

M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G.S. Corrado, A. Davis, Jeffrey Dean, Matthieu Devin, et almbox. 2016. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467 (2016).Google ScholarDigital Library
K. Chen, S. Patel, and S. Keller. 2016. Finexus: Tracking Precise Motions of Multiple Fingertips Using Magnetic Sensing. In ACM CHI. ACM. Google ScholarDigital Library
Y. Chen and C. Shen. 2017. Performance analysis of smartphone-sensor behavior for human activity recognition. IEEE Access, Vol. 5 (2017).Google Scholar
H. Cooper, B. Holt, and R. Bowden. 2011. Sign language recognition. In Visual Analysis of Humans. Springer.Google Scholar
C. Dong, M. Leu, and Z. Yin. 2015. American sign language alphabet recognition using microsoft kinect. In CVPRW.Google Scholar
D. Ekiz, G. Kaya, S. Buug ur, S. Güler, B. Buz, B. Kosucu, and B. Arnrich. 2017. Sign sentence recognition with smart watches. In IEEE SIU.Google Scholar
Rogerio Feris, Matthew Turk, R. Raskar, K. Tan, and G. Ohashi. 2004. Exploiting depth discontinuities for vision-based fingerspelling recognition. In IEEE CVPRW. Google ScholarDigital Library
Google. {n. d.} a. Profile battery usage with Batterystats and Battery Historian. https://developer.android.com/studio/profile/battery-historianGoogle Scholar
Google. {n. d.} b. Sensors Overview. https://developer.android.com/guide/topics/sensors/sensors_overviewGoogle Scholar
A. Graves, S. Fernández, F. Gomez, and J. Schmidhuber. 2006. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In ACM ICML. Google ScholarDigital Library
A. Graves and N. Jaitly. 2014. Towards End-To-End Speech Recognition with Recurrent Neural Networks.. In ICML, Vol. 14. Google ScholarDigital Library
F. Grosjean and H. Lane. 1977. Pauses and syntax in American sign language. Cognition, Vol. 5, 2 (1977).Google Scholar
HLAA. 2017. Basic Facts About Hearing Loss. http://www.hearingloss.org/content/basic-facts-about-hearing-loss.Google Scholar
S. Hochreiter and J. Schmidhuber. 1997. Long short-term memory. Neural computation, Vol. 9, 8 (1997). Google ScholarDigital Library
C. Hsu, C. Chang, C. Lin, et almbox. 2003. A practical guide to support vector classification. (2003).Google Scholar
M. Kadous et almbox. 1996. Machine recognition of Auslan signs using PowerGloves: Towards large-lexicon recognition of sign language. In Proceedings of the Workshop on the Integration of Gesture in Language and Speech. Citeseer.Google Scholar
D. Kingma and J. Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).Google Scholar
T. Kuroda, Y. Tabata, A. Goto, H. Ikuta, M. Murakami, et almbox. 2004. Consumer price data-glove for sign language recognition. In Proc. of 5th Intl Conf. Disability, Virtual Reality Assoc. Tech., Oxford, UK.Google Scholar
J. Kwapisz, G. Weiss, and S. Moore. 2011. Activity recognition using cell phone accelerometers. ACM SigKDD, Vol. 12, 2 (2011). Google ScholarDigital Library
K. Lee, O. Levy, and L. Zettlemoyer. 2017. Recurrent Additive Networks. arXiv preprint arXiv:1705.07393 (2017).Google Scholar
T. Lei and Y. Zhang. 2017. Training RNNs as Fast as CNNs. arXiv preprint arXiv:1709.02755 (2017).Google Scholar
K. Li, Z. Zhou, and C. Lee. 2016. Sign transition modeling and a scalable solution to continuous sign language recognition for real-world applications. ACM TACCESS, Vol. 8, 2 (2016). Google ScholarDigital Library
S. Liddell. 2003. Grammar, gesture, and meaning in American Sign Language.Google Scholar
Y. Ma, G. Zhou, S. Wang, H. Zhao, and W. Jung. 2018. SignFi: Sign Language Recognition Using WiFi. ACM IMWUT, Vol. 2, 1 (2018). Google ScholarDigital Library
Microsoft. 2017. Kinect for Xbox. http://www.xbox.com/en-US/xbox-one/accessories/kinect.Google Scholar
M. Mohandes, M. Deriche, and J. Liu. 2014. Image-based and sensor-based approaches to Arabic sign language recognition. IEEE THMS, Vol. 44, 4 (2014).Google Scholar
M. Mohandes. 2013. Recognition of two-handed Arabic signs using the Cyber Glove. AJSE, Vol. 38, 3 (2013),Google Scholar
Leap Motion. 2017. Leap Motion. http://leapmotion.com.Google Scholar
R. Nandakumar, V. Iyer, D. Tan, and S. Gollakota. 2016. Finger IO: Using Active Sonar for Fine-Grained Finger Tracking. In ACM CHI. Google ScholarDigital Library
World Federation of the Deaf. 2016. FAQ - WFD | World Federation of the Deaf. https://wfdeaf.org/faq.Google Scholar
L. Potter, J. Araullo, and L. Carter. 2013. The leap motion controller: a view on sign language. In ACM OzCHI. Google ScholarDigital Library
Q. Pu, S. Gupta, S. Gollakota, and S. Patel. 2013. Whole-home gesture recognition using wireless signals. In ACM MobiCom. Google ScholarDigital Library
Hisatake Sato. 2001. Moving average filter. US Patent 6,304,133.Google Scholar
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from over fitting. Journal of Machine Learning Research, Vol. 15, 1 (2014). Google ScholarDigital Library
T. Starner, J. Weaver, and A. Pentland. 1998. Real-time american sign language recognition using desk and wearable computer based video. IEEE TPAMI, Vol. 20, 12 (1998), 1371--1375. Google ScholarDigital Library
D. Stockwell and A. Peterson. 2002. Effects of sample size on accuracy of species distribution models. Ecological modelling, Vol. 148, 1 (2002), 1--13.Google Scholar
L. Sun, D. Zhang, B. Li, B. Guo, and S. Li. 2010. Activity recognition on an accelerometer embedded mobile phone with varying positions and orientations. In Springer ICUIC. Google ScholarDigital Library
R. Tennant and M. Brown. 1998. The American sign language handshape dictionary. Gallaudet University Press.Google Scholar
C. Valli and C. Lucas. 2000. Linguistics of American sign language: An introduction. Gallaudet University Press.Google Scholar
William Vicars. 2017. Basic ASL: First 100 Signs. http://www.lifeprint.com/asl101/pages-layout/concepts.htm.Google Scholar
C. Vogler and D. Metaxas. 1998. ASL recognition based on a coupling between HMMs and 3D motion analysis. In IEEE ICCV. Google ScholarDigital Library
C. Wang, X. Guo, Y. Wang, Y. Chen, and B. Liu. 2016. Friend or foe?: Your wearable devices reveal your personal pin. In ACM AsiaCCS. Google ScholarDigital Library
H. Wang, M. Leu, and C. Oz. 2006. American Sign Language Recognition Using Multi-dimensional Hidden Markov Models. JISE, Vol. 22, 5 (2006), 1109--1123.Google Scholar
J. Wang, D. Vasisht, and D. Katabi. 2014. textmdRF-IDraw: virtual touch screen in the air using textmdRF signals. In ACM SIGCOMM. Google ScholarDigital Library
G. Welch and G. Bishop. 1995. An introduction to the Kalman filter. (1995).Google Scholar
H. Wen, J. Ramos Rojas, and A. Dey. 2016. Serendipity: Finger gesture recognition using an off-the-shelf smartwatch. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, 3847--3851. Google ScholarDigital Library
J. Wu, Z. Tian, L. Sun, L. Estevez, and R. Jafari. 2015. Real-time American sign language recognition using wrist-worn motion and surface EMG sensors. In IEEE BSN.Google Scholar
W. Wu, S. Dasgupta, E. Ramirez, C. Peterson, and G. Norman. 2012. Classification accuracies of physical activities using smartphone motion sensors. JMIR, Vol. 14, 5 (2012).Google Scholar
Z. Zafrulla, H. Brashear, T. Starner, H. Hamilton, and P. Presti. 2011a. American sign language recognition with the kinect. In ACM ICMI. Google ScholarDigital Library
Z. Zafrulla, H. Brashear, T. Starner, H. Hamilton, and P. Presti. 2011b. American Sign Language Recognition with the Kinect. In ACM ICMI Google ScholarDigital Library
J. Zhang, W. Zhou, C. Xie, J. Pu, and H. Li. 2016. Chinese sign language recognition with adaptive HMM. In IEEE ICME.Google Scholar
T. Zhao, J. Liu, Y. Wang, H. Liu, and Y. Chen. 2018. PPG-based finger-level gesture recognition leveraging wearables. In IEEE INFOCOM.Google Scholar

Index Terms

SignSpeaker: A Real-time, High-Precision SmartWatch-based Sign Language Translator
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Human-centered computing
  1. Ubiquitous and mobile computing

Recommendations

Contemporary Issues in Handheld Computing Research

Mobile phones have become ubiquitous in today's society. However, mobile users are no longer satisfied with simple phones but instead expect ever more powerful functions to be available from their mobile devices. Advanced phones known as smartphones ...
Read More
10 years mobile multimedia: from Motorola RAZR to iPhone 5
IIWAS '12: Proceedings of the 14th International Conference on Information Integration and Web-based Applications & Services

The Motorola RAZR was first developed in July 2003 and released to the market in 2004 exactly around the same time MoMM conference series was launched by Johannes Kepler University Linz. The introduction of the Motorola RAZR became the most iconic ...
Read More
Mobile Web-Based System for Remote-Controlled Electronic Devices and Smart Objects

Nowadays there are many intelligent electronic devices in the everyday environments: appliances, industrial machinery, devices for service providers in the cities, etc. These electronic devices usually communicate with other devices and people in order ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MobiCom '19: The 25th Annual International Conference on Mobile Computing and Networking
August 2019
1017 pages
ISBN:9781450361699
DOI:10.1145/3300061
General Chairs:
Sharad Agarwal
Microsoft
,
Ben Greenstein
Google
,
Aruna Balasubramanian
Stony Brook University
,
Program Chairs:
Shyam Gollakota
University of Washington
,
Xinyu Zhang
University of California, San Diego
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 5 August 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
applications of machine learning
mobile computing
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate412of2,765submissions,15%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 60
  Total Citations
  View Citations
- 2,652
  Total Downloads
- Downloads (Last 12 months)540
- Downloads (Last 6 weeks)70
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

ePub

View this article in ePub.

View ePub

SignSpeaker: A Real-time, High-Precision SmartWatch-based Sign Language Translator

MobiCom '19: The 25th Annual International Conference on Mobile Computing and Networking

ABSTRACT

References

Cited By

Index Terms

Recommendations

Contemporary Issues in Handheld Computing Research

10 years mobile multimedia: from Motorola RAZR to iPhone 5

Mobile Web-Based System for Remote-Controlled Electronic Devices and Smart Objects