Skip to main content
Log in

Motion does matter: an examination of speech-based text entry on the move

  • LONG PAPER
  • Published:
Universal Access in the Information Society Aims and scope Submit manuscript

Abstract

Desktop interaction solutions are often inappropriate for mobile devices due to small screen size and portability needs. Speech recognition can improve interactions by providing a relatively hands-free solution that can be used in various situations. While mobile systems are designed to be transportable, few have examined the effects of motion on mobile interactions. This paper investigates the effect of motion on automatic speech recognition (ASR) input for mobile devices. Speech recognition error rates (RER) have been examined with subjects walking or seated, while performing text input tasks and the effect of ASR enrollment conditions on RER. The obtained results suggest changes in user training of ASR systems for mobile and seated usage.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Abbreviations

ASR:

Automatic speech recognition

RER:

Recognition error rate

PDA:

Personal digital assistant

SIID:

Situationally induced impairments and disabilities

NASA TLX:

NASA task load index

ISRC:

Interactive systems research center

MME:

Multi metaphor environments

NSF:

National science foundation

References

  1. Akoumianakis D, Stephanidis C (2003) Multiple metaphor environments: designing for diversity. Ergonomics 46(1–3):88–113

    Google Scholar 

  2. Baber C, Noyes J (1996) Automatic speech recognition in adverse environments. Human factors 38(1):142 – 155

    Article  Google Scholar 

  3. Barnard L, Yi JS, Jacko JA, Sears A An empirical comparison of use-in-motion evaluation scenarios for mobile computing devices. Int J Human-Comput Stud (in press)

  4. Barnard L, Yi JS, Jacko JA, Sears A (204) The effects of context on human performance in mobile computing. Personal and Ubiquitous Computing, July 2004

  5. Bradford JH (1995) The human factors of speech-based interfaces. SIGCHI Bull 27(2):61–67

    Article  MathSciNet  Google Scholar 

  6. Brewster S, Lumsden J, Bell M, Hall M, Tasker S (2003) Multi-modal ‘eyes-free’ interaction techniques for wearable devices. Lett CHI 5(1): 473–480

    Google Scholar 

  7. Brodie J, Perry M (2001) Designing for mobility, collaboration and information use by blue-collar workers. SIGGROUP Bull 22(3):22–27

    Google Scholar 

  8. Chandrasekhar A (2003) Respiratory rate and pattern of breathing: to evaluate one of the vital signs. Retrieved October 14, 2003, from Loyola University Medical Education Network Web Site: http://www.meddean.luc.edu/lumen/meded/medicine/pulmonar/pd/step73a.htm (n.d.)

  9. Cohen PR, Oviatt SL (1993) The role of voice in human-machine communication. In: Roe DB, Wilpon J (eds) Human-computer interaction by voice. National Academy of Sciences Press, Washington, pp 1–36

    Google Scholar 

  10. Dahlbom B, Ljungberg F (1998) Mobile informatics. Scand J Inform Syst 10(1–2):227–234

    Google Scholar 

  11. Doust JH, Patrick JM (1981) The limitation of exercise ventilation during speech. Respir Physiol 46:137–147

    Article  Google Scholar 

  12. Emery VK, Moloney KP, Jacko JA, Sainfort F (2004) Assessing workload in the context of human-computer interactions: Is the NASA-TLX a suitable measurement tool? (200401). Laboratory for Human-Computer Interaction and Health Care Informatics, Georgia Institute of Technology, Atlanta

  13. Entwistle MS (2003) The performance of automated speech recognition systems under adverse conditions of human exertion. Int J Hum Comput Int 16(2):127–140

    Article  Google Scholar 

  14. Feng J, Sears A (2003) Using confidence scores to improve hands-free speech-based navigation. In: Stephanidis C, Jacko J (eds) Human-computer interaction: theory and practice, Vol 2. Lawrence. Erlbaum Associates, Mahwah, pp 641–645

  15. Feng J, Sears A, Karat C-M (2004) A longitudinal evaluation of hands-free speech-based navigation during dictation (Information Systems Department Technical Report). UMBC, Information Systems Department ISRC, Baltimore

  16. Fiscus JG, Fisher WM, Martin AF, Przybocki MA, Pallett DS (2000) NIST evaluation of conversational speech recognition over the telephone: English and Mandarin performance results. Retrieved February 28, 2004, from http://www.nist.gov/speech/publications/tw00/pdf/cts10.pdf

  17. Hagen A, Connors DA, Pellom BL (2003) The analysis and design of architecture systems for speech recognition on modern handheld-computing devices. In: Proceedings of the international symposium on systems synthesis. ACM Press, New York, pp 65–70

  18. Hart SG, Staveland LE (1988) Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. In: Hancock PA, Meshkati N (Eds) Human mental workload. Elsevier Science Publishers B.V., Amsterdam, pp. 139–183

    Chapter  Google Scholar 

  19. 19. Holzman TG (2001) Speech-audio interface for medical information management in field environments. Int J Speech Technol 4:209–226

    Article  MATH  Google Scholar 

  20. Huerta JM (2000) Speech recognition in mobile environments. Ph.D. Dissertation, Carnegie Mellon University, Pittsburgh

  21. Iacucci G, Kuutti K, Ranta M (2000) On the move with a magic thing: Role playing in concept design of mobile services and devices. In: Proceedings of the conference on designing interactive systems: Processes, practices, methods, and techniques. ACM Press, New York, pp 193–202

  22. Johnson P (1998) Usability and mobility; interactions on the move. Retrieved August 20, 03, from Department of Computer Science Web Site: http://www.dcs.gla.ac.uk/∼johnson/papers/mobile/HCIMD1.html

  23. Juul-Kristensen B, Laursen B, Pilegaard M, Jensen BR (2004) Physical workload during use of speech recognition and traditional computer input devices. Ergonomics 47(2):119–133

    Article  Google Scholar 

  24. Karat C-M, Halverson C, Karat J, Horn D (1999) Patterns of entry and correction in large vocabulary continuous speech recognition systems. In: Proceedings of CHI ’99, ACM Press, New York , pp 568–575

  25. Lin M, Price K, Goldman R, Sears A, Jacko J (2005) Tapping on the Move - Fitts’ Law under mobile conditions. In: Proceedings of IRMA 2005 (in press)

  26. Lu Y-C, Xiao Y, Sears A, Jacko J (2003) An observational and interview study on personal digital assistant (PDA) uses by clinicians in different contexts. In: Harris D, Duffy V, Smith M, Stephanidis C (eds) Human-centred computing: cognitive, social and ergonomic aspects. Lawrence Erlbaum Associates, Mahwah, pp 93–97

    Google Scholar 

  27. McCormick J (2003) Speech Recognition. Govern Comput News 22(22):24–28

    Google Scholar 

  28. Meckel Y, Rotstein A, Inbar O (2002) The effects of speech production on physiologic responses during submaximal exercise. Med Sci Sports Exerc 34(8):1337–1343

    Article  Google Scholar 

  29. NASA Ames Research Center (1987) NASA Human Performance Research Group Task Load Index (NASA-TLX) instruction manual [Brochure]. Moffett Field, CA

  30. Noyes JM, Frankish CR (1994) Errors and error correction in automatic speech recognition systems. Ergonomics 37:1943–1957

    Article  Google Scholar 

  31. Pascoe J, Ryan N, Morse D (2000) Using while moving: HCI issues in fieldwork environments. ACM Trans Comput Hum Interact 7(3):417–437

    Article  Google Scholar 

  32. Paterno F (2003) Understanding interaction with mobile devices. Interact Comput 15:473–478

    Article  Google Scholar 

  33. Perry M, O’Hara K, Sellen A, Brown B, Harper R (2001) Dealing with mobility: Understanding access anytime, anywhere. ACM Trans Comput Hum Interact 8(4):323–347

    Article  Google Scholar 

  34. Price KJ, Sears A (2002) Speech-based data entry for handheld devices: Speed of entry and error correction techniques (Information Systems Department Technical Report). UMBC, Baltimore, pp 1–8

    Google Scholar 

  35. Price KJ, (2003) Sears A Speech-based text entry for mobile devices. In: Stephanidis C, Jacko J (eds) Human-computer interaction: theory and practice (Part II). Lawrence Erlbaum Associates, Mahwah, pp 766–770

    Google Scholar 

  36. Price KJ, Lin M, Feng J, Goldman R, Sears A, Jacko J (2004) Data entry on the move: An examination of nomadic speech-based text entry. Lect Notes Comput Sci (LNCS) 3196:460–471

    Google Scholar 

  37. Rollins AM (1985) Speech recognition and manner of speaking in noise and in quiet. In: CHI ’85 Proceedings. ACM Press, New York, pp 197–199

  38. Satyanarayanan M (1996) Fundamental challenges in mobile computing. In: Proceedings of the fifteenth annual ACM symposium on principles of distributed computing. ACM Press, New York, pp 1–7

  39. Sawhney N, Schmandt C (2000) Nomadic radio: speech and audio interaction for contextual messaging in nomadic environments. ACM Trans Comput Hum Interact 7(3):353–383

    Article  Google Scholar 

  40. Schumacher EH, Seymour TL, Glass JM, Fencsik DE, Lauber EJ, Kieras DE et al (2001) Virtually perfect time sharing in dual-task performance: uncorking the central cognitive bottleneck. Psychol Sci 12(2):101–108

    Article  Google Scholar 

  41. Sears A, Feng J, Oseitutu K, Karat C-M (2003) Hands-free, speech-based navigation during dictation: Difficulties, consequences, and solutions. Hum Comput Interact 18:229–257

    Article  Google Scholar 

  42. Sears A, Jacko JA, Chu J, Moro F (2001) The role of visual search in the design of effective soft keyboards. Behav Inform Technol 20(3):159–166

    Article  Google Scholar 

  43. Sears A, Karat C-M, Oseitutu K, Karimullah A, Feng J (2001) Productivity, satisfaction, and interaction strategies of individuals with spinal cord injuries and traditional users interacting with speech recognition software. Univer Access Inform Soc 1:4–15

    Google Scholar 

  44. Sears A, Lin M, Jacko J, Xiao Y (2003) When computers fade ... pervasive computing and situationally-induced impairments and disabilities. In: Proceedings of HCII 2003, pp 1298–1302

  45. Sears A, Young M (2003) Physical disabilities and computing technologies: an analysis of impairments. In: Jacko J, Sears A (eds) The human-computer interaction handbook. Lawrence Erlbaum and Associates, Mahwah, pp 482–503

    Google Scholar 

  46. Shneiderman B (2000) The limits of speech recognition. Commun ACM 43(9):63–65

    Article  Google Scholar 

  47. Suhm B, Myers B, Waibel A (2001) Multimodal error correction for speech user interfaces. ACM Trans Comput Hum Interact 8(1):60–98

    Article  Google Scholar 

  48. Ward K, Novick DG (2003) Accessibility: hands-free documentation. In: Proceedings of the 21st annual international conference on documentation. ACM Press, New York, pp 147–154

  49. Wright P, Bartram C, Rogers N, Emslie H, Evans J, Wilson B, Belt S (2000) Text entry on handheld computers by older users. Ergonomics 43(6):702–716

    Article  Google Scholar 

  50. Xie B, Salvendy G (2000) Prediction of mental workload in single and multiple tasks environments. Int J Cogn Ergon 4(3):213–242

    Article  Google Scholar 

Download references

Acknowledgements

This material is based upon work supported by the National Science Foundation (NSF) under Grant Nos. IIS-0121570 and IIS-0328391. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF. Numerous colleagues at the ISRC were instrumental in the completion of this research, including Liwei Dai who performed analysis and coding of speech data. We would also like to thank the anonymous reviewers for their thoughtful feedback, which led to several improvements in this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kathleen J. Price.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Price, K.J., Lin, M., Feng, J. et al. Motion does matter: an examination of speech-based text entry on the move. Univ Access Inf Soc 4, 246–257 (2006). https://doi.org/10.1007/s10209-005-0006-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10209-005-0006-8

Keywords

Navigation