Skip to main content
Log in

Improving the Utility of Speech Recognition Through Error Detection

  • Published:
Journal of Digital Imaging Aims and scope Submit manuscript

Abstract

Despite the potential to dominate radiology reporting, current speech recognition technology is thus far a weak and inconsistent alternative to traditional human transcription. This is attributable to poor accuracy rates, in spite of vendor claims, and the wasted resources that go into correcting erroneous reports. A solution to this problem is post-speech-recognition error detection that will assist the radiologist in proofreading more efficiently. In this paper, we present a statistical method for error detection that can be applied after transcription. The results are encouraging, showing an error detection rate as high as 96% in some cases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Horii SC, Redfern R, Kundel H, et al: PACS technologies and reliability: are we making things better or worse? Proc SPIE 4685:16–24, 2002

    Article  Google Scholar 

  2. Mehta A, Dreyer K, Schweitzer A, et al: Voice recognition—an emerging necessity within radiology: Experiences of the Massachusetts general hospital. J Digit Imaging 11:20–23, 1998

    Article  PubMed  CAS  Google Scholar 

  3. Al-Aynati NM, Chorneyko KA: Comparison of voice-automated transcription and human transcription in generating pathology reports. Arch Pathol Lab Med 127(6):721–725, 2003

    PubMed  Google Scholar 

  4. Ranaa DS, Hurst G, Shepstone L, et al: Voice recognition for radiology reports: is it good enough? Clin Radiol 60(11):1205–1212, 2005

    Article  Google Scholar 

  5. Marion J: Radiologists’ attitudes can make or break speech recognition. Diagn Imaging Online http://www.superiorconsultant.com/Pressroom/Articles/Diagnostic%20Imaging%20Feb%202002.doc

  6. Gale B, Safriel Y, Lukban A: Radiology report production times: voice recognition vs. transcription. Radiol Manage 23:18–22, 2001

    PubMed  CAS  Google Scholar 

  7. Jeong M, Kim B, Lee G: Using higher-level linguistic knowledge for speech recognition error correction in a spoken Q/A dialog. In: Proceedings of the HLT-NAACL special workshop on Higher-Level Linguistic Information for Speech Processing, Boston, USA, 2004, pp 48–55

  8. Jurafsky D, Martin J: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Englewood Cliffs: Prentice-Hall, 2000

    Google Scholar 

  9. Allen JF, Miller BW, Ringger EK, et al: A robust system for natural spoken dialogue. In: Proceedings of the 34th Annual Meeting of the ACL, Santa Cruz, USA, 1996, pp 62–70

  10. Kaki S, Sumita E, Iida H: A method for correcting errors in speech recognition using the statistical features of character co-occurrence. In: ACL-COLING, Montreal, Canada, 1998, pp 653–657

  11. Sarma A, Palmer D: Context-based speech recognition error detection and correction. In: Proceedings of the HLT-NAACL, Boston, USA, 2004, pp 85–88

  12. Inkpen D, Désilets A: Semantic similarity for detecting recognition errors in automatic speech transcripts. In: Proceedings of EMNLP. Association for Computational Linguistics, Vancouver, Canada, 2005, pp 49–56, http://www.aclweb.org/anthology/H/H05/H05-1007

  13. Manning CD, Schütze H: Foundations of Statistical Natural Language Processing. Cambridge: MIT Press, 2002

    Google Scholar 

  14. Voll K: A Methodology of Error Detection: Improving Speech Recognition in Radiology. Ph.D. thesis, Simon Fraser University, 2006

  15. Sistrom C: Conceptual approach for the design of radiology reporting interfaces: the talking template. J Digit Imaging 18(3):176–187, 2005

    Article  PubMed  Google Scholar 

  16. Caviedes JE, Cimino JJ: Towards the development of a conceptual distance metric for the UMLS. J Biomed Inform 37:77–85, 2004

    Article  PubMed  Google Scholar 

  17. Shiffman S, Detmer WMS, Lane, CD, et al: A continuous-speech interface to a decision support system: I. Techniques to accommodate misrecognized input. J Am Med Inform Assoc 2:36–45, 1995

    PubMed  CAS  Google Scholar 

Download references

Acknowledgments

We wish to thank the CDC for their continued support of this research endeavor, as well as Simon Fraser University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kimberly Voll Ph. D..

Rights and permissions

Reprints and permissions

About this article

Cite this article

Voll, K., Atkins, S. & Forster, B. Improving the Utility of Speech Recognition Through Error Detection. J Digit Imaging 21, 371–377 (2008). https://doi.org/10.1007/s10278-007-9034-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10278-007-9034-7

Key words

Navigation