Using Machine Learning to Identify Health Outcomes from Electronic Health Record Data

Wong, Jenna; Murray Horwitz, Mara; Zhou, Li; Toh, Sengwee

doi:10.1007/s40471-018-0165-9

Using Machine Learning to Identify Health Outcomes from Electronic Health Record Data

Pharmacoepidemiology (S Toh, Section Editor)
Published: 20 September 2018

Volume 5, pages 331–342, (2018)
Cite this article

Current Epidemiology Reports Aims and scope Submit manuscript

Jenna Wong¹,
Mara Murray Horwitz¹,
Li Zhou^2,3 &
…
Sengwee Toh¹

1804 Accesses
51 Citations
15 Altmetric
Explore all metrics

Abstract

Purpose of Review

Electronic health records (EHRs) contain valuable data for identifying health outcomes, but these data also present numerous challenges when creating computable phenotyping algorithms. Machine learning methods could help with some of these challenges. In this review, we discuss four common scenarios that researchers may find helpful for thinking critically about when and for what tasks machine learning may be used to identify health outcomes from EHR data.

Recent Findings

We first consider the conditions in which machine learning may be especially useful with respect to two dimensions of a health outcome: (1) the characteristics of its diagnostic criteria and (2) the format in which its diagnostic data are usually stored within EHR systems. In the first dimension, we propose that for health outcomes with diagnostic criteria involving many clinical factors, vague definitions, or subjective interpretations, machine learning may be useful for modeling the complex diagnostic decision-making process from a vector of clinical inputs to identify individuals with the health outcome. In the second dimension, we propose that for health outcomes where diagnostic information is largely stored in unstructured formats such as free text or images, machine learning may be useful for extracting and structuring this information as part of a natural language processing system or an image recognition task. We then consider these two dimensions jointly to define four common scenarios of health outcomes. For each scenario, we discuss the potential uses for machine learning—first assuming accurate and complete EHR data and then relaxing these assumptions to accommodate the limitations of real-world EHR systems. We illustrate these four scenarios using concrete examples and describe how recent studies have used machine learning to identify these health outcomes from EHR data.

Summary

Machine learning has great potential to improve the accuracy and efficiency of health outcome identification from EHR systems, especially under certain conditions. To promote the use of machine learning in EHR-based phenotyping tasks, future work should prioritize efforts to increase the transportability of machine learning algorithms for use in multi-site settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

High-throughput phenotyping with electronic medical record data using a common semi-supervised approach (PheCAP)

Article 20 November 2019

Public Health Informatics in the Larger Context of Biomedical and Health Informatics

Healthcare Analytics: Overcoming the Barriers to Health Information Using Machine Learning Algorithms

References

Papers of particular interest, published recently, have been highlighted as: • Of importance •• Of major importance

Singh S, Loke YK. Drug safety assessment in clinical trials: methodological challenges and opportunities. Trials. 2012;13:138.
Article PubMed PubMed Central Google Scholar
Kemp R, Prasad V. Surrogate endpoints in oncology: when are they acceptable for regulatory and clinical decisions, and are they currently overused? BMC Med. 2017;15:134.
Article PubMed PubMed Central Google Scholar
D’Agostino RB. Debate: the slippery slope of surrogate outcomes. Curr Control Trials Cardiovasc Med. 2000;1(2):76–8.
Article PubMed PubMed Central Google Scholar
Berger ML, Sox H, Willke RJ, Brixner DL, Eichler HG, Goettsch W, et al. Good practices for real-world data studies of treatment and/or comparative effectiveness: recommendations from the joint ISPOR-ISPE special task force on real-world evidence in health care decision making. Pharmacoepidemiol Drug Saf. 2017;26(9):1033–9.
Article PubMed PubMed Central Google Scholar
• Lanes S, Brown JS, Haynes K, Pollack MF, Walker AM. Identifying health outcomes in healthcare databases. Pharmacoepidemiol Drug Saf. 2015;24(10):1009–16 Discusses important methodological issues for researchers to consider when identifying health outcomes from both claims and EHR databases.
Article PubMed Google Scholar
Denny JC. Chapter 13: mining electronic health Records in the Genomics era. PLoS Comput Biol. 2012;8(12):e1002823.
Article CAS PubMed PubMed Central Google Scholar
Richesson RL, Smerek MM, Blake Cameron C. A framework to support the sharing and reuse of computable phenotype definitions across health care delivery and clinical research applications. eGEMs. 2016;4(3):1232.
Article PubMed PubMed Central Google Scholar
Onukwugha E. Big data and its role in health economics and outcomes research: a collection of perspectives on data sources, measurement, and analysis. PharmacoEconomics. 2016;34:91–3.
Article PubMed PubMed Central Google Scholar
• Ford E, Carroll JA, Smith HE, Scott D, Cassell JA. Extracting information from the text of electronic medical records to improve case detection: a systematic review. J Am Med Inform Assoc. 2016;23(5):1007–15 Reviews the methods and findings from previously published studies using information from free text in electronic medical records for patient phenotyping.
Article PubMed PubMed Central Google Scholar
Araújo T, Aresta G, Castro E, Rouco J, Aguiar P, Eloy C, et al. Classification of breast cancer histology images using convolutional neural networks. PLoS One. 2017;12(6):e0177544.
Article PubMed PubMed Central Google Scholar
Kienle GS, Kiene H. Clinical judgement and the medical profession. J Eval Clin Pract. 2011;17(4):621–7.
Article PubMed PubMed Central Google Scholar
•• Beam AL, Kohane IS. Big data and machine learning in health care. JAMA. 2018;319(13):1317–8 Provides an excellent non-technical overview of machine learning and big data and gives reasonable expectations for their roles in health care.
Article PubMed Google Scholar
Alessa A, Faezipour M. A review of influenza detection and prediction through social networking sites. Theor Biol Med Model. 2018;15:2.
Article PubMed PubMed Central Google Scholar
Jurgovsky J, Granitzer M, Ziegler K, Calabretto S, Portier P-E, He-Guelton L, et al. Sequence classification for credit-card fraud detection. Expert Syst Appl. 2018;100:234–45.
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Commun ACM. 2017;60(6):84–90.
Article Google Scholar
James G, Witten D, Hastie T, Tibshirani R. An introduction to statistical learning: with applications in R. New York: Springer Publishing Company, Incorporated; 2014. 430 p
Google Scholar
•• Jiang F, Jiang Y, Zhi H, Dong Y, Li H, Ma S, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol. 2017;2(4):230–43 Reviews the current state of machine learning applications in health care from PubMed with respect to different types of data used, areas of disease focus, and techniques used.
Article PubMed PubMed Central Google Scholar
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436–44.
Article CAS Google Scholar
Resta M, Sonnessa M, Tànfani E, Testi A. Unsupervised neural networks for clustering emergent patient flows. Oper Res Health Care. 2018;18:41–51.
Hastie T, Tibshirani R, Friedman J. The elements of statistical learning: data mining, inference, and prediction. 2nd ed. New York: Springer; 2009.
Book Google Scholar
Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.
Article Google Scholar
Tu JV. Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes. J Clin Epidemiol. 1996;49(11):1225–31.
Article CAS PubMed Google Scholar
•• Raghupathi W, Raghupathi V. Big data analytics in healthcare: promise and potential. Health Inf Sci Syst. 2014;2:3 Provides a broad overview of Big Data analytics and discusses important issues for consideration.
Article PubMed PubMed Central Google Scholar
Whelton PK, Carey RM, Aronow WS, Casey DE, Collins KJ, Dennison Himmelfarb C, et al. 2017 ACC/AHA/AAPA/ABC/ACPM/AGS/APhA/ASH/ASPC/NMA/PCNA Guideline for the Prevention, Detection, Evaluation, and Management of High Blood Pressure in Adults: A Report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines. J Am Coll Cardiol. 2018;71(19):e127–248.
Article PubMed Google Scholar
Arber DA, Borowitz MJ, Cessna M, Etzell J, Foucar K, Hasserjian RP, et al. Initial diagnostic workup of acute leukemia: guideline from the College of American Pathologists and the American Society of Hematology. Arch Pathol Lab Med. 2017;141(10):1342–93.
Article PubMed Google Scholar
World Health Organization. WHO global report on falls prevention in older age. 2007.
Loeser JD, Treede R-D. The Kyoto protocol of IASP basic pain terminology. Pain. 2008;137(3):473–7.
Article PubMed Google Scholar
Frantzides CT, Luu MB. BMJ best practice: obesity in adults. November 2017. Available from: http://bestpractice.bmj.com/topics/en-us/211.
Larosa M, Iaccarino L, Gatto M, Punzi L, Doria A. Advances in the diagnosis and classification of systemic lupus erythematosus. Expert Rev Clin Immunol. 2016;12(12):1309–20.
Article CAS PubMed Google Scholar
Thong B, Olsen NJ. Systemic lupus erythematosus diagnosis and management. Rheumatology. 2017;56(suppl_1):i3–i13.
CAS PubMed Google Scholar
World Health Organization. WHO guidelines on the pharmacological treatment of persisting pain in children with medical illnesses. Geneva, Switzerland; 2012.
Darcy AM, Louie AK, Roberts L. Machine learning and the profession of medicine. JAMA. 2016;315(6):551–2.
Article CAS PubMed Google Scholar
Murdoch TB, Detsky AS. The inevitable application of big data to health care. JAMA. 2013;309(13):1351–2.
Article CAS PubMed Google Scholar
Young T, Hazarika D, Poria S, Cambria E. Recent trends in deep learning based natural language processing. CoRR. 2017;abs/1708.02709.
Makam AN, Nguyen OK, Moore B, Ma Y, Amarasingham R. Identifying patients with diabetes and the earliest date of diagnosis in real time: an electronic health record case-finding algorithm. BMC Med Inform Decis Mak. 2013;13:81.
Article PubMed PubMed Central Google Scholar
Gunčar G, Kukar M, Notar M, Brvar M, Černelč P, Notar M, et al. An application of machine learning to haematological diagnosis. Sci Rep. 2018;8(1):411.
Article PubMed PubMed Central Google Scholar
Wians FH. Clinical laboratory tests: which, why, and what do the results mean? Lab Med. 2009;40(2):105–13.
Article Google Scholar
Valent P, Sotlar K, Blatt K, Hartmann K, Reiter A, Sadovnik I, et al. Proposed diagnostic criteria and classification of basophilic leukemias and related disorders. Leukemia. 2017;31:788–97.
Article CAS PubMed Google Scholar
Waters TM, Chandler AM, Mion LC, Daniels MJ, Kessler LA, Miller ST, et al. Use of ICD-9-CM codes to identify inpatient fall-related injuries. J Am Geriatr Soc. 2013;61(12):2186–91. https://doi.org/10.1111/jgs.12539.
Article Google Scholar
McCart JA, Berndt DJ, Jarman J, Finch DK, Luther SL. Finding falls in ambulatory care clinical documents using statistical text mining. J Am Med Inform Assoc. 2013;20(5):906–14.
Article PubMed Google Scholar
•• Gehrmann S, Dernoncourt F, Li Y, Carlson ET, Wu JT, Welt J, et al. Comparing deep learning and concept extraction based methods for patient phenotyping from clinical narratives. PLoS One. 2018;13(2):e0192360 Demonstrates the superiority of modern deep learning models over classical concept extraction based methods for performing NLP on unstructured clinical text for a variety of phenotyping tasks.
Article PubMed PubMed Central Google Scholar
Domingos P. MetaCost: a general method for making classifiers cost-sensitive. Proceedings of the fifth ACM SIGKDD international conference on knowledge discovery and data mining. San Diego: ACM; 1999. p. 155–64.
Google Scholar
Petri M, Orbai AM, Alarcon GS, Gordon C, Merrill JT, Fortin PR, et al. Derivation and validation of the systemic lupus international collaborating clinics classification criteria for systemic lupus erythematosus. Arthritis Rheum. 2012;64(8):2677–86.
Article PubMed PubMed Central Google Scholar
Turner CA, Jacobs AD, Marques CK, Oates JC, Kamen DL, Anderson PE, et al. Word2Vec inversion and traditional text classifiers for phenotyping lupus. BMC Med Inform Decis Mak. 2017;17:126.
Article PubMed PubMed Central Google Scholar
Savova GK, Masanz JJ, Ogren PV, Zheng J, Sohn S, Kipper-Schuler KC, et al. Mayo clinical text analysis and knowledge extraction system (cTAKES): architecture, component evaluation and applications. J Am Med Inform Assoc. 2010;17(5):507–13.
Article PubMed PubMed Central Google Scholar
• Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. CoRR. 2013;abs/1301.3781. Describes Word2Vec—an increasingly popular method for automatically engineering features from free text using machine learning to represent words in NLP tasks.
Luo Y, Cheng Y, Uzuner O, Szolovits P, Starren J. Segment convolutional neural networks (Seg-CNNs) for classifying relations in clinical notes. J Am Med Inform Assoc. 2018;25(1):93–8.
Article PubMed Google Scholar
Taddy M. Document classification by inversion of distributed language representations. CoRR. 2015;abs/1504.07295.
Rudkowsky E, Haselmayer M, Wastian M, Jenny M, Emrich Š, Sedlmair M. More than bags of words: sentiment analysis with word embeddings. Commun Methods Meas. 2018;12(2–3):140–57.
Article Google Scholar
Pak M, Kim S. A review of deep learning in image recognition. 2017 4th international conference on computer applications and information processing technology (CAIPT); 2017 8–10 2017.
Litjens G, Sanchez CI, Timofeeva N, Hermsen M, Nagtegaal I, Kovacs I, et al. Deep learning as a tool for increased accuracy and efficiency of histopathological diagnosis. Sci Rep. 2016;6:26286.
Article CAS PubMed PubMed Central Google Scholar
Rajpurkar P, Irvin J, Zhu K, Yang B, Mehta H, Duan T, et al. CheXNet: radiologist-level pneumonia detection on chest x-rays with deep learning. CoRR. 2017;abs/1711.05225.
Ciompi F, Chung K, van Riel SJ, Setio AAA, Gerke PK, Jacobs C, et al. Towards automatic pulmonary nodule management in lung cancer screening with deep learning. Sci Rep. 2017;7:46479.
Article CAS PubMed PubMed Central Google Scholar
Brady AP. Error and discrepancy in radiology: inevitable or avoidable? Insights into Imaging. 2017;8(1):171–82.
Article PubMed Google Scholar
Neuman MI, Lee EY, Bixby S, Diperna S, Hellinger J, Markowitz R, et al. Variability in the interpretation of chest radiographs for the diagnosis of pneumonia in children. J Hosp Med. 2012;7(4):294–8.
Article PubMed Google Scholar
Bowman S. Impact of electronic health record systems on information integrity: Quality and Safety Implications. Perspect Health Inf Manag. 2013;10(Fall):1c.
PubMed PubMed Central Google Scholar
Lin KJ, Glynn RJ, Singer DE, Murphy SN, Lii J, Schneeweiss S. Out-of-system care and recording of patient characteristics critical for comparative effectiveness research. Epidemiology. 2018;29(3):356–63.
Article PubMed Google Scholar
Wei WQ, Teixeira PL, Mo H, Cronin RM, Warner JL, Denny JC. Combining billing codes, clinical notes, and medications from electronic health records provides superior phenotyping performance. J Am Med Inform Assoc. 2016;23(e1):e20–7.
Article PubMed Google Scholar
• Shivade C, Raghavan P, Fosler-Lussier E, Embi PJ, Elhadad N, Johnson SB, et al. A review of approaches to identifying patient phenotype cohorts using electronic health records. J Am Med Inform Assoc. 2014;21(2):221–30 Reviews different approaches, including machine learning methods, used in the recent literature to identify patients with a common phenotype from EHR data.
Article PubMed Google Scholar
Jackson RE, Bellamy MC. Antihypertensive drugs. BJA Education. 2015;15(6):280–5.
Article Google Scholar
Liao KP, Cai T, Gainer V, Goryachev S, Zeng-treitler Q, Raychaudhuri S, et al. Electronic medical records for discovery research in rheumatoid arthritis. Arthritis Care Res. 2010;62(8):1120–7.
Article Google Scholar
Teixeira PL, Wei WQ, Cronin RM, Mo H, VanHouten JP, Carroll RJ, et al. Evaluating electronic health record data sources and algorithmic approaches to identify hypertensive individuals. J Am Med Inform Assoc. 2017;24(1):162–71.
Article PubMed Google Scholar
Li J, Chen X, Hovy E, Jurafsky D. Visualizing and understanding neural models in NLP. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016; pp. 681–691.
Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A. Learning deep features for discriminative localization. 2016 IEEE conference on computer vision and pattern recognition (CVPR); 2016 27–30 June 2016.
Carrell DS, Schoen RE, Leffler DA, Morris M, Rose S, Baer A, et al. Challenges in adapting existing clinical natural language processing systems to multiple, diverse health care settings. J Am Med Inform Assoc. 2017;24(5):986–91.
Article PubMed PubMed Central Google Scholar
Kirby JC, Speltz P, Rasmussen LV, Basford M, Gottesman O, Peissig PL, et al. PheKB: a catalog and workflow for creating electronic phenotype algorithms for transportability. J Am Med Inform Assoc. 2016;23(6):1046–52.
Article PubMed PubMed Central Google Scholar
Foster KR, Koprowski R, Skufca JD. Machine learning, medical diagnosis, and biomedical engineering research - commentary. Biomed Eng Online. 2014;13:94.
Article PubMed PubMed Central Google Scholar
Asperti A, Mastronardo C. The effectiveness of data augmentation for detection of gastrointestinal diseases from endoscopical images. Bioimaging. 2018.
Chen Y, Carroll RJ, Hinz ERM, Shah A, Eyler AE, Denny JC, et al. Applying active learning to high-throughput phenotyping algorithms for electronic health records data. J Am Med Inform Assoc. 2013;20(e2):e253–e9.
Article PubMed PubMed Central Google Scholar
Wong SC, Gatt A, Stamatescu V, McDonnell MD. Understanding data augmentation for classification: when to warp? 2016 International conference on Digital Image Computing: Techniques and Applications (DICTA); 2016 Nov. 30 2016–Dec. 2 2016.
Lewis DD, Gale WA. A sequential algorithm for training text classifiers. Proceedings of the 17th annual international ACM SIGIR conference on research and development in information retrieval; Dublin, Ireland. 188495: Springer-Verlag New York, Inc.; 1994. p. 3–12.

Download references

Acknowledgments

We thank Dr. Lisa Herrinton from the Kaiser Permanente Division of Research, Northern California for reviewing this paper and providing valuable feedback. We also thank Jacqueline Cellini from the Countway Library of Medicine for her help in identifying references for this review paper. Dr. Wong and Dr. Murray Horwitz are supported by the Thomas O. Pyle Fellowship from Harvard Medical School & Harvard Pilgrim Health Care Institute. Dr. Li is partially supported by the Agency for Healthcare Research and Quality (R01HS022728, R01HS025375, and R01HS024264). Dr. Toh is partially supported by the National Institute of Biomedical Imaging and Bioengineering (U01EB023683).

Author information

Authors and Affiliations

Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, 401 Park Drive, Suite 401 East, Boston, MA, 02215, USA
Jenna Wong, Mara Murray Horwitz & Sengwee Toh
Division of General Internal Medicine and Primary Care, Brigham and Women’s Hospital, Boston, MA, USA
Li Zhou
Harvard Medical School, Boston, MA, USA
Li Zhou

Authors

Jenna Wong
View author publications
You can also search for this author in PubMed Google Scholar
Mara Murray Horwitz
View author publications
You can also search for this author in PubMed Google Scholar
Li Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Sengwee Toh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jenna Wong.

Ethics declarations

Conflict of Interest

Mara Murray Horwitz reports other from Harvard Medical School and Harvard Pilgrim Health Care Institute, during the conduct of the study. Sengwee Toh reports grants from National Institute of Biomedical Imaging and Bioengineering, during the conduct of the study. Jenna Wong reports other from Harvard Medical School and Harvard Pilgrim Health Care Institute, during the conduct of the study. Li Zhou reports grants from Agency for Healthcare Research and Quality, during the conduct of the study.

Human and Animal Rights and Informed Consent

This article does not contain any studies with human or animal subjects performed by the authors.

Additional information

This article is part of the Topical Collection on Pharmacoepidemiology

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wong, J., Murray Horwitz, M., Zhou, L. et al. Using Machine Learning to Identify Health Outcomes from Electronic Health Record Data. Curr Epidemiol Rep 5, 331–342 (2018). https://doi.org/10.1007/s40471-018-0165-9

Download citation

Published: 20 September 2018
Issue Date: December 2018
DOI: https://doi.org/10.1007/s40471-018-0165-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Using Machine Learning to Identify Health Outcomes from Electronic Health Record Data