Skip to main content

Advertisement

Log in

Clinical Information Retrieval: A Literature Review

  • Research Article
  • Published:
Journal of Healthcare Informatics Research Aims and scope Submit manuscript

Abstract

Clinical information retrieval (IR) plays a vital role in modern healthcare by facilitating efficient access and analysis of medical literature for clinicians and researchers. This scoping review aims to offer a comprehensive overview of the current state of clinical IR research and identify gaps and potential opportunities for future studies in this field. The main objective was to assess and analyze the existing literature on clinical IR, focusing on the methods, techniques, and tools employed for effective retrieval and analysis of medical information. Adhering to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, we conducted an extensive search across databases such as Ovid Embase, Ovid Medline, Scopus, ACM Digital Library, IEEE Xplore, and Web of Science, covering publications from January 1, 2010, to January 4, 2023. The rigorous screening process led to the inclusion of 184 papers in our review. Our findings provide a detailed analysis of the clinical IR research landscape, covering aspects like publication trends, data sources, methodologies, evaluation metrics, and applications. The review identifies key research gaps in clinical IR methods such as indexing, ranking, and query expansion, offering insights and opportunities for future studies in clinical IR, thus serving as a guiding framework for upcoming research efforts in this rapidly evolving field. The study also underscores an imperative for innovative research on advanced clinical IR systems capable of fast semantic vector search and adoption of neural IR techniques for effective retrieval of information from unstructured electronic health records (EHRs).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data Availability

Data and materials are available in the supplemental files.

References

  1. Sutton RT, Pincock D, Baumgart DC, Sadowski DC, Fedorak RN, Kroeker KI (2020) An overview of clinical decision support systems: benefits, risks, and strategies for success. NPJ Digit 3(1):1–10

    Google Scholar 

  2. Botsis T, Hartvigsen G, Chen F, Weng C (2010) Secondary use of EHR: data quality issues and informatics opportunities. Summit Transl Bioinform 2010:1

    Google Scholar 

  3. Clark KD, Woodson TT, Holden RJ, Gunn R, Cohen DJ (2019) Translating research into agile development (TRIAD): development of electronic health record tools for primary care settings. Methods Inf Med 58(1):1–8

    Article  Google Scholar 

  4. Murdoch TB, Detsky AS (2013) The inevitable application of big data to health care. JAMA. 309(13):1351–1352

    Article  Google Scholar 

  5. McGowan J, Grad R, Pluye P, Hannes K, Deane K, Labrecque M et al (2009) Electronic retrieval of health information by healthcare providers to improve practice and patient care. Cochrane Database of Syst Rev 3

  6. Hersh WR (2020) Information retrieval: a biomedical and health perspective. Springer

  7. Zheng J, Yu H (2015) Key concept identification for medical information retrieval. In: Conference on empirical methods in natural language processing, EMNLP 2015. Association for Computational Linguistics (ACL)

  8. Ceri S, Bozzon A, Brambilla M, Valle ED, Fraternali P, Quarteroni S (2013) An introduction to information retrieval. Springer, Web information retrieval, pp 3–11

    Google Scholar 

  9. Manning CD, Raghavan P, Schütze H (2008) Introduction to information retrieval. Cambridge University Press

    Book  Google Scholar 

  10. Tamine L, Goeuriot L (2021) Semantic information retrieval on medical texts: research challenges, survey, and open issues. ACM Computing Surveys (CSUR) 54(7):1–38

    Article  Google Scholar 

  11. Himani S, Vaidehi D (2017) A survey on medical information retrieval. International Conference on Information and Communication Technology for Intelligent Systems, Springer

    Google Scholar 

  12. Gudivada A, Tabrizi N (2018) A literature review on machine learning based medical information retrieval systems. In: 2018 IEEE symposium series on computational intelligence (SSCI). IEEE

  13. Lopes CT (2022) Health information retrieval--state of the art report. arXiv preprint arXiv:220509083

  14. Montani S, Striani M (2019) Artificial intelligence in clinical decision support: a focused literature survey. Yearbook of medical informatics 28(01):120–127

    Article  Google Scholar 

  15. Khattak FK, Jeblee S, Pou-Prom C, Abdalla M, Meaney C, Rudzicz F (2019) A survey of word embeddings for clinical text. J Biomed Inform 100:100057

    Article  Google Scholar 

  16. Moher D, Liberati A, Tetzlaff J, Altman DG, Group* P (2009) Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Ann Intern Med 151(4):264–269

    Article  Google Scholar 

  17. Wang Y, Wang L, Rastegar-Mojarad M, Moon S, Shen F, Afzal N et al (2018) Clinical information extraction applications: a literature review. J Biomed Inform 77:34–49

    Article  Google Scholar 

  18. Wongsuphasawat K, Plaisant C, Taieb-Maimon M, Shneiderman B (2012) Querying event sequences by exact match or similarity search: design and empirical evaluation. Interact Comput 24(2):55–68

    Article  Google Scholar 

  19. Gormley C, Tong Z (2015) Elasticsearch: the definitive guide: A distributed real-time search and analytics. O'Reilly Media, Inc

    Google Scholar 

  20. Grainger T, Potter T (2014) Solr in action. Manning Publications Co

    Google Scholar 

  21. Chen DQ, Chen Y, Brownlow BN, Kanjamala PP, Arredondo CAG, Radspinner BL et al (2017) Real-time or near real-time persisting daily healthcare data into HDFS and elasticsearch index inside a big data platform. IEEE Trans Ind Inform 13(2):595–606

    Article  Google Scholar 

  22. Filho IB, Sampaio SC, Tenorio JCA, Filho EVDC, Pessoa MEDC, Malaquias RS et al (2020) Development of a health dashboard for an electronic health record system. In: 20th International Conference on Computational Science and Its Applications, ICCSA 2020. Institute of Electrical and Electronics Engineers Inc

  23. Chen J, Yu P, Ge H (2005) UNT 2005 TREC QA participation: using Lemur as IR search engine. TREC

    Google Scholar 

  24. Ide NC, Loane RF, Demner-Fushman D (2007) Essie: a concept-based search engine for structured biomedical text. J Am Med Inform Assoc 14(3):253–263

    Article  Google Scholar 

  25. Edinger T, Demner-Fushman D, Cohen AM, Bedrick S, Hersh W (2017) Evaluation of clinical text segmentation to facilitate cohort retrieval. AMIA Annu Symp Proc 2017:660–669

    Google Scholar 

  26. Bretonnel Cohen K, Christiansen T, Hunter LE (2011) MetaMap is a superior baseline to a standard document retrieval engine for the task of finding patient cohorts in clinical free text. In: 20th Text REtrieval conference, TREC 2011. Gaithersburg, MD

  27. Moen H, Ginter F, Marsi E, Peltonen L-M, Salakoski T, Salantera S (2015) Care episode retrieval: distributional semantic models for information retrieval in the clinical domain. BMC Med Inf Decis Mak 15(Suppl 2):S2

    Article  Google Scholar 

  28. Yadav N, Poellabauer C (2012) An architecture for personalized health information retrieval. In: Proceedings of the 2012 International workshop on smart health and wellbeing. Association for Computing Machinery, Maui

  29. Hanauer DA, Mei Q, Law J, Khanna R, Zheng K (2015) Supporting information retrieval from electronic health records: A report of University of Michigan's nine-year experience in developing and using the electronic medical record search engine (EMERSE). J Biomed Inform 55:290–300

    Article  Google Scholar 

  30. Hamid MS, Brenneman B, Niziol L, Stein JD, Newman-Casey PA (2020) Identification of glaucoma patients with poor medication compliance from the electronic health record. Investiga Ophthalmol Vis Sci Conf 61(7)

  31. Jackson R, Kartoglu I, Stringer C, Gorrell G, Roberts A, Song X et al (2018) CogStack-experiences of deploying integrated information retrieval and extraction services in a large National Health Service Foundation Trust hospital. BMC Medical Inform Decis Mak 18(1):1–13

    Article  Google Scholar 

  32. Wang T, Oliver D, Msosa Y, Colling C, Spada G, Roguski L et al (2020) Implementation of a real-time psychosis risk detection and alerting system based on electronic health records using CogStack. J Vis Exp, JoVE (pagination)

  33. Aronson AR (2001) Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. In: Proceedings of the AMIA Symposium. American Medical Informatics Association

  34. Hanauer DA, Wu DTY, Yang L, Mei Q, Murkowski-Steffy KB, Vydiswaran VGV et al (2017) Development and empirical user-centered evaluation of semantically-based query recommendation for an electronic health record search engine. J Biomed Inform 67:1–10

    Article  Google Scholar 

  35. Perez-Rey D, Jimenez-Castellanos A, Garcia-Remesal M, Crespo J, Maojo V (2012) CDAPubMed: a browser extension to retrieve EHR-based biomedical literature. BMC Med Inf Decis Mak. 12:29

    Article  Google Scholar 

  36. Thiessard F, Mougin F, Diallo G, Jouhet V, Cossin S, Garcelon N et al (2012) RAVEL: retrieval and visualization in electronic health records. Stud Health Technol Inform 180:194–198

    Google Scholar 

  37. Gubanov M, Pyayt A (2012) MEDREADFAST: A structural information retrieval engine for big clinical text. In: 2012 IEEE 13th international conference on information reuse and integration, IRI 2012, Las Vegas

  38. Hristidis V, Varadarajan RR, Biondich P, Weiner M (2010) Information discovery on electronic health records using authority flow techniques. BMC Med Inf Decis Mak. 10:64

    Article  Google Scholar 

  39. Garcelon N, Neuraz A, Benoit V, Salomon R, Burgun A (2017) Improving a full-text search engine: the importance of negation detection and family history context to identify cases in a biomedical data warehouse. J Am Med Inform Assoc 24(3):607–613

    Article  Google Scholar 

  40. Wen A, Wang Y, Kaggal VC, Liu S, Liu H, Fan J (2019) Enhancing clinical information retrieval through context-aware queries and indices. In: 2019 IEEE International Conference on Big Data, Big Data 2019. Institute of Electrical and Electronics Engineers Inc

  41. Yang S, Zheng X, Xiao Y, Yin X, Pang J, Mao H et al (2021) Improving Chinese electronic medical record retrieval by field weight assignment, negation detection, and re-ranking. J Biomed Inform 119:103836

    Article  Google Scholar 

  42. Bard JB, Rhee SY (2004) Ontologies in biology: design, applications and future challenges nature reviews genetics 5(3):213–222

    Google Scholar 

  43. Barcellos Almeida M, Farinelli F (2017) Ontologies for the representation of electronic medical records: the obstetric and neonatal ontology. J Assoc Soc Inf Sci Technol 68(11):2529–2542

    Article  Google Scholar 

  44. Bonacin R, Dos Reis JC, Perciani EM, Nabuco O (2018) Exploring intentions on electronic health records retrieval: studies with collaborative scenarios. Ing Syst Inf 23(2):111–135

    Google Scholar 

  45. Goodwin TR, Harabagiu SM (2018) Knowledge representations and inference techniques for medical question answering. ACM Trans Intell Syst Technolog 9(2)

  46. Gurulingappa H, Müller B, Hofmann-Apitius M, Fluck J (2011) A semantic platform for information retrieval from E-health records. TREC

    Google Scholar 

  47. Afzal M, Hussain M, Ali T, Khan WA, Lee S, Kang BH (2014) MLM-based automated query generation for CDSS evidence support. In: Hervas R, Bravo J, Lee S, Nugent C. Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). Springer Verlag, p 296–299

    Google Scholar 

  48. Hersh WR (1991) Evaluation of Meta-1 for a concept-based approach to the automated indexing and retrieval of bibliographic and full-text databases. Med Decis Mak 11(4_suppl):S120–S1S4

    Article  Google Scholar 

  49. Koopman B, Bruza P, Sitbon L, Lawley M (2012) Towards semantic search and inference in electronic medical records: an approach using concept--based information retrieval. Australas Med J 5(9):482–488

    Article  Google Scholar 

  50. Curé O, Maurer H, Shah N, LePendu P (2013) Refining health outcomes of interest using formal concept analysis and semantic query expansion. In: Proceedings of the 7th international workshop on data and text mining in biomedical informatics, San Francisco, Association for Computing Machinery

  51. Cure OC, Maurer H, Shah NH, Le Pendu P (2015) A formal concept analysis and semantic query expansion cooperation to refine health outcomes of interest. BMC Med Inf Decis Mak. 15(Suppl 1):S8

    Article  Google Scholar 

  52. Alonso I, Contreras D (2016) Evaluation of semantic similarity metrics applied to the automatic retrieval of medical documents: an UMLS approach. Expert Sys Appl 44:386–399

    Article  Google Scholar 

  53. Cureí O, Maurer H, Shah NH, Le Pendu P (2013) Refining health outcomes of interest using formal concept analysis and semantic query expansion. In: 6th International Workshop on Semantic Web Applications and Tools for Life Sciences, SWAT4LS 2013. CEUR-WS

  54. Martinez D, Otegi A, Soroa A, Agirre E (2014) Improving search over electronic health records using UMLSbased query expansion through random walks. J Biomed Inform 51:100–106

    Article  Google Scholar 

  55. Zhu D, Stephen W, James M, Carterette B, Liu H (2013) Using discharge summaries to improve information retrieval in clinical domain. In: 2013 cross language evaluation forum conference, CLEF 2013. CEUR-WS

  56. Aravazhi R, Chidambaram M (2019) An enhanced semantic similarity based information retrieval system in mesh and EMR. J Adv Res Dyn Control Syst 11(9 Special Issue):993–998

    Google Scholar 

  57. Liu S, Wang Y, Hong N, Shen F, Wu S, Hersh W et al (2017) On mapping textual queries to a common data model2017. Institute of Electrical and Electronics Engineers Inc

  58. Shi W, Kelsey T, Sullivan F (2020) Efficient identification of patients eligible for clinical studies using case-based reasoning on Scottish Health Research register (SHARE). BMC Med Inf Decis Mak 20(1):70

    Article  Google Scholar 

  59. Jain H, Thao C, Zhao H (2012) Enhancing electronic medical record retrieval through semantic query expansion. Inf Syst e-Bus Manage 10(2):165–181

    Article  Google Scholar 

  60. Wang N, Qi H, Deng Y, Yu W, Chen Z (2022) Transmission and drug resistance characteristics of human immunodeficiency Virus-1 strain using medical information data retrieval system. Comput. 2022:2173339

    Google Scholar 

  61. Kreuzthaler M, Pfeifer B, Schulz S (2022) Terminology expansion via co-occurrence analysis of large clinical real-world datasets. In: 2022 IEEE 10th International Conference on Healthcare Informatics (ICHI)

    Google Scholar 

  62. Yang S, Zheng X, Yin X, Mao H, Zhao D (2020) An algorithm of query expansion for Chinese EMR retrieval by improving expansion term weights and retrieval scores. IEEE Access 8:200063–200072

    Article  Google Scholar 

  63. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:13013781

  64. Pennington J, Socher R, Manning CD (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP)

    Google Scholar 

  65. Devlin J, Chang M-W, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:181004805

  66. Wang Y, Wu S, Li D, Mehrabi S, Liu H (2016) A part-of-speech term weighting scheme for biomedical information retrieval. J Biomed Inform 63:379–389

    Article  Google Scholar 

  67. Matsuo R, Ho TB (2018) Semantic term weighting for clinical texts. Expert Sys Appl. 114:543–551

    Article  Google Scholar 

  68. Chamberlin SR, Bedrick SD, Cohen AM, Wang Y, Wen A, Liu S et al (2020) Evaluation of patient-level retrieval from electronic health record data for a cohort discovery task. JAMIA open 3(3):395–404

    Article  Google Scholar 

  69. Zhu D, Carterette B (2012) Improving health records search using multiple query expansion collections. In: 2012 IEEE international conference on bioinformatics and biomedicine, BIBM2012, Philadelphia

  70. Rohde DL, Gonnerman LM, Plaut DC (2006) An improved model of semantic similarity based on lexical cooccurrence. Commun ACM 8(627–633):116

    Google Scholar 

  71. Pan M, Zhang Y, Zhu Q, Sun B, He T, Jiang X (2019) An adaptive term proximity based rocchio's model for clinical decision support retrieval. BMC Med Inf Decis Mak. 19(Suppl 9):251

    Article  Google Scholar 

  72. Demner-Fushman D, Antani S, Simpson M, Thoma GR (2012) Design and development of a multimodal biomedical information retrieval system. J Comput Sci Eng 6(2):168–177

    Article  Google Scholar 

  73. Duren R, Smith R, Tackes N, Neeley S, Welsh J, Shirley LX (2018) Scalable assembly of individual patient profiles for clinical trials accrual and research. Cancer Research Conference 78(13 Supplement 1)

  74. Li M, Cai H, Nan S, Li J, Lu X, Duan H (2021) A patient-screening tool for clinical research based on electronic health records using OpenEHR: development study. JMIR Med Inform 9(10):e33192

    Article  Google Scholar 

  75. Dai X, Rybinski M, Karimi S (2021) SearchEHR: A family history search system for clinical decision support. In: 30th ACM International Conference on Information and Knowledge Management, CIKM 2021. Association for Computing Machinery

  76. Metcalf K, Leake D (2018) Embedded word representations for rich indexing: A case study for medical records. In: Cox MT, Funk P, Begum S (eds) 26th international conference on case-based reasoning, ICCBR 2018. Springer Verlag, pp 264–280

    Google Scholar 

  77. Ye C, Fabbri D (2018) Extracting similar terms from multiple EMR-based semantic embeddings to support chart reviews. J Biomed Inform 83:63–72

    Article  Google Scholar 

  78. Syed H, Das AK (2016) Vector space models for encoding and retrieving longitudinal medical record data. In: Khan A, Luo G, Weng C, Wang F, Mitra P, Yu C (eds) 1st International Workshop on Data Management and Analytics for Medicine and Healthcare, DMAH 2015 and Workshop on Big-Graphs Online Querying, Big-O(Q) 2015 held in conjunction with 41st International Conference on Very Large Data Bases, VLDB 2015. Springer, Verlag, pp 3–15

    Google Scholar 

  79. Robertson S, Zaragoza H (2009) The probabilistic relevance framework: BM25 and beyond. Foundations and trends®. Inf Retr 3(4):333–389

    Google Scholar 

  80. Jin M, Li H, Schmid CH, Wallace BC (2016) Using electronic medical records and physician data to improve information retrieval for evidence-based care. In: 2016 IEEE international conference on healthcare informatics, ICHI 2016. Institute of Electrical and Electronics Engineers Inc

  81. Huang HH, Lee CC, Chen HH (2014) Mining professional knowledge from medical records. In: 2014 International Conference on Brain Informatics and Health, BIH 2014. Warsaw: Springer Verlag, pp 152–163

  82. Mutinda FW, Yada S, Wakamiya S, Aramaki E (2021) Semantic textual similarity in Japanese clinical domain texts using BERT. Methods Inf Med 60(S 01):e56–e64

    Article  Google Scholar 

  83. Arvanitis A, Wiley M, Hristidis V (2014) Efficient concept-based document ranking. In: 17th international conference on extending database technology, EDBT 2014. OpenProceedings.org, University of Konstanz, University Library

    Google Scholar 

  84. Xu J, Li H (2007) Adarank: a boosting algorithm for information retrieval. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval

    Google Scholar 

  85. Zhang P, Wu J (2021) Research on search ranking technology of chinese electronic medical record based on AdaRank. In: 18th international computer conference on wavelet active media technology and information processing, ICCWAMTIP 2021. Institute of Electrical and Electronics Engineers Inc

    Google Scholar 

  86. Lee J, Yoon W, Kim S, Kim D, Kim S, So CH et al (2020) BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36(4):1234–1240

    Article  Google Scholar 

  87. Alsentzer E, Murphy JR, Boag W, Weng W-H, Jin D, Naumann T, et al (2019) Publicly available clinical BERT embeddings. arXiv preprint arXiv:190403323

  88. Soni S, Roberts K (2020) Patient cohort retrieval using transformer language models. AMIA Annual Symposium Proceedings/AMIA Symposium 2020:1150–1159

    Google Scholar 

  89. Shi L, Syeda-mahmood T, Baldwin T (2022) Improving neural models for radiology report retrieval with lexicon-based automated annotation. In: Proceedings of the 2022 conference of the north American chapter of the Association for Computational Linguistics: human language technologies

  90. Moon S, He H, Fan JW (2022) Effects of information masking in the task-specific Finetuning of a transformers-based clinical question-answering framework. In: 2022 IEEE 10th international conference on healthcare informatics (ICHI)

    Google Scholar 

  91. Scholer F, Kelly D, Carterette B (2016) Information retrieval evaluation using test collections. Inf Retr J 19(3):225–229

    Article  Google Scholar 

  92. Chapman W, Saul M, Houston J, Irwin J, Mowery D, Karkeme H et al (2011) Creation of a repository of automatically de-identified clinical reports: processes, people, and permission. AMIA Summit on Clinical Research Informatics, San Francisco, CA

    Google Scholar 

  93. Johnson A, Pollard T, Shen L, Lehman L, Feng M, Ghassemi M et al (2016) MIMIC-III, a freely accessible critical care database. Sci Data 3:160035. https://pubmed.ncbi.nlm.nihgov/27219127

  94. Yilmaz E, Aslam JA (2008) Estimating average precision when judgments are incomplete. Knowl Inf Syst 16(2):173–211

    Article  Google Scholar 

  95. Bedrick S, Ambert KH, Cohen AM, Hersh WR (2011) Identifying patients for clinical studies from electronic health records: TREC medical records track at OHSU. TREC

    Google Scholar 

  96. Voorhees EM, Hersh WR (2012) Overview of the TREC 2012 medical records track. TREC

    Google Scholar 

  97. Goeuriot L, Jones GJ, Kelly L, Leveling J, Hanbury A, Müller H et al (2013) ShARe/CLEF eHealth evaluation lab 2013, task 3: Information retrieval to address patients' questions when reading clinical reports. In: CLEF 2013 online working notes, p 8138

    Google Scholar 

  98. Goeuriot L, Kelly L, Li W, Palotti J, Pecina P, Zuccon G, et al (2014) Share/clef ehealth evaluation lab 2014, task 3: user-centred health information retrieval. Proceedings of CLEF 2014

  99. Palotti JR, Zuccon G, Goeuriot L, Kelly L, Hanbury A, Jones GJ et al (2015) Clef ehealth evaluation lab 2015, task 2: retrieving information about medical symptoms. CLEF (Working Notes)

    Google Scholar 

  100. Zuccon G, Palotti J, Goeuriot L, Kelly L, Lupu M, Pecina P et al (2016) The IR task at the CLEF eHealth evaluation lab 2016: user-centred health information retrieval

    Google Scholar 

  101. Roberts K, Demner-Fushman D, Voorhees EM, Hersh WR (2016) Overview of the TREC 2016 clinical decision support track

  102. Roberts K, Simpson MS, Voorhees EM, Hersh WR (2015) Overview of the trec 2015 clinical decision support track. TREC

    Google Scholar 

  103. Simpson MS, Voorhees EM, Hersh W (2014) Overview of the trec 2014 clinical decision support track. Lister Hill National Center for Biomedical Communications, Bethesda MD

    Google Scholar 

  104. Roberts K, Demner-Fushman D, Voorhees EM, Bedrick S, Hersh WR (2020) Overview of the TREC 2020 precision medicine track

  105. Roberts K, Demner-Fushman D, Voorhees EM, Hersh WR, Bedrick S, Lazar AJ (2018) Overview of the TREC 2018 precision medicine track

  106. Roberts K, Demner-Fushman D, Voorhees EM, Hersh WR, Bedrick S, Lazar AJ et al (2017) Overview of the TREC 2017 precision medicine track. In: The text retrieval conference: TREC text REtrieval conference. NIH Public Access

    Google Scholar 

  107. Roberts K, Demner-Fushman D, Voorhees EM, Hersh WR, Bedrick S, Lazar AJ et al (2019) Overview of the TREC 2019 precision medicine track. In: The text retrieval conference: TREC text REtrieval conference, p 2019

  108. Roberts K, Demner-Fushman D, Voorhees EM, Bedrick S, Hersh WR (2021) Overview of the TREC 2021 clinical trials track. In: Proceedings of the thirtieth text retrieval conference (TREC 2021)

    Google Scholar 

  109. Min L, Wang L, Lu X, Duan H (2015) Case study: applying OpenEHR archetypes to a clinical data repository in a Chinese hospital. Studies in health technology and informatics 216:207–211

    Google Scholar 

  110. Sun B, Zhang F, Li J, Yang Y, Diao X, Zhao W et al (2021) Using NLP in openEHR archetypes retrieval to promote interoperability: a feasibility study in China. BMC Med Inf Decis Mak. 21(1):199

    Article  Google Scholar 

  111. Ye C, Malin BA, Fabbri D (2021) Leveraging medical context to recommend semantically similar terms for chart reviews. BMC Med Inf Decis Mak. 21(1):353

    Article  Google Scholar 

  112. Liu S, Wang Y, Wen A, Wang L, Hong N, Shen F et al (2020) Implementation of a cohort retrieval system for clinical data repositories using the observational medical outcomes partnership common data model: proof-of-concept system validation. JMIR Med Inform 8(10):e17376

    Article  Google Scholar 

  113. Savova GK, Masanz JJ, Ogren PV, Zheng J, Sohn S, Kipper-Schuler KC et al (2010) Mayo clinical text analysis and knowledge extraction system (cTAKES): architecture, component evaluation and applications. J Am Med Inform Assoc 17(5):507–513

    Article  Google Scholar 

  114. Goodwin TR, Harabagiu SM (2018) Learning relevance models for patient cohort retrieval. JAMIA open. 1(2):265–275

    Article  Google Scholar 

  115. Patrão DF, Oleynik M, Massicano F, Morassi SA (2015) Recruit-an ontology based information retrieval system for clinical trials recruitment. In: MEDINFO 2015: eHealth-enabled health. IOS Press, pp 534–538

    Google Scholar 

  116. Richman EL, Lombardi BM, de Saxe ZL, Forte AB (2022) What do EHRs tell us about how we deploy health professionals to address the social determinants of health. Soc. 37(3):287–296

    MathSciNet  Google Scholar 

  117. Kong N, Wang Y, Wang J, Tao X, Zhou Y (2020) Time-attention medical concept embedding and query representation for cohort selection. Basic Clin Pharmacol Toxicol 126(Supplement 4):10–11

    Google Scholar 

  118. Xiao C, Gao J, Glass L, Sun J (2020) Patient trial matching using pseudo-siamese network. J Clin Oncol Conf 38(15)

  119. Hammond KW, Laundry RJ, O'Leary TM, Jones WP (2013) Use of text search to effectively identify lifetime prevalence of suicide attempts among veterans

    Book  Google Scholar 

  120. Osmont MN, Bouzille G, Triquet L, Rochefort-Morel C, Polard E, Cuggia M (2017) Drug safety and big clinical data: detection of drug-induced anaphylactic shocks (BREIZH project). Fundam Clin Pharmacol 31(Supplement 1):32

    Google Scholar 

  121. Selvan NS, Vairavasundaram S, Ravi L (2019) Fuzzy ontology-based personalized recommendation for internet of medical things with linked open data. J Intell Fuzzy Syst 36(5):4065–4075

    Article  Google Scholar 

  122. Dentino B, Davis D, Chawla NV (2010) HealthCareND: leveraging EHR and CARE for prospective healthcare. In: Proceedings of the 1st ACM international health informatics symposium

    Google Scholar 

  123. Orenstein EW, Rasooly IR, Mai MV, Dziorny AC, Phillips W, Utidjian L et al (2018) Influence of simulation on electronic health record use patterns among pediatric residents. J Am Med Inform Assoc 25(11):1501–1506

    Article  Google Scholar 

  124. Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P et al (2020) Language models are few-shot learners. Adv Neural Inf Proces Syst 33:1877–1901

    Google Scholar 

  125. Touvron H, Lavril T, Izacard G, Martinet X, Lachaux M-A, Lacroix T et al (2023) Llama: open and efficient foundation language models. arXiv preprint arXiv:230213971

  126. Thirunavukarasu AJ, Ting DSJ, Elangovan K, Gutierrez L, Tan TF, Ting DSW (2023) Large language models in medicine. Nat Med 29(8):1930–1940

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to acknowledge the support from the National Center for Advancing Translational Sciences (NCATS) U24TR004111, the National Library of Medicine (NLM) R01LM014306, the University of Pittsburgh Clinical and Translational Science Institute (CTSI) Pilot Award, the University of Pittsburgh Momentum Funds, and the School of Health and Rehabilitation Sciences Dean’s Research and Development Award. Authors KR, WH, and HL would like to acknowledge the support from the NLM R01LM011934.

Funding

National Center for Advancing Translational Sciences (NCATS) U24TR004111.

University of Pittsburgh Clinical and Translational Science Institute (CTSI) Pilot Award.

University of Pittsburgh Momentum Funds.

School of Health and Rehabilitation Sciences Dean’s Research and Development Award.

National Library of Medicine R01LM014306, R01LM011934.

Author information

Authors and Affiliations

Authors

Contributions

SS: conceptualized the study, wrote the manuscript; HAM: conducted data analysis; DO: conducted data analysis, edited the manuscript; KR: edited the manuscript; WH: edited the manuscript; HL: edited the manuscript; DH: edited the manuscript; SV: edited the manuscript; YW: conceptualized the study, wrote the manuscript.

Corresponding author

Correspondence to Yanshan Wang.

Ethics declarations

Ethical Approval

Not applicable.

Competing Interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (XLSX 32 KB)

Supplementary file2 (DOCX 50 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sivarajkumar, S., Mohammad, H.A., Oniani, D. et al. Clinical Information Retrieval: A Literature Review. J Healthc Inform Res 8, 313–352 (2024). https://doi.org/10.1007/s41666-024-00159-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41666-024-00159-4

Keywords

Navigation