RadTex: Learning Efficient Radiograph Representations from Text Reports

Quigley, Keegan; Cha, Miriam; Liao, Ruizhi; Chauhan, Geeticka; Horng, Steven; Berkowitz, Seth; Golland, Polina

doi:10.1007/978-3-031-16876-5_3

Keegan Quigley¹³,
Miriam Cha¹³,
Ruizhi Liao¹⁴,
Geeticka Chauhan¹⁴,
Steven Horng¹⁵,
Seth Berkowitz¹⁵ &
…
Polina Golland¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13543))

Included in the following conference series:

MICCAI Workshop on Resource-Efficient Medical Image Analysis

479 Accesses
4 Altmetric

Abstract

Automated analysis of chest radiography using deep learning has tremendous potential to enhance the clinical diagnosis of diseases in patients. However, deep learning models typically require large amounts of annotated data to achieve high performance – often an obstacle to medical domain adaptation. In this paper, we build a data-efficient learning framework that utilizes radiology reports to improve medical image classification performance with limited labeled data (fewer than 1000 examples). Specifically, we examine image-captioning pretraining to learn high-quality medical image representations that train on fewer examples. Following joint pretraining of a convolutional encoder and transformer decoder, we transfer the learned encoder to various classification tasks. Averaged over 9 pathologies, we find that our model achieves higher classification performance than ImageNet-supervised and in-domain supervised pretraining when labeled training data is limited.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 59.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alfarghaly, O., Khaled, R., Elkorany, A., Helal, M., Fahmy, A.: Automated radiology report generation using conditioned transformers. Inf. Med. Unlocked 24, 100557 (2021)
Article Google Scholar
Angehrn, Z., et al.: Artificial intelligence and machine learning applied at the point of care. Front. Pharmacol. 11, 759 (2020)
Article Google Scholar
Beltagy, I., Lo, K., Cohan, A.: SciBERT: a pretrained language model for scientific text. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (2019)
Google Scholar
Chauhan, G., et al.: Joint modeling of chest radiographs and radiology reports for pulmonary edema assessment. In: MICCAI (2020)
Google Scholar
Chen, X., et al.: Microsoft coco captions: Data collection and evaluation server (2015). arXiv:1048550/ARXIV.1504.00325
Davenport, T., Kalakota, R.: The potential for artificial intelligence in healthcare. Future Healthcare J. 6(2), 94 (2019)
Article Google Scholar
Desai, K., Johnson, J.: VirTex: learning visual representations from textual annotations. In: CVPR (2021)
Google Scholar
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv (2020)
Google Scholar
Gasimova, A., Montana, G., Rueckert, D.: Automated knee x-ray report generation. arXiv (2021)
Google Scholar
Goyal, P., Mahajan, D., Gupta, A., Misra, I.: Scaling and benchmarking self-supervised visual representation learning. CoRR (2019)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. CoRR arXiv:abs/1512.03385 (2015)
Horng, S., Liao, R., Wang, X., Dalal, S., Golland, P., Berkowitz, S.J.: Deep learning to quantify pulmonary edema in chest radiographs. Radiol. Artif. Intell. 3(2), e190228 (2021)
Google Scholar
Hosseinzadeh Taher, M.R., Haghighi, F., Feng, R., Gotway, M.B., Liang, J.: A systematic benchmarking analysis of transfer learning for medical image analysis. In: Albarqouni, S., et al. (eds.) DART/FAIR -2021. LNCS, vol. 12968, pp. 3–13. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87722-4_1
Chapter Google Scholar
Irvin, J., et al.: Chexpert: a large chest radiograph dataset with uncertainty labels and expert comparison. In: Thirty-Third AAAI Conference on Artificial Intelligence (2019)
Google Scholar
Johnson, A., et al.: Mimic-cxr, a de-identified publicly available database of chest radiographs with free-text reports. In: Scientific data (2019)
Google Scholar
Johnson, A., et al.: MIMIC-CXR-JPG - chest radiographs with structured labels (2019)
Google Scholar
Johnson, A., Pollard, T., Mark, R., Berkowitz, S., Horng, S.: MIMIC-CXR database. PhysioNet (2019)
Google Scholar
Krishnan, K.S., Krishnan, K.S.: Vision transformer based COVID-19 detection using chest x-rays. In: 2021 6th International Conference on Signal Processing, Computing and Control (ISPCC), IEEE (2021)
Google Scholar
Liao, R., Chauhan, G., Golland, P., Berkowitz, S., Horng, S.: Pulmonary edema severity grades based on MIMIC-CXR (version 1.0.1). In: PhysioNet (2021). https://doi.org/10.13026/rz5p-rc64
Liao, R., Chauhan, G., Golland, P., Berkowitz, S., Horng, S.: Pulmonary edema severity grades based on mimic-cxr (version 1.0.1). PhysioNet (2021)
Google Scholar
Liao, R., et al.: Multimodal representation learning via maximization of local mutual information. In: MICCAI (2021)
Google Scholar
Lin, T.Y., et al.: Microsoft coco: Common objects in context (2014). arxiv:1048550/ARXIV.1405.0312
Miura, Y., Zhang, Y., Tsai, E.B., Langlotz, C.P., Jurafsky, D.: Improving factual completeness and consistency of image-to-text radiology report generation. arXiv (2020)
Google Scholar
Raghu, M., Zhang, C., Kleinberg, J., Bengio, S.: Transfusion: Understanding transfer learning for medical imaging. arXiv (2019)
Google Scholar
Sutton, R., Pincock, D., Baumgart, D., Sadowski, D., Fedorak, R., Kroeker, K.: An overview of clinical decision support systems: benefits, risks, and strategies for success. NPJ Digital Med. 3(1), 1–10 (2020)
Article Google Scholar
Thian, Y.L., et al.: Deep learning systems for pneumothorax detection on chest radiographs: a multicenter external validation study. Radiol. Artif. Intell. 3(4), e200190 (2021)
Google Scholar
Vaswani, A., et al.: Attention is all you need. arXiv (2017)
Google Scholar
Wang, X., Peng, Y., Lu, L., Lu, Z., Bagheri, M., Summers, R.M.: Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. CoRR arXiv:abs/1705.02315 (2017)
Wang, X., Peng, Y., Lu, L., Lu, Z., Summers, R.M.: Tienet: text-image embedding network for common thorax disease classification and reporting in chest x-rays. CoRR arXiv:abs/1801.04334 (2018)
Wen, Y., Chen, L., Deng, Y., Zhou, C.: Rethinking pre-training on medical imaging. J. Vis. Commun. Image Representation 78, 103145 (2021)
Article Google Scholar
Xie, Y., Richmond, D.: Pre-training on grayscale imagenet improves medical image classification. In: Leal-Taixé, L., Roth, S. (eds.) Computer Vision - ECCV 2018 Workshops (2019)
Google Scholar
Zhang, Y., Jiang, H., Miura, Y., Manning, C.D., Langlotz, C.P.: Contrastive learning of medical visual representations from paired images and text. arXiv (2020)
Google Scholar

Download references

Acknowledgements

This work was supported in part by MIT Lincoln Laboratory, US Air Force, NIH NIBIB NAC P41EB015902, Wistron, IBM Watson, MIT Deshpande Center, and MIT J-Clinic.

DISTRIBUTION STATEMENT A. Approved for public release. Distribution is unlimited. This material is based upon work supported by the Old Program 1 under Air Force Contract No. FA8702-15-D-0001. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Old Program 1. ©Massachusetts Institute of Technology. Delivered to the U.S. Government with Unlimited Rights, as defined in DFARS Part 252.227-7013 or 7014 (Feb 2014). Notwithstanding any copyright notice, U.S. Government rights in this work are defined by DFARS 252.227-7013 or DFARS 252.227-7014 as detailed above. Use of this work other than as specifically authorized by the U.S. Government may violate any copyrights that exist in this work.

Author information

Authors and Affiliations

MIT Lincoln Laboratory, Lexington, MA, USA
Keegan Quigley & Miriam Cha
CSAIL, Massachusetts Institute of Technology, Cambridge, MA, USA
Ruizhi Liao, Geeticka Chauhan & Polina Golland
Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
Steven Horng & Seth Berkowitz

Authors

Keegan Quigley
View author publications
You can also search for this author in PubMed Google Scholar
Miriam Cha
View author publications
You can also search for this author in PubMed Google Scholar
Ruizhi Liao
View author publications
You can also search for this author in PubMed Google Scholar
Geeticka Chauhan
View author publications
You can also search for this author in PubMed Google Scholar
Steven Horng
View author publications
You can also search for this author in PubMed Google Scholar
Seth Berkowitz
View author publications
You can also search for this author in PubMed Google Scholar
Polina Golland
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Keegan Quigley .

Editor information

Editors and Affiliations

Institute of High Performance Computing, Singapore, Singapore
Xinxing Xu
Hong Kong University of Science and Technology, Hong Kong, China
Xiaomeng Li
Inception Institute of Artificial Intelligence, Abu Dhabi, United Arab Emirates
Dwarikanath Mahapatra
University of Alberta, Edmonton, AB, Canada
Li Cheng
University of Rouen, Rouen, France
Caroline Petitjean
Institute of High Performance Computing, Singapore, Singapore
Huazhu Fu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Quigley, K. et al. (2022). RadTex: Learning Efficient Radiograph Representations from Text Reports. In: Xu, X., Li, X., Mahapatra, D., Cheng, L., Petitjean, C., Fu, H. (eds) Resource-Efficient Medical Image Analysis. REMIA 2022. Lecture Notes in Computer Science, vol 13543. Springer, Cham. https://doi.org/10.1007/978-3-031-16876-5_3

Download citation

DOI: https://doi.org/10.1007/978-3-031-16876-5_3
Published: 15 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16875-8
Online ISBN: 978-3-031-16876-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

RadTex: Learning Efficient Radiograph Representations from Text Reports