GPT4MIA: Utilizing Generative Pre-trained Transformer (GPT-3) as a Plug-and-Play Transductive Model for Medical Image Analysis

Zhang, Yizhe; Chen, Danny Z.

doi:10.1007/978-3-031-47401-9_15

Yizhe Zhang³¹ &
Danny Z. Chen³²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14393))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

580 Accesses
1 Citations

Abstract

In this paper, we propose a novel approach (called GPT4MIA) that utilizes Generative Pre-trained Transformer (GPT) as a plug-and-play transductive inference tool for medical image analysis (MIA). We provide theoretical analysis on why a large pre-trained language model such as GPT-3 can be used as a plug-and-play transductive inference model for MIA. At the methodological level, we develop several technical treatments to improve the efficiency and effectiveness of GPT4MIA, including better prompt structure design, sample selection, and prompt ordering of representative samples/features. We present two concrete use cases (with workflow) of GPT4MIA: (1) detecting prediction errors and (2) improving prediction accuracy, working in conjecture with well-established vision-based models for image classification (e.g., ResNet). Experiments validate that our proposed method is effective for these two tasks. We further discuss the opportunities and challenges in utilizing Transformer-based large language models for broader MIA applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Nearest neighbor classifiers are a typical type of transductive methods for prediction problems.
2.
LR, MLP, SVM, and KNN are conducted using the scikit-learn library at https://scikit-learn.org/, and UbKNN is with our implementation.
3.
The model weights are obtained from https://github.com/MedMNIST/experiments.

References

OpenAI API. https://openai.com/api/
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 1–27 (2011)
Article Google Scholar
ChatGPT. https://openai.com/blog/chatgpt/
Brown, T., et al.: Language models are few-shot learners. Adv. Neural. Inf. Process. Syst. 33, 1877–1901 (2020)
Google Scholar
Hang, H., Cai, Y., Yang, H., Lin, Z.: Under-bagging nearest neighbors for imbalanced classification. J. Mach. Learn. Res. 23(118), 1–63 (2022)
MathSciNet Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)
Google Scholar
Joachims, T., et al.: Transductive inference for text classification using support vector machines. In: ICML, vol. 99, pp. 200–209 (1999)
Google Scholar
Peterson, L.E.: K-nearest neighbor. Scholarpedia 4(2), 1883 (2009)
Article Google Scholar
Vapnik, V.: Statistical Learning Theory. Wiley (1998)
Google Scholar
Weisberg, S.: Applied Linear Regression, vol. 528. Wiley (2005)
Google Scholar
Yang, J., et al.: MedMNIST v2 - a large-scale lightweight benchmark for 2D and 3D biomedical image classification. Sci. Data 10(1), 41 (2023)
Article Google Scholar

Download references

Acknowledgement

This work was supported in part by National Natural Science Foundation of China (62201263) and Natural Science Foundation of Jiangsu Province (BK20220949).

Author information

Authors and Affiliations

School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, Jiangsu, China
Yizhe Zhang
Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN, 46556, USA
Danny Z. Chen

Authors

Yizhe Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Danny Z. Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yizhe Zhang .

Editor information

Editors and Affiliations

University of Central Arkansas, Conway, AR, USA
M. Emre Celebi
Amazon Development Center U.S. Inc., Seattle, WA, USA
Md Sirajus Salekin
Korea University, Seoul, Korea (Republic of)
Hyunwoo Kim
University Hospital Bonn, Bonn, Germany
Shadi Albarqouni
Instituto Superior Técnico, Lisboa, Portugal
Catarina Barata
Memorial Sloan Kettering Cancer Center, New Yrok, NY, USA
Allan Halpern
Medical University of Vienna, Vienna, Austria
Philipp Tschandl
Kenko AI, Barcelona, Spain
Marc Combalia
Google (United States), Palo Alto, CA, USA
Yuan Liu
National Institutes of Health, Bethesda, MD, USA
Ghada Zamzmi
Amazon, USA, Cambridge, MA, USA
Joshua Levy
Amazon (United States), Fairfax, VI, USA
Huzefa Rangwala
German Cancer Research Center, Germany, Heidelberg, Germany
Annika Reinke
Amazon (United States), Baltimore, WA, USA
Diya Wynn
Vanderbilt University, Brentwood, TN, USA
Bennett Landman
Korea University, Seoul, Korea (Republic of)
Won-Ki Jeong
Johns Hopkins University, Baltimore, MD, USA
Yiqing Shen
University of Surrey, Guildford, UK
Zhongying Deng
University of Pennsylvania, Philadelphia, PA, USA
Spyridon Bakas
University of British Columbia, Vancouver, Canada
Xiaoxiao Li
Imperial College London, London, UK
Chen Qin
Nvidia, Munich, Germany
Nicola Rieke
Nvidia Corporation, Bethesda, MD, USA
Holger Roth
NVIDIA Corporation, Santa Clara, CA, USA
Daguang Xu

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 3445 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, Y., Chen, D.Z. (2023). GPT4MIA: Utilizing Generative Pre-trained Transformer (GPT-3) as a Plug-and-Play Transductive Model for Medical Image Analysis. In: Celebi, M.E., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023 Workshops . MICCAI 2023. Lecture Notes in Computer Science, vol 14393. Springer, Cham. https://doi.org/10.1007/978-3-031-47401-9_15

Download citation

DOI: https://doi.org/10.1007/978-3-031-47401-9_15
Published: 01 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47400-2
Online ISBN: 978-3-031-47401-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

GPT4MIA: Utilizing Generative Pre-trained Transformer (GPT-3) as a Plug-and-Play Transductive Model for Medical Image Analysis