Abstract
In this paper, we propose a novel approach (called GPT4MIA) that utilizes Generative Pre-trained Transformer (GPT) as a plug-and-play transductive inference tool for medical image analysis (MIA). We provide theoretical analysis on why a large pre-trained language model such as GPT-3 can be used as a plug-and-play transductive inference model for MIA. At the methodological level, we develop several technical treatments to improve the efficiency and effectiveness of GPT4MIA, including better prompt structure design, sample selection, and prompt ordering of representative samples/features. We present two concrete use cases (with workflow) of GPT4MIA: (1) detecting prediction errors and (2) improving prediction accuracy, working in conjecture with well-established vision-based models for image classification (e.g., ResNet). Experiments validate that our proposed method is effective for these two tasks. We further discuss the opportunities and challenges in utilizing Transformer-based large language models for broader MIA applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Nearest neighbor classifiers are a typical type of transductive methods for prediction problems.
- 2.
LR, MLP, SVM, and KNN are conducted using the scikit-learn library at https://scikit-learn.org/, and UbKNN is with our implementation.
- 3.
The model weights are obtained from https://github.com/MedMNIST/experiments.
References
OpenAI API. https://openai.com/api/
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 1–27 (2011)
ChatGPT. https://openai.com/blog/chatgpt/
Brown, T., et al.: Language models are few-shot learners. Adv. Neural. Inf. Process. Syst. 33, 1877–1901 (2020)
Hang, H., Cai, Y., Yang, H., Lin, Z.: Under-bagging nearest neighbors for imbalanced classification. J. Mach. Learn. Res. 23(118), 1–63 (2022)
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)
Joachims, T., et al.: Transductive inference for text classification using support vector machines. In: ICML, vol. 99, pp. 200–209 (1999)
Peterson, L.E.: K-nearest neighbor. Scholarpedia 4(2), 1883 (2009)
Vapnik, V.: Statistical Learning Theory. Wiley (1998)
Weisberg, S.: Applied Linear Regression, vol. 528. Wiley (2005)
Yang, J., et al.: MedMNIST v2 - a large-scale lightweight benchmark for 2D and 3D biomedical image classification. Sci. Data 10(1), 41 (2023)
Acknowledgement
This work was supported in part by National Natural Science Foundation of China (62201263) and Natural Science Foundation of Jiangsu Province (BK20220949).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, Y., Chen, D.Z. (2023). GPT4MIA: Utilizing Generative Pre-trained Transformer (GPT-3) as a Plug-and-Play Transductive Model for Medical Image Analysis. In: Celebi, M.E., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023 Workshops . MICCAI 2023. Lecture Notes in Computer Science, vol 14393. Springer, Cham. https://doi.org/10.1007/978-3-031-47401-9_15
Download citation
DOI: https://doi.org/10.1007/978-3-031-47401-9_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-47400-2
Online ISBN: 978-3-031-47401-9
eBook Packages: Computer ScienceComputer Science (R0)