Skip to main content

Advertisement

Log in

Algorithmic bias, generalist models, and clinical medicine

  • Original Research
  • Published:
AI and Ethics Aims and scope Submit manuscript

Abstract

The technical landscape of clinical machine learning is shifting in ways that destabilize pervasive assumptions about the nature and causes of algorithmic bias. On one hand, the dominant paradigm in clinical machine learning is narrow in the sense that models are trained on biomedical data sets for particular clinical tasks, such as diagnosis and treatment recommendation. On the other hand, the emerging paradigm is generalist in the sense that general-purpose language models such as Google’s BERT and PaLM are increasingly being adapted for clinical use cases via prompting or fine-tuning on biomedical data sets. Many of these next-generation models provide substantial performance gains over prior clinical models, but at the same time introduce novel kinds of algorithmic bias and complicate the explanatory relationship between algorithmic biases and biases in training data. This paper articulates how and in what respects biases in generalist models differ from biases in prior clinical models, and draws out practical recommendations for algorithmic bias mitigation. The basic methodological approach is that of philosophical ethics in that the focus is on conceptual clarification of the different kinds of biases presented by generalist clinical models and their bioethical significance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. For an introduction to machine learning in medicine see Rajkomar et al. [61]. For further reading see Shamout et al. [65], Secinaro et al. [64], Lee and Lee [43].

  2. Performance biases can also arise, among other reasons, from misrepresentative training data such as data sets that employ proxy variables or data labels that systematically distort the circumstances of a disadvantaged group [38, 52]. See Section 2.3 for discussion.

  3. For an excellent discussion on the nature and causes of algorithmic bias simpliciter, that is, outside the specific context of clinical medicine, see [16].

  4. The problem is underscored by the fact that absent equal base rates across subpopulations or perfect predictive performance binary classifiers cannot equalize precision and false positive/negative rates [7]. See Kleinberg et al. [41] for an analogous result for continuous risk scores. For discussion of the significance of the fairness impossibility theorems for healthcare see Grote and Keeling [28] and Grote and Keeling [29].

  5. The implication here is not that all performance biases are explained by biases in training data, as biases can arise at every stage of the machine learning pipeline [61]. Rather, the claim is that performance biases (at least in healthcare, where demographic data biases are widespread and pervasive) in a broad class of cases arise due to biases in data sets, such as under-representative or misrepresentative training data. Such is the extent of data biases that it makes sense for organizations like the to orient their general bias mitigation advice around data representativeness [c.f. 17].

  6. This formulation is rough, because strictly speaking tokens and not words are inputted into the function, i.e., input sequences of text are first tokenized [see 26].

  7. A third case bracketed here is sequence-to-sequence models such as Google’s T5 that include an encoder and a decoder [58].

  8. These examples are illustrative and are not intended as an exhaustive taxonomy.

  9. Sun et al. [71, p.206] note that ‘the use of negative descriptors might not necessarily reflect bias among individual providers; rather, it may reflect a broader systemic acceptability of using negative patient descriptors as a surrogate for identifying structural barriers.

  10. Note that this issue is not unique to LLMs. Similar considerations hold for models and research studies that rely on discrete EHR data, which can also encode biases [c.f. 63].

  11. Ethical fine-tuning can also be achieved via non-supervised approaches such as Reinforcement Learning from Human Feedback [2, 53]. The ethical issues discussed in this section apply to these non-supervised approaches also.

References

  1. Abid, A., Farooqi, M., Zou, J.: Persistent anti-muslim bias in large language models. In: Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society 298–306 (2021)

  2. Bai, Y., Jones, A., Ndousse, K., Askell, A., Chen, A., DasSarma, N., Drain, D., Fort, S., Ganguli, D., Henighan, T., et al. Training a helpful and harmless assistant with reinforcement learning from human feedback. arXiv preprint arXiv:2204.05862, (2022)

  3. Bender, E. M., Gebru, T., McMillan-Major, A., Shmitchell, S.: On the dangers of stochastic parrots: Can language models be too big? In: Proceedings of the 2021 ACM conference on fairness, accountability, and transparency, pages 610–623 (2021)

  4. Bommasani, R., Hudson, D. A., Adeli, E., Altman, R., Arora, S., von Arx, S., Bernstein, M. S., Bohg, J., Bosselut, A., Brunskill, E., et al. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258, (2021)

  5. Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. Adv. Neural. Inf. Process. Syst. 33, 1877–1901 (2020)

    Google Scholar 

  6. Challen, R., Denny, J., Pitt, M., Gompels, L., Edwards, T., Tsaneva-Atanasova, K.: Artificial intelligence, bias and clinical safety. BMJ Quality Safety 28(3), 231–237 (2019)

    Article  Google Scholar 

  7. Chouldechova, A.: Fair prediction with disparate impact: a study of bias in recidivism prediction instruments. Big Data 5(2), 153–163 (2017)

    Article  Google Scholar 

  8. Chowdhury, A., Rosenthal, J., Waring, J., Umeton, R.: Applying self-supervised learning to medicine: review of the state of the art and medical implementations. Informatics 8(3), 59 (2021). (MDPI)

    Article  Google Scholar 

  9. Cirillo, D., Catuara-Solarz, S., Morey, C., Guney, E., Subirats, L., Mellino, S., Gigante, A., Valencia, A., Rementeria, M.J., Chadha, A.S., et al.: Sex and gender differences and biases in artificial intelligence for biomedicine and healthcare. NPJ Digital Med. 3(1), 81 (2020)

    Article  Google Scholar 

  10. Daneshjou, R., Vodrahalli, K., Novoa, R.A., Jenkins, M., Liang, W., Rotemberg, V., Ko, J., Swetter, S.M., Bailey, E.E., Gevaert, O., et al.: Disparities in dermatology ai performance on a diverse, curated clinical image set. Sci. Adv. 8(31), eabq6147 (2022)

    Article  Google Scholar 

  11. Deng, J., Dong, W., Socher, R., Li, L.-J., Li K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, (2009)

  12. Department of Health and Human Services. Artificial intelligence (ai) strategy, 2022

  13. Department of Health and Social Care. £21 million to roll out artificial intelligence across the nhs, (2023)

  14. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, (2018)

  15. Dieterich, W., Mendoza, C., Brennan T.: Compas risk scales: demonstrating accuracy equity and predictive parity. Northpointe Inc, 7(7.4):1 (2016)

  16. Fazelpour, S., Danks, D.: Algorithmic bias: senses, sources, solutions. Philos. Compass 16(8), e12760 (2021)

    Article  Google Scholar 

  17. Food and Drug Administration. Artificial intelligence/machine learning (ai/ml)-based software as a medical device (samd) action plan. Food Drug Admin., Silver Spring, MD, USA, Tech. Rep, 1, (2021a)

  18. Food and Drug Administration. Good machine learning practice for medical device development: Guiding principles, (2021b)

  19. Frosch, D.L., May, S.G., Rendle, K.A., Tietbohl, C., Elwyn, G.: Authoritarian physicians and patients’ fear of being labeled ‘difficult’among key obstacles to shared decision making. Health Aff. 31(5), 1030–1038 (2012)

    Article  Google Scholar 

  20. Ganguli D., Lovitt L., Kernion, J., Askell, A., Bai, Y., Kadavath, S., Mann, B., Perez, E, Schiefer, N., Ndousse, K., et al.: Red teaming language models to reduce harms: methods, scaling behaviors, and lessons learned. arXiv preprint arXiv:2209.07858, (2022)

  21. S. García-Méndez, F. De Arriba-Pérez, F. J. González-Casta no, J. A. Regueiro-Janeiro, and F. Gil-Casti neira. Entertainment chatbot for the digital inclusion of elderly people without abstraction capabilities. IEEE Access, 9:75878–75891, (2021)

  22. Genin, K., Grote, T.: Randomized controlled trials in medical ai: a methodological critique. Philos. Med. 2(1), 1–15 (2021)

    Google Scholar 

  23. Gianattasio, K.Z., Prather, C., Glymour, M.M., Ciarleglio, A., Power, M.C.: Racial disparities and temporal trends in dementia misdiagnosis risk in the United States. Alzheimer’s Dement: Transl. Res. Clin. Interventions 5, 891–898 (2019)

    Article  Google Scholar 

  24. Gianfrancesco, M.A., Tamang, S., Yazdany, J., Schmajuk, G.: Potential biases in machine learning algorithms using electronic health record data. JAMA Intern. Med. 178(11), 1544–1547 (2018)

    Article  Google Scholar 

  25. Gramling, R., Stanek, S., Ladwig, S., Gajary-Coots, E., Cimino, J., Anderson, W., Norton, S.A., Aslakson, R.A., Ast, K., Elk, R., et al.: Feeling heard and understood: a patient-reported quality measure for the inpatient palliative care setting. J. Pain Symptom Manage. 51(2), 150–154 (2016)

    Article  Google Scholar 

  26. Grefenstette, G.: Tokenization. Syntactic Wordclass Tagging, pages 117–133, (1999)

  27. Groh, M., C. Harris, L. Soenksen, F. Lau, R. Han, A. Kim, A. Koochek, and O. Badri. Evaluating deep neural networks trained on clinical images in dermatology with the fitzpatrick 17k dataset. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1820–1828, (2021)

  28. Grote, T., Keeling, G.: On algorithmic fairness in medical practice. Camb. Q. Healthc. Ethics 31(1), 83–94 (2022)

    Article  Google Scholar 

  29. Grote, T., Keeling, G.: Enabling fairness in healthcare through machine learning. Ethics Inf. Technol. 24(3), 39 (2022)

    Article  Google Scholar 

  30. Hall, W.J., Chapman, M.V., Lee, K.M., Merino, Y.M., Thomas, T.W., Payne, B.K., Eng, E., Day, S.H., Coyne-Beasley, T.: Implicit racial/ethnic bias among health care professionals and its influence on health care outcomes: a systematic review. Am. J. Public Health 105(12), e60–e76 (2015)

    Article  Google Scholar 

  31. Halpern, S.D., Loewenstein, G., Volpp, K.G., Cooney, E., Vranas, K., Quill, C.M., McKenzie, M.S., Harhay, M.O., Gabler, N.B., Silva, T., et al.: Default options in advance directives influence how patients set goals for end-of-life care. Health Aff. 32(2), 408–417 (2013)

    Article  Google Scholar 

  32. Hasan, O., Meltzer, D.O., Shaykevich, S.A., Bell, C.M., Kaboli, P.J., Auerbach, A.D., Wetterneck, T.B., Arora, V.M., Zhang, J., Schnipper, J.L.: Hospital readmission in general medicine patients: a prediction model. J. Gen. Intern. Med. 25, 211–219 (2010)

    Article  Google Scholar 

  33. Haug, C.J., Drazen, J.M.: Artificial intelligence and machine learning in clinical medicine, 2023. N. Engl. J. Med. 388(13), 1201–1208 (2023)

    Article  Google Scholar 

  34. Hedden, B.: On statistical criteria of algorithmic fairness. Philos. Public Aff. 49(2), 209–231 (2021)

    Article  Google Scholar 

  35. Hellström, T., Dignum, V., Bensch, S.: Bias in machine learning–what is it good for? arXiv preprint arXiv:2004.00686, (2020)

  36. Huang, K., Altosaar, J., Ranganath, R.: Clinicalbert: Modeling clinical notes and predicting hospital readmission. arXiv preprint arXiv:1904.05342, (2019)

  37. Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., Ishii, E., Bang, Y.J., Madotto, A., Fung, P.: Survey of hallucination in natural language generation. ACM Comput. Surv. 55(12), 1–38 (2023)

    Article  Google Scholar 

  38. Jiang, H., Nachum, O.: Identifying and correcting label bias in machine learning. In: International Conference on Artificial Intelligence and Statistics, pages 702–712. PMLR, (2020)

  39. Karystianis, G., Cabral, R.C., Han, S.C., Poon, J., Butler, T.: Utilizing text mining, data linkage and deep learning in police and health records to predict future offenses in family and domestic violence. Front. Digital Health 3, 602683 (2021)

    Article  Google Scholar 

  40. Kelly, B.S., Judge, C., Bollard, S.M., Clifford, S.M., Healy, G.M., Aziz, A., Mathur, P., Islam, S., Yeom, K.W., Lawlor, A., et al.: Radiology artificial intelligence: a systematic review and evaluation of methods (raise). Eur. Radiol. 32(11), 7998–8007 (2022)

    Article  Google Scholar 

  41. Kleinberg, J., Mullainathan, S., Raghavan, M.: Inherent trade-offs in the fair determination of risk scores. arXiv preprint arXiv:1609.05807, 2016

  42. Laurençon, H., Saulnier, L., Wang, T., Akiki, C., Villanova del Moral, A., Le Scao, T., Von Werra, L., Mou, C., González Ponferrada, E., Nguyen, H., et al.: The bigscience roots corpus: A 1.6 tb composite multilingual dataset. Adv. Neural Inform. Proces. Syst. 35, 31809–31826, (2022)

  43. Lee, C.S., Lee, A.Y.: Clinical applications of continual learning machine learning. Lancet Digital Health 2(6), e279–e281 (2020)

    Article  Google Scholar 

  44. Lee, J., Yoon, W., Kim, S., Kim, D., Kim, S., So, C.H., Kang, J.: Biobert: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36(4), 1234–1240 (2020)

    Article  Google Scholar 

  45. Lin, C., Bethard, S., Dligach, D., Sadeque, F., Savova, G., Miller, T.A.: Does bert need domain adaptation for clinical negation detection? J. Am. Med. Inform. Assoc. 27(4), 584–591 (2020)

    Article  Google Scholar 

  46. Liu, Q., Kusner, M. J., Blunsom, P.: A survey on contextual embeddings. arXiv preprint arXiv:2003.07278, (2020)

  47. McNeil, B.J., Pauker, S.G., Sox, H.C., Jr., Tversky, A.: On the elicitation of preferences for alternative therapies. N. Engl. J. Med. 306(21), 1259–1262 (1982)

    Article  Google Scholar 

  48. Mitsios, J.P., Ekinci, E.I., Mitsios, G.P., Churilov, L., Thijs, V.: Relationship between glycated hemoglobin and stroke risk: a systematic review and meta-analysis. J. Am. Heart Assoc. 7(11), e007858 (2018)

    Article  Google Scholar 

  49. Mosteiro, P., Rijcken, E., Zervanou, K., Kaymak, U., Scheepers, F., Spruit, M.: Machine learning for violence risk assessment using dutch clinical notes. arXiv preprint arXiv:2204.13535, (2022)

  50. Norori, N., Hu, Q., Aellen, F.M., Faraci, F.D., Tzovara, A.: Addressing bias in big data and ai for health care: a call for open science. Patterns 2(10), 100347 (2021)

    Article  Google Scholar 

  51. Norton, S.A., Tilden, V.P., Tolle, S.W., Nelson, C.A., Eggman, S.T.: Life support withdrawal: communication and conflict. Am. J. Crit. Care 12(6), 548–555 (2003)

    Article  Google Scholar 

  52. Obermeyer, Z., Powers, B., Vogeli, C., Mullainathan, S.: Dissecting racial bias in an algorithm used to manage the health of populations. Science 366(6464), 447–453 (2019)

    Article  Google Scholar 

  53. Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., et al.: Training language models to follow instructions with human feedback. Adv. Neural. Inf. Process. Syst. 35, 27730–27744 (2022)

    Google Scholar 

  54. Panch, T., Mattie, H., Atun, R.: Artificial intelligence and algorithmic bias implications for health systems. J. Global Health 149, (2019)

  55. Peng, Y., Yan, S., Lu, Z.: Transfer learning in biomedical natural language processing: an evaluation of bert and elmo on ten benchmarking datasets. arXiv preprint arXiv:1906.05474, (2019)

  56. Perez, E., Huang, S., Song, F., Cai, T., Ring, R., Aslanides, J., Glaese, A., McAleese, N., Irving, G.: Red teaming language models with language models. arXiv preprint arXiv:2202.03286, (2022)

  57. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al.: Language models are unsupervised multitask learners. OpenAI Blog 1(8), 9 (2019)

    Google Scholar 

  58. Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research 21(1), 5485–5551 (2020)

    MathSciNet  MATH  Google Scholar 

  59. Rahimi, S., Oktay, O., Alvarez-Valle, J., Bharadwaj, S.: Addressing the exorbitant cost of labeling medical images with active learning. In: International Conference on Machine Learning in Medical Imaging and Analysis, page 1, (2021)

  60. Rajkomar, A., Hardt, M., Howell, M.D., Corrado, G., Chin, M.H.: Ensuring fairness in machine learning to advance health equity. Ann. Intern. Med. 169(12), 866–872 (2018)

    Article  Google Scholar 

  61. Rajkomar, A., Dean, J., Kohane, I.: Machine learning in medicine. N. Engl. J. Med. 380(14), 1347–1358 (2019)

    Article  Google Scholar 

  62. Rasmy, L., Xiang, Y., Xie, Z., Tao, C., Zhi, D.: Med-bert: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction. NPJ digital medicine 4(1), 86 (2021)

    Article  Google Scholar 

  63. Ross, A.B., Kalia, V., Chan, B.Y., Li, G.: The influence of patient race on the use of diagnostic imaging in united states emergency departments: data from the national hospital ambulatory medical care survey. BMC Health Serv. Res. 20(1), 1–10 (2020)

    Article  Google Scholar 

  64. Secinaro, S., Calandra, D., Secinaro, A., Muthurangu, V., Biancone, P.: The role of artificial intelligence in healthcare: a structured literature review. BMC Med. Inform. Decis. Mak. 21, 1–23 (2021)

    Article  Google Scholar 

  65. Shamout, F., Zhu, T., Clifton, D.A.: Machine learning for clinical outcome prediction. IEEE Rev. Biomed. Eng. 14, 116–126 (2020)

    Article  Google Scholar 

  66. Shang, J., Ma, T., Xiao, C., Sun, J.: Pre-training of graph augmented transformers for medication recommendation. arXiv preprint arXiv:1906.00346, (2019)

  67. Sheng, E., Chang, K.-W., Natarajan, P. Peng, N.: The woman worked as a babysitter: On biases in language generation. arXiv preprint arXiv:1909.01326, (2019)

  68. Singhal, K., Azizi, S., Tu, T., Mahdavi, S. S., Wei, J., Chung, H. W., Scales, N. Tanwani, A., Cole-Lewis, H., Pfohl, S., et al.: Large language models encode clinical knowledge. arXiv preprint arXiv:2212.13138, (2022)

  69. Sirrianni, J., Sezgin, E., Claman, D., Linwood, S.L.: Medical text prediction and suggestion using generative pretrained transformer models with dental medical notes. Methods Inf. Med. 61(05/06), 195–200 (2022)

    Article  Google Scholar 

  70. Stephenson, J.: Racial barriers may hamper diagnosis, care of patients with alzheimer disease. JAMA 286(7), 779–780 (2001)

    Article  Google Scholar 

  71. Sun, M., Oliwa, T., Peek, M.E., Tung, E.L.: Negative patient descriptors: Documenting racial bias in the electronic health record: Study examines racial bias in the patient descriptors used in the electronic health record. Health Aff. 41(2), 203–211 (2022)

    Article  Google Scholar 

  72. Thoppilan, R., De Freitas, D., Hall, J., Shazeer, N., Kulshreshtha, A., Cheng, H.-T., Jin, A., Bos, T., Baker, L., Du, Y., et al.: Lamda: language models for dialog applications. arXiv preprint arXiv:2201.08239, (2022)

  73. Tschandl, P., Rosendahl, C., Akay, B.N., Argenziano, G., Blum, A., Braun, R.P., Cabo, H., Gourhant, J.-Y., Kreusch, J., Lallas, A., et al.: Expert-level diagnosis of nonpigmented skin cancer by combined convolutional neural networks. JAMA Dermatol. 155(1), 58–65 (2019)

    Article  Google Scholar 

  74. Uthoff, J., Nagpal, P., Sanchez, R., Gross, T.J., Lee, C., Sieren, J.C.: Differentiation of non-small cell lung cancer and histoplasmosis pulmonary nodules: insights from radiomics model performance compared with clinician observers. Translational Lung Cancer Res. 8(6), 979 (2019)

    Article  Google Scholar 

  75. van Wezel, M. M., Croes, E. A., Antheunis, M. L.: “i’m here for you”: Can social chatbots truly support their users? a literature review. In: Chatbot Research and Design: 4th International Workshop, CONVERSATIONS 2020, Virtual Event, November 23–24, 2020, Revised Selected Papers 4, pages 96–113. Springer, (2021)

  76. Wang, L., Mujib, M. I., Williams, J., Demiris, G., Huh-Yoo, J.: An evaluation of generative pre-training model-based therapy chatbot for caregivers. arXiv preprint arXiv:2107.13115, (2021)

  77. Ware, O.R., Dawson, J.E., Shinohara, M.M., Taylor, S.C.: Racial limitations of fitzpatrick skin type. Cutis 105(2), 77–80 (2020)

    Google Scholar 

  78. Weidinger, L., Mellor, J., Rauh, M., Griffin, C., Uesato, J., Huang, P.-S., Cheng, M., Glaese, M., Balle, B., Kasirzadeh, A., et al.: Ethical and social risks of harm from language models. arXiv preprint arXiv:2112.04359, (2021)

  79. Weiss, K., Khoshgoftaar, T.M., Wang, D.: A survey of transfer learning. J. Big data 3(1), 1–40 (2016)

    Article  Google Scholar 

  80. Willemink, M.J., Koszek, W.A., Hardell, C., Wu, J., Fleischmann, D., Harvey, H., Folio, L.R., Summers, R.M., Rubin, D.L., Lungren, M.P.: Preparing medical imaging data for machine learning. Radiology 295(1), 4–15 (2020)

    Article  Google Scholar 

  81. Zhang, S., Roller, S., Goyal, N., Artetxe, M., Chen, M., Chen, S., Dewan, C., Diab, M., Li, X., Lin, X. V., et al. Opt: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068, (2022)

  82. Zhao, W., Katzmarzyk, P.T., Horswell, R., Wang, Y., Johnson, J., Hu, G.: Sex differences in the risk of stroke and hba 1c among diabetic patients. Diabetologia 57, 918–926 (2014)

    Article  Google Scholar 

  83. Zhou, K., Ethayarajh, K., Jurafsky, D.: Frequency-based distortions in contextualized word embeddings. arXiv preprint arXiv:2104.08465, (2021)

Download references

Acknowledgement

The author is grateful to Michael Howell, Heather Cole-Lewis, Lisa Lehmann, Diane Korngiebel, Thomas Douglas, Bakul Patel, Kate Weber, and Rachel Gruner for helpful comments, alongside participants at the Workshop on the Ethics of Influence at the Uehiro Centre for Practical Ethics at the University of Oxford.

Funding

This study was funded by Google LLC and/or a subsidiary thereof (‘Google’).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Geoff Keeling.

Ethics declarations

Conflict of interest

The author(s) are current or former employees of Google LLC and own stock as part of the standard compensation package.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Keeling, G. Algorithmic bias, generalist models, and clinical medicine. AI Ethics (2023). https://doi.org/10.1007/s43681-023-00329-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s43681-023-00329-x

Keywords

Navigation