Skip to main content
Log in

Artificial intelligence bias in medical system designs: a systematic review

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Inherent bias in the artificial intelligence (AI)-model brings inaccuracies and variabilities during clinical deployment of the model. It is challenging to recognize the source of bias in AI-model due to variations in datasets and black box nature of system design. Additionally, there is no distinct process to identify the potential source of bias in the AI-model. To the best of our knowledge, this is the first review of its kind that addresses the bias in AI-model by categorizing 48 studies into three classes, namely, point-based, image-based, and hybrid-based AI-models. Selection strategy using PRISMA is adopted to select the 72 crucial AI studies for identifying bias in AI models. Using the three classes, bias is identified in these studies based on 44 critical AI attributes. Bias in the AI-models is computed by analytical, butterfly, and ranking-based bias models. These bias models were evaluated using two experts and compared using variability analysis. AI-studies that lacked sufficient AI-attributes are more prone to risk-of-bias (RoB) in all three classes. Studies with high RoB loses fins in the butterfly model. It has been analyzed that the majority of the studies in healthcare suffer from data bias and algorithmic bias due to incomplete specifications mentioned in the design protocol and weak AI design exploited for prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data availability

Data will be made available by the corresponding author on reasonable request.

Abbreviations

1 ACC:

Accuracy

2 ANOVA:

Analysis of variance

3 ANCOVA:

Analysis of co-variance

4 AI:

Artificial intelligence

5 AUC:

Area under the curve

6 AUPRC:

Area under precision-recall curve

7 BMI:

Body mass index

8 BI:

Brain imaging

9 CMR:

Cardiac magnetic resonance

10 CVD:

Cardiac vascular disease

11 CKD:

Chronic kidney disease

12 CXR:

Chest X-ray

13 CT:

Computed tomography

14 CAD:

Computer-aided diagnosis

15  CI :

Confidence Index

16 COI:

Coronary imaging

17 COVID-19:

Coronavirus disease 2019

18 COMPAS:

Correctional Offender Management Profiling for Alternative Sanction

19 CHARMS:

Critical appraisal and data extraction for systematic reviews of prediction modelling studies

20 DL:

Deep learning

21 EHR:

Electronic health record

22 ECG:

Electrocardiography

23 GDPR:

General Data Protection Regulation

24  HDL:

Hybrid Deep learning

25 HD-ai:

Hybrid data-based AI-model

26 ID-ai:

Image data-based AI-model

27 LVEF:

Left ventricular eject fraction

28 ML:

Machine learning

29 MRI:

Magnetic resonance imaging

30 MCC:

Matthew’s correlation coefficient

31 MSE:

Mean squared error

32 PD-ai:

Point data-based AI-model

33 PRISMA:

Preferred Reporting Items for Systematic Reviews and Meta-Analysis

34 RT-PCR:

Reverse transcription polymerase chain reaction

35 RoB:

Risk of Bias

36 ROBINS-I:

Risk of Bias In Non-randomized Studies - of Interventions

37 SEN:

Sensitivity

38 SI:

Stroke imaging

39 SPE:

Specificity

40 USS:

Ultrasound

41 WHO:

World health organization

References

  1. Abbasi-Sureshjani S, Raumanns R, Michels BE, Schouten G, Cheplygina V (2020) Risk of training diagnostic algorithms on data with demographic bias. Interpretable and Annotation-Efficient Learning for Medical Image Computing. Springer, pp 183–192

    Google Scholar 

  2. Acharya UR, Faust O, Alvin A, Krishnamurthi G, Seabra JC, Sanches J, Suri JS (2013) Understanding symptomatology of atherosclerotic plaque by image-based tissue characterization. Comput Methods Programs Biomed 110(1):66–75

    Google Scholar 

  3. Acharya UR, Mookiah MRK, Sree SV, Yanti R, Martis R, Saba L, Suri JS (2014) Evolutionary algorithm-based classifier parameter tuning for automatic ovarian cancer tissue characterization and classification. Ultraschall in der Medizin-European Journal of Ultrasound 35(03):237–245

    Google Scholar 

  4. Agarwal M, Agarwal S, Saba L, Chabert GL, Gupta S, Carriero A, Pasche A, Danna P, Mehmedovic A, Faa G, Shrivastav S, Jain K, Jain H, Jujaray T, Singh MI, Turk M, Chadha SP, Johri MA, Khanna NN, Suri JS (2022) Eight pruning deep learning models for low storage and high-speed COVID-19 computed tomography lung segmentation and heatmap-based lesion localization: A multicenter study using COVLIAS 2.0. Comput Biol Med 146:105571

  5. Ahmed S, Athyaab SA, Muqtadeer SA (2021) Attenuation of human Bias in artificial intelligence: An exploratory approach. In: 2021 6th international conference on inventive computation technologies (ICICT). IEEE, pp 557–563

    Google Scholar 

  6. Ahn E, Kumar A, Fulham M, Feng D, Kim J (2020) Unsupervised domain adaptation to classify medical images using zero-bias convolutional auto-encoders and context-based feature augmentation. IEEE Trans Med Imaging 39(7):2385–2394

    Google Scholar 

  7. Akter S, McCarthy G, Sajib S, Michael K, Dwivedi YK, D’Ambra J, Shen K (2021) Algorithmic bias in data-driven innovation in the age of AI. Int J Inf Manage 60:102387

    Google Scholar 

  8. Álvarez-Rodríguez L, Moura Jd, Novo J, Ortega M (2022) Does imbalance in chest X-ray datasets produce biased deep learning approaches for COVID-19 screening? BMC Medical Research Methodology 22(1):1–17

    Google Scholar 

  9. Araki T, Ikeda N, Shukla D, Jain PK, Londhe ND, Shrivastava VK, Shafique S (2016) PCA-based polling strategy in machine learning framework for coronary artery disease risk assessment in intravascular ultrasound: A link between carotid and coronary grayscale plaque morphology. Computer methods and programs in biomedicine 128:137–158

    Google Scholar 

  10. Ashraf A, Khan S, Bhagwat N, Chakravarty M, Taati B (2018) Learning to unlearn: Building immunity to dataset bias in medical imaging studies. arXiv preprint arXiv:181201716

  11. Assayag D, Morisset J, Johannson KA, Wells AU, Walsh SL (2020) Patient gender bias on the diagnosis of idiopathic pulmonary fibrosis. Thorax 75(5):407–412

    Google Scholar 

  12. Banchhor SK, Londhe ND, Araki T, Saba L, Radeva P, Khanna NN, Suri JS (2018) Calcium detection, its quantification, and grayscale morphology-based risk stratification using machine learning in multimodality big data coronary and carotid scans: A review. Comput Biol Med 101:184–198

    Google Scholar 

  13. Baylor D, Breck E, Cheng HT, Fiedel N, Foo CY, Haque Z, Zinkevich M (2017) Tfx: A tensorflow-based production-scale machine learning platform. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1387–1395

    Google Scholar 

  14. Beesley LJ, Mukherjee B (2022) Statistical inference for association studies using electronic health records: Handling both selection bias and outcome misclassification. Biometrics 78(1):214–226

    MathSciNet  Google Scholar 

  15. Belenguer L (2022) AI bias: Exploring discriminatory algorithmic decision-making models and the application of possible machine-centric solutions adapted from the pharmaceutical industry. AI and Ethics 2(4):771–787

    Google Scholar 

  16. Bellamy RK, Dey K, Hind M, Hoffman SC, Houde S, Kannan K, Zhang Y (2019) AI fairness 360: An extensible toolkit for detecting and mitigating algorithmic bias. IBM J Res Dev 63(4/5):4–1

    Google Scholar 

  17. Benjamin M, Gagnon P, Rostamzadeh N, Pal C, Bengio Y, Shee A (2019) Towards standardization of data licenses: the Montreal data license. arXiv preprint arXiv:190312262

  18. Berg T, Burg V, Gombović A, Puri M (2020) On the rise of fintechs: Credit scoring using digital footprints. The Review of Financial Studies 33(7):2845–2897

    Google Scholar 

  19. Byrne MD (2021) Reducing bias in healthcare artificial intelligence. J Perianesth Nurs 36(3):313–316

    Google Scholar 

  20. Calderon-Ramirez S, Yang S, Moemeni A, Colreavy-Donnelly S, Elizondo DA, Oala L, Molina-Cabello MA (2021) Improving uncertainty estimation with semi-supervised deep learning for covid-19 detection using chest x-ray images. Ieee Access 9:85442–85454

    Google Scholar 

  21. Calmon F, Wei D, Vinzamuri B, Natesan Ramamurthy K, Varshney KR (2017) Optimized preprocessing for discrimination prevention. Adv Neural Inf Proces Syst 30

  22. Cardenas S, Vallejo-Cardenas SF (2019) Continuing the conversation on how structural racial and ethnic inequalities affect AI biases. In: 2019 IEEE international symposium on technology and society (ISTAS). IEEE, pp 1–7

    Google Scholar 

  23. Catalá ODT, Igual IS, Pérez-Benito FJ, Escrivá DM, Castelló VO, Llobet R, Peréz-Cortés J-C (2021) Bias analysis on public X-ray image datasets of pneumonia and COVID-19 patients. IEEE Access 9:42370–42383

    Google Scholar 

  24. Cau R, Bassareo P, Caredda G, Suri JS, Esposito A, Saba L (2022a) Atrial strain by feature-tracking cardiac magnetic resonance imaging in takotsubo cardiomyopathy. Features, feasibility, and reproducibility. Can Assoc Radiol J 73(3):573–580

    Google Scholar 

  25. Cau R, Bassareo P, Suri JS, Pontone G, Saba L (2022b) The emerging role of atrial strain assessed by cardiac MRI in different cardiovascular settings: An up-to-date review. Eur Radiol 32(7):4384–4394

    Google Scholar 

  26. Celi LA, Cellini J, Charpignon M-L, Dee EC, Dernoncourt F, Eber R, Situ J (2022) Sources of bias in artificial intelligence that perpetuate healthcare disparities—A global review. PLOS Digital Health 1(3):e0000022

    Google Scholar 

  27. Challen R, Denny J, Pitt M, Gompels L, Edwards T, Tsaneva-Atanasova K (2019) Artificial intelligence, bias and clinical safety. BMJ Qual Saf 28(3):231–237

    Google Scholar 

  28. Chung H, Park C, Kang WS, Lee J (2021) Gender Bias in artificial intelligence: Severity prediction at an early stage of COVID-19. Front Physiol 2104

  29. Cirillo D, Catuara-Solarz S, Morey C, Guney E, Subirats L, Mellino S, Chadha AS (2020) Sex and gender differences and biases in artificial intelligence for biomedicine and healthcare. NPJ digital medicine 3(1):1–11

    Google Scholar 

  30. Corrias G, Mazzotta A, Melis M, Cademartiri F, Yang Q, Suri JS, Saba L (2021) Emerging role of artificial intelligence in stroke imaging. Expert Rev Neurother 21(7):745–754

    Google Scholar 

  31. d’Alessandro B, O’Neil C, LaGatta T (2017) Conscientious classification: A data scientist’s guide to discrimination-aware classification. Big data 5(2):120–134

    Google Scholar 

  32. Danks D, London AJ (2017) Algorithmic Bias in autonomous systems. IJCAI 17(2017):4691–4697

    Google Scholar 

  33. Dankwa-Mullan I, Weeraratne D (2022) Artificial Intelligence and Machine Learning Technologies in Cancer Care: Addressing Disparities, Bias, and Data Diversity. Cancer Discov 12(6):1423–1427

    Google Scholar 

  34. Das, S., G. Nayak, L. Saba, M. Kalra, J.S. Suri, and S. Saxena, (2022) An artificial intelligence framework and its bias for brain tumor segmentation: A narrative review. Computers in Biology and Medicine 105273

  35. Dastin J (2018) Amazon scraps secret AI recruiting tool that showed bias against women. In: Ethics of data and analytics. Auerbach Publications, pp 296–299

    Google Scholar 

  36. Kumar A, Jain R, Gupta M, Islam SM, (Eds.). (2023a) 6G-enabled IoT and AI for smart healthcare: Challenges, impact, and analysis. CRC Press

    Google Scholar 

  37. El-Baz A, Jiang X, Suri JS (Eds.) (2016) Biomedical image segmentation: Advances and trends

  38. El-Baz A, Suri JS (eds) (2019) Big data in multimodal medical imaging. CRC Press

    Google Scholar 

  39. Esteva A, Chou K, Yeung S, Naik N, Madani A, Mottaghi A, Socher R (2021) Deep learning-enabled medical computer vision. NPJ digital medicine 4(1):1–9

    Google Scholar 

  40. Estiri H, Strasser ZH, Rashidian S, Klann JG, Wagholikar KB, McCoy TH Jr, Murphy SN (2022) An objective framework for evaluating unrecognized bias in medical AI models predicting COVID-19 outcomes. J Am Med Inform Assoc 29(8):1334–1341

    Google Scholar 

  41. Ezzeddine H, Awad M, Abi Ghanem AS, Mourani B (2021) On data Bias and the usability of deep learning algorithms in classifying COVID-19 based on chest X-ray. In: 2021 IEEE 3rd International Multidisciplinary Conference on Engineering Technology (IMCET). IEEE, pp 136–143

    Google Scholar 

  42. Falco G (2019) Participatory AI: Reducing AI bias and developing socially responsible AI in smart cities. In: 2019 IEEE international conference on computational science and engineering (CSE) and IEEE international conference on embedded and ubiquitous computing (EUC). IEEE, pp 154–158

    Google Scholar 

  43. Fernandez-Quilez A (2023) Deep learning in radiology: Ethics of data and on the value of algorithm transparency, interpretability and explainability. AI and Ethics 3(1):257–265

    Google Scholar 

  44. Ferrer X, van Nuenen T, Such JM, Coté M, Criado N (2021) Bias and Discrimination in AI: A cross-disciplinary perspective. IEEE Technol Soc Mag 40(2):72–80

    Google Scholar 

  45. Gallifant J, Zhang J, M.d.P.A. Lopez, T. Zhu, L. Camporota, L.A. Celi, and F. Formenti, (2022) Artificial intelligence for mechanical ventilation: Systematic review of design, reporting standards, and bias. British Journal of Anaesthesia 128(2):343–351

    Google Scholar 

  46. Gebru T, Morgenstern J, Vecchione B, Vaughan JW, Wallach H, Iii HD, Crawford K (2021) Datasheets for datasets. Commun ACM 64(12):86–92

    Google Scholar 

  47. Geis JR, Brady AP, Wu CC, Spencer J, Ranschaert E, Jaremko JL, Shields WF (2019) Ethics of artificial intelligence in radiology: Summary of the joint European and North American multisociety statement. Canadian Association of Radiologists Journal 70(4):329–334

    Google Scholar 

  48. Georgopoulos M, Panagakis Y, Pantic M (2020) Investigating bias in deep face analysis: The kanface dataset and empirical study. Image Vis Comput 102:103954

    Google Scholar 

  49. Gianfrancesco MA, Tamang S, Yazdany J, Schmajuk G (2018) Potential biases in machine learning algorithms using electronic health record data. JAMA Intern Med 178(11):1544–1547

    Google Scholar 

  50. Greshake Tzovaras B, Tzovara A (2019) The personal data is political. The ethics of medical data donation:133–140

  51. Guidotti R, Monreale A, Ruggieri S, Turini F, Giannotti F, Pedreschi D (2019) A survey of methods for explaining black box models. ACM computing surveys (CSUR) 51(5):1–42

    Google Scholar 

  52. Gurupur V, Wan TT (2020) Inherent bias in artificial intelligence-based decision support systems for healthcare. Medicina 56(3):141

    Google Scholar 

  53. De Hert P, Czerniawski M (2016) Expanding the European data protection scope beyond territory: Article 3 of the General Data Protection Regulation in its wider context. International Data Privacy Law 6(3):230–243

  54. Iosifidis V, Ntoutsi E (2019) Adafair: Cumulative fairness adaptive boosting. In: Proceedings of the 28th ACM international conference on information and knowledge management, pp 781–790

    Google Scholar 

  55. Jain R, Kumar A, Nayyar A, Dewan K, Garg R, Raman S, Ganguly S (2023) Explaining sentiment analysis results on social media texts through visualization. Multimed Tools Appl:1–17

  56. Jain PK, Sharma N, Kalra MK, Viskovic K, Saba L, Suri JS (2022) Four Types of Multiclass Frameworks for Pneumonia Classification and Its Validation in X-ray Scans Using Seven Types of Deep Learning Artificial Intelligence Models. Diagnostics 12(3):652

    Google Scholar 

  57. Jamthikar A, Gupta D, Johri AM, Mantella LE, Saba L, Suri JS (2022) A machine learning framework for risk prediction of multi-label cardiovascular events based on focused carotid plaque B-Mode ultrasound: A Canadian study. Comput Biol Med 140:105102

    Google Scholar 

  58. Jamthikar AD, Gupta D, Mantella LE, Saba L, Johri AM, Suri JS (2021) Ensemble Machine Learning and its Validation for Prediction of Coronary Artery Disease and Acute Coronary Syndrome using Focused Carotid Ultrasound. IEEE Trans Instrum Meas 71:1–10

    Google Scholar 

  59. Kallus N, Zhou A (2019) The fairness of risk scores beyond classification: Bipartite ranking and the xauc metric. Adv Neural Inf Proces Syst 32

  60. Kamishima T, Akaho S, Asoh H, Sakuma J (2012) Fairness-aware classifier with prejudice remover regularizer. In: Machine learning and knowledge discovery in databases: European conference, ECML PKDD 2012, Bristol, UK, September 24–28, 2012. Proceedings, part II 23. Springer, Berlin Heidelberg, pp 35–50

    Google Scholar 

  61. Kim DW, Jang HY, Kim KW, Shin Y, Park SH (2019) Design characteristics of studies reporting the performance of artificial intelligence algorithms for diagnostic analysis of medical images: Results from recently published papers. Korean J Radiol 20(3):405–410

    Google Scholar 

  62. Klann JG, Estiri H, Weber GM, Moal B, Avillach P, Hong C, Maulhardt T (2021) Validation of an internationally derived patient severity phenotype to support COVID-19 analytics from electronic health record data. Journal of the American Medical Informatics Association 28(7):1411–1420

    Google Scholar 

  63. Krasanakis E, Spyromitros-Xioufis E, Papadopoulos S, Kompatsiaris Y (2018) Adaptive sensitive reweighting to mitigate bias in fairness-aware classification. In: Proceedings of the 2018 world wide web conference, pp 853–862

    Google Scholar 

  64. Krawczyk B (2016) Learning from imbalanced data: Open challenges and future directions. Progress in Artificial Intelligence 5(4):221–232

    Google Scholar 

  65. Kumar A, Ahluwalia R (2021) Breast Cancer detection using machine learning and its classification. In: Cancer prediction for industrial IoT 4.0: A machine learning perspective. Chapman and Hall/CRC, pp 65–78

    Google Scholar 

  66. Kumar A, Jain R (2021) Behavioral prediction of Cancer using machine learning. In: Cancer prediction for industrial IoT 4.0: A machine learning perspective. Chapman and Hall/CRC, pp 91–105

    Google Scholar 

  67. Kumar, A., R.S. Rai, and M. Gheisari, (2021) Prediction of Cervical Cancer Using Machine Learning, in Cancer Prediction for Industrial IoT 4.0: A Machine Learning Perspective Chapman and Hall/CRC 107–117

  68. Kumar A, Walia GS, Sharma K (2020) Recent trends in multicue based visual tracking: A review. Expert Syst Appl 162:113711

    Google Scholar 

  69. Kuppili V, Biswas M, Sreekumar A, Suri HS, Saba L, Edla DR, Suri JS (2017) Extreme learning machine framework for risk stratification of fatty liver disease using ultrasound tissue characterization. Journal of medical systems 41(10):1–20

    Google Scholar 

  70. Lambrecht A, Tucker C (2019) Algorithmic bias? An empirical study of apparent gender-based discrimination in the display of STEM career ads. Manage Sci 65(7):2966–2981

    Google Scholar 

  71. Landers RN, Behrend TS (2022) Auditing the AI auditors: A framework for evaluating fairness and bias in high stakes AI predictive models. American Psychologist

    Google Scholar 

  72. Lapuschkin S, Wäldchen S, Binder A, Montavon G, Samek W, Müller K-R (2019) Unmasking Clever Hans predictors and assessing what machines really learn. Nat Commun 10(1):1–8

    Google Scholar 

  73. Larrazabal AJ, Nieto N, Peterson V, Milone DH, Ferrante E (2020) Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis. Proc Natl Acad Sci 117(23):12592–12594

    Google Scholar 

  74. Le E, Wang Y, Huang Y, Hickman S, Gilbert F (2019) Artificial intelligence in breast imaging. Clin Radiol 74(5):357–366

    Google Scholar 

  75. Leavy S (2018) Gender bias in artificial intelligence: The need for diversity and gender theory in machine learning. In: Proceedings of the 1st international workshop on gender equality in software engineering, pp 14–16

    Google Scholar 

  76. Lee NT, Resnick P, Barton G (2019) Algorithmic bias detection and mitigation: Best practices and policies to reduce consumer harms. Brookings Institute, Washington, DC, USA

    Google Scholar 

  77. Lerman K, Hogg T (2014) Leveraging position bias to improve peer recommendation. PLoS ONE 9(6):e98914

    Google Scholar 

  78. Li Y, Chen J, Dong W, Zhu Y, Wu J, Xiong J, Qian T (2022) Mix-and-Interpolate: A Training Strategy to Deal with Source-biased Medical Data. IEEE Journal of Biomedical and Health Informatics 26(1):172–182

    Google Scholar 

  79. Li M, Hsu W, Xie X, Cong J, Gao W (2020) SACNN: Self-attention convolutional neural network for low-dose CT denoising with self-supervised perceptual loss network. IEEE Trans Med Imaging 39(7):2289–2301

    Google Scholar 

  80. Loftus JR, Russell C, Kusner MJ, Silva R (2018) Causal reasoning for algorithmic fairness. arXiv preprint arXiv:180505859

  81. Luo L, Xu D, Chen H, Wong TT, Heng PA (2022) Pseudo bias-balanced learning for debiased chest X-ray classification. In Medical image computing and computer assisted intervention–MICCAI 2022: 25th international conference, Singapore, September 18–22, 2022, proceedings, part VIII (pp. 621–631). Cham: Springer Nature Switzerland

  82. Maier-Hein L, Reinke A, Kozubek M, Martel AL, Arbel T, Eisenmann M, Landman BA (2020) BIAS: Transparent reporting of biomedical image analysis challenges. Med Image Anal 66:101796

    Google Scholar 

  83. Biswas M, Kuppili V, Saba L, Edla DR, Suri HS, Cuadrado-Godia E, Suri JS (2019) State-of the-art review on deep learning in medical imaging. Front Biosci-Landmark 24(3):380–406

  84. Marshall IJ, Kuiper J, Wallace BC (2015) Automating risk of bias assessment for clinical trials. IEEE J Biomed Health Inform 19(4):1406–1412

    Google Scholar 

  85. McDuff D, Cheng R, Kapoor A (2018) Identifying bias in AI using simulation. arXiv preprint arXiv:181000471

  86. McGregor C, Dewey C, Luan R (2021) Big data and artificial intelligence in healthcare: Ethical and social implications of neonatology. In: 2021 IEEE international symposium on technology and society (ISTAS). IEEE, pp 1–1

    Google Scholar 

  87. Mehrabi N, Morstatter F, Saxena N, Lerman K, Galstyan A (2021) A survey on bias and fairness in machine learning. ACM Computing Surveys (CSUR) 54(6):1–35

    Google Scholar 

  88. Meng C, Trinh L, Xu N, Enouen J, Liu Y (2022) Interpretability and fairness evaluation of deep learning models on MIMIC-IV dataset. Sci Rep 12(1):7166

    Google Scholar 

  89. Meng C, Trinh L, Xu N, Enouen J, Liu Y (2022) Interpretability and fairness evaluation of deep learning models on MIMIC-IV dataset. Sci Rep 12(1):1–28

    Google Scholar 

  90. Mia MR, Hoque ASML, Khan SI, Ahamed SI (2022) A privacy-preserving national clinical data warehouse: Architecture and analysis. Smart Health 23:100238

    Google Scholar 

  91. Mikołajczyk A, Grochowski M (2018) Data augmentation for improving deep learning in image classification problem. In: 2018 international interdisciplinary PhD workshop (IIPhDW). IEEE, pp 117–122

    Google Scholar 

  92. Minot JR, Cheney N, Maier M, Elbers DC, Danforth CM, Dodds PS (2022) Interpretable bias mitigation for textual data: Reducing genderization in patient notes while maintaining classification performance. ACM Trans Comput Healthc 3(4):1–41

    Google Scholar 

  93. Mitchell S, Potash E, Barocas S, D’Amour A, Lum K (2021) Algorithmic fairness: Choices, assumptions, and definitions. Annual Review of Statistics and Its Application 8:141–163

    MathSciNet  Google Scholar 

  94. Nagare M, Melnyk R, Rahman O, Sauer KD, Bouman CA (2021) A Bias-reducing loss function for CT image Denoising. In: ICASSP 2021–2021 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 1175–1179

    Google Scholar 

  95. Noriega M (2020) The application of artificial intelligence in police interrogations: An analysis addressing the proposed effect AI has on racial and gender bias, cooperation, and false confessions. Futures 117:102510

    Google Scholar 

  96. Norori N, Hu Q, Aellen FM, Faraci FD, Tzovara A (2021) Addressing bias in big data and AI for health care: A call for open science. Patterns 2(10):100347

    Google Scholar 

  97. Noseworthy PA, Attia ZI, Brewer LC, Hayes SN, Yao X, Kapa S, Lopez-Jimenez F (2020) Assessing and mitigating bias in medical artificial intelligence: The effects of race and ethnicity on a deep learning model for ECG analysis. Circ Arrhythm Electrophysiol 13(3):e007988

    Google Scholar 

  98. Ntoutsi E, Fafalios P, Gadiraju U, Iosifidis V, Nejdl W, Vidal ME, Staab S (2020) Bias in datadriven artificial intelligence systems—an introductory survey. Wiley Interdiscip Rev Data Min Knowl Discov 10(3):e1356

    Google Scholar 

  99. Oala L, Fehr J, Gilli L, Balachandran P, Leite AW, Calderon-Ramirez S, Wiegand T (2020) Ml4h auditing: From paper to practice. In: Machine learning for health. PMLR, pp 280–317

    Google Scholar 

  100. Oala L, Murchison AG, Balachandran P, Choudhary S, Fehr J, Leite AW, Nakasi R (2021) Machine Learning for Health: Algorithm Auditing & Quality Control. Journal of medical systems 45(12):1–8

    Google Scholar 

  101. Obermeyer Z, Mullainathan S (2019) Dissecting racial bias in an algorithm that guides health decisions for 70 million people. In: Proceedings of the conference on fairness, accountability, and transparency, pp 89–89

    Google Scholar 

  102. Obermeyer Z, Powers B, Vogeli C, Mullainathan S (2019) Dissecting racial bias in an algorithm used to manage the health of populations. Science 366(6464):447–453

    Google Scholar 

  103. Olteanu A, Castillo C, Diaz F, Kıcıman E (2019) Social data: Biases, methodological pitfalls, and ethical boundaries. Frontiers in Big Data 2:13

    Google Scholar 

  104. Palatnik de Sousa I, Vellasco MM, Costa da Silva E (2021) Explainable artificial intelligence for bias detection in covid ct-scan classifiers. Sensors 21(16):5657

    Google Scholar 

  105. Panch T, Mattie H, Atun R (2019) Artificial intelligence and algorithmic bias: Implications for health systems. J Glob Health 9(2)

  106. Panman JL, To YY, van der Ende EL, Poos JM, Jiskoot LC, Meeter LH, Hafkemeijer A (2019) Bias introduced by multiple head coils in MRI research: An 8 channel and 32 channel coil comparison. Front Neurosci 13:729

    Google Scholar 

  107. Parikh RB, Teeple S, Navathe AS (2019) Addressing bias in artificial intelligence in health care. JAMA 322(24):2377–2378

    Google Scholar 

  108. Parra CM, Gupta M, Dennehy D (2021) Likelihood of questioning ai-based recommendations due to perceived racial/gender bias. IEEE Transactions on Technology and Society 3(1):41–45

    Google Scholar 

  109. Paul S, Maindarkar M, Saxena S, Saba L, Turk M, Kalra M, Suri JS (2022) Bias Investigation in Artificial Intelligence Systems for Early Detection of Parkinson’s Disease: A Narrative Review. Diagnostics 12(1):166

    Google Scholar 

  110. Pfau J, Young AT, Wei ML, Keiser MJ (2019) Global saliency: Aggregating saliency maps to assess dataset artefact bias. arXiv preprint arXiv:191007604

  111. Pfohl SR, Foryciarz A, Shah NH (2021) An empirical characterization of fair machine learning for clinical risk prediction. J Biomed Inform 113:103621

    Google Scholar 

  112. Puyol-Antón E, Ruijsink B, Mariscal Harana J, Piechnik SK, Neubauer S, Petersen SE, King AP (2022) Fairness in cardiac magnetic resonance imaging: Assessing sex and racial bias in deep learning-based segmentation. Front Cardiovasc Med 664

  113. Qian Z (2021) Applications, risks and countermeasures of artificial intelligence in education. In: In 2021 2nd international conference on artificial intelligence and education (ICAIE). IEEE, pp 89–92

    Google Scholar 

  114. Kumar A, Gupta N, Bhasin P, Chauhan S, Bachri I (2023b) Security and privacy issues in smart healthcare using machine-learning perspectives. In: 6G-enabled IoT and AI for smart healthcare. CRC Press, pp 41–56

    Google Scholar 

  115. Rajotte JF, Mukherjee S, Robinson C, Ortiz A, West C, Ferres JML, Ng RT (2021) Reducing bias and increasing utility by federated generative modeling of medical images using a centralized adversary. In: Proceedings of the conference on information Technology for Social Good, pp 79–84

    Google Scholar 

  116. Renner A, Rausch I, Cal Gonzalez J, Laistler E, Moser E, Jochimsen T, Figl M (2022) A PET/MR coil with an integrated, orbiting 511 keV transmission source for PET/MR imaging validated in an animal study. Medical Physics 49(4):2366–2372

    Google Scholar 

  117. Ribeiro MT, Singh S, Guestrin C (2016) "Why should i trust you?" explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1135–1144

    Google Scholar 

  118. Roselli D, Matthews J, Talagala N (2019) Managing bias in AI. In: Companion proceedings of the 2019 world wide web conference, pp 539–544

    Google Scholar 

  119. Rueckel J, Huemmer C, Fieselmann A, Ghesu F-C, Mansoor A, Schachtner B, Ricke J (2021) Pneumothorax detection in chest radiographs: Optimizing artificial intelligence system for accuracy and confounding bias reduction using in-image annotations in algorithm training. European radiology 31(10):7888–7900

    Google Scholar 

  120. Saba L, Biswas M, Kuppili V, Godia EC, Suri HS, Edla DR, Mavrogeni S (2019) The present and future of deep learning in radiology. European journal of radiology 114:14–24

    Google Scholar 

  121. Saleiro P, Kuester B, Hinkson L, London J, Stevens A, Anisfeld A, Ghani R (2018) Aequitas: A bias and fairness audit toolkit. arXiv preprint arXiv:181105577

  122. Kumar A, Sareen P, Arora A (2023c) Healthcare engineering using AI and distributed technologies. In: Smart distributed embedded Systems for Healthcare Applications. CRC Press, pp 1–14

    Google Scholar 

  123. Santa Cruz BG, Bossa MN, Sölter J, Husch AD (2021) Public covid-19 x-ray datasets and their impact on model bias–a systematic review of a significant problem. Med Image Anal 74:102225

    Google Scholar 

  124. Saxena S, Jena B, Gupta N, Das S, Sarmah D, Bhattacharya P, Kalra M (2022) Role of Artificial Intelligence in Radiogenomics for Cancers in the Era of Precision Medicine. Cancers 14(12):2860

    Google Scholar 

  125. Seyyed-Kalantari L, Liu G, McDermott M, Chen IY, Ghassemi M (2020) CheXclusion: Fairness gaps in deep chest X-ray classifiers. In: BIOCOMPUTING 2021: Proceedings of the Pacific symposium, pp 232–243

    Google Scholar 

  126. Seyyed-Kalantari L, Liu G, McDermott M, Chen I, Ghassemi M (2021) Medical imaging algorithms exacerbate biases in underdiagnosis

    Google Scholar 

  127. Seyyed-Kalantari L, Zhang H, McDermott M, Chen IY, Ghassemi M (2021) Underdiagnosis bias of artificial intelligence algorithms applied to chest radiographs in under-served patient populations. Nat Med 27(12):2176–2182

    Google Scholar 

  128. Kumar A, Vohra R (2023) Impact of deep learning models for technology sustainability in tourism using big data analytics. In: Deep learning Technologies for the Sustainable Development Goals: Issues and solutions in the post-COVID era. Singapore, Springer Nature Singapore, pp 83–96

    Google Scholar 

  129. Shimron E, Tamir JI, Wang K, Lustig M (2022) Implicit data crimes: Machine learning bias arising from misuse of public data. Proc Natl Acad Sci 119(13):e2117203119

    MathSciNet  Google Scholar 

  130. Shrivastava VK, Londhe ND, Sonawane RS, Suri JS (2015) Reliable and accurate psoriasis disease classification in dermatology images using comprehensive feature space in machine learning paradigm. Expert Syst Appl 42(15–16):6184–6195

    Google Scholar 

  131. Sikstrom L, Maslej MM, Hui K, Findlay Z, Buchman DZ, Hill SL (2022) Conceptualising fairness: Three pillars for medical algorithms and health equity. BMJ Health Care Inform 29(1)

  132. Sounderajah V, Ashrafian H, Rose S, Shah NH, Ghassemi M, Golub R, Mateen B (2021) A quality assessment tool for artificial intelligence-centered diagnostic test accuracy studies: QUADAS-AI. Nature medicine 27(10):1663–1665

    Google Scholar 

  133. Srivastava B, Rossi F (2019) Rating AI systems for bias to promote trustable applications. IBM J Res Dev 63(4/5):5–1

    Google Scholar 

  134. Srivastava SK, Singh SK, Suri JS (2020) A healthcare text classification system and its performance evaluation: A source of better intelligence by characterizing healthcare text. Cognitive informatics, computer modelling, and cognitive science. Elsevier, pp 319–369

    Google Scholar 

  135. Stanley A, Kucera J (2021) Smart healthcare devices and applications, machine learning-based automated diagnostic systems, and real-time medical data analytics in COVID-19 screening, testing, and treatment. American Journal of Medical Research 8(2):105–117

    Google Scholar 

  136. Sterne JA, Hernán MA, Reeves BC, Savović J, Berkman ND, Viswanathan M, Higgins JP (2016) ROBINS-I: A tool for assessing risk of bias in non-randomised studies of interventions. Bmj 355

  137. Straw I (2020) The automation of bias in medical Artificial Intelligence (AI): Decoding the past to create a better future. Artif Intell Med 110:101965

    Google Scholar 

  138. Sugiyama M, Ogawa H (2000) Incremental active learning with bias reduction. In: Proceedings of the IEEE-INNS-ENNS international joint conference on neural networks. IJCNN 2000. Neural computing: New challenges and perspectives for the new millennium, vol 1. IEEE, pp 15–20

    Google Scholar 

  139. Sun TY, Walk OJ IV, Chen JL, Nieva HR, Elhadad N (2020) Exploring gender disparities in time to diagnosis. arXiv preprint arXiv:201106100

  140. Suresh H, Guttag JV (2019) A framework for understanding unintended consequences of machine learning. arXiv preprint arXiv:190110002 2(8)

  141. Suri JS, Agarwal S, Gupta SK, Puvvula A, Viskovic K, Suri N, Naidu DS (2021) Systematic review of artificial intelligence in acute respiratory distress syndrome for COVID-19 lung patients: A biomedical imaging perspective. IEEE J Biomed Health Inform 25(11):4128–4139

    Google Scholar 

  142. Suri JS, Agarwal S, Jena B, Saxena S, El-Baz A, Agarwal V, Naidu S (2022a) Five strategies for bias estimation in artificial intelligence-based hybrid deep learning for acute respiratory distress syndrome COVID-19 lung infected patients using AP (AI) Bias 2.0: A systematic review. IEEE Trans Instrum Meas

  143. Suri JS, Bhagawati M, Paul S, Protogeron A, Sfikakis PP, Kitas GD, Kalra M (2022b) Understanding the bias in machine learning systems for cardiovascular disease risk assessment: The first of its kind review. Comput Biol Med 105204

  144. Suri JS, Bhagawati M, Paul S, Protogerou AD, Sfikakis PP, Kitas GD, Saxena S (2022) A powerful paradigm for cardiovascular risk stratification using multiclass, multi-label, and ensemble-based machine learning paradigms: A narrative review. Diagnostics 12(3):722

    Google Scholar 

  145. Suri JS, Puvvula A, Biswas M, Majhail M, Saba L, Faa G, Naidu S (2020) COVID-19 pathways for brain and heart injury in comorbidity patients: A role of medical imaging and artificial intelligence-based COVID severity classification: A review. Comput Biol Med 124:103960

    Google Scholar 

  146. Suri JS, Rangayyan RM, Imaging B, Mammography, and Computer-Aided Diagnosis of Breast Cancer. (2006) SPIE: Bellingham. WA, USA

    Google Scholar 

  147. Tandel GS, Balestrieri A, Jujaray T, Khanna NN, Saba L, Suri JS (2020) Multiclass magnetic resonance imaging brain tumor classification using artificial intelligence paradigm. Comput Biol Med 122:103804

    Google Scholar 

  148. Tasci E, Zhuge Y, Camphausen K, Krauze AV (2022) Bias and Class Imbalance in Oncologic Data—Towards Inclusive and Transferrable AI in Large Scale Oncology Data Sets. Cancers 14(12):2897

    Google Scholar 

  149. Teoh KH, Ismail RC, Naziri SZM, Hussin R, Isa MNM, Basir MSSM (2021) Face recognition and identification using deep learning approach. In Journal of physics: Conference series (Vol. 1755, No. 1, p. 012006). IOP Publishing

    Google Scholar 

  150. Tommasi T, Patricia N, Caputo B, Tuytelaars T (2017) A deeper look at dataset bias. Domain adaptation in computer vision applications. Springer, pp 37–55

    Google Scholar 

  151. Trueblood JS, Eichbaum Q, Seegmiller AC, Stratton C, O’Daniels P, Holmes WR (2021) Disentangling prevalence induced biases in medical image decision-making. Cognition 212:104713

    Google Scholar 

  152. Vollmer S, Mateen BA, Bohner G, Király FJ, Ghani R, Jonsson P, Hemingway H (2020) Machine learning and artificial intelligence research for patient benefit: 20 critical questions on transparency, replicability, ethics, and effectiveness. BMJ:368

  153. Wachinger C, Becker BG, Rieckmann A, Pölsterl S (2019) Quantifying confounding bias in neuroimaging datasets with causal inference. In: Medical image computing and computer assisted intervention–MICCAI 2019: 22nd international conference, Shenzhen, China, October 13–17, 2019, proceedings, part IV 22. Springer International Publishing, pp 484–492

    Google Scholar 

  154. Wachinger C, Rieckmann A, Pölsterl S, Alzheimer’s Disease Neuroimaging Initiative. (2021) Detect and correct bias in multi-site neuroimaging datasets. Med Image Anal 67:101879

    Google Scholar 

  155. Wachter S, Mittelstadt B, Russell C (2021) Why fairness cannot be automated: Bridging the gap between EU non-discrimination law and AI. Comput Law Secur Rev 41:105567

    Google Scholar 

  156. Weber C (2019) Engineering bias in AI. IEEE Pulse 10(1):15–17

    Google Scholar 

  157. Wolff RF, Moons KG, Riley RD, Whiting PF, Westwood M, Collins GS, PROBAST Group† (2019) PROBAST: A tool to assess the risk of bias and applicability of prediction model studies. Ann Intern Med 170(1):51–58

    Google Scholar 

  158. Wu Y, Zhang L, Wu X (2018) Fairness-aware classification: Criterion, convexity, and bounds. arXiv preprint arXiv:180904737

  159. Xu Y, Hosny A, Zeleznik R, Parmar C, Coroller T, Franco I, Aerts HJ (2019) Deep learning predicts lung cancer treatment response from serial medical imaging. Clinical Cancer Research 25(11):3266–3275

    Google Scholar 

  160. Zafar MB, Valera I, Gomez Rodriguez M, Gummadi KP (2017) Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In: Proceedings of the 26th international conference on world wide web, pp 1171–1180

    Google Scholar 

  161. Zhang BH, Lemoine B, Mitchell M (2018) Mitigating unwanted biases with adversarial learning. In: Proceedings of the 2018 AAAI/ACM conference on AI, ethics, and society, pp 335–340

    Google Scholar 

  162. Zhang H, Lu AX, Abdalla M, McDermott M, Ghassemi M (2020) Hurtful words: Quantifying biases in clinical contextual word embeddings. In: Proceedings of the ACM conference on health, inference, and learning, pp 110–120

    Google Scholar 

  163. Zhang L, Wu X (2017) Anti-discrimination learning: A causal modeling-based framework. International Journal of Data Science and Analytics 4(1):1–16

    MathSciNet  Google Scholar 

  164. Zhou N, Zhang Z, Nair VN, Singhal H, Chen J, Sudjianto A (2021) Bias, fairness, and accountability with AI and ML algorithms. arXiv preprint arXiv:210506558

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jasjit S. Suri.

Ethics declarations

Conflicts of interests

Authors declare that they have no conflicts of interest.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Tables 6, 7, 8 and 9

Table 6 Bias attributes details of the representative work under PD-ai model
Table 7 Bias attributes details of the representative work under ID-ai model
Table 8 Bias attributes details of the representative work under HD-ai model
Table 9 Grading Scheme for evaluation of the AI attributes

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kumar, A., Aelgani, V., Vohra, R. et al. Artificial intelligence bias in medical system designs: a systematic review. Multimed Tools Appl 83, 18005–18057 (2024). https://doi.org/10.1007/s11042-023-16029-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-16029-x

Keywords

Navigation