Abstract
Inherent bias in the artificial intelligence (AI)-model brings inaccuracies and variabilities during clinical deployment of the model. It is challenging to recognize the source of bias in AI-model due to variations in datasets and black box nature of system design. Additionally, there is no distinct process to identify the potential source of bias in the AI-model. To the best of our knowledge, this is the first review of its kind that addresses the bias in AI-model by categorizing 48 studies into three classes, namely, point-based, image-based, and hybrid-based AI-models. Selection strategy using PRISMA is adopted to select the 72 crucial AI studies for identifying bias in AI models. Using the three classes, bias is identified in these studies based on 44 critical AI attributes. Bias in the AI-models is computed by analytical, butterfly, and ranking-based bias models. These bias models were evaluated using two experts and compared using variability analysis. AI-studies that lacked sufficient AI-attributes are more prone to risk-of-bias (RoB) in all three classes. Studies with high RoB loses fins in the butterfly model. It has been analyzed that the majority of the studies in healthcare suffer from data bias and algorithmic bias due to incomplete specifications mentioned in the design protocol and weak AI design exploited for prediction.
Similar content being viewed by others
Data availability
Data will be made available by the corresponding author on reasonable request.
Abbreviations
- 1 ACC:
-
Accuracy
- 2 ANOVA:
-
Analysis of variance
- 3 ANCOVA:
-
Analysis of co-variance
- 4 AI:
-
Artificial intelligence
- 5 AUC:
-
Area under the curve
- 6 AUPRC:
-
Area under precision-recall curve
- 7 BMI:
-
Body mass index
- 8 BI:
-
Brain imaging
- 9 CMR:
-
Cardiac magnetic resonance
- 10 CVD:
-
Cardiac vascular disease
- 11 CKD:
-
Chronic kidney disease
- 12 CXR:
-
Chest X-ray
- 13 CT:
-
Computed tomography
- 14 CAD:
-
Computer-aided diagnosis
- 15 CI :
-
Confidence Index
- 16 COI:
-
Coronary imaging
- 17 COVID-19:
-
Coronavirus disease 2019
- 18 COMPAS:
-
Correctional Offender Management Profiling for Alternative Sanction
- 19 CHARMS:
-
Critical appraisal and data extraction for systematic reviews of prediction modelling studies
- 20 DL:
-
Deep learning
- 21 EHR:
-
Electronic health record
- 22 ECG:
-
Electrocardiography
- 23 GDPR:
-
General Data Protection Regulation
- 24 HDL:
-
Hybrid Deep learning
- 25 HD-ai:
-
Hybrid data-based AI-model
- 26 ID-ai:
-
Image data-based AI-model
- 27 LVEF:
-
Left ventricular eject fraction
- 28 ML:
-
Machine learning
- 29 MRI:
-
Magnetic resonance imaging
- 30 MCC:
-
Matthew’s correlation coefficient
- 31 MSE:
-
Mean squared error
- 32 PD-ai:
-
Point data-based AI-model
- 33 PRISMA:
-
Preferred Reporting Items for Systematic Reviews and Meta-Analysis
- 34 RT-PCR:
-
Reverse transcription polymerase chain reaction
- 35 RoB:
-
Risk of Bias
- 36 ROBINS-I:
-
Risk of Bias In Non-randomized Studies - of Interventions
- 37 SEN:
-
Sensitivity
- 38 SI:
-
Stroke imaging
- 39 SPE:
-
Specificity
- 40 USS:
-
Ultrasound
- 41 WHO:
-
World health organization
References
Abbasi-Sureshjani S, Raumanns R, Michels BE, Schouten G, Cheplygina V (2020) Risk of training diagnostic algorithms on data with demographic bias. Interpretable and Annotation-Efficient Learning for Medical Image Computing. Springer, pp 183–192
Acharya UR, Faust O, Alvin A, Krishnamurthi G, Seabra JC, Sanches J, Suri JS (2013) Understanding symptomatology of atherosclerotic plaque by image-based tissue characterization. Comput Methods Programs Biomed 110(1):66–75
Acharya UR, Mookiah MRK, Sree SV, Yanti R, Martis R, Saba L, Suri JS (2014) Evolutionary algorithm-based classifier parameter tuning for automatic ovarian cancer tissue characterization and classification. Ultraschall in der Medizin-European Journal of Ultrasound 35(03):237–245
Agarwal M, Agarwal S, Saba L, Chabert GL, Gupta S, Carriero A, Pasche A, Danna P, Mehmedovic A, Faa G, Shrivastav S, Jain K, Jain H, Jujaray T, Singh MI, Turk M, Chadha SP, Johri MA, Khanna NN, Suri JS (2022) Eight pruning deep learning models for low storage and high-speed COVID-19 computed tomography lung segmentation and heatmap-based lesion localization: A multicenter study using COVLIAS 2.0. Comput Biol Med 146:105571
Ahmed S, Athyaab SA, Muqtadeer SA (2021) Attenuation of human Bias in artificial intelligence: An exploratory approach. In: 2021 6th international conference on inventive computation technologies (ICICT). IEEE, pp 557–563
Ahn E, Kumar A, Fulham M, Feng D, Kim J (2020) Unsupervised domain adaptation to classify medical images using zero-bias convolutional auto-encoders and context-based feature augmentation. IEEE Trans Med Imaging 39(7):2385–2394
Akter S, McCarthy G, Sajib S, Michael K, Dwivedi YK, D’Ambra J, Shen K (2021) Algorithmic bias in data-driven innovation in the age of AI. Int J Inf Manage 60:102387
Álvarez-Rodríguez L, Moura Jd, Novo J, Ortega M (2022) Does imbalance in chest X-ray datasets produce biased deep learning approaches for COVID-19 screening? BMC Medical Research Methodology 22(1):1–17
Araki T, Ikeda N, Shukla D, Jain PK, Londhe ND, Shrivastava VK, Shafique S (2016) PCA-based polling strategy in machine learning framework for coronary artery disease risk assessment in intravascular ultrasound: A link between carotid and coronary grayscale plaque morphology. Computer methods and programs in biomedicine 128:137–158
Ashraf A, Khan S, Bhagwat N, Chakravarty M, Taati B (2018) Learning to unlearn: Building immunity to dataset bias in medical imaging studies. arXiv preprint arXiv:181201716
Assayag D, Morisset J, Johannson KA, Wells AU, Walsh SL (2020) Patient gender bias on the diagnosis of idiopathic pulmonary fibrosis. Thorax 75(5):407–412
Banchhor SK, Londhe ND, Araki T, Saba L, Radeva P, Khanna NN, Suri JS (2018) Calcium detection, its quantification, and grayscale morphology-based risk stratification using machine learning in multimodality big data coronary and carotid scans: A review. Comput Biol Med 101:184–198
Baylor D, Breck E, Cheng HT, Fiedel N, Foo CY, Haque Z, Zinkevich M (2017) Tfx: A tensorflow-based production-scale machine learning platform. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1387–1395
Beesley LJ, Mukherjee B (2022) Statistical inference for association studies using electronic health records: Handling both selection bias and outcome misclassification. Biometrics 78(1):214–226
Belenguer L (2022) AI bias: Exploring discriminatory algorithmic decision-making models and the application of possible machine-centric solutions adapted from the pharmaceutical industry. AI and Ethics 2(4):771–787
Bellamy RK, Dey K, Hind M, Hoffman SC, Houde S, Kannan K, Zhang Y (2019) AI fairness 360: An extensible toolkit for detecting and mitigating algorithmic bias. IBM J Res Dev 63(4/5):4–1
Benjamin M, Gagnon P, Rostamzadeh N, Pal C, Bengio Y, Shee A (2019) Towards standardization of data licenses: the Montreal data license. arXiv preprint arXiv:190312262
Berg T, Burg V, Gombović A, Puri M (2020) On the rise of fintechs: Credit scoring using digital footprints. The Review of Financial Studies 33(7):2845–2897
Byrne MD (2021) Reducing bias in healthcare artificial intelligence. J Perianesth Nurs 36(3):313–316
Calderon-Ramirez S, Yang S, Moemeni A, Colreavy-Donnelly S, Elizondo DA, Oala L, Molina-Cabello MA (2021) Improving uncertainty estimation with semi-supervised deep learning for covid-19 detection using chest x-ray images. Ieee Access 9:85442–85454
Calmon F, Wei D, Vinzamuri B, Natesan Ramamurthy K, Varshney KR (2017) Optimized preprocessing for discrimination prevention. Adv Neural Inf Proces Syst 30
Cardenas S, Vallejo-Cardenas SF (2019) Continuing the conversation on how structural racial and ethnic inequalities affect AI biases. In: 2019 IEEE international symposium on technology and society (ISTAS). IEEE, pp 1–7
Catalá ODT, Igual IS, Pérez-Benito FJ, Escrivá DM, Castelló VO, Llobet R, Peréz-Cortés J-C (2021) Bias analysis on public X-ray image datasets of pneumonia and COVID-19 patients. IEEE Access 9:42370–42383
Cau R, Bassareo P, Caredda G, Suri JS, Esposito A, Saba L (2022a) Atrial strain by feature-tracking cardiac magnetic resonance imaging in takotsubo cardiomyopathy. Features, feasibility, and reproducibility. Can Assoc Radiol J 73(3):573–580
Cau R, Bassareo P, Suri JS, Pontone G, Saba L (2022b) The emerging role of atrial strain assessed by cardiac MRI in different cardiovascular settings: An up-to-date review. Eur Radiol 32(7):4384–4394
Celi LA, Cellini J, Charpignon M-L, Dee EC, Dernoncourt F, Eber R, Situ J (2022) Sources of bias in artificial intelligence that perpetuate healthcare disparities—A global review. PLOS Digital Health 1(3):e0000022
Challen R, Denny J, Pitt M, Gompels L, Edwards T, Tsaneva-Atanasova K (2019) Artificial intelligence, bias and clinical safety. BMJ Qual Saf 28(3):231–237
Chung H, Park C, Kang WS, Lee J (2021) Gender Bias in artificial intelligence: Severity prediction at an early stage of COVID-19. Front Physiol 2104
Cirillo D, Catuara-Solarz S, Morey C, Guney E, Subirats L, Mellino S, Chadha AS (2020) Sex and gender differences and biases in artificial intelligence for biomedicine and healthcare. NPJ digital medicine 3(1):1–11
Corrias G, Mazzotta A, Melis M, Cademartiri F, Yang Q, Suri JS, Saba L (2021) Emerging role of artificial intelligence in stroke imaging. Expert Rev Neurother 21(7):745–754
d’Alessandro B, O’Neil C, LaGatta T (2017) Conscientious classification: A data scientist’s guide to discrimination-aware classification. Big data 5(2):120–134
Danks D, London AJ (2017) Algorithmic Bias in autonomous systems. IJCAI 17(2017):4691–4697
Dankwa-Mullan I, Weeraratne D (2022) Artificial Intelligence and Machine Learning Technologies in Cancer Care: Addressing Disparities, Bias, and Data Diversity. Cancer Discov 12(6):1423–1427
Das, S., G. Nayak, L. Saba, M. Kalra, J.S. Suri, and S. Saxena, (2022) An artificial intelligence framework and its bias for brain tumor segmentation: A narrative review. Computers in Biology and Medicine 105273
Dastin J (2018) Amazon scraps secret AI recruiting tool that showed bias against women. In: Ethics of data and analytics. Auerbach Publications, pp 296–299
Kumar A, Jain R, Gupta M, Islam SM, (Eds.). (2023a) 6G-enabled IoT and AI for smart healthcare: Challenges, impact, and analysis. CRC Press
El-Baz A, Jiang X, Suri JS (Eds.) (2016) Biomedical image segmentation: Advances and trends
El-Baz A, Suri JS (eds) (2019) Big data in multimodal medical imaging. CRC Press
Esteva A, Chou K, Yeung S, Naik N, Madani A, Mottaghi A, Socher R (2021) Deep learning-enabled medical computer vision. NPJ digital medicine 4(1):1–9
Estiri H, Strasser ZH, Rashidian S, Klann JG, Wagholikar KB, McCoy TH Jr, Murphy SN (2022) An objective framework for evaluating unrecognized bias in medical AI models predicting COVID-19 outcomes. J Am Med Inform Assoc 29(8):1334–1341
Ezzeddine H, Awad M, Abi Ghanem AS, Mourani B (2021) On data Bias and the usability of deep learning algorithms in classifying COVID-19 based on chest X-ray. In: 2021 IEEE 3rd International Multidisciplinary Conference on Engineering Technology (IMCET). IEEE, pp 136–143
Falco G (2019) Participatory AI: Reducing AI bias and developing socially responsible AI in smart cities. In: 2019 IEEE international conference on computational science and engineering (CSE) and IEEE international conference on embedded and ubiquitous computing (EUC). IEEE, pp 154–158
Fernandez-Quilez A (2023) Deep learning in radiology: Ethics of data and on the value of algorithm transparency, interpretability and explainability. AI and Ethics 3(1):257–265
Ferrer X, van Nuenen T, Such JM, Coté M, Criado N (2021) Bias and Discrimination in AI: A cross-disciplinary perspective. IEEE Technol Soc Mag 40(2):72–80
Gallifant J, Zhang J, M.d.P.A. Lopez, T. Zhu, L. Camporota, L.A. Celi, and F. Formenti, (2022) Artificial intelligence for mechanical ventilation: Systematic review of design, reporting standards, and bias. British Journal of Anaesthesia 128(2):343–351
Gebru T, Morgenstern J, Vecchione B, Vaughan JW, Wallach H, Iii HD, Crawford K (2021) Datasheets for datasets. Commun ACM 64(12):86–92
Geis JR, Brady AP, Wu CC, Spencer J, Ranschaert E, Jaremko JL, Shields WF (2019) Ethics of artificial intelligence in radiology: Summary of the joint European and North American multisociety statement. Canadian Association of Radiologists Journal 70(4):329–334
Georgopoulos M, Panagakis Y, Pantic M (2020) Investigating bias in deep face analysis: The kanface dataset and empirical study. Image Vis Comput 102:103954
Gianfrancesco MA, Tamang S, Yazdany J, Schmajuk G (2018) Potential biases in machine learning algorithms using electronic health record data. JAMA Intern Med 178(11):1544–1547
Greshake Tzovaras B, Tzovara A (2019) The personal data is political. The ethics of medical data donation:133–140
Guidotti R, Monreale A, Ruggieri S, Turini F, Giannotti F, Pedreschi D (2019) A survey of methods for explaining black box models. ACM computing surveys (CSUR) 51(5):1–42
Gurupur V, Wan TT (2020) Inherent bias in artificial intelligence-based decision support systems for healthcare. Medicina 56(3):141
De Hert P, Czerniawski M (2016) Expanding the European data protection scope beyond territory: Article 3 of the General Data Protection Regulation in its wider context. International Data Privacy Law 6(3):230–243
Iosifidis V, Ntoutsi E (2019) Adafair: Cumulative fairness adaptive boosting. In: Proceedings of the 28th ACM international conference on information and knowledge management, pp 781–790
Jain R, Kumar A, Nayyar A, Dewan K, Garg R, Raman S, Ganguly S (2023) Explaining sentiment analysis results on social media texts through visualization. Multimed Tools Appl:1–17
Jain PK, Sharma N, Kalra MK, Viskovic K, Saba L, Suri JS (2022) Four Types of Multiclass Frameworks for Pneumonia Classification and Its Validation in X-ray Scans Using Seven Types of Deep Learning Artificial Intelligence Models. Diagnostics 12(3):652
Jamthikar A, Gupta D, Johri AM, Mantella LE, Saba L, Suri JS (2022) A machine learning framework for risk prediction of multi-label cardiovascular events based on focused carotid plaque B-Mode ultrasound: A Canadian study. Comput Biol Med 140:105102
Jamthikar AD, Gupta D, Mantella LE, Saba L, Johri AM, Suri JS (2021) Ensemble Machine Learning and its Validation for Prediction of Coronary Artery Disease and Acute Coronary Syndrome using Focused Carotid Ultrasound. IEEE Trans Instrum Meas 71:1–10
Kallus N, Zhou A (2019) The fairness of risk scores beyond classification: Bipartite ranking and the xauc metric. Adv Neural Inf Proces Syst 32
Kamishima T, Akaho S, Asoh H, Sakuma J (2012) Fairness-aware classifier with prejudice remover regularizer. In: Machine learning and knowledge discovery in databases: European conference, ECML PKDD 2012, Bristol, UK, September 24–28, 2012. Proceedings, part II 23. Springer, Berlin Heidelberg, pp 35–50
Kim DW, Jang HY, Kim KW, Shin Y, Park SH (2019) Design characteristics of studies reporting the performance of artificial intelligence algorithms for diagnostic analysis of medical images: Results from recently published papers. Korean J Radiol 20(3):405–410
Klann JG, Estiri H, Weber GM, Moal B, Avillach P, Hong C, Maulhardt T (2021) Validation of an internationally derived patient severity phenotype to support COVID-19 analytics from electronic health record data. Journal of the American Medical Informatics Association 28(7):1411–1420
Krasanakis E, Spyromitros-Xioufis E, Papadopoulos S, Kompatsiaris Y (2018) Adaptive sensitive reweighting to mitigate bias in fairness-aware classification. In: Proceedings of the 2018 world wide web conference, pp 853–862
Krawczyk B (2016) Learning from imbalanced data: Open challenges and future directions. Progress in Artificial Intelligence 5(4):221–232
Kumar A, Ahluwalia R (2021) Breast Cancer detection using machine learning and its classification. In: Cancer prediction for industrial IoT 4.0: A machine learning perspective. Chapman and Hall/CRC, pp 65–78
Kumar A, Jain R (2021) Behavioral prediction of Cancer using machine learning. In: Cancer prediction for industrial IoT 4.0: A machine learning perspective. Chapman and Hall/CRC, pp 91–105
Kumar, A., R.S. Rai, and M. Gheisari, (2021) Prediction of Cervical Cancer Using Machine Learning, in Cancer Prediction for Industrial IoT 4.0: A Machine Learning Perspective Chapman and Hall/CRC 107–117
Kumar A, Walia GS, Sharma K (2020) Recent trends in multicue based visual tracking: A review. Expert Syst Appl 162:113711
Kuppili V, Biswas M, Sreekumar A, Suri HS, Saba L, Edla DR, Suri JS (2017) Extreme learning machine framework for risk stratification of fatty liver disease using ultrasound tissue characterization. Journal of medical systems 41(10):1–20
Lambrecht A, Tucker C (2019) Algorithmic bias? An empirical study of apparent gender-based discrimination in the display of STEM career ads. Manage Sci 65(7):2966–2981
Landers RN, Behrend TS (2022) Auditing the AI auditors: A framework for evaluating fairness and bias in high stakes AI predictive models. American Psychologist
Lapuschkin S, Wäldchen S, Binder A, Montavon G, Samek W, Müller K-R (2019) Unmasking Clever Hans predictors and assessing what machines really learn. Nat Commun 10(1):1–8
Larrazabal AJ, Nieto N, Peterson V, Milone DH, Ferrante E (2020) Gender imbalance in medical imaging datasets produces biased classifiers for computer-aided diagnosis. Proc Natl Acad Sci 117(23):12592–12594
Le E, Wang Y, Huang Y, Hickman S, Gilbert F (2019) Artificial intelligence in breast imaging. Clin Radiol 74(5):357–366
Leavy S (2018) Gender bias in artificial intelligence: The need for diversity and gender theory in machine learning. In: Proceedings of the 1st international workshop on gender equality in software engineering, pp 14–16
Lee NT, Resnick P, Barton G (2019) Algorithmic bias detection and mitigation: Best practices and policies to reduce consumer harms. Brookings Institute, Washington, DC, USA
Lerman K, Hogg T (2014) Leveraging position bias to improve peer recommendation. PLoS ONE 9(6):e98914
Li Y, Chen J, Dong W, Zhu Y, Wu J, Xiong J, Qian T (2022) Mix-and-Interpolate: A Training Strategy to Deal with Source-biased Medical Data. IEEE Journal of Biomedical and Health Informatics 26(1):172–182
Li M, Hsu W, Xie X, Cong J, Gao W (2020) SACNN: Self-attention convolutional neural network for low-dose CT denoising with self-supervised perceptual loss network. IEEE Trans Med Imaging 39(7):2289–2301
Loftus JR, Russell C, Kusner MJ, Silva R (2018) Causal reasoning for algorithmic fairness. arXiv preprint arXiv:180505859
Luo L, Xu D, Chen H, Wong TT, Heng PA (2022) Pseudo bias-balanced learning for debiased chest X-ray classification. In Medical image computing and computer assisted intervention–MICCAI 2022: 25th international conference, Singapore, September 18–22, 2022, proceedings, part VIII (pp. 621–631). Cham: Springer Nature Switzerland
Maier-Hein L, Reinke A, Kozubek M, Martel AL, Arbel T, Eisenmann M, Landman BA (2020) BIAS: Transparent reporting of biomedical image analysis challenges. Med Image Anal 66:101796
Biswas M, Kuppili V, Saba L, Edla DR, Suri HS, Cuadrado-Godia E, Suri JS (2019) State-of the-art review on deep learning in medical imaging. Front Biosci-Landmark 24(3):380–406
Marshall IJ, Kuiper J, Wallace BC (2015) Automating risk of bias assessment for clinical trials. IEEE J Biomed Health Inform 19(4):1406–1412
McDuff D, Cheng R, Kapoor A (2018) Identifying bias in AI using simulation. arXiv preprint arXiv:181000471
McGregor C, Dewey C, Luan R (2021) Big data and artificial intelligence in healthcare: Ethical and social implications of neonatology. In: 2021 IEEE international symposium on technology and society (ISTAS). IEEE, pp 1–1
Mehrabi N, Morstatter F, Saxena N, Lerman K, Galstyan A (2021) A survey on bias and fairness in machine learning. ACM Computing Surveys (CSUR) 54(6):1–35
Meng C, Trinh L, Xu N, Enouen J, Liu Y (2022) Interpretability and fairness evaluation of deep learning models on MIMIC-IV dataset. Sci Rep 12(1):7166
Meng C, Trinh L, Xu N, Enouen J, Liu Y (2022) Interpretability and fairness evaluation of deep learning models on MIMIC-IV dataset. Sci Rep 12(1):1–28
Mia MR, Hoque ASML, Khan SI, Ahamed SI (2022) A privacy-preserving national clinical data warehouse: Architecture and analysis. Smart Health 23:100238
Mikołajczyk A, Grochowski M (2018) Data augmentation for improving deep learning in image classification problem. In: 2018 international interdisciplinary PhD workshop (IIPhDW). IEEE, pp 117–122
Minot JR, Cheney N, Maier M, Elbers DC, Danforth CM, Dodds PS (2022) Interpretable bias mitigation for textual data: Reducing genderization in patient notes while maintaining classification performance. ACM Trans Comput Healthc 3(4):1–41
Mitchell S, Potash E, Barocas S, D’Amour A, Lum K (2021) Algorithmic fairness: Choices, assumptions, and definitions. Annual Review of Statistics and Its Application 8:141–163
Nagare M, Melnyk R, Rahman O, Sauer KD, Bouman CA (2021) A Bias-reducing loss function for CT image Denoising. In: ICASSP 2021–2021 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 1175–1179
Noriega M (2020) The application of artificial intelligence in police interrogations: An analysis addressing the proposed effect AI has on racial and gender bias, cooperation, and false confessions. Futures 117:102510
Norori N, Hu Q, Aellen FM, Faraci FD, Tzovara A (2021) Addressing bias in big data and AI for health care: A call for open science. Patterns 2(10):100347
Noseworthy PA, Attia ZI, Brewer LC, Hayes SN, Yao X, Kapa S, Lopez-Jimenez F (2020) Assessing and mitigating bias in medical artificial intelligence: The effects of race and ethnicity on a deep learning model for ECG analysis. Circ Arrhythm Electrophysiol 13(3):e007988
Ntoutsi E, Fafalios P, Gadiraju U, Iosifidis V, Nejdl W, Vidal ME, Staab S (2020) Bias in datadriven artificial intelligence systems—an introductory survey. Wiley Interdiscip Rev Data Min Knowl Discov 10(3):e1356
Oala L, Fehr J, Gilli L, Balachandran P, Leite AW, Calderon-Ramirez S, Wiegand T (2020) Ml4h auditing: From paper to practice. In: Machine learning for health. PMLR, pp 280–317
Oala L, Murchison AG, Balachandran P, Choudhary S, Fehr J, Leite AW, Nakasi R (2021) Machine Learning for Health: Algorithm Auditing & Quality Control. Journal of medical systems 45(12):1–8
Obermeyer Z, Mullainathan S (2019) Dissecting racial bias in an algorithm that guides health decisions for 70 million people. In: Proceedings of the conference on fairness, accountability, and transparency, pp 89–89
Obermeyer Z, Powers B, Vogeli C, Mullainathan S (2019) Dissecting racial bias in an algorithm used to manage the health of populations. Science 366(6464):447–453
Olteanu A, Castillo C, Diaz F, Kıcıman E (2019) Social data: Biases, methodological pitfalls, and ethical boundaries. Frontiers in Big Data 2:13
Palatnik de Sousa I, Vellasco MM, Costa da Silva E (2021) Explainable artificial intelligence for bias detection in covid ct-scan classifiers. Sensors 21(16):5657
Panch T, Mattie H, Atun R (2019) Artificial intelligence and algorithmic bias: Implications for health systems. J Glob Health 9(2)
Panman JL, To YY, van der Ende EL, Poos JM, Jiskoot LC, Meeter LH, Hafkemeijer A (2019) Bias introduced by multiple head coils in MRI research: An 8 channel and 32 channel coil comparison. Front Neurosci 13:729
Parikh RB, Teeple S, Navathe AS (2019) Addressing bias in artificial intelligence in health care. JAMA 322(24):2377–2378
Parra CM, Gupta M, Dennehy D (2021) Likelihood of questioning ai-based recommendations due to perceived racial/gender bias. IEEE Transactions on Technology and Society 3(1):41–45
Paul S, Maindarkar M, Saxena S, Saba L, Turk M, Kalra M, Suri JS (2022) Bias Investigation in Artificial Intelligence Systems for Early Detection of Parkinson’s Disease: A Narrative Review. Diagnostics 12(1):166
Pfau J, Young AT, Wei ML, Keiser MJ (2019) Global saliency: Aggregating saliency maps to assess dataset artefact bias. arXiv preprint arXiv:191007604
Pfohl SR, Foryciarz A, Shah NH (2021) An empirical characterization of fair machine learning for clinical risk prediction. J Biomed Inform 113:103621
Puyol-Antón E, Ruijsink B, Mariscal Harana J, Piechnik SK, Neubauer S, Petersen SE, King AP (2022) Fairness in cardiac magnetic resonance imaging: Assessing sex and racial bias in deep learning-based segmentation. Front Cardiovasc Med 664
Qian Z (2021) Applications, risks and countermeasures of artificial intelligence in education. In: In 2021 2nd international conference on artificial intelligence and education (ICAIE). IEEE, pp 89–92
Kumar A, Gupta N, Bhasin P, Chauhan S, Bachri I (2023b) Security and privacy issues in smart healthcare using machine-learning perspectives. In: 6G-enabled IoT and AI for smart healthcare. CRC Press, pp 41–56
Rajotte JF, Mukherjee S, Robinson C, Ortiz A, West C, Ferres JML, Ng RT (2021) Reducing bias and increasing utility by federated generative modeling of medical images using a centralized adversary. In: Proceedings of the conference on information Technology for Social Good, pp 79–84
Renner A, Rausch I, Cal Gonzalez J, Laistler E, Moser E, Jochimsen T, Figl M (2022) A PET/MR coil with an integrated, orbiting 511 keV transmission source for PET/MR imaging validated in an animal study. Medical Physics 49(4):2366–2372
Ribeiro MT, Singh S, Guestrin C (2016) "Why should i trust you?" explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 1135–1144
Roselli D, Matthews J, Talagala N (2019) Managing bias in AI. In: Companion proceedings of the 2019 world wide web conference, pp 539–544
Rueckel J, Huemmer C, Fieselmann A, Ghesu F-C, Mansoor A, Schachtner B, Ricke J (2021) Pneumothorax detection in chest radiographs: Optimizing artificial intelligence system for accuracy and confounding bias reduction using in-image annotations in algorithm training. European radiology 31(10):7888–7900
Saba L, Biswas M, Kuppili V, Godia EC, Suri HS, Edla DR, Mavrogeni S (2019) The present and future of deep learning in radiology. European journal of radiology 114:14–24
Saleiro P, Kuester B, Hinkson L, London J, Stevens A, Anisfeld A, Ghani R (2018) Aequitas: A bias and fairness audit toolkit. arXiv preprint arXiv:181105577
Kumar A, Sareen P, Arora A (2023c) Healthcare engineering using AI and distributed technologies. In: Smart distributed embedded Systems for Healthcare Applications. CRC Press, pp 1–14
Santa Cruz BG, Bossa MN, Sölter J, Husch AD (2021) Public covid-19 x-ray datasets and their impact on model bias–a systematic review of a significant problem. Med Image Anal 74:102225
Saxena S, Jena B, Gupta N, Das S, Sarmah D, Bhattacharya P, Kalra M (2022) Role of Artificial Intelligence in Radiogenomics for Cancers in the Era of Precision Medicine. Cancers 14(12):2860
Seyyed-Kalantari L, Liu G, McDermott M, Chen IY, Ghassemi M (2020) CheXclusion: Fairness gaps in deep chest X-ray classifiers. In: BIOCOMPUTING 2021: Proceedings of the Pacific symposium, pp 232–243
Seyyed-Kalantari L, Liu G, McDermott M, Chen I, Ghassemi M (2021) Medical imaging algorithms exacerbate biases in underdiagnosis
Seyyed-Kalantari L, Zhang H, McDermott M, Chen IY, Ghassemi M (2021) Underdiagnosis bias of artificial intelligence algorithms applied to chest radiographs in under-served patient populations. Nat Med 27(12):2176–2182
Kumar A, Vohra R (2023) Impact of deep learning models for technology sustainability in tourism using big data analytics. In: Deep learning Technologies for the Sustainable Development Goals: Issues and solutions in the post-COVID era. Singapore, Springer Nature Singapore, pp 83–96
Shimron E, Tamir JI, Wang K, Lustig M (2022) Implicit data crimes: Machine learning bias arising from misuse of public data. Proc Natl Acad Sci 119(13):e2117203119
Shrivastava VK, Londhe ND, Sonawane RS, Suri JS (2015) Reliable and accurate psoriasis disease classification in dermatology images using comprehensive feature space in machine learning paradigm. Expert Syst Appl 42(15–16):6184–6195
Sikstrom L, Maslej MM, Hui K, Findlay Z, Buchman DZ, Hill SL (2022) Conceptualising fairness: Three pillars for medical algorithms and health equity. BMJ Health Care Inform 29(1)
Sounderajah V, Ashrafian H, Rose S, Shah NH, Ghassemi M, Golub R, Mateen B (2021) A quality assessment tool for artificial intelligence-centered diagnostic test accuracy studies: QUADAS-AI. Nature medicine 27(10):1663–1665
Srivastava B, Rossi F (2019) Rating AI systems for bias to promote trustable applications. IBM J Res Dev 63(4/5):5–1
Srivastava SK, Singh SK, Suri JS (2020) A healthcare text classification system and its performance evaluation: A source of better intelligence by characterizing healthcare text. Cognitive informatics, computer modelling, and cognitive science. Elsevier, pp 319–369
Stanley A, Kucera J (2021) Smart healthcare devices and applications, machine learning-based automated diagnostic systems, and real-time medical data analytics in COVID-19 screening, testing, and treatment. American Journal of Medical Research 8(2):105–117
Sterne JA, Hernán MA, Reeves BC, Savović J, Berkman ND, Viswanathan M, Higgins JP (2016) ROBINS-I: A tool for assessing risk of bias in non-randomised studies of interventions. Bmj 355
Straw I (2020) The automation of bias in medical Artificial Intelligence (AI): Decoding the past to create a better future. Artif Intell Med 110:101965
Sugiyama M, Ogawa H (2000) Incremental active learning with bias reduction. In: Proceedings of the IEEE-INNS-ENNS international joint conference on neural networks. IJCNN 2000. Neural computing: New challenges and perspectives for the new millennium, vol 1. IEEE, pp 15–20
Sun TY, Walk OJ IV, Chen JL, Nieva HR, Elhadad N (2020) Exploring gender disparities in time to diagnosis. arXiv preprint arXiv:201106100
Suresh H, Guttag JV (2019) A framework for understanding unintended consequences of machine learning. arXiv preprint arXiv:190110002 2(8)
Suri JS, Agarwal S, Gupta SK, Puvvula A, Viskovic K, Suri N, Naidu DS (2021) Systematic review of artificial intelligence in acute respiratory distress syndrome for COVID-19 lung patients: A biomedical imaging perspective. IEEE J Biomed Health Inform 25(11):4128–4139
Suri JS, Agarwal S, Jena B, Saxena S, El-Baz A, Agarwal V, Naidu S (2022a) Five strategies for bias estimation in artificial intelligence-based hybrid deep learning for acute respiratory distress syndrome COVID-19 lung infected patients using AP (AI) Bias 2.0: A systematic review. IEEE Trans Instrum Meas
Suri JS, Bhagawati M, Paul S, Protogeron A, Sfikakis PP, Kitas GD, Kalra M (2022b) Understanding the bias in machine learning systems for cardiovascular disease risk assessment: The first of its kind review. Comput Biol Med 105204
Suri JS, Bhagawati M, Paul S, Protogerou AD, Sfikakis PP, Kitas GD, Saxena S (2022) A powerful paradigm for cardiovascular risk stratification using multiclass, multi-label, and ensemble-based machine learning paradigms: A narrative review. Diagnostics 12(3):722
Suri JS, Puvvula A, Biswas M, Majhail M, Saba L, Faa G, Naidu S (2020) COVID-19 pathways for brain and heart injury in comorbidity patients: A role of medical imaging and artificial intelligence-based COVID severity classification: A review. Comput Biol Med 124:103960
Suri JS, Rangayyan RM, Imaging B, Mammography, and Computer-Aided Diagnosis of Breast Cancer. (2006) SPIE: Bellingham. WA, USA
Tandel GS, Balestrieri A, Jujaray T, Khanna NN, Saba L, Suri JS (2020) Multiclass magnetic resonance imaging brain tumor classification using artificial intelligence paradigm. Comput Biol Med 122:103804
Tasci E, Zhuge Y, Camphausen K, Krauze AV (2022) Bias and Class Imbalance in Oncologic Data—Towards Inclusive and Transferrable AI in Large Scale Oncology Data Sets. Cancers 14(12):2897
Teoh KH, Ismail RC, Naziri SZM, Hussin R, Isa MNM, Basir MSSM (2021) Face recognition and identification using deep learning approach. In Journal of physics: Conference series (Vol. 1755, No. 1, p. 012006). IOP Publishing
Tommasi T, Patricia N, Caputo B, Tuytelaars T (2017) A deeper look at dataset bias. Domain adaptation in computer vision applications. Springer, pp 37–55
Trueblood JS, Eichbaum Q, Seegmiller AC, Stratton C, O’Daniels P, Holmes WR (2021) Disentangling prevalence induced biases in medical image decision-making. Cognition 212:104713
Vollmer S, Mateen BA, Bohner G, Király FJ, Ghani R, Jonsson P, Hemingway H (2020) Machine learning and artificial intelligence research for patient benefit: 20 critical questions on transparency, replicability, ethics, and effectiveness. BMJ:368
Wachinger C, Becker BG, Rieckmann A, Pölsterl S (2019) Quantifying confounding bias in neuroimaging datasets with causal inference. In: Medical image computing and computer assisted intervention–MICCAI 2019: 22nd international conference, Shenzhen, China, October 13–17, 2019, proceedings, part IV 22. Springer International Publishing, pp 484–492
Wachinger C, Rieckmann A, Pölsterl S, Alzheimer’s Disease Neuroimaging Initiative. (2021) Detect and correct bias in multi-site neuroimaging datasets. Med Image Anal 67:101879
Wachter S, Mittelstadt B, Russell C (2021) Why fairness cannot be automated: Bridging the gap between EU non-discrimination law and AI. Comput Law Secur Rev 41:105567
Weber C (2019) Engineering bias in AI. IEEE Pulse 10(1):15–17
Wolff RF, Moons KG, Riley RD, Whiting PF, Westwood M, Collins GS, PROBAST Group† (2019) PROBAST: A tool to assess the risk of bias and applicability of prediction model studies. Ann Intern Med 170(1):51–58
Wu Y, Zhang L, Wu X (2018) Fairness-aware classification: Criterion, convexity, and bounds. arXiv preprint arXiv:180904737
Xu Y, Hosny A, Zeleznik R, Parmar C, Coroller T, Franco I, Aerts HJ (2019) Deep learning predicts lung cancer treatment response from serial medical imaging. Clinical Cancer Research 25(11):3266–3275
Zafar MB, Valera I, Gomez Rodriguez M, Gummadi KP (2017) Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In: Proceedings of the 26th international conference on world wide web, pp 1171–1180
Zhang BH, Lemoine B, Mitchell M (2018) Mitigating unwanted biases with adversarial learning. In: Proceedings of the 2018 AAAI/ACM conference on AI, ethics, and society, pp 335–340
Zhang H, Lu AX, Abdalla M, McDermott M, Ghassemi M (2020) Hurtful words: Quantifying biases in clinical contextual word embeddings. In: Proceedings of the ACM conference on health, inference, and learning, pp 110–120
Zhang L, Wu X (2017) Anti-discrimination learning: A causal modeling-based framework. International Journal of Data Science and Analytics 4(1):1–16
Zhou N, Zhang Z, Nair VN, Singhal H, Chen J, Sudjianto A (2021) Bias, fairness, and accountability with AI and ML algorithms. arXiv preprint arXiv:210506558
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interests
Authors declare that they have no conflicts of interest.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kumar, A., Aelgani, V., Vohra, R. et al. Artificial intelligence bias in medical system designs: a systematic review. Multimed Tools Appl 83, 18005–18057 (2024). https://doi.org/10.1007/s11042-023-16029-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-16029-x