Bias Analysis in Healthcare Time Series (BAHT) Decision Support Systems from Meta Data

Dakshit, Sagnik; Dakshit, Sristi; Khargonkar, Ninad; Prabhakaran, Balakrishnan

doi:10.1007/s41666-023-00133-6

Bias Analysis in Healthcare Time Series (BAHT) Decision Support Systems from Meta Data

Research Article
Published: 19 June 2023

Volume 7, pages 225–253, (2023)
Cite this article

Journal of Healthcare Informatics Research Aims and scope Submit manuscript

Sagnik Dakshit¹,
Sristi Dakshit¹,
Ninad Khargonkar¹ &
…
Balakrishnan Prabhakaran¹

300 Accesses
Explore all metrics

Abstract

One of the hindrances in the widespread acceptance of deep learning–based decision support systems in healthcare is bias. Bias in its many forms occurs in the datasets used to train and test deep learning models and is amplified when deployed in the real world, leading to challenges such as model drift. Recent advancements in the field of deep learning have led to the deployment of deployable automated healthcare diagnosis decision support systems at hospitals as well as tele-medicine through IoT devices. Research has been focused primarily on the development and improvement of these systems leaving a gap in the analysis of the fairness. The domain of FAccT ML (fairness, accountability, and transparency) accounts for the analysis of these deployable machine learning systems. In this work, we present a framework for bias analysis in healthcare time series (BAHT) signals such as electrocardiogram (ECG) and electroencephalogram (EEG). BAHT provides a graphical interpretive analysis of bias in the training, testing datasets in terms of protected variables, and analysis of bias amplification by the trained supervised learning model for time series healthcare decision support systems. We thoroughly investigate three prominent time series ECG and EEG healthcare datasets used for model training and research. We show the extensive presence of bias in the datasets leads to potentially biased or unfair machine-learning models. Our experiments also demonstrate the amplification of identified bias with an observed maximum of 66.66%. We investigate the effect of model drift due to unanalyzed bias in datasets and algorithms. Bias mitigation though prudent is a nascent area of research. We present experiments and analyze the most prevalently accepted bias mitigation strategies of under-sampling, oversampling, and the use of synthetic data for balancing the dataset through augmentation. It is important that healthcare models, datasets, and bias mitigation strategies should be properly analyzed for a fair unbiased delivery of service.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dynamic survival prediction in intensive care units from heterogeneous time series without the need for variable selection or curation

Article Open access 17 December 2020

Time Series Feature Learning with Applications to Health Care

Adversarial Learning for Improved Patient Representations

Data Availability

All the data used are open source and can be accessed through the citation links. We will also make the code for our framework on acceptance.

References

Burlina P et al (2017) Comparing humans and deep learning performance for grading AMD: a study in using universal deep features and transfer learning for automated AMD analysis. Comput Biol Med 82:80–86
Article Google Scholar
Oneto L, Silvia C (2020) “Fairness in machine learning.” Recent trends in learning from data: tutorials from the inns big data and deep learning conference (innsbddl2019). Springer International Publishing
Buolamwini J, Gebru T (2018) Gender shades: intersectional accuracy disparities in commercial gender classification. In Conference Fairness, Account Trans pages 77–91. PMLR
Álvarez-Rodríguez L et al (2022) Does imbalance in chest X-ray datasets produce biased deep learning approaches for COVID-19 screening? BMC Med Res Methodol 221:125
Article Google Scholar
Cruz S, Garcia B et al (2021) Public covid-19 x-ray datasets and their impact on model bias–a systematic review of a significant problem. Med Image Anal 74:102225
Article Google Scholar
Hague DC (2019) Benefits, pitfalls, and potential bias in health care AI. N C Med J 80(4):219–223
Google Scholar
Bower JK et al (2017) Addressing bias in electronic health record-based surveillance of cardiovascular disease risk: finding the signal through the noise. Curr Epidemiol Rep 4:346–352
Article Google Scholar
Rozier MD, Patel KK, Cross DA (2022) Electronic health records as biased tools or tools against bias: a conceptual model. Milbank Quarter 1001:134–150
Article Google Scholar
Bhanot K et al (2021) The problem of fairness in synthetic healthcare data. Entropy 239:1165
Article Google Scholar
Zhou Y, Huang S-C, Fries JA, Youssef A, Amrhein TJ, Chang M, Banerjee I et al (2021) “Radfusion: benchmarking performance and fairness for multimodal pulmonary embolism detection from ct and ehr.” arXiv preprint arXiv:2111.11665
Hague DC (2019) Benefits, pitfalls, and potential bias in health care AI. North Carolina Med J 80(4):219–223
Article Google Scholar
Torralba A, Efros AA (2011) “Unbiased look at dataset bias.” CVPR 2011. IEEE
Hundman K, Gowda T, Kejriwal M, Boecking B (2018) “Always lurking: understanding and mitigating bias in online human trafficking detection.” In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pp. 137–143
Vasconcelos M, Carlos C, and Bernardo G (2018) “Modeling epistemological principles for bias mitigation in AI systems: an illustration in hiring decisions.” Proceed AAAI/ACM Conference on AI, Ethics, Soc
Dixon L, Li J, Sorensen J, Thain N, Vasserman L (2018) “Measuring and mitigating unintended bias in text classification.” In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pp. 67–73
Gurupur V, Wan TTH (2020) “Inherent bias in artificial intelligence-based decision support systems for healthcare.” Medicina 56(3):141
PPuyol-Antón E, Ruijsink B, Piechnik SK, Neubauer S, Petersen SE, Razavi R, King AP (2021) “Fairness in cardiac MR image analysis: an investigation of bias due to data imbalance in deep learning based segmentation.” In: Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part III 24, pp. 413–423. Springer International Publishing
Duprez DA, Jacobs Jr DR, Lutsey PL, Herrington D, Prime D, Ouyang P, Barr RG, Bluemke DA (2009) “Race/ethnic and sex differences in large and small artery elasticity–results of the multi-ethnic study of atherosclerosis (MESA).” Ethnic Dis 19(3):243
Kishi S, Reis JP, Venkatesh BA, Gidding SS, Armstrong AC, Jacobs DR Jr, Sidney S, Wu CO, Cook NL, Lewis CE et al (2015) Race–ethnic and sex differences in left ventricular structure and function: the coronary artery risk development in young adults (cardia) study. J Am Heart Assoc 4(3):e001264
Article Google Scholar
Moody GB, Mark RG (2001) “The impact of the MIT-BIH arrhythmia database.” IEEE Eng Med Biol Mag 20(3):45–50
Bhanot K, Qi M, Erickson JS, Guyon I, Bennett KP (2021) The problem of fairness in synthetic healthcare data. Entropy 23(9):1165
Article Google Scholar
Gu J, and Daniela O (2019) “Understanding bias in machine learning.” arXiv preprint arXiv:1909.01866
Gianfrancesco MA, Tamang S, Yazdany J, Schmajuk G (2018) “Potential biases in machine learning algorithms using electronic health record data.” JAMA Int Med 178(11):1544–1547
Leino K, Fredrikson M, Black E, Sen S, and Datta A (2019) Feature-wise bias amplification. In Intl Conference Learn Represent (ICLR)
Kallus N, Zhou A (2018). Residual unfairness in fair machine learning from prejudiced data. <i>Proceedings of the 35th International Conference on Machine Learning</i>, in <i>Proceedings of Machine Learning Research</i> 80:2439–2448 Available from https://proceedings.mlr.press/v80/kallus18a.html
Protected Class: https://content.next.westlaw.com/Document/Ibb0a38daef0511e28578f7ccc38dcbee/View/FullText.html?transitionType=Default&contextData=(sc.Default)
Danks D, and London AJ (2017) “Algorithmic bias in autonomous systems.” Ijcai. Vol. 17. No
Hall M et al (2022) “A systematic study of bias amplification.” arXiv preprint arXiv:2201.11706
Pławiak P (2018) Novel methodology of cardiac health recognition based on ECG signals and evolutionary-neural system. Expert Syst Appl 92:334–349
Article Google Scholar
Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PCh, Mark RG, Mietus JE, Moody GB, Peng C-K, Stanley HE (2000) “PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals.” Circulation 101(23):e215–e220
Britton JW, Frey LC, Hopp JLet al (2016) authors; St. Louis EK, Frey LC, editors. Electroencephalography (EEG): an introductory text and atlas of normal and abnormal findings in adults, children, and infants [Internet]. Chicago: Am Epilepsy Soc Intro. Available from: https://www.ncbi.nlm.nih.gov/books/NBK390346/
Zhao J, Wang T, Yatskar M, Ordonez V, and Chang K-W (2017) Men also like shopping: reducing gender bias amplification using corpus-level constraints. Proceed Conference Empirical Methods Nat Language Process
Maweu BM, Dakshit S, Shamsuddin R, Prabhakaran B (2021) CEFEs: a CNN explainable framework for ECG signals. Artif Intell Med 115:102059
Article Google Scholar
Dakshit S et al (2022) “Core-set selection using metrics-based explanations (CSUME) for multiclass ECG.” IEEE Int Conference Healthcare Inform (ICHI). IEEE. (Also available at: arXiv:2205.14508)
Maweu BM et al (2021) Generating healthcare time series data for improving diagnostic accuracy of deep neural networks. IEEE Trans Instrument Measure 70:1–15
Article Google Scholar
Dokur Z, Ölmez T (2001) ECG beat classification by a novel hybrid neural network. Comput Methods Programs Biomed 66(2–3):167–181
Article Google Scholar
Nurmaini S, Partan RU, Caesarendra W, Dewi T, Rahmatullah MN, Darmawahyuni A, Bhayyu V, Firdaus F (2019) “An automated ECG beat classification system using deep neural networks with an unsupervised feature extraction technique.” Appl Sci 9(14):2921
Martis RJ, Rajendra Acharya U, Min LC (2013) ECG beat classification using PCA, LDA, ICA and discrete wavelet transform. Biomed Signal Process Control 85:437–448
Article Google Scholar
Yu S-N, Chou K-T (2008) Integration of independent component analysis and neural networks for ECG beat classification. Expert Syst Appl 34(4):2841–2846

Download references

Author information

Authors and Affiliations

Computer Science, The University of Texas at Dallas, Dallas, USA
Sagnik Dakshit, Sristi Dakshit, Ninad Khargonkar & Balakrishnan Prabhakaran

Authors

Sagnik Dakshit
View author publications
You can also search for this author in PubMed Google Scholar
Sristi Dakshit
View author publications
You can also search for this author in PubMed Google Scholar
Ninad Khargonkar
View author publications
You can also search for this author in PubMed Google Scholar
Balakrishnan Prabhakaran
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Sagnik Dakshit (primary author) has been responsible for designing the framework and experiments, writing manuscripts, conducting experiments, and creating figures. Sristi Dakshit (second author) and Ninad Khargonkar (third author) have been responsible for conducting experiments, creating figures, and reviewing the paper. Dr. Balakrishnan Prabhakaran (fourth author) has been responsible for reviewing the paper and helping design the framework and structuring the experiments and manuscript.

Corresponding author

Correspondence to Sagnik Dakshit.

Ethics declarations

Ethical Approval

For our novel work in BAHT framework, an ethical approval is not applicable.

Consent to Participate

All the authors consent to participate.

Consent for Publication

All the authors consent to publish.

Competing Interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Dakshit, S., Dakshit, S., Khargonkar, N. et al. Bias Analysis in Healthcare Time Series (BAHT) Decision Support Systems from Meta Data. J Healthc Inform Res 7, 225–253 (2023). https://doi.org/10.1007/s41666-023-00133-6

Download citation

Received: 12 October 2022
Revised: 19 April 2023
Accepted: 12 May 2023
Published: 19 June 2023
Issue Date: June 2023
DOI: https://doi.org/10.1007/s41666-023-00133-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bias Analysis in Healthcare Time Series (BAHT) Decision Support Systems from Meta Data

Abstract

Access this article

Similar content being viewed by others

Dynamic survival prediction in intensive care units from heterogeneous time series without the need for variable selection or curation

Time Series Feature Learning with Applications to Health Care

Adversarial Learning for Improved Patient Representations

Data Availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical Approval

Consent to Participate

Consent for Publication

Competing Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Bias Analysis in Healthcare Time Series (BAHT) Decision Support Systems from Meta Data

Abstract

Access this article

Similar content being viewed by others

Dynamic survival prediction in intensive care units from heterogeneous time series without the need for variable selection or curation

Time Series Feature Learning with Applications to Health Care

Adversarial Learning for Improved Patient Representations

Data Availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical Approval

Consent to Participate

Consent for Publication

Competing Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation