Performance comparison of multi-label learning algorithms on clinical data for chronic diseases
Graphical abstract
Introduction
Chronic diseases, also called noncommunicable diseases (NCDs) [1], are characterized by a long duration and generally a slow progression. Widespread chronic diseases include cardiovascular diseases, chronic respiratory diseases and diabetes. Chronic conditions are a major concern for public health programs of governments, particularly due to their negative effect in the continuous growth of medical care costs [2]. Chronic obstructive pulmonary disease (COPD) is an incurable illness, mainly due to tobacco smoking, where the treatment merely slows the progress of the condition. The World Health Organization (WHO) estimates that 64 million people have COPD worldwide in 2004 [3]. Concerning another major chronic disease, diabetes affects 347 million people worldwide in 2008 [4]. WHO projects that diabetes will be the 7th leading cause of death in 2030 [5]. Type 2 diabetes consists of 90% of people with diabetes, and is mostly the consequence of excess body weight and physical inactivity [6].
Despite the technical progress in the medical area which allows patients to be monitored in a more continuous way [7], the treatment of chronically ill patients, which can develop several comorbidities, remains complex for the physician. The continuous monitoring generates larger quantity of data. Often these measures are heterogeneous, such as laboratory tests, physiological values or electrocardiograms. On the one side, physicians willing to take optimal decisions will have to aggregate the information contained in these data. On the other side, such aggregation will become (or are already) unmanageable for humans. In addition, physicians are frequently in charge of hundreds of patients, as reported in [8]. Therefore, there is a need for state-of-the-art data-mining and machine learning tools to assist physicians by providing aggregated information about their patients. Indeed, as reported in [9], medical doctors would use tools that improve their understanding of an illness even if these involve more cognitive effort than in the standard practice. Several challenges appear during the design of such tools. Chronically ill patients, such as a diabetic patient, suffer frequently from several comorbidities in relation with the main disease. New approaches in the machine learning field, such as Multi-Label Learning (MLL), which have received, in the last few years, substantial contributions from the machine learning community [10], [11], [12], are then the good candidate for modeling the profile of a patient affected by several comorbidities. Another challenge concerns the characteristics of medical signals. Clinical data consist of multivariate time series that are often irregular by the fact that a patient may present various number of records with respect to another patient and the values can be nonuniformly sampled. The processing of data with these characteristics is challenging and techniques for the extraction of features are needed. One approach consists on relying on quantization methods, such as k-means clustering and Bag-of-Words (BoW), that have been proven successful in several medical data processing tasks [13]. Another approach would be to extract summary statistics for the different types of sequential clinical data [14].
MLL differs from classical machine learning by tackling the learning problem from a different perspective. In contrast to the classical classification tasks where each observation belongs to only one mutually exclusive class, in MLL decision areas of labels (i.e. classes) overlap. This aspect leads to the annotation (i.e. instead of classification) of observations with zero, one or several labels. In addition, instead of expressing the presence or the absence of a label as a binary variable, it is possible to express the confidence of the presence of a label through a score or a probability. This formulation looks natural for many problems in real life, such as the detection of emotions in music [15], [16], the semantic scene classification [17] or the classification of text into topics [18].
Regarding the application of such approaches in the medical domains, we can mention several research works. In genomics field, Barutcuoglu et al. proposed a Bayesian framework for the prediction of gene function [19]. Independently for each gene function, a Support Vector Machine (SVM) is trained, then a Bayesian network is built for combining the multiple classifier results. The graph structure of the network is based on a hierarchical gene taxonomy. The aim of this network is to avoid inconsistent set of predictions, where for a given gene a specific label may be predicted relevant while its inclusive parent label is predicted irrelevant. In the biology field, Xiao et al. developed the iLoc-Virus predictor [20] for predicting the subcellular locations of proteins according to their sequence information. In their work, they focus on viral proteins, those generated by viruses. Being able to predict the locations of viral proteins in a viral infected cell is important for improving antiviral drugs. As a virus protein can have more than one location, MLL methods accommodate well, and thus the ML-kNN [21] algorithm was chosen for their predictor. The following work is focused on chronic diseases, although they are not based on MLL but on related techniques. Huang et al. proposed a system for the prognosis and the diagnosis of chronic diseases which is based on data mining and case-based reasoning [22]. Data mining techniques are used to discover patterns from health examination data. More precisely, a decision tree induction algorithm is applied to find rules which will serve to the chronic diseases classification of new cases. Afterwards, case-based reasoning, which consists on the analysis of old cases to provide solution for a new case, aims to support physicians for the diagnosis and the treatments of chronic diseases. Regarding the evaluation, the experiment data were collected from a professional health examination center, and a feasibility test was performed with 12 discharged real cases. Amaral et al. developed a clinical decision support system to assess patients affected by chronic obstructive pulmonary disease (COPD) based on the forced oscillation technique (FOT) [23]. FOT is a noninvasive method to assess the breathing mechanics, using small amplitude pressure oscillations to stimulate the respiratory system in order to evaluate the flow response. Several machine learning classifiers were attempted, such as naive Bayes (NB), k-nearest neighbors (KNN), decision trees (DT), artificial neural networks (ANN), or support vector machines (SVM). Based on a dataset of 50 volunteers (where 25 have COPD), non-linear classifiers such as ANN and SVM and the lazy learning KNN classifier were able to reach a proper accuracy for COPD clinical diagnosis (sensitivity , specificity ).
We are motivated by the problem of studying multi-label learning techniques for the analysis of clinical data in order to identify patients that may be affected by chronic diseases. We use the MIMIC-II clinical database [24] where 19,773 patients of various intensive care units (ICUs) are diagnosed with one or several chronic diseases according to the coding scheme of the International Classification of Disease revision 9 (ICD-9).2 Being able to characterize patients, based on their clinical data, open several applications, such as the identification of patient cohorts in the context of comparative effectiveness studies or in the case of clinical decision support systems [14]. In a previous paper [25], Bromuri et al. report on a new classifier which combines BoW and supervised dimensionality reduction algorithms to perform multi-label classification on health records of chronically ill patients. In the framework of this research, we discovered the following new challenges. Although the quantization method (BoW) used is convenient for the feature extraction when dealing with irregular time series, we think that a finer feature extraction approach based on summary statistics [14] will improve the results while making it easier to identify the influent characteristics.
In addition, the evaluation of a new MLL technique for the classification of chronic diseases based on the analysis of clinical data is made difficult by the fact that there are no studies which provide a large experimental comparison of state-of-the-art MLL algorithms on such data. The main contribution of this work is a large experimental review of multi-label learning approaches for the analysis of clinical data of chronically ill patients. We provide an extended description on properties of the dataset, on the way features are extracted using summary statistics and how the evaluation is conducted.
The rest of this document is organized as follows: Section 2 presents a background on evaluation metrics and methods for multi-label learning; Section 3 describes the MIMIC-II database and its properties; Section 4 defines the methodology for building models; Section 5 presents the results for the multi-label algorithms considered in this study; finally, Section 6 concludes this paper and draws the lines for future work.
Section snippets
Background
This section begins with the formal definition of a MLL problem and their related evaluation metrics. Then, a state-of-the-art of the existing MLL techniques is described.
With L for the finite set of labels, and with X for the domain of observation, the training set T is defined as . Based on these definitions, a multi-label classifier h is defined as . In addition, some evaluation metrics are based on the output of a real-valued scoring function f
Materials
In this section, we describe the characteristics of the MIMIC-II clinical database [24]. We also explain how we use these data for our study related to chronic diseases.
The data were gathered during a seven year period, beginning in 2001, from Intensive Care Unit (ICU) of Boston׳s Beth Israel Deaconess Medical Center (BIDMC). The MIMIC-II clinical database [24] is publicly and freely available after registration. The last release of the database contains around 33,000 patients. We choose to
Methods
In this section we describe the feature extraction and the standardization that we apply on the data, then we describe the multi-label learning algorithms considered in this study.
Experiments
In this section we describe how the experiments were conducted and we discuss about the results.
Regarding the software environment in use, all the multi-label learning algorithms and evaluation metrics have been implemented with the Java programming language. The following Java libraries have been used: Mulan5 (version 1.4) and Weka4 (version 3.7.6). The operating system is a Ubuntu Linux 12.04 LTS 64 bits. Regarding the hardware environment, we used a workstation
Conclusion
In this contribution we presented an evaluation of multi-label learning algorithms on patients affected by chronic diseases. The emphasis of the work is on trying to model the relationship between different chronic illnesses by means of the multi-label paradigm. In this study we have been faced with the MIMIC-II dataset which contains a large number of patient records. This aspect leads to scalability problems with classifiers where multiple parameters need to be optimized, such as the use of a
Conflict of interest statement
This work was partially supported by the EU FP7 287841 COMMODITY12 project.
Acknowledgment
This work was partially supported by the EU FP7 287841 COMMODITY12 project.
References (53)
- et al.
A comparative risk assessment of burden of disease and injury attributable to 67 risk factors and risk factor clusters in 21 regions, 1990–2010a systematic analysis for the global burden of disease study 2010
Lancet
(2013) - et al.
National, regional, and global trends in fasting plasma glucose and diabetes prevalence since 1980systematic analysis of health examination surveys and epidemiological studies with 370 country-years and 2.7 million participants
Lancet
(2011) - et al.
Understanding the nature of information seeking behavior in critical careimplications for the design of health information technology
Artif. Intell. Med.
(2013) - et al.
An extensive experimental comparison of methods for multi-label learning
Pattern Recognit.
(2012) - et al.
Bag-of-words representation for biomedical time series classification
Biomed. Signal Process. Control
(2013) - et al.
Learning multi-label scene classification
Pattern Recognit.
(2004) - et al.
iLoc-Virusa multi-label learning classifier for identifying the subcellular localization of virus proteins with both single and multiple sites
J. Theor. Biol.
(2011) - et al.
ML-KNNa lazy learning approach to multi-label learning
Pattern Recognit.
(2007) - et al.
Integrating data mining with case-based reasoning for chronic diseases prognosis and diagnosis
Expert Syst. Appl.
(2007) - et al.
Machine learning algorithms and forced oscillation measurements applied to the automatic identification of chronic obstructive pulmonary disease
Comput. Methods Progr. Biomed.
(2012)
Multi-label classification of chronically ill patients with bag of words and supervised dimensionality reduction algorithms
J. Biomed. Inf.
National health spending by medical condition, 1996–2005
Health Aff.
Commodity 12a smart e-health environment for diabetes management
J. Ambient Intell. Smart Environ.
Endocrinology in crisis?
South. Med. J.
Mining multi-label data
A review on multi-label learning algorithms
IEEE Trans. Knowl. Data Eng.
Supervised patient similarity measure of heterogeneous patient records
SIGKDD Explor. Newsl.
Multi-label classification of emotions in music
Hierarchical multi-label prediction of gene function
Bioinformatics
Multiparameter intelligent monitoring in intensive care II (MIMIC-II)a public-access intensive care unit database
Crit. Care Med.
Boostextera boosting-based system for text categorization
Mach. Learn.
Cited by (59)
Multilabel all-relevant feature selection using lower bounds of conditional mutual information
2023, Expert Systems with ApplicationsCitation Excerpt :As labels, we consider indicators of ten families of diseases that have already been used in previous studies (Bromuri et al., 2014; Teisseyre, 2020; Teisseyre et al., 2019; Zufferey et al., 2015): hypertension, kidney, fluid, hypotension, lipoid, liver, diabetes, thyroid, copper, and thrombosis. We refer to a previous work (Zufferey et al., 2015), where a detailed description is provided for data cleansing and feature extraction. Table 3 contains summary statistics and label distributions.
Local-based k values for multi-label k-nearest neighbors rule
2022, Engineering Applications of Artificial IntelligenceClassifier chains for positive unlabelled multi-label learning
2021, Knowledge-Based SystemsCitation Excerpt :Below we discuss some illustrative examples of such situation. As a first example, consider a problem of predicting multi-morbidity, i.e. co-occurrence of multiple diseases in one patient using patients characteristics, which is a typical multi-label task [3–7]. It may happen that, some diseases are not diagnosed.
Deep learning approach for the prediction of diseases in medical images
2024, Medical Imaging Informatics: Machine learning, deep learning and big data analytics
- 1
Permanent email address: [email protected]