Elsevier

Computers in Biology and Medicine

Volume 65, 1 October 2015, Pages 34-43
Computers in Biology and Medicine

Performance comparison of multi-label learning algorithms on clinical data for chronic diseases

https://doi.org/10.1016/j.compbiomed.2015.07.017Get rights and content

Highlights

  • We evaluate multi-label learning algorithms for the analysis of clinical data.

  • We focus on patients affected by multiple chronic diseases.

  • We use a summary statistics approach to extract features on medical time series.

Abstract

We are motivated by the issue of classifying diseases of chronically ill patients to assist physicians in their everyday work. Our goal is to provide a performance comparison of state-of-the-art multi-label learning algorithms for the analysis of multivariate sequential clinical data from medical records of patients affected by chronic diseases. As a matter of fact, the multi-label learning approach appears to be a good candidate for modeling overlapped medical conditions, specific to chronically ill patients. With the availability of such comparison study, the evaluation of new algorithms should be enhanced.

According to the method, we choose a summary statistics approach for the processing of the sequential clinical data, so that the extracted features maintain an interpretable link to their corresponding medical records. The publicly available MIMIC-II dataset, which contains more than 19,000 patients with chronic diseases, is used in this study. For the comparison we selected the following multi-label algorithms: ML-kNN, AdaBoostMH, binary relevance, classifier chains, HOMER and RAkEL.

Regarding the results, binary relevance approaches, despite their elementary design and their independence assumption concerning the chronic illnesses, perform optimally in most scenarios, in particular for the detection of relevant diseases. In addition, binary relevance approaches scale up to large dataset and are easy to learn. However, the RAkEL algorithm, despite its scalability problems when it is confronted to large dataset, performs well in the scenario which consists of the ranking of the labels according to the dominant disease of the patient.

Introduction

Chronic diseases, also called noncommunicable diseases (NCDs) [1], are characterized by a long duration and generally a slow progression. Widespread chronic diseases include cardiovascular diseases, chronic respiratory diseases and diabetes. Chronic conditions are a major concern for public health programs of governments, particularly due to their negative effect in the continuous growth of medical care costs [2]. Chronic obstructive pulmonary disease (COPD) is an incurable illness, mainly due to tobacco smoking, where the treatment merely slows the progress of the condition. The World Health Organization (WHO) estimates that 64 million people have COPD worldwide in 2004 [3]. Concerning another major chronic disease, diabetes affects 347 million people worldwide in 2008 [4]. WHO projects that diabetes will be the 7th leading cause of death in 2030 [5]. Type 2 diabetes consists of 90% of people with diabetes, and is mostly the consequence of excess body weight and physical inactivity [6].

Despite the technical progress in the medical area which allows patients to be monitored in a more continuous way [7], the treatment of chronically ill patients, which can develop several comorbidities, remains complex for the physician. The continuous monitoring generates larger quantity of data. Often these measures are heterogeneous, such as laboratory tests, physiological values or electrocardiograms. On the one side, physicians willing to take optimal decisions will have to aggregate the information contained in these data. On the other side, such aggregation will become (or are already) unmanageable for humans. In addition, physicians are frequently in charge of hundreds of patients, as reported in [8]. Therefore, there is a need for state-of-the-art data-mining and machine learning tools to assist physicians by providing aggregated information about their patients. Indeed, as reported in [9], medical doctors would use tools that improve their understanding of an illness even if these involve more cognitive effort than in the standard practice. Several challenges appear during the design of such tools. Chronically ill patients, such as a diabetic patient, suffer frequently from several comorbidities in relation with the main disease. New approaches in the machine learning field, such as Multi-Label Learning (MLL), which have received, in the last few years, substantial contributions from the machine learning community [10], [11], [12], are then the good candidate for modeling the profile of a patient affected by several comorbidities. Another challenge concerns the characteristics of medical signals. Clinical data consist of multivariate time series that are often irregular by the fact that a patient may present various number of records with respect to another patient and the values can be nonuniformly sampled. The processing of data with these characteristics is challenging and techniques for the extraction of features are needed. One approach consists on relying on quantization methods, such as k-means clustering and Bag-of-Words (BoW), that have been proven successful in several medical data processing tasks [13]. Another approach would be to extract summary statistics for the different types of sequential clinical data [14].

MLL differs from classical machine learning by tackling the learning problem from a different perspective. In contrast to the classical classification tasks where each observation belongs to only one mutually exclusive class, in MLL decision areas of labels (i.e. classes) overlap. This aspect leads to the annotation (i.e. instead of classification) of observations with zero, one or several labels. In addition, instead of expressing the presence or the absence of a label as a binary variable, it is possible to express the confidence of the presence of a label through a score or a probability. This formulation looks natural for many problems in real life, such as the detection of emotions in music [15], [16], the semantic scene classification [17] or the classification of text into topics [18].

Regarding the application of such approaches in the medical domains, we can mention several research works. In genomics field, Barutcuoglu et al. proposed a Bayesian framework for the prediction of gene function [19]. Independently for each gene function, a Support Vector Machine (SVM) is trained, then a Bayesian network is built for combining the multiple classifier results. The graph structure of the network is based on a hierarchical gene taxonomy. The aim of this network is to avoid inconsistent set of predictions, where for a given gene a specific label may be predicted relevant while its inclusive parent label is predicted irrelevant. In the biology field, Xiao et al. developed the iLoc-Virus predictor [20] for predicting the subcellular locations of proteins according to their sequence information. In their work, they focus on viral proteins, those generated by viruses. Being able to predict the locations of viral proteins in a viral infected cell is important for improving antiviral drugs. As a virus protein can have more than one location, MLL methods accommodate well, and thus the ML-kNN [21] algorithm was chosen for their predictor. The following work is focused on chronic diseases, although they are not based on MLL but on related techniques. Huang et al. proposed a system for the prognosis and the diagnosis of chronic diseases which is based on data mining and case-based reasoning [22]. Data mining techniques are used to discover patterns from health examination data. More precisely, a decision tree induction algorithm is applied to find rules which will serve to the chronic diseases classification of new cases. Afterwards, case-based reasoning, which consists on the analysis of old cases to provide solution for a new case, aims to support physicians for the diagnosis and the treatments of chronic diseases. Regarding the evaluation, the experiment data were collected from a professional health examination center, and a feasibility test was performed with 12 discharged real cases. Amaral et al. developed a clinical decision support system to assess patients affected by chronic obstructive pulmonary disease (COPD) based on the forced oscillation technique (FOT) [23]. FOT is a noninvasive method to assess the breathing mechanics, using small amplitude pressure oscillations to stimulate the respiratory system in order to evaluate the flow response. Several machine learning classifiers were attempted, such as naive Bayes (NB), k-nearest neighbors (KNN), decision trees (DT), artificial neural networks (ANN), or support vector machines (SVM). Based on a dataset of 50 volunteers (where 25 have COPD), non-linear classifiers such as ANN and SVM and the lazy learning KNN classifier were able to reach a proper accuracy for COPD clinical diagnosis (sensitivity >87%, specificity >94%).

We are motivated by the problem of studying multi-label learning techniques for the analysis of clinical data in order to identify patients that may be affected by chronic diseases. We use the MIMIC-II clinical database [24] where 19,773 patients of various intensive care units (ICUs) are diagnosed with one or several chronic diseases according to the coding scheme of the International Classification of Disease revision 9 (ICD-9).2 Being able to characterize patients, based on their clinical data, open several applications, such as the identification of patient cohorts in the context of comparative effectiveness studies or in the case of clinical decision support systems [14]. In a previous paper [25], Bromuri et al. report on a new classifier which combines BoW and supervised dimensionality reduction algorithms to perform multi-label classification on health records of chronically ill patients. In the framework of this research, we discovered the following new challenges. Although the quantization method (BoW) used is convenient for the feature extraction when dealing with irregular time series, we think that a finer feature extraction approach based on summary statistics [14] will improve the results while making it easier to identify the influent characteristics.

In addition, the evaluation of a new MLL technique for the classification of chronic diseases based on the analysis of clinical data is made difficult by the fact that there are no studies which provide a large experimental comparison of state-of-the-art MLL algorithms on such data. The main contribution of this work is a large experimental review of multi-label learning approaches for the analysis of clinical data of chronically ill patients. We provide an extended description on properties of the dataset, on the way features are extracted using summary statistics and how the evaluation is conducted.

The rest of this document is organized as follows: Section 2 presents a background on evaluation metrics and methods for multi-label learning; Section 3 describes the MIMIC-II database and its properties; Section 4 defines the methodology for building models; Section 5 presents the results for the multi-label algorithms considered in this study; finally, Section 6 concludes this paper and draws the lines for future work.

Section snippets

Background

This section begins with the formal definition of a MLL problem and their related evaluation metrics. Then, a state-of-the-art of the existing MLL techniques is described.

With L for the finite set of labels, and with X for the domain of observation, the training set T is defined as T={(x1,Y1),(x2,Y2),,(xn,Yn)}(xiX,YiL). Based on these definitions, a multi-label classifier h is defined as h:X2L. In addition, some evaluation metrics are based on the output of a real-valued scoring function f

Materials

In this section, we describe the characteristics of the MIMIC-II clinical database [24]. We also explain how we use these data for our study related to chronic diseases.

The data were gathered during a seven year period, beginning in 2001, from Intensive Care Unit (ICU) of Boston׳s Beth Israel Deaconess Medical Center (BIDMC). The MIMIC-II clinical database [24] is publicly and freely available after registration. The last release of the database contains around 33,000 patients. We choose to

Methods

In this section we describe the feature extraction and the standardization that we apply on the data, then we describe the multi-label learning algorithms considered in this study.

Experiments

In this section we describe how the experiments were conducted and we discuss about the results.

Regarding the software environment in use, all the multi-label learning algorithms and evaluation metrics have been implemented with the Java programming language. The following Java libraries have been used: Mulan5 (version 1.4) and Weka4 (version 3.7.6). The operating system is a Ubuntu Linux 12.04 LTS 64 bits. Regarding the hardware environment, we used a workstation

Conclusion

In this contribution we presented an evaluation of multi-label learning algorithms on patients affected by chronic diseases. The emphasis of the work is on trying to model the relationship between different chronic illnesses by means of the multi-label paradigm. In this study we have been faced with the MIMIC-II dataset which contains a large number of patient records. This aspect leads to scalability problems with classifiers where multiple parameters need to be optimized, such as the use of a

Conflict of interest statement

This work was partially supported by the EU FP7 287841 COMMODITY12 project.

Acknowledgment

This work was partially supported by the EU FP7 287841 COMMODITY12 project.

References (53)

  • S. Bromuri et al.

    Multi-label classification of chronically ill patients with bag of words and supervised dimensionality reduction algorithms

    J. Biomed. Inf.

    (2014)
  • C. Roehrig et al.

    National health spending by medical condition, 1996–2005

    Health Aff.

    (2009)
  • C. Mathers, D.M. Fat, J. Boerma, The Global Burden of Disease: 2004 Update, World Health Organization,...
  • A. Alwan, et al., Global Status Report on Noncommunicable Diseases 2010, World Health Organization,...
  • K.G.M.M. Alberti, et al., Definition, Diagnosis and Classification of Diabetes Mellitus and its Complications. Part 1:...
  • Ö. Kafalı et al.

    Commodity 12a smart e-health environment for diabetes management

    J. Ambient Intell. Smart Environ.

    (2013)
  • J. Ghably et al.

    Endocrinology in crisis?

    South. Med. J.

    (2013)
  • G. Tsoumakas et al.

    Mining multi-label data

  • M.-L. Zhang et al.

    A review on multi-label learning algorithms

    IEEE Trans. Knowl. Data Eng.

    (2014)
  • J. Sun et al.

    Supervised patient similarity measure of heterogeneous patient records

    SIGKDD Explor. Newsl.

    (2012)
  • A. Wieczorkowska et al.

    Multi-label classification of emotions in music

  • K. Trohidis, G. Tsoumakas, G. Kalliris, I.P. Vlahavas, Multi-label classification of music into emotions, in: ISMIR,...
  • A. McCallum, Multi-label text classification with a mixture model trained by EM, in: AAAI׳99 Workshop on Text Learning,...
  • Z. Barutcuoglu et al.

    Hierarchical multi-label prediction of gene function

    Bioinformatics

    (2006)
  • M. Saeed et al.

    Multiparameter intelligent monitoring in intensive care II (MIMIC-II)a public-access intensive care unit database

    Crit. Care Med.

    (2011)
  • R. Schapire et al.

    Boostextera boosting-based system for text categorization

    Mach. Learn.

    (2000)
  • Cited by (59)

    • Multilabel all-relevant feature selection using lower bounds of conditional mutual information

      2023, Expert Systems with Applications
      Citation Excerpt :

      As labels, we consider indicators of ten families of diseases that have already been used in previous studies (Bromuri et al., 2014; Teisseyre, 2020; Teisseyre et al., 2019; Zufferey et al., 2015): hypertension, kidney, fluid, hypotension, lipoid, liver, diabetes, thyroid, copper, and thrombosis. We refer to a previous work (Zufferey et al., 2015), where a detailed description is provided for data cleansing and feature extraction. Table 3 contains summary statistics and label distributions.

    • Local-based k values for multi-label k-nearest neighbors rule

      2022, Engineering Applications of Artificial Intelligence
    • Classifier chains for positive unlabelled multi-label learning

      2021, Knowledge-Based Systems
      Citation Excerpt :

      Below we discuss some illustrative examples of such situation. As a first example, consider a problem of predicting multi-morbidity, i.e. co-occurrence of multiple diseases in one patient using patients characteristics, which is a typical multi-label task [3–7]. It may happen that, some diseases are not diagnosed.

    • Deep learning approach for the prediction of diseases in medical images

      2024, Medical Imaging Informatics: Machine learning, deep learning and big data analytics
    View all citing articles on Scopus
    1

    Permanent email address: [email protected]

    View full text