Structural MRI-based detection of Alzheimer's disease using feature ranking and classification error

https://doi.org/10.1016/j.cmpb.2016.09.019Get rights and content

Highlights

  • Feature ranking based feature selection is introduced for high-dimensional pattern recognition.

  • The raw feature vectors are obtained using voxels within statistically obtained 3D masks.

  • Dimensionality reduction of ranked features is provided by determining optimum number of the top ranked features.

  • Classification error is used to determine the optimal number of top ranked features.

Abstract

Background and objective

This paper presents an automatic computer-aided diagnosis (CAD) system based on feature ranking for detection of Alzheimer's disease (AD) using structural magnetic resonance imaging (sMRI) data.

Methods

The proposed CAD system is composed of four systematic stages. First, global and local differences in the gray matter (GM) of AD patients compared to the GM of healthy controls (HCs) are analyzed using a voxel-based morphometry technique. The aim is to identify significant local differences in the volume of GM as volumes of interests (VOIs). Second, the voxel intensity values of the VOIs are extracted as raw features. Third, the raw features are ranked using a seven-feature ranking method, namely, statistical dependency (SD), mutual information (MI), information gain (IG), Pearson's correlation coefficient (PCC), t-test score (TS), Fisher's criterion (FC), and the Gini index (GI). The features with higher scores are more discriminative. To determine the number of top features, the estimated classification error based on training set made up of the AD and HC groups is calculated, with the vector size that minimized this error selected as the top discriminative feature. Fourth, the classification is performed using a support vector machine (SVM). In addition, a data fusion approach among feature ranking methods is introduced to improve the classification performance.

Results

The proposed method is evaluated using a data-set from ADNI (130 AD and 130 HC) with 10-fold cross-validation. The classification accuracy of the proposed automatic system for the diagnosis of AD is up to 92.48% using the sMRI data.

Conclusions

An automatic CAD system for the classification of AD based on feature-ranking method and classification errors is proposed. In this regard, seven-feature ranking methods (i.e., SD, MI, IG, PCC, TS, FC, and GI) are evaluated. The optimal size of top discriminative features is determined by the classification error estimation in the training phase. The experimental results indicate that the performance of the proposed system is comparative to that of state-of-the-art classification models.

Introduction

Alzheimer's disease (AD), a progressive irreversible neurodegenerative disorder, occurs most frequently in older adults and gradually destroys regions of the brain that are responsible for memory, thinking, learning, and behavior [1]. It is estimated that 5.3 million Americans of all ages suffer from AD in 2015 [2]. Among the top 10 causes of death among Americans, AD is the only disease that cannot be cured, prevented, or slowed [2]. Although there is no cure for AD, early detection may shed light on AD mechanisms and improve the responses of AD patients to drug therapy and their quality of life. In recent years, the analysis of neuroimaging data, such as structural magnetic resonance imaging (sMRI) [3], [4], [5], [6], [7], [8], [9], [10], [11], [12], functional MRI [13], [14], [15], and diffusion tensor imaging [16], [17], [18], in addition to positron emission tomography (PET) and single photon emission computed tomography (SPECT) [19], [20], [21], [22], [23], [24], has attracted much interest, with recent improvements in accurate detection of AD. In this paper, we focus only on the use of sMRI data in the classification of AD. Recently, sMRI brain data have been widely used to design computer-aided diagnosis (CAD) systems for the classification of AD [4], [9], [25], [26], because of the noninvasiveness, excellent spatial resolution, and good tissue contrast of sMRI, in addition to the absence of radioactive pharmaceutical injection, as occurs with PET and SPECT [19], [20], [21], [22]. Many researchers studied advanced pattern analysis and classification approaches for extracting complex spatial patterns of brain structure [14], [27], [28], [29], [30]. This paper describes the application of an automatic CAD system, which uses statistical feature-ranking methods as part of a novel feature-selection process, followed by estimation of the classification error in AD and healthy control (HC) groups to determine the optimum number of highest-ranking features to be selected. In the training set, resubstitution and cross-validation error estimators were used as classification errors to measure the quality of a classifier. We used these classification error metrics as stopping criteria among the ranked features to estimate the optimal number of features with the most discriminative information in the classification process. We evaluated seven feature-ranking methods, namely, statistical dependency (SD), mutual information (MI), information gain(IG), Pearson's correlation coefficient (PCC), the t-test score (TS), Fisher's criterion (FC), and the Gini index (GI) in the proposed CAD system. In the proposed approach, high-dimensional feature space was reduced into lower dimensional space by employing the minimized classification error as the dimensionality selection criterion in an iterative process of incrementing the number of ranked features. The proposed feature-selection method was applied to gray matter (GM) atrophy clusters of voxels, which corresponded to the volume of interests (VOIs) of the sMRI data obtained through the voxel-based morphometry (VBM) analysis during preprocessing. VBM is an advanced method used to assess the whole-brain structure using voxel-by-voxel comparisons [8], [31], [32], [33], [34], [35], [36]. It is one of the best methods for feature extraction from sMRI in AD [9]. In the proposed system, we used only sMRI data. The proposed CAD system was applied in four stages in a systematic manner. In the first stage, the VBM technique was employed, in addition to diffeomorphic anatomical registration using the exponentiated Lie algebra (DARTEL) [33]. This approach was used to analyze group-wise comparisons between cross-sectional structural MRI scans to detect the MRI voxels that were best discriminated between the AD group versus HCs [8], [31], [32], [33]. Based on the VBM and DARTEL approach on a global brain scale, and regional structural GM alterations, regions with significant atrophy of GM were investigated and specified in the patients who suffer from AD. In the second stage, specified VOIs were used as 3D masks for extracting voxel intensity values from GM atrophy regions to generate raw feature vectors. These feature vectors were subjected to further data-selection processes before they were used by the classifier. In the third stage, the extracted features were ranked based on the statistical scores (i.e., SD, MI, IG, PCC, TS, FC, and GI) of the AD and HC groups in the training set. The ranking scores can be considered an indicator of the level of separation/discrimination between the AD and HC groups in the training set. Feature ranking has been used successfully in a number of pattern-recognition studies [37], [38], [39], [40], [41], [42]. In addition, an automatic approach based on classification error estimation was used to determine the number of top features using the AD and HC groups in the training set. This approach adaptively determines the optimum number of top features and identifies a discriminative subset of high-performance features based on the training data in each fold instead of using a fixed number of features. In the fourth stage, the performance of the proposed feature-selection technique was evaluated using a support vector machine (SVM) classifier. In this work, the SVM classifier with a linear kernel was trained to discriminate between the classes. In addition, instead of using a single feature ranking method, the results of multiple individual feature ranking methods were combined through the proposed data fusion technique for improved classification performance.

In summary, the aim of this study was to design an automatic CAD system based on statistical feature ranking and classification errors as part of a novel feature-selection method. The proposed system utilizes feature ranking based on statistical scores, followed by the determination of resubstitution and cross-validation error estimators to identify the number of ranked features that minimizes the error in the training set. This process helps to identify a selected discriminative subset of high-performance features into a lower-dimensional feature vector space representing sMRI images. In addition, a data fusion technique was proposed to improve the AD classification performance among different feature ranking methods. The performance of the proposed system was assessed using a data set from the Alzheimer's Disease Neuroimaging Initiative (ADNI) containing 260 subjects (130 AD patients and 130 HCs) using 10-fold cross-validation. The experimental results showed that the accuracy (ACC) (92.48%), sensitivity (SEN) (91.07%), specificity (SPE) (93.89%), and area under the curve (AUC) (0.963) of the proposed system were well comparatively to results obtained with state-of-the-art techniques in terms of AD classification.

The rest of the paper is organized as follows: Section 2 details the statistical data in the study. Section 3 describes the proposed methodology to design an automatic CAD system based on feature ranking and classification error. Section 4 presents the experimental results, discussion, and analysis of the proposed system. Finally, Section 5 presents the conclusions.

Section snippets

MRI acquisition

The MR images and data used in this study were obtained from the ADNI database.1 All the participants initially underwent a number of neuropsychological examinations, resulting in several clinical characteristic indicators, including the Mini Mental State Examination (MMSE) score and Clinical Dementia Ratio (CDR) score. The MRI scans were acquired using 3 Tesla, T1-weighted by Siemens scanner with Acquisition Plane = SAGITTAL, Acquisition Type = 3D, Coil = PA, Flip

Proposed CAD classification system

In this section, an automatic CAD system, which is based on feature ranking, followed by optimal selection of a number of top features using a classification error for high-performance AD classification, is introduced. An outline of the proposed ranking-based CAD system is illustrated in Fig. 1. First, the VBM and DARTEL approach were employed to preprocess 3D T1-weighted MRI data. Second, voxel-based feature extraction was performed. Third, the extracted features were ranked based on the score

Experimental results and discussion

In this section, the experimental results obtained through the preprocessing phase using VBM plus DARTEL analysis on 3D T1weighted MR Imaging are considered, as an indicator disclosing significance of decreased gray matter volumes in ADs contributing to VOI. The experimental data consisted of 260 samples from an ADNI data set. A 10-fold cross-validation was employed throughout the performance analysis, with 234 (90%) samples in the training sample and 26 (10%) samples in the testing processes

Conclusion

This paper proposed an automatic CAD system for the classification of AD based on seven feature-ranking methods (i.e., SD, MI, IG, PCC, TS, FC, and GI) and classification errors (i.e., resubstitution and cross-validation errors). The optimal size of the selected features was determined by classification error estimation, which minimized the classification error in the training phase. This approach was applied to extracted raw features obtained from GM atrophy clusters of VOIs, which were

Conflict of interest

All authors declared no conflict of interests.

Ethical approval

All authors confirm that the data used in our research involving human participants were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database (www.loni.ucla.edu/ADNI).

Acknowledgments

This work was partly carried out under the Brain Mapping by Integrated Neuroethologies for Disease Studies (Brain/MINDS) project (grant number 16dm0207017h0003), funded by the Japan Agency for Medical Research and Development (AMED). This work has been also partially supported by project grants from Beijing Nova Program (xx2016120), National Natural Science Foundation of China (81101107, 31640035), Natural Science Foundation of Beijing (4162008) and program for top young innovative talents of

References (86)

  • D. Chyzhyk et al.

    Hybrid dendritic computing with kernel-LICA applied to Alzheimer's disease detection in MRI

    Neurocomputing

    (2012)
  • A.H. Andersen et al.

    Partial least squares for discrimination in fMRI data

    Magn. Reson. Imaging

    (2012)
  • Y. Fan et al.

    Structural and functional biomarkers of prodromal Alzheimer's disease: a high-dimensional pattern classification study

    Neuroimage

    (2008)
  • M. Graña et al.

    Computer Aided Diagnosis system for Alzheimer Disease using brain Diffusion Tensor Imaging features selected by Pearson's correlation

    Neurosci. Lett

    (2011)
  • W. Lee et al.

    Classification of diffusion tensor images for the early detection of Alzheimer's disease

    Comput. Biol. Med

    (2013)
  • H. Hanyu et al.

    The progression of cognitive deterioration and regional cerebral blood flow patterns in Alzheimer's disease: a longitudinal SPECT study

    J. Neurol. Sci

    (2010)
  • K.R. Gray et al.

    Multi-region analysis of longitudinal FDG-PET for the classification of Alzheimer's disease

    Neuroimage

    (2012)
  • Y.J. Chen et al.

    A semi-quantitative method for correlating brain disease groups with normal controls using SPECT: Alzheimer's disease versus vascular dementia

    Comput. Med. Imaging Graph

    (2013)
  • J.M. Górriz et al.

    GMM based SPECT image classification for the diagnosis of Alzheimer's disease

    Appl. Soft Comput

    (2011)
  • F. Segovia et al.

    Early diagnosis of Alzheimer's disease based onpartial least squares and support vector machine

    Expert Syst. Appl

    (2013)
  • J. Ramírez et al.

    Computer aided diagnosis system for the Alzheimer's disease based on partial least squares and random forest SPECT image classification

    Neurosci. Lett

    (2010)
  • I. Beheshti et al.

    Probability distribution function-based classification of structural MRI for the detection of Alzheimer's disease

    Comput. Biol. Med

    (2015)
  • A. Ortiz et al.

    LVQ-SVM based CAD tool applied to structural MRI for the diagnosis of the Alzheimer's disease

    Pattern Recognit. Lett

    (2013)
  • J. Ashburner et al.

    Voxel-based morphometry – the methods

    Neuroimage

    (2000)
  • X. Guo et al.

    Voxel-based assessment of gray and white matter volumes in Alzheimer's disease

    Neurosci. Lett

    (2010)
  • Y. Meng et al.

    Anatomical deficits in adult posttraumatic stress disorder: a meta-analysis of voxel-based morphometry studies

    Behav. Brain Res

    (2014)
  • J. Jovicich et al.

    Reliability in multi-site structural MRI studies: effects of gradient non-linearity correction on phantom and human data

    Neuroimage

    (2006)
  • Y. Hirata et al.

    Voxel-based morphometry to discriminate early Alzheimer's disease from controls

    Neurosci. Lett

    (2005)
  • J.H. Son et al.

    Correlation between gray matter volume in the temporal lobe and depressive symptoms in patients with Alzheimer's disease

    Neurosci. Lett

    (2013)
  • J. Cousijn et al.

    Grey matter alterations associated with cannabis use: results of a VBM study in heavy cannabis users and healthy controls

    Neuroimage

    (2012)
  • D. Chyzhyk et al.

    Computer aided diagnosis of schizophrenia on resting state fMRI data by ensembles of ELM

    Neural Netw

    (2015)
  • J. Pohjalainen et al.

    Feature selection methods and their combinations in high-dimensional classification of speaker likability, intelligibility and personality traits

    Comput. Speech Lang

    (2015)
  • N. Zhou et al.

    A modified T-test feature selection method and its application on the HapMap genotype data

    Genomics Proteomics Bioinformatics

    (2007)
  • C. Cabral et al.

    Predicting conversion from MCI to AD with FDG-PET brain images at different prodromal stages

    Comput. Biol. Med

    (2015)
  • Q. Gao et al.

    Enhanced fisher discriminant criterion for image recognition

    Pattern Recognit

    (2012)
  • I. Dimitrovski et al.

    Improved medical image modality classification using a combination of visual and textual features

    Comput. Med. Imaging Graph

    (2015)
  • T. Xue et al.

    Neural specificity of acupuncture stimulation from support vector machine classification analysis

    Magn. Reson. Imaging

    (2011)
  • X. Song et al.

    A SVM-based quantitative fMRI method for resting-state functional network detection

    Magn. Reson. Imaging

    (2014)
  • C. Hinrichs et al.

    Predictive markers for AD in a multi-modality framework: an analysis of MCI progression in the ADNI population

    Neuroimage

    (2011)
  • A. Farzan et al.

    Boosting diagnosis accuracy of Alzheimer's disease using high dimensional recognition of longitudinal brain atrophy patterns

    Behav. Brain Res

    (2015)
  • L. Wang et al.

    Multi-task feature selection via supervised canonical graph matching for diagnosis of autism spectrum disorder

    Brain Imaging Behav

    (2015)
  • R. Genuer et al.

    Variable selection using random forests

    Pattern Recognit. Lett

    (2010)
  • Z. Lao et al.

    Morphological classification of brains via high-dimensional shape transformations and machine learning methods

    Neuroimage

    (2004)
  • Cited by (64)

    • A Review on Advanced CNN Architecture in Diagnosing Alzheimer’s Disease

      2024, Non-Invasive Health Systems based on Advanced Biomedical Signal and Image Processing
    • Structural biomarker-based Alzheimer's disease detection via ensemble learning techniques

      2024, International Journal of Imaging Systems and Technology
    View all citing articles on Scopus
    1

    Data used in this article were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database (www.loni.ucla.edu/ADNI). ADNI investigators other than those listed above contributed to study design, implementation or data provision but did not participate in the analyses or writing of this report. The complete listing of ADNI investigators is available at http://www.loni.ucla.edu/ADNI/Data/ADNI_Authorship_List.pdf.

    View full text