Alzheimer’s Disease Diagnosis by Using Dimensionality Reduction Based on KNN Classifier

Data mining is fast developing technology in extensive sort of applications. One of the essential data mining areas is medical data mining. Healthcare industry is a kind of industry, where huge amount of information’s and are more sensitive. That information is required to be handle very carefully without any mischievousness. There is a wealth of data presented in healthcare but there is no effective analysis tool to discover hidden relationships in data. There are numerous data mining methods that have been utilized as a part of healthcare industry but now the investigation has to be going on the performance of several classification techniques. In this paper, they proposed the Novel dimensionality reduction based KNN Classification Algorithm for analyzing and classifying the Alzheimer disease and Mild Cognitive Impairment are present in the datasets. National Alzheimer’s Coordinating Centre (NACC) having the Researcher’s Data Dictionary - Uniform Data Set (RDD-UDS) is gives dataset for the researchers to analyzing clinical and statistic information’s. From this research work, that gives more accuracy percentage, sensitivity percentage and specificity percentage to provide a better result.


INTRODUCTION
Data mining is a most dominant technology which having high potential to support discovering of hidden predictive data from huge datasets.These extracted data's are load in a data warehouse which are stored and managed into multidimensional databases.In current days, the applications of data mining in healthcare system play a vital role because the health region contains rich information and it became an essential technology.This technology is used to extract the information from the database at any time, which needs for processing.The understanding of this techniques leads to improve the efficiency and enhance the feature of disease analysis in the certain datasets 1,2 .Dementia is a type of syndrome; it contains various set of symptoms includes functional degradation in sense of space, memory loss, decision taken, manipulating ability, abstract thinking, and attention.The patients may have complex actions because of internal stimulus, identity changes, misunderstandings or hallucinations.The seriousness of such manifestations may impact patients' relationships and capacities to work 3 .
Alzheimer's Disease (AD) is the greatest eminent type of disease in dementia; it has common symptoms of memory loss, thinking capacity, changes in behaviors, moods and other cognitive abilities have severe enough to affect the day by day life.Memory harms are usually one of the starting signs of Alzheimer's disease; but various persons can have the different symptoms initially.It has different sorts of stages: In the preclinical stage it does not emit any symptoms but the brain gets some toxic.The early stage gets the symptoms of slight changes which are not noticeable.In the middle stage, there is loss of memory, confusions, and people may get the difficulty in identifies the family and friends.The final stage having more severe issues with the loss of communication ability, sleeping problem and loss more weight.Mild cognitive impairment (MCI) causes light changes in cognitive abilities but that are noticed by the ways of expressing them or other people.These changes are not affecting the daily life of that individual.
LC-kNN classifier is executed to find the Alzheimer's disease (AD).This classifier distinguished the ailment by using arranges of high compactness (low dimension) and discriminability.The INs representations beside with the versatile idea of the utilized metric separation are assumed to be the key elements of the LC-kNN classifier.The examination of a component that declines the data vectors dimension by considering the classes 4 .The objective of this work to discover and classifying the Alzheimer's disease and Mild Cognitive Impairment by using the dimensionality reduction based KNN classification algorithm.
The sensitive techniques of cerebrospinal fluid (CSF) biomarkers and brain imaging were utilized to discover the initial step of disease to improve the progress.Here lack of improvement in the biomarkers present in blood which has been replicated in huge amount of studies and that useful for clinically discovers the difficulties facing by individuals during the innovation of AD.A few serum markers have been described which may emerge from inflammatory actions in the central nervous system in the early course of AD.
The remaining of this paper structured as follows: The comprehensive descriptions of the related works on the detection of Alzheimer disease under data mining domains are discussed in section II.The execution process of Novel dimensionality reduction based KNN Classification algorithm is defined in section III.The performance analysis of KNN with existing approaches delivered in section IV.At last, the conclusions about the diagnoses of AD using KNN algorithm described in section V.

Related Works
This section describes the detection of Alzheimer's disease and Mild cognitive impairment (MCI) from the datasets.Lu, et al 5  The given number of data's were discovered the high quality rules.While substantial work on using neural networks for classification has been reported, none of them can generate rules with the quality comparable to those generated by Neuro Rule.Here, issues was to reduce the training time of neural networks.Gironi, et al. [6]established the technique of Artificial neural networks (ANNs) that used to manipulate the simulated models by central nervous system.The computational models inspired by the network and had the ability for learning mechanisms and detection of patterns.This work has to reveal the correlation between immunological and oxidative stress markers in AD and MCI by the application of ANNs.
Al-nuaimi, et al 7 presented an approach of Tsallis entropy has theoretical systems information for measuring deviations in the EEG.It provides beneficial vision of brain functions and it played an important role in initial stage of detection and diagnosis of dementia by using decision-support tool.It has a maximum temporal resolution, noninvasive, and the minimum of cost.The dementia causes the damage which affects the brain cells and this leads to modify the features of EEG.Information of theoretical methods developed as a possible way to compute the variations in EEG as biomarkers of dementia.Ortiz, et al 8 shown the process of neurodegenerative, that produce the changes in connection of brain network and the physical changes in pattern of brain activation.Sparse Inverse Covariance Matrix (SICE) method used to work out an experimental observation of brain network with the help of Gaussian graphical.This allowed to build a graphical view of the brain connectivity, by exhibiting the inter-regional brain network.The conditional independence was not able to evaluate by the covariance-based methods based on the variables to factor out the impact of other regions.
Savio and GrañA 9 discovered the variations of significant morphological in brain anatomy that were relevant to the brain atrophy of AD.Then it suggested the method of Computer Aided Diagnosis (CAD) which chosen from the fields of deformation, and moderated GM, that gained from non-linear registration process.The discriminant localization site was reliable with the literature concerning images in biomarkers of AD.The various classification results among deformation measures were not significant.Bhat, et al 4 stated that the analysis of AD was still a challenging task.Hence a study of recent research automated electroencephalography based on the diagnosis of AD was presented in this work.Neuroprotective and symptomatic methods were as anti-oxidants and neurotransmitters have verified to be effective in the treatment of AD symptoms and found the delay in development process.Here, the lack of accuracy in EEG based Diagnosis of AD.Kapur, et al 10 described about the collection of data's which was produced in the daily life, in that data's were extracted at the time of needed for processing otherwise no use of it.Data that we have to gauge students' potential based on various indicators like previous performances and in other cases their background to gave a comparative account on what method was the best in achieving that end.They compared six algorithms were Random Forest, Naive Bayes, Naive Bayes Multinomial, K-star, IBk.In that study, all the algorithms exposed that the amount of accuracy was low for that to be implemented on a large measure.
Walker, et al 11 talked about the difficulty of handling the microarray data that come from two known classes were as Alzheimer and normal.They suggested three various techniques used to identify the genes associated with the Alzheimer disease (AD).It has the major process which utilized gene expression data for disease classification and diagnosis.Here, only considered the data's for the process.Seixas, et al 12 described about the early analysis of this kind of disease permits to taken the treatment in advance and recover quickly.For this they developed the Bayesian network decision model for helpful in analysis of dementia, AD and Mild Cognitive Impairment (MCI).Bayesian networks were suitable for demonstrating the uncertainty and causality, which were existed in medical domains.Candás, et al 13 detected the disorders and diseases using abnormal human activities under free-living conditions was a reliable process.They proposed Automatic data mining method based on physical activity measurements used to evaluate the activities of human beings.But they don't consider real time analysis of operation to get the effective performance.Bang, et al 14 several research work were established to improve a dementia identification method in the area of computer-aided Diagnosis (CAD) technique.This work implemented a quad phased data mining modeling containing of 4 segments.In Proposed Module, substantial for analytical measures were selected the effective way of discovery.Trambaiolli, et al. 15 intended to estimate the significance of FS methodologies implemented to Alzheimer's Disease (AD) EEG-based analysis and compare the selected features with previous medical findings.This was estimated by seeing the leave-one-subject-out exactness of Support Vector Machine classifiers made from the datasets defined by the certain features.Zhang, et al 16 developed a hybrid classification system for distinctive NC, MCI, and AD based on physical MRI images.They utilized MRI data, demographics, clinical analysis, and resultant anatomic capacity as the training data.The CDR was used as the target data.They used PCA to decrease the dimensionality of the feature vectors of the MRI data and the resultant principal mechanisms reserved significant data.Bull, et al 17 proposed the open-source SAMS structure which used for data collection, text collection, and analyze the methods.It faced a number of challenges, but the primary challenges was derived to deploy it on a real users' home computers and to gather data as from their used computers to do everyday things.But it did not allow the resources to establish a SAMS product configuration lines for every type, brand and version of desktop computer, operating system, web browser and desktop application.Adeniyi, et al 18 applied a K-NN classification method with Euclidean distance technique has accomplished to generate suitable and a quite better classifications and recommendations of the customer at any time.The system performs classification of users on the simulated active sessions extracted from testing sessions by collecting active users' click stream and matches this with similar class in the data mart.This utilized to produce a set of recommendations to the customer in a Real-Time basis.But it has less efficiency.

Proposed work
This section evaluates the implementation details of proposed Novel dimensionality reduction based KNN Classification algorithm for achieving the classification result of Alzheimer disease.The simultaneous achievement of large size of data samples and better classification depends on following models in proposed work as shown in Fig. 2  The aim of this research work is to detect the Alzheimer Disease (AD) from the datasets by using Novel dimensionality reduction based KNN Classification algorithm.Input Datasets having more number of data's which contains all type of information's including the Normal, Mild cognitive Impairment (MCI), and the Alzheimer's disease (AD).Collect data's from the datasets that are used for testing and training Feature Extraction method used for the selection purpose which extracts the required data's for further process.Then apply logarithm for the data's to compute the mean value.purpose.The data's performing a data preprocessing method for filtering the unwanted data's from the datasets.
The extracted mean values are divided by the size of log features and executed the feature reduction operation.This is also repeated for the training process.At last, the outcome of Training and testing process are classified into Normal, (MCI), and (AD).These are based on analysis of information in the datasets.

Data Preprocessing
Preprocessing guarantees effective operation of disease analysis.The main purpose of data preprocessing is to reduce the amount of features used for classification method.Data preprocessing techniques involving the following steps: • Data collections • processing the information's • Extract the unwanted Basic information's such as, DOB, Sex, Education etc., • Select health oriented information's.
From the datasets, processing all the data's to separate the relevant information's.Then the unwanted information's are eliminated from the dataset; the remaining features which retain sufficient information for analyzing the person's health condition are as Normal, Mild cognitive Impairment (MCI), and the Alzheimer's disease (AD).Feature extraction is a common method that contains all type of data's are as both relevant and irrelevant for the disease analyzing purpose.In this some irrelevant features are extracted from the overall features and it provides a new set of features for the further process.

Dimensionality reduction based Classification
Dimensionality reduction techniques used to manage the interesting tasks in data mining.Due to the trouble of the data and ordering in real-world applications, it appears not a simple task to construct a common dimensionality reduction technique.Decreases in dimensionality can initiate with dual features: decreasing the amount of samples or decreasing the amount of features.To get reliable methods for dimensionality reduction falls into two different sorts of models: feature extraction and feature subset selection.Feature extraction denotes to raising new features with a linear or nonlinear conversion from the original input space to a feature space, while feature subset selection is to discover certain informative features from the original set and

Fig. 1 :
Fig. 1: Workflow of proposed system Zhang, et al19  Normal elder control used for the early analysis or identification of Alzheimer's disease (AD).In this Computeraided diagnosis (CAD) technique for MR brain pictures based on Eigen brains and mechanism learning with the accurate discovery of AD and AD-related brain diseases.Here, there was a lack of classification accuracy during the process.Khan, et al20  Considered the spatial data streams classifications and the training datasets were frequently changed.Training data's were arrived and that data's added to the training set.Here, proposed a k-nearest neighbor(KNN) classification, it finds the k nearest neighbors based on certain distance metric by finding the distance of the target data point from the training dataset.The problem associated with KNN classifiers were it significantly increases the classification time.Ferreira, et al 21 selected the most related neuropsychological and sorts of demographic for the prediction of prognostic with the help of genetic algorithm.This selection model has the ability of conversion predicted for dementia with the 88 percent of sensitivity.It helps for improve the treatment process in effective manner.Suk, et al 22 introduced a multitask learning method for feature selection in computer-aided Alzheimer's disease (AD) or Mild Cognitive Impairment (MCI) diagnosis.During the distribution process the neuroimaging data has numerous peak or modes.To detect the multi peak distributional characteristics and define the subclass based on the outcomes of clustering method.
and are explained as follows.• Data Pre-Processing • D i m e n s i o n a l i t y r e d u c t i o n b a s e d classification