A Survey of Data Mining Methods for Automated Diagnosis of Cardiac Autonomic Neuropathy Progression

Cardiac autonomic neuropathy (CAN) is a disease that occurs as a result of nerve damage causing an abnormal control of heart rate. CAN is often associated with diabetes and is important, as it can lead to an increased morbidity and mortality of the patients. The detection and management of CAN is important since early intervention can prevent further complications that may lead to sudden death from myocardial infarction or rhythm disturbance. This paper is devoted to a review of work on developing data mining techniques for automated detection of CAN. A number of different categorizations of the CAN progression have been considered in the literature, which could make it more difficult to compare the results obtained in various papers. This is the first review proposing a comprehensive survey of all categorizations of the CAN progression considered in the literature, and grouping the results obtained according to the categorization being dealt with. This novel, thorough and systematic overview of all categorizations of CAN progression will facilitate comparison of previous results and will help to guide future work.


Introduction
Clinical applications of data mining techniques have been actively investigated.For preliminaries and background information on this broad area let us refer the readers to the monographs [1][2][3].In particular, for successful treatment of various conditions, it is important to find attributes that may help in the early detection of signs and symptoms of disease that may facilitate the prevention, early diagnosis and treatment (cf.[4][5][6][7][8][9]).Likewise, automated computer-based diagnosis plays an important role in eHealth and mobile applications (cf.[10,11]).The present review article deals with recent contributions to this broad research area for the special case of cardiovascular autonomic neuropathy (CAN), which is a well-known complication associated with diabetes (cf.[12][13][14]).Cardiovascular (CVD) complications associated with diabetes account for 65% of all diabetic deaths [15].The large impact of CVD associated with diabetes mellitus Type 1 and Type 2 has brought about the recommendation that people with diabetes should be regularly screened for the presence of comorbidities including autonomic nervous system dysfunction with the aim to decrease the incidence of cardiovascular related morbidity and mortality [16][17][18].People with diabetes and autonomic neuropathy have an increased mortality rate (29%) compared to people with diabetes but no autonomic neuropathy (6%) [19,20].As many as 22% of people with type 2 diabetes suffer from CAN, with prevalence increasing as duration of diabetes increases [21,22].CAN leads to impaired regulation of blood pressure, heart rate and heart rate variability (HRV).The increased risk of cardiac mortality due to arrhythmias makes screening of people with diabetes for autonomic neuropathy vital so that early detection, intervention and monitoring can occur [23].Autonomic neuropathy is also associated with non-response hypoglycemia and a reduction in counter-regulation of the hypoglycemic events [24,25].Silent ischemia is significantly more frequent in patients with CAN than in those without CAN [26,27] and significantly more people with diabetes die from cardiovascular disease such as heart attack and stroke, which can be attributed to CAN [28].Early subclinical detection of CAN and intervention are of prime importance for risk stratification in preventing the potentially serious consequences of CAN [29].
Data mining methods are an important adjunct to medical research in identifying disease markers that allow early detection, prevention or treatment of disease.Electronic patient records and large healthcare databases combined with data mining provide a means to improve the level of health by identifying latent features not identified previously that are strong indicators of disease [30].Data mining methods have been used extensively in health care research to build prediction models that provide additional information for improving health care outcomes [31][32][33].

Tests of the Ewing Battery
Autonomic neuropathy in diabetics has been traditionally identified by performing the Ewing battery of tests, which was recommended by the American Diabetes Association and the American Academy of Neurology.These tests evaluate heart rate (HR) and blood pressure (BP) responses to various activities [34][35][36].The five tests in the Ewing battery are shown in Table 1 following [35].
Results of these tests provide a good assessment of diabetic autonomic neuropathy and aid in objective diagnosis instead of relying on self-reported clinical signs such as gustatory sweating, reflux, and incontinence.For the current study, patient results were only included if participants in the study were free of medication and comorbidities affecting their heart rate.The response of subjects to each of the Ewing tests is defined as normal, borderline or abnormal, as shown in Table 1, where HR is measured beats per minute (BPM) and BP is measured in mmHg.Let us refer to [35,37,38] and [13] for more explanations and details on conducting and interpreting tests included in the Ewing battery.From this grading CAN risk assessment can be divided into a normal and no CAN evident category and four CAN categories comprising: early, definite, severe and atypical, as shown in Table 2.The categorization given in Table 2 is for diagnosing CAN categories as shown in [37].The paper [35] compared these rules for determining the categories of CAN with two alternative scoring systems.The first one gave 0 for a normal result, ½ for a borderline outcome, and 1 for an abnormal outcome, resulting in a combined total core ranging from 0 to 5 for each participant.The second set of rules counted the number of outcomes that were abnormal, which again produced a total score in the range from 0 to 5 for each person.The paper [35] demonstrated that these scoring systems give roughly equivalent categorizations and neither seems to carry a real advantage over the other.
It is not always possible for patients to perform all of the Ewing tests.For instance, the hand grip test may be difficult due to arthritis.The lying to standing tests often cannot be included in the test battery results due to mobility challenges of patients.Likewise, some patients have conditions where forceful breathing required for the Valsalva maneuver is contra-indicated.These issues result in CAN risk assessments being made in practice on the basis of only a subset of the Ewing battery.
This review groups the results obtained in the literature into several sections according to the categorization being studied.If a paper considers several classifications simultaneously, we include a more detailed summary of the new methods proposed in this paper in the section devoted to the categorization with the smallest number of classes since usually it is easier to handle and the best outcomes are achieved in this case.In the sections devoted to categorizations with a larger number of classes we only include an indication of the best values obtained for the corresponding categorization in the relevant paper.
The following categorizations of CAN progression have been considered in the literature: • CAN2: the presence or absence of CAN (2 classes); • CAN3: absence of CAN, early CAN, and definite CAN (3 classes); • CAN4: normal, early, definite and severe CAN (4 classes); • CAN5: normal, early, definite, severe, and atypical CAN (5 classes); • CAN-early: early CAN and the absence of early CAN (2 classes); • CAN-severe: severe CAN and the absence of severe CAN (2 classes).
Here the absence of early CAN is the union of normal, definite CAN and severe CAN classes.
Likewise, by the absence of severe CAN we mean the union of normal, early CAN, and define CAN classes.

DiabHealth Database
Many articles have used the large database of health-related parameters and tests amalgamated in the Diabetes Screening Complications Research Initiative (DiabHealth) [39] organized by Charles Sturt University in Australia.The collection and analysis of data in the project was approved by the Ethics in Human Research Committee of the university.The participants were instructed not to smoke and refrain from consuming caffeine containing drinks and alcohol for 24 hours preceding the tests as well as to fast from midnight of the previous day until tests were complete.The measurements were conducted from 9:00am until 12midday and were recorded in the DiabHealth database along with various other clinical data including age, sex and diabetes status, blood pressure (BP), body-mass index (BMI), blood glucose level (BGL), and cholesterol profile.Reported incidents of a heart attack, atrial fibrillation and palpitations were also recorded.DiabHealth has made it possible to collect a large database with over 2500 entries and several hundred features.

Heart Rate Variability for the Automated Diagnostics of CAN
The Ewing battery is commonly used for detecting CAN but is often not conclusive and therefore more sensitive and accurate tests are required.This section deals with one of the most important special classes of attributes, which have been applied for the automated detection of CAN in previous publications.
Heart Rate Variability (HRV) as a clinical tool using ECG recordings has been shown to be a sensitive marker for risk of future arrhythmias or CAN and is easier to use clinically compared to the Ewing battery [40].[42].
Nonlinear HRV measures have become popular in recent times as they are more robust against nonstationarity and nonlinearity characteristics of the RR tachogram and are able to detect how aging and pathological conditions affect interbeat variation [43,44].Nonlinear HRV features such as detrended fluctuation analysis (DFA), estimate complexity inherent in the signal.The correlation dimension (D 2 Table 3 summarizes various HRV analysis methods.In particular, it uses the notion of normal to normal beat intervals, also called NN intervals, see [44,46,47] for more explanations. ) can also be applied [45].
Several entropy measures have been proposed such as approximate entropy, sample entropy, tone-entropy [48,49].These measures have subsequently led to the multiscale entropy measures including the multi-scale Rényi entropy, which is a generalization of the Shannon entropy [50].The Rényi entropy H is defined as where α is the order of the Rényi entropy and p i stands for the probability of X being equal to a particular value.The value of Rényi entropy for given π and α is denoted by H (π,α).

Data Mining Methodology
The following standard measures of the effectiveness or performance of classifiers have been considered in the literature devoted to the diagnostics of CAN: accuracy, precision, recall, F-measure, sensitivity, specificity and Area Under Curve (AUC) also known as the Receiver Operating Characteristic or ROC area.These measures are standard and well-known.The readers interested in detailed explanations of these standard measures can find them in the monograph [51].
Several articles devoted to CAN divide the data set into a training set and a validation set to assess the effectiveness of the classifiers being designed ( [52]).On the other hand, 10-fold cross validation, which is a standard well-known technique organizing experiments to prevent overfitting machine learning models to data, have also been reported.It can be implemented in WEKA and is invoked by default as stratified 10-fold cross validation, see [51].It divides data into ten stratified folds and creates training sets and hold out testing sets ten times for ten consecutive tests with hold out sets automatically.Another method designed in the literature to prevent over-fitting is the 5 × 2 cross-validation introduced and recommended in [53] for comparison of classifiers.This method carries out five iterations of twofold cross-validation.The results of cross-validation implemented in WEKA are included in the output of all classifiers automatically, which makes it easy to apply cross validation in experiments concerning classifiers implemented in WEKA [54,55].

Binary Classification CAN2
The paper [38]  In order to reduce the cost of performing medical tests required to collect the attributes yet maintain diagnostic accuracy, it is essential to optimize the features used for classification and to keep the number of features as small as possible.Feature selection methods of this kind are outlined in this section.The binary classification CAN2 was also studied in [52].Instead of concentrating on the role of attributes of a particular type, the article applied data mining feature selection methods to approaches was defined and was used to reduce the number of features necessary for optimal classification.The combined heuristic MR-ANNIGMA exploits the complimentary advantages of both the filter and wrapper heuristics to find significant features [52].
The best accuracy obtained by applying this method for CAN2 was 80.66%.The feature selection approach applied in [52] however identified an effective set of ECG components associated with CAN2, which have clinical relevance.More information on the relation of ECG features, CAN and hypertension have been established in [56,57].
The paper [60] dealt with the binary classification CAN2 for diabetes patients only.It carried out a comprehensive study of the effectiveness of several decision trees including ADTree, J48, NBTree, RandomTree, REPTree, and SimpleCart and various ensembles of decision trees generated by applying AdaBoost, Bagging, Dagging, Decorate, Grading, MultiBoost, Stacking, and two multilevel combinations of AdaBoost and MultiBoost with Bagging.The best classifier designed in [60] achieved classification results for CAN2 with the ROC area equal to 0.947.
In [41], visualization methods for determining the categories of CAN2 were studied.The authors concentrate on visualization using only data derived from HRV.A variety of measures were extracted from the sequence of interbeat time intervals (RR intervals).The multiscale Renyi entropy was calculated using −5 < α < +5, where α = 1 gives the Shannon entropy and α = 2 produces the squared entropy.Sample Entropy was also calculated in order to provide a comparison.All features calculated from HRV were visualized using a Spider diagram.The results show that this visualization technique not only captures that binary classification CAN2, but provides additional insights by displaying a comprehensive picture of the complexity of the disease.In this relation, let us note that it would be also interesting to investigate the applications of conceptual graphs for the visualization of the diagnostics of CAN progression, since conceptual graphs are well known effective tools for formal visual reasoning in the medical domain [61,62].

Ternary Classification CAN3 and Quaternary Classification CAN4
Ternary classification, CAN3, has been considered simultaneously with the quaternary classification, CAN4, in previous papers.These papers also included CAN2.The paper [65] considered CAN2, CAN3, and CAN4.The paper only used complete data without addressing the problem of missing values and applied feature selection methods incorporated in the implementation of Random Forest in R [66,67] to select relevant features.Multilevel classifiers were investigated in [65].
The best classifier produced AUC values of 0.997 for CAN2, 0.994 for CAN3, and 0.990 for CAN4.
It is essential to note that the results of [65] cannot be applied to handle missing values, since all tests used a large set of features and a complete dataset.
The Ewing battery of tests is still commonly used, but the question of which if any of the single tests included in the Ewing battery may perform equally well as the 5-test battery and which one of the five tests this may be has not been investigated prior Stranieri and colleagues [29] handled all three classifications CAN2, CAN3, and CAN4 to address this question.An optimal order of the Ewing tests was determined using the Optimal Decision Path Finder procedure proposed in [68].In addition, visual aids were developed in [29] to simplify the selection of the next Ewing test during applications of this procedure in practice.Only simple basic decision trees were used and the best accuracy achieved was equal to 94.14% for CAN2 [29].
The paper [69] introduced a new parameter, the beat-to-beat TQ-RR ratio derived from ECG

Binary Classification of Early CAN
Binary classification of early CAN was considered simultaneously with CAN2 and the ternary classification, CAN3 in [74].It investigated the problem of determining of these categorizations based only on HRV.A variety of measures may be extracted from HRV, including time domain, frequency domain, and more complex non-linear measures.Among the latter, Renyi entropy has been proposed as a suitable measure that can be used to discriminate CAN from normal healthy patients.
There are several different ways that can be used to calculate various variants of the Renyi entropy, which depend on a number of parameters.The paper [74] compares nine different methods to calculate Renyi entropy by applying several variations of the histogram method and a density method based on sequences of RR intervals.The effectiveness on nine methods in achieving the best separation of the different categories of CAN3 is then compared.The results obtained showed that that the histogram method using single RR intervals yields an entropy measure that is either incapable of discriminating CAN from controls, or it provides little information that could not be gained from the standard deviation (SD) of the RR intervals.In contrast, probabilities calculated using a density method based on sequences of RR intervals yielded an entropy measure that provided good separation between groups of participants and provided information not available from the SD.
This showed that different approaches to calculating probability for determining the Renyi entropy may affect the success of detecting CAN3 categories.Thus, the results of [74] bring clarity to the question of how best to calculate the Renyi entropy for the successful detection of CAN3 categories.

Binary Classification Severe CAN
The paper [75] applied multiscale Allen factor to determine a marker for cardiac neuropathy from ECG recordings as features to be used for the machine learning methods and automated detection.It introduced the Graph-Based Machine Learning System (GBMLS).This method is intended to enhance the effectiveness of the diagnosis of severe diabetic neuropathy.We applied it to the multiscale Allen factor (MAF) features as a collection of attributes determined from the recorded ECG biosignals.These attributes can be collected as a result of routine ECG investigation of patients regardless of the presenting medical problems.The experiments compare sensitivity and specificity of the automated detection produced by GBMLS with analogous outcomes achieved by various other machine learning approaches.To this end the authors used a comprehensive collection of important

α 1 -
of the normal to normal beat intervals Domain RMSSD The square root of the mean squared difference of the NN intervals Frequency Total Power Variance of N-N intervals over the temporal segment (freq < 0.4) Domain VLF Power in very low frequency range (freq < 0.04) LF Power in low frequency range (freq 0.04 to 0.15) HF Power in high frequency range (freq 0.15 to 0.4) Nonlinear SD1, SD2 The standard deviations perpendicular to and along the line-of-identity of the Poincaré plot Short-term fluctuation slope; α 2 -Long-term fluctuation slope derive a set of features to be used for the automated detection of CAN.The experiments undertaken divided the data set into a training set and a validation set.A hybrid of Maximum Relevance filter (MR) and Artificial Neural Net Input Gain Measurement Approximation (ANNIGMA) wrapper

Karmakar and colleagues [ 63 ]
undertook a multi-lag Tone-Entropy (T-E) analysis of HRV data for CAN2.A total of 41 ECG recordings from DiabHealth were utilized with definite CAN and without CAN.T-E values of each patient were calculated for different beat sequence lengths (denoted by len and ranging from 50 to 900) and lags (denoted by m and ranging from 1 to 8).For all values of the len and m parameters, it was discovered that the group of normal patients has a lower mean tone value compared to that of definite CAN patients, whereas the mean entropy value was higher in normal patients than in patients with definite CAN.Leave-one-out cross-validation tests using a quadratic discriminant (QD) classifier were applied to investigate the performance of multi-lag T-E features.This produced 100% accuracy for T-E with len = 250 and m = {2, 3} settings, which is better than the performance of T-E technique based on m = 1.The results demonstrated the usefulness of multi-lag T-E analysis over single lag analysis for the diagnosis of CAN2 categorization.Investigating data transfer, the paper [64] solved the problem of minimizing data transfer between different data centers of the cloud during the diagnosis of CAN2 by classifiers deployed in the cloud.A new model of clustering-based multi-layer distributed ensembles (CBMLDE) was introduced.It was designed to eliminate the need to transfer data between different data centers for training of the classifiers.Ten-fold cross validation and a dataset derived from DeabHealth were used in order to determine the best combinations of options for setting up CBMLDE classifiers.The results demonstrated that CBMLDE classifiers not only completely eliminate the need in patient data transfer, but also have significantly outperformed all base classifiers and simpler counterpart models in all cloud frameworks.
recordings and was investigated in conjunction with the systolic-diastolic interval interaction (SDI) parameter.Performance of both QT-TQ and TQ-RR based SDI measures was explored to diagnose the categories of CAN3.ECG recordings of 72 diabetic subjects without CAN, 55 subjects with early CAN and 15 subjects with definite CAN from the DiabHealth study were utilized.The outcomes obtained demonstrated that variability of the TQ-RR based SDI measure can distinguish all three categories of CAN3 with p-value p < 0.001.In contrast, the variability of the QT-TQ based SDI measures showed significant difference only between the normal subjects and definite CAN categories.This demonstrates TQ-RR based SDI parameter turned out more sensitive in the detection of CAN3 categories compared to the QT-TQ based measures.The paper [13] used ten-fold cross validation to compare the effectiveness of applications of decision trees, ensemble classifiers and multi-level ensemble classifiers for neurological diagnostics of CAN.It investigated and compared the effectiveness of AdaBoost, Bagging, MultiBoost, Stacking, Decorate, Dagging, and Grading in their ability to enhance the performance of decision trees-ADTree, J48, NBTree, RandomTree, REPTree, SimpleCart, as well as several other base classifiers-Decision

Table 2 . Categorization of CAN based on Ewing tests.
The ability to use only HRV for accurate identification of CAN and CAN progression provides alternative test results to the physician in addition to invasive testing such as cholesterol, BGL and HbA1c results.
[41]attributes can serve as a safeguard measure detecting CAN from short heart rate recordings during a patient health review.Several articles have applied HRV features to the task of automated detection of CAN.The motivation to use HRV data is that it is more often available and easier to obtain in clinical practice than the Ewing battery features.HRV measures also provide many more variables compared to the five attributes in the Ewing battery.HRV analysis involves determining the interbeat intervals (RR intervals) between successive QRS complexes on an ECG or directly from heart rate recordings.HRV information can include as many as 20-30 measures sensitive to different characteristics of the ECG time series that can be divided into time, frequency, and nonlinear measures[41].ECGs are routinely assessed in clinical practice and although they do not directly indicate CAN, HRV can be determined from the interbeat interval tachogram or from a continuous heart rate recording carried out a study demonstrating the usefulness of HRV and complexity analyses based on short term ECG recordings as a screening tool for CAN2 categorization.Binary Table, FURIA, J48, NBTree, Random Forest and SMO.In addition, Jelinek et al. blood biochemistry attributes can be used to supplement the Ewing battery for the detection of CAN in the absence of a complete set of Ewing tests.The results show that AIME provided higher accuracy as a multitier CAN5 classification system.For CAN5 categorization, the best accuracy of 99.57% was obtained by the AIME that combined Decorate as the top layer with Bagging on middle layer applied to Random Forest as a base classifier.
[14]w machine learning algorithm for the diagnosis of CAN progression based on HRV attributes was proposed in[14].The Multi-Layer Attribute Selection and Classification (MLASC)AIMS Medical ScienceVolume 3, Issue 2, 217-233.several