Missing Value Imputation Using Stratified Supervised Learning for Cardiovascular Data

Legacy (and current) medical datasets are rich source of information and knowledge. However, the use of most legacy medical datasets is beset with problems. One of the most often faced is the problem of missing data, often due to oversights in data capture or data entry procedures. Algorithms commonly used in the analysis of data often depend on a complete data set. Missing value imputation offers a solution to this problem. This may result in the generation of synthetic data, with artificially induced missing values, but simply removing the incomplete data records often produces the best classifier results. With legacy data, simply removing the records from the original datasets can significantly reduce the data volume and often affect the class balance of the dataset. A suitable method for missing value imputation is very much needed to produce good quality datasets for better analysing data resulting from clinical trials. This paper proposes a framework for missing value imputation using stratified machine learning methods. We explore machine learning technique to predict missing value for incomplete clinical (cardiovascular) data, with experiments comparing this with other standard methods. Two machine learning (classifier) algorithms, fuzzy unordered rule induction algorithm and decision tree, plus other machine learning algorithms (for comparison purposes) are used to train on complete data and subsequently predict missing values for incomplete data. The complete datasets are classified using decision tree, neural network, K-NN and K-Mean clustering. The classification performances are evaluated using sensitivity, specificity, accuracy, positive predictive value and negative predictive value. The results show that final classifier performance can be significantly improved for all class labels when stratification was used with fuzzy unordered rule induction algorithm to predict missing attribute values.


Introduction
Legacy medical datasets are rich source of information and knowledge, and there is a growing trend with research funders expecting the data resulting from clinical trials to be used beyond the originating study.However, real-life data sets are often found to be incomplete.This is true for both legacy and current, in use, datasets.Causes for values to be missing vary; ranging from oversights in data capture or data entry procedures to systematic flaws in the studies that led to the data being generated.Often the cause of missing values is due to legacy data being extended with further trials where the information profile being captured has changed.Missing attribute values is already been identified as an important issue in data mining and analytics [1].In medical data mining and analysis missing values has become a challenging issue, predominantly as legacy data can be a valuable source of information and knowledge.In many clinical trials, the medical report pro-forma allow some attributes to be left blank, because they are inappropriate for some cases or the person providing the information feels that it is not appropriate to record the values of some attributes [2].
According to Roderick and Donald [3] missing data can be classified in to two ways.Data is termed missing completely at random (MCAR) when the response indicator variables R, are independent of the data variables X and the latent variables Z.The MCAR condition can be briefly expressed by P (R|X, Z, µ) = P (R|µ).The second category of missing data is called missing at random or MAR.The MAR condition is often written as P (R = r|X = x, Z = z, µ) = P (R = r|X° = x°) for all x µ , z and µ [4].
Generally, methods to handle missing values belong either to sequential methods like leastwise deletion, assigning most common values for categorical attributes, arithmetic mean or median for the numeric attribute or parallel methods where algorithms are used to predict missing attribute values [5].There are some reasons for which leastwise deletion is considered to be a good method [3], but a number of works [2,3,6] have shown that the application of these methods on the incomplete data can corrupt the construal of the data and mislead the subsequent analysis through the introduction of bias.
Several techniques for missing value imputation are proposed by researchers; most of the techniques are single imputation approaches [7].The most commonly used missing value imputation techniques are deleting cases, mean value imputation and other statistical methods [7].In recent years, research has explored machine learning techniques as a method for missing values imputation; artificial neural network (ANN), self-organising maps (SOM), decision tree and k-nearest neighbors ( K-NN) were used as missing value imputation methods in many different domains [6,[8][9][10][11][12][13][14][15].In many cases machine learning methods like ANN, SOM, K-NN and decisions tree have been found to perform better than the traditional statistical methods [6,16].
Machine learning methods can be used for predicting missing values; for example by using rule induction algorithm in which rules are induced from the original complete data set, with missing attribute values ignored.The decision tree can be produced by splitting cases with missing attribute values into fractions and adding these fractions to new case subsets [5].Other methods of handling missing attribute values were presented in [17].Jerez et al. [6] presented comparison results of missing data imputation using statistical and machine learning methods in a real breast cancer problem.They used imputation methods based on statistical techniques, e.g., mean, hot-decking and multiple imputations, and machine learning techniques, e.g., multi-layer perceptron (MLP), SOM and K-NN and applied them to the cancer data.The results were then compared to those obtained from the list wise deletion (LD) imputation method.K-NN has been used by many researchers for imputing missing value [18,19].Every time a missing value is found in a current instance, K-NN computes the K nearest neighbours and a value from them is imputed.For categorical values, the most common value among all (k) neighbours is taken, and for numerical values, the average value is used [19].Gajawada and Toshniwal [18] proposed a modified version of imputing missing value with K-NN.Here, the dataset is divided into two sets records with missing value and records without missing value.K-Means clustering is applied to the complete instances set to obtain clusters of complete instances.This was then used to impute the missing values in the incomplete dataset.
In most cases highlighted above, the machine learning based missing value imputation found to be better than conventional statistical methods.However, none of the research considered the class label as of factor that might affect the learning from pattern of the complete dataset.Our contention is that a data pattern of one class is not similar to other class label records, and so stratified learning may give better results.
In this paper we examine stratified supervised learning for predicting missing values.In our proposed approach we used FURIA, fuzzy unordered rules induction algorithm [20], with stratification as a missing values imputation for real life incomplete cardiovascular datasets.The results are compared with some other non-stratified machine learning based missing value imputation methods using decision tree, SVM, K-NN, and conventional statistical mean-mode imputation methods.

Overview of Furia
Fuzzy Unordered Rule Induction Algorithm (FURIA) is a novel rule-based classification method, which is a modification and extension of the state-of-the-art RIPPER rule learner algorithm.The main difference between FURIA and RIPPER is that FURIA learns fuzzy rules and unordered rule sets instead of conventional rules and rule lists.Moreover, FURIA uses a rule stretching method to deal with uncovered examples [20].A fuzzy interval of that kind is specified by four parameters and will be written: = ( , , , , , , , ): where , and , are, respectively, the lower and upper bound of the core (elements with membership 1) of the fuzzy set; likewise, , and , are, respectively, the lower and upper bound of the support (elements with membership >0).
For an instance x = (x1……xn) the degree of the fuzzy membership can be found using the formula [20]: For fuzzification of a single antecedent only relevant training data is considered and data are partitioned into two subsets and rule purity is used to measure the quality of the fuzzification [20] where The fuzzy rules 1 () .. () have learned for the class , the support of this class is defined by [20]: where, the certainty factor of the rule is defined as Fuzzy rule are generated by FURIA by following two steps: (1) For every single class λc a rule set is learnt, using a oneversus-all decomposition.The RIPPER algorithm is used, which consists of two fundamental steps (building and the optimization phase) described in Sun and Xu [21].
(2) Rules from above step are fuzzified to obtain fuzzy rules.Each rule is fuzzified remembering the same structure as the non fuzzified rule just replacing original intervals in the antecedent with fuzzy intervals (complete procedure is described in Hühn and Hüllermeier [20]) More use of FURIA in different areas of data mining can be found in [20,22,23].Stratified machine learning based missing value imputation This research presents a new way of imputing missing value using machine learning methods.The original data set can be first stratified using the intended class label.It is then partitioned into groups of missing and non-missing; the records having missing values in their attributes are in one group and the records without any missing values are placed in a separate group.Figure 1 depicts the flow for the imputation process.Below we explain this process in terms of using the FURIA fuzzy rule based classifier to find suitable values for imputation.The process is very similar when using other classifiers.The difference being the flow in the right center of Figure 1 is modified according to the classifier used.The other classifiers used in the experiments are briefly described in section 6.
The fuzzy rule based classifier FURIA is trained with the complete data sets and optimum fuzzy rules are obtained.The rules are later applied to the incomplete data for predicting the missing attribute values.The process is repeated for the entire set of attributes that have missing values.At the end of training, this training dataset and the missing value imputed datasets are combined to make the complete data.The final dataset is then fed to the selected classifier for classification on the true outcome.
The stratified fuzzy rule based imputation scheme developed in this study can be described as follows: (1) Given an incomplete data set X, Stratify data based on the class label (for two class problem xa and xb) (2) For all data records of each class do the following: a. Separate the input vectors that do not contain any missing data from the ones that have missing values.
b. Train the FURIA Classifier with the complete data (having no missing value).Select the output as the attribute whose value needs to be predicted by the classifier for imputation and build up the model with classifiers' best accuracy.Obtain optimum fuzzy rules.c.
For each incomplete pattern apply the fuzzy rules to predict unknown value of the missing fields.
d. Repeat for all attributes with missing value.

Cardiovascular Data
Two data sources for cardiovascular patients are used: the Hull site of 498 patients and the Dundee site of 341 patients.The patients in the Hull site are described by 98 attributes.The patients in the Dundee site are described by 57 attributes.As a dataset, a combination from both sites is used.This gives a group of 823 instances (cardiovascular patients) classified into two levels of risk and described by 22 attributes.After the combination, 18 out of 22 attributes have missing values from 1% to 30%; and 613 out of 839 instances have 4% to 56% missing values in their describing attributes.All instances having 20% or more missing values and relating to live patients 30 days after an operation are removed.The data is described in full in Nguyen [24].

Data description
The description of instances and their summary is given in Table 1, showing the percentage of missing values for each attribute.This data is symptomatic of much legacy clinical data, in that it is flawed in data capture, with patient records coming from multiple trials and each data record cannot be replicated (for obvious reasons).
ASA grade is used to classify the patient into categorical values one, two, three or four according to the American Society of Anesthesiologists classification [25]

Classifier evaluation
K-Fold cross validation is used to minimize the bias associated with random sampling of training and test data samples in comparing predictive accuracy of two or more methods [31].Here the whole data set is randomly split into 'k' (in our case k=10) mutually exclusive subsets of approximately equal size.Classification model is trained and tested k times.The classification performance is evaluated by accuracy (ACC); sensitivity (Sen); specificity (Spec) rates, and the positive predicted value (PPV) and negative predicted value (NPV), based on values residing in a confusion matrix (see Table 2).
Assume that the cardiovascular classifier output set includes two typically risk prediction classes as: "High risk", and "Low risk".Each pattern xi (i=1, 2..n) is allocated into one element from the set (P, N) (positive or negative) of the risk prediction classes.
Hence, each input pattern might be mapped into one of four possible outcomes: true positive true high risk (TP) when the outcome is correctly predicted as High risk; true negativetrue low risk (TN) when the outcome is correctly predicted as Low risk; false negative-false Low risk (FN) when the outcome is incorrectly predicted as Low risk, when it is High risk (positive); or false positivefalse high risk (FP) when the outcome is incorrectly predicted as High risk, when it is Low risk (negative).The set of (P, N) and the predicted risk set can be built as a confusion matrix.The accuracy of a classifier is calculated by: The sensitivity is the rate of number correctly predicted "High risk" over the total number of correctly predicted "High risk" and incorrectly predicted "Low risk".It is given by: The specificity rate is the rate of correctly predicted "Low risk" over the total number of expected/actual "Low risk".It is given by: = + (9) Higher accuracy does not always reflect a good classification outcome.For clinical data analysis it is important to evaluate the classifier based on how well the classifier predicts the "High Risk" patients.In many cases it has been found that the classification outcome is showing good accuracy as it can predict well the low risk patients (majority class) but failed to predict high risk patients (the minority class).For completeness, we also show positive predictive value (PPV) and negative predictive value (NPV), where

Classification Algorithms Decision tree
Decision trees are algorithms that automatically construct a decision tree from a given data sets.The algorithm generates an optimal decision minimizing the generalization error.A decision tree is articulated as a recursive partition of the instance space.It consists of a directed tree with a "root" node with no incoming edges and all the other nodes have exactly one incoming edge [5].Decision trees models are mostly used in data mining to examine the data and generate decision rules describing that data.The induced tree and its associated rules are used to make predictions [32].Ross Quinlan introduced a decision tree algorithm known as Iterative Dichotomiser (ID 3) in 1979.C4.5, as a successor of ID3, is the most widely-used decision tree algorithm.The major advantage to the use of decision trees is human readable and the class-focused visualization of data.This visualization is useful in that it allows users to easily understand the overall structure of data and the decision rules.

K-nearest neighbor algorithm ( K-NN)
K-nearest Neighbor (K-NN) method has been becoming interesting topic in data science and proven to be one of the most powerful algorithm for classification.K-NN is a technique for classifying objects based on closest training examples in the feature space.K-NN is a type of lazy learning or instancebased learning [33], where the function is only approximated locally and all computation is deferred until classification.
The k-NN is one of the simplest machine learning algorithms where an object is classified by a majority vote of its neighbours, where the object being allocated to the class most common amongst its "k" nearest neighbours (k is a positive integer, typically small).

Experiments
The data as described in section 4 was prepared using the procedure outlined in section 3.This is compared to previously published results [34,35].Missing values were replaced using the standard Mean/Mode imputation as the basis for comparison.Five classifiers, decision tree (J48), K-NN, Fuzzy Unordered Rule Induction Algorithm (FURIA), SVM and Rippledown rules (Ridor) [36] were used for predicting missing values.Alternative datasets were prepared by using all the classifiers and later classified using Decision Tree, K-NN, Neural Networks, Fuzzy Unordered Rule Induction Algorithm (FURIA) and K-Mean clustering.

Classification outcome using standard imputation methods
This experiment was designed to compare classification outcomes and establish a baseline classification for the data.For this, Decision Tree, Ripple-down rules (Ridor), K-NN, FURIA and Neural Network (Support Vector Machine and Multi-Layer Perceptron) classifiers were used.For this experiment the missing values were replaced using the standard Mean/Mode missing imputation technique.No class label balancing technique, see [37] or any other data pre-processing were used.The purpose of these experiments was to set a baseline classification outcome for the data set discussed in section 4.1.
The results are presented in the Table 3 and later compared with the results from other experiments.
Most of the classifiers are showing reasonable accuracy for this data (72% to 80%) but with very poor sensitivity (11% to 23%).Consider the sensitivity rate; the classification outcome of the imbalanced data is very poor because the classifiers give the same attention to the majority class (Low Risk) and the minority class (High Risk).When the imbalance level is huge, it is hard to build a good classifier using conventional learning algorithms.They aim to optimize the overall accuracy without considering the relative distribution of each class.This class imbalance problem is been addressed in our previous research [37].For all the classifiers used in this experiment the results show that it is hardly possible to achieve an acceptable prediction rate for high-risk patients as they are a minority set in the case of this data.The highest value of sensitivity (23%) is found with the classifier FURIA, which is still very poor.

Classification outcome of the dataset prepared using machine learning based imputation methods
We have exhaustively tested the combinations of machine learning imputation and subsequent classification.Rather than present all these results, we will show the results from several combinations (highlighting the best and worst) and then provide a summary table and figure.This article is available from: http://datamining.imedpub.com/archive.phpimputation methods.It can be observed that the Decision Tree (J48) classified accuracy of all the datasets of different missing values imputation methods are almost closed to each other (78% to 80%) and there is a big gap of sensitivity among all the imputation methods.The highest sensit ivity (23%) was found with the use of Decision Tree (J48) as imputation method.
Table 5 presents the K-NN classification outcome of all the datasets prepared by different missing value imputation methods.The K-NN classified accuracy of all the datasets of the different missing values imputation methods are from 71% to 81% and the highest sensitivity (24%) was found with the use of K-NN as imputation method, and the lowest was by Decision Tree (J48) (20%).The use of K-NN as missing imputation outperformed all the other methods.K-NN has the highest sensitivity (24%), specificity (91%) and accuracy (81%) among all the methods.The statistical method of missing values imputation (mean-mode) has slightly better sensitivity and accuracy then Decision Tree (J48) and SVM as missing imputation methods.Table 6 presents the FURIA classification outcome of all the datasets prepared by different missing value imputation methods.First column of the table is the classifier used for training the model with the complete datasets and later used for predicting the missing field of the inco mplete dataset.The last row of the table is the classification outcome of the dataset prepared by the standard Mean/Mode missing value imputation method.Again, different machine learning algorithms were applied on the dataset to predict the missing values.The classification results in Table 6 shows that the use of Decision Tree (J48) has high sensitivity (40%).The use of Decision Tree (J48) as missing imputation outperformed all the other methods.Decision Tree (J48) has the highest sensitivity (40%).Although SVM has the high specificity (83%), it shows very poor sensitivity (18%) compared to all the other imputation methods.Fuzzy Unordered Rule Induction Algorithm and K-NN have the same sensitivity of 30%.For Fuzzy Rule Induction Algorithm (FURIA) the Decision Tree (J48) imputatio n method perform best for predicting the high risk patients.If we measure the perpendicular distance of the points from the random classification line the combination L and M are found to have the highest (best) distance from the random line.Some of the classification outcomes of classifiers where Mean/Mode was used to impute the missing vale also show better than random results.However most of them are very low compared to all the combinations where machine learning was used for missing value imputation.Out of the classifications where Mean/Mode was used as missing value imputation the combination K (Mean/ModeK-NN) found to be best.
Table 7 presents the highest sensitivity found from the classifiers used as missing value imputation.First column of the table is the name of the classifier used for missing value imputation and last column is the name of the classifier use to classify the final complete datasets.From the Table 7 we can conclude that if the research aim is to achieve high sensitivity for unsupervised learning it is recommended to use FURIA as missing value imputation method and for supervised learning decision tree as missing value imputation method.
The results show that with the data prepared using mean mode as missing value we can get maximum 29% sensitivity with 63% accuracy for the K-Means classification.On the other hand we can get 40%-43% sensitivity if we use machine learning methods to predict the missing value.It is observed that in most of the cases if the same classifier is used for predicting the missing value and final classifier the performances are better than the other cases.This is likely because the bias of the classifiers in imputing missing values later benefits that classifier on the complete data.However, this is not always the case.We can also see some other combination of the imputation-classifier classification-classifier can produce good results.Some combinations are able to produce better sensitivity while some are producing better specificity.The appropriate selection of the classifier is an issue for this approach to missing value imputation.It is expected that selection will depend on the data and interests of the research.Preparing the data using Machine Learning algorithm X and achieving best results on that prepared data using the same Machine Learning algorithm X is also to be expected.Using Mean-Mode we are imputing the unique value for the entire missing field but it is obvious that missing values cannot be unique.It is a big challenge to find the right value for the missing field.The proposed method uses pattern recognition technique to predict the value for the missing field by learning the pattern from the complete dataset.The experiments show that this method is giving an improved way of finding the best possible value for the missing fields.Finally, we show the effect of stratification on this.The results (Table 8) are shown without K-NN as this had no effect when stratified, with the results given above not improved on.Datasets are prepared using stratified machine learning based missing value imputation method discussed in section 3, and are then classified using Decision Tree, K-NN, FURIA and Neural Network.Standard mean/mode imputation and nonstratified machine learning based missing value imputation method also been used for comparison.The summary of the results, presented in Table 8 and Figure 3, show that proposed stratified machine learning based missing imputation method outperform other methods discuss in this paper.Apart from K-NNclassification (which is omitted from Figure 3) all the other classification performances have significantly improved using the proposed method for missing value imputation.

Conclusion
Like many other real life data sets medical data are usually found to be incomplete, which causes many problems in analytics and knowledge discovery.This work proposed a missing value imputation framework using stratified machine learning techniques.The results are compared with nonstratified machine learning based missing value imputation and statistical (mean/mode) imputation.Experimental results show that the proposed stratified machine learning methods outperformed the statistical method (Mean/Mode) and other non-stratified machine learning methods.
The proposed method might be computationally expansive for a big datasets having large numbers attributes with missing fields.However, it is known that data cleaning is part of data pre-processing task and a one-off process.With this extra effort we can achieve a good quality data for better knowledge discovery and decision support.
In agreement with other recent research [38], and findings of this experiment we can infer that machine learning techniques may be the best approach to imputing missing values for better classification outcomes.However providing a generic answer for which is the best combination of machine learning algorithm for missing value imputation and final classification remains an open question.Unlike [38][39][40][41][42][43], we found that K-NN is not an optimal strategy to follow when using stratified imputation.The results shown here and in other work [35] suggest that the data domain and label used in the classification problem have a bearing on this question.We can confidently say that stratified machine learning imputation does improve final classification results in the datasets tested.Furthermore, the machine learning algorithm used for missing value imputation is not necessarily the best for final classification; so countering the argument that the method produces a data bias for the given classifier.

Figure 1 :
Figure 1: Flow diagram for the imputation process.

Figure 2
Figure 2 shows the ROC of different combination of the nonstratified machine learning algorithms used for imputing missing values and classifying the final complete data.A random classificatio n line was also drawn to see how much better the classification outcomes are over random.From the figure it can be seen that apart from the combination B and F all the combinations where machine learning algorithm were used, the classification performances are better than random classifier.The combination A (FURIA-K-Means), where FURIA was used to predict and impute the missing values and K-Mean was used to classify the final complete data has got the highest sensitivity.

Figure 2 :
Figure 2: Sensitivity versus (1-Specificity) for All Imputation Methods.The data points A to R can be interpreted via the key with lists (Imputation Method-Classifier) pairings.

8 Figure 3 :
Figure 3: Summary of best results for sensitivity versus (1-Specificity) across all imputation methods.The data points A to I can be interpreted via the key (different imputation methods with classifier).(K-NN is omitted to no improvement on imputation via stratification).

Table 1 :
[30]lue one means the patient is fit and well for her/his age.Value two means the patient's cardiovascular disease is mild, i.e. it does not hamper enjoyment of daily activities.Value three means the patient's cardiovascular disease is severe, i.e. it restricts the patient's daily activities.Value four means the patient's cardiovascular disease is life-threating [25].Aspirin indicates if the patient takes aspirin.Blood loss represents the blood loss in surgery in millilitres.Coronary artery bypass surgery indicates if coronary artery bypass surgery is present.Carotid status indicates a patient's health status related to carotid arteries.Congestive cardiac failure indicates if heart failure has occurred and when it occurred.Diabetes indicates if and what kind of diabetes is present.Value impaired glucose tolerance means the patient is in a prediabetic state of dysglycemia that is associated with insulin resistance and increased risk of cardiovascular pathology [26].Value Diet Rx pill indicates the patient takes Diet Rx pills.Duration is the duration of surgery in hours.Description of the cardiovascular dataset showing missing value percentages for each attribute.Age represents the age of the patient.Attribute Angina pectoris indicates if a particular angina pectoris is present.The value is set as none if there is no angina pectoris, other possible values are stable, controlled, uncontrolled.Attribute Arrhythmia indicates if a large and heterogeneous group of conditions in which there is abnormal electrical activity in the heart exists [27] the possible values for this attribute are none, a-fib ≥ 90, other, where a-fib ≥ 90 means atrial fibrillation is present for greater than 90 days .ECG describes electrocardiography, i.e. a transthoracic (across the thorax or chest) interpretation of the electrical activity of the heart over a period of time.Several categorical values are used: normal, q waves, st-t waves, a-fib 60-90, a-fib ≤ 90, five ectopic, other abnormal rhythm, other.Value normal means there are no abnormalities in electrocardiography.Value q waves means Q wave abnormalities are present.Value st-t waves means ST-T wave [28] abnormalities are present.Values a-fib 60 to 90 and a-fib ≥ 90 are related to atrial fibrillation [29].Value five ectopic means the patient has five or more ectopic heart beats per minute.Value other abnormal rhythm means some other abnormal rhythm.Value other represents all other abnormalities.Hypertension indicates if a high blood pressure is present.Myocardial infarct indicates if heart attack has occurred or when it occurred.Patch indicates which material is used for bypass patching in the patient's surgery.The values arm vein/leg vein/other vein indicate different patient body part sources used; while the values dacron and ptfe express the use of synthetic material, either Dacron or polytetrafluoroethylene. Value stent means a stent is inserted into the patient's body.Value none shows there has not been any bypass patching for the patient.Value other means something else is used.Renal failure indicates if renal insufficiency is present.Respiratory problem indicates problems with breathing, possible values are mild COAD (chronic obstructive airway disease), moderate COAD and severe COAD.Sex represents the gender of the patient.Shunt indicates if a shunt is present.Attribute Side holds the side of surgery.Smoking relates to smoking habits of the patient.Attribute Warfarin indicates if the patient takes warfarin.Class attribute Risk is used to classify instances into two possible class categorical values high and low risks.The values of class attribute are generated according to the following heuristic model[30]: an instance (cardiovascular patient) is classified into "high" if the patient's death or severe cardiovascular event (e.g.stroke, myocardial relapse or cardiovascular arrest) appears within 30 days after an operation.

Table 3 :
Baseline classification using mean-mode imputation.

Table 4 :
Different missing imputation methods with decision tree classification.

Table 4
presents the Decision Tree (J48) classification outcome of the datasets prepared by different missing value Journal

Table 5 :
Different missing imputation methods with K-NN classification.

Table 6 :
Different missing imputation methods with FURIA classification.

Table 7 :
The Highest sensitivity values of different missing imputation methods without stratification.

Table 8 :
Experimental results for alternative missing value imputation methods and strategies.