FUZZY-ROUGH CLASSIFICATION FOR BRAINPRINT AUTHENTICATION

The electroencephalogram (EEG) signal is used as biometric modality, because it is proven to be unique, universal and collectable. This work aims to assess the performance of fuzzy-based techniques for brainprint authentication modelling. We benchmark the performance of Fuzzy-Rough Nearest Neighbour (FRNN) technique to the Discernibility Nearest Neighbour (D-kNN) and the Fuzzy Lattice Reasoning (FLR) techniques using the selected samples of brainwaves’ data from the original UCI EEG dataset. All the three classifiers are available in the fuzzy-rough version of WEKA implementation tool. Selected 9 EEG channels located at the midline and lateral regions were used in the experimentation. The coherence, mean of amplitudes and cross-correlation feature extraction methods were used to extract the EEG signals. The area under ROC curve (AUC) measurement of FRNN was promising against the D-kNN and FLR techniques. The FRNN model has achieved the best performance of AUC measure at 0.904 in opposition to the D-kNN and FLR models, where both recorded 0.770 and 0.563, respectively. However, the classification accuracy shows significantly no difference among the three classifiers. The results confirmed that the classification accuracy of D-kNN and FLR techniques is not reliable, because they are highly contributed by the true negative cases. Hence, we conclude that the FRNN model is less biased to imbalance data problem as compared to the D-kNN and FLR models. Future work of this research should focus on optimizing the EEG channel and feature selection in order to obtain a better data representation of biometric brainprint for more efficient authentication in imbalance data problem.


INTRODUCTION
The aim of brainprint authentication is to accept or reject the identity claimed by an individual. There are numerous types of person authentication methods, such as knowledge-based, token-based and biometric methods. The commonly used Personal Identification Number (PIN) and password are examples of knowledge-based authentication and signature is an example of token-based authentication. However, password and signature are considered the weakest authentication models, because the password can be stolen, while the signature can be forged easily. Biometric systems such as fingerprint, iris, face, voice and hand geometry authentication systems were introduced to overcome the security incompetency of the traditional authentication methods. Among all, fingerprint and face recognition are common modalities in today's biometric authentication systems. Fingerprint scheme [1] is widely used, but is still prone to forgery. This technology recognizes only the ridge arrangement on the finger surface, where intruders can easily replicate the fingerprint using silicon or gelatine to infringe the security systems. Facial recognition is also less promising, because the human face structure will change as a person ages. The above mentioned limitations can be overcome using a more secure biometric modality; the human brainprint. The brainprint extracted through electroencephalogram (EEG) signals is a highly secure biometric modality for person authentication. Over the recent years, EEG-based person authentication is catching much researchers' attention [2]- [3].
Various types of soft computing techniques have been applied in EEG signal classification. Artificial neural networks (ANNs), fuzzy logic, K-Nearest Neighbour (kNN), linear discriminant analysis (LDA) and support vector machine (SVM) are examples of soft computing techniques for EEG signal classification. Gui et al. [4] investigated visual evoked potential (VEP) data collection using a low-cost sensor system. ANNs were used for EEG-based biometric authentication and the classification accuracy achieved was around 90%. Back-propagation NN, SVM and LDA were used to classify the EEG signals

RELATED WORK
Berger was the first who recorded the EEG signals in 1929 [7]. EEG is defined as the electrical activity recorded from the scalp surface [8]. EEG signals are the electromagnetic waves that are emitted from the human brain's neurons. EEG is the most practical capturing method that can be used in biometrics due to the advances in its hardware devices. The EEG recording is a completely non-invasive procedure that can be repeatedly applied to normal adults, patients and children with virtually no risk or limitation [8]. The main advantages of using brain electromagnetic waves are: the uniqueness and liveness of the EEG signals, in addition to that the recorded brain responses cannot be replicated and the individual's identity cannot be stolen. A research work [9] showed that the individual's EEG signals vary from every individual to another, even though they performed similar task or thought. Conditions of stress, anxiety, fatigue, medication, drowsiness, environment, …etc. can increase the difficulty of reproducing similar pattern of EEG signals [10]. For example, a person that has been under the influence of stress will generate different EEG signals when compared to his/her normal state.
EEG recording electrodes and their function are critical for obtaining high-quality data for interpretation [8]. One important problem of EEG signal recording is the artifacts. Examples of artifacts occurring in EEG signal recording are: eye blinking, head movements, muscle activities and electrocardiogram (ECG). Due to the very low amplitude of EEG signals, artifacts often contaminate the recordings, restricting or making difficult analysis or interpretation. Therefore, the position of the subject during EEG recording should be very comfortable to avoid unnecessary activities; a lying position diminishes the existence of some artifacts caused by feeble motion. One of the ideas that combined EEG signals with authentication systems was proposed by Thorpe et al. [11]. The studied authentication system was designed by using pass-thought, which is reliable due to the uniqueness of EEG signals. Apart from that, a consumer grade of EEG headset was used in Ashby et al. [12] for authentication purpose.
Marcel and Millán [13] achieved a high authentication performance of 93.4% in terms of accuracy. A total of 9 normal subjects were asked to perform 3 tasks (i.e., left-hand movement, right-hand movement and generation of words that begin with the same random letter) during 12 non-feedback sessions in 3 days, which means 4 sessions per day. The classification accuracy reached around 80% in the research work [14]. They analyzed the 8-channel EEG signals from a group of 40 volunteers who performed a simple experiment (i.e., relaxing with opened and closed eyes). In addition, the research work by Jian-Feng [15] used the BCI competition 2003 EEG dataset that was recorded from a total of 64 channels and sampled at 250 Hz. The authentication classification result ranged from 75% to 85%. Biometric authentication based on EEG signals conducted in [16] covered three tasks of classification accuracy of: reading task (97.3%), relax task (94.4%) and multiplication task (97.5%). The research work in [17] combined EEG headsets with the smartphone for EEG-based person authentication purpose. Besides, an EEG-based biometric authentication system was developed in [18]. The EMOTIV Epoch+ EEG headset was used to collect the EEG signals and the classification accuracy achieved was 96.97%.
Mean corresponds to the centre of a set of values. It is a time domain feature, which is calculated for the reconstructed EEG signal amplitude and time duration. Mean has been used in [19] as one of the features for the filtered signals. The extracted signals are then distinguished to be normal or epileptic by using artificial neural network technique. Other than that, time domain features, such as mean, median, mode, standard deviation, minimum and maximum, were used in [20] in the analysis of EEG signals to detect brain abnormalities. Correlation is very similar to convolution, which is a mathematical operation. Then, cross-correlation is the measurement of the extent of similarity relationship between two signals. It is able to detect non-stationarity and is widely used for the analysis of time series of EEG signals. A set of five features is extracted and then utilized for training an SVM classifier to generalize the results. In this research, a healthy subject sample signals are acquired to represent a reference signal for data comparison. Cross-correlation has been aided to Support Vector Machine (SVM) classifier in EEG signal classification [21]. With the aid of cross-correlation, SVM is able to perform better in pattern recognition. The accuracy achieved was 94.5%. Hence, cross-correlation is a very useful technique to gain insight in EEG signals for feature extraction. Coherence is one of the feature extraction methods that is widely used for EEG signal analysis. Coherence is a linear correlation measure between two signals at different frequencies. It was first used as a feature in [22] for measuring the mean coupling between signals recorded from an electrode and its neighbours. In addition, mutual information, coherence and cross-correlation have been used in [23] for an EEG biometric system. The features extracted from the EEG signals have been proven unique enough among subjects for biometric applications. Research work in [23] used an unobtrusive authentication method that uses 2 frontal electrodes and 1 reference electrode placed at the left ear lobe only.
Due to low signal-to-noise ratio and non-stationarity of EEG signals, uncertainty modelling tools, such as fuzzy set and rough set, are needed to handle the related problems. Fuzzy set theory [24] and rough set theory [25] are good solutions in handling uncertainty and manipulating incomplete data. Fuzzyrough set provides a higher degree of flexibility in dealing with imprecision and vagueness existing in real-world data [26]- [30]. Fuzzy-Rough Nearest Neighbour (FRNN) model introduced by Jensen and Cornelis [31] is hybridized with the strength of fuzzy-rough set and Fuzzy Nearest Neighbour (FNN) approach to complement each other. The constructed fuzzy lower and upper approximations are used to avoid the use of fuzzy logical connectives altogether. However, fuzzy-rough set allows that the element belongs to more than one class. In addition, FNN model is an extension version of kNN algorithm to fuzzy set theory and it is proved that FNN model outperforms the standard nearest neighbour model [32]. FNN model allows partial membership of an object in different classes and takes into account the closeness of each neighbour with respect to the test instance. Unfortunately, FNN algorithm is found out to have a problem when dealing with imperfect data. Therefore, the hybridization of the strength between fuzzy-rough set and FNN algorithm, which is fuzzy-rough nearest neighbour (FRNN) algorithm, can allow both to complement each other in order to gain good performance.
FRNN algorithm uses nearest neighbours to compute fuzzy lower and upper estimations in order to predict the test objects [31]. With the existing of the fuzzy approximations, the FRNN algorithm outperforms other nearest neighbour approaches and Naïve Bayes prediction models in classification problems. This is proven from the experiment done in Sarkar [33]. Three nearest neighbour approaches; namely, conventional kNN algorithm, the FNN algorithm and the FRNN algorithm, were used to classify Wisconsin Breast Cancer problem [34]. The dataset consisted of 699 samples and each sample provided ten numerical attributes. A total of 16 samples with missing attributes were removed from the dataset. From this experiment, FRNN algorithm gained the highest classification performance among the three algorithms. Moreover, the time complexity of FRNN algorithm is the same as those of the conventional kNN algorithm and the FNN algorithm. Furthermore, the FRNN algorithm was applied in China Stock Market Distressed Company for prediction problems [34]. The FRNN algorithm is able to use unbalanced and unmatched training and testing datasets in prediction. The prediction accuracy achieved was 78.37% which is better than that of the FNN classification approach. This study concluded that the FRNN approach performs better than the conventional kNN approach and the FNN approach. FRNN approach not only can deal with unbalanced data, but also performs well when dealing with incomplete data. D-kNN approach is an extension of the kNN algorithm which uses the concept of discernibility. D-kNN computes the discernibility of the neighbours and the distances from the test objects. The main benefit in D-kNN approach is that it does not allow the classes of dataset to overlap. D-kNN algorithm considers the structural properties of the neighbours. A comparison of performance among three classifiers with nearest neighbour approach was carried out by using Bupa Liver dataset [35]. The classifiers are conventional kNN algorithm, Weighted kNN (W-kNN) algorithm and D-kNN algorithm. By comparing the classifiers, D-kNN yields the best classification accuracy and net reliability. However, the processing time is slightly longer than in the conventional kNN and W-kNN algorithms. W-kNN yields the worse in terms of accuracy, because the dataset only contains six features and all of these are equally importance. W-kNN performed well when the dataset contains a larger number of features.
FLR is a rule-based classifier. The term "fuzzy lattice" was introduced by Naseem in 1994 [36] on the concept of fuzzy partial-order relation. The benefit of the lattice theory is capability of tackling with uncertain information and dealing with missing data [37]. Fuzzy lattices can be used in classification and clustering algorithms and have been successfully implemented in real-time problems, such as pattern recognition [38], air quality assessment and ambient ozone estimation [39]. FLR classifier was successfully applied in ambient ozone estimation [39] and the results with missing values and without missing values were compared. The FLR classifier gained similar values in terms of accuracy measure for the dataset with missing values and without missing values. The classification accuracy was 84.6% for dataset with missing values and 83.23% for dataset without missing values. Furthermore, the least time for training and testing was taken by FLR classifier. It used only around 1.5 seconds, while backpropagation neural networks took a training and testing time between 3 minutes and 25 minutes. In recent years, the FLR classifier has been used for image recognition, such as human facial expressions [40]- [41]. However, there is still lack of research on EEG signal classification using FLR.
Data pre-processing and feature extraction are the important steps in order to perform FLR classification. Seven different facial expressions; namely, neutral, angry, disgust, feared, happy, sad and surprised, were recorded. The dataset was divided into 75% of training data and 25% testing data. From this experiment, FLR classifier performed better than the conventional kNN algorithm [40].

CLASSIFICATION
In this study, FRNN, D-kNN and FLR techniques were used to accomplish brainprint authentication modelling. Brainprint authentication modeling consists only of 2 classes; client and impostor. The FRNN, D-kNN and FLR techniques can be found in fuzzy-rough version of Waikato Environment for Knowledge Analysis (WEKA). It is free downloaded from http://users.aber.ac.uk/rkj/book/wekafull.jar.

Fuzzy-Rough Nearest Neighbour (FRNN)
Fuzzy-Rough Nearest Neighbour (FRNN) was introduced by Jensen and Cornelis [31] in 2011. It is a hybrid model with the combination of fuzzy set, rough set and nearest neighbour classification approaches. In the FRNN algorithm, the lower and upper approximations are constructed by the nearest neighbours to allocate the decision class to the test object. The details of FRNN algorithm can be found in Algorithm 1 [31]. The FRNN algorithm calculates the similarity between the two objects and finally classifies the test objects into the most possible decision classes. FRNN classifies the test object based on single nearest neighbour with the highest similarity measure. Therefore, the value of k does not affect the classification performance. The FRNN technique captures uncertainty by using fuzzy-rough approximations. The construction of fuzzy upper and lower approximations is to avoid the use of fuzzy logical connectives completely. The connectives here are the keys in developing the fuzzy-rough set theory.

Discernibility Nearest Neighbour (D-kNN)
Discernibility Nearest Neighbour (D-kNN) classifier can handle overlapping classes of a dataset compared to the original kNN. The discernibility of the neighbours was first calculated, followed by their distances from the test objects. The algorithm of D-kNN is shown in Algorithm 2 [35]. The property of the neighbours is playing an important role in D-kNN prediction [35]. The ratio or distance of discernibility is computed for each neighbour data and the average of the ratios is taken for each class. D-kNN not only classifies the test elements based on the concept of nearest neighbours, but also based on the discernibility scores. The discernibility score of D-kNN classifier is produced for each object to be classified. After that, the average of the discernibility scores of the neighbouring objects and their distances from the objects are calculated for each one of the possible classes. Then, S j is calculated for the classification score of each class. Eventually, the classification scores of the different classes are compared in order to classify the test objects. The higher the classification score, the higher the chance to be the output of the classification.

Fuzzy Lattice Reasoning (FLR)
Fuzzy Lattice Reasoning (FLR) is a classifier to extract rules from the input data based on fuzzy lattices. The sequence of the input data representation is important. FLR plays an important role in dealing with different types of data; for example, fuzzy sets, real vectors, images, symbols, graphs and waves. Other than that, FLR can deal with both points and intervals. Apart from that, FLR has the ability of knowledge representation and is capable of extracting implicit features beyond the data, which can represent the data as rules. Furthermore, FLR has the ability to combine different types of data, handle missing data and cope with both complete and incomplete lattices.  6: The antecedent 0 is classified to the class with label .
(a i , C L ) is the representation for the input datum to the FLR model, where C L represents the class label of datum a i and can be interpreted as a rule "if a i then C L ". An input datum (a 0 , C 0 ) is presented to the network in the learning phase. The degree of inclusion between input and stored rules in RB will be calculated as k(a 0 , a 1 ), … , k(a 0 , a c ). The FLR will choose the rule with arg max l∈{1,…,L} k(a i ≤ A l ) as the winner rule. If the winner rule A J and input datum a 0 have the same class label and the size of a i ⋁A J is less than a user-defined threshold, then the winner rule will be updated. There is only one parameter that can be tuned in FLR; that is the threshold size, D crit . D crit is used to indicate the maximum size of a hyperbox to be learned. Larger values of D crit will result in more generalized rules while smaller values of D crit will result in more specific rules.

EXPERIMENTATION
EEG signal classification is a difficult task as a result of that the characteristics of EEG signals are nonstationary, in addition to high dimensionality and low signal-to-noise ratio (SNR). Thus, data preprocessing and data preparation steps are important.

Data Pre-processing and Data Preparation
In this study, a free EEG dataset is used which can be taken from UCI Machine Learning Repository [42]. The online available EEG dataset consists of three versions, which are small dataset, large dataset and full dataset. Each of the datasets contains an individual, 10 individuals and 122 individuals, respectively. The UCI EEG dataset was recorded for both alcoholic and non-alcoholic persons. Since this study focused on person authentication modeling, only non-alcoholic dataset will be used. Alcoholic data is not suitable for this study, because data collected from alcoholic persons might be less accurate due to their brains having been affected by alcohol. Large dataset will be used in this study, but one of the individuals from the large dataset will be replaced by an individual from the full dataset. This is because there are many redundant trials by the individuals which will affect the result. Each individual accomplished 60 trials. This EEG dataset consists of the measurements of 64 electrodes (61 active electrodes + 3 reference electrodes) placed on the scalps and the sampling rate was at 256Hz.
The stimuli were composed of 90 images that were chosen from a total of 260 black-and-white Snodgrass and Vanderwart image set [43]. The subjects were requested to recognize the image as soon as the image is displayed on the computer screen. The distance of the computer screen from the subject's eyes was 1 meter. The image remained on the screen for 300ms and the Inter-Stimulus Interval (ISI) for each test was set to 3200ms. The visual stimulus presentation is illustrated in Figure 1. In the general machine learning model building, there are some common suggestions for train/test splits, such as 60/40, 70/30, 80/20 or even 90/10, if the dataset is relatively large [34]. The higher percentage of train data tends to generate a better model, but sacrifices the objectivity of test results due to low number of test data. Therefore, the larger the dataset, the higher the train/test proportion which may be applied. However, machine learning experiment seldom implements the 90/10 proportion, unless the dataset used is extremely large. We used the 80/20 train/test proportion in this study, where 480 instances were used for model building versus 120 instances for model testing.
In this stage, we have checked the trials in order to avoid the redundant trials between train set and test set. The dataset has equivalent distribution of trials between S1 object, S2 match and not-match for both train set and test set. In this study, we only selected 100 data points, which corresponds to approximately 300 milliseconds (ms). This is because the VEP normally occurred within the first 300ms. Besides, the EEG signals of S2 were different from those of S1 due to that S2 involves brain information about the match or not-match analysis. Only the electrodes located at the midline and lateral sides were considered. This is due to that midline and lateral electrodes provide stronger strength from electrical signals when responding to visual stimuli [44]. The lateral electrodes are PO7, PO8, O1, O2 and OZ, while the midline electrodes are FPZ, FZ, CZ, PZ and OZ.

Feature Extraction
A set of feature vectors were retrieved from the raw EEG dataset. The extracted feature vectors act as a different observation for the purpose of classification. Besides, feature extraction can reduce the dimensions of the input attributes as compared to the raw EEG dataset. In this study, coherence, crosscorrelation and mean of amplitudes are selected from a particular literature review. The three feature extraction methods are described as follows: a) Coherence: Coherence is used in order to compute the degree of linear correlation between two signals. The correlation between two signals at different operating frequencies can be revealed by coherence [17]. EEG-based coherence analysis is proven to be suitable for use in biometrics [45]. Coherence is ranging from 0 to 1, where the value of 0 indicates that the two signals are independent, while the value of 1 indicates that the two signals are completely linearly dependent. The coherence is calculated as follows: where, ( ) is a function of the power spectral density ( and ) of and and the cross-power spectral density ( ) of and .

b) Cross-correlation:
Cross-correlation, known as a sliding dot product, is used to compute the similarity between two signals. It is also frequently used to obtain the existence of a known signal sequence in an unknown one. It is a function of the relative delay between the signals and the application in pattern recognition. Two input signals will be calculated for cross-correlation: Channel 1 with itself: , Channel 2 with itself: , Channel 1 with channel 2: . The correlation between two random variables and with expected values and and standard deviation and is given as: where, ( ) is the expectation operator and ( ) is the covariance operator.

c) Mean of Amplitudes:
Mean, also known as average, is the sum the of all EEG potential values divided by the number of data points. The mean is calculated as follows: where, is the number of data points and is the value of the data.

Experimental Setting
In FRNN algorithm, fuzzy logic connectives are crucial for developing the fuzzy-rough set theory. A triangular norm (t-norm), is any increasing, associative and commutative . Based on the [31], the Kleene-Dienes implicator for , value in [0,1] was implemented. In addition, the experimental setting for D-kNN was the same with FRNN classifier. Kleene-Dienes was chosen for both t-norm and implicator. Moreover, there is only one parameter that can be tuned in FLR algorithm; that is the threshold size . We have set = 0.1 [38] in our experiment.

Performance Measures and Statistical Test
The experimental result is analyzed based on the accuracy and the area under the receiver operating characteristics (ROC) curve (AUC). The AUC measure is used as one of the performance measures in this study, because it is more reliable and statistically consistent as compared to the accuracy measure in [46]. The accuracy and AUC of FRNN will be compared with the results obtained from D-kNN and FLR. The purpose of this comparison is to test whether FRNN can perform better than other classification algorithms, such as D-kNN and FLR.
Beforehand, the normality distribution of data is verified by using the Anderson-Darling test. The Anderson-Darling test [47] is modified from Kolmogorov-Smirnov (K-S) test. By comparing to the (K-S) test, Anderson-Darling test contributes more weights to the distribution tails. The critical value is calculated for the specific distribution. The Anderson-Darling test is calculated as: where, = non-negative weight function which can be defined from: The normality distribution of data must be determined before performing a statistical test. A statistical test is performed in order to determine the confidence level of the dataset which lead to reaching conclusions. Parametric test is chosen when data is normally distributed, while non-parametric test will be chosen when data is not normally distributed. Parametric tests, such as Z-test, paired-sample t-test or F-test, will yield higher accuracy when data is normally distributed. Simultaneously, if data is normally distributed and a non-parametric test is performed, then the results will not be as accurate as in the case of parametric test.
From the normality test using Anderson-Darling test, the accuracy of FRNN, accuracy of D-kNN and AUC of D-kNN are normally distributed, while the accuracy of FLR, AUC of FRNN and AUC of FLR are not normally distributed. Therefore, a paired-sample t-test is performed between accuracy of FRNN and accuracy of D-kNN. In contrast, Wilcoxon signed-rank test is performed when the results are not normally distributed.
A paired-sample t-test is performed to compare the differences of means between paired observations by using the IBM SPSS Statistics 22. The paired-sample t-test is a statistical validation method which is used to compare the means from different sources in a dataset [48]. The reason behind this is to investigate the significance differences between two groups. The null hypothesis of paired-sample Ttest states that the difference between two mean values is zero, which is represented as: On the other hand, Wilcoxon signed-rank test is frequently used for non-parametric testing. It is an alternative method for paired-sample t-test. The Wilcoxon signed-rank test is used to evaluate the difference of medians between paired data. Wilcoxon signed-rank test is more powerful in distinguishing the differences between two samples [49]. The nominal data cannot be analyzed with Wilcoxon signed-rank test, because the difference of the nominal data points has no specific value.
In statistical test, the null hypothesis is rejected if and only if the -value is less than 0.05, which means that there are statistically significant differences between the two samples. On the contrary, the null hypothesis is accepted if and only if the -value is larger than 0.05, which means that there are no statistically significant differences between the two samples. A statistical test was performed to test the differences between the two classifiers for different cases of use at 95% confidence level. The possible reason is the parameter setting of FLR model. As previously described, there is only one parameter which can be tuned; that is the size of threshold. As the parameter setting for FLR model is 0.1 [38], therefore it will affect the AUC obtained, since the perspective of AUC is different from that of accuracy. The larger values of threshold will result in more generalized rules [39]. The threshold used in this project is small, which is 0.1; therefore, the rules are more specific as smaller values of threshold will result in more specific rules. Thus, the AUC of FLR models is lower than in FRNN and D-kNN models.

EXPERIMENTAL RESULTS AND DISCUSSION
By observing the overall classification results above, FRNN gained good performance in terms of accuracy and AUC compared to D-kNN and FLR models. As previously described, FRNN algorithm is a fusion model that combines the strength fuzzy-rough set and the FNN approach. The decision class is determined by using the fuzzy lower and upper approximations to compute the membership value of a test object [31]. The fuzzy lower and upper approximations play a crucial role in dealing with noisy data such as EEG signals. Hence, the FRNN is able to perform better.  Table 2 shows the statistical test for the comparison of accuracy and AUC among FRNN, D-kNN and FLR models. The paired-sample t-test was only used for the comparison between the accuracy of FRNN model and the accuracy of D-kNN model. The p-value of this comparison is 0.071, which is greater than 0.05. Thus, we can conclude that the FRNN model and the D-kNN model are not significantly different. On the contrary, Wilcoxon signed-rank test was used for the rest of the comparisons. The p-value for the comparison between the accuracy of FRNN model and that of FLR model was recorded at 0.767, which is greater than 0.05. Hence, there are also no significant differences between the FRNN model and the FLR model. In summary, accuracy comparisons did not show significant differences among the models.
Comparatively, the statistical test for the comparison between the AUC of FRNN model and D-kNN model showed significant differences with a p-value of 0.004. From the mean values in Table 2, it is clearly proved that the AUC of FRNN model is higher than the AUC of D-kNN model. Thus, we can conclude that the FRNN model performed better than the D-kNN model. Furthermore, a statistical test was also carried out for the comparison between the AUC of FRNN model and that of FLR model. The p-value was recorded at 0.006 and indicating significant differences in the paired set. The AUC of FRNN model achieved a value of 0.904, while the AUC of FLR model achieved only 0.563, which is considered a poor result. In other words, the FRNN model significantly performed better than the D-kNN model and the FLR model.

CONCLUSIONS
Among the fuzzy set and rough set approaches, the FRNN model is proven to be significantly better than D-kNN model and FLR model in EEG signal classification for brainprint authentication modeling.
The AUC of FRNN model is 0.904, which is considered an excellent classification result. However, further work should be done on the FRNN model to improve the accuracy and AUC, since a good authentication system should have a perfect classification. The classification results gained from the FRNN model are more stable and consistent as compared to the classification results of D-kNN model and FLR model. This study showed the importance and capability of fuzzy-rough approximations of handling uncertain and non-stationary signals.