Similarity Analysis of EEG Data Based on Self Organizing Map Neural Network

The Electroencephalography (EEG) is the recording of electrical activity along the scalp. This recorded data are very complex. EEG has a big role in several applications such as in the diagnosis of human brain diseases and epilepsy. Also, we can use the EEG signals to control an external device via Brain Computer Interface (BCI) by our mind. There are many algorithms to analyse the recorded EEG data, but it still remains one of the big challenges in the world. In this article, we extended our previous proposed method. Our extended method uses Self-organizing Map (SOM) as an EEG data classifier. The proposed method we can divide in following steps: capturing EEG raw data from the sensors, applying filters on this data, we will use the frequencies in the range from 0.5 Hz to 60 Hz, smoothing the data with 15-th order of Polynomial Curve Fitting, converting filtered data into text using Turtle Graphic, Lempel-Ziv complexity for measuring similarity between two EEG data trials and Self-Organizing Map Neural Network as a final classifiers. The experiment results show that our model is able to detect up to 96 % finger movements correctly.


Introduction
To use the EEG signals to communicate between the human brain and an external device becomes one of the current big challenges in this research field.When we are looking on the EEG data of different mental tasks, they seem to be identical, but in details they are different.They contain different information.So we need to find an efficient method or algorithm to detect these differences between different mental tasks and be able to distinguish between them.When we are able to distinguish between two or more various mental tasks with a satisfying success rate, we can transform every mental task to a control command of an external device, such as prosthesis and wheelchair.The EEG signals classification was presented by several researchers using various techniques, for example Non-negative matrix factorization (NMF) [1] as a one of efficient methods to recognize human mental tasks.

Related Works
In this field, we can find many papers which are focused on EEG data processing.In this section, we present a brief overview of some methods which are related to our article.Zhang et al. applied Polynomial Curve Fitting (PCF) to improve Image Quality in Electrical Impedance Tomography (EIT).The experiments on the 2D model confirmed the improving quality of the reconstructed image; also PCF can be used to improve reconstructed image quality in 3D EIT [2].Tarade and Katti, compared Auto Regressive Integrated Moving Average (ARIMA), Artificial Neural Network (ANN) and polynomial curve fitting (PCF) for wind speed prediction.Their results showed, that ARIMA is better than other methods [3].Kang and Lee presented algorithm for compensating network delays in a smart actuator based on the Lagrange Polynomial curve fitting.Their experimental results showed, that this method can be used effectively for message de-lay of smart actuator [4].Zhang et al. they proposed a method based on the polynomial curve fitting algorithm to process the flight testing data and their result showed that the proposed method can efficiently automatically eliminate the outlier errors [5].Jishui et al. they proposed multi-dimensional fuzzy reasoning algorithm to optimize the calculation process and improve the fitting curve speed and accuracy for NC Machining Graphics.Their results showed that this method has a short time computation and improved the fitting algorithm and fitting precision and is suitable for curve fitting of NC machine [6].

Introduction to EEG
The Electroencephalograph (EEG) is measuring and recording the differences of the voltage from two sites on the scalp over time.The first recording of electrical activity of the human brain was made by Breger (Berger, 1929), when he explained his method for measuring the electrical activity of the human brain on the scalp.The EEG signal in common has amplitude from a few microvolts up to 100 µV with frequency in the range from 0.5 to 40 Hz [18].We can record EEG signal between two active electrodes, bipolar recording, or between one active electrode and a reference electrode, monopolar recording [19].Electroencephalograph (EEG) is generally used in the diagnosis of brain diseases and epilepsy, research areas, due to the valuable information that convey by EEG signal [18].

10-20 International System
EEG recording is made by putting set of sensors on the human brain according to 10-20 International System as Fig. 1.The 10-20 international EEG Electrode placement is the international system for determining the EEG electrodes locations on the human skull.It contains 21 EEG electrodes without the earlobe electrodes that called A1 connected to the left earlobe and A2 connected to the right earlobe.These electrodes are normally used as reference electrodes [20], The letters F, T, C, P, and O stand for Frontal, Temporal, Central, Parietal and Occipital [21].The electrodes that have even number (2,4,6) are placed on the right side of the skull.The electrodes with odd numbers (1,3,5) are placed on the left side of the skull and Z or zero on midline of the skull [20].The 10 and 20 referred to the current distance between an electrode to other electrode, either 10 % or 20 % of the whole of distance from right side of the skull to the left side, or from front to back of the skull.In some applications, we need more EEG electrodes.In this case we can put some electrodes between the original electrodes according to 10-20 system, as Fig. 1 [20].

EEG Artifacts
The EEG data are very sensitive and complicated.Therefore the EEG data must be clear from contained surrounding interference to get good and reliable results.
The EEG signal normally contains noise and different kinds of interface signal (artifacts).This noises either internal signals are produced by subject himself such as electrical activity of heart, pulse, body movement, perspiration, eye blinking, eyes movement, muscles activities, or external signals produced out of the subject, for example 50/60 Hz from electrical power supply, EEG devices, electronic elements [20], [21] and etc.In EEG signal processing field, the removing of these noises and artifacts from EEG signal is an important topic [18].For example, Fig. 2 show EEG signal contaminated by power line interference, while Fig. 3 show some EEG signal contaminated by eye blinking artifact.EEG must be filtered to obtain clear EEG, without interference and artifacts, so that data become ready for further analysis.The noise signal and unwanted signal must be eliminated or minimized from EEG data without losing significant information and quality that embedded in EEG to ensure an accurate and perfect analysis and diagnosis of the EEG.
There are several techniques to filtering EEG signal such as conventional filters and adaptive filters that have more efficiency than conventional filters for elimination of the artifacts from EEG, because EEG signal and artifacts have overlapping spectra [23].

Turtle Graphics
Turtle graphics (TG) is a term in computer graphics for a method of programming vector graphics using a relative cursor position (the "turtle") upon a Cartesian plane.In the TG, we have a turtle with a drawing pen on a computer screen.This turtle must respond on a sequence of commands.The turtle can be controlled using these basic commands: forward command, is moving the turtle in front a few number of units, right commands rotate turtle in a clockwise direction a few number of degrees.These commands can be extended with other more complicated commands.
The back and left commands cause same movement as forward and right command, but in the opposite way.
The number of commands to determine how much to move is called input commands, depending on the application.When moving the turtle according to the input commands, it leaves a trace, this trace represent the desired object [24] as a simple example in Fig. 4. By This way we can represent and draw the objects, from simple to complex objects.
Using TG we converted EEG data from numeric values into text data and process them as text.This conversion helps us to compare two EEG data trial, two mental tasks, such as finger movement [25].
Every EEG trail is represented by sequence of commands-move forward and turn left or right.

Comparing Data with the LZ Complexity
The Lempel-Ziv (LZ) complexity for sequences of finite length was suggested by Lempel and Ziv [26].It is a non-parametric, simple-to-calculate measure of complexity in a one-dimensional data.The LZ complexity is related to the number of distinct substrings and the rate of their recurrence along the given sequence [27].
The larger values correspond to more complexity in the data.
The comparing of two TG commands lists is the main task of this article.The lists are compared each to other.The main property for the comparison is the number of common sequences in both lists.These sequences are obtained after applying the LZ complexity to the TG commands list.This number is represented by the parameter in the following form Eq. ( 1), which is a metric of similarity between two turtle commands list.
where sc -Count of common LZ sequence in both command lists, c 1 , c 2 -Count of LZ sequence in first or second command list.
This SM gives a result in the range between 0 and 1.The 0 result tells us that this two compared TG commands list have nothing common.They have the highest difference.If the result is equal to 1, the two compared TG commands list are same.

Interpolation of the EEG Data
After recording and filtering of the recorded EEG data, we apply polynomial curve fitting for data smoothing.The fitting will remove noise and interference from the data and fit the data trend.
Consider the general form for a polynomial fitting curve of order j : We minimized the total error of polynomial fitting curve with least square approach.The general expression for any error using the least squares approach is: where n -is a count of data points in one move, i -is the current data point being summed, j -is the polynomial order.

Self-Organizing Map (SOM)
Self-Organizing Map (SOM) is an unsupervised learning neural network.The SOM in most common used for the clustering and visualization of complex data.The SOM reduces the data dimension by produce map usually in one or two dimension in the output that plots similarities of data together as Fig. 5.The SOM is trained after many of iteration in the training phase until the map becomes stable at the output.This map is generated in the training phase and used in the testing phase to estimate in which group can belong the test input, while in other network types, Backpropagation networks, is the target output used to train the network [28].

SOM Algorithm
The SOM learning we can divide in following steps: • Initializing weight vectors with small random values.
• Choosing random vector from the training set and present to network.• Finding winning neuron which has minimum distance from data input based on specific criterion, for SOM usually using Euclidean distance to measure the distance between data input and neurons as Eq. ( 5).
The winning neuron is called Best Matching Unit (BMU).
• Calculate the radius of the neighborhood of BMU using Eq. ( 6).
where r(t) -is radius of the neighborhood, r 0is the radius of the map, T -time constant, t -Current iteration.
• Any nodes found within the radius of BMU must be update, this means move the BMU and its neighborhood nodes toward data input as Fig. 6 using the Eq.(7).
• Repeating the steps from step 2 to step 5 for many iterations until the map at output becomes stable [28].

Proposed Method
The proposed method is using Neural network Unsupervised learning to classify EEG data.Our model was tested on EEG data to detect index finger movement.We made our proposed method as following: filter and the smooth EEG data (Training data set) using 15th order polynomial curve fitting, after that we convert the smoothed EEG data into text form using turtle graphic.The LZ complexity we used to compare two TG commands lists and assign the type of movement to processed data trail [29].This was done for every sensor of processed trial.We made a vector V with dimension 8, 7 channels and one data type class.This vector V is used to train the Self-Organizing Map (SOM) neural network with dimension of 5 × 5 nodes to produce the map.When the training is finished, the map of output becomes stable.In the testing phase we used other EEG data (Testing data set) to test the network as depict in experiment scheme in Fig. 7.

EEG Data
The EEG Data used in this experiment was recorded in our laboratory.In our experiment we used seven EEG channels, which were selected by our Biomedical Department.These seven channels are able to capture most finger movement data.The recorded signals contain movements of one index finger.We recorded EEG Data from four different subjects.Every one of them performed a press of a button with left index finger.We used 320 recorded finger movements, and 320 recorded trials without finger movement.For every task we used 576 trials for the training set (288 trials with movement and 288 without movement) and 64 trials for testing set (32 trials with movement and 32 without movement).The sampling rate was set to 256 Hz, and the band-pass filter was set to 0.5 Hz to 60 Hz to remove unwanted frequencies and noises.
While extracting the task data from captured EEG data, we added before and after every task a time interval 0.3 second.

Experiment Results
To  The proposed model is able to detect index finger movement in the range between 90.6 % and 100.00 %.The detection rate for trials without finger movement varied in the range between 90.6 % and 100.00 %.
The Table 3 show the percentage of total identified and misidentified trials in our experiment.The proposed model is able to detect in average 96.250 % of finger movement trials correctly.The total average of misidentified trials is about 3.750 %.The average final quantization error is 0.6556, and the average final topographic error is 0.007.

Conclusion
This experiment shows the ability to find and recognize different mental task in EEG data.This helps us   to understand the valuable information which is hidden in the EEG data.Our approach is able to decide between two tasks, pressed button with index finger and released button.We used only seven selected electrodes.This count of electrodes is enough to capture good EEG data for movement.As a first step, we used a band pass filter, to filter out wanted frequencies which are useful for finger movement detection.Our   suggested approach is using high order polynomial fitting curve for noise and interference elimination, turtle graphic to convert filtered data from numbers into text, Lempel-Ziv complexity to compare two data trials a Self-Organizing maps as a classifier.
The data trial were cut 0.3 second before the mental task began and 0.3 second after.In our experiment, we filtered data with polynomial fitting with order 15.This order is enough to fit data trend and remove unwanted noise and interference surrounding environment.As a classifier, we chose SOM with map dimension 5 × 5 neurons.The testing vector was assigned to cluster using BMU.
Our model was able to detect on finger movement as average about 96.56 %, the lowest rate we reached was 90.625 %, and the highest was 100.00 %.For a trial without finger movement the average successful rate is about 95.93 %, the lowest rate is 90.625 % and the maximal is at 100.00 %.The average for both for detects on finger movement and without finger movement about 96 %.In the future, we will continue with other publication regarded to testing other EEG data or modifying our model to improve the result of EEG data recognition and increase the speed of our model.

Fig. 1 :
Fig. 1: 10-20 International System of EEG Electrodes Placement the Nasion is the place between the forehead and nose, Inion is the jut at back of the skull [20].

Fig. 6 :
Fig. 6: Update of Winner Neuron (BMU) and its neighbors,this means move them towards data input indicated with X.The solid and dotted line correspond to the status before and after update respectively.

Fig. 7 :
Fig. 7: The schematic diagram of proposed method.Black and blue lines represent training phase.Red dotted lines represent testing phase.
train and test our model we used k-fold Crossvalidation technique with k = 10.The EEG data set is divided into 10 sub-sets, or folds and the experiments are repeated for 10 times.The recognition results of finger movements are listed in Tab. 1 and the results for trials without finger movement are in Tab. 2.

Fig. 9 :
Fig. 9: SOM of k-fold 1, Red Nodes represent movement trials and Green Nodes represent the without Movement trials.

Fig. 11 :
Fig. 11: SOM of k-fold 10, Red Nodes represent movement trials and Green Nodes represent the without Movement trials.
followed and finished his Ph.D. studies in Technical Cybernetics in 2000.Since 2002 he is Guarantee of M.Sc.specialization Measurement and Control in Biomedicine.Through his career he published more than 100 original research articles including over 30 peer reviewed journal papers.He is author and coauthor of more than ten books.
[7]ng et al.they applied curve fitting to phase calibration algorithm using error voltage data from satellite tracking.The result showed that we could use this method in monopulse tracking, which does not need to build the source and we can use only the error voltage[7].Jiang et al. they pro- [9]ed method for the fault location detection in electrical cables based on flat coefficient computation.Cable fault location analysis is combined with wavelet transform and fitting curve technique.This paper proved that the proposed method reduces deviation of singularity detection and improves the fault location precision[8].Yixu Song et al. proposed new method based on curve fitting technique combined with the clustering algorithm to store the data stream.The experiment results of this method depict the best compression ratio and fitting accuracy[9].Zhang and Liu applied curve fitting technique method to detect the dislocation defect in polysilicon slices.They compared two methods of curve fitting, quadratic curve fitting and Gaussian curve fitting.Their results showed Tab. 2: Without finger movement results.
Tab. 3: Evaluation of the results.