Hjorth Parameter based Seizure Diagnosis using Cluster Analysis

The health-related issues have been increased with a wide range in few years. Hence the need for effective and advanced health care systems or aids isexpanding. New methodologies and instruments must be developed to aid the doctors inintelligent health caring of patients. Biomedical signals are a rich source of information, and it is not easy to understand by the normal human beings. To provide ease, extraction and analysis of biomedical signals can help get the correct information to everyone. The signals generated by the brain control the status of the mind and control the action of the whole body. Epilepsy is a disease by which around 50 million people are affected worldwide. Abnormal synchronisation of the neural activity with symptoms like convulsion is the phenomenon of epileptic seizures. An advanced seizure diagnosis system will help in the detection and diagnosis of epileptic seizures. In this paper, clustering algorithms are applied to Electroencephalogram (EEG) data to classify it in normal and epileptic seizures using the Hjorth parameters. After extracting the Hjorth parameters from EEG signals and k-means, basic sequential algorithmic scheme (BSAS), partitioning around medoids (PAM), fuzzy c-means (FCM), and Vally-Seeking clustering algorithms are applied to group it into normal and seizure. With the used dataset, the Vally Seeking clustering algorithm gives the best performance with an accuracy of about 87%.


Introduction and related works
Neurological disorders and brain function can be studied widely with rich information as EEG [1]. A tremendous amount of data is generated by EEG systems, and its regular analysis and monitoring are not possible manually [2]. Hence, determining the epileptic and non-epileptic seizures is a tedious task for manually analysing the long-range electroencephalogram signals by doctors.Therefore,many seizure diagnosis systems came into existence. In this work, a method is developed to detect the seizure. First, features are found from EEG signals by calculating the Hjorth parameter, and after that, different clustering algorithms are being applied to the diagnosis of seizure activity in the brain [3]. The clustering algorithm is compared based on classification accuracy and execution time. The application of digital signal processing and soft computing techniques makes doctors' task easy to diagnose the seizure activity accurately.
Ongoing through several studies, it is found that few works have been done on the clustering with Hjorth parameters based on clustering algorithms for seizure diagnosis using EEG signals. Therefore, few actualresults of recent years are discussed below: In 2016, J. Mohylova et al. used Fuzzy c-means to classify the epileptic pattern and creating a homogenous compact of class segments of EEG [4]. A study on seizure detection by analysing EEG signals using power spectral density, entropy, and Teager energy by N. Sriram et al., published in 2018, was conducted. Descriptive Analytics and Wilcoxon rank-sum test and descriptive analysis is used to confirm the features' suitability. Finally, multilayer perceptron neural networks are used to classify the signals [5]. In 2018, Hemant Choubey and Alpana Pandey introduced a Masking and Checkin-based feature extraction technique to corkscrew the attributes from EEG signals and then used K-means and KNN as classification algorithms for segregation of normal and abnormal signals. They have compared their results with some previous works to show outcome enhancement [6].
In 2019, Ashok Sharmila and Purshotaman Geetanjali surveyed pattern discovery methods for epilepsy seizure diagnosis from EEG signals. They have found that seizure detection methods vary from different EEG signals to other conditions. They have also found that EEGs taken in variousstates have different characteristics [7]. A study on Epileptic Seizure detection using Hidden Markov Model using a probability-based method was presented by Deba Prasad Das and Mahesh Kumar H. Kolekar. Discriminant correlation analysis is used to differentiate the signals with Fuzzy c-means clustering. They have achieved the highest accuracy of 98.57% [8].
In 2020, Qianyi Zhan and Wei hu presented epilepsy detection using multi-view clustering using in-depth features. Deep CNNs are used to extract the features. They have used FCM to do the clustering with the three-step method. This work showed a satisfactory result in an analysis of seizures [9].OzlemKarabiberCura et al., in 2020, present a classification method for epileptic seizures using the technique and its derivatives of empirical mode decomposition. They have used four different features as energy, power spectral distance, statistical significance, and correlation measures on various supervised learning algorithms to classify the signals and got a highest of 97% accuracy [10].
Based on this study, it is observed that fewer works have been done on the Hjorth parametersbased clustering of EEG signals. Therefore, four prominent unsupervised clustering algorithms as kmeans, FCM, PAM, and BSAS, are used to analyse the results in EEG signal classification.The rest of the paper is divided into five different parts. Section2 contains the dataset description in detail. Section3 is all about the research methodology used to classify signals. In section4, results have been discussed, and conclusions are drawn in section5 with the references given at the end.

Dataset Description:
The data described in Andrzejak et al. (2001) [11] that is published is used. The dataset link is given in the reference section below [12]. This dataset has five separate sets (A -E), each containing 100 single-channel electroencephalogram signals with a duration of 23.6s. The single-channel data was extracted from the multichannel signals by visual interaction of eyes or muscle movements. Five healthy and fit volunteers were carried out to take surface EEG recordings of set A and Busing a standardised electrode placement scheme.Using the pre-surgical diagnosis, sets C, D, and E are achieved.The EEG of five healthy volunteers who have gained complete seizure control is considered an epileptogenic zone after the resection of the hippocampal formation. Therefore, signals C is determined from the Hippocampal formation, while signals D were recorded with an epileptogenic zone. Seizure-free intervals are recorded in sets C and D, while seizure activity is recorded in set E. Only set A and set E are considered in this study, as shown in figure 1.

Methodology:
In the given approach of figure 2, EEG signals (data) are first collected. Only two sets of signal A and E, corresponding to the standard and seizure dataset consisting of 100 samples of both groups, are taken. Next, theHjorth parameters of both sets are extracted. It has three parameters activity, complexity, and mobility, where theactivity and mobility are chosen for clustering purposes. Hjorth parameters are calculated in the time domain. The parameters are extracted and then collected parameters data into a matrix. Now the data matrix is of the size 2*200 containing the activity and complexity features of the EEG signal. After collecting the data, clustering algorithms K-means, FCM, PAM, BSAS, Vallyseeking clustering approach is applied. The achievement of the clustering algorithm is analyzed by accuracy and time [13].

Figure 2. Methodology
The Hjorth parameter analysis and clustering algorithms are explained in detail in the below subheadings:

Hjorth Parameters
It is one of the accustomed statistical indicators of those properties implemented in signal processing within a time realm [14].Following are the types of Hjorths parameters: Where x(t) reflects the signal.

Mobility.
The mean frequency or the standard deviation proportion of spectrum power is represented by the mobility parameter [15]. Mobility is defined as the ratio of the square root of the variance of the first derivative of signal x(t) to the variance of signal x(t) and is presented in equation (2): 3.1.3. Complexity. The complexity parameter represents the variation in frequency.The ratio of mobility of the first derivative of signal to signal mobility is called complexity. Given in equation (3): So these are the normalised slope descriptor (NDS) parameters that have their application in EEG.

K-means clustering algorithm
The most often used vector quantisation approach for signal processing clustering algorithms is K-Means. Its goal is to divide n instances into k clusters, with each instance belonging to the cluster with the closest mean (cluster centroid) [16]. K-means algorithm can be given in mathematical equations (4) (4) K-means stops itself when there is movement in the cluster centres, and for this, it minimises intracluster distance, not the Euclidean distance. Hence this can be considered as the most difficult Weber problem. K-means algorithm minimises the cost function as in equation (5) ‫ܬ‬ሺߠ, ܷሻ ൌ ∑ ∑ ‫ݒ‬ ‫ݔ‖‬ െ ߠ ‖ ଶ ୀଵ ே ୀଵ (5) Where θ = [θ 1 T , . . . ,θm T ] T , ||.|| is Euclidean distance, and V ij = 1 if x i lies nearest to θ j or 0 otherwise. In other words, thisresults in minimisingthe intra-cluster distance or the distance between each data vector [17].Therefore,when the data vectors of X form m compact groups (with no substantial differences in size), if m is known, and each θ j is (nearly) placed in the centre of each cluster, it is expected that J is minimised. It is not necessarily followed in the cases when: (a) Compact clusters are not formed by data vectors. (b) Sizes will not differ drastically.
(c) Cluster number (m) has not been given correctly. K-means is an algorithm that works on the principle of moving the points to the nearest cluster centre. Convergence of K-means can be faster with the use of some other distance calculation methods. Spherical K-means, PAM, are the modification of the K-means algorithms.

Partitioning Around Medoids
The Partitioning Around Medoids (PAM) algorithm is alsocalledthe k-medoid algorithm, is a classical machine learning method of dividing the data inton instances into k clusters, where the k clusters should be prior known. Like the k-means algorithm, it is also a clustering algorithm. The objective of k-Means is to reduce the total squared error. In contrast, the goal of k-medoids is to reduce the sum of dissimilarities between the observations labelled to be in a cluster and a point designated as the centre of that cluster. The PAM algorithm, unlike the k-means algorithm, selects the data points as centres (medoids). The set ɵof the X vectors that describe the clustering body (called medoids) is obtained by minimising cost function J(ɵ).It can be calculated as overall data vectors, distance from each vector, and its nearest medoid [18]. The realization of the K-medoid algorithms can be done with the Partitioning Around Medoid (PAM) algorithm and can be understood from the following steps: 1. Step1 (Initialisation): K-medoids are initialised randomly from n data points. 2. Step2 (Assignment): The nearest medoids are allocated to each of the data points. 3.
Step 3 (Updating): For each data point with medoid and each medoid, do the swapping between medoid and each associated data point and calculate the cost that is average dissimilarity in between all the data points and its associated medoids. Choose the medoid with the lowest price.

Repeat steps2 and 3 until there is no change.
So the algorithm is iterative.The runtime complexity of the original PAM algorithm per iteration of (3) isO(k(n-k)(n-k)), by only computing the change in cost [19].

Fuzzy c-means Algorithms
A parameter vector θ j represents each (compact) clusterin the Fuzzy c-means (FCM) algorithm, where j ranges from 1 to m. Furthermore, it is assumed that a vector x i belonging to the data set X does not essentially belong to a single clusterC j .Instead, it can, to some extent, belong to numerous clusters at the same time. The variablex i evaluatesxi's "grade of membership" in the clusterC j such that u ij ∈ [0,1], and ∑ ൌ ୀ (6) for all xi. Again, the number of clusters, m, is believed to be known. FCM seeks to direct each of the m available dimensional parameter vectors (representatives) θ j , j = 1, . . ., mto thedata-intensivepart of the data space. Finally, the algorithm includes an additional parameter q (>1), called the fuzzifier [20]. FCM is a well-known algorithm. It is iterative, starting estimations, θ 1 (0), . . ., θ m (0), for θ 1 , . . . , θ m , respectively, and at each iteration t: • The (squared Euclidean) distances betweenx i from all θ j 's, j = 1, . . ., m are used to compute the grade of membership, u ij (t −1), of the data vector x i in the groupC j , i= 1, . . ., N, j = 1, . . ., m.
• The weighted means of all data vectors are modified in the representative's θ j 's (each data vector x i is weighted by ሺ െ ሻ).

Basic Sequential Algorithmic Scheme
On the given data set, the BSAS algorithm operates a single pass. Furthermore, the mean of the vectors associated with each cluster represents it [21]. The BSAS working procedure is outlined below. For each new vector x supplied to the algorithm, the distance from the already created clusters is determined first.If the following criteria are not met, a new cluster with x is created: 1. If the distances are more significant than a (user-defined) threshold of dissimilarity, Θ.

2.
If the maximum number of clusters allowable, q has not been reached. Otherwise, x is assigned to its cluster nearest to it, and the representative for that cluster is updated. When all data vectors are considered once, the algorithm ends. Therefore, BSAS is suitable for sorting out compact clusters [22]. Furthermore, BSAS is fast, and the reason behind this is that it requires one pass through the data set, X; thus, it is a strong option for processing large data sets. However, the clustering that results in several cases may be of poor quality. Improvements can be achieved in a refinement step.

Valley-Seeking Clustering Algorithm
Clusters are defined as the probability distribution function p (x) peaks,denoting X, separated by valleys in this method (also called VS). This method doesnot use representation,but the local area V (x) around each data vector x ∈ X.The latter is interpreted as the set of vectors in X (excluding x) at a distance less than 'a' given distance from 'x' where 'a' is a user-defined Parameter. Euclidean squarescan be used(other distances can also be applicable) for distance measurement. In addition, VS requires an (overestimated) number of clusters, m. This algorithm is sensitive to the initial allocation of the data vectors to clusters and the choice of 'a'. Iterative in nature, the algorithm begins with an initial assignment of X vectors to m clusters. Then, all data vectors are presented only once at each epoch (N successive iterations). The region V (x i ) is determined during the t th epoch and the cluster to which major of the data vectors in V (x i ) belong is recognised and stored for each x i in X, i= 1, . .,N. When all the data vectors have been presented (during the t th epoch), re-clustering happens, and each x i is now assigned to the cluster with the most significant number of points in V(xi).The algorithm finishes once no re-clustering occurs between two successive epochs [23,24].

Experimental Results
The methodology starts with collecting EEG data, which is then used to determine the Hjorth parameters, activity, complexity, and mobility of sets A and E. Table 1 shows the sample of feature one from each class (A& E).   After collecting the features from set A and set E, we apply the clustering algorithms one by one and see the classification accuracy and time. Table 2 indicates the classification accuracy of all the clustering algorithms used in the papers. The result in table2 shows the performance of different clustering algorithms. For example, Kmeans and PAM clustering algorithms have the same clustering capabilities regarding accuracy and execution time in both these algorithms. Only one mismatch was found in the set A cluster, and 43 mismatches were found in the set E cluster. Therefore, the overall accuracy of both the algorithm is 78%, and the time required is the same 0.0470 seconds.
FCM clustering algorithm is better in terms of accuracy from K-means. For example, PAM and BSAS having an accuracy of 80%, one mismatch in class A and 39 mismatches in class E., but it takes 0.1041 sec execution time.
BSAS clustering algorithm has one mismatch in class A and 41 mismatches in class E. The overall accuracy is about 79% and takes an execution time of 0.1892 sec. The valley-seeking clustering algorithm gives the best performance in classification accuracy when applied to the data set. This algorithm indicates only one mismatch in class A and 25 mismatches in class E. so overall accuracy of 87%. Time execution time is 0.3060 sec. Figure 4 shows the cluster structure after executing different clustering algorithms on the same data set.

Conclusion
Diagnosis of seizure activity is an integral part of its treatment. Therefore, correct diagnosis of seizure activity without human intervention with the highest degree of accuracy in the least time is the motive of this paper. In this paper, various clustering algorithms are applied to Hjorth parameters of EEG