Motor Imagery Recognition Based on GMM-JCSFE Model

Features from EEG microstate models, such as time-domain statistical features and state transition probabilities, are typically manually selected based on experience. However, traditional microstate models assume abrupt transitions between states, and the classification features can vary among individuals due to personal differences. To date, both empirical and theoretical classification results of EEG microstate features have not been entirely satisfactory. Here, we introduce an enhanced feature extraction method that combines Joint label-Common and label-Specific Feature Exploration (JCSFE) with Gaussian Mixture Models (GMM) to explore microstate features. First, GMMs are employed to represent the smooth transitions of EEG spatiotemporal features within microstate models. Second, category-common and category-specific features are identified by applying regularization constraints to linear classifiers. Third, a graph regularizer is used to extract subject-invariant microstate features. Experimental results on publicly available datasets demonstrate that the proposed model effectively encodes microstate features and improves the accuracy of motor imagery recognition across subjects. The primary code is accessible for download from the website: https://github.com/liaoliao3450/GMM-JCSFE.


Motor Imagery Recognition Based on GMM-JCSFE Model
Chuncheng Liao , Shiyu Zhao, and Jiacai Zhang Abstract-Features from EEG microstate models, such as time-domain statistical features and state transition probabilities, are typically manually selected based on experience.However, traditional microstate models assume abrupt transitions between states, and the classification features can vary among individuals due to personal differences.To date, both empirical and theoretical classification results of EEG microstate features have not been entirely satisfactory.Here, we introduce an enhanced feature extraction method that combines Joint label-Common and label-Specific Feature Exploration (JCSFE) with Gaussian Mixture Models (GMM) to explore microstate features.First, GMMs are employed to represent the smooth transitions of EEG spatiotemporal features within microstate models.Second, category-common and category-specific features are identified by applying regularization constraints to linear classifiers.Third, a graph regularizer is used to extract subject-invariant microstate features.Experimental results on publicly available datasets demonstrate that the proposed model effectively encodes microstate features and improves the accuracy of motor imagery recognition across subjects.The primary code is accessible for download from the website: https://github.com/liaoliao3450/GMM-JCSFE.Index Terms-EEG, microstate, GMM, JCSFE, motor imagery.

I. INTRODUCTION
B RAIN-COMPUTER interfaces (BCIs), also named brain-machine interfaces (BMIs) are communication systems that measure central nervous system (CNS) activity and convert it into artificial output that replaces, restores, enhances, supplements, or improves natural CNS output, thereby changes the ongoing interactions between the CNS and its external or internal environments [1].In recent years, the integration of artificial intelligence into brain-computer interface (BCI) systems has attracted more and more attention.These AI-driven BCI systems are designed to extend BCI systems' application and enhance its interaction experience [2], [3], [4], [5].
Motor imagery brain-computer interface (MI-BCI) is one of the most widely investigated paradigms in active BCIs [6].Motor imagery (MI) tasks can generally be categorized into two main types: Kinesthetic Motor Imagery (KMI) and Visual Motor Imagery (VMI).KMI involves the mental simulation of a movement with a focus on the physical sensations associated with the action, such as the force or effort required.On the other hand, VMI entails the visualization of a movement from a third-person perspective, concentrating on the visual cues and observable changes in body position [7].Cognitive neuroscience studies have shown that motor imagery involves neural activity in specific brain regions.The EEG activation patterns among scalp electrodes have been successfully detected and used to decipher the direction of motor imagery [8], [9].However, there are some challenges in EEG feature extraction and intention recognition of motor imagery.
The primary challenge posed by EEG signals is their nonstationarity.It is widely recognized that motor imagery (MI) induces internal modifications in the brain's nervous system, which subsequently interact with the external environment.These internal and external modifications exert unpredictable influences on EEG signals.Commonly employed techniques for MI EEG include independent component analysis (ICA) and sliding window analysis.However, these methods are frequently applied on coarse timescales, resulting in the loss of high temporal resolution and oscillatory phase information inherent in EEG data.
Microstate models are predicated on two primary assumptions.The first is the one-hot hypothesis, which posits that the topological structure of the EEG signal at any given time point is exclusively indicative of a single microstate.Furthermore, this topological structure is asserted to be dominant and maintain stability for a duration of 80-120 milliseconds.The second assumption is referred to as the discrete hypothesis, which suggests that the EEG state transitions from one microstate to another without any transitional period.This implies that throughout the entirety of a state's cycle, a single microstate is dominant, followed by a rapid transition to another stable microstate.Mishra et al. used principal component analysis (PCA) and time-varying signal power analysis to demonstrate that classifying the brain's topographic topology over an extended period into a discrete state of unique hot may not be justifiable [18].In the current study, a hybrid representation of temporal and spatial dynamics of MI EEG is proposed, utilizing a Gaussian mixture model.The multi-channel EEG data is decomposed into linear combinations of multiple Gaussian component models.
One challenge in MI EEG decoding is the inter-individual variability in EEG microstates.There are significant differences in EEG microstate features among individuals, and the classification accuracy of these features can also vary considerably from one subject to another.
In the realm of microstate characteristics, Liu et al. [19] identified four distinct microstates (labeled A, B, C, and D) through EEG microstate analysis.They computed and assessed various EEG mass spectrometry features, encompassing conventional metrics such as global interpretation variance, mean duration, coverage, incidence, and transition probability, as well as the Hurst index, and temporal dynamic features including autocorrelation and partial autocorrelation functions.These findings indicate that EEG microstate features exhibit reliable uniqueness within a single subject and demonstrate considerable inter-individual variability.As an electrophysiological indicator, EEG has the potential to decode individuality [20], [21], [22].
This research investigates the feature extraction of motor imagery and its associated classification algorithms.By identifying common spatiotemporal dynamic features across participants (features exhibiting consistent discriminative power across various classification tasks) and unique spatiotemporal dynamic features (features with varying discriminative power in different classification tasks), we aim to minimize individual differences and enhance recognition accuracy.

A. Experimental Data
1) The BCI Competition IV Dataset 2a: This dataset was collected from 9 subjects [23].Each subject completed four different motor imagery asks: left hand movement (category 1), right hand movement (category 2), feet movement (category 3) and tongue movement (category 4).Each subject underwent a total of 288 trials, divided into 6 sessions.There was a short break midway through each session.Each session comprised 48 trials, with 12 trials per task category.
2) The BCI Competition IV Dataset 2b: This dataset was collected from 9 subjects [24].Each subject completed two different motor imagery (MI) tasks: left hand movement (category 1), right hand movement (category 2).Each subject underwent a total of 5 sessions, receiving online smiley feedback in the last three sessions.In the first two sessions, each subject performed 120 trials per session, and executed another 160 trials per session in the last three sessions, completing a total of 720 trials.
3) The BCI Competition III Dataset IVa: This dataset was collected from five subjects [25], as cited in Data Iva.Each subject completed two different motor imagery (MI) tasks: right hand movement (category 1) and right foot movement (category 2).Each subject underwent a total of 280 trials.In this experiment, the training set for subjects aa, al, av, aw, and ay consisted of 168, 224, 84, 56, and 28 trials respectively, with the remaining trials being selected to form the testing set.Additionally, the sampling rate was reduced to 250 Hz to maintain consistency with the two datasets mentioned above.
During the experiments (using trial data 2a as an example), subjects sat in front of a computer screen.At the beginning of each trial (t = 0), there was a sound stimulus reminder followed by a cross in the middle of the screen for 2 s, requiring the subject to focus on the center of the cross.Then, a cue in the form of an arrow appeared on the screen.The direction of the arrow could be left, right, up, or down, corresponding to the motor imagery tasks of imagining movement with the left hand, right hand, feet, and tongue respectively.The duration of the arrow was 1.25 s, prompting the subjects to perform mental imagery.During this period, the subjects kept looking at the cross until t=6 s, when the cross disappeared from the screen.After that, the subjects took a short break.
Data augmentation is achieved by adopting the strategy proposed by Milanes Hermosilla et al. [26].For each trial, EEG signals within the 3-6 s time window were segmented for further processing.We discarded the data for 2-3 s, in order to eliminating the ERD/ERS irrelevant effects induced by visual stimuli.The 3-6 s EEG fragments were further segmented into three samples by a 1-s sliding window without overlapping.Thus, the size of samples was 3 times the original dataset.5-fold cross-validation was adopted in this study.During dataset division, samples from the same trial are kept within the same partition; that is, the 3 samples are all in training dateset or all in testing dataset.
As shown in Fig. 1, during the period from the appearance of the arrow prompt to the end of the cross fixation point, EEG signals are recorded for each trial.During this period, the subjects received the command for the motor imagery task and imagined it.However, the data from 3 to 6 s were used as the EEG segmented data.
Data preprocessing: The EEG dataset is pre-processed using Python 3.8.8 and the MNE 0.23.4 toolkit.
Baseline Correction: A common approach was used to correct the baseline.For all datasets, including the training and testing sets, EEG trials with baseline noise were rejected by visual inspection.Then, a high-pass filter was applied to both the training and testing sets; that is, the filters for the training and testing datasets shared the same parameters.
Principal Component Analysis (PCA): To reduce the dimensionality of our data and extract uncorrelated features, the projection space of PCA was learned from the training set.We then projected the testing samples into the same space learned from the training set.
Artifact Rejection: Firstly, multi-channel EEG signals were decomposed into independent components using Independent Component Analysis (ICA).Secondly, correlation coefficients between the time courses of each ICA component and three electrooculogram channels were calculated.Then, threshold processing based on an adaptive Z-score was performed.
Components above the threshold (threshold = 3.0) were masked, and the Z-score was recalculated until no components exceeded the threshold.Lastly, the scalp EEGs were reconstructed from the remaining independent components, thereby automatically excluding the effect of the electrooculogram signal.The algorithm was first trained on the training set using a set of known artifacts and then applied to the testing set.

B. GMM-JCSFE Model Overview
Fig. 2 shows the flow of the GMM-JCSFE model.The raw EEG signals were preprocessed and sent to the GMM module, which extracted the microstate features of different clusters.Next, these microstate features were passed through the JCSFE module, which labeled the microstate features as common features and specific features in order to better complete the four classification tasks.

C. Microstate Features Of Gaussian Mixture Model
Based on the Gaussian mixture model, this paper decomposed the EEG microstate into the mixed representation, Instead of the unique thermal representation, and explored the classification ability of the microstate hybrid model under the MI task.
Taking four Gaussian component models as an example, Fig. 2.a shows the basic flow of microstate decomposition of GMM.The pre-processed EEG data X ∈ R N ×T is taken as the input of the model, where N is the number of sensors of the EEG device and T is the number of sampling points of the sample.Z ∈ R K ×T represents the output of the model, K represents the number of submodels in the Gaussian mixture model, consistent with the number of microstates in k-medoids set K =10.The model initialization algorithm is shown in Table I,and the E-M algorithm can be used to optimize the model parameters by iterating the E and M steps alternately, as shown in Table II.
Finally, the multi-channel EEG data is decomposed into linear combinations of multiple Gaussian submodels.Similar to the unique thermal representation model of MI EEG, the GMM hybrid representation model extracts dynamic statistical features according to the probability of each submodel, including three features as shown in Fig. 3: duration, occurrence frequency, and coverage rate.

D. Mining Model Of Class Common and Specific Features
The dynamic features extracted based on microstate analysis, such as duration, occurrence frequency, and coverage rate, are time-domain statistical features that are mainly defined by experience and lack deep mining of the information of microstate spatiotemporal series.This means there is a need to explore whether there are common features that can discriminate all MI tasks and whether there are category-specific features that are biased towards specific MI tasks.To address these issues, a Joint Label-Common and Label-Specific Feature Exploration (JCSFE) model is introduced [27].The model framework is shown in Fig. 2b.
1) Weight Optimization for Class Common and Specific Features: The EEG microstate mixture representation dataset is the class of N samples defined as: 1, Belong to the ith categor y 0, Other wise.
Fig. 4 illustrates the meaning of features common to a category and features specific to a category.Assume that the data matrix contains two instances X = [x 1 , x 2 ] with a 5-dimensional feature space f 1 , f 2 , f 3 , f 4 , f 5 and that the corresponding category vector is Y = [y 1 ; y 2 ].The two elements in each category vector represent the probability that the corresponding instance belongs to the two categories.Fitting (X, Y ) by the projection matrix W , we obtain a possible solution for W .The specific features of each class can be obtained by the nonzero values of the two columns of the projection matrix W , namely, W 1 and W 2 .Specifically, W 1 = [1, 1, 1, 0, 0] T indicates that the first class is determined by   the features f 1 , f 2 , f 3 , W 2 = [0, 0, 1, 1, 1] T means that the features f 3 , f 4 , f 5 determine the second class, and the feature f 3 is a common feature of the two classes.
For category-common features, they had the same discrimination ability for all motor imagery tasks.For the ith feature, the normalized l 2 norm of the ith row of the projection matrix W is used to measure the importance of the feature shared by the class, which is denoted as θ i .The larger the value of θ i is, the more discriminative the ith feature is in the classification of motor imagery.Therefore, the l 2,1 norm is imposed on the projection matrix W to mine the common features of the categories, which is essentially a method of row sparsification.In addition to class-shared features, each motor imagery task may be additionally determined by several specific features of its own, so class-specific features are selected using l 1 norm regularization, which forces the projection matrix W to be elementwise sparse.The objective function is: The projection matrix W ∼ = [w 1 , w 2 , • • • , w c ] ∈ R D×C , and w j is the jth column of the projection matrix W .The coefficient vector w j = [w 1 j, w 2 j, w 3 j, w D j ] T , where w i j represents the discrimination ability of the ith feature for the jth motor imagery task.w i j ̸ = 0 indicates that the ith feature is discriminative for recognizing the jth motor imagery task, so the ith feature is considered a class-specific feature for the jth motor imagery task.Conversely, w i j = 0 means that the ith feature will not be useful for recognizing the jth motion imagery task.In the objective function defined by formula (2), nonnegative regularization parameters α and β are applied to balance the effects, which control the row sparsity and element sparsity of the projection matrix W in mining the class-common features and class-specific features, respectively.
2) Classification Model Based on Common and Specific Features of Joint Categories of Graph Regularity: Peng et al. [27] proposed that learning performance can be greatly improved if data manifolds are explored and utilized.To measure the correlation between two samples represented by microstate mixing, the KNN method is adopted to establish a similarity matrix S ∈ R N ×N , which describes the similarity between two samples after projection.For example, S i j represents the similarity degree between two samples x i and x j , and using a 0-1 weighting scheme, the similarity matrix is defined as follows: where N (x i ) denotes the K nearest neighbors containing the sample x i based on the Euclidean distance metric.Data local invariance requires that if two samples x i and x j are similar in the original sample space, their representations in the projection space should also be similar, which can be achieved in the following way: where the Laplacian matrix L is calculated by D − S, D represents a diagonal matrix, and the i-th diagonal element S i j .By incorporating formula (4) into formula (7) as a regularizer, the final JCSFE model objective function is: F ∼ = X T W is an intermediate variable to simplify notation.During the training process, formula (5) is fitted using the given microstate mixed representation sample, and the motor imagery task category corresponding to the unlabeled sample can be directly obtained through Y U .Additionally, based on the learned projection matrix W , the common and specific features of the categories can be explored by analyzing the spatiotemporal information of their respective microstate sample spaces in the motor imagery task.The training process algorithm of the JCSFE model is shown in Table III, and the test process algorithm is shown in Table IV.

A. Model Classification Result
The experiment was conducted on a single subject, and the cluster number of Gaussian mixture model was set to 10.Among the all samples, the ratio of training set to test set was 4:1.After the JCSFE model is used to obtain the feature coefficient matrix on the training set, and calculate the product of the test dataset and the coefficient matrix on the test set to obtain a prediction matrix, where in each row of the matrix represents a test sample and a value represents the probability that the sample is divided into a certain class.The column corresponding to the maximum value of each row is selected as the prediction class, and the algorithm model is denoted as GMM-JCSFE.
1) Results on the 2a Dataset: In this experiment, four different methods were compared as follows: FBCSP: This baseline method is also the No. 1 algorithm in the competition [28].ACSP-CNN: The augmented CSP (ACSP) algorithm is combined with a three-layer CNN for feature extraction and classification [29].WT-2D CNN: The wavelet transform is selected to generate time-frequency images, which are combined with a 2D-CNN for classification [30].TSGL-EEGNet: The EEGNet algorithm is combined with a temporally constrained sparse group lasso (TCSGL) [31].ATCNet: This approach employs a multihead self-attention mechanism to emphasize the most valuable features in EEG data, utilizes a temporal convolutional network to extract high-level temporal features, and uses a convolutional-based sliding window for efficient augmentation of the EEG data [32].
The experimental results are shown in Table V.The average classification accuracy of the baseline method for all subjects was 67.75% ± 13%, and for the ACSP-CNN method, it was 69.27% ± 14.08%.The WT-2DCNN method achieved an accuracy of 85.59% ± 5.67%, the TSGL-EEGNet method resulted in an accuracy of 81.34% ± 9.08%,and the ATCNet method reached an accuracy of 85.40% ± 9.10%.The proposed GMM-JCSFE method yielded an accuracy of 85.89% ± 1.11%,which not only had a higher average classification accuracy than all comparative methods but also exhibited a lower standard deviation.
The experimental results are shown in Table VII.The average classification accuracy of the Winner method for all subjects was 96.00% ± 7.16%, and for the CSP-SVM method, it was 85.01% ± 10.92%.The TDA-CNN method achieved an accuracy of 86.02% ± 11.21%,and the SCN method resulted in an accuracy of 88.91% ± 7.51%.The average accuracy of GMM-JCSFE was 90.67% ± 3.83%.While the average classification accuracy across all subjects was not the highest compared to all other comparative methods, the standard deviation was lower than that of all the comparative methods.

B. Analysis of Class Common Features and Class Specific Features on Dataset 2a
At the Gaussian submodel level, the feature coefficient matrix of each submodel is a 1000 × 4 matrix, where 1000 is the number of sampling points of EEG motor imagery data, and there are four motor imagery tasks.For the characteristic coefficient matrix of each Gaussian distribution model, formula ( 6) is used to calculate and analyze the class common features of the Gaussian distribution model, where θ is the row l 2 norm of the characteristic coefficient matrix and T represents the number of sampling points.
Fig. 5 shows the common feature contribution of each Gaussian score model category for a single subject on dataset 2a.With the exception of Subjects 4 and 7, all other subjects have a Gaussian score model with a significant contribution.As a common feature of the category, the Gaussian score model has the ability to recognize all four types of motor imagery tasks.Subjects 4 and 7 have multiple common features with similar contributions, which may be related to their relatively low classification accuracy of Subjects 4 and 7 in all classification algorithms.
The class-specific feature can be analyzed from the perspective of the column of the characteristic coefficient matrix of the Gaussian distribution model, and the class-specific feature of the Gaussian distribution model is calculated and analyzed using Formula (7).
Fig. 6 shows the category-specific feature contribution of the single-subject Gaussian classification model.The darker color blocks in the figure indicate that the Gaussian classification model has a higher recognition ability for the classification tasks corresponding to that column.In each classification model, Subject 1 demonstrated good recognition ability for motor imagery intentions.According to the thermodynamic diagram of Subject 1 in Figure 6, it is shown that Gaussian Score Model 10 has good classification ability for left-hand imagery tasks, Gaussian Score Model 8 has good recognition ability for right-hand imagery tasks, and Gaussian Score Model 4 has a higher contribution to the task of feet imagination.Additionally, Gaussian Score Model 9 has a higher discrimination ability for the task of tongue imagination.Each of these Gaussian models corresponds to a specific class of features.IV.DISCUSSIONS This study designed a model that recognizes various MI tasks from EEG signals at the individual level.While further research and improvements are still needed, our work has strengths and weaknesses in the following aspects.Here, we discuss possible directions for future improvements: 1) The GMM-JCSFE model was developed for the extraction and classification of microstate features from EEG data during motor imagery.This model diverges from conventional microstate feature extraction methods by not relying on the one-hot encoding assumption and is capable of effectively extracting various state features from sub-models.Additionally, the model distinguishes between common and specific features based on the weight contributions of these state features across different types, thereby enhancing the interpretability of the classification capabilities of each feature.
Experimental results demonstrate that the proposed model significantly outperforms literature methods in enhancing classification accuracy and reducing variance across three datasets, thereby exhibiting superior stability [26], [28], [29], [30], [31], [32], [33], [34], [35], [36], [37], [38].However, testing has so far been confined to individual subjects without validation among multiple subjects, and the classification performance lags behind that reported in literature [45].This disparity might stem from the impact of denoising techniques on data integrity or could indicate that multi-subject learning is better equipped to discern shared characteristics advantageous for classification.Future research should prioritize multi-subject learning and investigate specific factors influencing unique and common microstate features relevant to classification tasks, such as their relationship with state transitions or duration, to further improve classification precision.
2) To examine the influence of shared features across categories on the effectiveness of Gaussian Mixture Model (GMM) in single-subject classification tasks, experiments using dataset 2a revealed that subjects 4 and 7 had comparatively lower classification accuracy.This can be attributed to these subjects sharing multiple features with similar contributions across categories.The lack of distinct individual mixture model features due to their common characteristics leads to reduced discriminability.
3) To explore the impact of private features within categories in the Gaussian Mixture Model (GMM) for single-subject classification tasks, experiments from dataset 2a indicated that in subject 1, mixture model 10 demonstrated a higher classification contribution for the left-hand motor imagery task, while mixture model 8 showed superior performance for the right-hand task.Additionally, mixture model 4 was prominent for the feet movement imagery, and mixture model 9 exhibited a higher contribution for the tongue movement imagery.This phenomenon is likely attributed to the presence of private characteristics among the tasks, which manifest as variations in the weights of the respective mixture models.
4) Signal denoising is an important step in EEG signals preprocessing.This study employed simple baseline correction by visual inspection, which are simple and easy to understand.These simple methods perform well in some studies with the limitations of dependence on intensive laborious work and subjective judgements.In recent years, numerous advanced denoising methods have widely been utilized to improve EEG denoising performance, such as Independent Component Analysis, Multi-Scale Principal Component Analysis (MSPCA) and Diffusion Models [45], [46], [47].Compared with the traditional methods, these advanced methods automatically remove noise from signals.Another advantage of advanced denoising methods is that results are reproducible.However, another disadvantage of current EOG trial removal steps is the use of third-party tools (such as MNE), which restricts our methods to only be applicable in off-line EEG analysis.Furthermore, the advent of deep learning has led to exceptional classification performance in EEG classification using Convolutional Neural Networks (CNNs) or Graph Neural Networks (GNNs) across various applications [48], [49], [50].Additionally, graphical features represent a novel approach to identifying underlying patterns based on the graphical representation of EEG data.The fusion of EEG features derived from Gaussian Mixture Models (GMMs) with graphical features, in conjunction with CNN and GNN models, presents promising avenues for future research in EEG classification.
5) The code runtime analysis was conducted on the 'sub-ject4' dataset from 2a data.The entire code running on all the 864 trials for subject4 includes data splitting (3.26844s), Gaussian Mixture Model feature extraction (1.12317s), data transformation (1.87973s), main function (132.8579s), and classification (0.12878s).The total time taken was 139.25805s, with an average time of 161.17839ms per-sample test, which is below the 250ms threshold.This indicates that it is effective for implementing a real-time detection system.

V. CONCLUSION
In this paper, we introduce the GMM-JCSFE model to address the under-utilization of microstate features and the reliance on manual experience.The model extracts microstate mixed representation features using the Gaussian mixture model, subsequently obtains the feature coefficient matrix for subjects in the JCSFE model training set, and evaluates its performance on the test set.Experimental results indicate that our proposed model outperforms other frequency-domain and spatial-domain model methods in terms of motion recognition accuracy and demonstrates effectiveness in handling individual differences.

Fig. 1 .
Fig. 1.Scheme of experimental paradigm.The 3-6 second interval is the experimental data section, during which the subject imagines movements.

Fig. 2 .
Fig. 2. The flow of the GMM-JCSFE model.Data preprocessing mainly includes filtering(4-38HZ), segmentation and baseline correction, removing artifacts, and referencing.2.a:With the number of sub-models set to 10, GMM processes the preprocessed EEG data using the Gaussian mixture model decomposition algorithm to obtain the parameters and probabilities of each Gaussian sub-model.The probability of each sub-model constitutes the micro-state features.2.b:The JCSFE module differentiates between specific and common features of micro-state characteristics and performs classification tasks.The classification results include four kinds of motor imagery: left hand, right hand, feet, and tongue.

Fig. 3 .
Fig. 3. Feature extraction of micro-state hybrid model based on GMM.

Fig. 5 .
Fig. 5. Contribution of Gaussian score model common features of categories in the single-subject on dataset 2a.

Fig. 6 .Fig. 7 .
Fig. 6.Contribution of specific features of categories in the single-subject Gaussian score model on dataset 2a.(the Y axis represent the microstates Number, and the X axis represent the categories.)

TABLE V K
-FOLD CROSS-VALIDATION PERFORMANCE IN TERMS OF ACCURACY MEAN ON DATASET 2a

TABLE VI K
-FOLD CROSS-VALIDATION PERFORMANCE IN TERMS OF ACCURACY MEAN ON DATASET 2b