Integrating Optimized Multiscale Entropy Model with Machine Learning for the Localization of Epileptogenic Hemisphere in Temporal Lobe Epilepsy Using Resting-State fMRI

The bottleneck associated with the validation of the parameters of the entropy model has limited the application of this model to modern functional imaging technologies such as the resting-state functional magnetic resonance imaging (rfMRI). In this study, an optimization algorithm that could choose the parameters of the multiscale entropy (MSE) model was developed, while the optimized effectiveness for localizing the epileptogenic hemisphere was validated through the classification rate with a supervised machine learning method. The rfMRI data of 20 mesial temporal lobe epilepsy patients with positive indicators (the indicators of epileptogenic hemisphere in clinic) in the hippocampal formation on either left or right hemisphere (equally divided into two groups) on the structural MRI were collected and preprocessed. Then, three parameters in the MSE model were statistically optimized by both receiver operating characteristic (ROC) curve and the area under the ROC curve value in the sensitivity analysis, and the intergroup significance of optimized entropy values was utilized to confirm the biomarked brain areas sensitive to the epileptogenic hemisphere. Finally, the optimized entropy values of these biomarked brain areas were regarded as the feature vectors input for a support vector machine to classify the epileptogenic hemisphere, and the classification effectiveness was cross-validated. Nine biomarked brain areas were confirmed by the optimized entropy values, including medial superior frontal gyrus and superior parietal gyrus (p < .01). The mean classification accuracy was greater than 90%. It can be concluded that combination of the optimized MSE model with the machine learning model can accurately confirm the epileptogenic hemisphere by rfMRI. With the powerful information interaction capabilities of 5G communication, the epilepsy side-fixing algorithm that requires computing power can be integrated into a cloud platform. The demand side only needs to upload patient data to the service platform to realize the preoperative assessment of epilepsy.


Introduction
Epilepsy is a chronic disease caused by abnormal neural discharges that could result in abnormal functions in the central nervous system. is disease could adversely impact the patient's quality of life and may even lead to death. e positive hippocampus on the structural magnetic resonance imaging (sMRI) is the most common pathological marker in the medial temporal lobe epilepsy (mTLE). It is also an indicator of epileptogenic hemisphere. e paroxysmal discharges might propagate to the large-scale brain networks and lead to metabolic dysfunction or structural changes in the distal brain regions connected with the primary lesion. Surgical treatment is a preferred method for intractable epilepsy, and more than 70% of focal epilepsy patients had experienced favorable postsurgical seizure control and exhibited the most positive therapeutic effect [1,2]. e precise assessments such as epileptogenic hemisphere, epileptogenic zone, and propagation pathway are important prior to the surgery [3]. ey can not only help remove abnormal brain tissue, but also avoid damage to other functional areas such as language and memory as much as possible. In particular, the confirmation of the epileptogenic hemisphere is a basis. Integrating machine learning to locate epilepsy hemispheres has become a current research hotspot. Jin et al. [4] used magnetoencephalograms (MEG) to find the biomarker of functional connectivity (FC) to predict the epileptic hemisphere by support vector machine (SVM) with a classification rate of 76.2%. Barron et al. [5] had developed an automatic computer-aided diagnosis tool, which can determine the epileptogenic hemisphere by Positron Emission Computed Tomography (PET), with an accuracy rate of 82%.
Resting-state functional magnetic resonance imaging (rfMRI), a modern technology, has gained prominence for the task of revealing the pathological functions in the different regions in the brain, with some advantages such as noninvasive, objective, quick (in less than 15 min), conformable, and highly adaptive. It is therefore of great clinical value for pre-and postoperative monitoring and treatment of neurological diseases [6][7][8]. Now, based on rfMRI data, Zheng et al. could locate the the epileptogenic hemisphere with an accuracy of 83.8% by the combination of FC and SVM [9]. erefore, the classification rate needs to be improved. As the neurological diseases affect brain functions on multiple temporal and spatial scales, Costa et al. [10] proposed the multiscale entropy (MSE) model that combined the sample entropy with the time scale to jointly uncover the complex characteristics under different physiological states. It is now extensively adopted for investigating neural signals [11][12][13][14][15][16][17], such as electroencephalography (EEG) [18][19][20][21], MRI [22,23], and magnetoencephalography (MEG) [24]. With regard to complexity, the MSE algorithm can reveal the physiological, pathological, and functional changes in the brain. e nonlinearity of the rfMRI signal of an epileptic patient is higher than corresponding signal for a healthy individual [25,26], and thus the entropy model has the potential to evaluate the functions in an epileptic brain.
However, the selection of parameter dimension m and similarity r of MSE model depends on experiences and lacks an objective basis from the past to the present. e values of these parameters in different studies in the pertinent literature were m � 1, r � 0.35 [27], m � 1, r � 0.3 [28], m � 1, r � 0.35 [29], and m � 2, r � 0.6 [30]. Evidently, there is a problem of no uniform parameter standard or specification for MSE to process biomedical signals, leading to subjectivity in the entropy calculation. In this study, for the task of localizing the epileptogenic hemisphere, we developed an optimization algorithm for the parameters of the MSE model based on the rfMRI data scanned on the mTLE patients with a positive indicator in the hippocampal formation on sMRI. e hemisphere of these patients was considered epileptogenic in a clinic. e proposed approach for selecting the parameters of the entropy model in the epileptic rfMRI is demonstrated to be objected.
We investigated the extant literature on parameter optimization, with the help of the sensitivity analysis indexesreceiver operating characteristic (ROC) curves and the area under the ROC curve (AUC). After the parameters of epileptic entropy model were optimized, the entropy model was employed to extract the sensitive functional image markers, whose entropy values were considered as feature vectors and input to the support vector machine (SVM) to assess the feasibility and accuracy of optimal entropy model. Since the SVM is a supervised machine learning model and needs to be run on correctly labeled samples, the rfMRI data of mTLE patients with the positive hippocampus of sMRI were analyzed using the SVM. e study flowchart is shown in Figure 1.

Data Acquisition.
e rfMRI data of 20 mTLE were collected from the Department of Medical Imaging, General Hospital of Eastern eater of PLA, China, including nine males and 11 females, aged 19−33 (25.7 ± 3.7) years, who underwent a routine preoperative evaluation to localize the epileptogenic discharge areas. ere were 10 patients (left group) with the positive indicator on the left hippocampus and the other 10 patients (right group) on the right hippocampus on sMRI.
When scanning, participants were asked to keep their eyes closed, stay awake, and think nothing. All rfMRI data were collected using a 3.0 T Magnetom Vision plus MRscanner (Siemens, Erlangen Germany) and blood-oxygenlevel-dependent (BOLD)-sensitive echo plane imaging sequence. e parameters were set as follows: TE/TR � 30/ 2000 ms, FA � 90°, FOV � 240 mm, voxel size � 3 × 3 × 3 mm3, and slice thickness � 4 mm [31]. is study was approved by the Medical Ethics Committee of General Hospital of Eastern eater of PLA in China, and the patients were provided with the requisite information and signed the informed consent form.

Data
Processing. Data processing was accomplished using both pre-and postprocessing techniques.
All rfMRI data were preprocessed using FMRIB software [32,33]. Preprocessing procedure and quality inspection (framewise displacement (FD) < 0.2 mm) were conducted at Martinos Center for Medical Imaging, Harvard Medical School, USA [34,35]. e processing station used was Intel Xeon Sliver 4112 × 16 with the operating system of Centos 7.6, and the preprocessing time was about 15 hours for each subject. e different steps for data preprocessing can be enumerated as follows: (1) removal of the first four acquisition time series to stabilize the signal, (2) slice timing correction (SPM2, Wellcome Department of Cognitive Neurology, London, UK), (3) rigid body correction for head motion with the FMRIB software library package (http://fsl.fmrib. ox.ac.uk/fsl) [36], (4) normalization for global mean signal intensity across runs and registration of the signal to the standard space of Montreal neurological institute, and (5) band-pass temporal filtering (0.01Hz−0.08 Hz).
During postprocessing, data were projected to the brain region AAL1 (Anatomical Automatic Labeling, Version 1) using DPABI of MATLAB R2018b in the Windows V.10 operating system. e whole brain of each subject was divided into 116 brain regions, and each brain region has 246 time points in a 490 s time length. e overall postprocessing time was about 14 hours. Among 116 brain regions, 90 cortical cortex regions were examined in this study.

MSE Model.
For one-dimensional N-length discrete time series x 1 , x 2 , x 3 , . . . , x N , the length of each coarsegrained time series y τ j is equal to the length of original time series divided by the time scale τ in the following formula: where 1 ≤ j ≤ N/τ, τ is the scale factor, and the length of For each i value, its distance from other value j is calculated, that is, the distance between Ym(i) and Ym(j) is shown in the following formula: Setting the tolerance threshold (i.e., similarity factor) r (r ＞ 0), the number Bm(i) of d[Ym(i), Ym(j)] < r is calculated for each i value, and the ratio to the total distance can be obtainable by C m τ (r) � B m (i)/L -m. And then, the average of C m τ (r) is shown as follows: Likewise, for m + 1 dimension, the average can be derived as Given that the sequence length L is a finite value, the entropy with L can be estimated, denoted as MSE using the followig formula: e size of entropy value is related to the repeatability of time series. e larger the entropy value, the greater the complexity, and vice versa.
As the three parameters interacted with each other, they needed to participate in the optimization of each parameter. e dimension m was optimized at first, followed by the optimization of similarity r and time scale τ.
In general, the accuracy of entropy evaluation improves with the increase in the number of vector matches in dimension m and m + 1. Using BOLD signal, the entropy can be accurately estimated in the time length of 10 m−20 m [37]. On the basis of the time point length of data (246 time points), the dimension m could be estimated in a range of 1−2; that is, the optimization of dimension m would be conducted by m � 1 or 2.

Journal of Healthcare Engineering
According to previous analysis, the range of similarity r could be selected from 0.05 to 0.6 [27][28][29][30]38,39]. However, there were some invalid values in the entropy calculation at r � 0.05-0.3. Accounting for a short length of time series in this study, the similarity r could not be too small. us, the similarity r could be within the range of 0.3-0.6. We conducted the entropy calculation with a step size of 0.02. As usual, the time scale τ could be optimized in a range of 1−5 with a step size of 1.
Here, both ROC curve and AUC value were used to assess the optimization effect. Both these values converged during the evaluation. at is, the more the ROC curve was above the reference line; the greater the AUC value was, and the better the classification effect was. However, ROC curves below the reference signified the classification insignificance. When the ROC curve of a brain area was above the reference line, this brain area could be considered sensitive to the epileptogenic hemisphere. Conversely, when the ROC curve of a brain area was around the reference line, this brain area could be considered insensitive to the epileptogenic hemisphere. e optimal procedures were individually conducted for all the 90 brain areas. For the sake of explanation, the specific brain regions were taken as examples, such as left superior frontal gyrus, medial (SFGmed.L), right superior parietal gyrus (SPG.R), left cuneus (CUN.L), and left thalamus (THA.L). e brain area with a significant difference (p < .05) of the optimized entropy values between the two groups was considered to be the functional biomarked sensitive to epileptogenic hemisphere. e statistics were performed by t-test through software of Statistical Package for the Social Science (SPSS) (IBM SPSS Statistics 21; USA). e ROC curve and the AUC value were used to verify the biomarkers. BrainNet Viewer was employed to visualize the biomarkers (http://www.nitrc.org/projects/bnv/) [40].

SVM.
SVM-based machine learning model is not easily influenced by the dimension of data and the limitation of the sample size. In particular, it could simultaneously minimize the empirical classification error and maximize the geometric margin. It is thus better than other conventional machine learning models. e optimized entropy values of functional biomarked areas were considered to be the feature vectors input into the SVM model. e mTLE patients in the left group were marked as ''1,'' and those in the right group were marked as "0". e optimized MSE values of 20 subjects were taken as the input to the SVM model. e data from 8 subjects were used as the training set and from 12 subjects as the testing set. In the SVM model, the radial basis function (RBF) was considered as the kernel function, and the kernel parameter σ was replaced with the proportional parameter g � 1/2σ 2 to form a set of parameter pairs (C, g). Ranges of both these parameters were set as [−10,10] with a step size of 0.2. e optimal values of C and g were confirmed by the grid search method and then used to build the training model and the test sets. e leave-one-out cross-validation (LOOCV) is often used to test the accuracy of the algorithm model, and its sample utilization is very high; thus, it is highly suitable for small samples analysis. If there are N sample data, then N − 1 samples constitute the training data, and the remainder constitutes the testing data. erefore, the mean accuracy of N times (N � 20) can be used to evaluate the classification accuracy.

Parametric Optimization of MSE Model.
After preprocessing, the length of BOLD signal was 246 time points in the mTLE. us, the value of m can be 1 or 2. According to the previous studies, the optimized parametric spaces were confirmed in a range of r � 0.3-0.6 (step size of 0.02) and τ � 1−5 (step size of 1). Figure 2, the dimension m was optimized by the number of significant brain regions between two groups (p < .05) when the time scale τ � 1−5 with a step size of 1 and the similarity r � 0.3-0.6 with a step size of 0.02. It was found that the number of significant brain regions of m � 1 was generally greater than that of m � 2. erefore, m � 1 was selected as the optimization parameter. Moreover, the difference is significant at τ � 2 and τ � 3, so the optimization parameter of τ might be derived from τ � 2−3.

Optimization of Dimension m and Similarity r. In
Additionally, it was found that the number of significant brain regions at r � 0.54-0.6 were generally greater than other r values. at is, the optimization value of similarity r could be derived from 0.54-0.6.
By setting m � 1 and r � 0.54-0.6 (step size of 0.02) fixed, the ROC curves of left superior frontal gyrus, SFGmed.L, SPG.R, CUN.L, and THA.L at different r values were taken as examples in Figure 3. e results of τ � 2 and τ � 3 are similar; therefore, only the result of τ � 3 is shown here; it was found that the ROC curves of SFGmed.L and SPG.R were all above the reference lines, indicating that SFGmed.L and SPG.R could be considered sensitive to the epileptogenic hemisphere. Conversely, the ROC curves of CUN.L and THA.L were around the reference line and thus insensitive to the epileptogenic hemisphere. e AUC value of each brain region and the corresponding similarity factor (r) are presented in Table 1. Evidently, the AUC values of SFGmed.L and SPG.R were the largest when r � 0.56 (yellow line in Figures 3(a) and 3(b)), which indicates that r � 0.56 was the optimized value.  Figure 4. e AUC value of each brain region by time scale τ is presented in Table 2, and it was found that the AUC values of INS.L and ROL.L were the largest at τ � 3 (green line in Figure 4), suggesting that τ � 3 was the optimized value.
In summary, the optimized parameters of MSE model derived from the epileptic rfMRI data were m � 1, r � 0.56, and τ � 3.

Feature Vector.
With the optimized parameters m � 1, r � 0.56, and τ � 3, a total of nine brain regions sensitive to the epileptogenic hemisphere were obtained by the intergroup entropy values (p < .01), that is, left

Leave-One-Out Cross-Validation.
e MSE values of nine biomarked brain regions were used as feature vectors input for SVM-based classification. All classification rates (CR) of 20 times are shown in Table 3. e validity of MSE model is verified using leave-one-out cross-validation (LOOCV), and the mean classification accuracy of LOOCV was found to be 95%.

MSE Model to Image the Brain. From a theoretical
perspective, the MSE model can comprehensively analyze the dynamical changes of signals by calculating the complexity of time series [19]. In addition, because epilepsy is a dynamical disease, the MSE model has the potential to be utilized in the assessment of dysfunction in epilepsy [41]. Finally, since the functional connectivity of Pearson correlation (FC) is a conventional index, its relationship between FC and entropy would provide support for the optimized MSE to image the functional brain. Wang et al. [42] affirmed that the relationship between MSE value of the rfMRI time series and FC depended on the length of time series and the size of time scale τ. By the MSE model of the rfMRI data, Yang et al. [27] found that the people with different cognitive scores can be fairly distinguished using a MSE-based model. William et al. [43] examined if the MSE model of EEG data can distinguish the absence epilepsy patients from healthy control with high accuracy (>95%).

Functional Biomarker of TLE.
At present, the mTLE lesions are identified by certain biomarked brain regions including the insula, bilateral cingulate gyrus, and precuneus [4,9,44]. By considering the healthy people as a control, Zhou et al. [45] found that the left middle temporal gyrus, right middle frontal gyrus, and the left anterior central gyrus were significantly different in temporal lobe epilepsy by the entropy values of the rfMRI data. In addition, by the anatomical connectivity between the left and right hemispheres in the temporal lobe epilepsy, a different connectivity pattern was observed in the cortical-limbic network and cerebellum [46]. ese biomarked brain regions have a great intersection with the current study.

Conclusion
rough the optimized MSE model, a total of nine landmarks sensitive to the epileptogenic hemisphere of temporal lobe epilepsy could be estimated in the left Rolandic operculum, right Rolandic operculum, left superior frontal gyrus, medial, left insula, and so on. e mean CRs attained through LOOCV were 95%, respectively.
is work underlines the importance of an optimized entropy model and presents an automated detection algorithm for rfMRI to aid the preoperative assessment of epilepsy.

Data Availability
Due to the privacy protection of patient data, the availability of original data cannot be provided.    T  T  T  T  T  T  T  T  T  T  ID  11 12 13 14 15 16 17 18 19 20  Result  F  T  T  T  T  T  T  T  T  T T: prediction is converge to diagnose; F: prediction is reversal to diagnose. 8 Journal of Healthcare Engineering