A Feature Selection Approach Based on Memory Space Computation Genetic Algorithm Applied in Bearing Fault Diagnosis Model

The main objective of this study is to propose a motor fault diagnosis model based on machine learning. Compared with the traditional motor fault diagnosis model, the proposed model can reduce the computation time. This model can be divided into three steps: feature extraction, feature selection, and classification. In the feature extraction step, the original signal is extracted by Hilbert-Huang transform (HHT), envelope analysis (EA), and variational mode decomposition (VMD) methods. A feature selection method based on memory space computation genetic algorithm (MSCGA) is proposed and applied in the feature selection step. The advantage of MSCGA is that it eliminates the need to compute data fitness values, saving unnecessary computation time repeatedly. The classifiers use k-nearest neighbor (KNN) and support vector machines (SVM). In order to verify the stability and efficiency of the model, the university of California Irvine (UCI) benchmark dataset, the current signal of motor fault datasets, and case western reserve university (CWRU) were used. The UCI dataset is used to test the efficiency and computation time of the feature selection method. Other datasets are used to compare with traditional motor fault diagnosis models. The simulation results of the proposed model have demonstrated the effectiveness in reducing the computation time without affecting the computation results compared to the traditional motor fault diagnosis model. Furthermore, the performance of MSCGA is proven to be better than that of the other algorithm.


I. INTRODUCTION
In an industrialized society, electrical rotary machinery is widely used in various manufacturing industries and manufacturing plants. Among them, the rolling bearing is the most important core of the rotating machine [1]. The rolling bearing in the rotating machine in the occurrence of the failure ratio is the highest and the most common [2], [3]. The failure of the rotating machinery may cause damage to the manufacturing equipment, and in serious cases may even cause injury to the operator [4]. How to detect the failure of electrical machinery is a solution that needs to be proposed. Therefore, this study's contribution is finding The associate editor coordinating the review of this manuscript and approving it for publication was Baoping Cai . a feature selection method that can be more efficient. The diagnosis model of bearing failure is divided into the following three steps [5], the first step is feature extraction [6], the second step is feature selection [7], and finally, classification. The motor in operation is interspersed with a large number of signals and noises [8]. Measuring the current and vibration signals is the initial signal. The main function of feature extraction is to find the important features from the initial signals and extract them [9]. The important features often include the maximum value, the average value, and the root means square value. The features' quality affects the classification accuracy [10]. There are several feature extraction methods [11], the more famous ones are envelope analysis (EA) [12], wavelet transforms (WT) [13], Hilbert-Huang transform (HHT) [14], variational mode decomposition (VMD) [15], and fast Fourier transform (FFT) methods [16]. HHT was proposed by Norden E. Huang et al, at Academia Sinica, Taiwan [17], to decompose the analysis data into intrinsic mode functions (IMF) [18], called empirical mode decomposition (EMD) [19]. The IMF is transformed into a Hilbert transform to obtain the instantaneous frequency of the processed data [20]. VMD is one of the latest feature extraction strategies, similar to EMD, which determines the frequency center and bandwidth of each component by iteratively searching for the optimal solution of the variational model in the process of acquiring the decomposed components so that the frequency domain segmentation of the signal and the effective separation of each component can be achieved in an adaptive manner. Among them, HHT, VMD, and EA are applied in this study. After the feature extraction step, the feature set has been initially formed. However, the extracted dataset still has many redundant features, especially in the higher dimensional dataset [21]. These redundant features cause a decrease in the accuracy of the final classification. Therefore, a feature selection step is needed, which is an intermediate step between data extraction and classification. Its function is to pre-process the dataset before it is sent to the classifier so that its features are further selected. The development of feature selection has flourished and metainspired algorithms have received attention with the aim of solving the optimization of datasets. The classical algorithms include binary grey wolf optimization (BGWO) [22], ant colony optimization (ACO) [23], binary particle swarm optimization (BPSO) [24], genetic algorithm (GA) [25], and binary differential evolution (BDE) [26]. The GA was proposed by John Hollander of the university of Michigan and is commonly used to solve search algorithms for optimization. It was originally developed by drawing on a number of phenomena in evolutionary biology [27], including genetics, mutation, natural selection, and hybridization [28]. The proposed method in this study: MSCGA has the advantage of reducing the computing time while maintaining the original computing power compared to conventional GA. The main purpose is to avoid finding local optimal in feature selection, to increase the search area, to find the best choice for the whole domain, and to improve the spatial search capability. At the same time, it can shorten the computation time by more than half. The last step in a machine learning-based fault diagnosis model is classification. In this part, the classifier classifies the best features in the selected feature subset, identifies the fault signals, and the accuracy distinguishes the final result. In machine learning, several common classifiers, such as k-nearest neighbor (KNN) [29], discriminate analysis (DA) [30], support vector machines (SVM) [31], decision tree (DT) [32], and naive bayes (NB) [33], and random forest (RF) [34], and so forth. KNN is an example-based learning or inert learning that approximates locally and postpones all calculations until after classification. KNN is an examplebased learning or inert learning that approximates locally and postpones all calculations until after classification. KNN is one of the simplest of all machine learning algorithms.
SVM is a supervised learning model and associated learning algorithm for analyzing data in classification and regression analysis. Given a set of training instances, each of which is labeled as belonging to one or the other of two classes, it is a non-probabilistic binary linear classifier. KNN and SVM, as the most frequently used classifiers, are considered to have the highest degree of credibility. Therefore, this study uses KNN and SVM as the final classifiers for comparison. The proposed model in this study is a combination of the feature extraction method, the proposed feature selection method, MSCGA, and the final step, classifier. This paper's main objective is to develop an effective method of bearing diagnosis. Based on this purpose, the main contributions are as follows: 1) Regarding feature extraction, the extracted dataset comprises the original data and the sub-data extracted by HHT, EA, and VMD, respectively. More oriented features can be extracted. 2) In feature selection, MSCGA was applied. Which can effectively shorten the computation time.
3) The proposed feature selection strategy is validated using the UCI datasets to compare the accuracy of the proposed feature selection strategy with other classical feature selection methods. Furthermore, in order to verify the stability, the proposed bearing diagnostic model is compared with the existing diagnostic model using the current signal of motor fault datasets and CWRU dataset.
The rest of the paper is described as follows: the process of feature extraction is discussed in section II. In section III, a detailed introduction of GA is described, the feature selection method based on GA is described, and the features and roles of MSCGA are described. Furthermore, section IV describes the bearing diagnosis model used in this paper. section V describes the experimental results, using UCI datasets, CWRU datasets, and bearing current signal datasets to test the model capability. Finally, the conclusions are described in section VI.

II. THE FEATURE EXTRACTION METHOD
In this study, three feature extraction methods were used, namely HHT, EA and VMD. The feature extraction flow chart is shown in Fig. 1. Feature extraction means extracting different features from the original signal as a basis for analysis. The features of the extraction are including max value (max), mean value (mean), min value (min) mean squared error (mse), root mean square (rms), sum value (sum) and standard deviation (std). The mathematical meanings of the features are shown in the Table 1. Mathematical expressions for every feature are shown in the Table 2.
A. HILBERT-HUANG TRANSFORM HHT consists of two parts: empirical mode decomposition (EMD) and Hilbert spectral analysis (HSA) [35]. This method is potentially viable for nonlinear and nonstationary data VOLUME 11, 2023   analysis, especially for time-frequency-energy representations [36].
1) Empirical mode decomposition (EMD) Decomposing data into intrinsic mode functions (IMF), such a decomposition process is called empirical mode decomposition (EMD) method. The signal x(t) decomposed process via the EMD are listed as below: a. Find all the local maximum and local minimum of signal x(t). b. The upper and lower envelopes are connected from the local maximum and local minimum by the cubic spline line. c. The mean of the upper envelopes and lower envelopes is designated be m ik , the intermediate component h ik is calculated as follow h ik = x(t)−m ik [37]. Eliminate noise in the signal and meet IMF conditions. Therefore, repeat the process k times (Step a. to Step c.), the first component is considered to be the input signal. d. In iteration process, if any intermediate component h ik satisfies the IMF conditions, it is considered the first IMF c i (t) = h ik (t). The residue is calculated as follow r i (t) = x(t) − c i (t).The iterative loop is continued again (Step (a.) to Step (d.)) with residual is considered as input signal x(t) − r i (t) for finding the next of IMFs. e. When termination criteria are met. The stoppage criterion for termination can be either of the following: • The residual standard deviation is less than the specified value.
• No additional IMF can be extracted from the residual signal. 2) Hilbert Transform (HT) HT is a specific linear operator which acts as a function, c i (t) of a continuous signal and produces another function of a real variable. The Hilbert spectrum matrix of the signal was formed from the process of analyzing the IMFs by the HT [38]. HT spectrum can be used to analyze the nonlinear and non-stationary signals of the spectrum content over time [39]. Therefore, a spectral matrix is applied to extract local features [40]. Based on the amplitude distribution in the 51284 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.  time-frequency domain of the spectrum matrix, the feature extraction process is as follows: in the time domain, five curves are calculated by five indicators as maximum (T-max), mean (T-mean), mean square error (T-mse), root mean square (T-rms), and energy (T-sum). Afterward, each indicators curve has five statistical values extracted as maximum (max), mean, mean square error (mse), root mean square (rms), and standard deviation (std).

B. ENVELOPE ANALYSIS
This method is mainly used to extract periodic mechanical vibration excitation signals [41], which can detect defects in various motor parts. The flow chart of EA is shown in Fig. 2. There are N samples, and x(t) is the original vibration mode. X (f ) is the FFT-transformed signal of the original shock wave.
x(n) is the recombined signal passed through the bandpass filter.
x env(n) is the envelope signal to be analyzed, as shown follow: X env (k) is the envelope spectrum The main advantage of EA is reliable wide-frequency fault detection in certain frequency bands [42].

C. VARIATIONAL MODE DECOMPOSITION
This method is based on the traditional Hilbert transform [15]. Find the optimal solution of the variational model through iteration, and determine each mode's bandwidth and center frequency. Implementing adaptive decomposition in a variational framework [43].
Update in the frequency domain, and finally inverse Fourier transformed to the time domain. Complete adaptive segmentation of the signal band. Effectively avoid blend mode. K mode corresponds to different periods [44]. The VMD function calculates the IMFs in the frequency domain, reconstructing The algorithm extends the signal by mirroring half its length on either side to remove edge effects. The Lagrange multiplier introduced in optimization (signal processing toolbox) has the Fourier transform ∧(f ). The length of the Lagrange multiplier vector is the length of the extended signal. For the (n + 1)-th iteration, the algorithm performs these steps: 1) Iterate over the k modes of the signal to compute: a. The frequency-domain waveforms for each mode using 2) Update the Lagrange multiplier using where τ is the update rate of the Lagrange multiplier.

III. THE FEATURE SELECTION METHOD
The purpose of feature selection is to extract features that can effectively improve the final accuracy of the dataset after feature extraction, and delete features that are useless for classification. This section applies the genetic algorithm (GA) and the proposed method, MSCGA.

A. GENETIC ALGORITHM
GA is a feature-based optimization technique that searches for heuristics based on population calculus, imitating the process of natural human evolution. The principle of genetic algorithm comes from the iterative process of manipulating a set of population or organism chromosomes (candidate solutions) to generate new individuals, through genetic functions, such as selection, crossover and mutation (similar to Charles Darwin's theory of biological evolution, inheritance and recombination of chromosomes, survival of the fittest [45]). The GA is composed of three steps: selection, crossover and mutation. The steps as below step1 Encoding Encode each solution in gene strings. Each chromosome is divided into explicit (1) and implicit (0) [46]. step2 Creating a group Create a primitive population that contains a certain number of sequences.  step1 Implement GA's principle Implement the following steps until the termination conditions are reached: a. Calculate the fitness values for every solution in the group. The fitness function determines how close a given solution is to the best solution to the desired problem. The purpose is to determine the suitability of the solution [47]. b. In GA, each chromosome is represented as a sequence of binary codes. These binary codes are called solutions, and these solutions are tested and the best solution to the problem is proposed. A score is determined for each solution, indicating how close the solution is to satisfy the desired solution. The score is calculated by applying the fitness function to the test or from the tested solution. c. How to determine the appropriate fitness function for a given problem? Each problem has a suitable fitness function. How to decide the fitness function depends on the nature of the problem itself. When using GA, deciding which fitness function is appropriate for the problem is the most challenging part. The computational parameters and basic functions related to the domain in which the problem is formulated can be used as fitness functions for optimization problems. d. In this model, the KNN is applied. In other words, accuracy is the score that determines whether a model is good enough. Accuracy is defined as follow: Accuracy = Number of correct predictions Total number of predictions (15) In this study, the metric used for the fitness function is the error rate, where the error rate formula is as follows e. Create new solution by three genetic operations as follows: Make choices in the group based on solution's fitness values, and choose two solutions as parents. The schematic of the roulette wheel is shown in Fig. 3. Consider a population of size n (the number of solutions is n), P = {a 1 , a 2 , a 3 , . . . , a n }, each solution has fitness values of f (a i ), or each solution selected  chance is: f. Two parents created two new solutions, and the combination of new solutions comes from the reorganization of the intersection of the two parents. Select one of the nodes, and swap the remaining chromosomes. The schematic diagram of the single-node crossover is shown in Fig. 4. g. By probabilistic selection, mutate an existing solution. Select several of the chromosomes and flip them. The schematic diagram of single point mutation is shown in Fig. 5. step4 Termination When the termination condition is reached, this calculus's subset of optimal fitness values is obtained. The flow chart of GA is shown in Fig. 6. The fitness of the candidate (chromosome) is used for the function to evaluate the target or fitness function. The value given by the fitness value function (objective function) is used for the fitness sorting of a specific chromosome. The applicability of the fitness function depends on the application problem.

B. THE PROPOSED METHOD (MSCGA)
The GA algorithm is widely applied in many research fields, but it still has some disadvantages. In traditional GA, the fitness value is recalculated at each iteration, including duplicate chromosomes. If the number of generations and populations is large enough, it will take a lot of time, and this is where the most time is spent. In addition, the genetic 51286 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
randomization of chromosomes is so uncertain that it is sometimes difficult to find the optimal fitness value by random search. Sometimes it can get stuck with a local maximum problem, which means that GA does not guarantee that every computation is optimal.
The main idea of this study is to find a way to reduce the computation time efficiently without affecting the original GA computation and to further expand the original population in the hope of solving the local maximum problem. The advantage of MSCGA is that it eliminates the need to compute data fitness values, saving unnecessary computation time repeatedly. Therefore, the computation time can be effectively reduced. In the traditional GA selection method, the roulette wheel method is used. In MSCGA, the roulette wheel selection of one of the parents generations is replaced using a method called the ranking roulette wheel. The roulette wheel is ranked according to the fitness value from the largest to the smallest, expecting that the order with the largest fitness value will be increased. Therefore, in the selection phase of MSCGA, a mixed roulette wheel selection is used.
The check rate parameter has been added in the selection stage. The purpose of this check rate parameter is to check the similarity of the two parent generations, if the similarity is too high, the selection is considered meaningless and the selection is skipped. The check rate is set to 50%, which means that if the similarity of the two parent generations is more than 50%, the selection is skipped. step4 Crossover Two parents created two new solutions, and the combination of new solutions comes from the reorganization of the intersection of the two parents. Selecting one of the nodes, swap the remaining chromosomes.
The check rate parameter is added to the selection stage to reduce the number of selection stages that are canceled due to high similarity. The cross rate (the probability of selecting parental mating at the current time in traditional GA) is eliminated in the crossover stage. This prevents the program from stopping due to the cancellation of two steps. step5 Mutation In the mutation step, random interval (RI) mutation is proposed since the traditional GA has the problem of the local optimal due to excessive convergence.  RI selects an interval of the chromosome and mutates the rest of the chromosome. The maximum mutation is performed while retaining some of the original chromosome characteristics, as shown in Fig. 7. step6 Delete the repeat offspring A memory space has been added in this step. The purpose of this space is to record the population chromosomes of each iteration in memory before the termination condition is reached. The new offspring from each iteration will be compared in memory. If a chromosome is found to be duplicated in the existing population, the chromosome is deleted. This means that the fitness value of the chromosome does not need to be calculated again in the current iteration, further reduction of computing time. step7 Compare if any chromosomes have been deleted In the previous step, chromosomes that overlap with existing populations will be deleted. This means that there are not enough offspring in the current iteration to move to the next iteration. In this step, it is detected that if any offspring is deleted, the number of deleted the offspring will be increased. The new offspring will be added in the next step: the additional crossover. step8 Additional crossover In each iteration, duplicate chromosomes are deleted, which means that the population may not be large enough. Deleted chromosomes are replenished in this step, and the memory space is divided into five classes according to their fitness values, from high to low.
Each rank is given a weight. The highest rank has a weight of 1, the next rank has a weight of 1/2, . . . , the lowest rank has a weight of 1/5. Whenever a chromosome is deleted, two ranks are selected by using roulette wheel selection from memory. The two parents are selected again using roulette wheel selection to generate a new offspring at the crossover between the two ranks. This method maximizes population diversity while maintaining population fitness values. step9 If the offspring meet the original number Repeating the additional crossover step is expected to bring the number of chromosomes up to the original number that was not deleted. Once the number of chromosomes is checked at this step, the next step is performed: again mutation. step10 Again mutation When the number of chromosomes has reached the number of the offspring that would have resulted from the original iteration, it is expected that the chromosome diversity can be increased further. Therefore, we perform an again mutation. Again mutation works by selecting chromosomes with high fitness values in the top 20% of memory for mutation. (The original mutation rate was 0.01 while the again mutation rate was 0.05). New offspring are mutated and added to the existing population. When the current iteration number reaches 100, the calculate is stopped.
The method MSCGA was proposed in this research. Compared with the traditional GA, MSCGA is more powerful in feature selection and effectively reduces calculation time. The flowchart of the proposed method is shown as Fig. 8. The bearing fault diagnosis model has three parts: feature extraction, feature selection and classification. In the feature extraction part, HHT, EA and VMD are used to extract features from the initial signal. HHT extracts 50 features, EA extracts 10 features, and 10 features are extracted by VMD; Total of 70 features. In the feature selection part MSCGA is applied. In this part, the optimal feature subset is generated. Support vector machine (SVM) and k-nearest neighbor (KNN) are used in the classification part. Identify the feature of healthy motor bearings. And faulty motor bearings with different degrees of damage.

IV. THE BEARING FAULT DIAGNOSIS MODEL
The bearing fault diagnosis model consists of three stages: feature extraction, feature selection, and classification. Fig. 9 shows the flowchart of the bearing fault diagnosis model. During the feature extraction stage, three feature selection   stage: SVM and KNN. The training data set is set to 70% and the testing data set is set to 30%. SVM is a widely used classifier, which is convenient for solving small sample and nonlinear problems. The basic concept of SVM is to treat the input signal as a high dimensional vector and use lower dimensional hyperplane to separate the points, which is a socalled linear classifier. KNN consider the input signal as ''i'', find out which type of the k data closest to ''i'' is mostly, and predict the type of ''i''. The advantage is high precision. Both of the above can efficiently complete data classification.

V. EXPERIMENTAL VERIFICATION AND RESULTS
In the study, the personal computer (4 CPUs Intel(R) Xeon(R)CPU E3-1230 v3,32GB of RAM) was utilized. Experimental equipment includes three-phase squirrel-cage induction motor, three-phase power supply, three-purpose meter and oscilloscope. The software used for experiment is MATLAB R2017a version. In the study, three different datasets are applied to verify the motor diagnostic robustness of the proposed method and traditional methods.

A. PARAMETER SETTING
MSCGA is a kind of wrapper feature selection method, and the model also requires a classifier to predict the fitness values. The value computed by the classifier can evaluate the accuracy of the dataset. In this model, KNN and SVM are used to evaluate the fitness of the dataset. The parameters of the two classifications are set as follows. In KNN, k value is set to 3 and kfold is set to 10. In SVM, kfold is set to 10 and kernel is selected as radial basis function. The proposed method and other data selection methods used for comparison are GA, ACO, and BDE. The parameter settings are shown in Table 3.

B. CASE STUDY 1: UCI BENCHMARK DATASETS
1) The UCI benchmark was adopted to test the data selection ability of the proposed method. Three traditional methods are compared with the proposed feature selection method, GA, ACO and BDE. 2) In this case, a total of 11 datasets are used, namely BreastEW, CongressEW, Exactly, HeartEW, Ionosphere, KrVsKpEW, Sonar, SpectEW, Vote, Waveform and Wine. The data structure information is listed Table 4. In this phase, three characteristics were compared: fitness average error, fitness best performance and computation time. The average error and lowest error are presented in Table 5. When using the proposed method, eight data sets have the best performance in Average Error data compared with GA, ACO, and BDE. They are BreastEW, CongressEW, Exactly, Ionosphere, KrVsKpEW, Sonar, Waveform, and Wine. There are 7 data sets with the best performance in the Lowest Error data. In the Lowest Error data, there are 7 data sets with the best performance: BreastEW, Exactly, Ionosphere, KrVsKpEW, Sonar, SpectEW and Waveform. The data compared in Fig. 10 are the computation times of 100 iterations. The units are seconds. Of all 11 datasets used, MSCGA has the shortest computation time. In some data sets it is even possible to reduce the computation time by 50% compared to GA using a repetitive offspring removal technique. This can confirm that the performance of the proposed feature selection method is superior to that of the traditional methods.

C. CASE STUDY 2: CURRENT SIGNAL OF MOTOR FAULT DATASETS MEASURED BY INDUCTION MOTOR
1) The motor fault dataset was adopted to test the classification ability of the bearing fault diagnosis model. The motor fault dataset was measured by a three-phase squirrel-cage induction motor. Motor specifications are as follows: rated voltage 220V/380V, rated horsepower 2hp, rated current 5.58A/3.23A, efficiency 83.5%, rated speed 1715 RPM and the number of poles is 4. The signal is measured under four conditions: health, bearing damage, stator fault, and rotor bar damage. Data miner (NI PXI-1033) is used for data current collection. The detailed status is shown in Table 6. The measurement conditions are as follows: load rate 0%, torque 0 N/m; load rate 25%, torque 2 N/m; load rate 50%, torque 4 VOLUME 11, 2023 51289 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.   N/m; load rate 75%, torque 6 N /m and load rate 100%, torque 8 N/m. The measurement equipment is shown in Fig. 11, Fig. 12, Fig. 13. Original signal is measures by the experiment induction motors in current signal form. In this case, the proposed feature selection method is compared with others three feature selection method same as case study 1.
51290 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.  2) In this case, the test data used are divided into eight categories: one healthy data, three damaged bearings, two stators short-circuit conditions, and two rotor bars bore damages. Table 6 shows the detailed information. 3) In this experiment, the original dataset was combined into a dataset of 70 samples by extracting 50, 10 and 10 samples from HHT, EA and VMD individually. GA, ACO, BDE and MSCGA are used for data selection. Each experiment was repeated 30 times and the average computation time was obtained as shown in Fig. 14.
It can be seen that the average time for GA is the longest, and among the four data selection methods, the average time for MSCGA can be the shortest. 4) The average classification results of the 30 experiments are recorded in Fig. 15. Although the best performance cannot be obtained by averaging 30 experiments, the best results can be obtained by using each classifier. 5) The original 70 features were selected by different feature selection methods, as shown in Table 7, with 35 features selected by GA, 32 features selected by ACO, 43 features selected by BDE, and 38 features selected by MSCGA. The BDE filtered the most features, but the final accuracy was not the highest, which proves that the BDE has the worst filtering ability among the four feature selection methods used, and it is difficult to retain the valid features effectively. BDE is not a good feature selection method.    precision. Recall is defined as follow: Recall = True Positives (Ture Positives + False Negatives) (19) Recall is calculated as the ratio of correctly  Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. record the vibration signal of load 0 ∼ 3 hp (1720 ∼ 1797 RPM). CWRU dataset was adopted to test the classification ability of the bearing fault diagnosis model. In this case, the proposed feature selection method is also compared by the three feature selection methods in case study 1. 2) In this case, the test data used were divided into ten categories: one health data, three innerace, three outerace, and three ball. Innerace, outerace, and ball data were individually classified as diameter 0.0007 inches, 0.0014 inches, and 0.0028 inches. 3) Same as case study 2. The number of samples selected by each data selection method from the original data set of 70 samples and their numbers are organized in the  Fig. 18. 5) Table 8 Fig. 19.  Finally, the proposed model is more effective in selecting features to improve the final accuracy, which can effectively improve the accuracy results. In addition, the computation time can be significantly reduced while improving the classification results. In turn, the efficiency of the work can be improved. The improvement of precision and recall values also proves that MSCGA is more capable of feature selection than GA.