A Hybrid Fault Diagnosis Approach for Rotating Machinery with the Fusion of Entropy-Based Feature Extraction and SVM Optimized by a Chaos Quantum Sine Cosine Algorithm

As crucial equipment during industrial manufacture, the health status of rotating machinery affects the production efficiency and device safety. Hence, it is of great significance to diagnose rotating machinery faults, which can contribute to guarantee the running stability and plan for maintenance, thus promoting production efficiency and economic benefits. For this purpose, a hybrid fault diagnosis model with entropy-based feature extraction and SVM optimized by a chaos quantum sine cosine algorithm (CQSCA) is developed in this research. Firstly, the state-of-the-art variational mode decomposition (VMD) is utilized to decompose the vibration signals into sets of components, during which process the preset parameter K is confirmed with the central frequency observation method. Subsequently, the permutation entropy values of all components are computed to constitute the feature vectors corresponding to different kind of signals. Later, the newly developed sine cosine algorithm (SCA) is employed and improved with chaotic initialization by a Duffing system and quantum technique to optimize the support vector machine (SVM) model, with which the fault pattern is recognized. Additionally, the availability of the optimized SVM with CQSCA was revealed in pattern recognition experiments. Finally, the proposed hybrid fault diagnosis approach was employed for engineering applications as well as contrastive analysis. The comparative results show that the proposed method achieved the best training accuracy 99.5% and best testing accuracy 97.89%. Furthermore, it can be concluded from the boxplots of different diagnosis methods that the stability and precision of the proposed method is superior to those of others.


Introduction
Rotating machinery plays a significant role in modern industrial fields, and its health status greatly influences the production efficiency and product quality. Besides, once an unexpected or sudden fault occurs, it could result in large economic losses. Hence, it is of great practical significance to diagnose rotating machine faults [1]. During the running process of various kinds of rotating machines, rolling element bearings are the most widely used parts. However, owing to their structural properties and recognition ability when the number of samples is large. However, all three methods mentioned above are based on empirical risk minimization, i.e., abundant samples are needed to achieve high accuracy. In contrast, SVM, proposed by Vapnik [35] based on structural risk minimization, has certain advantages in dealing with small samples and linearly inseparable problems. However, the pattern recognition performance of SVM is influenced by the parameters. To solve the problem, different optimization methods, such as particle swarm optimization [36], antlion algorithm [37], fruit fly algorithm [38] and ant colony algorithm [39] were proposed and employed to choose the best parameters for SVM.
Sine cosine algorithm (SCA) is a newly developed optimization algorithm proposed by Mirjalili [40] that has shown good performance in many studies [41,42]. To achieve accurate fault diagnosis for rotating machinery, a hybrid fault diagnosis model with entropy-based feature extraction and SVM optimized by chaos quantum sine cosine algorithm (CQSCA) is developed in this research. Firstly, the adaptive VMD is employed to decompose the vibration signals into a set of components, during which stage the preset parameter K of VMD is ascertained with central frequency observation method. Then, the permutation entropy values of all the sub-signals are calculated, thus to construct the feature vector of the given fault sample. Subsequently, an improved SVM model with full fusion of chaotic initialization, quantum technique and SCA for parameter selection, whose effectiveness has been proved in pattern recognition experiments, is presented to classify different fault types. Finally, the superiority of the proposed method was confirmed through engineering applications as well as comparative analysis.
The remainder of this paper is organized as follows: Section 2 presents the base theory of VMD and permutation entropy. Section 3 introduces the improved pattern recognition method based on SVM optimized with chaos quantum sine cosine algorithm and validates the effectiveness with pattern recognition experiment. Section 4 delineates the procedures of the proposed hybrid fault diagnosis model with entropy-based feature extraction and SVM optimized by CQSCA. Section 5 illustrates the superiority of the proposed method with engineering application and comparative analysis. The conclusions are summarized in Section 6.

Variational Mode Decomposition
VMD [12] is a newly developed time-frequency signal processing technique, which is adaptive and quasi-orthogonal. With VMD, a given signal can be decomposed into a set of intrinsic mode functions (IMF) which are all band-limited. The decomposition process can be realized by solving a constrained variational problem formulated as follows [12]: where K is the total number of IMFs, m k is the time domain signal of the k-th IMF, and w k is the center pulsation of the k-th IMF.
To obtain the solution of Equation (1), a quadratic penalty term and a Lagrangian multiplier are introduced, and the augmented problem is provided as follows: L(m k , w k , β) = α∑ k ∂ t δ(t) + j πt * m k (t) e −jw k t   where α and β(t) are respectively the balancing parameter and Lagrange multiplier.
Then the alternate direction method of multipliers (ADMM, [43]) is applied to deduce the solution of Equation (2) by optimizing m k , w k and β alternately, which is based on the ideas of Lagrange theory and dual decomposition. The optimization problems of m k and w k are respectively presented as Equations (3) and (4): The iterative formulas of problems (3) and (4) are inferred as follows: The Lagrangian multiplier is renewed according to Equation (7): where τ is the update parameter. The prime procedures of VMD can be summarized as follows: Step 1: Initialize m 1 k , w 1 k , β 1 n = 1; Step 2: Update m k and w k based on Equations (5) and (6); Step 3: Update β based on Equation (7), n = n + 1; Step 4: If ∑ k m n+1 k − m n k 2 2 > ε, turn to Step 2, else stop iterating.

Permutation Entropy
The principle of permutation entropy (PE) does not consider the specific values of the data. Instead, it is based on the comparison and reconstruction of adjacent data, which is simple in computation and has the advantage of anti-interference [44]. Given a time series X = [x 1 , x 2 , . . . , x N ], the phase space is reconstructed firstly within PE algorithm as follows: Gathering the above formulas, an overall matrix X i m can be obtained: where m is the embedded dimension with the integer scope [3,7], while τ is time delay and generally set as the integer 1. Each element in X i can be sorted in ascending order: where j 1 , j 2 , . . . , j m are the column indexes of each element before ordering. If two elements are equal, i.e., X i+(j p −1)τ = X i+(j q −1)τ , they will be sorted sequentially which means: Therefore, a symbol sequence can be obtained for any X i : The m-dimensional phase space maps a total of m! different symbol sequences [j 1 , j 2 , . . . , j m ], among which symbol sequence S(l) is an individual. Accordingly, we can calculate the probability p l of each symbol sequence, where p l is subject to m! ∑ l=1 p l = 1.
where n is the occurrence times of each symbol sequence S(l). According to the above formula, PE of the time series X can be defined as follows in the form of information entropy: When p l = 1 m! , H P (m) reaches the maximum value ln(m!). For convenience, H P (m) is usually normalized by ln(m!): where H P is in the range of [0, 1], which represents the randomness of the time series X i , i.e., the larger H P is, the more random the time series is.

Support Vector Machine
SVM is a data mining method based on structural risk minimization and statistical learning theory [35], whose core innovative idea is to map the sample space to a high-dimensional feature space through nonlinear kernel transformation. Owing to the optimal hyper-plane in the high-dimensional feature space, the nonlinear classification in sample space is realized by solving the linear classification of feature space, which makes SVM can successfully deal with nonlinear pattern recognition problems. Compared with traditional learning machines, SVM has an outstanding adaptability to limited samples and is not sensitive to data dimension.
Given a data set {(x i , y i ), i = 1, 2, . . . , n} from two classes, there must exist a classification hyper-plane, the construction of which is the most important task for achieving pattern recognition with SVM. The hyper-plane can be formulated as: where w and b represent the weight vector and bias term, respectively, while w · x is the inner product of w and x. For a binary classification issue with labels −1 and 1, all the samples should meet a specific condition as defined in Equation (16), thus the two types of samples can be completely separated: To solve linearly non-separable problems, slack variable ξ and penalty factor C are introduced, thus the generalized hyper-plane can be deduced by solving the following optimization problem: During the process of mapping the samples into higher dimension space, radial basis kernel function is always employed which is defined as: where g is the kernel parameter.
In accordance with Lagrange theory and duality principle, the dual form of optimization problem (17) can be reformulated as: where µ i are Lagrange multipliers.
With the Lagrange multipliers acquired from the solution of the above dual problem, the decision function of the original problem can be ascertained: 3.2. Chaos Quantum Sine Cosine Algorithm (CQSCA)

Sine Cosine Algorithm
The optimization procedure of SCA includes two phases [40]: exploration and exploitation. During the exploration phase, the algorithm is firstly initialized with a collection of random solutions to start the optimization process. With the stochastic searching, SCA can locate feasible solutions quickly in the searching space. Meanwhile, in the exploitation phase, the random solutions change gradually and the changing rate is obviously lower than that during the exploration phase, which contributes to a better searching in current space.
The positions of m individuals are randomly generated in initialization phase of SCA. Supposing that each solution of the optimization problem corresponds to individual's position in the searching space, and the position of i − th (i = 1, 2, . . . , m) individual is represented by X i = (X i1 , X i2 , . . . , X iD ) T , where D is individual's dimension. The individual i's best value is P i = (P i1 , P i2 , . . . , P iD ) T . The position of individual i will be updated by the following formulas in the iteration [40]: where X i k is the position of individual i in the k-th iteration. The above equations can be combined as follows: As is shown in the above equations, four parameters are mainly included in the updating Equations [40]: r 1 , r 2 , r 3 and r 4 . The parameter r 1 is a random number, dictating the next iteration position's movement direction of individual i. The parameter r 2 is a random number in [0, 2π], which defines the distance that the movement should be towards or outwards the destination. To randomly emphasize (r 3 > 1) or deemphasize (r 3 < 1) the effect from the best value of individual during the movement, the parameter r 3 is brought with a random weight with the range of [0, 2]. Lastly, the parameter r 4 is a random number in [0, 1] to switch equally between components, when r 4 < 0.5, the position of individual i iterates by sine component, otherwise iteration switches to cosine component.
During the searching process, SCA should balance the exploration and exploitation phases and finally find the global optimum in the searching space. Accordingly, the amplitudes of the sine and cosine functions are adaptively changed by adjusting r 1 in the updating Equation [40]: where T, t and a are respectively the maximum number of iterations, the current number of iterations and constant.

Quantum Sine Cosine Algorithm
QSCA is the improved version of SCA with quantum evolution [45]. In quantum description, the smallest unit of information is a qubit, and any state of a qubit can be represented as a linear combination of the basic states, called superposition |φ . The qubit can also be expressed by probability amplitude |φ = [cos(θ), sin(θ)] T , where θ is the phase of a qubit. The probability amplitude of the qubit is directly used as the encoding of the solution vector to avoid the randomness of the transformation in QSCA [45]. The coding pattern is: where θ ij = 2π × rand, rand is a random number in [0, 1], i = 1, 2, . . . , m, m is the size of populations, j = 1, 2, . . . , D, D is the spatial dimension. As is shown in formula (24), each individual occupies two positions in the space: For convenient expression, p i c is called cosine position, while p i s is called sine position. Since the individual's traversal scope is [−1, 1] in every dimension, the two positions occupied by the individuals need to be mapped to the solution space of the corresponding optimization problem. Each probability amplitude of an individual qubit corresponds to an optimization variable in the solution space. As the j-th qubit of the individual i is [cos(θ), sin(θ)] T , the corresponding solution space variable is in [45]: During the status updating stage for all individuals, the movement of an individual's position is implemented by a quantum rotation gate. The individual's position will move according to the following rules: (1) The qubit updating of phase increment for individual i: (2) The qubit updating of probability amplitude for individual i: After the above two updating processes, the two new positions are formulated as: To increase the diversity of population and avoid local optimum, a mutation operator with quantum non-gate is introduced in Reference [45]. Firstly, a random number within (0, 1) is created and compared with the given mutation probability p m for each individual. Then, a total number of 0.5D qubits from each individual are randomly selected, whose probability amplitudes are changed by quantum non-gate if rand i < p m , otherwise, the amplitude phase remains unchanged: The procedures of QSCA are detailed as follows [45]: Step 1: Initialize the population and set relevant parameters according to Equation (24); Step 2: Transform unit space to solution space on the basis of Equation (26), thus to calculate the fitness of each individual; Step 3: Update individual's status with Equations (27) and (28); Step 4: Implement the mutation process based on the given mutation probability according to Equation (30); Step 5: Loop steps 2-4 until the convergence condition is met or the maximum times of iterations is reached.

Chaos Quantum Sine Cosine Algorithm
Chaos is a kind of seemingly irregular and random phenomenon happening within nonlinear systems resulted from deterministic rules. It appears to be disorganized but has certain motion laws, representing the complexity, randomness, and disorder within the systems. Chaotic variables have the features of pseudo-randomness and ergodicity, which traverses all points in a certain scope of the solution space without repeatability. The basic idea of searching with chaotic variables is to make full use of the ergodicity, which means that some chaotic variables are created with a chaotic map and transformed to the range of variables to be optimized, then the optimal parameters are searched [46]. With the chaotic variables, it would be more likely to find the global optimum. To promote the searching performance of QSCA, a Duffing system [47] is employed to create the chaotic variables. The dynamical equation of Duffing system is given by: where coefficient γ is the damping degree, α is the toughness degree, β is the nonlinearity of power, A is the amplitude of driving force, ω is the circular frequency of driving force. The differential form of Equation (31) can be obtained by transformation, which are given by: The coefficients of the Duffing system except driving force A are chosen as γ = 0.1, α = 1, β = 0.25, ω = 2. Given the initial values x(0) and y(0), the system's status will evolve gradually with the value of driving force A changing. When the dynamic behavior of the Duffing system is chaotic, chaotic variables x and y will traverse the points in a certain scope. Then, some points from the traversed ones are selected at a certain interval, after which a linear transformation from the chaotic variables space to the solution space is executed, thus to produce the initial solutions X i , i = 1, 2, . . . , m of QSCA.

SVM Optimized by CQSCA
The main procedures of the optimized SVM with the proposed chaos quantum sine cosine algorithm (CQSCA) are as follows: Step 1: Create chaotic variables by a Duffing system based on Equation (31) and transform them to the range of [0, 1]; Step 2: Initialize the population with the processed chaotic variables; Step 3: Encode the quantum and transform unit space to solution space; Step 4: Calculate the fitness of each individual, i.e., the cross-validation accuracy of SVM; Step 5: Update individual's status with Equations (27) and (28); Step 6: Implement mutation process based on the given probability according to formula (31); Step 7: Loop steps 3-6 until the convergence condition is met or the maximum number of iterations is reached; Step 8: Choose C and g in accordance with the maximal cross-validation accuracy as the optimal parameters; Step 9: Train the optimal SVM model with the training set; Step 10: Recognize the testing set.
The flowchart of SVM optimized by CQSCA is shown in Figure 1. Step 1: Create chaotic variables by a Duffing system based on Equation (31) and transform them to the range of [0, 1]; Step 2: Initialize the population with the processed chaotic variables; Step 3: Encode the quantum and transform unit space to solution space; Step 4: Calculate the fitness of each individual, i.e., the cross-validation accuracy of SVM; Step 5: Update individual's status with Equations (27) and (28); Step 6: Implement mutation process based on the given probability according to formula (31); Step 7: Loop steps 3-6 until the convergence condition is met or the maximum number of iterations is reached; Step 8: Choose C and g in accordance with the maximal cross-validation accuracy as the optimal parameters; Step 9: Train the optimal SVM model with the training set; Step 10: Recognize the testing set.
The flowchart of SVM optimized by CQSCA is shown in Figure 1.

Pattern Recognition Experiments
To estimate the performance of the proposed method, some standard UCI datasets [48] including wine, iris and heart were selected for pattern recognition experiment. The basic information of the datasets is shown in Table 1. All attributes of the datasets were normalized to be in the range [0, 1].

Pattern Recognition Experiments
To estimate the performance of the proposed method, some standard UCI datasets [48] including wine, iris and heart were selected for pattern recognition experiment. The basic information of the datasets is shown in Table 1. All attributes of the datasets were normalized to be in the range [0, 1]. Five-fold cross-validation was utilized to search the optimal parameters of C and g, which means that all the three datasets were haphazardly divided into five subsets, and each time one subset was selected as testing data while the other four ones as training data. The searching scopes of C and g were both [2 −10 , 2 10 ]. The numbers of individuals and iterations were set as 30 and 100, respectively. The constant a for changing the amplitudes of the sine and cosine functions was set as 2. The mutation probability p m was set as 0.04.
In order to compare with the proposed method (SVM-CQSCA), SVM optimized by PSO (PSO-SVM) and SVM optimized by SCA (SCA-SVM) were employed. The searching scopes of parameters C and g for SCA-SVM and PSO-SVM were the same as the configuration for SVM-CQSCA. The constant a in SCA-SVM was set as 2. To measure the performance of all methods well, the experiment was rerun for totally ten times. In each experiment, the best C and g were determined based on the maximal cross-validation accuracy, then the SVM model was trained and applied to classify all the data.
The experimental results are shown in Table 2, where the cross-validation accuracy and classification accuracy both donate the average of all results. The parameters C and g are corresponding to the best cross-validation accuracy. Meanwhile, the deviation scope is employed for error analysis in accordance with the mean value. Additionally, boxplots are employed to reveal the performance of different methods visually in Figure 2. As the results show, the proposed method achieves better classification performance than other methods by introducing a Duffing system for chaotic initialization and quantum technique for improving the optimization efficiency. In order to compare with the proposed method (SVM-CQSCA), SVM optimized by PSO (PSO-SVM) and SVM optimized by SCA (SCA-SVM) were employed. The searching scopes of parameters C and g for SCA-SVM and PSO-SVM were the same as the configuration for SVM-CQSCA. The constant a in SCA-SVM was set as 2. To measure the performance of all methods well, the experiment was rerun for totally ten times. In each experiment, the best C and g were determined based on the maximal cross-validation accuracy, then the SVM model was trained and applied to classify all the data.
The experimental results are shown in Table 2, where the cross-validation accuracy and classification accuracy both donate the average of all results. The parameters C and g are corresponding to the best cross-validation accuracy. Meanwhile, the deviation scope is employed for error analysis in accordance with the mean value. Additionally, boxplots are employed to reveal the performance of different methods visually in Figure 2. As the results show, the proposed method achieves better classification performance than other methods by introducing a Duffing system for chaotic initialization and quantum technique for improving the optimization efficiency.

Hybrid Fault Diagnosis Based on VMD and SVM Optimized by CQSCA
The procedures of the proposed hybrid fault diagnosis approach with entropy-based feature extraction and SVM optimized by chaos quantum sine cosine algorithm (CQSCA) are detailed as follows:

Hybrid Fault Diagnosis Based on VMD and SVM Optimized by CQSCA
The procedures of the proposed hybrid fault diagnosis approach with entropy-based feature extraction and SVM optimized by chaos quantum sine cosine algorithm (CQSCA) are detailed as follows: Step 1: Collect the vibration signals; Step 2: Select the mode number K of VMD through center frequency observation method; Step 3: Decompose all fault samples into sets of IMFs with VMD; Step 4: Calculate the PEs of all IMFs; Step 5: Construct the fault feature vectors with the PEs for all fault samples; Step 6: Search the optimal parameters C and g for SVM with the proposed CQSCA; Step 7: Train SVM with the optimal parameters C and g, thus the optimized SVM model is obtained; Step 8: Apply the optimal SVM model to recognize different types of faults.
The flowchart of the proposed hybrid fault diagnosis approach is shown in Figure 3. Step 1: Collect the vibration signals; Step 2: Select the mode number K of VMD through center frequency observation method; Step 3: Decompose all fault samples into sets of IMFs with VMD; Step 4: Calculate the PEs of all IMFs; Step 5: Construct the fault feature vectors with the PEs for all fault samples; Step 6: Search the optimal parameters C and g for SVM with the proposed CQSCA; Step 7: Train SVM with the optimal parameters C and g, thus the optimized SVM model is obtained; Step 8: Apply the optimal SVM model to recognize different types of faults.
The flowchart of the proposed hybrid fault diagnosis approach is shown in Figure 3.

Data Collection
The experimental data gathered from Bearings Data Center of Case Western Reserve

Data Collection
The experimental data gathered from Bearings Data Center of Case Western Reserve University [49] were employed to validate the capability of the proposed method in this paper. As shown in Figure 4, the experiment device mainly consists of a motor, an accelerometer and a torque sensor/encoder. The bearing is a SKF deep groove ball bearing model 6205-2RS. Accelerometers were placed at the end of the motor housing for data acquisition. The bearing data was collected from the drive end (DE). The inner, outer and ball element diameters of the bearing were 0.9843, 2.0472 and 0.3126 inches respectively, and the number of ball elements is nine. Single point faults were introduced to the test bearings by using electro-discharge machining, simulating the four working states of the rolling bearing: normal state, inner race fault, outer race fault and ball element fault. The fault diameters were 0.007 inches and 0.021 inches with the depth of 0.011 inches. In the experiment, the rotation speed was 1797 rpm under the load of 0 hp and the sample frequency was 12,000 Hz. The samples used in this paper include 7 types, namely normal state, outer race fault, inner race fault and ball fault with diameters of 0.007 inches and 0.021 inches (i.e., each of the three types of faults has two defect sizes). In addition, all data were partitioned into 59 segments containing 1024 sampling points for each type of signals. Details of the experimental data are listed in Table 3.

Engineering Application
To verify the effectiveness of the proposed VMD-PE-CQSCA-SVM method, the experiment was conducted with the comparison of EMD and EEMD during the signal decomposing phase. Likewise, when optimizing the parameters C and g for SVM, PSO and SCA are employed for comparison. In other words, eight different methods were applied to achieve the contrastive analysis,

Engineering Application
To verify the effectiveness of the proposed VMD-PE-CQSCA-SVM method, the experiment was conducted with the comparison of EMD and EEMD during the signal decomposing phase. Likewise, when optimizing the parameters C and g for SVM, PSO and SCA are employed for comparison. In other words, eight different methods were applied to achieve the contrastive analysis, When decomposing the fault samples with VMD, the decomposing mode number K needs to be preset. If the value of K is too small, the reduction of non-stationarity for original signal is limited. On the contrary, when the value of K is too large, the center frequencies of adjacent components will be close to each other, resulting in mode mixing. In our application, a detected signal under inner race fault with diameter of 0.007 inches was applied to ascertain the parameter K. The normalized center frequencies of all IMFs with different K are listed in Table 4. As it can be seen from Table 4, similar normalized center frequencies appeared when K was set 5, i.e., excessive decomposition occurred. Hence, the total number of modes was set 4. The decomposition results of signals from different kind of working states are shown in Figure 5, from which it can be seen that the original non-stationary signals were decomposed into four components with different frequency bands by VMD. As the time domain waveforms shown in Figure 5, it is of apparent difference among the decomposing results from different kind of working states. After decomposing the vibration signals, the PEs were calculated for each component to constitute the fault feature vector. During the calculation of PEs, the embedded dimension m and the time delay τ were set as 3 and 1 respectively. The PEs of five samples from different type of signals (L0-L6) are listed in Table 5.
Among the 59 feature vectors from each kind of operational condition, 40 were randomly selected as training samples, while the other 19 ones were selected for testing. The penalty parameter C and the kernel parameter g of SVM were optimized by the proposed CQSCA which had 30 particles and iterated 100 times, where the searching ranges of C and g were both 2 −10 , 2 10 . During the optimization process, the fitness value was measured with the five-fold cross validation accuracy. Then, the SVM model with the selected optimal parameters C and g was trained and employed to recognize the testing samples. In order to further verify the availability of the proposed method, the experiment was repeated ten times and the training samples were selected randomly every time, after which the average accuracy and corresponding deviation scope were calculated to evaluate the performance in both training and testing phases. Furthermore, the optimal (C, g) is reported corresponding to the best training accuracy.
In comparative experiment, all components decomposed by EMD and EEMD were employed to calculate the PEs, during which process the parameter setting for PE calculation was the same as the proposed method. The optimum parameters C and g in all the eight contrastive methods are decided the alike way as done in proposed method, i.e., 30 particles and 100 iterations are presented, while the search scopes of C and g are both 2 −10 , 2 10 . In addition, the way of performance evaluation was the same as done in the proposed method as well.
from which it can be seen that the original non-stationary signals were decomposed into four components with different frequency bands by VMD. As the time domain waveforms shown in Figure 5, it is of apparent difference among the decomposing results from different kind of working states. After decomposing the vibration signals, the PEs were calculated for each component to constitute the fault feature vector. During the calculation of PEs, the embedded dimension m and the time delay τ were set as 3 and 1 respectively. The PEs of five samples from different type of signals (L0-L6) are listed in No.    The fault diagnosis results and comparison of accuracies with different methods are presented in Table 6 and Figure 6. From Table 6, it can be viewed that the proposed VMD-PE-CQSCA-SVM method achieved the best precision in both training and testing phases, i.e., 99.50% and 97.89%, respectively. Specifically, it can be seen from the comparison of EMD-PE-CQSCA-SVM, EEMD-PE-CQSCA-SVM and VMD-PE-CQSCA-SVM that the testing accuracy of the proposed method is respective 22.70% and 14.88% higher than that of the other two methods, which shows the fact that VMD, as a non-stationary signal processing method, can promote the fault representation ability of PE. Furthermore, the contrastive analysis among VMD-PE-PSO-SVM, VMD-PE-SCA-SVM and VMD-PE-CQSCA-SVM shows that CQSCA-optimized SVM improved the accuracy by 0.52% and 1.57% than PSO-optimized SVM and SCA-optimized SVM respectively, indicating the availability of the proposed CQSCA optimizing strategy. Additionally, boxplots are employed to reveal the performance of different diagnosis methods visually in Figure 7, from which it can be seen that the proposed VMD-PE-CQSCA-SVM method achieves better precision and stability than other contrastive methods. In comparative experiment, all components decomposed by EMD and EEMD were employed to calculate the PEs, during which process the parameter setting for PE calculation was the same as the proposed method. The optimum parameters C and g in all the eight contrastive methods are decided the alike way as done in proposed method, i.e., 30 particles and 100 iterations are presented, while the search scopes of C and g are both 2 , 2 . In addition, the way of performance evaluation was the same as done in the proposed method as well.
The fault diagnosis results and comparison of accuracies with different methods are presented in Table 6 and Figure 6. From Table 6, it can be viewed that the proposed VMD-PE-CQSCA-SVM method achieved the best precision in both training and testing phases, i.e., 99.50% and 97.89%, respectively. Specifically, it can be seen from the comparison of EMD-PE-CQSCA-SVM, EEMD-PE-CQSCA-SVM and VMD-PE-CQSCA-SVM that the testing accuracy of the proposed method is respective 22.70% and 14.88% higher than that of the other two methods, which shows the fact that VMD, as a non-stationary signal processing method, can promote the fault representation ability of PE. Furthermore, the contrastive analysis among VMD-PE-PSO-SVM, VMD-PE-SCA-SVM and VMD-PE-CQSCA-SVM shows that CQSCA-optimized SVM improved the accuracy by 0.52% and 1.57% than PSO-optimized SVM and SCA-optimized SVM respectively, indicating the availability of the proposed CQSCA optimizing strategy. Additionally, boxplots are employed to reveal the performance of different diagnosis methods visually in Figure 7, from which it can be seen that the proposed VMD-PE-CQSCA-SVM method achieves better precision and stability than other contrastive methods.

Conclusions
In order to enhance the fault diagnosis precision for rotating machinery, a hybrid approach with the fusion of entropy-based feature extraction and SVM optimized by a chaos quantum sine cosine algorithm is proposed in this paper. Firstly, the preset parameter K of VMD is chosen using

Conclusions
In order to enhance the fault diagnosis precision for rotating machinery, a hybrid approach with the fusion of entropy-based feature extraction and SVM optimized by a chaos quantum sine cosine algorithm is proposed in this paper. Firstly, the preset parameter K of VMD is chosen using the central frequency observation method, after which the signals collected under different states are decomposed into series of intrinsic mode functions (IMFs). Subsequently, the permutation entropy values of all IMFs are calculated to assemble the feature vectors of different fault samples. Finally, an optimized SVM model based on chaotic initialization, quantum technique and SCA (CQSCA) for parameter selection, whose availability has been ascertained with recognizing experiment, is proposed to achieve the pattern recognition for different kind of faults. In the engineering applications, the proposed VMD-PE-CQSCA-SVM method was successfully employed to recognize different fault samples and compared with some other relevant methods, including EMD-PE-PSO-SVM, EMD-PE-SCA-SVM, EMD-PE-CQSCA-SVM, EEMD-PE-PSO-SVM, EEMD-PE-SCA-SVM, EEMD-PE-CQSCA-SVM, VMD-PE-PSO-SVM, VMD-PE-SCA-SVM. The application results indicate that the proposed method achieves the best performance during both the training stage and testing stage in terms of the average accuracy of ten times randomized experiments. Particularly, the test accuracy of the proposed method is 22.70% and 14.88% higher than that of EMD-PE-CQSCA-SVM and EEMD-PE-CQSCA-SVM, and also 0.52% and 1.57% higher than VMD-PE-PSO-SVM and VMD-PE-SCA-SVM. Furthermore, the boxplots of different diagnosis methods show that the stability and precision of the proposed method is superior to those of other methods. Thus, the proposed method is a reliable and effective tool for fault diagnosis of rotating machinery.
Author Contributions: W.F. designed the research, J.T. performed the experiments and contributed to paper writing, C.L. provided guidance for this research and participated in the discussion, Z.Z. provided recommendations for this research, Q.L. and T.C. participated in revision process.