Compound Fault Diagnosis of Gearbox Based on RLMD and SSA-PNN

In order to solve the diﬃculty in the classiﬁcation of gearbox compound faults, a gearbox fault diagnosis method based on the sparrow search algorithm (SSA) improved probabilistic neural network (PNN) is proposed. Firstly, the gearbox fault signal is decomposed into a series of product functions (PFs) by robust local mean decomposition (RLMD). Then, the permutation entropy of PFs, which contains much fault information, is calculated to construct the feature vector and input it into the SSA-PNN model. The experimental results show that compared with the traditional fault diagnosis methods based on EMD-BP and EEMD-PNN, the gearbox fault diagnosis method based on RLMD and SSA-PNN has higher diagnosis accuracy.


Introduction
e gearbox is the core component of mechanical equipment, and its running state is closely related to the safe operation of the equipment. Gearbox faults often occur as multiple faults in practical engineering applications and may cause abnormal operation of the equipment system and even lead to significantly reduced service life and degraded property. erefore, compound fault diagnosis of the gearbox plays an important role in the safety maintenance of the mechanical system. Vibration signal analysis is one of the common analysis methods of gearbox fault diagnosis [1]. Vibration signals can be obtained through the contact sensor installed on the machine shell or base or through the airborne acoustic array sensor. However, in actual working conditions, the environmental background noise is large, and the fault impact characteristics of vibration signals are submerged in the cluttered noise, making it difficult to obtain fault information from original signals with the naked eye. Signal decomposition is one of the effective methods to deal with vibration signals. e ensemble empirical mode decomposition (EEMD) method [2] is widely used for feature extraction of fault signals. By adding Gaussian white noise when dealing with decomposed signal EMD, EEMD uses the binary filter bank characteristics of the EMD filter to fill the whole time-frequency space to reduce mode mixing. However, the added noise may not be completely eliminated and will cause signal reconstruction error [3]. To solve modal aliasing and end effect, Liu proposed a robust local mean decomposition (RLMD) method [4]. Yan [5] reconstructed the PFs obtained from RLMD of the signal and used the K-means++ clustering method to cluster the fault features. e effectiveness of this method was verified by simulation and experiments.
In the current fault diagnosis methods [6], the BP neural network is the most widely used, but it also has many shortcomings, such as the tendency to fall into the local extremum because of the dependence on the initial network weight, slow convergence, and so on. Compared with the BP neural network, the probabilistic neural network [7] converges faster and has higher diagnosis accuracy. Wang [8] used multiscale entropy (MSE) to extract fault features from signals and then input them into PNN. e results showed good fault diagnosis ability of the MSE-PNN model. Di [9] used EEMD to decompose signals into multiple IMFs, then took the energy as the feature vector, and inputted it into PNN. It was proved that this method has high recognition accuracy. However, the smoothing factor in PNN can only be selected by artificial experience without a fixed method. Among swarm intelligence optimization algorithms [10][11][12], the sparrow search algorithm (SSA) [13] with strong search ability and fast convergence speed [14] is the best, and therefore, it can be used in the adaptive selection of smoothing factors to make them reflect the characteristics of the sample to the maximum extent. Accordingly, a gearbox fault diagnosis method based on RLMD and PNN optimized by SSA is proposed in this paper.

Local Mean Decomposition.
e local mean decomposition (LMD) method [15] is an adaptive time-frequency representation method through iterative operation and can decompose a signal into a series of product functions (PFs), each of which is the product of the FM signal and the envelope signal. If the given original signal is x(t), the LMD algorithm steps are as follows.
Step 1. All local maxima and minimums of the signal x(t) are obtained. e extreme points are represented by e w , and the corresponding extremes are marked as x(e w ) with w � 1, 2, 3, . . .
Step 2. e local mean m 0 (t) and local amplitude a 0 (t) are preprocessed according to formulas (1) and (2), and then, the smoothing algorithm is used to postprocess m 0 (t) and a 0 (t) for the smoothed local mean m(t) and local amplitude a(n): Step 3. e initial local average x(t) is removed from the original signal m 11 (t), and the estimated zero-mean signal h 11 (t) is obtained as Step 4. e estimated FM signal s 11 (t) is obtained by dividing h 11 (t) by a 11 (t), that is, After the LMD, if the signal s 11 (t) does not meet the requirements, that is, it is not a pure FM signal, and s 11 (t) will be regarded as a new signal to repeat Steps (1) to (4) P times until the conditions in formula (5) are satisfied: lim P⟶∞ a 1p (t) � 1.
Step 5. When formula (5) is established, the FM signal s 1 (t) which meets the requirements can be calculated by formula (6), and the envelope signal a 1 (t) can be calculated by formula (7). e first product function PF 1 (t) can be obtained by multiplying s 1 (t) with a 1 (t): Step 6. e residual signal u 1 (t) is obtained by subtracting PF 1 (t) from the original signal. Steps (1) to (5) are repeated Q times until u 1 (t) is a constant or nonoscillatory function, and then, the original signal can be represented by the sum of multiple product functions and residual components:

Robust Local Mean Decomposition.
Because of mode aliasing and end effect of LMD, Liu proposed RLMD. e improvements are as follows.
Step 1. Boundary condition optimization: the mirror expansion algorithm [16] is used to find the symmetrical points of the signal with respect to the endpoints at both ends.
Step 2. Envelope estimation: for the value λ * that needs to be selected by artificial experience, Liu creatively uses a method based on statistical theory: In the formula, odd(·) is the nearest odd number of the input, μ s is the center of the step, and δ s is the standard deviation of the step.
Step 3. Stop criteria filter: the objective function f(x) is minimized: where the zero-baseline envelope signal z(t) � a(t) − 1 and RMS(·) and EK(·) are given by formulas (11) and (12) as follows: where z is the average of z(n).

Simulation Experiment and Analysis.
In order to verify the superiority of RLMD, a composite signal is simulated: e sampling frequency is 20480 Hz, the number of sampling points is 4096, and the time domain waveforms of each component and composite signal are shown in Figure 1.
e first four PFs and IMFs obtained by simultaneous RLMD and EMD of the signal are shown in Figures 2 and 3. Figure 2 shows that the PF1 component mainly corresponds to function x 2 (t), and the PF2 component mainly corresponds to function x 1 (t), while in Figure 3, all components are affected by modal aliasing, resulting in that their significance is not obvious. erefore, the superiority of the RLMD method is shown.

Sparrow Search Algorithm.
Sparrow search algorithm is a new swarm intelligence optimization algorithm, which simulates the foraging behavior of the sparrow population, and in this algorithm, the sparrow population is divided into discoverers and participators.
e population X with the number of sparrows n is expressed by formula (13): In the formula, d is the dimension of the parameter to be optimized and n is the number of populations. en, the fitness values of all sparrows can be expressed as formula (14): e iterative formula of the discoverer's position is expressed as formula (15): where t is the number of iterations, iter max is the maximum number of iterations, and Q is a random number, which obeys normal distribution. e maximum value of j is the dimension d of the parameter to be optimized; L is a row matrix with the length d, and the element is 1; R 2 and ST represent early warning value and safety value, respectively. e iterative formula of the participant's position is expressed as formula (16): where X p and X worst represent the global best location and global worst location found by the discoverer so far, respectively. A is a row matrix with the length d, the element is set as 1 or -1 randomly, and A + � A T (AA T ) − 1 .
In this paper, the sum of the classification error rate of the training set and the classification error rate of the test set is used as the fitness value.

Probabilistic Neural Network.
A probabilistic neural network [17] is a kind of neural network that can be used for pattern classification, and its essence is a parallel algorithm based on the Bayesian minimum risk criterion. It has the characteristics of a simple learning process, fast training speed, more accurate classification, good fault tolerance, and so on.
Probabilistic neural networks are generally divided into four layers, namely, input layer, pattern layer, summation layer and output layer. e input layer is used to input the high-dimensional fault feature matrix for analysis. e number of input layers is affected by the dimension of the fault feature matrix. e pattern layer connects with the input layer through the connection weight. It calculates the matching degree, that is, the similarity between the input feature vector and each pattern in the training set, and then inputs it into the activation function. e result is the output of the pattern layer. e summation layer is responsible for connecting the pattern layer units of each class. e number of neurons in this layer is consistent with the number of fault types. e output layer classifies the fault type by outputting the maximum value in the summation layer. e basic model of PNN is shown in Figure 4.

Compound Fault Diagnosis of the Gearbox by RLMD-SSA-PNN
In order to solve the problems of compound fault diagnosis of the gearbox, an improved PNN algorithm based on RLMD and SSA is proposed in this paper, and the specific steps are as follows.
Step 1. e signal is decomposed by RLMD, and a series of PFs are obtained. e number of PFs containing much fault information is judged by the correlation coefficient, and the permutation entropy of each effective PF is calculated. After that, the high-dimensional fault characteristic matrix is constructed.   Step 2. e fault characteristic matrix is divided into the training set and test set, which are then labeled.
Step 3. Parameters in the sparrow search algorithm are set, such as the number of populations, maximum number of iterations, and upper and lower boundaries of the search (the range of smoothing factors).
Step 4. e training set and its label are input into the SSA-PNN model for training, making the model find the optimal value of the smoothing factor, and then, the test set and its label are input for the test.

Data Acquisition.
is paper used the gearbox test platform of the Ministry of Education Key Laboratory in Beijing Information Science and Technology University to collect data. e test bench includes an acceleration sensor, conditioning circuit, acquisition instrument, planetary gearbox test platform, and computer. e range of speed adjustment is 1140-2220 rpm, and the step size is 120 rpm. Five types of data are included, namely, the normal planetary gear, planetary gear tooth fracture, planetary gear tooth surface wear, planetary gear tooth fracture plus tooth surface wear, and planetary gear tooth fracture plus rolling body missing. Each type of data is collected three times, with the sampling frequency of 20480 Hz and the sampling time of each collection being 3s. e length of each collected signal is 61440. e data used in this paper are the fault data under the speed of 1500 r/min, and the time domain diagram of each type is shown in Figure 5.

Construction of High-Dimensional Fault Characteristic
Matrix. Considering the computational cost and efficiency, each collected signal is divided into 30 short samples, with the length of each sample being 2048, and there are a total of 90 short samples for each type of data. Sixty of them are randomly selected as the training set, and the remaining 30 are chosen as the test set. Since there are 5 data types in the experiment, a total of 300 short samples are used as the training set, and 150 samples are selected as the test set. e labels of the normal type, broken tooth type, wear type, wear and broken tooth type, broken tooth type, and rolling body missing type are 1, 2, 3, 4, and 5, respectively.
Taking the fault signal of tooth breaking of the planetary gear as an example, this paper shows the decomposition of a short sample with a length of 2048 by RLMD to get multiple PFs, as shown in Figure 6. e above six PFs' components are obtained after the fault signal of planetary gear tooth breaking is decomposed by RLMD. In fact, there are still meaningless components in the PFs' components of the signal after RLMD. Taking these components as feature elements will cause interference and reduce the recognition accuracy of the recognition Output layer  algorithm. erefore, it is necessary to use the correlation coefficient method to filter out the unimportant PFs components. e correlation coefficients of PFs' components of five types of data decomposed by RLMD are calculated, as shown in Figure 7. It can be seen that the correlation coefficients of the PFs after PF4 are all lower than 0.1, which indicates that the latter PFs can hardly reflect the fault characteristics of the original signal, and therefore, the first four PFs are retained as effective components. en, the permutation entropy of these four PFs is calculated, and the fault characteristic matrix of 450 × 4 is obtained.

Experimental Results and Analysis.
e fault characteristic matrix is inputted into the SSA-PNN model, and the result is shown in Figure 8. It is shown that, in the 150 short samples of the test set, only 4 short samples have classification errors, including the diagnosis of the normal type as wear plus broken teeth, the diagnosis of the broken teeth as wear plus broken teeth, the diagnosis of the broken teeth as rolling body missing plus broken teeth, and the diagnosis of the wear as the normal type, and the overall classification accuracy reaches 97.33%.
In order to verify the superiority of the proposed method, it is compared with other methods. is paper uses   EMD to decompose the signal, then takes the first four IMFs as sensitive components, and also uses permutation entropy as the index to construct the feature matrix, which is input into the BP model. e classification results of the EM-BP model are shown in Figure 9. It is shown that there are 38 classification errors in the short samples of the test set, and the classification accuracy is only 74.67%. Similarly, the EEMD-PNN model is used, and its fault classification results are shown in Figure 10. It can be seen that there are 13 classification errors in the short samples of the test set, and the classification accuracy is 91.33%.

Conclusions
In this paper, a gearbox fault diagnosis method based on robust mean decomposition and SSA improved PNN model is proposed. is method can effectively adaptively select the smoothing factors in the PNN model so as to achieve a good classification effect. Besides, it has higher diagnosis accuracy than EMD-BP, EEMD-PNN, and other classification methods and is suitable for fault diagnosis of the gearbox.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.