Planetary-Gearbox Fault Classification by Convolutional Neural Network and Recurrence Plot

Recurrence-plot (RP) analysis is a graphical tool to visualize and analyze the recurrence of nonlinear dynamic systems. By combining the advantages of the RP and a convolutional neural network (CNN), a fault-classification scheme for planetary gear sets is proposed in this paper. In the proposed approach, a vibration is first picked up from the planetary-gear test rig and converted into an angular-domain quasistationary signal through computed order tracking to eliminate the frequency blur caused by speed fluctuations. Then, the signal in the angular domain is divided into several segments, and each segment is processed by the RP to constitute the training sample. Moreover, a two-dimensional CNN model was developed to adaptively extract faulty features. Experiments on a planetary-gear test rig with four conditions under three operating speeds were carried out. The results of measured vibration demonstrated the validity of CNN and recurrence plot analysis for the fault classification of planetary-gear sets.


Introduction
Due to their power capability in a compact, lightweight form and variable transmission ratio, planetary-gear trains are key units of rotating machinery that are widely used in helicopters, wind turbines, and other mechanical transmission systems. However, harsh operating conditions make components in planetary gearboxes, such as gears and bearings, prone to failure. Therefore, it is urgent to develop effective diagnostic approaches to detect the operating conditions of a mechanical transmission system.
Compared with conventional fixed-axis gear sets, the structures of planetary gearboxes are more complicated and generally composed of a planet carrier, a sun gear, a ring gear, and several planets. Planet gears not only spin around their own shaft, but also revolve around the center of the sun gear. Thus, the perceived vibration from the planetary gear set characterizes heavy amplitude, frequency, and phase demodulation caused by gears or bearing faults, gear meshing motion, and time-varying transfer paths.
Currently, popular approaches for extracting fault features caused by planetary gear sets are vibration-based methods, which include the vibration-separation technique [1], amplitude and frequency demodulation [2,3]. However, these conventional signal-processing approaches have some disadvantages. First, when we use these approaches, the operational characteristics and failure mechanisms of planetary-gear trains should be considered. Second, it is difficult to recognize fault features form various faults.
On the other hand, AI techniques have recently been used in the field of condition monitoring and fault diagnosis. AI techniques could be employed in the condition monitoring and fault diagnosis of rotating machinery by taking expert knowledge into consideration. In fact, some convolutional-neural-network (CNN) studies reported on the fault identification of complex machinery transmission systems. For example, the spectrum of bearing housing vibration is processed by fast Fourier transform to identify faults with the CNN [4]. The time-and frequency-domain indicators were combined with the CNN for fault classification of a fixed shaft gearbox in [5,6].
The main deep-learning schemes generally have the advantage of taking 2D images as inputs. The perceived vibration is converted into time-frequency [7] or time-scale images by continuous wavelet transform (CWT), short-time Fourier transform (STFT), and Fourier transform (FT), which aim to be combined with the CNN for intelligent fault diagnosis [8]. Recurrence-plot (RP) analysis [9] is a powerful tool to locate nonstationary structural changes and transitions occurring in the dynamic system. It can be employed to study the response of a planetary-gear set and visualize the recurrence behavior of a faulty system [10,11]. RP results are 2D images that make the RP and CNN combination convenient. In this paper, a combination scheme of a CNN and angular-domain recurrence-plot analysis is proposed. The validity of the proposed scheme is experimentally demonstrated on a planetary-gear test rig.
This paper is arranged as follows. Angular-domain recurrence-plot analysis is introduced in Section 2. Subsequently, the CNN principle is briefly described in Section 3. The proposed approach is discussed in Section 4. The experiment on the planetary-gear test rig and the corresponding analysis results are presented in Section 5. Finally, conclusions are drawn in Section 6.

Angular-Domain Recurrence-Plot Analysis
Because the recurrence plots of the raw vibration can be distorted by speed fluctuations, it is conceivable to remove the distortions before carrying out the RP to improve the accuracy of fault classification. Therefore, the well-known equiangle resampling scheme in computer order tracking (COT) was employed to convert the observed time series into an angular-domain quasistationary one that can avoid the frequency blur caused by speed fluctuations. More details can be found in [12,13].

Phase-Space Reconstruction in Angular Domain
The possible states of the dynamic system can be represented by phase-space reconstruction that reconstructs the one-dimensional time series into a high-dimensional phase space to unfold the attractor and reflect characteristics of a dynamic system. The main methods of phase-space reconstruction were proposed, and the time-delay method was employed in this paper. In the same way, for resampled angular-domain observations, angle-delay vector y(θ, δ) in d-dimensional space could be reconstructed as in [14] by: where δ is the angle delay and d represents the embedding dimension. It is easy to see from Equation (1) that either determining angle delay δ or calculating embedding dimension d is critical to obtain a better reconstruction result.

Determination of Embedding Dimension
To determine a suitable embedding dimension in phase-space reconstruction, some methods were proposed, such as computing some invariant on the attractor, singular-value decomposition, and false nearest neighbors (FNN) [15]. In this paper, the FNN method was used to analyze angular-domain data series, which is similar to the discrete-time series in theory. In d-dimensional phase space, the square of Euclidean distance between point y(θ, δ) and rth nearest neighbor of y(θ, δ), R 2 d can be presented as in [14] by: From the dth to (d + 1)th dimension, corresponding Euclidean distance R 2 d+1 (θ, r) is changed into In engineering applications, the observed vibration may have limited length and low signal-to-noise ratio (SNR). In this situation, by only considering the nearest neighbors [14], e.g., R d (θ, r) ≡ R d (θ, r = 1), the criterion to identify the false neighbors in the angular domain can be expressed as in [14]: where A tol is the threshold (without losing generally, it is set to 2). A suitable d is determined with the value of the first minimum of the FNN [9].

Determination of Angle Delay
The angle delay is also a critical parameter in phase-space reconstruction. If δ is too small, the two trajectories in the phase space are too closed, which makes it impossible to unfold the attractor. If δ is chosen largely, the co-ordinate components are statistically independent, and even the trajectory projection of chaotic attractors is without correlation.
To obtain a proper δ, some approaches were proposed, including the average displacement autocorrelation function and mutual information [16] employed in this study. We considered nonlinear system (S,Q), which was composed of angular-domain discrete data series y(θ) and delay vector y(θ + δ). Assume that S was measured. Mutual information I(Q, S) could be computed as in [17] by: where P(s i ), P(q j ) represent the probability of s i in S and q j in Q, respectively. P sq (s i , q j ) denotes the joint probability of s i and q j . A suitable angle delay was selected at the value of the first minimum of the mutual-information function. Details about mutual information can be found in [16,17].

Briefs on Recurrence Plot
Recurrence-plot analysis measures the recurrence behavior of a trajectory in phase space, which visualizes the structural changes and transitions of a dynamic system by an image. In this research, the angular-domain data series were used as the analysis data of the recurrence-plot analysis, and the mathematical expression of recurrence matrix R i,j at resampling angles i and j can be expressed as in [9] by: where ε is a threshold distance, H(·) is the Heaviside function, N is the number of the angular-domain data series points, and · is the Euclidean norm. A discussion on the threshold can be found in [18], and more details about recurrence-plot analysis and setting parameters can be seen in [9,18].

Typical Configuration of Convolutional Neural Network
A CNN generally contains five layers: the convolutional, activation, pooling, fully connected, and softmax layers. Convolutional layers are composed of several convolutional kernels that abstract and adaptively learn features from the input to obtain feature maps. The first feature map is computed by convolving the input images with the first convolutional kernel, and the next feature map is obtained from the previous output through multiplying the shared weight and adding the bias. Then, the nonlinear activation function is applied on the above-mentioned convolutional results, which is defined by sigmoid tanh or a rectified-linear-unit (ReLU) function [19]. The ReLU is widely applied after the output of every convolutional and fully connected layer.
In the classification stage of the CNN model, there is generally a fully connected layer and classifier that mainly integrate and classify information from the previous layer. A softmax regression classifier is always placed at the last layer to estimate the probability of multiple faults. Cross-entropy is generally used as the loss function to represent the error between the true and estimated value. Weight matrix and bias term are optimized by the adaptive-moment-estimation method (ADAM). ADAM is able to dynamically adjust the learning rate of each parameter through first-(average) and second-order (variance) moment estimation [20]. Thus, it is appropriate for nonstationary objectives and engineering vibration data with a low SNR. Moreover, in applications, this method can efficiently compute and needs little memory, which is suitable for processing long data.

Evaluation Indicators of CNN Model
To quantify the performance of different methods, four evaluation indicators were employed: accuracy, precision, recall, and f 1 -score. The calculation of these indicators is shown in [4]. In engineering applications, the condition-monitoring system ideally triggers an alarm for real faults, and avoids false or missed triggers. Therefore, a good classification method that maximizes all evaluation indicators is required. More details about CNN setting parameters and configurations can be found in [4,5,8].

CNN Combined with Angular-Domain RP for Fault Classification of Planetary Gearboxes
In this study, a combination of a CNN and angular-domain RP for the fault classification of planetary gearboxes is proposed. The corresponding schematic is shown in Figure 1. The main steps of the proposed scheme are listed as follows: 1. Vibration-signal acquisition and processing. Raw vibration of the planetary gearbox is picked up synchronously with the tacho plus trains of the reference shaft. Then, equiangle resampling is performed to convert the data series into those in the angular domain.
2. Construction of training dataset. Vibration data series are first divided into segments. For a planetary gearbox with a fixed annulus gear, the transmission ratio of the planet carrier and sun gear can be expressed by where N s , N r are the tooth numbers of the sun and ring gear, respectively. For the minimal common complete integral period of the sun gear and planet carrier, the sun gear needs to rotate N s + N r circles, and the carrier rotates N s circles. Therefore, to ensure each segment covering the meshing vibration of all teeth, the corresponding rotating circles of the sun gear should not be less than N s + N r .
We assumed that the rotating speed of the sun gear was n s , and the total time of the observed signal was T. To save computation time, the resampled length data in the angular domain can be expressed by: where N resample is resampling points per revolution. Thus, the points of each segment equal to (N s + N r )N resample . Then, to a condition with three operating speeds, each segment was randomly selected to constitute the training sample. 3. Build CNN model and set parameters. The configuration of a 2D CNN was used. Several key parameters, such as learning rate, the size of convolutional kernels, number of iterations, and nodes of fully connected layers were set and are discussed in the next section. 4. Identification results and evaluation of CNN model.

Planetary-Gearbox Test Rig
In order to validate the validity of the proposed approach, an experiment with four conditions under three operating speeds was carried out on a planetary-gear test rig (with one 28-tooth sun gear, three 20-tooth planet gears, a 71-tooth fixed ring gear, and a carrier; the gear modulus was 2.25), shown in Figure 2a, where the output shaft was connected with the sun-gear shaft by a coupling, and the planetary carrier was the input. The kinematical schema of the experiment planetary gearbox can be found in Figure 3. An eddy probe was fixed by a magnetic base to measure the tacho impulse trains of the output shaft, and three accelerometers were respectively mounted on the planetary-gearbox housing, as shown in Figure 2a. The data were implemented by an NI USB 9234 card with a sampling rate of 51.2 kHz.
In this study, a vibration picked up by accelerometer II close to the ring gear was used. Four conditions of the planetary gearbox, that is, normal; planet gear-root crack; sun gear with a tooth-root crack; and planet carrier with a crack, are shown in Figure 2b-d. The tooth-root-crack length of the faulty planet and sun gear was 4 and 3.7 mm, respectively. The crack size of the carrier was 21 mm length, 5 mm depth, and 0.18 mm width.

Data Preprocessing
First, taking the sun gear as the reference shaft, a raw vibration with 600 s was converted into those in the angular domain by the COT, which eliminates speed fluctuations. Sampling points per revolution in the angular domain were selected to be 1024.
Subsequently, the data series in the angular domain were divided into several segments. According to Equation (7), i cs = 28/99 data length corresponds to 99 circles of the sun gear. Then, each segment was divided in this way. To reduce computation time, 200 resampling points were selected in a revolution of the sun gear. Thus, the number points of the resampled data series were 1.6 × 10 6 , and each segment was 1.98 × 10 4 points in the condition that the rotating speed of the sun gear is 800 r/min. In the time domain, we could calculate the time of a complete circle under different rotating speeds. For example, when the rotating speeds of the output shaft were 800, 700, and 600 r/min, the lengths of the data series in the time domain were 7.425, 8.486, and 9.9 s, respectively, when resampling ratewas 2000 points per second. Thus, each segment was randomly selected as 7.5, 8.5, and 10 s of the total 600 s of observed vibration.
Then, all segments were processed by recurrence-plot analysis to obtain 4 × 3 × 500 2D images as the training sample. The recurrence plots in the angular domain of the four conditions with n s = 800 r/min are presented in Figure 4. In order to show the obtained recurrence-plot differences between multifault categories, only 8000 of 19,760 points are shown. Angle delay δ and embedding dimension d were determined by the FNN and mutual-information function, respectively. Threshold ε was determined by the standard deviation of the observed angular-domain signal [11]. The vertical distance between these lines can be seen in Figure 4, signals' intermittent behavior with a characteristic angle scale [21] of components caused by faults. Small fluctuations could also be seen for all considered conditions, but these differences were slight and not conducive to quantitatively distinguish. Moreover, for the same condition, they may have been slightly different, so it was hard to determine the planetary-gearbox failure type only on the basis of recurrence plots.

Model Design
The configuration of the two-dimensional CNN consisted of two convolutional layers, two pooling layers, three fully connected layers, and a softmax regression. Cross-entropy was employed as the loss function, and several fundamental parameters of the CNN model are listed in Table 1.

Results and Discussion
Three comparative methods were utilized in this study. The RP of the time domain, continuous wavelet transform (cmor4-4), and the first six maximal peak values of the intrinsic mode function (IMF) generated by the empirical mode decomposition (EMD) of the experiment data [22] were employed to validate the performance of the proposed method. The comparative results of the four approaches are shown in Figure 5. Figure 5a shows the valuation loss, which was used to measure the extent of prediction error. The closer the prediction probability to 1 was, the closer the loss value to zero was. For the time-domain recurrence-analysis (green line) and EMD (blue line) methods, loss values were significantly higher than those of other methods (loss value stabilizes at 0), with more obvious fluctuations. Figure 5b shows valuation accuracy, which is defined as the ratio of the true number to the total number. The higher the accuracy is, the better the classification of the model. According to Figure 5b, the valuation accuracy of the proposed method (red line) and CWT (black line) was up to 100%. For the same CNN model, the valuation loss and accuracy of the IMF were fluctuant, which explains why this method failed to effectively extract fault features. Comparing the results of recurrence analysis, the proposed method was better than that in time domain due to eliminating the rotational fluctuations. Although both angular-domain recurrence analysis and CWT could obtain effective results, convergence speed was different. The valuation accuracy of the proposed method reached 100% and valuation loss was about 0% with epoch = 15, which demonstrated that convergence speed was better than that of the CWT.
To eliminate the random error, each approach was executed 10 times. Results of four different methods are listed in Table 2. The confusion-matrix plots of the four methods are shown in Figure 6. Evaluation indicators demonstrated that the proposed approach and the CWT could effectively identify the faults of planetary-gear sets, and all evaluation indicators could reach 100%. As can be seen, standard variance was about 0%, which shows that performance was robust. Comparing the confusion matrix of the time domain RP with the artificially selected IMF plots, the true classification number was higher than that of handcrafted features, which concludes that adaptive feature extraction was better than artificially selected features. Therefore, the proposed approach is effective for fault classification of a planetary gearbox under various rotating speeds.  Result format is average evaluation indicators (%) (σ = standard variance (%)).

Conclusions
In this paper, we proposed a combination of a CNN with recurrence-plot analysis in the angular-domain method for fault classification in planetary gearboxes. The combination of the CNN with the RP was compared with angular domain RP, CWT (cmor4-4), and IMF plots. Even though the CWT could also obtain high valuation accuracy, it needed more epochs. Moreover, due to the speed fluctuations and limitation of handcrafted features, the valuation accuracy of the two other methods was lower than that of the CWT and the proposed method. Finally, experiment results showed that the proposed method is superior to conventional methods on the fault classification of planetary gearboxes.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: