Research on Degradation State Recognition of Planetary Gear Based on Multiscale Information Dimension of SSD and CNN

Planetary gear is the key part of the transmission system for large complex electromechanical equipment, and in general, a series of degradation states are undergone and evolved into a local fatal fault in its full life cycle. So it is of great significance to recognize the degradation state of planetary gear for the purpose ofmaintenance repair, predicting development trend, and avoiding sudden fault. This paper proposed a degradation state recognition method of planetary gear based on multiscale information dimension of singular spectrum decomposition (SSD) and convolutional neural network (CNN). SSD can automatically realize the embedding dimension selection and component grouping segmentation, and the original vibration signal being nonlinear and nonstationary can be decomposed into a series of singular spectrum decomposition components (SSDCs), adaptively. Then, the multiscale information dimension which combines multiscale analysis and fractal information dimension is proposed for quantifying and extracting the feature information contained in each SSDC. Finally, CNN is used to achieve the effective recognition of the degradation state of planetary gear. The experimental results show that the proposed method can accurately recognize the degradation state of planetary gear, and the overall recognition rate is up to 97.2%, of which the recognition rate of normal planetary gear reaches 100%.


Introduction
Planetary gear is the key part of the transmission system for large complex electromechanical equipment, and it usually operates under extreme harsh conditions on the long term.The local fault of planetary gear is easy to produce, and it undergoes a series of degradation states to fatal fault, which directly affects the operational reliability of the electromechanical equipment [1].Therefore, it is of great significance to accurately recognize the current degradation state of planetary gear for the purpose of maintenance repair, predicting development trend, and avoiding sudden fault.
The degradation state recognition of planetary gear is still the problems of pattern classification and fault diagnosis, but it is more difficult than the general fault diagnosis problem [2].The main reasons are as follows: (1) In actual engineering, the operation condition of planetary gear is relatively harsh.Meanwhile, due to the particularity and complexity of planetary gear structure, its vibration signals have the characteristics of being nonlinear and nonstationary compared with the other transmission mechanisms.(2) The different degradation states of planetary gear still belong to the same fault type, and their degrees are different which results in the fault feature differences being smaller.
The effective extraction of the fault feature for planetary gear is the key to realize the degradation state recognition [3].The traditional fault feature includes time domain features and frequency domain features, but they only have global statistical significance, and they are not suitable to analyze nonlinear and nonstationary signals [4].With the development of signal processing technology, the method of signal decomposition combining with feature quantification has become the mainstream.The widely used signal decomposition method includes empirical mode decomposition (EMD) [5] and wavelet transform and their improved methods such as ensemble empirical mode decomposition (EEMD) [6], complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) [7], and dual-tree complex 2 Complexity wavelet transform (DTCWT) [8] are also applied to fault feature extraction.But there are still some shortcomings in those methods; the methods developed from EMD can easily fall into modal aliasing when it is used to process the intermittent vibration signal; meanwhile, the obtained decomposition result is as follows: intrinsic mode functions (IMFs) lack the physical meaning.Some methods developed from wavelet transform are restricted to the selection of the wavelet basis functions and decomposition layer.Singular spectrum analysis (SSA) is a kind of nonparametric spectral estimation method based on principal component analysis, and it can capture the high harmonic oscillation shapes based on data adaptive driven, which is suitable for dealing with the vibration signal with nonlinear and nonstationary signals [9,10].Traditional SSA is generally divided into four steps: trajectory matrix construction, singular value decomposition (SVD), component grouping, and diagonal averaging.Although some scholars have done some research on SSA, but to determine the appropriate embedding dimension and component grouping criteria is still the key to the effect of SSA.The existing methods need to manually select the window length and embedding dimension, which cannot automatically realize the frequency band division of the decomposed signal, which makes the decomposition result has the shortcomings of physical meaning deficiency and modal aliasing.Therefore, Bonizzi [11] proposed a new adaptive signal processing method: singular spectrum decomposition.This method is a fully adaptive decomposition method, and it can determine the embedding dimension and component grouping criterion based on data driven, and the singular spectrum decomposition components (SSDC) can be reconstructed adaptively from high frequency to low frequency.
The original vibration signal with nonlinear and nonstationary characteristics is decomposed into a series of SSDCs by SSD and SSDCs including more feature information that can reflect the degradation state of planetary gear.Therefore, the quantitative extraction of fault features is of vital importance.Analyzing the vibration signal from different time scales can obtain the multidimension expressing of signal information, so multiscale analysis is often applied to signal processing [12].It can highlight the signal feature in different scales more comprehensively and sufficiently, and it not only reflects the global information but also gives attention to the detail information of vibration signal.The commonly used feature quantization methods include feature frequency, time domain feature, frequency domain feature, entropy [13], and fractal dimension [14].Fractal dimension can extract the detail signal features, and it reflects the self-similarity of fine structure and statistical significance; it mainly includes Hausdorff dimension, similarity dimension, box dimension, and information dimension [15,16].The information dimension can describe the complexity and sparsity of the signal geometric from probabilistic perspective, and it can be used to quantify the fault feature contained in the vibration signal with nonlinear and nonstationary characteristics.Therefore, combining multiscale analysis and information dimension can realize the quantitative extraction of the complexity and sparsity of the vibration signal from different scales.
The multiscale information dimension extracted from SSDCs is regarded as fault feature, and the key of the next step is to recognize the degradation state of planetary gear.Traditional pattern recognition methods, such as support vector machine (SVM), back-propagation (BP) neural network, and fuzzy clustering [17,18], have been applied to the state recognition of mechanical equipment.But those methods still have some shortcomings, which are mainly reflected in the inability to process multidimension data, poor recognition effect with small training samples, and being easy to fall into the local optimum and overfitting.CNN is a multilayer perceptron that applies to recognize two-dimensional feature maps, it is a deep learning network model with multiple hidden layers, and it is beneficial to maintain the relationship among multidimensional data [19].CNN not only has the advantages of strong fault tolerance, strong self-adaptive ability, and self-learning ability of traditional neural network but also can transfer features layer by layer, and the feature pattern in the training samples can be learned and expressed implicitly.Compared with BP neural network and SVM, CNN can avoid the training falling into the local extreme value, and it has the stronger learning and expression ability to complex features, and it has faster computing speed.
The structure of this paper is as follows: In the second section, the mathematical model of degradation state recognition of planetary gear based on multiscale information dimension of SSD and CNN is established.In the third section, the experimental device used in this paper is introduced.In the fourth section, the vibration signals of different degradation states of planetary gear collected by the experimental device are dealt with by the proposed method, which proves the effectiveness of the proposed method for recognizing the degradation state of planetary gear.In the last section, this paper ends with some conclusions.

Model Building
. .Singular Spectrum Analysis.SSA is a nonparametric spectrum estimation method to analyze the time domain signal, and it is generally divided into four steps: trajectory matrix construction, SVD, component grouping, and diagonal averaging [20,21].
. . .Trajectory Matrix Construction.For a nonzero time series { 1 ,  2 , ...,   } with length , the embedding dimension is  (1 <  < ), the one-dimensional time series can be converted into multidimensional signal space { X 1 , X 2 , ..., X  }, and the obtained lag vector can be expressed as X = (  ,  +1 , ...,  +−1 ).The obtained trajectory matrix can be expressed as follows: where the obtained trajectory matrix  is a Hankel matrix, and it contains the information with trend, oscillation, and Complexity 3 noise.It can be seen that the choice of embedding dimension is an important parameter for the trajectory matrix construction, which affects the effect of SVD.

. . . Singular Value Decomposition (SVD).
For the obtained trajectory matrix  (×) , SVD is carried out.The SVD of trajectory matrix  can be expressed as follows:  =   , where  and  are normalization matrix, and  is the diagonal matrix constructed by the obtained singular value   .According to the theory of SVD, the singular value   can be obtained by eigenvalue decomposition of the covariance matrix of trajectory matrix .The covariance matrix of trajectory matrix  can be expressed as  = (1/)  , where  is the dimension.The eigenvalues of the covariance matrix  can be obtained and sorted from large to small, and they are  = { . . .Diagonal Averaging.This process is the last step of SSA, diagonal averaging is carried out for the new grouping matrix    , and the final reconstruction components that are the same as the data points of the original signal can be obtained.This process is called diagonal averaging or Hankelization.The corresponding relationship between the final reconstruction components  () by diagonal averaging and the each data points of new grouping matrix    is as follows: where ỹ represents the -th SSDC,  , represents the data point of new grouping matrix    obtained by SVD and component grouping, and  =  −  + 1.
The above equation is the main process of SSA; it can be seen that the critical parameters that determine the decomposition components quality of SSA are the selection of embedding dimension m and component grouping criterion.
. .Singular Spectrum Decomposition.In order to overcome the difficulty of SSA in selecting the embedding dimension and how to carry out the component grouping, a new adaptive signal processing method named SSD is proposed.It can automatically select the embedding dimension in the iterative process, and the frequency band of the obtained SSDCs can be segmented actively by automatic component grouping.It is a completely data driven decomposition method [11,22].

. . . Adaptive Selection of Embedding Dimension.
The selection of embedding dimension is important for constructing the trajectory matrix of SSD, and the adaptive selection criterion of embedding dimension is formulated.It can select the embedding dimension adaptively in each iteration process to calculate the final SSDDs based on data driven.The adaptive selection criterion of embedding dimension is expressed as follows: (1) According to SSD theory [22], a SSDC can be obtained through one iteration calculation.Assuming the residual component in -th iteration is , and its power spectrum density (PSD) is calculated.The frequency  max corresponding to maximum peak in PSD can be obtained.
(2) In the first iteration, namely,  = 1, if the value of  max is smaller (this is measured by whether  max /  is less than 0.01, where   is the sampling frequency), it is shown that the residual signal is considered as a trend signal, and the embedding dimension  is set as /3.
(3) In other cases, if it is not the first iteration, namely,  > 1, then  is set as  • (  / max ), where  is the ratio factor for adjusting the average period of the desired signal and the window length; in general,  is set as 1.2.
. . .Grouping and Reconstruction of the -th SSDC.SSDCs are reconstructed from high frequency to low frequency by a series of iterations in SSD process.In the first iteration ( = 1), if the detected component is a trend signal, the first left and right eigenvectors are only used to reconstruct the SSDC  (1) (), and  (1) , where ( ) represents the calculation process of diagonal averaging.In addition, for the -th iteration, the SSDC  () () must be able to describe a time scale with clear physical meaning.In this sense, the following rules are defined: The main frequency components of the analyzed signal are concentrated in the frequency band [ max − Δ,  max + Δ], where Δ represents the half band width of the main peak in the PSD of the residual signal.Therefore, a subset   = (  = { 1 ,  2 , ...,   }) is created from all eigenvalues set, and the determining principle of subset   is as follows: All eigenvalues which correspond to the left eigenvectors have the prominent dominant frequency in frequency band [ max − Δ,  max + Δ] and eigenvalues with the greatest contribution to the main peak energy of the analyzed signal are selected as a group; then, the SSDC corresponding to subset   can be reconstructed according to the diagonal averaging of the matrix   =  1 +  2 + ⋅ ⋅ ⋅ +   .The half band width of the main peak Δ in PSD is related to the average time span of the oscillating signal of the SSDC.In order to better estimate Δ, a spectrum model with the superposed Gaussian function is constructed to describe the distribution of PSD.This model is defined as the sum of three Gaussian functions, and each function represents a spectrum peak: where   is the amplitude of the  -th Gaussian function,   is its location, and   is its width;  = []  is a parameter vector, and it satisfies The first Gaussian function closes to the frequency  max corresponding to the main spectrum peak, and the second Gaussian function closes to the frequency  2 corresponding to the subspectrum peak, and the third Gaussian function closes to the frequency corresponding to any peak between main spectrum peak and subspectrum peak.Therefore, the following can be obtained: ( The model parameter   can be obtained by the weighted least square method, and the initial parameter values of the model can be set as follows: The optimal value of model parameter   is determined by Levenberg-Marquardt method.The estimated value of  1 is given, and the bandwidth Δ = 2.5 1 of the main spectrum peak can be obtained.Choosing   by the above method can determine the main eigenvalues with respect to the noise effects.In the -th iteration, the signal components with different scales mismatching to the frequency band [ max − Δ,  max + Δ] in this iteration are automatically discarded, and they can be decomposed in subsequent iteration.In further, in order to reconstruct the -th SSDC, the j-th iteration begins.The scale factor â is used to adjust the difference between  (1) () and residual signal V () (), and it is as follows: where â =   V    and g() () = â () () .
. . .Stopping Criterion for Iteration.The SSDCs g() () obtained by iterations are separated from original vibration signal, and the residual signal is V (+1) () = V () () − g() ().The normalized mean square error (NMSE) is calculated between the residual signal and original vibration signal, and it is as follows: NMSE is less than a certain threshold value set as the stopping criterion for iteration, and this threshold value can be set as 1%.If NSME is less than the threshold value, then the iteration decomposition is terminated.If NSME is greater than the threshold value, the residual signal is regarded as the original signal to repeat the above iteration decomposition process until the stopping criterion for iteration is satisfied.After satisfying the stopping criterion for iteration, the final decomposed SSDCs are as follows: where  is the number of the decomposed SSDCs and V (+1) () is the residual component when iteration decomposition is terminated. . .Multiscale Fractal Information Dimension.The original vibration signal with nonlinear and nonstationary characteristics is decomposed into a series of SSDCs by SSD, and combining multiscale analysis and fractal information dimension can realize the quantitative extraction of the complexity and sparsity of the vibration signal from different scales [12].
. . .Multiscale Analysis.Multiscale analysis is that a new data point can be obtained by average processing of the adjacent  data points, and the new data points can form a new time domain signal with  scale.For a time domain signal { 1 ,  2 , ...,   }, the value of scale factor  is set, and a new time domain signal { ()   } can be obtained in  scale, and the specific process is shown as (10) and Figure 1.By changing the scale factor , the expression of the original signal in different scales can be obtained: . . .Fractal Information Dimension.Fractal information dimension describes the complexity and sparsity of the signal geometric from probabilistic perspective.The specific calculation process of information dimension is as follows: For the signal sequence {} = { 1 ,  2 , . . .,   }, in order to reduce the influence of noise on the signal, the difference of adjacent data points in the effective signal length is used as the reconstruction signal, that is,  0 = ( + 1) − (),  = 1, 2, . . .,  − 1.Assuming that the space of the reconstruction signal is filled by a series of boxes with a length of , and all boxes are numbered in the calculation process.If the probability that the data points of the reconstruction signal fall into the -th box is   , then, the information entropy can be expressed as If the information dimension satisfies   ∼ lg    , the information dimension of  can be defined as follows: . .Convolutional Neural Network.CNN is generally composed of input layer, convolution layer, pooling layer, full connection layer, and classifier layer, and its basic structure is shown as Figure 2 [23][24][25].
. . .Convolution Layer.The convolution layer contains a set of convolution kernels with equal size learned by data driven.For different feature density, the convolution kernels with the fixed size convolution step are convoluted with the input feature matrix, and the convolution feature map can be formed by nonlinear activation function and bias, which represents the response to input feature, and the calculation process of the convolution feature map can be expressed as follows: where    and  −1  are the -th feature map and -th feature map of the -th layer and  + 1-th layer, respectively.  , is a convolution kernel between two feature maps,    is the bias, and ( ) is the nonlinear activation function, and in here, Sigmoid function is used and shown as follows: . . .Pooling Layer.The pooling layer is usually cascaded with convolution layer, and its function is to reduce the dimension of convolution feature map.The pooling feature map is formed by the downsampling of convolution feature map, and the calculation process of pooling feature map can be expressed as follows: where  is the template size of downsampling and    is the template weight.According to different sampling method, the pooling computing methods include maximum pooling, average pooling, and random pooling.Pooling is the aggregation statistics of the feature in a continuous region, and the regional feature is represented by the maximum value and average value and so on.In this paper, the maximum pooling is used.
. . .Full Connection Layer.The full connection layer adopts the full connection mode, and the vector transformation of pooling feature map is processed.The two-dimensional feature matrix of this layer is stretched into one-dimension feature vector, which is convenient for the subsequent output layer calculation.The full connection layer is still equivalent In testing stage of CNN, the output probability of testing samples is calculated, and the state of testing samples is determined according to the criterion of maximum probability.At present, the most commonly used classification method in classifier layer is logistic method and Softmax method.
Because the degradation state recognition of planetary gear is a multipattern classification problem, in this paper, the Softmax method is used.

Experiment Introduction
The simulation experiment of the degradation state of planetary gear is carried out on the comprehensive simulation test bench for mechanical fault, and the basic structure of the comprehensive simulation test bench for mechanical fault is shown as Figure 3.
In this experiment, the degradation states of two types of planetary gear faults are simulated, and they are degradation states of broken planetary gear and the degradation states of pitting planetary gear, respectively.Normal planetary gear and different degradation states are shown in Figure 4. Based on the basic parameters of the planetary gearbox of test bench and the preliminary analysis of its vibration signal in the early stage, the feature frequency and its side frequency of planetary gearbox are in the frequency band with 20 Hz-640 Hz, and the highest and most prominent natural frequencies of planetary gearbox are in the frequency band with 2800 Hz-3200 Hz.According to the Nyquist sampling principle, the fault information contained in the frequency band of feature frequency should be focused, and the natural frequency of planetary gearbox also needs to be taken into account.In addition, considering that too high sampling frequency will increase the calculation amount and affect the calculation efficiency, and the sampling frequency of the experiment process is set as 6400 Hz.The basic parameters setting in the experiment are shown in Table 1.

Experiment Analysis
The simulation experiment of degradation state of planetary gear is carried out, and the obtained vibration signals of normal planetary gear and different degradation states are shown in Figure 5. Due to the nonlinear and nonstationary characteristics of the vibration signal of planetary gear and the tiny difference among the vibration signals of various degradation states, there is no obvious feature difference among vibration signals in time domain, and it is not possible to recognize the planetary gear state according to Figure 5. Next, the Figure 5 obtained vibration signals of planetary gear are processed to verify the validity of the proposed degradation state recognition method.First of all, the vibration signals are processed by SSD, in order to save the paper space, the SSD of the vibration signal of the degradation state of breakage level 2 is taken as an example.In the decomposition process, SSD can automatically realize the embedding dimension selection, and the frequency band of the decomposition result can be segmented actively by automatic component grouping, and the obtained SSDCs of  the vibration signal of degradation state of breakage level 2 are shown in Figure 6.
It can be seen from Figure 6 that the vibration signal with nonlinear and nonstationary characteristics of the degradation state of breakage level 2 is decomposed into 9 SSDCs by a series of iterations (in order to expression convenience, the final residual component is expressed as SSDC9), and SSDC1-SSDC9 are segmented from low frequency to high frequency.It is obvious that SSDC1-SSDC4 have a distinct periodicity, indicating that SSD can separate the periodic component hidden in the original vibration signal, and the high frequency information contained in the original vibration signal is separated into SSDC5-SSDC9.By SSD, each SSDC contains a large number of feature information that is benefit for recognizing degradation state of planetary gear.Next, the multiscale information dimension is used to extract and quantify the feature information from different time scales for each SSDC.
In the calculation process of multiscale information dimension, the scale factor  is set to 50.For each SSDC, the new time series under 50 time scales can be obtained, and then the information dimension features can be extracted for each new time series.It can realize the multidimensional feature expression from global signal information to detail signal information.The multiscale information dimensions of the SSDCs of normal planetary gear and different degradation states are shown as Figure 7.
As shown in Figure 7, for same SSDCs, the information dimensions extracted from different time scales have a certain difference, the values of information dimension show an upward trend from SSDC1-SSDC9, and, moreover, the values of information dimension also show a downward trend as the increasing of time scale.Because the vibration signal of normal planetary gear is relatively regular and simple, and when the fault occurs, the stiffness of planetary gear has a local nonlinear variation that results in vibration signal being nonlinear and nonstationary.The multiscale information dimension of normal planetary gear is smaller compared with that of the other malfunction degradation states.In addition, it can be seen from Figure 7 that the multiscale information dimensions of each SSDC are also different for different degradation states of planetary gear faults.The main difference is the information degradation state of planetary gear.For example, for breakage level 1, the information dimensions of SSDC7-SSDC9 are obviously smaller than those of other degradation states when the time scale is relatively large (time scale is greater 35).For breakage level 3, the information dimensions of SSDC1, SSDC4, and SSDC6 increase significantly when the time scale is near 17.For pitting level 1, the information dimensions of SSDC7 in time scale 5-10 are obviously different from that of other degradation states.Next, on the basis of the multiscale information dimension of each SSDC, CNN which can fully consider the internal relationship of the information dimension of adjacent SSDCs and adjacent scales are used to realize the effective recognition of the degradation state of planetary gear.
The structure of CNN is built below, and the feature matrix constructed by the multiscale information dimension of each SSDC is defined as the input of CNN, so the feature  matrix size of the input layer of CNN is 9 × 50.The hidden layer of CNN is composed of two convolution layers and two polling layers alternately.In convolution layer 1, the number of the convolution kernel is 6, and the size of convolution kernel is 3 × 3, the slip step is 1, and the activation function is selected as Sigmoid function.In polling layer 1, the size of polling area is 1 × 6, and the polling area is not overlapped.In convolution layer 2, the number of convolution kernel is 12, and the other parameters are the same as the first layer.In polling layer 2, the size of polling area is 1 × 3, and the polling area is not overlapped.In addition, the feature dimension of full connection layer is set as 120.Because 9 types of degradation states of planetary gear need to be recognized by CNN in this experiment, the various number of the output layer of CNN is set to 9, and the Softmax classifier is used.The basic parameters of CNN are shown in Table 2.
The training of CNN is carried out, 200 training samples are selected for each kind of planetary gear state randomly, and there are 1800 training samples in all.The training samples are processed by SSD and multiscale information dimension, and the obtained feature matrixes are defined as the input of CNN for training.There are 9 neurons in the output layer of CNN, and the output vectors correspond to 9 types of degradation states of planetary gear.The training rate is set to 1, and the number of iterations is set to 100.The training process of CNN is shown in Figure 8.It can be seen from Figure 8 that the mean square error of training samples tends to be stable after 65 iterations, and the training process of CNN is completed.Next, the recognition ability  of CNN for degradation state of planetary gear is verified, and 100 testing samples are randomly selected for each type of planetary gear state, a total of 900 testing samples, and those are recognized by the trained CNN.The recognition result is shown in Figure 9.Meanwhile, in order to illustrate the advantage of CNN, other recognition methods, BP neural network [26] and SVM [27], are used to perform comparative analysis.And the training samples and testing samples which are the same as that applied to CNN method are used to train and test BP neural network and SVM, respectively.The  recognition results of CNN, BP neural network, and SVM for testing samples are shown in Table 3.
It can be seen from Figure 9 and Table 3 that the proposed degradation state recognition method of planetary gear combining multiscale information dimensions of SSD and CNN can obtain better recognition results.The overall recognition rate of CNN is 97.2%, and for normal planetary gear, the recognition rate of CNN is 100%.The planetary gear state with the lowest recognition rate of CNN is the planetary gear with pitting level 1, and it can also reach 94%.However, when BP neural network and SVM are used to recognize the degradation state of planetary gear, it can be found that their recognition rates are significantly lower than that of CNN.The overall recognition rate of BP neural network is only 88.3%, and the planetary gear states with the highest recognition rate of BP neural network are breakage level 2 and Pitting level 3 and only reach 91%.With the application of SVM, the recognition rates of various planetary gear states have been improved to a certain extent, the overall recognition rate reaches 93.4%, but compared with the application of CNN, its recognition effect still has a certain gap.The experimental results show that the proposed method in this paper is a reliable, accurate, and effective method for the degradation state recognition of planetary gear.

Conclusions
A new degradation state recognition method of planetary gear based on multiscale information dimension of SSD and CNN is proposed in this paper.The SSD developed from SSA is suitable for processing the vibration signal with nonlinear and nonstationary characteristics generated by planetary gear, and the embedding dimension and component grouping segmentation can be determined.The original vibration signal can be converted into a series of SSDCs which are easy to be analyzed, and the feature information of planetary gear is contained in each SSDC.Aiming at a series of SSDCs, multiscale information dimension combining multiscale analysis and fractal information dimension which is a quantitative extraction method of feature information is studied, and the multidimensional feature expression for each SSDC from global signal information to detail signal information can be realized.The feature matrix composed of multiscale information dimension of each SSDC is defined as the input of CNN, and the training samples including different planetary gear states are used to train CNN, so that the trained CNN has the ability to recognize the degradation state of planetary gear effectively.The experimental results show that the proposed method is suitable for processing and analyzing the vibration signal of planetary gear, and the overall recognition rate of various planetary gear states is up to 97.2%.This method is an effective method for feature extraction and degradation state recognition of planetary gear.

Figure 2 :Figure 3 :
Figure 2: The basic structure of CNN.

Figure 4 :Figure 5 :
Figure 4: Normal planetary gear and different degradation states.

ComplexityFigure 6 :
Figure 6: The obtained SSDCs of the vibration signal of degradation state of breakage level 2.

Figure 8 :
Figure 8: The training process of CNN.

Figure 9 :
Figure 9: The recognition result of CNN for testing samples.

Table 1 :
Parameters setting of the experiment process.

Table 2 :
Basic parameters of CNN.

Table 3 :
The recognition results of CNN, BP neural network and SVM.