Compound Fault Diagnosis of Rolling Bearing Based on ALIF-KELM

Aiming at the shortcomings of difficult classification of rolling bearing compound faults and low recognition accuracy, a composite fault diagnosis method of rolling bearing combined with ALIF and KELM is proposed. First, the basic concepts of ALIF and KELM are introduced, and then ALIF is used to decompose the sample data of vibration signals of different bearing states so that each sample can get several IMFs, select the top K IMFs containing the main fault information from each sample, calculate the energy feature and sample entropy of each IMF, and construct a fault feature vector with a dimension of 2K. Finally, the feature vectors of the training set and the test set are input into the KELM model for fault classification. Experimental results show that, compared with EMD-KELM model, ALIF-ELM model, ALIF-BP model, and IFD-KELM model, the rolling bearing composite fault diagnosis method based on the ALIF-KELM model has higher classification accuracy.


Introduction
Rolling bearings are one of the basic components and play an important role in various types of industrial equipment. Rolling bearings have been widely used in many engineering fields. However, the actual working environments of rolling bearings are very harsh. After an extended period of operation, these components are prone to failure. In addition to a single failure, the failure types can also easily present as composite failure formed due to simultaneous occurrences of multiple types of failures [1]. Statistical analysis [2] indicates that approximately 30% of all rotating machinery equipment failures are caused by failure of rolling bearings. Consequently, effective monitoring of the integrity health status of rolling bearings and timely elimination of hidden issues play an important role in ensuring safe and reliable equipment operation, reduction in economic and capital losses, and avoiding accidents.
In view of the above situation, most of the methods currently proposed by researchers are based on vibration signal processing composite fault diagnosis technology for rolling bearings, in which the signal decomposition method is one of the effective methods for processing vibration signals. In 1998, Huang et al. [3,4] proposed an empirical mode decomposition (EMD) algorithm. Ma Xinna and others combined EMD with an adaptive notch filter to realize the adaptive separation and diagnosis of rolling bearing composite faults. However, due to the lack of EMD's strict mathematical theoretical derivation, singular points in the signal easily lead to modal aliasing occurrences. Cubic spline interpolation has either underfitting or overfitting and is unstable under the noise interferences. To effectively resolves rolling bearing problems, researchers have proposed many adaptive mode decomposition methods inspired by the idea of EMD, including local mean decomposition (LMD), empirical wavelet transform (EWT), and variational modal decomposition (VMD) [5][6][7]. Huang et al. [8] extended the local mean decomposition to a complex local mean decomposition and were successful in applying it to the composite fault diagnosis of rolling bearings. Zhu et al. [9] proposed a parameterized local eigenscale decomposition method for the discontinuity of the first derivative of the local eigenscale decomposition method, applied it to the composite fault simulation signal and the bearing experimental signal, and verified the method's performance. Effectiveness and superiority of the latter method are demonstrated by a comparative analysis. Hu et al. [10] optimized several important parameters in the variational modal decomposition to improve the decomposition performances. At the same time, they also used the 1.5-dimensional spectrum to suppress noise and enhance the impact signal, combining the two to achieve effective separation of composite faults in the rolling bearings. In order to improve the stability and convergence of the mean function of the upper and lower envelopes under disturbances, Lin et al. [11] proposed an iterative filter (IF) algorithm, which follows the same algorithm framework as EMD and uses low-pass filtering to obtain the upper and lower mean functions of the signal envelope. In 2016, Cicone et al. [12] used the basic solution system of Fokker-Planck (FP) differential equations as the filter function to extend the IF algorithm; they proposed the Adaptive Local Iterative Filter (ALIF) algorithm. ALIF can effectively analyse and process nonlinear and nonstationary signals. At present, the algorithm has been increasingly applied to the field of rotating machinery fault diagnosis, Chen et al. [13] combined ALIF and energy operator demodulation methods to effectively diagnose the fault characteristic frequencies of rolling bearings. Zhang et al. [14] proposed a method based on ALIF and high-order energy operator demodulation and successfully identified weak fault components during the early faults stages of rolling bearings; compared with the low-order energy operator demodulation method, this approach proves to be a superior method.
In recent years, machine learning technology has allowed for better success, through applying intelligent fault diagnosis algorithms [15][16][17][18]. Globally, scholars have continued to research intelligent recognition algorithms based on the BP neural network; these applications have been widely applied to the rolling bearing fault diagnosis achieving relatively sound academic resolutions [19][20][21]. e BP neural network requires performing iterative calculations during the learning process; sometimes, it falls into a local minimum, causing the algorithm to become time intensive, and the generalization ability of the network is very limited [22].
To address the above problems, Huang et al. [23] proposed an extreme learning machine (ELM) based on the single-hidden layer feedforward network (SLFN). e algorithm relies on its own performance and has gradually attracted the attention of scholars in diverse fields, including significance for the development of intelligent diagnosis technology for rolling bearing faults. For the nonstationary characteristics of bearing vibration signals, scholars, locally and abroad, usually use various nonstationary signal processing and analysis methods combined with the ELM algorithm to conduct intelligent diagnosis research on rolling bearing faults. Xu and Ma [24] used a combination of empirical wavelet transform and ELM to apply to the study of intelligent diagnosis of rolling bearing faults and provided bearing experimental data to prove the feasibility of this method. When the intelligent diagnosis model remains unchanged, the construction of the fault feature vector will have an important influence on the diagnosis effect of the intelligent fault diagnosis. KELM is an improved algorithm proposed by Huang et al. [25,26] and is based on ELM. First, the original algorithm is optimized, and then the kernel function is used to replace the activation function of the hidden layer to make the model stable and universal. e KELM algorithm has improved generalization ability and is more suitable for solving multiclassification problems. is paper proposes a composite fault diagnosis method for rolling bearings that combines both Adaptive Local Iterative Filter (ALIF) and KELM approaches.

Adaptive Local Iterative Filter Algorithm
Adaptive Local Iterative Filter (ALIF) is a new type of adaptive mode decomposition method; improvements are due to the iterative filtering algorithm (IF). ALIF mainly constructs a filter function with adaptive characteristics by applying the basic solution system of Fokker-Planck differential equations. erefore, it is also very necessary to introduce the IF algorithm before introducing the principle of the ALIF algorithm.

Iterative Filter.
IF is similar to the EMD algorithm; it iteratively filters out each eigenmode function (IMF) component. is method convolves the filter functions with the signal to be decomposed to obtain the sliding operator; this process replaces the process of fitting the original data to obtain the mean value of the envelope in the EMD algorithm. IF mainly includes two processes: inner loop and outer loop.
Knowing the preprocessed signal X(t) and the filter function f(t), the sliding operator Γ(X(t)) is obtained by calculating the convolution of X(t) and f(t): where f(t) is the fixed low-pass filter function; h(z) is the filter interval; its calculation expression is as follows: where N is the signal length of X(t); λ is the set value; a is the number of extreme points of X(t). en calculate the fluctuation operator K(X(t)) by preprocessing the difference between the signal X(t) and the sliding operator Γ(X(t)): Finally, it judges whether the volatility operator K(X(t)) meets the conditions of the IMF component, and only the volatility operator that meets the set conditions can be extracted as the IMF component. If not, the volatility operator needs to be screened further, and the specific process is as follows: (1) Calculate the filtering interval 1 of the preprocessed signal according to formula (2).
(2) Solve the sliding operator Γ(X(t)) according to formula (1). (3) Calculate the volatility operator K(X(t)) according to formula (3), and the expression of the volatility operator in the screening process is as follows: Let When IMF(t) can meet the IMF component conditions, complete the extraction of IMF components; otherwise, continue to repeat steps 1 to 4 until the conditions are met before stopping the screening. However, in actual situations, it is impossible for n to approach infinity, so the screening termination conditions for IMF components can be artificially set as follows: at is, when σ is less than a specified threshold, the screening is stopped and IMF(t) is the filtered IMF component.
e above-mentioned is the inner circulation process, and its main purpose is to extract the qualified IMF components, while the function of the outer circulation process is to stop the inner circulation process. First, the margin after all the effective IMF components of the preprocessed signal are successfully extracted is defined as the residual signal, denoted as c(t): When the residual signal c(t) has obvious trend characteristics, that is, there is only one extreme point at most, the entire iterative filtering process is halted. Otherwise, it needs to be used as a fresh preprocessing signal to continue to extract qualified IMF components.

Adaptive Local Iterative Filter.
In IF, in order to reduce the negative impact of noise on it, filter functions are generally set in advance, but some complex signals will lack adaptability when applying IF algorithms and may also cause component waveform distortion. In order to analyse both nonlinear and nonstationary signals more effectively and overcome the shortcomings of the IF algorithm, Cicone et al. were inspired by the diffusion process of partial differential equations and used the solution of the Fokker-Planck equation to construct a filter function; this enabled the filter to be tightly supported in the time domain; its length can be flexibly changed and adaptability is enhanced. Moreover, it can also avoid false components in the iterative filtering process [14]. is allows ALIF to effectively suppress noise sensitivity and modal aliasing in the IF algorithm.
For interval (a, b), there are two differentiable functions p(x) and q(x), and the following two conditions are satisfied: e Fokker-Planck equation is as follows: In order to simplify the above formula, convert to the expression of the differential equation: where α and β are called steady-state coefficients and α, β ∈ (0, 1).
(p(x)g) x in equation (9) will have the effect of aggregation so that the solution g(x) of the equation will start from the two endpoints of the interval [a, b] towards the center point. At the same time, (q 2 (x)g) xx will produce the effect of diffusion, which causes the solution g(x) of the equation to diffusely move from the center of interval [a, b] to the two endpoints. When the two effects are balanced, At this time, the differential equation has a nonzero solution and meets the following conditions: e solution g(x) in the Fokker-Planck equation is the filter function f(t) used in iterative filtering. For different intervals [a, b], the solution obtained by the filter function f(t) will also be different, and the function expression will also differ, allowing the ALIF algorithm to solve for filter function adaptively.

Kernel-Based Extreme Learning
Machine Algorithm

Extreme Learning
Machine. e ELM network structure is shown in Figure 1. e network structure includes three layers: hidden layer, input layer, and output layer [27]. e ELM intelligent learning model needs to provide the number of hidden layer nodes and the type of activation function during the entire learning process, while the input weights and hidden layer thresholds are randomly generated and remain unchanged. Finally, the least square method can be used to solve the output weight under the premise of ensuring that the training error is minimized.
Assuming that there are N existing data samples ( the corresponding output expression of SLFN with L hidden layer nodes is as follows: where β i represents the connection weight of the output layer and the hidden layer; ω i represents the connection weight of the input layer and the hidden layer; x j is the input vector, which also represents all the feature vectors of the j th sample; b i represents the hidden layer threshold; g(x) represents the hidden layer containing layer activation function; t j is the output vector, which also represents the class label of the j th sample. Assuming that the activation function g(x) is infinitely differentiable, then the ultimate goal of ELM learning is to minimize the output error; that is, infinity approaches 0, which can be expressed as follows: en there are β i , ω i , and b i that make the following formula true: e abbreviated formula (14) is expressed as a matrix form as follows: where H is the hidden layer output matrix; T is the expected output matrix; they are represented as follows: Since the input parameters of the ELM algorithm are randomly generated and remain unchanged, there is no need to adjust during the entire training and learning process. e connection weight 1 of the output layer and the hidden layer under the minimum error can be solved by the following formula: where H + represents the Moore-Penrose generalized inverse matrix of H.

Kernel-Based Extreme Learning
Machine. e kernel extreme learning machine is based on the single-hidden layer feedforward neural network extreme learning machine. By introducing the kernel function mapping and regularization theory to optimize the model network, it can improve the accuracy and generalization ability while reducing the complexity and randomness of the network. e extreme learning machine can be expressed by the following formula through mathematical expression: where C represents the penalty coefficient; ξ is the training error; h(x) is the output row vector of the hidden layer. Solving for the above optimization problem, it can be concluded that the improved output function of ELM is as follows: where H is expressed as follows: Regarding h(x i ) as the nonlinear mapping of each sample, HH T represents the inner product form of h(x i ), using the kernel function theory to define the kernel matrix Ω ELM to replace HH T so as to overcome the fluctuation of the final result of the ELM algorithm due to randomly generated inputs. e kernel matrix definition of KELM is as follows: After finishing formulas (20) ∼ (22) and substituting them into formula (19), the new output function e of KELM is obtained as follows: where T represents the label of the data set; Ω ELM is a symmetric matrix with N rows and N columns; K(x i , x j ) is the kernel function quoted. is paper uses the Gaussian radial basis kernel function; Ω ELM and K(x i , x j ) are expressed in the following specific forms: where λ represents the nuclear coefficient.

Classification Algorithm Based on the Combination of ALIF and KELM
According to the previous introduction and analysis, ALIF can effectively decompose nonlinear and nonstationary vibration signals and further analyse the IMF components obtained after decomposition. ALIF can extract the local characteristics of the fault signal. Compared to traditional neural network algorithms, KELM has a strong generalization learning ability and at the same time has high efficiency and stability. erefore, this paper combines the two and proposes a diagnostic method for rolling bearing composite faults based on ALIF and KELM. e specific steps are as follows: (1) e ALIF decomposition is applied to the sample data of the vibration signal of different bearing states, and each sample can get several IMF components and a residual component

Experimental Data Processing.
e data set analysis and verification in this study were generated from the Xi'an Jiaotong University rolling bearing accelerated life test. e experimental setup is shown in Figure 3 [28]. e sampling frequency is set to 25.6 kHz, the sampling interval is set to 1 min, and each sampling time is 1.28 s, so the number of sampling points for each sample in the data set is 32768. e vibration signal collected in the experiment is all the data of the rolling bearing from normal to failure, including a total of 15 data sets under 3 working conditions. In the following sections, four data sets will be used to analyse and verify the method proposed in this paper. e data description is shown in Table 1. e bearing data used in the subsequent analysis in this paper are based on the failure data intercepted during the whole life cycle. is paper uses the experimental data set introduced in Table 1 and obtains 102400 sample points of rolling bearing outer ring fault, cage fault, inner ring and outer ring composite fault, inner ring fault, and normal state data from it. e data of each state of the rolling bearing is divided into 50 samples, a total of 250 samples are obtained from the five states, and each sample contains 2048 sampling points. e procedure is to take one sample from each of the five bearing states and generate their time-domain waveforms as shown in Figure 4.

Fault Feature Analysis.
It is necessary to construct fault feature vectors in advance before KELM performs intelligent fault diagnosis. Selecting appropriate features will help improve the accuracy of fault intelligent diagnosis. erefore, before proceeding with the method verification, a brief analysis of the fault characteristics used in this paper is given.
Select one sample data in each of the five states of the rolling bearing, set the same parameters for all five samples, and then apply ALIF decomposition to obtain five IMF components and one residual component. Calculate the energy characteristics and sample entropy of the first 4 IMF Mathematical Problems in Engineering components; the results are shown in Figures 5(a) and 5   Outer ring failure 2100 Bearing2 Compound failure of inner ring and outer ring 2100 Bearing3 Inner ring failure 2250 Bearing4 Cage failure 2250 bearing state features in the IMF4 component that basically overlap. is has been an obstacle to identifying types of bearing faults. When the energy feature and the sample entropy are selected as the fault feature vector, several bearing states where the original energy features are aliased can be effectively distinguished by the sample entropy.

Experimental Results and Analysis.
In order to intuitively distinguish the different operating states of rolling bearings in the subsequent analysis, the sample data of the 4 types of faults and normal states are divided into 5 categories, and the specific category labels are given as shown in Table 2.
First, use the ALIF algorithm to decompose the vibration signals of all samples, make it get 5 IMF components and 1 residual component, calculate the energy characteristics and sample entropy of the first 4 IMF components, and obtain a fault eigenvector matrix with a size of 250 × 8. en select 30 samples in each state as the training set and 20 samples as the test set, and normalize the fault feature data set to make the data indicators.
Finally, the Gaussian radial basis kernel function in equation (25) is used as the kernel function of KELM; the kernel coefficient λ � 0.5 and penalty coefficient C � 1 are determined. e initialization of the KELM intelligent diagnosis model is then completed. Input 150 training sample sets into the KELM model for training, and then apply 100 test samples for testing; the result is shown in Figure 6. e abscissa in Figure 6 represents 100 sets of test samples, and each of the 5 bearing failure categories uses 20 sets of samples as the test; the vertical axis shows the category labels of different failures of rolling bearings, corresponding to Table 2. Judging from the diagnostic results, only one of the 100 test samples was misdiagnosed. e composite fault of the inner ring and the outer ring was misdiagnosed as a cage failure, while the remaining 99 test samples were accurately diagnosed. Taken together, the overall fault diagnosis accuracy rate is 99%, of which the single fault diagnosis accuracy rate is 100%, and the compound fault diagnosis accuracy rate is 95%. erefore, the effectiveness of the intelligent diagnosis method of rolling bearing composite fault based on ALIF and KELM proposed in the paper can be proved.
However, in practical engineering applications, the sample data available for equipment is usually limited. e next step will be to study whether the method proposed in this paper can achieve higher accuracy fault intelligent diagnosis on test samples with fewer training samples. Setting the number of training samples for each failure type to 30, 25,20,15,10,5, and 1 in turn, the number of corresponding test samples is 20,25,30,35,40,45, and 49; then we can get 7 different results of ALIF-KELM's diagnosis of rolling bearing faults. e relationship between the number of different training samples and the diagnosis accuracy is shown in Figure 7. It can be found that even when the number of training samples is only 1, the overall accuracy of ALIF-KELM can still reach 81.63%. When the number of training samples is 10, the fault intelligent diagnosis accuracy can also be as high as 99%; the specific diagnosis is shown in Figure 8. At this time, in all 200 test samples, only 2 sets of inner and outer ring compound faults were misdiagnosed as cage faults, so the compound fault accuracy rate is still 95%, which is the same as the accuracy rate when the number of training samples is 30. Compared with the   IFD-KELM method proposed in [29], when the sample size is 10, the accuracy of the method proposed in this paper is increased by 6.5%. erefore, it can be further proved that the intelligent diagnosis method of rolling bearing composite fault based on ALIF and KELM proposed in this chapter can still be effective with less sample data.

Comparative Analysis of Experiments.
In order to verify that the energy characteristics and sample entropy of the first four IMF components calculated after the application of ALIF decomposition can more effectively reflect the different fault characteristics of rolling bearings, the ALIF decomposition method is replaced with the traditional decomposition method EMD for comparison. ere are 10 training samples and 40 test samples in each bearing state, and the KELM parameters remain the same as before. A total of 50 training sample sets of five bearing states are input into the KELM model for training, and then 200 test samples are used for testing; the result is shown in Figure 9. e figure shows that there are a total of 5 groups of test samples with diagnostic errors. Among them, 3 groups misdiagnosed the outer ring fault as a cage fault and an inner ring fault, and 2 groups misdiagnosed the cage fault as a composite fault of the inner ring and the outer ring; the composite fault test samples are all accurately identified, but the overall fault diagnosis accuracy rate is 97.5%. Figures 7 and 8 show that the overall diagnosis accuracy of ALIF-KELM is 99% in the case of 10 training samples. In contrast, the application of ALIF-KELM has higher accuracy than EMD-KELM's intelligent fault diagnosis method. It is verified that ALIF decomposition is more effective than EMD decomposition. In order to verify the superiority of the performance of the KELM fault diagnosis model, KELM was replaced with the two traditional diagnosis models of ELM and BP neural network.
e method of feature extraction and the construction of feature vectors remain unchanged from the original method. e number of training samples in each bearing failure state is still 10, and the number of test samples is 40. Applying these two fault diagnosis models, respectively, the results are shown in Figures 10(a) and 10(b). Figure 10(a) shows that the accuracy of using ELM as a diagnostic model is 78.5%. e accuracy of BP as a diagnostic model is lower; Figure 10(b) shows only 53%. In short, compared with Figure 8, it can be found that the fault diagnosis accuracy of the two traditional diagnosis models is obviously much worse, and there are a lot of misdiagnosis phenomena in 200 test samples.
Since the initial weights of the two diagnostic models, ELM and BP, are randomly generated, the results obtained from each test are usually different. In order to reduce the impact of random fluctuations on the final comparison results, the three diagnostic models KELM, ELM, and BP were retrained and tested 10 times, and the accuracy of fault diagnosis was recorded. e results are shown in Figure 11. In order to be able to study the algorithm efficiency of each model and record the training time and test time consumed by each diagnostic model, the results are shown in Figures 12(a) and 12(b). Figure 10 shows that using KELM as an intelligent diagnosis model to identify fault types has not only the highest test accuracy but also the best stability. It can maintain an accuracy of 99% in 10 tests. Compared with the two diagnostic models of ELM and BP, the accuracy and stability of ELM are significantly better than that of BP. Figures 12(a) and 12(b) show that the KELM model is used for classification, the training time is the lowest, and the test time is basically the lowest, which is not much different from ELM, but the overall efficiency of KELM is the highest. And the algorithm efficiency of the BP model for classification is obviously lower than that of KELM and ELM. In short, ALIF-KELM has higher accuracy and algorithm efficiency than the two fault diagnosis methods ALIF-ELM and ALIF-BP.

Conclusions
In order to realize the composite fault diagnosis of rolling bearings, this paper proposes a diagnosis method by combining ALIF and KELM. In the case of a small sample, the proposed method is compared with EMD-KELM, ALIF-ELM, ALIF-BP, and IFD-KELM in [29]. e results show that both ALIF and KELM algorithms have certain advantages in the case of small samples, and the method proposed in this paper has high diagnostic accuracy and is suitable for the diagnosis of composite faults of rolling bearings.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this study. Mathematical Problems in Engineering 11