A Feature Extraction Method of Wheelset-Bearing Fault Based on Wavelet Sparse Representation with Adaptive Local Iterative Filtering

.e feature extraction of wheelset-bearing fault is important for the safety service of high-speed train. In recent years, sparse representation is gradually applied to the fault diagnosis of wheelset-bearing. However, it is difficult for traditional sparse representation to extract fault features ideally when some strong interference components are imposed on the signal. .erefore, this paper proposes a novel feature extraction method of wheelset-bearing fault based on the wavelet sparse representation with adaptive local iterative filtering. In this method, the adaptive local iterative filtering reduces the impact of interference components effectively and contributes to the extraction of sparse impulses. .e wavelet sparse representation, which adopts L1-regularized optimization for a globally optimal solution in sparse coding, extracts intrinsic features of fault in the wavelet domain. To validate the effectiveness of this proposed method, both simulated signals and experimental signals are analyzed. .e results show that the fault features of wheelset-bearing are sufficiently extracted by the proposed method.


Introduction
As the core system of high-speed trains, the bogie frame plays an extremely crucial role in the operation process. Among the components of the bogie frame, wheelsetbearing is the core component for the connection between the wheelset and the frame. When a high-speed train operates on a rail, the wheelset-bearing plays an important role in the power transmission. Compared with the other common bearings on the static mechanical equipment, the service conditions of the wheelset-bearing are quite different. e wheelset-bearing bears not only the static pressure of high-speed train but also the unstable dynamic load caused by the radial acceleration during operating time. A higher speed naturally causes a greater vibration and dynamic force to the wheelset-bearing. In addition, the wheelset-bearing will bear a large axial force when the train passes through a curve.
When a high-speed train operates on a rail, various operation characteristics could cause the wheelset-bearing fault. Since the distance between adjacent stations is usually not long for high-speed train, the acceleration and the braking of high-speed train occur frequently. is causes the dynamic load range of the wheelset-bearing to change greatly and frequently. In addition, due to the effect of overwhelming impact from track-vehicle system, which is caused by the polygonal wear of wheels, track irregularities, and irregular turnout, various faults (such as peeling, flaw) might appear on the wheelset-bearing. Once these faults appear under the condition of highspeed rotation, the service conditions of wheelset-bearing will deteriorate rapidly, which will eventually affect the safety of high-speed train. erefore, it is of great significance to detect wheelset-bearing fault [1]. As for the traditional fault detection of wheelset-bearing, it cannot be conducted until the wheelset-bearing is disassembled during the level-three maintenance of a high-speed train.
is indicates that fault detection of wheelset-bearing cannot be executed before the level-three maintenance. erefore, the fault diagnosis, based on vibration signal, becomes a feasible technique for wheelset-bearing fault in the early stage.
In general, when the bearing fault occurs, the periodical impulses are generated. erefore, the vibration signals are collected to determine whether the fault exists on the bearing. During the operation of high-speed train, the defect inevitably appears on the wheel-tread. Compared with the traditional bearing, the energy of the rotation frequency for wheelset, generated by the defect on the wheel-tread, is larger because of the interaction between wheel-tread and rail. erefore, the vibration information of both rotation frequency and harmonics for wheelset is also evidently contained in the collected vibration signals, which makes the frequency components contained in the vibration signals more complex [2].
is is the most important difference between the vibration signals of traditional bearing and wheelset-bearing. If the rotation frequency information of wheelset cannot be handled appropriately in the analytical process, the mistaken fault features might be extracted. In addition, the collected vibration signals usually contain nonstationary component.
ey are also interfered with noise component, which causes extraordinarily challenges for the detection of fault [3,4].
As for the fault detection in a general bearing, many fault diagnosis methods, including empirical model decomposition and its variants [5,6], empirical wavelet decomposition [7,8], variational mode decomposition [9,10], minimum entropy deconvolution [11,12], local mean decomposition [13,14], deep learning [15,16], and sparse representation [17,18], have been proposed for bearing fault detection. Among these techniques, the sparse representation might be an advanced method for feature extraction of bearing fault. Recently, many scholars have devoted themselves to applying sparse representation methods to bearing fault detection. Chen et al. [17] proposed a method named SpaEIAD for fault extraction, which showed a good performance in denoising of signal. In [19], feature-sign search was adopted for sparse representation which also obtained a desired extraction result. Sun et al. [20] designed a parametric impulsive dictionary and improved the stopping criteria of the OMP for sparse representation. Ding [21] proposed a shock response convolutional sparse coding technique and achieved the extraction of shock response based on time location coefficients. Qin [22] proposed a new sparse representation method based on a family of model-based impulsive wavelets, which was able to accurately represent the bearing fault impulses. In [23], a bearing fault extraction method, based on the adaptive OMP algorithm and improved K-SVD with adaptive transient dictionary, was proposed and it achieved a good effect in fault detection and computation speed.
In the traditional sparse representation, the power levels of different features will affect the results extracted by the sparse representation [24]. When the energy of interference, usually expressed as the nonstationary component, is stronger than the energy of fault features, it will detect and extract the interference component instead of the fault features. Considering this condition, Qin separated the harmonics and modulated components from the vibration signal of gearbox bearing successfully with the improved OMP and Fourier dictionary in [23]. is method achieved an outstanding performance in fault extraction through two experiments. However, the service conditions of the wheelset-bearing are more complex than those of the gearbox bearing in [23]. As analysis showed before, apart from the influence of nonstationary component, the wheelset's rotation frequency component also should be considered to be removed. Due to this reason, the aforementioned separation algorithm in [23] should be executed additionally, which increases the complexity of the algorithm. In addition, the Fourier dictionary might not be a suitable dictionary for the separation of the wheelset's rotation frequency component. erefore, Qin's algorithm might not be fully suitable for the application of fault extraction for the wheelset-bearing. According to the analysis in [2], two resonance frequencies, excited by the defect of wheel-tread and the wheelset-bearing fault, are more likely at low frequency and high frequency, respectively. Additionally, the frequency of nonstationary component, which is distributed throughout the signal, is also at low frequency relative to the resonance frequency excited by the wheelsetbearing fault. In order to remove the impact of these two components conveniently and effectively, adaptive local iterative filtering (ALIF) [25] is introduced. e ALIF is suitable to process the separation of the components belonging to different frequency. erefore, the ALIF can reduce or eliminate the influence of aforementioned two components.
When sparse representation is solely applied in time domain, it will lead to the inadequate feature extraction. is indicates that the acquired fault features cannot be extracted thoroughly.
e wavelet domain is another scale representation of signal [26]. e most useful information of the signal in time domain can be compressed and represented in wavelet domain without losing local information. During the process of wavelet decomposition, the noise component can be partially separated. is highlights the local features of signal. In addition, the initial atoms of constructed dictionary, obtained in wavelet domain, are more fitted to the signal. erefore, the wavelet sparse representation, which indicates that the sparse representation is applied in the wavelet domain, is able to extract intrinsic features of signals.
To diagnose wheelset-bearing fault more effectively, a novel feature extraction method, namely, ALIF-SBAKW, based on the wavelet sparse representation (Split Bregman for sparse coding and approximate K-SVD for dictionary learning) with the adaptive local iterative filtering (ALIF), is proposed in this paper. e paper is organized as follows. Section 2 elaborates the details of wavelet sparse representation. Section 3 describes the main principle of ALIF-SBAKW for feature extraction. e proposed ALIF-SBAKW is verified by simulations and experiments, respectively, in Sections 4 and 5. Section 6 concludes the paper.

Sparse Representation.
According to the sparsity of fault impulses, the observed signals can be represented sparsely by combining the dictionary with the sparse coefficient, as shown in 2 Shock and Vibration where y o ∈ R n denotes the observed signal, e ∈ R n is the measurement noises, D ∈ R n×m denotes dictionary matrix, and Z ∈ R m denotes sparse coefficient vector. e extracted signal can be constructed by the multiplication of dictionary D and sparse coefficient Z.
It can be observed that (1) is an underdetermined equation, which means (1) has infinite solutions. To solve this kind of problem, it can be turned into the problem of finding the L0-regularized optimization and L1-regularized optimization. Compared with L0-regularized optimization, L1-regularized optimization is more suitable for solving an NP-hard problem [19]. erefore, L1-regularized optimization is adopted. e equation of L1-regularized optimization is given in where T � cσ and c denotes adjusted gain of standard deviation σ of noise.
To solve this kind of optimization problem, a penalty factor ρ can be introduced to reduce the constraint. A new objective function can be obtained: In fact, the length of signal, i.e., n, is usually a large number, which means that a very large amount of computing resources is consumed in solving the problem. In order to reduce the computational burden, the observed signal can be segmented into a series of truncated-signals y i with a certain overlap; as shown in Figure 1, dataset Y comprised of the truncated-signals can be obtained. Equation (3) can be solved in the form of matrix Y. Assuming that the dataset is Y � [y 1 , y 2 , . . . , y l ] ∈ R b×l after segmentation, accordingly dictionary matrix and sparse coefficient matrix To solve the objective optimization problem in (4), two optimization steps containing both sparse coding and dictionary learning are executed. In sparse coding, a sparse approximation is used to find sparse coefficients with a fixed dictionary. e dictionary learning step is used to update the dictionary with the obtained sparse coefficients.

Wavelet Sparse Representation.
In order to better extract the intrinsic features of signals, the wavelet decomposition can be adopted in sparse representation. As for a one-dimensional signal, there are mainly two forms of coefficients after wavelet decomposition: approximation coefficients (CAs) and detail coefficients (CDs). e CAs contain the main information of the original signal. On the contrary, the CDs contain the subordinate components of the original signal. e wavelet decomposition of the signal is given in Figure 2. It can be observed that the number of decomposed subbands is related to the decomposed level q. ese subbands mainly consist of a CA band and q CD bands. Accordingly the number of subbands is q + 1. e fault features of bearing are mainly hidden in the CAs after the wavelet decomposition of signal. Accordingly, they can be called impulse wavelet coefficients (IWCs). In addition, the CDs mainly contain the noise component of signal. According to the performance of wavelet decomposition, the sparse representation can be accomplished by the obtained IWCs. e objective function can be transformed into where (y i ) I denotes the i-th column of dataset (Y) I , which is composed of the segmented IWCs in the wavelet domain. Equation (5) suggests that the dictionary matrix (D) I and sparse coefficient matrix (Z) I can be obtained after the wavelet sparse representation. As for the CDs, they can be directly set to zero because they mainly contain noise component. It should be noticed that the level of wavelet decomposition has great significance for the IWCs. e wavelet decomposition of the one-dimensional signal is a process of downsampling. If there are too many required levels, the IWCs will be distorted, causing the fault features, hidden in the IWCs, to be weakened or to disappear. erefore, the choice of level for wavelet decomposition is extraordinarily important. At the beginning of wavelet sparse representation, k columns of dataset Y are randomly selected as the initialization dictionary (k < l). During the calculation, the row b of the designed dictionary is always much smaller than the column k of the designed dictionary. is makes the dictionary redundant. With the feature of redundancy, the advantage of this kind of dictionary is that it is more conducive to expressing a highly diversified signal and reconstructing the local features of the signal. Furthermore, the noise is generally not considered to be sparse. erefore, the fault signal, which needs to be extracted from the noise, will become much sparser and more stable by using the representation of redundant dictionary.

Split Bregman for Sparse Coding. Split Bregman (SB)
iteration is one of the effective methods of L1-regularized optimization. Due to its ability to solve a very wide class of L1-regularized problems by using alternating iteration, SB has been widely used in the field of image processing [27,28]. It is also suitable for sparse coding. In this paper, the sparse coding problem needs to be solved in the form of (6) and (7) based on the principle of SB: where (D) I ∈ R b×k and (y i ) I ∈ R b are given.
, an elegant form of iteration, based on the principle of SB, can be obtained by the simplification to (7), as seen in (8) and (9).
In (8), the item F((z i ) I , (f i ) I ) contains L1 and L2 components with two different independent variables. When the value of one of the variables is updated, the other variable can be seen as a constant. Due to this alternating characteristic, (8) can be split into two steps: When solving (10), the solution (z i ) k+1 I can be obtained by differentiation with corresponding independent variable and setting result to zero. erefore, the computed equation of (z i ) k+1 I is shown below: In addition, (f i ) k+1 I can be obtained by using a shrinkage operator at the second iteration, as shown in (13) and (14). Θ denotes elementwise multiplication.
2.4. Approximate K-SVD for Dictionary Learning. After the sparse coefficient matrix (Z) I is calculated, the dictionary (D) I will be considered updated. K-SVD is an available method for dictionary learning, which can effectively reduce the sparsity of the corresponding sparse coefficient matrix of dictionary [23]. e objective function can be modified as follows:  Figure 2: e wavelet decomposition of signal.

Shock and Vibration
where (z j T ) I ∈ R 1×l denotes the j-th row of (Z) I . Based on the principle of K-SVD, the optimization of (15) is equivalent to optimizing the nonzero elements in (z i T ) I : where matrix Φ i is a size of l × n 0 (φ i ) with ones on (φ i (α), α) and zeros elsewhere (α � 1, 2, . . . , n 0 (φ i )). φ i denotes a group of indices for the elements of nonzero in as the first column of U and the first column of V multiplied by S(1, 1), respectively.
Based on the K-SVD, approximate K-SVD (AK-SVD) is introduced to improve the dictionary by iteration and reduce the computational burden simultaneously [29]. In this case, the following two steps can be alternately iterated to obtain an approximate solution, as shown below:

Proposed ALIF-SBAKW
3.1. e ALIF Algorithm. In order to reduce the impact of the nonstationary and the wheelset's rotation frequency components, adaptive local iterative filtering (ALIF) is used to process signals. ALIF is a novel time-frequency analysis algorithm, which is inspired by EMD [25]. e flowchart of ALIF is shown in Figure 3.
In this algorithm, the operator L n (s)(x) denotes the moving average of the signal s(x), as shown in (19). In (19), ω n (x, t) denotes the low pass filter constructed by the solution of Fokker-Planck (FP) equations with a mask length of 2l n (x). e main idea of computing mask length is to compute a multiple of the distance of subsequent local minima and maxima of s(x). Obtaining a continuously varying and smooth l n (x) can be achieved by interpolating the values of the distance of the subsequent local extrema of s(x) and subtracting the high frequency from the interpolated line. It can be observed in Figure 3 that the ALIF consists of two loops: the outer loop and inner loop. e outer loop mainly derives the IMFs captured by the inner loop. It determines whether the process of decomposition can stop with the number of extrema. e inner loop mainly captures a single IMF component with a stopping criterion. e stopping criterion in the inner loop usually requires n ⟶ ∞, which is difficult to apply in practice. erefore, an exact threshold ε can be set as a stopping criterion shown in (20), where H i,n is the n-th step of the i-th inner loop shown in (21). e ALIF performs much better when processing the separation of the aforementioned two components. ALIF follows the iterative framework of the EMD algorithm. e moving average in ALIF is the convolution between the signal and the low pass filter. e low pass filter, constructed by the solution of FP equation, is compactly supported and is tending to zero smoothly at both ends, which ensures the nonexistence of artificial oscillations. In addition, the length l n (x) of filters in ALIF is adapted accordingly. is ensures that the nonstationary changes in signal can be captured more effectively. erefore, ALIF is more stable under perturbation.

e Choice of Impulse-IMF.
As presented in Section 1, in order to make the conducting of fault extraction more conducive, the nonstationary and the wheelset's rotation frequency components that are caused by the service conditions of the wheelset-bearing should be removed firstly. After that, an appropriate signal component, which mainly contains fault information, should be chosen as the impulse-IMF for wavelet sparse representation. According to the algorithm of ALIF, a series of IMFs for different frequency can be obtained. ese IMFs are arranged in a frequency order from high to low. According to the different frequency of the signal components, the IMF1 (fault information with high frequency) is empirically chosen as the impulse-IMF. is can effectively reduce or eliminate the influence of the aforementioned two components (low frequency), which is beneficial to the feature extraction of the wheelset-bearing fault.

Proposed ALIF-SBAKW.
A novel feature extraction method of wheelset-bearing fault, ALIF-SBAKW, is proposed in this paper. e flowchart of ALIF-SBAKW is shown  Shock and Vibration 5 in Figure 4. ALIF-SBAKW mainly consists of the following steps: Step 1: the collected vibration signal is decomposed into a series of IMFs by ALIF. e impulse-IMF containing the impulses with noise can be selected from the IMFs.
Step 2: the IWCs can be obtained by the wavelet decomposition of impulse-IMF. Generally, the number of decomposed levels can be set to 1 or 2 for the signal with a short length. e noise level σ of IWCs is estimated, and the adjusted gain c should be given. e IWCs can be segmented, with the maximal overlap, into a series of truncated-signals (y i ) I whose length is b.
. . , (y l ) I ] ∈ R b×l and k columns (k < l) of (Y) I be randomly selected as an initial Step 3: sparse representation can be applied to the given Step 4: in order to achieve the purposes of strengthening the IWCs and enhancing the impulse response in the original signal, the convolution between the IWCs and a typical impulse is applied. e extracted impulses of the vibration signal can be reconstructed by the wavelet reconstruction with convolutional IWCs and zero-setting CDs.

Simulation Validation
In order to illustrate and verify the effect of the proposed ALIF-SBAKW, a simulation validation is designed in this section. As the analysis in Section 1 showed, a simulated signal generated by bearing rotation can be constructed as where I(t) denotes the impulse component generated by the fault of bearing. It can be expressed by (23), where u(t) is a unit step function, J is the total number of generated impulses, Am j is the amplitude of the j-th impulse, B denotes the structure damper coefficient, f r denotes the resonance frequency, and T g , whose reciprocal is the fault characteristic frequency (f g � T − 1 g ), is the time interval between the two adjacent impulses. e simulated parameters of (23) are shown in Table 1.
e time-domain waveform of I(t) is shown in Figure 5(a): Am j e − B t− jT g cos 2πf r t − jT g u t − jT g .
e simulated x 0 (t) denotes some unknown interferences with nonstationary component which are generated during the process of measurement. It can be described by a cosine function and its corresponding modulation function.
Here, x 0 (t) is directly written out, as shown in (24). In addition, η(t) denotes noise component of x(t), which can be described by Gaussian random noise, whose standard deviation σ 0 can be set to 1. e power ratio between the impulse component I(t) and the total noisy simulated signal x(t) is − 18.7008 dB: × sin((20π + 0.2 cos(40πt))t) + 3.6 cos(160πt) When the sampling frequency and sampling time are set to 10000 Hz and 1 s, respectively, a simulated signal x(t), as shown in Figure 5(b), can be obtained. Figure 5(c) is the Hilbert envelope spectrum of x(t). It can be observed from Figure 5(b) that the impulses have been completely covered because of the interference components. e features of bearing fault cannot be identified in time domain. In addition, the fault characteristic frequency f g � T − 1 g also cannot be detected in its corresponding Hilbert envelope spectrum.

e Setting Rule of Adjusted Gain c.
In order to obtain a good performance of fault extraction in ALIF-SBAKW, the selection of suitable adjusted gain c is very crucial. If c is set too small, more wrong impulses and noises will be extracted in the results. Instead, if c is set too large, the number of fault impulses will be extracted insufficiently. In the fault diagnosis of bearing, the extracted performance of fault characteristic frequency and its harmonics in the Hilbert envelope spectrum is an important evaluation. Envelope spectrum kurtosis (ESK) is an effective measure index of extracted fault features in the Hilbert envelope spectrum. A higher value of ESK implies clearer fault features with larger amplitudes in their envelope spectrum and a larger number of harmonics for fault characteristic frequency [2]. erefore, in order to select c more reasonably, the ESK of different extracted signals with different c in a particular range is calculated. e adjusted gain c with the highest ESK is selected as the most suitable parameter.

e Choice of Wavelet Base.
It should be noted that, except the adjusted gain c, the choice of wavelet base also should be reasonable to obtain a good performance of wavelet decomposition and reconstruction. Daubechies wavelet is compact support and it has a wonderful regularity. It can not only retain the peak feature of the impulse, but also obtain a smoothing bearing fault signal. erefore, it is very suitable for wavelet transform of bearing fault signal [30]. In addition, the choice of the support length of wavelet also has an impact on the analytical results. In order to avoid the boundary problem and low order vanishing moments caused by the overlong and short support length of wavelet, respectively, the filter length 8 (i.e., Daubechies 8-tap wavelet) is selected for wavelet decomposition in this paper.

Simulation Results.
According to the flowchart of ALIF-SBAKW, a series of crucial calculated parameters are given. e threshold ε of stopping criterion in ALIF is set to 0.001. e decomposed level of wavelet decomposition is 1. e chosen wavelet basis is Daubechies 8-tap wavelet. e adjusted gain c of noise standard deviation is determined as 4.2.
e size of the dictionary is set to 10 × 40, which means that the length of segmented signal is 10. e parameters, ρ and λ in SB, are set to 100 and 1000, respectively. e chosen impulse-IMF is IMF1. Finally, the simulated analysis of (22) is implemented based on the procedure of the ALIF-SBAKW method. e decomposed IMFs of (22) by ALIF are shown in Figure 6. e simulated result of feature extraction of bearing fault is shown in Figure 7(a) and the corresponding Hilbert envelope spectrum is shown in Figure 7(b). It can be observed from the results that the fault features are clearly detected in Figure 7(a), in which the interference components are totally eliminated. e fault characteristic frequency f g and its harmonics are explicitly illustrated in Figure 7(b). e simulated results show that the fault features are sufficiently extracted by the proposed method no matter whether in time domain or in the Hilbert envelope spectrum.

Performance Comparison.
In order to further illustrate the advancement of proposed method, two comparative methods are applied to analyze the same simulated signal. Firstly, wavelet sparse representation, i.e., SBAKW, is used to process the simulated signal directly. e adjusted gain c of noise standard deviation is set to 24, and the other parameters are determined as above. e simulated result using SBAKW is shown in Figure 8(a), and its Hilbert envelope spectrum is shown in Figure 8(b). Secondly, the OMP-KSVD algorithm, a well-known sparse representation algorithm, is applied to handle the IMFs decomposed by ALIF, i.e., ALIF-OMPK. In general, there are two different iteration stopping criteria in OMP algorithm: target sparsity and error goal. e size of the dictionary is set to 64×256 in ALIF-OMPK. e number of iterations is set to 10, and the chosen impulse-IMF also is IMF1. When the target sparsity is adopted as the stopping criterion and it is set to 2, the simulated result is shown in Figure 9(a) and its Hilbert  envelope spectrum is shown in Figure 9(b). When the error goal is adopted as the stopping criterion and it is set to 54.7, the simulated result is shown in Figure 10(a) and its Hilbert envelope spectrum is shown in Figure 10(b). Making a comparison between SBAKW and ALIF-SBAKW, features of interferences are extracted by using SBAKW in Figure 8(a). is is because the sparse representation is more likely to detect the components having stronger energy. In this simulated case, the energy of nonstationary component is much stronger than that of fault features. e extracted features obviously are not fault features. e information reflected in Hilbert envelope spectrum also cannot extract the fault characteristic frequency f g effectively either.
erefore, SBAKW cannot detect fault features of bearing directly. e results extracted by ALIF-OMPK under two different iteration stopping criteria can also identify the information of fault characteristic, as shown in Figures 9 and  10. However, the extracted results are not desired compared with the results extracted by the ALIF-SBAKW. e ALIF-SBAKW method significantly removes noise between the two impulses, which has better performance in denoising than that of the ALIF-OMPK. Additionally, although the fault characteristic frequency f g and its harmonics can also be identified in Hilbert envelope spectrum by using ALIF-OMPK, the amplitude of f g is weaker than that extracted by the ALIF-SBAKW. e number of harmonics extracted by the ALIF-SBAKW is also much more fruitful. In order to further conduct the comparison, the value of each ESK is calculated as a comparative indicator. e values of ESK in Figures 7(b), 9(b), and 10(b) are 511.7, 336.4, and 477.8, respectively. is implies that the results, extracted by the ALIF-SBAKW, are superior. erefore, compared with the ALIF-OMPK, the proposed ALIF-SBAKW method can extract more fruitful information of fault no matter whether in time domain or in the Hilbert envelope spectrum.

Experimental Validation
In order to further validate the effect of the proposed ALIF-SBAKW, the experimental data of wheelset-bearing fault has been obtained through the testing rig shown in Figure 11(a). e experiment was conducted by the project between Southwest Jiaotong University and CRRC Corporation. e axle box bearing running on the testing rig is from China Railway High-speed (CRH) vehicle, and the double-row tapered roller bearing is adopted for CRH vehicle axle box. e testing rig consists of a motor, a loading device, a pair of driving wheels, a testing wheelset, and an axle box. e testing wheelset, which is driven by the driving wheels in the bottom, is supported by the axle box bearing. e driving power is delivered by the motor, and it can be conveyed to the driving wheels though the rubber belts. e accelerometer for collecting vibration signals is mounted on the axle box, as shown in Figure 11(b). In Figures 11(c) and 11(d), two typical bearing faults are introduced into the experiment: outer-race fault and roller fault. e fault characteristic frequency of these two typical faults can be calculated using (25) and (26), respectively. e bearing parameters d B , d P , N Z , and ϕ denote roller-ball diameter, pitch diameter, the number of balls, and the contact angle of balls, respectively. e values of bearing parameters are listed in Table 2; f ro denotes the rotation frequency. When the speed is 100 km/h, the corresponding rotation frequency is 10.3 Hz. erefore, it can be easily calculated that the fault characteristic frequency of outer-race fault f BPFO and roller  Shock and Vibration fault f BSF are 83.23 Hz and 33.69 Hz, respectively. e accelerometer is installed on the axle box, and the fault signal is collected at a sampling rate of 10 kHz. In this section, the proposed ALIF-SBAKW method is used to analyze the collected vibration signal. In order to further validate the effect of proposed ALIF-SBAKW and highlight its superiority, four different comparative methods, namely, SBAKW, ALIF-OMPK (using target sparsity), EWT, and fast kurtogram [7], are applied to process the analyzed signal:  Shock and Vibration

Outer-Race Fault Experiment.
In the experiment of outer-race fault, the collected signal is shown in Figure 12(a) and its Hilbert envelope spectrum is shown in Figure 12(b). It can be seen that the fault characteristic frequency f BPFO and its harmonics cannot be discovered clearly due to the existence of power-line interference and noises. e proposed ALIF-SBAKW method is used to analyze the collected signal in Figure 12   outer-race fault f BPFO and its harmonics are clearly displayed in Hilbert envelope spectrum, which indicates that there certainly exists fault on the surface of the outer race. In order to further validate the effect of proposed ALIF-SBAKW, four comparative methods are conducted. In SBAKW, the adjusted gain c of noise standard deviation is set to 5.5, and the other parameters are identical with those set in ALIF-SBAKW. e extracted results using SBAKW are shown in Figure 14. In ALIF-OMPK, the size of the dictionary is set to 128×512, and the sparsity is 2. e number of iterations is set to 10. e extracted signal using ALIF-OMPK is shown in Figure 15(a), and its Hilbert envelope spectrum is shown in Figure 15(b). In EWT, the maximum number of segmented bands is set to 14. In the results obtained by the EWT, most subband signals are unable to contain useful information due to the segment of narrow   frequency bands. Among all bands, only the 13th subband signal can extract some feature information. e analytical results are shown in Figure 16. In fast kurtogram, the highest level of decomposition is set to 6. e kurtogram is obtained as shown in Figure 17(a). e centre frequency and optimal level of filter are set to 3945.3125 and 6, respectively. e extracted results are shown in Figures 17(b) and 17(c). Making comparison between SBAKW and ALIF-SBAKW, although some kinds of vibration features can be extracted by using SBAKW, they mainly derive from the    e characteristic frequency of power-line interference f PLI can be detected in Hilbert envelope spectrum, whereas the f BPFO is almost submerged in other frequency components and its harmonics cannot be extracted. is shows that the interference seriously affects the extraction of correct features for impulses. Compared with ALIF-OMPK, EWT, and fast kurtogram, the extracted signals are not purified, and the fault impulses are nearly drowned in the noise component in time domain. Furthermore, the whole amplitude of extracted signals, analyzed by EWT and fast kurtogram, is greatly reduced compared with the signal extracted by the ALIF-SBAKW, which means that the extraction of fault is affected. Compared with the Hilbert envelope spectrum, the most significant difference is that the amplitude of f BPFO and its harmonics, extracted by ALIF-OMPK, EWT, and fast kurtogram, are generally weaker than those extracted by the ALIF-SBAKW. As for the ALIF-OMPK, although the number of its harmonics increases, the f BPFO is inconspicuous. For EWT, the number of harmonics of f BPFO is far less than the number of harmonics in Figure 13(f ). By contrast, the f BPFO of fast kurtogram and its harmonics cannot be discovered directly in Figure 17(c). erefore, the proposed ALIF-SBAKW can be used to analyze the fault extraction of outer race effectively, and it performs better than the other four comparative methods.

Roller Fault Experiment.
In the roller fault experiment, the collected signal is shown in Figure 18(a) and its Hilbert envelope spectrum is shown in Figure 18(b). It should be noted that the even harmonics of f BSF are often dominant in Hilbert envelope spectrum. is is because the defect on rolling element surface impacts both the inner race and outer race, which excites two impulses and results in two shocks per basic period [10,31]. erefore, the 2f BSF and its harmonics should be used as the diagnostic indices for roller fault. It can be observed from Figure 18 Figure 19(e), and the corresponding Hilbert envelope spectrum is shown in Figure 19(f ). It can be observed that the impulses, caused by the roller fault, are directly extracted. e double fault characteristic frequency of roller 2f BSF and its harmonics are clearly discovered in Hilbert envelope spectrum.
Similarly, four comparative methods are used to process the collected signal. In SBAKW, the adjusted gain c of noise  standard deviation is set to 6, and the other parameters remain unchanged as those set in ALIF-SBAKW. e analytical results are shown in Figure 20. In ALIF-OMPK, the related parameters are consistent with the parameter settings of ALIF-OMPK in Section 5.1. e extracted results are shown in Figure 21. In EWT, the detected boundaries of Fourier spectrum are shown in Figure 22(a). According to the results, only the 17th subband signal is able to extract some feature information, as shown in Figures 22(b) and 22(c). In fast kurtogram, the centre frequency and the optimal level of filter are set to 1875 and 5, respectively. e analytical results are shown in Figure 23. e reason why the double fault characteristic frequency 2f BSF can be detected in Hilbert envelope spectrum of the original collected signal is that the original energy of fault impulses in the roller fault experimental data is extraordinarily strong. However, the quality of results, extracted by four comparative methods, is still not as good as that extracted by ALIF-SBAKW. Compared with SBAKW, although some kinds of vibration features can be extracted by using SBAKW, other different features are simultaneously extracted compared with the signal in Figure 19(e). is is due to the existence of power-line interference and noises, which lead to the inconspicuousness of 2f BSF directly in Figure 20(b). Making comparison among the left three methods, we find that the extracted signals can only observe the peaks of impulses and there still exist noises at the interval between the adjacent impulses compared with signal in Figure 19(e).
According to the comparison in Hilbert envelope spectrum, it can be observed from Figures 20(b) and 22(c) that the frequency, extracted by SBAKW and EWT, is nearly unable to detect the harmonics of 2f BSF directly. In Figures 21(b) and 23(c), although the amplitude of partly harmonics, extracted by the ALIF-OMPK and fast kurtogram, is not the most prominent, a certain number of harmonics still can be identified. In order to further highlight the superiority of the ALIF-SBAKW, some feature indicators are introduced and calculated to make a more direct comparison between ALIF-SBAKW, ALIF-OMPK,        and fast kurtogram. e introduced feature indicators include crest factor (CF), impulse factor (IF), kurtosis [32], and envelope spectrum kurtosis (ESK). e related parameters and calculated values are shown in Table 3. eoretically, when the values of CF, IF, and kurtosis are higher, the features of impulses extracted in time domain are relatively stronger. In addition, the ESK mainly reflects the richness of fault information in Hilbert envelope spectrum. As shown in Table 3, obviously each of the values of the indicator, calculated by the results of the ALIF-SBAKW, is higher than those calculated by the results of the other two methods. Consequently, the comparative results indicate that the ALIF-SBAKW is still superior to the ALIF-OMPK and fast kurtogram. Overall, it can be concluded that, compared with SBAKW, ALIF-OMPK, EWT, and fast kurtogram, the proposed ALIF-SBAKW has better performance in some degree.

Conclusions
e fault diagnosis of wheelset-bearing has great significance to the safety of high-speed train. Sparse representation is an advanced method for bearing fault extraction. However, it is hard for the traditional sparse representation to conduct fault extraction under severe service conditions, especially under the complicated track-vehicle system. If the energy of interference is stronger than the energy of fault features, the interference component, instead of the fault features, will be detected and extracted. erefore, the ALIF-SBAKW is proposed in this paper. ere are two reasons why this new method can solve this problem. On the one hand, the ALIF can effectively reduce or eliminate the nonstationary and the wheelset's rotation frequency components (caused by severe service conditions of the high-speed train), which is conducive to realizing fault extraction. On the other hand, the wavelet sparse representation can deeply find the intrinsic features of signal and extract the wheelset-bearing fault. e ALIF-SBAKW method is validated by simulated and experimental signals. e results show that the ALIF-SBAKW method is extraordinarily suitable for the fault feature extraction of wheelset-bearing signals, especially compared with SBAKW, ALIF-OMPK, EWT, and fast kurtogram in  Finally, although the ALIF-SBAKW method can effectively extract the fault feature, this method cannot be effectively applied to the separation of multiple faults now. erefore, further research should be made to solve the considered problems. In addition, the fault extraction method, proposed by Qin et al. in [23], is an excellent technique. In the future research of separation for multiple faults, a more comprehensive comparison and research with the method in [23] will be conducted in terms of the effect of component separation and fault extraction.