A fault feature extraction method for rotating shaft with multiple weak faults based on underdetermined blind source signal

In order to improve the fault feature extraction of a rotating shaft, an efficient method of underdetermined blind source signal with weak faults based on Hankel matrix-based singular value decomposition (SVD) and blind source separation (BSS) is proposed. First, a Hankel matrix is constructed for single-channel vibration signal, and then SVD is used to estimate the number of fault signals. Finally, the fault features can be extracted by BSS. Compared with wavelets, variational mode decomposition and ensemble empirical mode decomposition  +  BSS, the better performance of the proposed method is demonstrated by an analysis of the simulated signal with the misalignment fault and imbalance fault mixed. Furthermore, an experiment verifies the effectiveness of this method. The result demonstrates that the proposed method is efficient for feature extraction of a single-channel vibration signal of a rotating shaft with multiple weak faults.


Introduction
The rotating shaft is the most important part of rotating machinery: its faults will affect accuracy and can be sometimes be catastrophic for the whole machinery. In order to acquire more fault information from vibration signals and improve the accuracy of diagnostic work, many feature extraction methods have been proposed, such as fast Fourier transform (FFT) [1,2], wavelets [3,4], ensemble empirical mode decomposition (EEMD) [5,6] and variational mode decomposition (VMD) [7,8]. However, these methods present a problem in that the extracted features sometimes are not obvious and cannot be extracted at low signal-to-noise ratio (SNR). With the development of science, blind source separation (BSS) is more and more applied in the field of rotating machinery fault feature extraction [9,10].
The goal of BSS is to detect latent signals from mixed signals without any knowledge of the mixing process. This challenging problem has attracted much research interest due to its very wide area of applicability, such as in speech signal separation, image processing, computer vision, bioinformatics, etc [11][12][13][14]. In the usual blind separation model, it is often required that the number of sensors (which determine the dimension of the signal) should be no less than the number of Measurement Science and Technology A fault feature extraction method for rotating shaft with multiple weak faults based on underdetermined blind source signal Hongchun Sun 1,2 , Liang Fang 1,2 and Jingzheng Guo 1 In order to improve the fault feature extraction of a rotating shaft, an efficient method of underdetermined blind source signal with weak faults based on Hankel matrix-based singular value decomposition (SVD) and blind source separation (BSS) is proposed. First, a Hankel matrix is constructed for single-channel vibration signal, and then SVD is used to estimate the number of fault signals. Finally, the fault features can be extracted by BSS. Compared with wavelets, variational mode decomposition and ensemble empirical mode decomposition + BSS, the better performance of the proposed method is demonstrated by an analysis of the simulated signal with the misalignment fault and imbalance fault mixed. Furthermore, an experiment verifies the effectiveness of this method. The result demonstrates that the proposed method is efficient for feature extraction of a single-channel vibration signal of a rotating shaft with multiple weak faults.
Keywords: rotating shaft, underdetermined blind source separation, Hankel matrix, singular value decomposition, fault feature extraction (Some figures may appear in colour only in the online journal) Original content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. sources (which represent fault signals in this paper). However, due to cost issues and limitations in the monitoring environment, it may be impossible to install multiple sensors on rotating machinery, so single-channel monitoring is sometimes the only option, which is the so-called underdetermined blind source problem. Therefore, the problem of underdetermined blind source separation has gradually become a hot area in the research of blind signals in recent years [15][16][17].
Before BSS, estimating the number of sources accurately and effectively is an important prerequisite to achieve effective separation of the blind source. The single-channel signal can be transformed into many kinds of matrices, such as a Toeplitz matrix [18] and Hankel matrix, [19] etc. Singular value decomposition (SVD) [20, 21] is a non-parametric technique which has been used extensively in noise reduction, feature extraction, source number estimation and fault diagnosis. For a single-channel signal with multiple faults, using SVD directly cannot extract the eigenvalues of each fault signal. Therefore, a Hankel matrix-based SVD method is proposed in this paper. The ratios of neighboring singular values (NSVR) in descending order obtained by the Hankel matrix-based SVD are applied to estimate the number of fault signals of single-channel vibration signal.
In this paper, a fault feature extraction method which combines with Hankel matrix-based SVD and blind source separation is proposed. Its advantage is that the number of signals with weak faults can be estimated, and the features can be clearly extracted. For feature extraction of a single-channel vibration signal, results of comparative experiments show that this method has a significantly better performance than wavelets, VMD and EEMD+BSS.
The structure of this paper is as follows. The theory of Hankel matrixes, SVD and BSS are introduced in section 2. Section 3 gives a detailed procedure of the proposed model for fault feature extraction. The performance of the proposed method is evaluated by an analysis of a simulated signal of a rotating shaft in section 4. In section 5, an artificial fault experiment proves the effectiveness of the proposed method. The conclusion of this paper is presented in section 6.

Hankel matrix theory
The single-channel signal X = (x(1), x(2), . . . , x(N)) can be constructed as a Hankel matrix as follows: (1) where 1 < n < N. Let m = N − n + 1, then A ∈ R m×n . The matrix A can be written into a form of multiplication of the vector u i , vector v i and singular value σ i as follows: It is assumed that Let the first line of A i be the row vector P i,1 and H i,n be the last column vector of A i without the elements of the first line. According to the Hankel matrix construction process, P i,1 and H T i,n can be connected end-to-end to form a component signal P i as follows: The matrix A i can be represented by the row vector The original Hankel matrix A can be represented by the row vector X 1 , X 2 , · · · , X m , X m ∈ R 1×n . Therefore, X 1 can be expressed as The matrix A without the elements of first line can be represented by the column vector I n , I n ∈ R (m−1)×1 . Therefore, I T n can be expressed as follows: According to the Hankel matrix construction process, the original signal X can be expressed as X = (X 1 , I T n ). The component signal P i can be expressed as P i = (P i,1 , H T i,n ). Therefore, Substituting equations (4) and (5) into equation (6), The essence of Hankel matrix theory is the decomposition of the original signal into a simple linear superposition of the component signal P i . Its advantage is that the separated component signals are simply subtracted from the original signal, which will make the separated component signals maintain the original phase, that is, zero-phase shift characteristic.

Singular value decomposition theory
The signal The observed signal is assumed to be a linear superposition of the source signals and noise signal The observed signal X(t) can be described as follows: The matrix R X is the covariance matrix of the observed signal. The eigenvalue decomposition of R X is performed as follows: where Λ is a diagonal matrix composed of the eigenvalues {λ 1 , λ 2 , ..., λ m } of R X . Every column vector of the eigenvector Q is a unit eigenvector corresponding to the eigenvalue, and the unit vectors are orthogonal to each other. It can be deduced that the eigenvalues of R X R H X derived by the characteristic decomposition are {λ 2 1 , λ 2 2 , ...λ 2 m }. That is, Therefore, the singular value of R X defined by the singular value is {|λ 1 |, |λ 2 |, ..., |λ m |} because the singular value of the covariance matrix is the same as the absolute value of the eigenvalue. Therefore, the number of non-zero singular values is equal to the number of non-zero eigenvalues. It is assumed that R X is a covariance matrix of the mixed signal with noise. By the mixed system model X = AS + N , the following can be calculated: where L is the sampling points, where σ 2 is the power of noise.
Therefore, in the case of a high SNR, the main eigenvalue of the covariance matrix is equal to the number of signal sources.
Let the eigenvalues of R X be arranged in descending order, that is, is the maximum ratio of neighboring singular values, the number of sources will be k. SVD is simple and can be realized easily.

Blind source separation theory
The fast independent component analysis (FastICA) algorithm [22, 23] based on negentropy is one of the most famous algorithms of BSS. It combines the batch processing method with the adaptive method and has a faster processing speed. The negentropy is defined as follows: where Y g and Y have the same variance of Gaussian random variables. The differential entropy H(·) of a random variable is defined as follows: According to information theory, among random variables with the same variance, the random variables of Gaussian components have maximum differential entropy. When Y obeys a Gaussian distribution, N g (Y) is equal to zero. The stronger the non-Gaussianity of Y, the smaller the differential entropy, and the larger the N g (Y). However, the calculation of differential entropy needs to know the probability distribution function of Y. For the actual acquisition of the signal, it is clear that the probability distribution function cannot be accurately known. Therefore, an approximate calculation formula is defined as follows: where E{·} is the mean operation function, G(·) is a nonquadratic function, and C is a constant. It is selected that The FastICA algorithm learning rule is to find the separation matrix W such that W T X(Y = W T X) has the largest non-Gaussianity. The non-Gaussianity is measured by the approximation of N g (W T X).
The procedure of the FastICA algorithm is as follows. The maximum approximation of N g (W T X) can be obtained by optimizing E{G(W T X)}. According to the Kuhn-Tucker condition, under the constraint of E{(W T X) 2 } = W 2 = 1, the optimal value of E{G(W T X)} can be obtained at the point of satisfying equation (17): where β is a constant and g(·) is the derivative of G(·). β ≈ E{W T Xg(W T X)}. The Newton iterative method can be used to solve the equation, and, after simplification, the iterative formula of the FastICA algorithm can be expressed as where g (·) is the derivative of g(·). When n independent components are estimated, the separation matrix W can be obtained.

Detailed procedure of the proposed method
In this paper, a fault feature extraction method for a rotating shaft devised using Hankel matrix-based SVD and the FastICA algorithm is proposed. Its advantage is that the number of signals can be estimated, and the features can be clearly extracted. The procedure of the proposed method is summarized in figure 1. The detailed procedure has the following steps.
Step 1: A Hankel matrix is constructed for the collected single-channel signal X(t), and the component signals P 1 , P 2 , . . . , P N can be obtained.

Simulation analysis
In order to prove that this method can extract the features of the misalignment fault and imbalance fault of the rotating shaft, simulated signals of a rotating shaft are analyzed in this section. The selected motor speed is 2000 rpm, that is, the rotation frequency f r = 33.3 Hz. The feature of the imbalance fault is that the frequency f r is the main frequency in the frequency domain. The feature of the misalignment fault is that the frequency 2f r = 66.6Hz is the main frequency. The sampling frequency is 1000 Hz and the number of sampling points is 1000. The one-dimensional linear-mixed signal is where s 1 is the simulated signal of misalignment fault, s 2 is the simulated signal of imbalance fault, a 1 , a 2 are superposition coefficients, and n is white Gaussian noise. To simulate weak faults, the selected superposition coefficients are (0.12, 0.15) and the SNR is 0.3 dB. The formula for the calculation of SNR is as follows: where PS is the effective power of the signal and PN is the effective power of the noise. Therefore, a mixed signal with weak faults is simulated. The time-domain waveform of x is shown in figure 2 and the amplitude spectra of x obtained by FFT is shown in figure 3.
None of the periodic amplitude can be seen in the timedomain from figure 2. In figure 3, signal energy is almost evenly distributed from low frequency to high frequency in the frequency domain, which shows that the fault features are completely obscured in noise.
The performance of wavelets, EMD, EEMD+BSS and the proposed method are compared as follows.

Wavelets
In this method, first the mixed signal x is decomposed by four-layer wavelet decomposition, and then the amplitude spectra of the decomposed signals can be obtained by a FFT. The result is shown in figure 4. In figure 4(c), a frequency of 64.45 Hz is extracted, but the error between it and 2f r is very large. However, the feature of the imbalance fault is not extracted. Therefore, wavelets cannot fully extract the features of the weak fault of a rotating shaft.

VMD
In this method, first the mixed signal x is decomposed by VMD, and then the amplitude spectra of each component signal can be obtained by a FFT. The result is shown in     figure 5(a), but the two fault signals are still in a mixed state. Furthermore, the noise energy is very high.

EEMD+BSS
In this method, first the mixed signal x is decomposed by EEMD and each intrinsic mode function (IMF) can be obtained. Then, the number of fault signals can be estimated by SVD. Finally, the fault signals can be separated by the FastICA BSS.
According to the ratios of neighboring singular value in descending order, the number of fault signals is 2. IMFs with high correlation coefficient were selected to reconstruct the new observed signal. The amplitude spectra of the separated fault signals by FastICA is shown in figure 6.
A frequency of 66.41 Hz can be seen in figures 6(a) and (b); it is very close to 2f r . A frequency of 33.2 Hz can be seen in figure 6(b), which is very close to f r . However, the amplitude of other frequencies is also high, and some frequencies can even affect the result of feature extraction. Therefore, the features of the misalignment fault and imbalance fault remain unclear.

The proposed method
In this method, the first step is to construct a Hankel matrix of the one-dimensional mixed signal x, and the component signals P 1 , P 2 , . . . , P N can be obtained. Through SVD, it is estimated that the number of fault signals is 2. Finally, the reconstructed signal (P 1 , P 2 ) can be separated by the FastICA BSS. The amplitude spectra of the separated fault signals is shown in figure 7.
In figure 7(a), a frequency of 67.38 Hz, which is very close to 2f r is clearly extracted. In figure 7(b), a frequency of 33.2 Hz is obvious. In figure 7, the misalignment fault and imbalance fault can be clearly identified.
Compared with the other three methods, the proposed method has a significantly better performance for fault feature extraction of a rotating shaft. The feature of fault frequency is more obvious and noise energy is clearly reduced, which is significant for fault diagnosis. The simulation analysis proves that the proposed method can estimate the number of fault signals and is efficient at extracting the features of a

Experimental verification
In order to verify the practical application of the proposed method, it is applied in an experiment to extract the features of a rotating shaft with three mixed faults. A schematic diagram of the whole test system is presented in figure 8, and the test-bed and signal acquisition system is shown in figure 9. The test system includes a test-bed, speed controller, dynamic signal acquisition instrument, computer and analysis software. The eddy current displacement sensor sends the radial vibration signal of the rotating shaft into the dynamic signal acquisition instrument, and then converts the analog signal to a dgital signal. Finally, the digital signal will be uploaded to analysis software installed on the computer to realize various functions required by the user.
In the experiment, an artificial misalignment fault, imbalance fault and rub-impact fault are created. Only one eddy current displacement sensor is used to collect the mixed vibration signal of the rotating shaft. The sensor position is shown in figure 10. The sampling frequency is 1000 Hz and the number of sampling points is 5000. The motor speed is 2000 rpm, that is, the rotation frequency f r =33.3Hz. The feature of rub-impact fault emerges as 1/n of f r , where n is equal to 2, 3 or 4.
The time-domain waveform and amplitude spectra of the healthy signal and the fault signal are shown in figure 11.
In figure 11, the amplitude of f r and 2f r of the fault signal are much higher than that of the healthy signal. Therefore, it can be confirmed that there is a fault in the rotating shaft. In order to simulate weak faults, white Gaussian noise is added to the collected signal; the SNR is 0.3 dB. The time-domain waveform and amplitude spectra of the fault signal after adding noise are shown in figure 12.
In figure 12(b), the fault frequencies cannot be clearly identified. In the proposed method, a Hankel matrix is first constructed for the collected signal. Then, the number of fault signals can be estimated by SVD. The ratios of neighboring singular values are shown in table 1.
As can be seen from the data in table 1, the maximum NSVR occurs when the number of fault signals is 3.
Finally, the component signals P 1 , P 2 , P 3 are selected to reconstruct a new observation signal, and then the fault      figure 13. A frequency of 16.6 Hz, which is 1/2 of f r , can be found in both figures 13(a) and (c), and can be considered as the feature of the rub-impact fault. In figure 13(a), a frequency of 66.65 Hz is the main frequency, which is the feature of the misalignment fault. In figure 13(b), a frequency of 33.33 Hz is the main frequency, which is the feature of the imbalance fault. The features of all three faults are easily signified. This demonstrates that the proposed method has a good performance in fault feature extraction of a rotating shaft in the case of mixed misalignment, imbalance and rub-impact faults.

Conclusion
This paper has proposed a method to extract the weak fault features of a rotating shaft, and fault features of a singlechannel vibration signal can be clearly extracted. Compared with wavelets, VMD and EEMD+BSS, analysis of a simulated signal shows that the proposed method makes the fault frequency sharper and more visible. The noise energy is obviously reduced in the amplitude spectra, which improves the efficiency of fault feature extraction. Furthermore, the experiment demonstrates the effectiveness of this method.
Finally, the proposed method is worth being evaluated by real faults in large industrial equipment. The effectiveness of this method needs to be verified under non-linear conditions in future research.