Bearing Fault Diagnosis Approach under Data Quality Issues

: In rotary machinery, bearings are susceptible to different types of mechanical faults, including ball, inner race, and outer race faults. In condition-based monitoring (CBM), several techniques have been proposed in fault diagnostics based on the vibration measurements. For this paper, we studied the fractal characteristics of non-stationary vibration signals collected from bearings under different health conditions. Using the detrended ﬂuctuation analysis (DFA), we proposed a novel method to diagnose the bearing faults based on the scaling exponent ( α 1 ) of vibration signal at the short-time scale. In vibration data with high sampling rate, our results showed that the proposed measure, scaling exponent, provides an accurate identiﬁcation of the health state of the bearing. At the end, we evaluated the performance of the proposed method under different data quality issues, data loss and induced noise.


Introduction
The machine health monitoring (MHM) of rotary machinery plays a big role in reducing the shutdown time and keeping excellent working conditions of the machinery assets. MHM records and analyzes the raw vibrational data signals gathered by means of accelerometers. Then, it extracts the useful features from the measurements and converts them to proper actions applied to the machinery through condition-based maintenance.
Fault diagnostic techniques can be classified into signal-based, model-based, and hybrid methods. The signal-based methods depend mainly on the raw data extracted from the machinery by sensors. So, the detection of machine faults is carried out through different analysis techniques applied on the extracted data. The main advantage of this method is that the mathematical modeling for the physical setup is not needed and classical signal processing techniques can be exploited in noise reduction and feature extraction. These methods include several diagnostic techniques in the time and frequency domains, including kernel density estimation (KDE), root mean square (RMS) [1], crest factor (CF) [2], Kurtosis [3,4], Fast Fourier transform [5], wavelet transform [6], and wavelet packet transform (WPT) [7]. The main challenge is the stage of preprocessing the raw vibration measurements to attenuate noise and extract the original signal. Secondly, the model-based methods are based on developing a mathematical model simulating the behavior of the mechanical system under study. Applying this approach on any system is directly dependent on the real system model and its outputs. In this approach, the dynamic response is collected from the system under faulty and normal operating conditions [8][9][10]. Then, a comparison between the output response parameters and simulated data can help in assessing the condition of the mechanical component. The main drawback of this approach is the difficulty of building accurate models under the different uncertainties associated with any system running in a noisy environment.
Due to the constraints associated with these approaches, a mixture of the two approaches was proposed as a hybrid method. The hybrid methods are known to be reliable and robust methods compared to other diagnostic techniques. In these methods, one or more diagnostic techniques are integrated to leverage the strength of the different integrated methods [11]. Some of the hybrid diagnostic methods are neuro-fuzzy [12], artificial neural network and support vector machines [13], and modified empirical mode decomposition [14].
The classical statistical methods are considered lengthy processes as they are required to perform feature extraction along with classification. In addition, some of the bearing fault features may be hidden in the data and they cannot be identified using the classical methods. To expedite the fault diagnosis process and unravel the hidden features, several machine learning (ML) approaches have been proposed successfully. These methods include artificial neural networks (ANN) [15], k-nearest neighbors (KNNs) [16], fuzzy cognitive networks (FCNs) [17], multi-agent system (MAS) approach using intelligent classifiers [18], and support vector machine (SVM) [19]. By analyzing and learning from the bearing data, ML methods can apply what they learned to make informed decisions about bearing health. Most ML methods can identify and classify the bearing with a high accuracy rate. However, some challenges, such as rolling element sliding, frequency interplay, external vibration, and feature sensitivity, may result in inaccurate feature extraction and classification [20].
Recently, deep learning (DL) has been used extensively in bearing fault diagnosis. Based on neural networks (NN) and artificial intelligence (AI), the DL methods have better performance [21] and automatic feature extraction [22] compared to other ML methods. Convolutional neural network (CNN) can detect the defect by learning optimal filters and extracting the best features from the raw vibration signals. It outperforms the classical ML methods in accuracy and speed of feature extraction [23]. Moreover, auto-encoders' approach is an unsupervised DL method that has better anti-noise characteristics in feature extraction [24,25]. Other DL methods include deep belief network (DBN) [26], recurrent neural network (RNN) [27], generative adversarial network (GAN) [28].
The detection of early stages of bearing degradation is feasible by tracking the changes in the characteristics of the bearing measurements, including vibration, temperature, and lubrication parameters. Evidently, these distinct changes could be symptoms of different bearing faults (ball, inner race, and outer race). As part of fault diagnostics in bearings, it is of great importance to extract the unique features associated with each of these faults. These features could serve as fault classifiers based on the bearing measurements.
In the time domain, studying the statistical characteristics of the vibration measurements is one of the reliable fault diagnostic techniques. Under the different fault conditions, the vibration signals possess different statistical parameters (root mean square, kurtosis, skewness, etc.). In the literature, monitoring of these parameters has been exploited extensively in identifying bearing faults at different success rates.
The statistical and fractal characteristics of the vibration signals can be inferred by calculating the scaling (Hurst) exponent of these signals. The scaling exponent serves as a measure of the level of long-range correlations in the time series. In general, the bearing vibration signals are known to be non-stationary in nature [29]. Therefore, we exploited the detrended fluctuation analysis (DFA) [30] in estimating the scaling exponents of the vibration signals with sampling rates of 12,000 and 48,000 samples/s. The DFA is one of the reliable methods in estimating the scaling exponents in non-stationary time series. Based on the estimated scaling exponents, we proposed a novel diagnostic approach of the different bearing faults. Our results showed that the proposed method can classify healthy and faulted bearings at high accuracy rates using vibration signals with 48,000 samples/s. On the other hand, it is not suitable to classify bearing faults in typical data acquisition systems with low sampling rates (12,000 samples/s) due to inaccurate estimation of the scaling exponent.

Case Western Reserve University (CWRU) Testing Rig
The vibration data sets were provided by the Case Western Reserve University (CWRU) bearing data center [31]. In the experiment, a testing rig was used to extract the vibration signals from the bearings, as shown in Figure 1. The apparatus consisted of an electric motor, torque transducer, a dynamometer, and tested bearings. A 1.491 kW reliance electric motor drives a shaft that is connected by a coupling. On the driving shaft, a torque transducer and encoder were mounted to measure speed and horsepower data. Moreover, the vibration signals were gathered using accelerometers attached to the motor housing using a magnetic base. Accelerometers were located at the 12 o'clock position at both the drive end and fan end of the motor housing.
In this experiment, a dynamometer with an embedded electronic control unit was used to apply different motor loads (0-2.237 kW). It is known as absorbing dynamometer that acts like a load driven by the motor. The amount of applied torque was proportional to the motor load that was managed via variable frequency drive through the control unit. Therefore, the increase in motor load resulted in an increase in torque and a reduction in shaft speed. The 0 kW motor load represented the no-loading condition and the shaft speed was slightly slower than synchronous speed. However, under the three loading conditions (0.746-2.237 kW), the shaft speed was less than the synchronous speed. The tested bearings were steel, sealed, deep-groove ball bearings (SKF 6205-2RS JEM) installed at the drive end of the motor. The ball bearing and its dimensions are shown in Table 1. Each bearing had a limiting speed, 18,000 rpm, and a maximum basic dynamic load, 14,800 N. Artificial fatigue cracks were seeded on the bearings using electro-discharge machining (EDM), as shown in Figure 2. Faults with diameters, 0.18 mm and 0.53 mm, were introduced on different bearings at the rolling element (ball), the inner race, and the outer race. After seeding the bearings with faults, they were installed into the motor to test them and record the vibration data sets.

CWRU Vibration Data Sets
The vibration data sets were collected from 28 bearings under normal and faulty conditions, as shown in Table 2. There were four data sets collected from bearings under normal condition. On the other hand, the remaining 24 data sets were collected from bearings under minor faults, 0.18 mm, and severe faults, 0.53 mm. As explained earlier, the tested bearing faults were ball, inner race, and outer race. In the experiment, the data sets were measured using accelerometers connected to the motor drive end with the aid of a magnetic base. The sampling rate of the data sets was 48,000 samples/s.
In Table 2, the provided rotational speed is an average speed measured during each test. Regarding the measuring device, the speed was recorded via a tachometer installed on the shaft. In this experiment, using a dynamometer and electronic control system, several motor loads, 0-2.237 kW, were used to apply different torque loads on the shaft. Using the control system, motor load remained approximately constant throughout each test. Moreover, the small changes in motor load did not affect the radial load and they had a minimal effect on shaft speed [33].

DFA Method
The DFA method was introduced to study the long-range correlation in the DNA nucleotides [30]. It is a powerful method in estimating the power low scaling, especially in non-stationary time series. As a result of detrending in the DFA method, the present non-stationary can be removed prior to estimating the scaling exponent (α). The method has been used in assessing the correlation properties in various research fields.
The scaling exponent (α) provided a reliable measure of the stationarity and the degree of long-range correlation in any time series. The time series was stationary when the scaling exponent (α) was below 1. Moreover, the stationary time series could be either anticorrelated (0 < α < 0.5), uncorrelated (α = 0.5), or long-range correlated (0.5 < α < 1). In the anticorrelated series, the increases were likely to be followed by decrease, and vice versa. On the other hand, in the long-range correlated series, the increases were expected to be followed by further increase, and vice versa.
The non-stationary time series had a scaling exponent higher than 1. For α < 1, the series had long-range correlation not following the power law. The steps to compute the scaling exponent (α) of any time series, x(t), are summarized below: (1) Compute the integrated time series, y(t), as shown in Equation (1).
(2) Divide the integrated series into equally spaced boxes with size of n. Inside each box, detrend the series segment by subtracting it from its best linear fit, y n (t).  (2).
(4) The scaling exponent (α) represents the slope of the best linear fit of the fluctuation function on the log-log scale.

Crossover Phenomenon
The existence of two linear scaling regions in the log-log plot of the fluctuation function is called the crossover. This phenomenon was first discovered in the heartbeat time series [34]. Subsequently, the existence of crossover phenomenon was confirmed in several real-world time series in diverse research fields. In [35], the authors showed that the crossover in the DFA method could be a result of periodic trend in the time series.
In a correlated time series with additive periodic sinusoidal trend with a period of T, the fluctuation function, F(n), can be written as the superposition of the fluctuation function, F S (n), of the time series and the fluctuation function, F T (n), of the sinusoidal trend, as shown in Equation (3).
The fluctuation function, F(n), exhibits a crossover phenomenon with two linear regions. It can be shown that crossover occurs at the period of the sinusoidal trend (n = T). As shown in Figure 3, at the short-time scale (n < T), the effect of the periodic trend was dominant compared to the correlated series. Moreover, the fluctuation function of the correlated series was dominant at the long-time scale (n > T) as the fluctuation function of the periodic sinusoidal trend remained bounded at higher time scales.

Fractal Characteristics of Vibration Signals
Using the DFA, the fractal characteristics of vibration signals can be inferred by calculating scaling exponent. The bearing vibration data were collected as part of an experiment conducted by the Case Western Reserve University. At a sampling rate of 48,000 samples/s, Figure 4 shows 1-second vibration measurements from bearings under four different health conditions: no fault (normal), rolling element (ball), inner race, and outer race faults.
Based on the measurements, the vibration signals in faulted bearings had higher deviation compared to the normal ones. In the normal bearing, the vibration amplitude varied between −0.2 m/s 2 and 0.2 m/s 2 . However, the outer race-faulted bearing had more severe deviations with vibration amplitude up to 6 m/s 2 . In the faulted bearings, the ball fault had lower vibration amplitude compared to the other two faults (inner race and outer race). We applied the DFA method on the vibration signals from 28 bearings under three health conditions, no fault (normal), minor fault (0.18 mm), and severe fault (0.53 mm). In Figure 5, all the fluctuation functions show two scaling regions resulting from the crossover phenomenon. At the short-time scale (3 < n < 17), the average scaling exponents of vibration signals from bearings under normal state, minor faults, and severe faults were 1.39, 1.76, and 1.85, respectively. Therefore, the calculated scaling exponents (α 1 ) indicated non-stationarity (α > 1) and long-range correlation (α > 0.5) at the short-time scale. The scaling exponents (α 1 ) fell between 1 and 2 and they increased as we moved from normal toward severely faulted bearings.
On the other hand, the second linear region was present at the long-time scale between the windows (n), 7000 and 200,000. At the long-time scale, the average scaling exponents of vibration signals from bearings under normal state, minor faults, and severe faults were 1.07, 1.01, and 1.02, respectively. The scaling exponents (α 2 ) of the vibration signals were approximately distributed around 1 for all the bearings under the different health conditions. That means the vibration signals possessed long-range correlation and behaved similarly to white noise with scaling exponent close to 1. The fractal characteristics of the vibration signals could unravel the hidden features associated with the different bearing faults. By studying these characteristics, we provided a novel method to diagnose the health condition of the bearing based on its scaling exponents. As shown in Figure 6, the scaling exponents (α 1 and α 2 ) of the bearings were plotted on the x-and y-axes. The range of the scaling exponents (α 2 ) at the long-time scale was the same among normal and faulted bearings. That means the minor and severe faults posed no effect on the vibration dynamics at the long-time scale.
On the other hand, the faulted bearings possessed higher scaling exponents (α 1 ) at the short-time scale compared to the normal ones. The normal bearings had smaller scaling exponents (α 1 ) distributed between 1.2 and 1.6. Moreover, the scaling exponents increased gradually as the severity of the fault increased. At the short-time scale, the scaling exponents showed a shift from anticorrelation in normal bearings to long-range correlation in faulted bearing. A promising application of our proposed method is the detection of different bearing faults from the vibration signals by estimating the scaling exponents (α 1 ) at the short-time scale. In the literature, DL methods have been used increasingly in fault detection and diagnosis due to their high efficiency and accuracy. Specifically, several DL methods have been proposed to diagnose bearing faults based on the CWRU data set [36]. The most common methods include auto-encoders, convolutional neural networks (CNN), deep belief networks (DBN), and generative adversarial networks (GAN). As shown in Table 3, based on the selected classifier, these methods can diagnose bearing faults at different accuracy rates: auto-encoders (83-99.8%), CNN (96.8-100%), DBN (84-99.6%), and GAN (96-99.9%). In this paper, the scaling exponent (α 1 ) was proposed as the input of two classification approaches. In the first one, we aimed to classify the bearings as either healthy or faulty based on scaling exponents from several 1000-sample vibration signals. Secondly, we classified the severity of the fault into either minor or severe fault. The performance of the scaling exponent (α 1 ) as a binary classifier can be assessed using the confusion matrix and receiver operating characteristic (ROC) curve. Specifically, we chose the overall accuracy, as shown in Equation (4), to evaluate performance of the proposed method. The accuracy quantified the performance by calculating the ratio of correctly predicted samples with respect to the total number of samples.
TP and TN are the number of correctly predicted positive and negative samples, respectively. FP represents the number of positive samples that was classified incorrectly by the method. In addition, FN denotes the number of negative samples that was misclassified.
In the first approach, we calculated the scaling exponents (α 1 ) from the vibration signals collected from healthy and faulted bearings. Moreover, each scaling exponent was calculated from a vibration signal of 1000 samples. The scaling exponent distributions of all the vibration signals are shown in Figure 7a. Using ROC curve, we evaluated the performance of our classification method by calculating the area under curve (AUC). In general, the AUC values range from 0.5 (random prediction) to 1 (perfect prediction). As shown in Figure 7b, the AUC was equal to 1, which indicates that our method can identify healthy and faulted bearings correctly without any misclassifications. In Figure 7c, we have the overall accuracy of the proposed method at different cutoff values. It seems that the highest accuracy, 100%, was achieved at scaling exponent cutoff of 1.63. In the second approach, we focused on the classification of bearings with minor or severe faults based on the scaling exponent (α 1 ). Similarly, the scaling exponents were calculated over several 1000-sample vibration signals. As shown in Figure 8, we computed the scaling exponent distributions of the faulted bearings with minor or severe faults. Based on ROC and accuracy curves, the method had high prediction capability of identifying fault severity with AUC of 0.9. Moreover, the overall accuracy of this approach was 85% and it was achieved at α 1 = 1.91. For this paper, we focused our analysis on vibration signals with high sampling rate, 48,000 samples/s. Clearly, the high sampling rate provided better visibility of the vibration signal dynamics. Therefore, our approach is capable of diagnosing bearing faults with high accuracy rates, 85% and 100%, as shown in Table 4. The other CWRU data sets had lower sampling rate, 12,000 samples/s, recorded during the three bearing faults. As a result of low sampling rate, the fluctuation functions had smaller linear regions at the short-and long-time scales, as shown in Figure 9. Therefore, it was not possible to estimate the scaling exponents (α 1 and α 2 ) accurately. Moreover, it is clear from Figure 10 that the classification of different bearing faults was more challenging and less accurate.
In Figure 11, we evaluate the diagnostic performance of the proposed method by calculating the scaling exponent (α 1 ) from vibration signals at sampling rate of 12,000 samples/s. Using ROC and accuracy curves, it is shown that accuracy of diagnosing minor and severe faults was 65% at scaling exponent (α 1 ) equal to 1.33. Table 4. Diagnostic performance of the proposed method using detrended fluctuation analysis (DFA).

Effect of Data Anomalies on the Fractal Characteristics of Vibration Signals
In the real-world environment, vibration measurements are susceptible to data quality issues during the data acquisition stage. So, we devoted this section to study the effect of these issues on estimating the fractal characteristics of vibration measurements. Here, we focused on the impact of induced noise and data loss on the estimated scaling exponents of the vibration signals under different health conditions.

Vibration Signals with Induced Noise
The vibration data may contain measurement errors and noise as a result of experimental measurements. The level of induced noise depends on the application and the used measuring device. In the vibration measurements, the additive noise followed a Gaussian distribution [43,44]. To study the effect of the noise on the performance of the Detrended Fluctuation Analysis (DFA) and the proposed method, we considered an additive zeromean white Gaussian noise with a variable standard deviation (σ) to simulate the different levels of the noise. For each noise level, we calculated the signal-to-noise ratio (SNR) as the ratio of the vibration signal power to the power of the noise.
For this paper, the scaling exponent (α 1 ) was proposed as a new diagnostic parameter of the bearing health. So, we focused here on the effect of the induced noise on the estimated scaling exponent. At each SNR, we calculated the scaling exponents (α 1 ) of the vibration signals (induced by noise) from bearings under the three health conditions (normal, minor fault, and severe fault). Subsequently, we calculated the mean relative errors in the scaling exponent versus the corresponding SNR, as shown in Figure 12. In normal bearings, the relative errors in the scaling exponent remained below 10% for all the SNRs higher than 9 dB. However, the faulted bearings had mean relative errors in the scaling exponent below 10% for SNR ≥ 12 dB. From Figure 12, it is clear that the white Gaussian noise had a negligible effect on the estimated scaling exponents for SNR ≥ 18 dB.

Vibration Signals under Data Loss
As part of the condition-based monitoring of mechanical systems, incomplete or missing vibration measurements may occur as a result of sensor failure, communication failure, or storage size restrictions [45,46]. To assess the limitation of data loss on the proposed method, we evaluated the effect of different percentages of data loss on the estimated scaling exponents (α 1 ). The data loss was simulated by randomly removing the data samples from the original vibration signal.
At each data loss percentage, we calculated the relative errors in the scaling exponent (α 1 ) of vibration signals from the different bearings. In Figure 13, we plot the mean relative errors in the scaling exponent versus the corresponding data loss percentage. It seems that the relative errors increased gradually as the data loss percentage increased. For a data loss percentage up to 15%, the means of the relative errors for all the bearings were below 10%. We concluded that the proposed diagnosis method is resilient and robust against low data loss percentages up to 15%.

Conclusions
In this paper, we studied the fractal characteristics of bearing vibration signals from CWRU data set using DFA method. Our results showed that vibration signal from faulted bearing had higher scaling exponent (α 1 ) at the short scale compared to normal bearing. Based on the estimated scaling exponent, we proposed a novel method to detect bearing faults and classify their severity. The proposed method can predict faulted bearing with an accuracy rate of 100% using high sampling rate vibration data. We believe that the scaling exponent (α 1 ) can be adopted as a novel classifier of different bearing faults. One of the limitations of the proposed method is low sampling rate of vibration data. It seems that the proposed method is not suitable for typical data acquisition systems at low sampling rate.
It was shown that the proposed method is resilient against the common data quality issues, induced noise and data loss, in the real-measurement environment. It seems that the relative error in scaling exponent remained below 10% in the case of acceptable SNR and data loss levels. In the future, we intend to apply the proposed diagnostic approach on the other benchmark vibration data sets to test the capability of this approach in identifying the bearing faults under different operating conditions.