An optimized ensemble local mean decomposition method for fault detection of mechanical components

Mechanical transmission systems have been widely adopted in most of industrial applications, and issues related to the maintenance of these systems have attracted considerable attention in the past few decades. The recently developed ensemble local mean decomposition (ELMD) method shows satisfactory performance in fault detection of mechanical components for preventing catastrophic failures and reducing maintenance costs. However, the performance of ELMD often heavily depends on proper selection of its model parameters. To this end, this paper proposes an optimized ensemble local mean decomposition (OELMD) method to determinate an optimum set of ELMD parameters for vibration signal analysis. In OELMD, an error index termed the relative root-mean-square error (Relative RMSE) is used to evaluate the decomposition performance of ELMD with a certain amplitude of the added white noise. Once a maximum Relative RMSE, corresponding to an optimal noise amplitude, is determined, OELMD then identifies optimal noise bandwidth and ensemble number based on the Relative RMSE and signal-to-noise ratio (SNR), respectively. Thus, all three critical parameters of ELMD (i.e. noise amplitude and bandwidth, and ensemble number) are optimized by OELMD. The effectiveness of OELMD was evaluated using experimental vibration signals measured from three different mechanical components (i.e. the rolling bearing, gear and diesel engine) under faulty operation conditions.


Introduction
Condition monitoring and fault diagnosis (CMFD) is an important research field in reliability analysis of mechanical components [1][2][3]. Many CMFD techniques have been developed to analyze the vibration signals acquired from these mechanical components [4][5][6][7]. Recently, time-frequencydomain analysis techniques, such as Wigner-Ville distribution [8] and wavelet transform [9][10][11], have attracted considerable attention. These techniques, shown to be successful in various applications, are, however, non-adaptive [12]. For example, wavelet analysis can extract local features in multiple scales, and hence enables the identification of singular components of a vibration signal for fault diagnosis of rotating machinery [8][9][10][11]. However, wavelet analysis has a critical limitation, i.e. different mother wavelets need to be determined or predefined for different signal processing applications. Therefore, these techniques are not self-adaptive [12].
Unlike the aforementioned analysis methods, empirical mode decomposition (EMD) can adaptively decompose any given complex signal into a series of intrinsic mode functions (IMFs) [13,14]. However, one of the major disadvantages of EMD is its susceptibility to the mode mixing problem. Mode mixing is defined as either a single IMF consisting of multiple widely disparate scales, or a signal residing in multiple IMF components. Although ensemble empirical mode decomposition (EEMD) represents a significant improvement over EMD for reducing mode mixing [15][16][17][18], the performance of EEMD largely depends on proper determination of its model parameters (i.e. the ensemble number and the ratio of the added noise) [19]. In addition, the computation complexity is also a concern for EMD and EEMD [20,21].
Recently, the local mean decomposition (LMD) [22] was introduced to solve the mode mixing problem in EMD. Instead of Hilbert transform, as is used in EMD, LMD uses smoothed local means to extract intrinsic modes from a signal. Hence, the information loss caused by Hilbert transform can be minimized [23]. The advantages of LMD over EMD were originally discussed in their applications to electroencephalogram (EEG) [22]. In mechanical fault detection, Wang et al [24,25] used LMD for fault diagnosis of rotating machinery and demonstrated that LMD provided better performance than EMD. It was reported in [26] that the features extracted by LMD provided satisfactory fault detection performance on a helical gearbox. Kidar et al [27] compared LMD and EMD in gear fault diagnosis. Their analysis results suggested that LMD improved the detection effectiveness over EMD in early gear defects. Feng et al [28] adopted LMD to detect early faults in planetary gearboxes. Chen et al [29] applied LMD to detect both gear and bearing faults and showed LMD was more effective than EMD in fault detection. Liu et al [30] combined Wavelet transform and LMD to analyze field data obtained from a locomotive rolling bearing. Han and Pan [31] integrated LMD with entropy/energy ratio to detect rolling bearing faults. Wang et al [32] incorporated LMD into morphology analysis for pump fault diagnosis. The above mentioned studies demonstrate superior performance of LMD over EMD in mechanical fault diagnosis [24][25][26][27][28][29][30][31][32].
To further enhance the capability of LMD to mitigate mode mixing, Sun et al [33] recently presented the application of an ensemble LMD (ELMD) to gas leak detection. Similar to EEMD, ELMD adds white noise to a raw vibration signal when decomposing the signal into characteristic modes. Mode mixing caused by the uniform time-frequency distribution of different-scale components in the raw signal can be significantly reduced since the added white noise can assist in tuning the time-frequency distribution [34]. As a result, ELMD can often obtain more reliable smoothed local means than LMD [33,34], and thus produce better performance in mechanical fault detection [34,35]. Moreover, ELMD showed better performance than EEMD in terms of fault detection rate [33][34][35]. Because LMD is computationally more efficient than EMD [22], ELMD imposes less computational burden than EEMD.
The effectiveness of ELMD in mode-mixing reduction is often highly influenced by the selection of its parameters, i.e. the amplitude and bandwidth of the added noise and number of ensemble trials. Very limited work has been done to optim ize these ELMD parameters, and no prior research has been found in mechanical fault diagnosis using parameteroptimized ELMD. Hence, it is worth investigating how the ELMD parameters affect the performance in fault diagnosis of mechanical components and how to optimize these parameters to achieve better performance. This paper proposes an optimized ensemble local mean decomposition (OELMD) method that optimizes three ELMD parameters (i.e. amplitude and bandwidth of the added noise, and number of the ensemble trials) to ensure satisfactory decomposition performance. Through the parameter optimization, the parameter dependency of the decomposition performance of ELMD can be accounted for when designing ELMD for a specific application. The resulting OELMD is expected to achieve a lesser degree of mode mixing than an ELMD with subjectively chosen parameters. This work was inspired by the previous studies in [16,18,36]. In [18], Guo et al proposed a method based on the relative root-mean-square error (Relative RMSE) to optimize the noise level for EEMD. Lei et al [16] then presented the use of Relative RMSE to select both the sifting number and noise amplitude in EEMD. More recently, the Relative RMSE-based method was proposed in [36] for the optimization of the noise bandwidth in EEMD. To the best of the authors' knowledge, no prior studies have investigated the use of Relative RMSE as a criterion to optimize ELMD parameters. The contributions of this work include: (a) the development of a new procedure based on Relative RMSE to optimize the noise amplitude and bandwidth for ELMD; and (b) the adoption of signal-tonoise ratio (SNR) to select an appropriate ensemble number for ELMD. The effectiveness of OELMD were evaluated and compared with ELMD using experimental vibration data measured from three different mechanical components (i.e. the rolling bearing, gear and diesel engine).

Fundamentals of the ELMD method
The ELMD method was recently developed to reduce mode mixing caused by the uniform time-frequency distribution of a signal. This method repeatedly applies LMD by adding white noise to the signal throughout the signal decomposition process [33]. If there are sufficient trials in the decomposition process, the added white noise can be canceled out in the final ensemble mean [34].
Assume that a multi-component nonlinear and/or nonstationary signal, x(t), can be decomposed into a series of production functions (PFs), t PF j ( ), and a monotonic function r t N ( ) [22].
where j is the index of the production function (PF), and N is the total number of the PFs. Essentially, each PF is the product of a purely frequency modulated signal and an envelope signal [22]. Furthermore, the corresponding complete time-frequency distribution can be obtained by assembling the instantaneous amplitudes and instantaneous frequencies of all PF components. This process is demonstrated using a simulated signal x(t) consisting of three components in equation (2).
x t x t x t x t where, x t t 1.5 e sin 2 5000 x t t t 0.2 1 cos 2 100 cos 2 and x 3 (t) is an additive Gaussian white noise with the bandwidth from 2 to 4 kHz. In equation (3), t′ is a periodic function of time with a fundamental period of 1/160 s. This frequency range was selected based on the fact that, compared to lowfrequency noises, high-frequency noises generally have more contributions to the changes of the extremum distribution of the original signal [24]. The time domain waveforms of x(t) and its three components are shown in figure 1, where the sampling frequency is 40 kHz.
The decomposition results of x(t) by ELMD are shown in figure 2, where it can be observed that the four PFs (i.e. PF1-PF4) with different characteristic modes do not well match the actual modes shown in figure 1. The reason is probably that improper ELMD parameters were used in this analysis (the amplitude and bandwidth of the added noise and the ensemble number were subjectively chosen, as is typically done in the original ELMD method). Hence, properly determining the ELMD parameters is expected to improve the decomposition performance.
In the ELMD decomposition process, if the amplitude of the added noise is much larger than that of the original signal, it may result in redundant PFs that consume extra computational time. In contrast, if the amplitude of the added noise is too small, the noise may not affect the extremums that the LMD method relies on [35]. Thus, the effectiveness in mode mixing reduction may dramatically decrease. In addition, high-frequency noises have more impact on the changes of the extremum distribution of the original signal than lowfrequency noises [34]. However, the computation complexity increases with the increase of the noise bandwidth. Finally, if the number of ensemble trials is too large, the computational cost may be prohibitively high. Conversely, a very small value may not be sufficient to cancel out the noise remaining in each PF [33]. Therefore, it is important to be able to determine optimum ELMD parameters that ensure effective mitigation of mode mixing and minimize the computational cost. It is worth noting that most of the existing studies on ELMD designed ELMD with subjectively selected parameters that are often suboptimal. Hence, there is a need for a systematic and effective approach to selecting suitable ELMD parameters.

Optimized ensemble local mean decomposition (OELMD)
Optimization of the added white noise's amplitude (L N ) and bandwidth ( f b ) and the ensemble number (N E ) is essential to reduce mode mixing and can lead to a reduced computational cost as well. Here, OELMD is proposed to accomplish parameter optimization.

Optimization of noise amplitude
Initially, the effects of the added noise's amplitude on ELMD were investigated. For x(t) described in equation (2), figures 3-5 illustrate the decomposition results using ELMD with noise amplitudes of 0.1, 0.6, and 1.5, respectively. The number of ensemble trails was subjectively set to 100 in these simulations.
As can be seen in figures 3-5, the degree of mode mixing varies with the amplitude of the added noise. When the amplitude of the added noise is 0.1 (figure 3), the components x 1 (t) and x 2 (t), cannot be separated by the first two PFs because of mode mixing. When the amplitude is 1.5, the mode mixing effect in figure 4 is worse than that in figure 3. In comparison, when the amplitude of the added noise is 0.6, the mode mixing effect in figure 5 is less than those in figures 3 and 4. The PFs in figure 5 correspond to the original x(t) components (i.e. PF 1 , PF 2 and PF 3 correspond to x 1 (t), x 2 (t) and x 3 (t), respectively). These observations are consistent with [33], where the noise amplitude was suggested to be neither too small nor too large.
In order to choose a suitable noise amplitude value, inspired by [16][17][18], an error index, termed relative rootmean-square error (Relative RMSE), is introduced in this work. Relative RMSE is defined as the ratio of the rootmean-square of the decomposition error to the root-meansquare of the original signal x(t), where the decomposition error is the difference between x(t) and a specific PF, c max (t), i.e. the PF component with the highest correlation with the original signal [17]. Mathematically, the Relative RMSE can be expressed as [17,18]: , and x is the sample mean of x(t). On the one hand, if the Relative RMSE is close to zero, it indicates that the specific PF, c max (t), is similar to the original signal. This suggests that c max (t) contains not only the main component in the original signal but also noise and/or other weakly correlated or irrelevant signal components [12]. In other words, mode mixing occurs in this PF. On the other hand, an optimum value of noise amplitude that maximizes the Relative RMSE may exist. At this optimum value, the PF is close to an intrinsic mode/component of x(t). To this end, an optimization method was proposed to determine the appropriate noise amplitude by maximizing the Relative RMSE.
The relationship between the Relative RMSE and the added noise amplitude L N for x(t) is given in figure 6, where L N was initially set to a maximum value l max = 2, and then decreased to a minimum value l min in increments (Δl) of 0.05. As can be seen in the figure, the maximum Relative RMSE corresponds to 0.6 of the noise amplitude. In figures 3-5, the decomposition performance of ELMD at the noise amplitude of 0.6 is better than at 0.1 or 1.5. Hence, the proposed optimization approach based on the maximum Relative RMSE is effective to find a suitable noise amplitude value. Figure 6 also indicates that the maximum Relative RMSE under different ensemble numbers N E is different, but the optimal noise amplitude remains the same. Therefore, the ensemble number hardly affects the determination of the noise amplitude.

Optimization of noise bandwidth
The noise bandwidth is another factor that influences the decomposition performance of ELMD. As with the selection of noise amplitude, the noise bandwidth is optimized by   maximizing the Relative RMSE [17]. The detailed procedure is as follows: First, define a pool of candidate upper bandwidths of the added noise, (2).
Second, obtain the interpolated signal x′(t) of x(t) by cubic spline interpolation for each candidate f b . The data length of Third, perform ELMD on x′(t) for each candidate f b with the optimal noise amplitude determined in section 3.1. The upper bandwidth of the added white noise is f b in the ELMD process. Then, calculate the Relative RMSE values for each candidate f b .
Finally, determine the optimal f b as the candidate with the maximum Relative RMSE. Figure 7 shows the simulation results of ELMD on x(t) using different f b values. As can be seen in this figure, the maximum Relative RMSE is located at 40 kHz, which is selected as the optimal noise bandwidth. As was the case in figure 6, the location of the maximum Relative RMSE is not influenced by N E .
In order to verify the optimization result in figure 7, figure 8 illustrates the decomposition performance of ELMD on x(t) with different noise bandwidths. Figure 8(a) shows that the PFs well match the original x(t) components because the optimal noise bandwidth of 40 kHz is used. In figure 8(b), significant mode mixing is observed when the noise bandwidth is 80 kHz because the Relative RMSE at 80 kHz in figure 7 is the local minimum. In figure 8(c), when the noise bandwidth is 120 kHz, a redundant mode PF3 appears, indicating mode mixing. Similar results are observed in figure 8(d) when the upper bandwidth of the added noise is 200 kHz. Figures 8(c) and (d) indicate that the decomposition using 120 kHz provides better performance than that using 200 kHz. This is because the Relative RMSE at 120 kHz is larger than that at 200 kHz. Since a 40 kHz bandwidth produces the largest Relative RMSE in figure 7, the decomposition results in figure 8(a) are superior to those of the other three bandwidths. Consequently, figure 8 demonstrates the effectiveness of the optimization method.

Optimization of number of ensemble trials
The number of ensemble trials needed is proportional to the amount of white noise added to the original signal [15]. After determining the optimal noise amplitude and bandwidth, the next task is to choose an appropriate number of ensemble trails. If the number is too large, it may lead to a prohibitively high computational cost [33]. Conversely, a small value may not be enough to cancel out the added noise from the decomposed PFs. In this work, the signal-tonoise ratio (SNR) was adopted to determine the appropriate ensemble number.
For each ensemble number in ELMD, the signal decomposition stops once the selected PF is obtained. Here, the first PF was used to calculate the SNR value and find the appropriate ensemble number. p p SNR 10log 10 1 2 ( / ) = Here, p 1 is the power of the first PF, and p 2 is the power of the noise, which is equal to the power of the original signal minus p 1 . The relationship between the SNR and the ensemble number, as shown in figure 9, was obtained by analyzing x(t) using ELMD with the optimized noise amplitude of 0.6 and bandwidth of 40 kHz. As can be seen in figure 9, when the ensemble number is smaller than 100, an increase in the ensemble number leads to a significant increase in the SNR value. However, after 100 ensemble trials, the SNR value remains constant. Therefore, the optimal ensemble number for x(t) when the optimized noise amplitude of 0.6 and bandwidth of 40 kHz are used is 100. Figure 10 compares the decomposition performances of ELMD for both 20 and 100 ensemble trials. The original x(t) components of the decomposed PFs are recovered correctly in both cases. In order to compare the results of different ensemble trials quantitatively, two evaluation indexes, namely the correlation coefficient r and the energy error E, are introduced. Coefficient r refers to the correlation between the original x(t) components and the PFs decomposed by ELMD. The energy error, E, is defined as:  where E s (·) is the energy of the signal, x i (t) is the ith original component of x(t), and I i is the corresponding PF of x i (t). In this simulation, i = {1, 2, 3}. A smaller E indicates a better decomposition performance of ELMD. Table 1 lists r and E for different numbers of ensemble trials with the optimized noise amplitude of 0.6 and bandwidth of 40 kHz. The quantitative indexes r and E are shown for x 1 (t) and x 2 (t). As the number of ensemble trials increases, r increases, indicating improved ELMD performance. The correlation coefficient of ELMD for 100 ensemble trials is larger than for 20 trials, suggesting better decomposition performance in that case. However, as the number of ensemble trials increases over 100, the gain in ELMD performance is very limited (i.e. less than 0.0012). Similar results are also observable for E. Because the computation effort needed for 200 ensemble trials is twice  that needed for 100 trials, it is reasonable to choose 100 as the optimal number of trials for this specific case. Therefore, the optimized ensemble number in figure 9 is verified, and SNR is shown to be an effective criterion for determining the ensemble number. Figure 11 shows a flowchart of the proposed OELMD method. For a given signal x(t), the decomposition process for the proposed OELMD method can be summarized as follows:

Procedure of OELMD
Step 1-Determine the optimal noise amplitude and bandwidth using maximum Relative-RMSE based optimization (see sections 3.1 and 3.2).
Step 2-Calculate the appropriate number of ensemble trials with the SNR-optimization approach using the optimal noise amplitude and bandwidth (see section 3.3).
Step 3-Perform ELMD with the optimized parameters (i.e. optimal noise amplitude, bandwidth, and ensemble number).
It is worth noting that Step 1 determines the optimal noise amplitude and bandwidth sequentially and separately, and an alternative to the sequential optimization is to jointly optim ize both parameters by solving the following optimization problem: . One way to solve this problem is to use an iterative two-step optimization algorithm that iterates between the optimization of L N (keeping f b fixed) and the optimization of f b (keeping L N fixed) until a convergence criterion is met. A detailed discussion of the joint optimization is beyond the scope of this work, and will be considered in our future work.
After the three parameters of the white noise were optimized, a comparison of the ELMD (L N = 0.2, f b = 80 kHz and N E = 200) and OELMD (L N = 0.6, f b = 40 kHz and N E = 100) methods was conducted. The decomposition results of the same original signal x(t) derived by ELMD and OELMD are presented in figure 12. It can be seen that OELMD performs well in the elimination of mode mixing. For further comparison, the envelope spectra of the decomposed components are respectively presented in figure 13. In figure 13, the envelope spectra of the first two components, obtained by ELMD and OELMD, are similar. In the OELMD results, the spectrum lines of the high frequency (156 Hz) and low frequency (98 Hz) components appear distinct. However, the amplitudes of these two components (98 Hz and 156 Hz) in the ELMD results are smaller than in the OELMD results, and the frequency multiplication components in the ELMD results are less distinguishable than in the OELMD results. Overall, the spectrum lines in the OELMD results are more clearly visible than in the ELMD results, where a serious frequency interference problem exists. Consequently, OELMD was superior to ELMD in term of reducing mode mixing in this simulation.

Experimental results and discussion
To verify the effectiveness of the proposed OELMD method in mechanical fault diagnosis, experimental data collected from three different mechanical components was analyzed. The performances of OELMD and ELMD were compared in the experimental analysis.

Case study 1: bearing fault detection
The first test case was detection of an inner race fault of a rolling bearing. The bearing vibration signal was collected from a gearbox test rig ( figure 14), which consisted of one planetary and one parallel gearbox, a 3 hp driving motor, and a magnetic brake. A local pitting defect was introduced into   Figure 15 illustrates the raw vibration signal of the faulty bearing. Obvious bearing impulsive and gear modulation components were observed in the raw vibration ( figure 15). The OELMD was then applied to the raw vibration signal.
The parameter optimization results are shown in figure 16. As can be seen in the figure, the optimal noise amplitude in figure 16(a) is 0.3, the optimal noise bandwidth in figure 16(b) is 12 kHz, and optimal ensemble number in figure 16(c) is 100. Figures 17 and 18 depict the time and envelope waveforms of the PFs decomposed by OELMD. Four dominant frequency components located at the harmonics of BPI (i.e. f i , 2f i , 4f i , and 5f i ) can be observed in the first PF in figure 18. A pure dominant frequency at BPI is observed in PF3, but in PF4, the only dominant frequency is at 3f r . As a result, the vibration modes of the input shaft rotation and the inner race were decomposed into different PFs according to figure 18. Although in PF2 in figure 18, the two modes (2f r and 4f i ) mix, Figure 11. Flowchart of the OELMD method. when comparing the amplitude of PF1 and PF2, it is obvious that the peak values of the dominant frequency components in PF1 are much larger than that of those in PF2. PF1 is the most informative mode of the three. According to Randall [37], for a bearing with an inner race defect, the main vibration pattern in the envelope spectrum of the bearing vibration will be a series of peaks at BPI harmonics. Therefore, OELMD analysis correctly identified the inner race fault ( figure 18).
A comparative study was performed on the OELMD and ELMD methods. The time and envelope waveforms of the PFs decomposed by ELMD are presented in figures 19 and 20.
The ELMD parameters were selected as follows: the noise amplitude was 0.1, the noise bandwidth was 20 kHz, and the ensemble number was 100. ELMD (figure 20) suffered much more severe mode mixing than OELMD (figure 18) did. For EMLD, the two modes (2f r and 4f i ) mixed in PF1 and PF3. In addition, PF2 and PF4 were useless modes since they did not provide any useful information on the bearing vibration. In addition, the largest peak value of PF1 in figure 20 was located at 2f r , while in figure 18, the first PF obtained by OELMD contained only BPI harmonics. Overall, this comparison suggests that (a) because of unsuitable parameters, the ELMD method is subject to severe frequency interference, and (b) the fault detection performance of OELMD is better than that of ELMD.

Case study 2: gear fault detection
The second test case was detection of gear wear faults in the parallel gearbox ( figure 14). Figure 21 shows the structure diagram of the gearbox. The wear fault was introduced to the driving gear Z2. The motor driving speed was 980 rpm, and the sampling frequency was 12 000 Hz. The transmission ratio of the planetary gearbox was 12.64, and the fault characteristic   frequency and meshing frequency of Z2 were f r = 4.46 Hz and f z = 129.21 Hz respectively. Figure 22 shows the time and frequency spectra of the raw gear vibration signal. In figure 22(b), frequency peaks appear at 129.67, 389.01, and 907.69 Hz (i.e. f z , 3f z and 7f z ), indicating that the meshing frequency of the Z1-Z2 gear pair dominates the vibration response of the parallel gearbox. OELMD was then applied to the raw vibration signal. Figure 23 shows the parameter optimization results. As indicated in the figure,

Case study 3: valve fault detection of a diesel engine
A four-stroke, four-cylinder diesel engine (Model 4135D) was tested to evaluate the proposed OELMD method. The engine operating rotating speed was 1500 rpm, and its operating power was 50 kW. In the experiment, the inlet valve clearance of the first cylinder was set to a large value to simulate a valve fault condition. Four piezoelectric accelerometers were installed on the right-side engine bodies of the four cylinders to measure their vibration signals (figure 26). A sampling frequency of 30 kHz was used. Cylinder vibration was measured under normal and valve fault conditions.
The measured raw vibration of the first cylinder was used for the analysis. Figure 27 27(a)). In the corresponding period of the normal waveform in figure 27(b), there was no such severe vibration. It is reasonable to conclude that the valve fault may have generated these vibration impulses.
In the frequency spectra in figures 27(c) and (d), the dominant frequency peaks for both normal and valve fault conditions appear at 3545 Hz. This 3545 Hz vibration pattern is probably the first order harmonic of the diesel engine's natural vibration mode. The dominant peak is much higher under the valve fault condition than under normal conditions, which is consistent with the time waveforms in figures 27(a) and (b). The frequency spectra also demonstrate that under the valve fault condition, the diesel engine's main vibration components come from the frequency range of 3000 Hz-4000 Hz. The vibration contributions under normal conditions exist in a broader frequency range (2000 Hz-4000 Hz) than those under the valve fault condition.
OELMD was then applied to decompose the raw vibration signal into seven PFs. In the analysis, optimal values for noise amplitude (0.36), noise bandwidth (59 kHz), and ensemble number (150) were used. Figures 28 and 29 show the OELMD analysis results. For ELMD, noise amplitude of 0.2, bandwidth of 50 kHz, and ensemble number of 150 were used.
OELMD and ELMD provide almost the same decomposition result on the first PF ( figure 28). In the range of 3000 Hz-4000 Hz, the frequency spectra of PF1 for both OELMD and ELMD (figure 29(a) and (b)) and the raw fault vibration signal (figure 27(c)) are similar. Since the dominant peaks of the two PF1 modes in figures 29(a) and (b) are located at 3545 Hz, according to the results of the analysis in figure 27, it is reasonable to conclude that PF1 best represents the overall tendency of the diesel engine vibration.
However, in figure 28, the second PF decomposed by OELMD differs from the one decomposed by ELMD. OELMD provides clear impulsive components in PF2, while ELMD does not show many individual impulses. The frequency spectra of these second PFs reveal that the PF2 decomposed by OELMD has an absolute dominant frequency at 3640 Hz (figure 28). In contrast, the ELMD-decomposed PF2 has two dominant frequency components: one at 3640 Hz and the other in a mixing mode. The third PF obtained by OELMD also contains one dominant frequency at 3640 Hz (figure 29(a)), although a disturbance frequency component is present as well. However, the dominant frequency component of PF3 obtained by ELMD is located at 1384 Hz (figure 29(b)), which does not belong to the primary vibration frequency range (3000 Hz-4000 Hz) for the diesel engine with a valve fault. As a result, the PF3 obtained by OELMD provides more useful information about the valve fault than the one obtained by ELMD decomposition. Hence, figures 28 and 29 suggest that the decomposition performance of OELMD is more effective than that of ELMD for the valve fault vibration signal.

Discussion
The three case studies discussed in this paper demonstrate better performance of OELMD than ELMD due to the determination of suitable critical parameters. Improper ELMD parameters either increase the computation cost, decrease the fault detection performance, or both. Moreover, a suitable number of ensemble trials can be obtained with the OELMD method. The computation cost can be significantly reduced by using the optimized ensemble number while the mode decomposition effectiveness can still be guaranteed. This is because, in most cases, a suitable ensemble number is unknown and large value of the ensemble number is selected for ELMD analysis, artificially and subjectively, for the purpose of reliable mode decomposition. For example, in case study 2, the optimized ensemble number was 50. However, when the original ELMD was used to analyze the signal, it often chose an ensemble number of 100 or more to ensure adequate decomposition performance. Nevertheless, these results do not necessarily mean that OELMD is computationally more efficient than ELMD, because obtaining the optimized parameters also consumes some computational time, which should be considered as part of the computational time required by OELMD. The results only suggest that, after the suitable ELMD parameters are obtained, the fault detection performance can be improved.
Although an adaptive strategy was proposed to select suitable parameters for EEMD in [16], previous publications indicate better performance of ELMD than EEMD in mechanical fault diagnosis [33][34][35]. Consequently, in this work, the fault detection performance of OELMD is only compared against ELMD. However, comparisons between OELMD, ELMD, and EEMD have not been conducted on real-world vibration signals. This is an area with practical applications that is worth investigating in future research. In addition, OELMD and ELMD utilize noise to improve the decomposition performance and obtain more distinct fault vibration modes. Another signal processing technique, namely stochastic resonance (SR), also adopts the noise information to enhance fault characteristics [38], and demonstrates promising performance  in fault detection of gearboxes [39]. Hence, our future work will also consider investigating SR-based fault diagnosis on real-world vibration signals and comparing its performance with that of OELMD.

Conclusion
In this paper, an ELMD method improved by optimizing three critical parameters, namely OELMD, was proposed. A new procedure based on Relative RMSE was introduced to optim ize noise amplitude and bandwidth, and a SNR-optimization approach was developed to obtain a suitable ensemble number. When these optimized parameters were applied to the intrinsic mode decomposition process, OELMD can effectively reduce mode mixing and enhance the fault detection performance. Three case studies were conducted to verify the effectiveness of the proposed method. The results demonstrate that (a) suitable ELMD parameters can be obtained by OELMD; (b) OELMD represents an improvement over ELMD in mode mixing reduction, and is therefore expected to increase the fault detection rate; and (c) improvement in computational efficiency is achieved by minimizing the ensemble number. The proposed method has significant practical applications in mechanical fault diagnosis. Our future work will investigate the effectiveness of OELMD with real-world engineering vibration data measured from gearboxes of wind turbines and mining machines, and compare the fault detection performance of OELMD with those of both ELMD and EEMD. SR-based fault diagnosis with the realworld data will also be included in our future research.