Rolling bearing fault diagnosis based on improved VMD-adaptive wavelet threshold joint noise reduction

Due to the fault vibration signal of the rolling bearing is greatly interfered by the background noise, the fault features are easily submerged and result in a low fault diagnosis accuracy. A novel fault diagnosis method of rolling bearing is proposed based on improved VMD-adaptive wavelet threshold combined with noise reduction in this paper. Firstly, the modal components are obtained based on VMD decomposition; Secondly, the dual determination criteria of sample entropy and correlation coefficient are constructed to filter the components; Subsequently, an adaptive wavelet thresholding function is proposed, and quadratic noise reduction is applied to mixed IMFs, which in turn reconstructs each component to achieve joint noise reduction. Finally, based on traditional machine learning and deep learning diagnosis methods, the features of noise reduction signals are extracted to realize fault diagnosis. By verifying and analyzing the simulated signal with the measured signal, noise components, the expression of fault characteristics, and the accuracy of fault diagnosis are eliminated, enhanced, and improved.


Introduction
Rolling bearings are key components supporting the operation of rotating machinery and equipment, and relevant data show that about 30% of rotating machinery failures are caused by rolling bearings, 1 and their operating conditions directly affect the performance and safety of machinery and equipment. Therefore, the effective diagnosis of rolling bearing failure is of great significance to improve the service life and working performance of mechanical equipment. 2 In the case of early failure of rolling bearings, the vibration signal contains a large amount of noise information due to the influence of the location layout of the sensor and the surrounding environment. It makes failure characteristic signal not obvious and difficult to be detected. 3 Therefore, an effective early fault signal noise reduction method is essential for bearing fault diagnosis.
In view of the non-stationary, nonlinear and coupled modulation characteristics of rolling bearing fault vibration signals, 4 scholars have applied signal decomposition methods to the field of noise reduction 1 The College of Mechanical Engineering, Chongqing University of Technology, Chongqing, China 2 State Key Laboratory of Mechanical Transmission, Chongqing University, Chongqing, China research, such as empirical mode decomposition (EMD), 5 ensemble empirical mode decomposition (EEMD), 6 complete ensemble empirical mode decomposition (CEEMD), 7 and so on. Compared with the above methods, variational mode decomposition (VMD) 8 can effectively avoid the problems of modal mixing and endpoint effects, and at the same time can suppress the interference of noise. Wavelet threshold noise reduction 9 is a time-scale analysis method with the advantages of multi-resolution analysis and simple noise reduction principle. It is usually combined with the VMD decomposition algorithm to perform a secondary noise reduction process on the intrinsic mode function (IMFs), which in turn reconstructs the components to achieve joint noise reduction. Liu et al. 10 combined VMD decomposition and improved wavelet thresholding for noise reduction of vibration information of planetary wheel wear faults and crack faults. Chen et al. 11 proposed a rolling bearing fault feature extraction method with optimized VMD and improved threshold noise reduction, and it was verified that the fault features were more obvious after the noise reduction by this method. Chen et al 12 proposed a method based on VMD and alignment entropy combined with wavelet noise reduction, which was applied to the noise reduction processing of wind turbine vibration signals in strong noise background. Liu et al. 13 proposed a noise reduction method of VMD combined with soft threshold wavelet for rolling bearing fault vibration information.
The results show that the VMD combined with wavelet threshold noise reduction can effectively remove the noise components from the signal and extract the weak features of the rolling bearing vibration signal at the early stage of degradation in the strong noise background. However, the noise information is distributed among the components after VMD decomposition, and the IMF components need to be selected and given the optimal subsequent noise reduction process. Therefore, the selection of IMF components and the optimization of wavelet threshold function are worthy of further study.
Regarding the selection of IMF components, Cui et al. 14 used the Kurtosis criterion to identify IMFs with prominent fault information to reconstruct the signal. Li et al. 15 selected IMFs with rich fault information based on frequency band entropy. Jin et al. 16 selected the best IMF component based on the Pearson correlation. Yan et al 17 used fault feature ratio to select IMFs rich in fault information as the main component for the next step of spectrum analysis. Cao et al. 18 measured the noise content of each IMF component based on permutation entropy and classified each IMF component according to the noise content. Akhenia et al. 19 selected the IMF components with maximum energy and minimum Shannon entropy as valid components to generate spectrograms based on the ratio of maximum energy to Shannon entropy criterion. Kumar et al. 20 measured the similarity between each IMF and the original signal based on the dynamic time warping criterion to select the valid components. Liang et al. 21 screened two IMF components with maximum fault information for reconstruction based on the improved kurtosis criterion combined with the Holder coefficient criterion. Chen et al. 22 used sample entropy to classify the components into noise IMF components, mixed IMF components, and useful IMF components. The selection of IMF in the above studies is often based on a single indicator or only distinguishes between valid and invalid components (dichotomous classification), which may lead to a less accurate type discrimination of components. The problem that valid information is eliminated and irrelevant information is retained exists.
Regarding the optimization of the wavelet thresholding function, Xie et al. 23 considered the effect of the number of layers of wavelet decomposition and improved the thresholding function to reduce the bias due to inaccurate thresholding. Chen and Zhang 24 improved the wavelet thresholding function and combined it with median filtering to set different thresholds for each level of wavelet details. Li et al. 25 proposed a continuously derivable threshold function at the threshold, where the coefficients below the threshold are adjusted by a power function instead of being set to zero to prevent the loss of useful information. Chegini et al. 26 proposed an improved threshold function, which makes the function adjust between soft and hard functions by artificially setting the size of the parameters. Li et al. 27 used the attenuation property of the exponential function to retain a part of wavelet coefficients close to the wavelet threshold to prevent the problem of excessive noise reduction. Dalei et al. 28 improved the wavelet semi-soft threshold function and determined the value of the adjustment parameter a in it. The adjustment parameters of the optimized wavelet threshold function in the above study cannot be set adaptively according to the noise content of the signal, and the robustness is poor too.
In this paper, for the noise reduction problem of rolling bearing vibration signal, the IMF component selection and wavelet threshold function optimization are studied, and a joint noise reduction method based on improved VMD-adaptive wavelet thresholding is proposed. Firstly, the IMF components after VMD decomposition are selected based on the dual determination criteria of sample entropy and correlation coefficient; Secondly, an adaptive wavelet threshold function is proposed to adaptively adjust the noise reduction form according to the noise content degree of the noise-containing components, to perform secondary noise reduction for mixed IMFs, and to reconstruct each component to realize joint noise reduction.
Finally, the traditional machine learning and deep learning diagnosis methods are adopted respectively to realize the bearing fault diagnosis. At the same time, the improvement of the diagnosis accuracy by the noise reduction method in this paper is verified. The analysis results of simulated and measured signals show that the joint noise reduction method proposed in this paper has better noise reduction effect and self-adaptability, which effectively enhances the expression ability of features and improves the accuracy of diagnosis.
Improved VMD-adaptive wavelet threshold joint noise reduction VMD VMD is a non-recursive, adaptive signal decomposition algorithm that decomposes the original signal into a series of intrinsic mode function sums with finite bandwidth and center frequency, and each IMFs is defined as an FM-AM signal. The formula is as follows: Where: A k t ð Þ is the instantaneous amplitude and F k t ð Þ is the instantaneous phase. Constrained variational models are constructed by estimating the bandwidth of each IMFs using the demodulation method. The formula is as follows: Where: u k f g= u 1 , :::, u K f g is the decomposed K IMFs, and v k f g= v 1 , :::, v K f g is the corresponding center frequency of each component.
To solve the above model, the Lagrangian operator l and the quadratic penalty factor a are introduced, and the constrained variational problem is transformed into an unconstrained variational problem with an augmented Lagrangian function expression as: The optimal solution of the variational constraint model is obtained by alternately updating u n + 1 k , v n + 1 k , and l n + 1 using the alternating direction method of the multiplicative operator. The update equation is as follows.û The termination constraint is given by: (in equation, e notes convergence accuracy) The noise content of each IMFs obtained by decomposition is basically different, and the sample entropy and correlation coefficient are introduced as the quantification criteria of the noise content of each IMFs in order to give the optimal treatment.

Improved VMD based on the dual determination criterion of sample entropy-correlation coefficient
The calculation of sample entropy (SE) does not depend on the length of the data and is not affected by the intrinsic characteristics of the signal, the sample entropy has a high accuracy. The complexity and disorder of a signal increases when it is disturbed by noise, and the sample entropy value is larger. 29 Let a onedimensional time series of length N be X = fx 1 , x 2 , x 3 , :::, x N g. Its sample entropy is defined as follows: Where: m is the number of embedding dimensions; r is the similarity tolerance value; A m r ð Þ is the sum of the number of time series with spacing less than r in m + 1 dimensions; B m r ð Þ is the sum of the number of time series with spacing less than r in m dimensions; the embedding dimension m is 2; and the similarity tolerance value r is one-fifth of the standard deviation of the time series X.
Calculate the sample entropy S X of the original signal and the sample entropy S i of each IMFs. The screening criteria are as follows: The components of S i .1:1S X are noisy IMFs, and this type of component has a very large noise content and almost no useful components; The components of S i ł 0:3S X are useful IMFs, and this type of component is basically composed of useful signals; The component of 0:3S X \S i ł 1:1S X are residual IMFs, and this type of component is composed of some noise components and useful components. 22 Due to the poor correlation between the noise information or the noise component caused by endpoint oscillations and the original signal, noisy IMFs can be determined by the correlation coefficient (Corr), which is expressed as: Where: X and Y are one-dimensional time series of length N.
The correlation coefficients of each residual component with the original signal are further calculated, defining: The components of P i ł 1 are noisy IMFs; The components of P i .1 are mixed IMFs, 30 subject to further noise reduction.
Due to the richness of rolling bearing vibration signals, the screening of IMFs by a single indicator may result in valid components being rejected and irrelevant components being retained. 31 Therefore, this paper constructs a dual determination criterion of sample entropy and correlation coefficient, firstly, using sample entropy to analyze the noise content of each component and filter it, and further using correlation coefficient to measure the correlation degree between the residual IMFs and the original signal to ensure the accurate classification of mixed IMFs and noisy IMFs. Stable screening of the maximum correlation components of the periodic fault pulse and the original signal in the measured signal, effectively reducing noise interference.

Adaptive wavelet threshold function noise reduction
A wavelet thresholding function is applied to mixed IMFs for secondary noise reduction. However, the conventional hard threshold function is discontinuous and has intermittent points in the wavelet domain, making the reconstructed signal oscillatory. Although the soft threshold function is continuous, there is a constant error between the reconstructed signal and the real signal. Moreover, the noise reduction forms of the above two threshold functions are single fixed and lack of robustness. Therefore, on the basis of traditional wavelet threshold, this paper proposes an adaptive continuous wavelet threshold function based on sample entropy. The expression is as follows: In the formula: d is the wavelet coefficient; d Ã is the wavelet coefficient after noise reduction; T is the threshold; a is an adjustable parameter (0 \ a \ 1), so that the wavelet coefficient between the two thresholds can be denoised by using the exponential decay characteristic to avoid The problem of excessive noise reduction caused by direct zero setting, a value of 1/1.16 28 ; b is an adaptive parameter (0 ł b ł 1), which measures the noise level of the mixed IMFs, so that the threshold function can automatically adjust the noise reduction form according to the noise condition. The expression is as follows : In the formula: S X is the sample entropy of the original signal, 0.3S X is the threshold value to distinguish useful IMFs from residual IMFs 22 ; S i Ã is the sample entropy of mixed IMFs. When b tends to 0, a harder threshold function is used for noise reduction; otherwise, b tends to 1, a softer threshold function is used for noise reduction.
When the a value is fixed at 0.8, and the b value is 0.8, 0.5, and 0.2, the function image is shown in Figure 1.
Firstly, the bias of the adaptive wavelet threshold function is tested.
Second, the continuity of the adaptive wavelet threshold function is tested, when d ! T + , d ! T À ; when d ! aT + , d ! aT À .
In summary, the function is continuous at d = T and d = aT; at the same time, due to the adaptive adjustment of parameter b, the deviation is also adjusted accordingly. Therefore, the adaptive wavelet threshold function in this paper not only avoids the discontinuity of the hard threshold function, but also reduces the constant error of the soft threshold function. It can be adaptively adjusted according to the noise of the signal, which makes the application more flexible.
Improved VMD-adaptive wavelet threshold joint noise reduction Based on the above algorithms and theories, this paper proposes an improved VMD-adaptive wavelet threshold function joint denoising method. The implementation steps are shown in Figure 2.
(1) Perform VMD decomposition on the signal, and screen IMFs based on the dual determination criterion of sample entropy correlation coefficient. (2) Discard noisy IMFs, retain useful IMFs, and perform secondary noise reduction using adaptive wavelet thresholding functions for mixed IMFs.

Simulation signal test and analysis
In the vibration signal acquisition process, the real signal is disturbed by a large number of high-frequency noise signals and weak fault features are usually distributed in the low frequency band within 1 kHz due to the influence of mechanical equipment and the surrounding environment. 3 To verify the effectiveness of the noise reduction pre-processing algorithm proposed in this paper, a set of non-smooth, non-linear periodic amplitude-modulated frequency modulated signals are constructed to simulate the periodic signals of the faults; and Gaussian white noise and colored noise are added to simulate the interference noise signals in the high frequency band.
Where: t = [0, 1], the sampling frequency is 1000 Hz. m t ð Þ is the noise signal. The time and frequency domain waveforms before and after the addition of Gaussian white noise with SNR = 1 and colored noise with SNR = 22.0 are shown in Figure 3. The useful frequency components of the signal are 5, 40, and 100 Hz. The Gaussian white noise is uniformly distributed in the whole frequency domain, while the colored noise is mainly distributed in the high frequency band and has a higher amplitude compared with the white noise, and the two noise signals show different characteristics in the frequency domain.

VMD and IMFs selection
To verify the screening effect of the dual determination criterion on IMF components in this paper, the components were screened by the sample entropy criterion, 22 the ratio of maximum energy to Shannon entropy criterion (Ratio criterion), 19 the dynamic time warping criterion (DTW criterion), 20 and the criterion in this paper. As an example, the parameters of VMD were determined by genetic algorithm for a signal containing white noise with SNR = 15, 13 and the calculation results and screening results based on each criterion are shown in Table 1.
Among them, the ratio values and DTW values are normalized in order to specifically quantify the importance of IMFs when selecting IMF components based on the ratio criterion and DTW criterion. The rules of the ratio criterion are as follows: the components whose normalized ratio value is greater than 0.9 and less than or equal to 1 are regarded as useful IMFs conversely, they are regarded as mixed IMFs and require secondary noise reduction. The DTW criterion rule is as follows: the components whose normalized DTW values are greater than or equal to 0 and less than 0.1 are regarded as useful IMFs; conversely, they are regarded as mixed IMFs.
From Table 1, IMF2 and IMF3 were determined as useful IMFs by the above four criteria; IMF4 was determined as mixed IMFs by the DTW criterion and useful IMFs by the other three criteria. IMF1 was determined as noisy IMFs by the criteria of this paper and mixed IMFs by the other three criteria.
To check the merits of the screening results of each criterion, the spectral analysis of each component is shown in Figure 4. IMF4, IMF3, and IMF2 contain 5, 40, and 100 Hz frequency components and the amplitude of the frequency components at high frequencies are 0, which belong to useful IMFs; while IMF1 does not contain useful frequency components, which belong to noisy IMFs. The above analysis results are consistent with the results filtered by the criterion in this paper.
In order to visualize the screening effect of each criterion, after the screening by each criterion, a hard threshold function is used for the secondary noise reduction and reconstruction. The signal-to-noise ratio and correlation coefficient were used as the evaluation indexes, and the noise reduction effect is shown in Table 2. The signal-to-noise ratio was 19.44 and the correlation coefficient was 0.994 after being screened by the double-judgment criterion and the secondary noise reduction, which was the best.
In summary, the ratio criterion and the DTW criterion, both based on a single indicator, only distinguish between useful IMFs and mixed IMFs, which leads to the exclusion of effective components and the retention of irrelevant components, thus affecting the effect of secondary noise reduction. The sample entropy criterion distinguishes between useful IMFs, mixed IMFs, and noisy IMFs, but based on a single indicator, it is easy to discriminate some component types inaccurately. The criterion in this paper combines the dual screening of sample entropy and correlation coefficient, which quantifies the noise content of each IMF and measures the correlation between each IMF and the original signal, ensuring that the three component types are accurately classified and have better screening effects.

Joint noise reduction
To verify the robustness and noise reduction effect of the adaptive wavelet thresholding function in this paper, on the basis of the dual determination criterion in this paper, VMD + soft thresholding (VMD + soft), VMD + hard thresholding (VMD + hard), VMD + semi-soft thresholding (VMD + semi-soft), and the method in this paper are used for noise reduction of the signal containing white noise with SNR = 1 and the signal containing color noise with SNR = 22.0.
Considering the large number of IMFs, the process of determining the optimal wavelet decomposition parameters for each component individually is tedious and there is no quantitative indicator to measure the advantages and disadvantages of each parameter. Therefore, the best parameters are determined by wavelet decomposition of the original signal as the optimal parameters for wavelet decomposition of each IMF component.
Taking the simulated signal containing white noise with SNR = 1 as an example, the rules for selecting the wavelet decomposition parameters are as follows: ffi Number of decomposition layers j: wavelet decomposition of the original signal, the ratio of the wavelet entropy of the second layer to the entropy of the original signal is 0.052, which is close to 5%, 32 and the optimal number of decomposition layers j = 2. Wavelet basis and threshold determination principle: determined by exhaustive method respectively.
ffl Determination of wavelet basis: the number of decomposition layers j is taken as 2, and the threshold principle is tentatively set as sqtwolog threshold. Considering the wavelet bases of dbN, coifN, symN series are well matched with mechanical fault vibration signals and have the advantages of highlighting fault characteristics and so on. 13 The wavelet bases of the above series are selected respectively, and the original signal is processed by noise reduction using soft threshold, and the optimal base wavelet is selected according to the signal-to-noise ratio of the signal after noise reduction, and the results are shown in Table 3. Compared with the three, the wavelet base of coifN series has a good noise reduction effect, and the wavelet base is selected as: the wavelet base coif5 with the highest evaluation index. Determination of the thresholding principle: take the number of decomposition layers j as 2, wavelet base as coif5, and select sqtwolog threshold, heursure threshold, minimaxi threshold, and rigrsure threshold for noise reduction   of the original signal, and similarly select the optimal base wavelet according to the signal-tonoise ratio, and the results are shown in Table  4. The noise reduction effect of each threshold principle is approximately the same, considering minimaxi threshold and rigrsure threshold, most of the coefficients smaller than the threshold are set to zero, which can avoid the problem of excessive noise reduction caused by setting all of them to zero, 33 and minimaxi threshold is finally chosen.
After determining the parameters of wavelet decomposition, the above joint method is used to noise reduction of the signal containing white noise with SNR = 1. The time domain and frequency domain waveforms after noise reduction are shown in Figure 5, it can be seen that noise frequency components exist around 200 Hz for VMD + soft, VMD + hard, and VMD + semi-soft while noise components with high amplitude remain at 150 Hz after noise reduction by VMD + hard and VMD + semi-soft. The frequency components at 5, 40, and 100 Hz are prominent after noise reduction by the method in this paper, and the frequency amplitude above 200 Hz is zero, and the ability of the signal to express features is enhanced.
The signal-to-noise ratio and correlation coefficient of the noise reduction signal are used as the evaluation indexes of the noise reduction effect, and the noise reduction results are shown in Table 5. The signal-to-noise ratio of the white noise signal with SNR = 1 is 6.88 and the correlation coefficient is 0.907 after noise reduction by the method in this paper, which are higher than other noise reduction methods. The signal-to-noise ratio of the signal with color noise with SNR = 22.0 is 13.02 and the correlation coefficient is 0.975 after noise reduction  by this method, which are higher than other noise reduction methods. In summary, this paper shows that the method has better robustness and self-adaptability for different SNRs and different noise types.

Rolling bearing fault diagnosis test
Fault diagnosis is to achieve the differentiation of faults based on the differences between different fault characteristics. Theoretically if the signal is enhanced by noise reduction processing, the ability of the signal to express features can essentially improve the performance of fault diagnosis. 34 In order to verify whether the improved VMD-adaptive wavelet threshold joint noise reduction processing method proposed in this paper can improve the diagnosis accuracy, the traditional machine learning method of signal analysis technology combined with classifier identification and the deep learning method of adaptive extraction of features were used for rolling bearing fault diagnosis test. The fault diagnosis process is shown in Figure 6.
The data for this test were obtained from the bearing failure test bench at the University of Paderborn, Germany. 35 Take the vibration signal of FAG6203 rolling bearing collected under the working condition of speed 1500 rpm, torque 0.7 N-m, load 1000 N, and sampling frequency 64 kHz. The bearing simulates the outer ring and inner ring failure by EDM and electric engraving machine processing, and additionally contains the normal state, three states in total. Combined with structural parameters, the theoretical eigenfrequencies are calculated as follows: outer ring fault eigenfrequency f o = 76:4Hz, rotation frequency f r = 25Hz.
To verify the effectiveness of the noise reduction pre-processing method in this paper, the signal is divided into 4096 data points, segmented for noise reduction, and then spliced into a whole segment after each segment is processed by noise reduction. Gaussian white noise with SNR = 210 is introduced into the original signal, and a sample in the outer ring fault state is taken as an example, and the time domain waveform and local envelope spectrum before and after adding noise are shown in Figure 7. After adding noise, the amplitude of the signal time domain waveform further increases; due to factors such as the parameter error of the inner and outer rings of the bearing, there may be a small range error between the theoretical fault characteristic frequency and the real characteristic frequency, and the outer ring fault characteristic frequency can be identified from the envelope spectrum.   Bearing outer ring failure data noise reduction preprocessing EMD + soft threshold (EMD + soft), EEMD + soft threshold (EEMD + soft), VMD + soft threshold (VMD + soft), VMD + semi-soft threshold (VMD + semi-soft), and the methods in this paper are used to pre-process the signal of a section in the outer ring fault state for noise reduction, respectively. The parameters of the algorithm are selected in the same way as in Section ''Simulation signal test and analysis,'' the number of VMD decomposition layers is determined to be 4, the penalty factor is 2110, the number of wavelet decomposition layers is taken to be 8, the threshold algorithm is the great minimal threshold, and the wavelet basis is sym10. The envelope spectrum analysis of the signal after noise reduction is performed for each noise reduction method, and the envelope results are shown in Figure 8.
A comparative analysis leads to the following: there is a peak at 78.1 Hz ' f o after EMD + soft, but it is drowned by the noise frequency component at 46.8 Hz, while the octave frequency component above 500 Hz is not obvious; the fault frequency component from 1 to 5 octaves is prominent after EEMD + soft, but compared with VMD + wavelet threshold at 109 Hz, 640 Hz has noise frequency components (it has a high amplitude). The poor noise reduction effect of the two methods is due to the problems such as modal blending in EMD and EEMD, which in turn affect the effect of secondary noise reduction. After noise reduction by VMD + soft and VMD + semi-soft, both of them have noise frequency components at 46.9 and 640 Hz compared with the method in this paper. Although VMD improves the problems such as modal aliasing,   Table  6. The signal-to-noise ratio of 2.37 and the correlation coefficient of 0.75 after noise reduction by this method are greater than those of other methods, so this method has better noise reduction effect and wider adaptability, and can effectively enhance the expression ability of fault characteristics.

Fault diagnosis based on traditional machine learning
Based on the traditional machine learning fault diagnosis method, firstly, the vibration signal is extracted by using the traditional signal processing analysis technique for feature extraction, and secondly, a classifier is designed based on machine learning for fault identification. The diagnostic test steps are as follows.
(1) Data set division: The vibration signals before and after noise reduction are divided into samples with 4096 data points, 248 samples for each class of states, and 744 samples for three classes of states in total. were selected as classifiers. To avoid the problem of biased diagnostic results due to improper data set splitting, the k-fold cross-validation method 37 is used, where k is taken as 5. The specific steps of training are as follows.
In step ffi above, the parameter value of the random number seed for dividing the samples is determined as follows: an integer in the range of [1,60] is randomly selected as the random number seed to divide the training set and the test set. When the test set before and after noise reduction achieves good diagnostic accuracy, this integer is the parameter value of the random number seed, and once this parameter value is determined, it will not be changed with the replacement of the classifier model.
Referring to python's official help documentation on scikit-learn, the hyperparameters of each classifier are set as shown in Table 7. In the table, k is the number of neighbors; weights is the weight rule, which takes the value of uniform that the weights of all neighbors are equal; max_depth is the maximum depth of the tree; max_leaf_nodes is the maximum number of leaf nodes, which is set to none that is, there is no limit to the maximum number of leaf nodes.
The results of the diagnostic tests are shown in Table 8. Before denoising, the above two classifier models had 1-2 diagnostic accuracy lower than 90% in five crossover experiments, and the accuracy of the models was below 50% in the fourth cross-test (the validation random number seed parameter values were not changed). After the noise reduction by the method in this paper, the diagnostic accuracy of all five crosstests reached over 90%. The average diagnostic accuracy of KNN was improved from 85.4% to 97.1%; the average diagnostic accuracy of DT was improved from 81.3% to 99.3%. Therefore, the noise reduction preprocessing method applied to traditional machine learning in this paper has a good improvement on its diagnosis accuracy.

Fault diagnosis based on deep learning
Unlike traditional machine learning fault diagnosis, deep learning adaptively extracts potential features of the signal and establishes a nonlinear mapping between features and fault types to achieve end-to-end fault diagnosis. The diagnostic test steps are as follows.
(1) Data set division: similarly, 4096 data points were used for sample division, with a total of 744 samples. Randomly divide the training set (632), test set (112) according to the ratio of 8.5:1.5. ffi Similarly, the fivefold cross-validation method is used to divide the training set into five parts equally, and one part of each training is taken as the validation set and the other four parts as the training set. And ensure the consistency of data distribution. ffl The total number of iterations per training is set to 50 (the termination condition of each training). After each iteration is completed, the accuracy and loss values of the current model are calculated by inputting the validation set in order to observe the iterative process of the model. Fifty iterations are completed and the accuracy of the model is tested with the test set. The diagnostic accuracy of the five test sets is obtained after five training sessions, and the mean value is taken as the final diagnostic accuracy of the model.
(2) Deep learning models: the one-dimensional convolutional neural network Wdcnn 38 model and the two-dimensional convolutional network ResNet34 39 model were used for fault diagnosis experiments, respectively, and the hyperparameters of each model are shown in Table 9, and the number of layers in the Based on the Wdcnn model diagnosis test, the timing characteristics of the vibration signal are directly used as the network input, and the diagnostic accuracy results of the test set are shown in Table 10. Before noise reduction, the diagnostic accuracy of the model fluctuated above and below 97%, and the average diagnostic accuracy was 97.7%; after noise reduction by the method of this paper, the diagnostic accuracy was stabilized above 99%, and the average diagnostic accuracy was improved to 99.6%.
Taking the first cross-validation test as an example, the accuracy of the validation set before and after noise reduction is shown in Figure 9. In the initial iteration, the network training converges slowly and the accuracy of the validation set is low, and the network gradually converges with the updating of the weight parameters. The diagnostic accuracy of the validation set before noise reduction increases slowly after the 15th iteration and reaches 100% at the 30th iteration; the diagnostic accuracy of the validation set after noise reduction by this method gradually increases after the 10th iteration and stabilizes at 100% at the 20th iteration. Therefore, the noise reduction pre-processing method in this paper can effectively enhance the expression ability of fault features, reduce the time required for model training, and effectively improve the diagnosis efficiency and accuracy.
Based on the ResNet34 model diagnosis test, the vibration signal is firstly transformed into a timefrequency diagram by using the ''Morlet'' wavelet base, 40 as shown in Figure 10. Under the interference of noise, the signal contains more noise frequency components, and the color of the time-frequency diagram varies in shades and distribution, which cannot accurately express the time-frequency information of the fault. After the noise reduction process in this paper, the noisy frequency components are removed, leaving the time-frequency information of the regular distribution of the fault characteristics.
The results of the diagnostic accuracy of the test set are shown in Table 11. Before noise reduction, the diagnostic accuracy of the model fluctuated above and below 97%, and the average diagnostic accuracy was 97.3%; after noise reduction by the method of this paper, the diagnostic accuracy was stabilized above 99.1%, and the average diagnostic accuracy was improved to 99.8%.   Taking the first cross-validation test as an example, the accuracy iterations of the validation set before and after noise reduction are shown in Figure 11. With the increase of the number of iterations, the diagnostic accuracy of the validation set before noise reduction fluctuates continuously above and below 97%, and the diagnostic accuracy of the validation set after noise reduction is basically stable at 100%. Therefore, the noise reduction pre-processing method in this paper can effectively improve the diagnostic accuracy and stability of the improved model.
In summary, the noise reduction preprocessing method in this paper can enhance the expression of fault information in the signal, thus improving the diagnostic accuracy of traditional machine learning and deep learning-based fault diagnosis methods. At the same time, it is also valuable for improving the diagnostic efficiency and stability of deep learning fault diagnosis methods.

Conclusion
To deal with the problem that the rolling bearing fault vibration signal is weak and difficult to be extracted under the noise interference caused by mechanical equipment and surrounding environment, etc. A rolling bearing fault diagnosis method based on improved VMD-adaptive wavelet threshold joint noise reduction is proposed, and the main conclusions are as follows.
(1) A dual determination criterion of sample entropy and correlation coefficient is constructed to screen the modal components of the decomposition. It effectively removes the noise components and avoids the one-sidedness of IMFs selected by a single indicator. (2) An adaptive wavelet threshold function is proposed. It is capable of adaptively adjusting the noise reduction form according to the noise content of the components, and thus has a certain degree of adaptivity. It solves the problems of excessive noise reduction and poor noise reduction of traditional wavelet thresholding, semi-soft thresholding, and other fixed forms of noise reduction algorithms. (3) Through simulation experiments and fault diagnosis experiments, the noise reduction preprocessing method proposed in this paper can effectively eliminate the noise components mixed in the signal, and enhance the expression ability of the features. It has good robustness, and a certain improvement on the diagnosis accuracy of the traditional machine learning and deep learning based diagnosis methods. Therefore, the noise reduction preprocessing method proposed in this paper has potential and value when applied to the research of rolling bearing fault diagnosis.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by National Natural