Fault diagnosis method for rotating machinery based on SEDenseNet and Gramian Angular Field

▪ Utilizing the GAF coding method enables the representation of low-dimensional signal features within high-dimensional nonlinear data. ▪ The incorporation of the SE attention mechanism into the DenseNet model facilitates enhanced feature transfer and reuse. ▪ The diagnostic accuracies for the three datasets achieved were 100%, 100%, and 99.85%, respectively. The fault diagnosis in rotating machinery is crucial for ensuring the safe and dependable operation of intricate mechanical systems. Addressing the limitations inherent in traditional deep learning approaches concerning extended time sequence encoding and subpar generalization capability is paramount. The study utilizes the Gramian Angular Field (GAF) and Squeeze and Excitation (SE) attention mechanisms to alleviate these constraints. GAF enhances feature extraction by emphasizing the angular relationships among adjacent signal points to uncover latent fault characteristics. Simultaneously, through the integration of SE with DenseNet architecture, the network facilitates global information exchange and improves multi-scale fusion, thereby enhancing the precise identification of fault type and location within the signal. Experiments conducted on two datasets achieved accuracies of 100% and 99.85%, respectively, outperforming other methods and models, thereby validating the effectiveness of this study.


Introduction
In the industrial sector, there is a substantial presence of rotating machinery.However, these machines often face various failures during engineering practice, which can result in significant damage, as well as severe injury or even loss of life.

Gramian Angular Field
The GAF image encoding method maintains the time Gramian Angular Summation Field(GASF) and Gramian Angular Difference Field(GADF) are defined as follows: Where I is the unit row vector [ Given that GASF and GADF compute the angle between neighboring points of the temporal signal in distinct manners, the resulting 2D images exhibit disparities.
Scale(Fscale):Assign the weight vector s generated in the previous step to the feature map U to obtain the feature map X~, whose size is the same as the feature map U.The SE attention module does not change the size of the feature map.
x ̃ = F  (u  ,   ) =   u  (9) By multiplying the generated feature vector s(1×1×C) with the feature map U(H×W×C), even if the H×W individual values of each channel in the feature map U are multiplied by the weights of the corresponding channel of s, so that each channel dimension realizes the calibration of the feature.

Experimental data and model
This    During the experiment, it was found that the diagnostic accuracy of these two images was not much different, so GASF with smaller storage space was selected as the input of SEDenseNet.

B Case 2: Wind Turbine Drivetrain Simulator Dataset
The experimental data for Case  and gear tooth crack (GTC), as shown in Table 3.

C Experimental model
The Cross-entropy is applied to calculate the model loss.The initial learning rate is set at 0.001, and a fixed-step descent method is used to update the learning rate, which is reduced by a factor of 0.7 after every five iterations.

Experimental results and comparative analysis
This chapter further validates the superiority of the rotating machinery fault diagnosis method proposed in this paper by diagnosing the fault class types of the two dataset and comparing other network models with signal preprocessing methods.

A Fault diagnosis in Case 1
The As can be observed from the Fig. 14, the proposed network model in this paper consistently achieves 100% accuracy in recognizing bearing health information across all 10 validations.
While other network models also achieve a recognition rate of over 95%, they do not guarantee correct recognition of certain samples and there remains some gaps between their performance and that of the network model proposed in this paper.
To evaluate the effectiveness of the proposed GAF coding        In order to ensure that the fault diagnostic model does not experience false alarms, the performance measure of the model is performed using precision, where the precision formula is as follows: In order to ensure that the fault diagnostic model does not suffer from underreporting, recall is used to measure the performance of the model, where the recall formula is as follows: Given that various types of troubleshooting exhibit varying preferences for model predictions, the F1 score is calculated by taking a reconciled average of precision and recall, using the following formula: The most satisfactory results of the three sets of data were chosen to assess the fault diagnostic model's performance, as illustrated in Table 7, which displays the average values across ten trials.
As the specificity increases, the optimal classification results are obtained when the sensitivity reaches 1.In other words, the larger the area under the seven health state curves is, the better the experimental results will be.The ROC curves for Case 1, as depicted in Fig. 24, demonstrate that the optimal classification results are achieved when the area under each of the ROC curves is equal to 1 for each category.
Many faults exhibit numerous ambiguous features.Therefore, conducting research on feature extraction and analysis methods for fault diagnosis of rotating machinery holds immense importance 1. Traditional fault diagnosis methods based on expert systems heavily rely on the expert's empirical knowledge, leading to high manual labor intensity and low accuracy in real engineering environments 2. These methods are based on mechanical fault mechanisms, signal characteristics, or feature extraction.Traditional fault diagnosis using expert systems relies on experts' practical experience and professional knowledge of rotating machinery, making it difficult to carry out efficient fault diagnosis 3. Machine learning-based fault diagnosis is a data-driven method that utilizes machine learning techniques to automate the diagnosis of faults in equipment or systems 4. The model is trained with historical data and applied to real-time data to determine the type and location of faults.Ye et al. 5 used variational modal decomposition to decompose the original mechanical vibration signal into multiple intrinsic modal functions , constructed multidimensional feature vectors of the signal through multiscale permutation entropy, and inputted the multidimensional vectors into a particle swarm optimization-based support vector machine classification model to achieve the diagnosis of fault diagnosis of rolling bearings.Liu et al. 6 proposed a method based on twin prototype networks with noisy label self-correction to address fault diagnosis for wind turbine gearboxes with noisy labels.Guo et al. 7 proposed a measure of the intensity of periodic pulses in the signal, cyclic kurtosis entropy , to address the problem of rotating machinery that may lead to failure of fault diagnosis techniques in the presence of strong noise disturbances or composite fault coupling phenomena.The algorithm utilizes the entropy value to calculate all delayed periodic kurtosis, overcoming the shortcomings of poor adaptive ability of kurtosis and obtaining a more stable value, resulting in enhanced fault feature extraction of rotating machinery under stationary operating conditions.Su et al. 8 proposed a novel intelligent fault diagnosis model based on singular value manifold features , optimized support vector machines and multi-sensor information fusion.Li et al. 9 tackled the issue of frequency distortion caused by strong noise or dense frequencies in rotating machinery by mapping the signal into a new pseudo-temporal domain to eliminate non-smoothness in the basis of the rotating machinery mechanism.They also addressed the impact of velocity variations or acceleration fluctuations at complex and dense frequencies on fault diagnosis.Li et al. 10 proposed an interpretable wavelet packet kernel-constrained convolutional network for noise robust fault diagnosis, which demonstrated superior robustness and noise immunity compared to other models.Aasi et al.11 established a setup specifically designed for condition monitoring of angular contact bearings using acoustic emission sensors.During the experimentation process, acoustic emission signals were collected from bearings containing defects, with a focus on analyzing and recording defects occurring on the inner or outer rings of the bearings.By screening and conducting a detailed analysis of several common time-domain features, the results revealed the applicability of these features in diagnosing the condition of the bearings.Notably, the clearance factor demonstrated significant effectiveness in detecting small-sized defects, and the sixth-order central moment was particularly adept at identifying larger defects.This research provides an efficient technical method for the industrial application of rolling bearing condition monitoring and fault detection.The implementation of machine learning has facilitated certain advancements in the field of fault diagnosis 12.However, due to its shallow architecture, machine learning often struggles with obtaining high-dimensional and invisible characterizations 13.In the realm of big data, deep learning-based algorithms have emerged as a prominent area of focus for fault diagnosis.These algorithms can learn fault features from raw fault signals without relying on expert experience or manual extraction, making deep learning-based fault diagnosis methods a popular research direction among scholars in recent years.Zhang et al. 14 proposed a Bayesian augmented convolutional neural network , which focuses on capturing the time dependence of the signals in non-stationary relationships collected from raw fault signals and achieves impressive results in the fault diagnosis of large-scale low-speed wind turbine bearings.Wang et al. 15 proposed an adaptive denoising convolutional neural network , which integrates an adaptive denoising unit to remove noise while retaining sensitive fault features, eliminating the need for manual denoising function setting and achieving superior fault diagnosis performance in noisy environments.Wang et al. 16 proposed a fast fault diagnosis method based on the swarm decomposition algorithm, the improved multi-scale reverse discrete entropy algorithm, and bidirectional long-term memory network for gear transmission systems, achieving high accuracy and stability in fault signal classification.Chen et al. 17 proposed a nonlinear system identification strategy based on deep migration learning domain adversarial neural network, achieving high identification accuracy in 16 operating conditions of helicopter transmission systems.Cheng et al. 18 proposed a wavelet transform-local two-stage convolutional neural network fault diagnostic method, in which vibration signals from bearings are passed through a wavelet transform and then separately input to a local two-cross convolutional neural network to diagnose rotating machinery.Since machinery failures do not exhibit only one characteristic, methods that fuse multiple source signals into a neural network for fault diagnosis have been proposed.Wang et al. 19 proposed a multi-modal sensor fusion method, using acceleration sensors and acoustic sensors to collect vibration and acoustic signals respectively and feeding the collected signals into a constructed binary convolutional neural network to carry out fault diagnosis.and verified its superiority compared with existing algorithms.It is widely recognized that Convolutional Neural Networks (CNNs) have shown excellent performance in fault diagnosis and prediction tasks.CNNs are primarily used for 2D image processing, but they have also been successfully applied to equipment fault diagnosis and prediction due to their strong feature learning ability and fault tolerance in complex environments and confusing knowledge rules.There are two main strategies for using CNNs for rotating machinery fault diagnosis: one is to use a 1D-CNN model and input onedimensional mechanical signals into the network to diagnose faults; the other is to convert one-dimensional mechanical signals into two-dimensional images and input them into the network.Xia et al.21 combined the raw data collected by multiple sensors into a 2D matrix and input the raw data into the CNN in 2D form to achieve end-to-end feature learning.Chen et al. 22 proposed a method of combining cyclic spectral Cscoh 2D maps with CNNs using double Fourier transform to estimate vibration signal 2D Cscoh maps and reveal valuable health state information.Xiong et al. 23 proposed a vibration signal for fault diagnosis using mutual dimensionless similarity Gram matrix data preprocessing and feeding the processed feature maps into the CNN for diagnosis.Bai et al. 24 proposed a new spectral Markov transfer field algorithm, which constructs a first-order Markov transfer matrix of frequency domain signals to represent the spectral characteristics of vibration signals as an image and shows effectiveness in comprehensively characterizing composite fault features.He et al. 25 proposed a multi-sensor data fusion method that integrates multivariate data into pixel matrices and feeds these matrices into a twoscale residual network for fault diagnosis.Fan et al. 26 proposed a graph generation method to generate correlation diagrams from table data and feed these diagrams into neural networks for HVAC system fault diagnosis.Chang et al. 27 initially transformed the battery voltage signal into a time-frequency representation through continuous wavelet transform following variational modal decomposition.Subsequently, they utilized image entropy to decipher the parameters of the battery malfunction for fault diagnosis.The utilization of attention mechanisms in intelligent fault diagnosis enhances the accuracy of classification by selectively focusing on relevant features and suppressing irrelevant information, thereby greatly improving efficiency.Wang et al. 28 applied Empirical Mode Decomposition to preprocess the original signal, incorporated a multi-head attention mechanism into the graph neural network, and achieved notable performance in bearing diagnosis.Tong et al. 29 proposed a CNN model utilizing multi-sensor information fusion and coordinated attention, resulting in a lightweight convolutional neural network with integrated attention mechanism, known as ACNN.Xia et al. 30 introduced a hierarchical attention-based method for multi-source data fusion fault diagnosis in autonomous underwater vehicles.The method employs an encoder-decoder network, a fusion network stacked with encoders, and an attention mechanism to diagnose faults.Validation of the method was conducted using monitoring data from the "Submarine Dragon II" underwater robot during sea trials in the South China Sea, which demonstrated its effectiveness.Despite the advancements made by the aforementioned methods in fault diagnosis, the realm of mechanical fault diagnosis confronts persistent challenges in real-world operational scenarios.Rotating mechanical equipment typically operates under fluctuating load conditions, resulting in dynamic alterations in fault information.Moreover, the vibration signals captured by sensors often exhibit high levels of complexity, coupling, and uncertainty.Conventional deep learning models lack the adaptability to dynamically extract fault features in accordance with specific requirements, thereby resulting in inadequate generalization of the models for practical engineering applications.To address the inherent limitations of traditional deep learning methods in expanding time series encoding and generalization capabilities, as well as to facilitate global information exchange in fault diagnosis models and enhance multi-scale feature fusion, this study has designed a fault diagnosis model based on GAF encoding technology and SEDenseNet .This approach aims to achieve precise identification of fault types and locations within signals.This method is employed to detect common fault conditions in bearings and gears within rotating machinery, thereby preventing losses in engineering projects.The design concept Eksploatacja i Niezawodność -Maintenance and Reliability Vol. 26, No. 4, 2024 entails merging deep learning and signal processing methodologies to enhance the efficacy of fault diagnostic detection.This is achieved by incorporating the GAF coding technique to capture the fault characteristics of signals and integrating it into the deep learning framework.The fault characteristics embedded within vibration signals, encompassing multivariate time domain information, are extracted utilizing the angular parsing mechanism of GAF coding, thereby enhancing the feature representation.Subsequently, the resultant feature maps are processed through the SE Attention module and DenseNet module to recalibrate the feature maps.Leveraging learnable weight parameters, the model dynamically selects and accentuates pivotal information within the feature maps, thus augmenting the model's capability to discern fault features.The efficacy of the proposed methodology is validated across three dataset, affirming its robust performance.
dependence of the signal 31, with the time sequence information increasing as the location moves from the upper left corner to the lower right corner, ensuring that no information is lost within the signal.The main process for constructing a GAF is as follows: First, given the time series X = {x1, x2, ..., xn} of n values of a vibration signal, rescale X so that all signal values fall within the interval [0, 1], and perform normalization processing on X to obtain  ~; then encode the signal values into cosine angles using the following equation and encode the timestamps into radii to represent the rescaled time series X in polar coordinates.{  = (  ~ ), −1 ≤  ~ ≤ 1,  ~ ∈  r In the above equation, it is the timestamp and N is the constant used to regularize the span of the polar coordinate system.Since cos (ϕ) is monotonic when ϕ ∈ [0, π], the above equation is bijective.Angular perspective is utilized to consider the triangular sum or triangular difference coming between each point, thus identifying temporal correlations in different time intervals.
1, 1, ... 1].One-dimensional time series can be transformed into quasi-Gramian matrix by the above transformation.Due to the large size of the n n Gramian matrix, Piecewise Aggregate Approximation 32 is used to smooth the time series while maintaining the trend of the signal.The GAF generation process is shown in Fig. 1.Every pixel in the GAF 2D image corresponds to the magnitude of the value in the corresponding position of the proposed Gramian matrix.
allows for deeper training by establishing "shortcuts" (skip connections) between front and back layers.Conversely, DenseNet establishes dense connections between all previous layers and those behind them.DenseNet solves gradient vanishing issues in deep CNNs and enhances feature transfer and reuse 33.This dense connectivity regularization can reduce overfitting in small dataset to some extent.At its core, the dense connection mechanism is the foundation of DenseNet, which is illustrated in Fig. 2.

A Case 1 :
section presents two types of experimental data and describes the experimental model.The experiments utilize Case Western Reserve University(CWRU) motor bearing fault diagnosis dataset, as well as the bearing fault data and gearbox fault data collected by Wind Turbine Drivetrain Simulator Fault Diagnosis Comprehensive Experimental Platform.These dataset are used to validate the proposed rotating machinery fault diagnosis method based on SEDenseNet and Gramian angular field.Case Western Reserve University Bearing Dataset The experimental data for Case 1 was obtained from the motor bearing troubleshooting testbed at Case Western Reserve University 35.The platform in Fig. 5, which includes a 2-hp electric motor, torque transducer, power test meter, and electronic controller (not shown), serves as the experimental setup.The data was acquired at the drive end of the bearing model SK6205, with a sampling frequency of 12 kHz, bearing speed of 1797 r/min, and under no load conditions.

Fig. 5 .
Fig. 5. CWRU experimental setup.The bearing dataset comprises four health conditions: normal state (N), inner race fault (IR), outer race fault (OR), and ball fault (Ball).The data was collected from the drive-end bearing and is presented in Table 1.To augment the network model's learning and generalization capabilities, this paper employs dataset augmentation techniques to expand the dataset.Specifically, one-dimensional bearing vibration signals are oversampled using overlapping sampling, whereby there is an overlap between consecutive samples.The sampling method is

Fig. 6 .
Fig. 6.Signal sample interception.In order to make the length of each sample signal greater than the length of the signal produced by the bearing rotating for two weeks (12000×60/1797), the length of each section of the sample signal was selected to be 1000, the step size was 500, 2 were obtained from the Wind Turbine Drivetrain Simulator(WTDS) Fault Diagnosis Comprehensive Experimental Platform, which comprises a motor controller, a drive motor, a speed sensor at the drive end, a bearing housing, a parallel gearbox, a planetary gearbox, a magnetic powder brake, and an output speed sensor.The sampling frequency of the faulty acceleration signals of the bearings and gearboxes detected by the WTDS is 20480Hz, and the type of faulty bearing in question is ER-16K.

Fig. 9 .
Fig. 9. WTDS experimental setup.The bearing dataset includes the following four health types: normal condition (N), inner race fault (IR), outer race fault (OR), and ball fault (Ball), as shown in Table 2. Same as CWRU bearing data set division, in order to make each sample signal length is greater than the length of the bearing and gear rotation two weeks to produce signals (20480 × 60/1500), selected each

Fig. 10 .
Fig. 10.Proposed fault diagnosis method for rotating machinery.The segmented signal samples are initially transformed into a 500×500 two-dimensional image via GAF coding technique.The image is subsequently scaled to a dimension of 224×224 before being incorporated into the network model.Following these layers, the GAF containing health status information of rotating machinery is classified, resulting in loss error and accuracy being obtained.The backpropagation of loss error is performed to optimize the network model and retrain it to minimize the gradient, thus enabling the network to be trained for 100 iterations to achieve optimal classification results.D Experimental platform setup A fault diagnosis model for rotating machinery is developed using the programming language Python 3.8, running on the PyTorch deep learning framework.Development tools such as PyCharm are employed, and the code is executed in a computing environment utilizing CUDA v12.1 and an NVIDIA GeForce RTX 3060 computer graphics card.The batch

Fig. 11 .
Fig. 11.Test set accuracy and loss under Case 1.As can be observed from the figure, the model achieves Fig.15.The comparison revealed that while the MTF coding technique failed to achieve 100% accuracy in diagnosing the health status of bearings in the Case 1 dataset, it did manage to attain 98.91% accuracy when using the network model

Fig. 16 .
Fig. 16.Gearbox dataset accuracy and loss in Case 2. Fig. 17.Bearing dataset accuracy and loss in Case 2. As can be seen from the figures, the gearbox and bearing dataset of the WTDS experimental platform exhibit a gradual convergence after around 20 rounds of training.Even though there may be slight fluctuations in accuracy and loss, the magnitude of these changes is not significant.After approximately 40 rounds of training, the test set accuracy for both dataset remains consistently high, with the gearbox dataset achieving an accuracy of almost 100% and the bearing dataset achieving an accuracy of almost 99.86%.Additionally, both dataset have nearly zero loss after reaching this point of convergence.

Fig. 23 .
Fig. 23.Accuracy of 10 validations for different signal preprocessing methods in Case 2. The 10 validation results of the gearbox dataset in Case 2 all achieved 100% diagnostic accuracy, while the 10 validation results of the bearing dataset had 9 diagnostic accuracies of 99.86% and 1 diagnostic accuracy of 99.72%, respectively.This corresponded to 1 incorrect classification in the other categories in the test set for bearings and 2 for gearboxes.The average accuracy of ten validations of the network model proposed in this paper with the five network models compared and the MTF coding technique is shown in Tables 5,6.

6 .
Fault diagnosis model performance metricsThe potential ramifications of failing to identify or accurately report fault types during diagnosis are significant, thus the diagnostic accuracy metric alone cannot be utilized to assess the effectiveness of the fault diagnosis model.To comprehensively evaluate the model's performance, four key indicators will be considered: accuracy, precision, recall, and F1 value.In the rotating machinery health state classification model proposed in this paper, the true categories of the samples and the model's predicted categories are classified into four distinct groups: True Positive (TP), False Positive (FP), True Negative (TN), and False Negative (FN).

Table 1 .
Details of the experimental dataset selected from the CWRU dataset.

Table 2 .
Details of the experimental dataset selected from the WTDS bearing dataset.

Table 3 .
Details of the experimental dataset selected from the WTDS gearbox dataset.

Table 4 .
Comparison of average accuracy of different network In this section, the health state data of bearings and gearboxes collected by the WTDS experimental platform is recognized and diagnosed.A series of evaluations of the diagnostic effects are conducted, including the recognition accuracy and loss of the WTDS bearing and gearbox dataset health state shown in Figs 16 and 17.

Table 5 .
Comparison of average accuracy of different network

Table 6 .
Comparison of average accuracy of different network

Table 7 .
Model metric average results.
curve represent the perceptibility of a given signal stimulus.The ROC curve displays the true positive rate (TPR，sensitivity) on the vertical axis and the false positive rate (FPR，specificity) on the horizontal axis.The ROC curve is as follows: