Fault diagnosis of shield machine based on RFE and ELM


 Shield machine is a complex large-scale tunneling equipment with multiple systems and driving sources. In order to improve the accuracy and efficiency of fault diagnosis for shield machine, a method based on the combination of reverse feature elimination (RFE) and extreme learning machine (ELM) is proposed. For the characteristics of shield machine operation data with many dimensions and large quantity, the RFE method is introduced to reduce the dimension of data, eliminate the redundant dimension and remove the correlation between features. Considering the neural network has the slow speed and low efficiency of fault diagnosis, the ELM neural network classifier model is built based on the extremely learning mechanism for fault diagnosis of shield machine. The simulation results based on the field construction data show that this method improves the accuracy and efficiency of fault diagnosis of shield machine significantly and has good engineering application value.


Introduction
Shield machine is a comprehensive large-scale construction machinery which integrates mechanical, hydraulic, electrical and automatic control, and its structure is shown in Fig. 1. Shield machine is widely used in subway tunnel construction, mining engineering and mountain crossing tunnel construction for its advantages of high quality, high speed and safety. At present, shield machine is developing towards high power and intelligence [1] . However, due to the complexity of the shield machine structure and the relatively closed working environment, it is very easy to have a variety of failures in the working process. The fault diagnosis of shield machine is one of the key technologies to realize its safe, efficient and intelligent construction. Therefore, a method is needed which can predict the location of the fault in the first time or even before the fault occurs so that improve work efficiency and reduce economic losses.

Fig. 1 Structure of shield machine
Many scholars have studied this problem. In reference [2], an expert system for fault diagnosis of shield machine is proposed. Multi-agent system and fuzzy reasoning mechanism are introduced to diagnose its multiple faults. However, expert system often depends on the experience knowledge of professionals, but the model. In reference [3], the self-learning, self-organizing function and parallel processing mechanism of BP neural network are used for information fusion to classify and diagnose the common faults of shield machine, such as gushing, hob wear and shield shell stuck. In reference [4], DE (differential evolution) and BP neural network (Back Propagation Neural Network， BPNN) are combined to diagnose the hydraulic system of fault in shield machine propulsion. The results inevitably show that BP neural network is time-consuming and requires high data quality. In reference [5], rough set is used to reduce the dimension of data, then BP neural network is used to predict the fault, and the least square method is used to reflect the future operation of shield machine. However, rough set cannot deal with numerical continuous value variables, which limits the applicability of this method.
Compared with traditional support vector machine (SVM), neural network and other methods, extremely learning machine has the advantages of fast learning speed, high accuracy, simple parameter adjustment and so on [6][7] , and has achieved good results in the fault diagnosis of large-scale equipment such as aeroengine components, marine diesel engine and so on [8][9][10][11] .
Therefore, a fault diagnosis method of shield machine is proposed based on RFE and extremely learning machine in this paper. Using the reduced dimension network to eliminate redundant columns as much as possible and gradually reduce the sample of diagnosis data based on RFE. Then the fault is classified by using ELM. Simulation test results show that the model has good data dimensionality reduction ability, short fault diagnosis time and high accuracy.

Extreme learning machine
Extreme learning machine is a new algorithm for single hidden layer feedforward neural network. Compared with the shortcomings of traditional feedforward neural networks, such as slow training speed, easily falling into local minimum and sensitive selection of learning rate, ELM algorithm randomly generates connection weights of input layer and hidden layer and threshold values of hidden layer neurons. In the process of training, there is no need to adjust, only to set the number of hidden layer neurons, the unique optimal solution can be obtained. Compared with the traditional training methods, ELM has the advantages of fast learning speed and good generalization performance. For After the neuron parameters in the hidden layer are randomly generated based on the probability of any continuous sampling distribution and the training samples are given, the output matrix of the hidden layer is actually known and remains unchanged. Eq.
(1) is transformed into the least norm least square solution for linear system (4) Where is the Moore Penrose generalized inverse matrix of the matrix.

Idea of RFE
RFE is a dimension reduction method applied to data with a large number of redundant dimensions. First, the original data dimension is calculated to get the accuracy rate of the original data dimension, and then the accuracy rate is calculated by eliminating the first dimension data. If the accuracy becomes higher after the first dimension is eliminated, the first dimension can be considered as redundant dimension, which will be eliminated and the calculation of the second dimension will be started; if the accuracy does not become higher after the first dimension is eliminated, the first dimension will be useful dimension, which will be retained and the second dimension will be calculated. It can Iterate through the above until all dimensions are traversed.

Fault diagnosis of shield machine based on RFE-ELM
According to the actual construction experience of the shield machine, three main failure types are finally determined: hob wear fault caused by too much torque of the cutter head motor; shield shell stuck caused by too much speed of the jack of propulsion system; pipe blocking failure of the grouting pipeline caused by the grouting fluid flow to be too low. Each main fault is divided into two major secondary fault, and the fault classification diagram is shown in Table 1.  ( ( N N N N

grouting B liquid injection flow is too low
Based on the above fault types, the ELM fault classifier of shield machine is constructed by combining the reverse feature elimination method. The process of fault diagnosis is shown in Fig. 2. The dimension of fault data is determined according to the fault type of shield machine, and the sample data is constructed based on the actual monitoring data of shield machine for testing and training. The specific steps of fault diagnosis algorithm of elm shield machine based on reverse feature elimination are as follows.
Step 1: Select the operation data of shield machine as sample data and generate fault data, select part of the data as training sample and the other part as test sample.
Step 2: Substitute the original fault data dimension of shield machine into ELM for fault diagnosis, and get the accuracy of the original fault data dimension of shield machine .
Step 3: Replace the first dimension shield machine fault data with ELM to calculate the accuracy, and get the accuracy of the first dimension shield machine fault data .
Step 4: Compare with . If , the accuracy becomes higher, the first dimension of the shield machine fault data can be considered as redundant dimension, which can be eliminated, and the accuracy calculation of the second dimension of the shield machine fault data can be started; If , the accuracy becomes lower, the first dimension of shield machine fault data is the useful dimension, which should be retained and the second dimension of shield machine fault data elimination accuracy calculation should be started.
Step 5: Take the accuracy as the judgment standard, use the above rules to iterate until traversing all dimensions, and finally get the accuracy of fault diagnosis of ELM operation shield machine which eliminates most redundant dimensions. Combining the accuracy rate of shield machine is obtained for eliminating the first dimension fault data, and complete step 3.
Finally, according to step 4, through all dimensions, the accuracy of fault diagnosis for shield machine is obtained by using ELM. In this process, the accuracy rate of fault diagnosis is set as the optimization goal, and the accuracy rate of fault diagnosis is continuously improved during iterative calculation.
In conclusion, the fault diagnosis process of the shield machine based on the elimination of reverse characteristics is shown in Fig. 3. difficult to obtain a large number of actual fault data of shield machine due to less fault components and frequency, the complex tunneling conditions and many factors affecting the driving process. Therefore, using the data of Beijing Metro Line 10, 140 dimensional vector data composed of 140 monitoring parameters under normal conditions are obtained in this paper, and the fault data are generated according to the selected fault type and deviation degree. The three main fault types and their secondary fault types involve 11 dimensions of the 140 total dimensions. A total of 120 groups of data are selected, each group of data contains 140 dimensional attributes, forming the original matrix of 120 rows and 140 columns, 90 rows are selected to form the training data, and the remaining 30 rows to form the test data.
Because of the randomness of neural network operation, each accuracy rate of fault diagnosis is obtained through 100 times of repeated training, and then the average value is added to get a reasonable and stable test accuracy rate, but the calculation time of fault diagnosis is also increased. In the actual application of shield construction, the number of training can be appropriately reduced to improve the calculation efficiency.
All the following operation experiments are simulated by MATLAB software on a computer configured with Intel Core i5-7200U，2.50GHz and 8.00GB memory.
Algorithm parameter setting: the sigmoid functions with the same parameters are selected as activation functions for the ELM algorithm, the same number of hidden layers and nodes. ELM neural network structure consists of input layer, hidden layer and output layer. The input is each group of fault data, and the output is the corresponding fault classification category of this group of fault data. The topology of ELM neural network is shown in Fig. 4.

Fig. 4 ELM neural network topology
The RFE-ELM algorithm in this paper is simulated and verified, and the results of data dimensionality reduction and fault diagnosis accuracy are shown in Fig. 5.  The relationship between the number of eliminated columns and the fault diagnosis accuracy is shown in Fig. 6. it can be seen intuitively that the accuracy rate increases with the increase of elimination columns. At the same time, the ELM diagnostic accuracy is increasing, and the accuracy is the highest after eliminating 26 redundant columns. The data after eliminating some redundant dimensions are used for shield machine fault diagnosis classification verification based on RFE-ELM and compared with the diagnosis results of BP neural network method with reverse feature elimination (RFE-BP). The visual results of diagnosis are shown in Fig. 7. In Fig. 7, the fault classification categories from 1 to 6 are as follows: NO.1,5,3,7 cutter head motor torque is too high; NO.2,6,4, 8 cutter head motor torque is too high; NO.2,4, jack speed is too high; NO.1,3, jack speed is too high; grouting a liquid injection flow is too low; grouting B liquid injection flow is too low. It can be seen from Fig. 7 that the fault diagnosis result of RFE-ELM is basically consistent with the expected output result, while the diagnosis result of RFE-BP is quite different from the expected output. Moreover, the accuracy of RFE-ELM is 93.33%, while the accuracy of RFE-BP is 56.67%. Therefore, it can be concluded that the RFE-ELM method in this paper has high accuracy and better effect in fault diagnosis.

Comparison and analysis of the effectiveness of the algorithms
In order to verify the effectiveness of this method, RFE-ELM method is compared with RFE-BP neural network, ELM, BP neural network diagnosis method. Through a lot of experiments, the structure of BP neural network is finally determined to be composed of input layer, two hidden layers and output layer. The input is each group of fault data, and the output is the corresponding classification category of the group of fault data. Each hidden layer has 15 nodes, and its topology is similar to that shown in Fig. 4. The initial parameter learning rate of BP neural network is 0.3, the training target is 0.001, and the training times is 200. The sigmoid functions in all BP networks are chosen with the same parameters, the same number of hidden layers and nodes. The simulation results of the four methods are shown in Fig. 8. It can be seen from Fig. 8 that the accuracy rate of RFE-ELM fault diagnosis in this paper is significantly higher than that of other methods, and with the elimination of redundant dimensions, the accuracy rate of the highest diagnosis reaches 94.2%, which shows that this method has better dimension reduction characteristics and fault classification accuracy. However, the accuracy of RFE-BP does not increase after eliminating 100 column, and it is unstable. The highest diagnostic accuracy is 58%. Because the ELM and BP algorithm do not eliminate the column and reduce the dimension based on the RFE, but take the 140 dimension of the original data as the input for fault diagnosis, so the diagnosis results only present a point in the Fig. 8, which are the ELM accuracy of the original data and the BP accuracy of the original data respectively. It is worth noting that this is not the diagnosis result of column 140, but the diagnosis result of 140 dimensional input. It can be seen that the diagnostic accuracy of ELM algorithm is 53.76%, and that of BP algorithm is 47.67%, which is significantly lower than that of RFE-ELM algorithm in this paper. The quantitative comparison results of the overall performance of the four methods are shown in Table 2. All the above results are calculated after the algorithm runs 100 times. From Table 2, it can be seen that the performance comparison of the four algorithms is as follows: (1)Time consuming: The training time of RFE-ELM and RFE-BP is longer after RFE dimension reduction algorithm is introduced, and they are obviously higher than the simple ELM and BP algorithm. Because a large amount of time is consumed in the process of data dimensionality reduction after RFE algorithm is introduced, the timeconsuming is increased. But the time of fault diagnosis and prediction is almost the same.
(2)Elimination of columns: When data are trained, different columns will be related to each other, so the columns eliminated by RFE-BP are roughly the same as those eliminated by RFE-ELM, but there are also subtle differences. The

Conclusion
In order to solve the problem of accuracy and efficiency of shield machine fault diagnosis, a fault diagnosis method of extremely learning machine based on RFE is proposed in this paper. Using the RFE to reduce the dimension of data, using the extremely learning machine to improve the efficiency of calculation, and realizing the timeliness of fault diagnosis. Based on the field construction data, simulation experiments are carried out to verify the effectiveness of the method, and the following conclusions are obtained: (1) Compared with RFE-BP, ELM and BP, the accuracy of RFE-ELM is significantly higher than other methods, the highest accuracy is 94.2%, and the prediction time is shorter. Most of the faults of shield machine can be detected automatically, which is of great significance to shield construction.
(2) In order to reduce the fluctuation of the accuracy of the algorithm, the simulation results are all calculated by the algorithm running 100 times, which increases the training time of the algorithm. In the actual application of shield construction, the program stability can be reduced at the cost of reducing the running time to improve the timeliness of the algorithm. Further tests are also carried out in the experiment. When the number of iterations is reduced from 100 to 10, the highest accuracy rate is 95%, the redundancy dimension is eliminated to 17 columns, and the training time can be compressed to 359.73 s, but the fluctuation of each program running result increases. Therefore, under the condition of less operation times, how to reduce the fluctuation of program results and ensure the accuracy of calculation is the work we need to study in the future.