Theory and Numerical Analysis of Extreme Learning Machine and Its Application for Different Degrees of Defect Recognition of Hoisting Wire Rope

,


Introduction
Extreme learning machine (ELM) was proposed based on the single-hidden layer feed-forward neural network (SLFNN) [1].Unlike the conventional network learning algorithm which must know the training samples before generating the parameters of the hidden node, ELM could generate randomly the parameters of the hidden node before understanding the training samples.ELM is characterized by the easier parameter selection rules, the faster converge speed, the less human intervention, and so on.However, due to the random generation mechanism of hidden nodes in ELM, there are still some urgent problems to be improved in ELM. e network structure is very crucial to the learning results and generalization ability of ELM.
e network structure of ELM is determined by the number of hidden nodes.In recent years, the growth mechanism of hidden nodes has been extensively studied by many researchers.In order to obtain a better generalization of ELM, the performance of ELM should be optimized.Recently, there are many improved methods about ELM. e incremental extreme learning machine (I-ELM) was proposed by Huang et al. [1], which randomly adds hidden nodes one by one until it reaches the convergence requirement.But I-ELM does not recalculate the output weights of all existing nodes when a new node is added.To solve the disadvantages of I-ELM, the convex incremental extreme learning machine (CI-ELM) [2] and its improved method (ICI-ELM) [3] have been proposed.To decrease the calculation time of ELM, two different growth structures (increased structure and decreased structure) of hidden nodes were designed.e increased structure of hidden nodes includes the enhanced random search based on I-ELM (EI-ELM) [4], EM-ELM [5], and so on.e decreased structure of hidden nodes includes P-ELM [6], OP-ELM [7], EM-ELM [8], and so on.e errorminimized extreme learning machine for single-hidden layer feed-forward neural networks was proposed for the problem of simultaneous learning.e optimum values of these parameters and the numbers of hidden neurons of ELM were obtained by using a genetic algorithm (GA), wavelet or particle swarm optimization (PSO).In addition, some new adaptive growth methods of hidden nodes were proposed, including AG-ELM [9] and D-ELM [10].Apart from optimization constraints of ELM, ELM has a wide range of applications in data classification [11], nonlinear dynamic systems identification [12], pattern recognition [13][14][15], expert diagnosis [16], medical diagnosis [17], modelling permeability prediction [18], expert target recognition [19], human face recognition [20], and prediction interval estimation of electricity markets [21].However, there are still some problems that need to be studied.All these have resulted in contradiction between the efficiency and the accuracy.is paper is based on deeply studying the improved ELM methods, and a new growth network structure of the ELM algorithm is proposed to gain better generalization.Due to the updating process being dynamically adjusted by the structure of hidden nodes by a variable step length, the method is referred to as the variable step incremental extreme learning machine (VSI-ELM).So VSI-ELM is characterized by the compact network structure, the fast running speed, and the better generalization ability.
Wire rope is widely used in coal mines, as the key component of a mine hoister, which is characterized by high intensity, lightweight, favorable flexibility, high reliability, better bending performances, and so on [22].So wire rope is playing an increasingly important role in coal mining.Under the alternative load, the fatigue, wear, and corrosion of wire rope tend to happen and even result in the serious damage to broken rope [23].Since some events may lead to wire rope with some risks to hosting persons, broken wire is not only the beginning of serious damage of broken rope but also difficult to be found previously, which cumulatively decreases the strength or even leads to fracture of wire rope [24,25].erefore, it is important to study the nondestructive testing technique of wire rope.e rest of this paper is organized as follows: Section 2 gives ELM algorithm theory and its improved VSI-ELM model.Section 3 gives data analysis and research of ELM, I-ELM, and VSI-ELM and the performance analysis of ELM by using the UCI data set.Section 4 introduces an automatic MFL detection system.In this section, VSI-ELM is applied to diagnosis of different broken wires.Section 5 concludes the paper indicating major achievements and future scope of this work.

Traditional SLFNN
eory.Extreme learning machine (ELM) was proposed based on the single-hidden layer feed-forward neural network (SLFNN).ELM is characterized by the easier parameter selection rules, the faster converge speed, the less human intervention, and so on.
e ELM algorithm has been widely used in many areas of image processing, machines vision, pattern recognition, decision and control, and so on.A typical SLFNN is mainly composed of the input layer, hidden layer, and output layer.ELM is a unified SLFNN with randomly generated input weights, bias, and hidden nodes.For any given N independent samples (x i , t i ), Assume the input layer of SLFNN with n nodes, the hidden layer of SLFNN with L nodes, and the output layer of SLFNN with m nodes.A typical SLFNN model can be represented by where w i is the connection weight between input layer nodes and hidden layer nodes and b i is the bias.e two parameters w i and b i are independent not only of the training sample set but also of each other.β i is the connecting weight between the ith hidden node and the output nodes.g(x) is the activation function of hidden nodes.o i is the output vector.Unlike based on traditional gradient descent learning algorithms which only work for differentiable activation functions, ELM algorithm also can work for all bounded nonconstant piecewise continuous activation functions.e hidden node of ELM includes additive or RBF-type nodes, fully complex nodes, and wavelet nodes.e common activation functions of the hidden layer are shown in Table 1.For the traditional hidden layer activation function, the activation function parameters a and b are 1.And the different values will impact the performances of the ELM algorithm.
For a given standard set of training samples (x i , t i ), if the outputs of the network are equal to the targets, we can get  L i�1 ‖o j − t j ‖ � 0: Equation ( 2) can be written compactly as where 2

Shock and Vibration
where H is called the output matrix of the hidden layer in ELM, the ith column of H is the output vector of the ith hidden node with respect to the inputs, and β T is the transpose of a vector β.
In practical applications, the number of training sample sets is greatly larger than the number of hidden nodes (N ≫ L).In order to reduce calculation of ELM, the number of hidden nodes is generally selected less than the number of training samples N.
For a given minimum value ε > 0, ELM is of the universal approximation capability, as represented by the following equation: Under the constraint of the minimum norm least square, the weight between the hidden nodes and the output nodes can be calculated as min where H + is the Moore-Penrose generalized inverse of the output matrix of the hidden layer H. e ELM has a three-step learning model and can be summarized below.Given a training sample set (x i , t i ) and the activation function of the hidden node G(a, b, x), Step 1: assign randomly the input weight w i , the bias b i , and hidden layer nodes L Step 2: calculate the output matrix of the hidden layer H Step 3: calculate the output weight β � H + T.

VSI-ELM Algorithm.
Based on deeply studying the improved ELM methods, a new growth network structure of the ELM algorithm is proposed to gain better generalization.Due to the updating process being dynamically adjusted by the structure of hidden nodes by a variable step length, the method is referred to as the variable step incremental extreme learning machine (VSI-ELM).VSI-ELM is characterized by the compact network structure, the fast running speed, and the better generalization ability.
But due to lack of the selecting standard of hidden nodes, the initial value of hidden nodes L 0 is particularly important.If the number L 0 is far greater than the optimal value of hidden nodes, it can result in the increase of the training time and the decrease of the generalization ability.If the number of hidden nodes is too small, it can result in not only the lack of the fault tolerance ability but also the increase of the training error.According to the requirements between the number of hidden nodes and the resolution problem of ELM, together with the selecting experience of other neural networks, the initial value of hidden nodes L 0 in ELM is as follows: where L 0 is the initial number of hidden nodes, n is the number of the input layer nodes, m is the number of the output layer nodes, q is the variable step length function, q ∈ Z, and k is the number of the iterations.When q � 0, L 0 is the initial number of hidden nodes.
Next, the update of hidden nodes L 0 is adjusted by (8).When the number of hidden nodes is close to the objective, ELM can adjust the smaller step to increase or decrease the number of hidden nodes.When the number of hidden nodes is far to reach the objective, ELM can adjust the larger step to increase or decrease the number of hidden nodes.VS-ELM reduces the computational complexity by only updating the output weights incrementally each time.e output weight β is calculated by the least-square criterion.And the computing process of VS-ELM is as follows.
Given a set of training samples the expected learning accuracy ε > 0, and the maximum iteration r, the VS-ELM algorithm can be shown in three phases.
Phase 1: the initialization phase: (1) Initialize the parameters of SLFNN with the mechanism of randomly generated w i , b i , and activation functions g(x j ).ere exists a positive integer k. e initial number of hidden nodes is , and its error is E 1k , when k � 0.
(2) Calculate the output matrix of the hidden layer H 1 : (3) Calculate the corresponding output error ere are two seeking directions of the growing mechanism of hidden nodes of ELM, including the increasing growth and the decreasing growth of the network structure as represented by formula (10) and formula (11).
e total number of hidden nodes can be added to the value L k .It means adding the number of hidden nodes

Types of activation functions
Formula of activation functions Gaussian function G(a, b, x) � exp(−b‖x − a 2 ‖) Shock and Vibration 3 ) and (+2 k−1 ) to the existing SLFNN, respectively.Calculate the corresponding output errors E 2k and E 3k .
(a) For the negative growth of the network structure, the number of hidden nodes is (b) For the positive growth of the network structure, the number of hidden nodes is Phase 3: if the corresponding output error ‖E‖ < ε or k > r, the growing procedure gets finished.
End while.

e Performance Analysis of ELM.
rough the above analysis of ELM theory and its improved methods, it is not difficult to find that the performance of ELM has a direct relationship with its algorithm structure.e input weight matrix of ELM is generated by a random pattern after the matrix of input neurons and the number of hidden layer neurons.e number of input neurons is determined by the size of the sample matrix (training or testing), while the number of hidden layer neurons is artificially set.erefore, the size of the sample matrix also affects the performance of ELM.However, it is very important to adjust the hidden layer neuron nodes to improve the performance of the ELM without changing the sample size.Based on this, the accuracy of the ELM in regression or classification will be improved if the input weight matrix generated by a random pattern is the best match with the training sample.erefore, it is of great significance to study the number of hidden layer neurons in ELM and optimize the parameters of the input weight matrix.
To investigate the effect of hidden layer neuron nodes on the performance of ELM, a performance test of the ELM algorithm was conducted using some sample sets provided by the University of California, Irvine (UCI).e regression and classification sets selected from the UCI data set are shown in Tables 2 and 3, respectively.Among them, the determination coefficient R and the root-mean-square error (RMSE) are selected as evaluation indexes.e smaller the root-mean-square error, the better the performance of the algorithm model.e determination coefficient R is within the range [0, 1], and the closer the coefficient to 1, the better the performance of the algorithm model.Conversely, the closer the coefficient to 0, the worse the performance of the algorithm model.
e two indicators are calculated as follows: where  y i is the predicted value of the ith sample, y i is the true value of the ith sample, and n is the number of samples.
In order to verify the impact of hidden layer nodes of ELM on the field of regression prediction, the selected four classes of regression data sets (spectra set, concrete set, fertility set, and sinc set) were tested from UCI provided in Table 2. Based on the performance indicators of the rootmean-square error (RMSE) and determination coefficient R, the performance index changes with the hidden layer node, as shown in Figure 1.
After analysis, it can be seen that the sample features can effectively describe the characteristics of the samples, which is the prerequisite for the regression prediction of the samples, which is the main reason for the unsatisfactory performance, as shown in Figure 1(b).In both the training samples and the testing samples, the closer the RMSE to 0, the closer the determination coefficient to 1. Similarly, the closer the determination coefficient to 1, the closer the RMSE to 0.
For the simple characterization of samples such as the sinc set, with the increase of hidden layer nodes of ELM, the closer the RMSE to 0 and the closer the decision coefficient R to 1, the higher the regression prediction accuracy of the testing samples than the training samples.At the same time, with the increase of hidden layer nodes of ELM, there is no fluctuation of their RMSE and decision coefficient anomaly.However, with the increase of hidden layer nodes of ELM, the abnormal fluctuation of RMSE and determination coefficient R appears on the regression prediction of the spectra set, concrete set, and fertility set, and these abnormal fluctuations occur in the hidden layer node numbered 30, 50, and 50, respectively.erefore, it is not appropriate to improve the fitting accuracy of ELM regression only by adding hidden layer nodes without considering the ELM overlearning problem.In addition, for the spectra set, concrete set, and sinc set, there is no change when the hidden layer nodes reach 90, 100, and 10, respectively; since then, continuing to increase the hidden layer nodes will only increase the computing time of ELM, as shown in Figure 1.

Shock and Vibration
In order to verify the effect of hidden layer nodes of ELM on the classification method, 11 classification data sets (abalone, statlog (heart), diabetes, parkinsons, wdbc, iris, wine, breast tissue, glass, seeds, and waveform (version 2)) are used to test the classification predictions of ELM. e classification accuracy of the classification results varies with the hidden layer nodes, as shown in Figure 2.
rough the analysis and comparison of the above data set, the conclusions are as follows: (1) With the increase of hidden layer neuron nodes, the classification accuracy of the training samples has a sharp increase stage and then relatively slowly approaches the target value.If the neurons in the hidden layer continue to increase, the classification accuracy of the training samples can reach 100%.(2) With the increase of hidden layer neurons, the classification accuracy of the testing samples also has a sharp increase phase, but it does not approach the target value relatively slowly afterwards as the training samples.Instead, the following possibilities exist: e main reason for this situation is that ELM input weights are caused by the stochastic mode.However, the root cause is that every time the hidden layer node of ELM is updated, the input weight matrix is updated again, which leads ELM to lose self-optimizing ability and greatly increase the searching time of the ELM optimal structure.is is also the reason why the I-ELM algorithm uses hidden layer neuron nodes layer by layer.

e Different Growth Structure of Hidden Layer Nodes of ELM.
ere are two different ways of growth of hidden layer nodes in this article.I-ELM algorithm 1 needs to recalculate all the input weights according to the updated number of hidden layer neurons, and I-ELM algorithm 2 only needs to calculate the connection weights of the new added hidden layer neurons and original input and output neurons.I-ELM algorithm 2 makes full use of the previously calculated input weight matrix to reduce its calculation time.I-ELM algorithm 2 improves the algorithm structure only by adding hidden layer nodes, as shown in Figure 3.
In order to verify the difference between the two methods, different updates of hidden layer neuron numbers and recalculation of input weights are done.In this paper, the two methods (I-ELM algorithm 1 and I-ELM algorithm 2) were tested by using the iris set, respectively.e results are shown in Figure 4.As can be seen in Figure 4, I-ELM algorithm 2 significantly converges faster than I-ELM algorithm 1, making the ELM structure more compact and avoiding unnecessary training time consumption.

e Performance Analysis of the VSI-ELM Algorithm.
In order to compare the performance of the VSI-ELM algorithm and I-ELM algorithm, the UCI classification data set (statlog (heart), diabetes, parkinsons, and iris) provided in Table 3 was used to test the two algorithms.e update rate curves of the sample classification accuracy are shown in Figure 5.
e detailed comparison results of the training time-consuming and hidden layer neuron node for the VSI-ELM algorithm and I-ELM algorithm are shown in Table 4.For iris, the preimprovement algorithm (I-ELM) takes twice the time-consuming training of the modified algorithm (VSI-ELM).So the VSI-ELM algorithm is of faster training speed of the multiclassification samples.
Numerical analysis shows that the VSI-ELM algorithm can guarantee the optimal number of neurons in the hidden layer and faster convergence than the I-ELM algorithm and    Shock and Vibration make the ELM network structure more compact; generalization ability is also stronger.

Practical Application Based on the VSI-ELM Algorithm
4.1.Research Background of Broken Wire Detection.Mine lifting wire rope is one of the most critical components of the coal mine transportation system.It is responsible for the transportation of personnel, coal, and equipment, and its working condition is directly related to the safe and orderly production of coal mine.As the mine lifting wire rope is affected by the long-term friction, humidity, corrosion, and other harsh production conditions and bears the repeated tensile load and bending load, broken wire, abrasion, corrosion, and other structural damage will inevitably appear, which results in the strength reduction of wire rope and brings harm to the safe operation of the wire rope.With the increasing depth of mining, the requirements for wire rope that can withstand high-speed, long-term, and heavy-load conditions have become exigent.However, the complex structure of wire rope and uncertainty of damage type and location have brought a lot of technical problems to the wire rope nondestructive detection, especially the use of the magnetic flux leakage method.In that case, the relation between magnetic field change and structure, movement mode, and stress change of wire rope is becoming more complex, and it also brings troubles to the magnetic flux leakage signal detection.Magnetic flux leakage (MFL) is an

Shock and Vibration 9
efficient nondestructive testing technique for the defected wire rope and plays an important role in the dynamic monitoring of wire rope [26][27][28].Because of the intricate structure of wire rope, there is a complicated relation between the diverse damages and MFL signals.Permanent magnet is characterized by small volume, low cost, lightweight, high magnetic field, not requiring power, and easy to dispose and install.e MFL signals are gathered by some arrays of Hall effect sensors disposed at the circumference clinging to the outer surface of wire rope [29].So the MLF signals are influenced by the lift-off distance, velocity effect, shaking, and various properties of the defects.e MFL signal of each channel is different from that of other channels in a multistage ring MFL detection device [30].Certainly, all influencing factors are very important to study the design of the subsequent signal processing.In recent years, a large amount of the defect detection methods have gained great achievement in respect of monitoring of wire rope; meanwhile, there are also some issues that need to be resolved.erefore, it is an important and urgent lesson in the research field that explains how to apply the simplest and fastest method for fault feature extraction of the broken wire of wire rope.e effect of variable tensile stress on the MFL signal response of defective wire ropes is analyzed and dealt with as needed [31].e filtering system consisting of the Hilbert-Huang transform and compressed sensing is used to obtain the defect RMF image characteristics of wire rope, and the characteristics are extracted as the input of a radial basis function neural network to identify the defects of wire rope [32].
To remove the effects of channel-to-channel mismatch on the disposition, an adaptive method for MFL channel equalization is based on PCA and ELM [33].For the classification of the MFL signal for different broken wires, the neural networks are very popular methods.e BP neural network was employed for the quantitative identification of broken wires [34].e improved radial basis function neural network was applied for the quantitative identification of defected wire rope [35].e wavelet neural network was used for the prediction and diagnosis of hoisting wire rope [36].
erefore, the choice of the wire rope breakage identification method is to be solved.

Experimental Study of the Classification of Different
Degrees of Broken Wires.In this paper, a new MFL detection device is used to obtain the MFL signal.e MFL detection device is shown in Figure 6.Twenty-four Hall sensors are distributed in space of the MFL detection device.Each of the three Hall sensors is a group.e acquisition board includes three diverse direction Hall sensor arrays.Each direction is composed of 8 channels of Hall sensors, which are uniformly arranged at the annular circuit board.ere are 24 Hall sensors to measure the magnetic flux leakage of defected wire rope by using the necessary amplification and filter to record the MFL signal.e multichannel MFL signals are transmitted to the acquisition system.e time-domain and timefrequency domain characteristics of MFL signals of the diverse wire rope are analyzed.In order to train the VSI-ELM algorithm, some normal samples are needed in this experiment.e mixed-features vector can be used as the effective characteristic input of the quantitative identification when wire rope appears to be broken wires.To avoid the training sample set getting too large, the length of the sample set is set to a certain length (2048 data points).Table 5 shows the characteristic samples of MFL signals of broken wires, where n is the sequence number, P is the peak of the MFL wave, W is the width of the MFL wave, S is the area under the MFL wave, R is the diameter of wire rope, d is the lift-off distance, and k is the damage type.In this section, VSI-ELM was utilized to extract the characteristics of MFL signals of different broken wires.
For MFL signals, the characteristic samples include training samples and testing samples.
e number of training samples is 80. e number of testing samples is 80. e classification accuracy of defected broken wires is up to the best value of 97.5% by using VSI-ELM.Compared to the I-ELM algorithm, VSI-ELM can not only gain the optimal number of hidden nodes but also the fast convergence rate.e experimental results show that the VSI-ELM algorithm is of faster classification speed and higher classification accuracy of different broken wires.

Conclusions
In this paper, the theory of ELM based on the single-hidden layer feed-forward neural network is reanalyzed.e classification model of ELM is theoretically deduced, and the existing improving methods of ELM are compared.e number of hidden layer nerves of ELM is emphatically analyzed.So the key is the hidden layer neuron growth strategy.is article focuses on the analysis of the influence of the number of hidden layer nodes on the performance of ELM.
e numerical simulation analysis of the UCI data set is used to test the effect of the number of hidden layer neuron nodes of ELM.rough comparative analysis, it is found that I-ELM algorithm 2 has better performance.It is verified that the I-ELM algorithm 2 is more conducive to finish of sample training by using stacking the hidden layer nerves.Based on

( 1 )
When the classification accuracy increases sharply and reaches the maximum value, the classification accuracy decreases gradually, as shown in Figures 2(a), 2(e), and 2(k).(2) When the classification accuracy sharply increased to reach the maximum, the classification accuracy first decreased and then stabilized, as shown in Figures 2(b), 2(c), 2(d), and 2(i).(3) When the classification accuracy increased sharply and reached the maximum value, the classification accuracy first decreases and then rises and approaches a new stationary value, as shown in Figures 2(f ), 2(h), and 2(j).(3)ere is a big difference in classification accuracy of ELM when the nodes of hidden layer neurons are not much different, and their classification accuracy even exceeds 20%.As shown in Figure2(g), the classification accuracy of wine presents a banded distribution.

Figure 4 :
Figure 4: Comparison between I-ELM 1 and I-ELM 2 about classification accuracy updating curves of training samples for the iris data set.

Table 1 :
e common hidden layer activation functions of ELM.

)
Firstly, compare E 11 , E 21 , and E 31 ; the smallest of E 11 , E 21 , and E 31 will be used as the number of hidden nodes.Suppose k � 0 and the SLFNN with the number of hidden nodes L 0 , if the corresponding output error E 11 ≤ ε, E 11 ≤ E 21 , and E 11 ≤ E 31 , the growing procedure gets finished.If E 21 is the minimum value of E 11 , E 21 , and E 31 , VS-ELM chooses the negative growth of hidden nodes.If E 31 is the minimum value of E 11 , E 21 , and E 31 , VS-ELM chooses the positive growth of hidden nodes.For example, if E 31 is the minimum value of E 11 , E 21 , and E 31 , the next update of the number of hidden nodes is Ø ) and the corresponding output error is E 32 .Secondly, compare E 32 and E 31 ; if E 32 < E 31 and ‖E‖ > ε, the next update of the number of hidden nodes is ) and the corresponding output error is E 33 .In addition, if E 36 > E 35 and ‖E‖ > ε, the number of hidden nodes will stop positive growing, Ø 5−1) + (2 1−1 ) is updated.Using this method, we can find the best number of hidden nodes until ‖E‖ < ε.

Table 2 :
Data set description of the selected UCI regression problems.

Table 3 :
Data set description of the selected UCI classification problems.

Table 4 :
Performance comparison of VSI-ELM and I-ELM in the UCI set classification problem.

Table 5 :
e characteristic samples of MFL signals of broken wires.theabove analysis, a novel adjustment strategy of hidden layer neuron nodes of the ELM (VSI-ELM) algorithm is proposed in this paper.e feasibility of VSI-ELM is verified by the UCI classification data set (statlog (heart), diabetes, parkinsons, and iris).e time-consuming ratio of I-ELM to VSI-ELM of statlog (heart), diabetes, parkinsons, and iris is 50.64,18.08, 28.89, and 2.86, respectively.e experimental results show that the VSI-ELM algorithm can find the best number of hidden layer neuron nodes faster than the I-ELM algorithm.Finally, the VSI-ELM algorithm is applied to identify the characteristics of the MFL signal of different broken wires.e classification accuracy of defected broken wires is up to 97.5% by using VSI-ELM.