Advanced Computational Methods for Mitigating Shock and Vibration Hazards in Deep Mines Gas Outburst Prediction Using SVM Optimized by Grey Relational Analysis and APSO Algorithm

,


Introduction
In recent years, with the extension of coal mining level and the increase of mining depth, gas outburst has had an increasing impact on safe production of coal mines. Gas outburst threatens the lives of coal miners and also causes huge economic losses to the country [1][2][3]. For a long time, the study of gas outburst prediction has been a hot topic for researchers. Currently, the commonly used prediction methods for gas outbursts mainly include single index method, comprehensive index method [4], gas geology unit methods [5], and geophysical methods [6,7]. However, gas outburst is a nonlinear dynamic process coupled by variety of factors. e prediction accuracy is not satisfactory relying only on a certain indicator or simple combination of few indicators. erefore, using nonlinear artificial intelligence technology to predict gas outburst has become a research hotspot. At present, some researchers have made good research progress in using artificial intelligence methods to predict gas outburst. Yingjie Li et al. established a coupled forecasting model and achieved good forecasting results by combining Dempster-Shafer theory and traditional forecasting methods [8]. Dong Chunyou and others proposed G-K evaluation and a rough set model to analyze and classify gas outburst grades [9]; Cheng Xia and Xu Manguan analyzed the factors affecting gas outburst based on the gray system theory and selected the main control factors to simplify the prediction process and increase the prediction speed [10,11]. Zhou Xihua [12] used the principal component analysis method to assist the reverse BP neural network to reduce the dimensionality of the input factors and predict the gas outburst, which improved the accuracy of prediction results compared with traditional methods. However, the performance of BP neural network depends on the selection of initial value; it is easy to fall into local optimal solution in the process of gradient descent calculation. Liu Yijun, Hao Wenli, and others [13] established a GA-BP hybrid model to predict gas concentration; the GA-BP model used the improved genetic algorithm to optimize the weights and thresholds of BP neural network, which avoided occurrence of the local optimal solutions and achieved a more accurate prediction results compared with conventional BP network. Compared with BP neural network, SVM performs better in small sample data processing [14]. It has been widely used in many fields, such as troubleshooting [15], image recognition [16], and medical diagnosis [17]. Qu Fang used the SVM model to predict gas outburst, and his work showed the effectiveness of SVM in processing small sample data [18]. Qiao Meiying used LS-SVR to process time series data for gas concentration prediction and achieved good results [19]. e purpose of the SVM algorithm is to find support vectors [20]. e optimization goal is to minimize the structural risk rather than the experience risk, and the performance of the SVM model depends largely on the selection of kernel function parameters and penalty factors. For a specific sample, selecting the most suitable kernel function parameter σ and penalty factor c can make SVM model have both learning ability and generalization ability. Although the commonly used conventional methods such as cross-validation method and grid search method are relatively simple to implement, they are computationally intensive, and the classification accuracy is not good enough. erefore, how to obtain the best calculation parameters for a given sample data is a problem to be solved. To solve the problem, this paper proposed an algorithm based on grey relational analysis and SVM using adaptive particle swarm optimization (APSO-SVM). e sample data of the experiment were taken from the 31004 driving face of Xinyuan Coal Mine, Yangquan City, Shanxi Province, China. We first used the grey relational analysis method to analyze the correlation degree of the index parameters used for gas outburst prediction. e 5 parameters with lower correlation degree were eliminated, and the 4 most influential parameters were selected as input parameters to improve the model's performance in stability and accuracy. en, APSO method was used to optimize the penalty factor and kernel parameters of the support vector machine. e established APSO-SVM model was applied to the gas outburst prediction of the test sample and compared with the SVM model and PSO-SVM model. We further introduce four indicators (accuracy, precision, recall, and F 2 -score) to evaluate the prediction results.
e results show that the accuracy of the APSO-SVM model is 98.38%, the precision and recall are both 100%, and F 2 -score is 1. e APSO-SVM model outperforms the SVM model and the PSO-SVM model.

Site and Data
e experimental samples in this paper are taken from the #3 coal seam 31004 driving face of Xinyuan Coal Mine, Yangquan City, Shanxi Province, China. Xinyuan Coal Mine is recognized as gassy mine. Its geographic location is shown in Figure 1(a). e #3 coal seam is currently being mined, and the risk of gas outburst is high. e comprehensive histogram of #3 coal seam is shown in Figure 1(b). Using the data collected from this coal seam for experimental analysis is representative.
According to the analysis of the mechanism of gas outburst, the main factors affecting outburst are hydrogeological conditions, gas storage conditions, and coal structural parameters. According to the actual construction conditions of the #3 coal seam, we collected 162 sets of sample data containing 9 parameters (geological structure zone distance, coal seam gas content, gas release initial velocity, gas desorption index-K1, drill cuttings volume, coal seam depth, coal seam thickness, coal destruction type, and coal firmness coefficient) through combination of on-site collection and review of the Xinyuan Coal Mine files. Since it is difficult to obtain samples of gas outbursts that actually occur on site, blowout holes, top drills, and other dynamic phenomena that represent outbursts during drilling are used as the basis for outburst danger. According to the actual situation of the site, if no measures are taken, it is considered that there is no gas outburst risk, and we use the number 1 to indicate this situation. Drilling 12-15 groups of pressure relief holes means that the outburst risk is general, and the number 2 is used to indicate this situation. Drilling 20 groups of pressure relief holes means that the outburst risk is serious, and the number 3 is used to indicate this situation. Detailed data are shown in Table 1.

Using Grey Relational Analysis to Identify Prediction
Parameters. For SVM models, too many input parameters may cause the model to overfit, and the convergence will also be affected. is paper collected data of 9 indexes as input parameters. ere may be correlations between input parameters, and the correlations between input parameters and experimental results are different. erefore, it is necessary to find and strengthen the input parameters that are highly related to the experimental results and weaken or eliminate input parameters that have low correlation with experimental results to improve the convergence speed and prediction accuracy of the model. Grey relational analysis can effectively analyze the correlation between input parameters 2 Shock and Vibration and experimental results [21]. e specific analysis process is as follows: (1) Set the reference sequence, denoted as (2) Set the comparison sequence, denoted as In the formula, m is the number of evaluation objects, and n is the number of evaluation parameters. X i � (x i (1)，x i (2), . . . , x i (n)) T , i � 1，2, . . . , m.
(3) Nondimensionalization data. e physical meaning and dimension of each parameter of gas outburst prediction are different, these parameters are not easy to compare, or it is difficult to get correct conclusions when comparing. erefore, when performing grey relational analysis, nondimensional data processing is generally required. Commonly used dimensionless processing methods include extreme value method, average value method, and initial value method. Among them, the extreme value method formula is In the formula, p is the evaluation object, q is the evaluation parameter, i � 1，2, . . . , m, k � 1，2, . . . , n, and x max and x min are the maximum and minimum values of the sequence.    Number Note: X 1 means coal destruction type; X 2 means gas release initial velocity; X 3 means coal firmness coefficient; X 4 means coal seam gas content; X 5 means gas desorption index-K1; X 6 means drill cuttings volume; X 7 means geological structure zone distance; X 8 means coal seam depth; X 9 means coal seam thickness; "Risk" means the gas outburst risk. e calculation formula of the initial value method is e calculation formula of the averaging method is e dimensionless data are as follows: (4) Calculate the grey correlation coefficient between corresponding elements of each comparison sequence and reference sequence: Here, ξ p (q) is the grey correlation coefficient between the p-th evaluation object and the q-th parameter；ρ (0 < ρ < 1) is the resolution coefficient; if ρ is smaller, the difference between correlation coefficients is larger, and the distinguishing ability is stronger. Generally, ρ � 0.5. (5) Calculate the degree of grey correlation. e correlation coefficient is the value of the correlation between each element in the comparison sequence and the corresponding element of reference sequence; the correlation coefficient of each element of the comparison sequence needs to be averaged as a measure of the degree of correlation between the comparison sequence and the reference sequence, denoted as (6) Sort the grey correlation degree. Sort the calculated gray correlation degree r p . If r 1 > r 2 , it means that the correlation between the comparison sequence r 1 and the reference sequence r 0 is greater. e parameter of sequence r 1 plays a greater role in gas outburst than r 2 . If the calculated value of r i is the largest, it indicates that the comparison sequence r i occupies a dominant position in the entire system.

Support Vector Machine
Model. Support vector machine is based on statistical theory [22]. It is currently one of the most popular machine learning methods. Its core lies in the minimum structural risk, and the overall generalization ability is improved by controlling experience risk and confidence range. e method is to map the samples that cannot be classified in the low-dimensional space to the high-dimensional space through a nonlinear transformation to make the samples linearly separable. For a given sample, Z � ( L is the number of samples, and n is the input sample dimension. Construct a classification hyperplane to maximize the interval between all samples. e hyperplane can be expressed as the following formula: In order to solve this hyperplane, the problem is transformed into where c is the penalty coefficient and ξ i is the slack variable. Formula (11) is a quadratic optimization problem, introducing Lagrangian multipliers, and the problem can be transformed into a dual problem: is the kernel function. e solution α * of the above problem can be obtained using the SMO algorithm, and then w * and b * can be obtained. After the above derivation, the optimization function is e RBF kernel function has a wide range of convergence and can adapt to various samples. erefore, this paper chooses RBF as the kernel function, and its expression is as follows:

SVM Optimization Based on Adaptive Weighted Particle Swarm
Optimization. e performance of the SVM model depends on the kernel function parameter σ and the penalty factor C. e parameter σ is used to control the width of the Gaussian distribution of the sample. If σ is much smaller than the minimum distance between training samples, all samples become support vectors, which results in "overfitting" phenomenon. For new samples, the SVM model has a poor classification ability. If σ is much larger than the maximum distance between training samples, all samples are classified into one category, and the SVM model has no learning ability. e penalty factor C is the penalty coefficient for the deviation value of the wrong sample, which can adjust the ratio of the empirical risk and the confidence range of the SVM model to improve the generalization ability. When C becomes larger, it means that all constraint conditions tend to be satisfied, the fit of the data becomes higher, and the classification effect is better, but the generalization ability decreases. When C becomes smaller, the penalty for empirical error is smaller and the complexity of the model decreases, but the empirical risk increases.
erefore, the SVM model can obtain better learning ability and generalization ability by choosing the appropriate parameters σ and C. e conventional kernel function parameter optimization method has low efficiency and low accuracy. Compared with the conventional method, the swarm intelligence algorithm has achieved good application effects in the SVM kernel function parameter optimization [23]. At present, swarm intelligence algorithms commonly used in this field include genetic algorithm [24], particle swarm algorithm [25], ant colony algorithm [26], and leapfrog algorithm [27]. e advantage of genetic optimization algorithm's SVM is that it can avoid the occurrence of the local optimal solutions, but the disadvantage is that it is greatly affected by the initial value. e advantage of ant colony optimization algorithm's SVM is higher accuracy, but the disadvantage is that the algorithm has slow convergence speed and poor robustness. e advantage of the frog-leap optimization algorithm's SVM is that the solution accuracy and robustness are high, but the algorithm's convergence speed is slow and its implementation is complicated. e particle swarm optimization algorithm is currently the most widely used swarm intelligence algorithm in the field of SVM kernel function parameter optimization. It has fast convergence speed and high global optimization performance, but it is easy to fall into a local optimal solution. is paper uses an improved particle swarm optimization algorithm to optimize SVM, assuming that the number of particles in the population is m and the search space of the population is D-dimensional; the population can be expressed as X � (X 1 , X 2 , . . . , X m ). e space position of the i-th (1 ≤ i ≤ X m ) particle is X i � (x i1 , x i2 , . . . , x i D ), and the flying speed of the i-th ( , v i2 , . . . , v i D ); the speed and position update formulas are as follows: In the formula, t is the number of iterations. d is the search dimension, and the target parameters optimized in this paper are σ and C, so d � 2. I � 1, 2, ..., m is the number of particles.
c 1 is the individual learning factor, and c 2 is the social learning factor. r 1 and r 2 are random numbers distributed in the interval [0, 1]. p id (t) is the historical best position of the particle at the t-th iteration. p gd (t) is the best historical position of the t-th iteration of the population. v id (t) is the flight speed of the i-th particle in the d-di- x id is the position of the i-th particle in the d-dimensional space. w is the inertia weight. e flying speed and position of the particles are adjusted by the adaptive inertia weight. e expression is as follows: When the target value of each particle tends to be the same or tends to the local optimum, the inertia weight will be increased. When the target value of each particle is relatively scattered, the inertia weight will be reduced. At the same time, for the particle whose objective function value is better than the average target value, the corresponding inertia weight is reduced, thereby retaining its particle. On the contrary, for the particle whose objective function value is worse than the average target value, the corresponding inertia weight is increased to make the particle move closer to a better search area. e position x id updated each time is used as two parameters of SVM for sample training, and the accuracy of training samples is used as the optimization target of particle swarm algorithm.

Model Evaluation Index.
is paper introduces four indicators to evaluate the prediction effect of the model: e gas outburst risk is divided into 3 levels: serious, normal, and nonhazardous. We define the serious level as positive sample and the normal and nonhazardous levels as negative sample. TP is the number of correctly classified in the positive sample, FP is the number of incorrectly classified in the negative sample, FN is the number of incorrectly classified in the positive sample, and TN is the number of correctly classified in the negative sample. Accuracy reflects the classification performance of the model. Precision represents the proportion of real positive samples in the data that we predict as positive samples. It reflects a measure of the cost of misjudging a negative sample. Recall is also called "sensitivity," which represents the proportion of all positive samples that are correctly predicted by us. It reflects the model's ability to recognize positive samples. In a large-scale data collection, the two indicators "precision" and "recall" are often mutually restricted, so a trade-off needs to be made in the actual evaluation. F β − score is the harmonic value of precision and recall. When β � 1, we think that precision and recall are equally important. If we think that precision is more important, we adjust β to be less than 1, but if we think recall is more important, we adjust β to be larger than 1. For serious level, if it is misjudged as normal or nonhazardous, it will cause major safety hazards. erefore, we hope that the recall can be as large as possible, while the precision meets the requirements. Recall is more important, and the value of β in this paper is 2. Table 1 according to the method described in Section 3.1 of this paper. Set the " Risk " sequence as the reference sequence X 0 . e remaining columns are used as comparison sequences, including coal destruction type X 1 , gas release initial velocity X 2 , coal firmness coefficient X 3 , coal seam gas content X 4 , gas desorption index-K1 X 5 , drilling cuttings volume X 6 , geological structure zone distance X 7 , coal seam depth X 8 , and coal seam thickness X 9 . Nondimensional data processing adopts average value method.

e Results of Identifying Prediction Parameters. Process the data in
e grey correlation degree of each comparison sequence and the reference sequence is obtained by calculation, and the grey correlation degree is sorted. e calculation results are shown in Table 2.
According to the calculation results in Tables 2 and 3, sets of experiments are set up to investigate the prediction performance of SVM under different indicator combinations. Take the comparison sequences with an association degree greater than 0.65, comparison sequences greater than 0.6, comparison sequences greater than 0.55, and all comparison sequences, respectively. e accuracy is used to compare the prediction results of the four groups of experiments. e grouping and prediction results are shown in Table 4.
It can be seen from the results in Table 3 that the first group has the highest accuracy. It shows that the number of parameters given in the first group is the smallest, but the predicted result is still the closest to the true value. e reduction of the input parameter dimension can reduce the complexity of the model, thereby reducing the calculation time and improving the efficiency of the entire system. rough the results of the grey correlation analysis experiment, the first set of comparison sequences is finally selected as the main input parameters of the model, which are coal seam gas content X 4 , gas release initial velocity X 2 , gas desorption index-K1 X 5 , and drilling cuttings volume X 6 .

e Results of the Gas Outburst Prediction.
Using APSO-SVM to predict coal and gas outburst, the prediction process is shown in Figure 2. In order to evaluate the predictive performance of the APSO-SVM model, for the same set of sample data, the SVM model, PSO-SVM model, and APSO-SVM model were used for experiments. ere are 162 sets of sample data. e first 100 sets are used as training data, and the last 62 sets are used as test data. e parameters in the SVM, PSO, and APSO algorithms are initialized. For the SVM model, the grid search method is used to optimize the parameters, the minimum value of c and g is -8, the maximum value is 8, and the step is 0.8. For PSO-SVM model and APSO-SVM model, set the variation range of the penalty factor c to (0.1, 100) and the variation range of the kernel function parameter g to (0.01, 1000), c 1 is 1.5, c 2 is 1.7, the maximum evolutionary generation is 200, and the population size is 20.
Taking the model training results of the first 100 sets to analyze the parameter optimization effects of the three algorithm models, the calculation results are shown in Figure 3. It can be seen from Figure 3(a) that the results of the grid search method are c � 16 and g � 0.0625, and the accuracy is 89%. Figure 3(b) shows that the PSO algorithm obtains the best parameters (c � 62.0514 and g � 0.0265). When the iteration reaches the 10th generation, the accuracy is 91%. Figure 3(c) shows that the APSO algorithm obtains a 6 Shock and Vibration local optimal solution when the iteration reaches the 10th generation, and the accuracy is 93%. When the average fitness gradually approaches 93%, the algorithm adaptively adjusts the update weights so that the particles increase the search range and jump out of local extreme points, and then the search continues. e model obtains a better solution than the first stage when the iteration reaches the 39th generation, with an accuracy of 95%. Repeat the previous steps; when the average fitness is close to 95%, the algorithm adjusts the update weights again and the search range of the particle is increased. When iterating to the 79th generation, a better solution than that of the previous stage is obtained. e accuracy is 97%. Within the range of 200 iterations, the best accuracy is 97%. erefore, the solution is regarded as Table 3: Detailed prediction results of the three models.

Algorithm parameter initialization
Calculate the fitness value of each particle Calculate and update the optimal value of individual fitness for each particle Calculate and update the optimal value of the global fitness of the particle swarm Update the particle's velocity and position according to formulas (14), (15), and (16) Whether to meet the termination conditions Obtain the optimal parameters of SVM for regression e corresponding optimal parameters are c � 8.6060 and g � 0.1336. According to the above analysis and calculation results, the accuracy of the APSO optimization algorithm is 97%, and the effect is the best. e second is the PSO algorithm, which has an accuracy rate of 91%, and the grid search method is relatively inefficient, with an accuracy rate of 89%. Figure 4 shows the prediction results of the 62 test sets using 3 algorithm models. Table 3 lists the number of samples predicted by the three algorithm models with serious risk, general risk, no outburst risks, and the misjudgments of each level.
It can be seen from Table 3 that the number of samples with serious risk predicted by the SVM model is 4. Compared with the real sample data, these 4 samples are indeed serious risk with no misjudgment, but the number of real samples with serious risk is 6, so 2 samples are missed. It can be seen from Figure 4(a) that these two samples with serious risk were misjudged as general risk. e number of samples with general risk predicted by the SVM model is 28. Two of them are misjudgments, so 26 are correct predictions. e number of samples with no outburst risk predicted by the SVM model is 30.
ere are 29 true nonprotruding risk samples, so 29 of the prediction samples are correct predictions and 1 is wrong prediction.
In actual engineering applications, the serious risk of gas outburst belongs to a small number of samples, so we classify them as positive samples. General risk and no outburst risk of gas outburst are classified as negative samples. Based on the above analysis, it can be calculated that the accuracy of   Table 5. From the perspective of accuracy, the APSO-SVM model has the best effect, where its accuracy is 98.38%, followed by the PSO-SVM model with an accuracy of 96.77%, and the SVM model performs relatively poorly. e precision of the three models is 100%, which means that the three models will not misjudge negative samples as positive samples. In actual engineering applications, the three models will not have false alarms. From the perspective of recall, the APSO-SVM model has the best effect, where its recall is 100%, followed by PSO-SVM with a recall of 83.33%, and the SVM model performs the worst. e results above show that APSO-SVM    is sensitive enough to the serious risk samples, which is very important to ensure safe production. In practical engineering applications, we need to ensure that all the samples with serious risk can be identified, so the importance of recall is emphasized. Here, F 2 -score is calculated. From the calculation results, the APSO-SVM model's score is 1, followed by PSO-SVM with a score of 0.8620, and SVM has the worst realization with a score of 0.7443. Based on the analysis above, the APSO-SVM model performs the best.

Conclusions
is paper proposed an algorithm based on grey relational analysis and SVM using adaptive particle swarm optimization (APSO-SVM) for gas outburst prediction. e main conclusions are as follows: (1) e parameters of the APSO-SVM model are obtained by the grey correlation analysis method, which avoids the subjectivity of weight determination. 5 parameters are eliminated from the original data by using grey relational analysis, and experiments show that the accuracy of the model is improved after eliminating the parameters with poor correlation. With the continuous increase of sample data in practical engineering applications, too many prediction parameters will bring about great difficulty to model training, so this method is meaningful. (2) e APSO algorithm is used to optimize the penalty factor and kernel function parameters of the SVM model, which improves the global search ability of the entire system. Accuracy, precision, recall, and F 2 -score are introduced to analyze the performance of the model. Compared with the SVM model optimized by the grid search method and the PSO method, the APSO-SVM model established in this paper has higher accuracy in the application of gas outburst prediction. (3) e APSO-SVM model established in this paper is applied in tunneling heading face 31004 of Xinyuan Coal Mine in Yangquan City, Shanxi Province, China. It should be mentioned that although the proposed prediction model has been proven to be more accurate, gas outburst is a complex nonlinear dynamic process affected by many factors. ere are still some important issues to be studied, such as the mechanism of gas outburst and the application of some new development methods. In addition, it is of great significance to apply the APSO-SVM model established in this paper to other disaster predictions in coal mines, such as mine pressure prediction. erefore, in our future study, we will expand the data range to further test the ability of our model in the accurate and efficient predication of gas outburst and other disasters.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.