A Decision Tree for Rockburst Conditions Prediction

: This paper presents an alternative approach to predict rockburst using Machine Learning (ML) algorithms. The study used the Decision Tree (DT) algorithm and implemented two approaches: (1) using DT model for each rock type (DT-RT), and (2) developing a single DT model (Unique-DT) for all rock types. A dataset containing 210 records was collected. Training and testing were performed on this dataset with 5 input variables, which are: Rock Type, Depth, Brittle Index (BI), Stress Index (SI), and Elastic Energy Index (EEI). Other ML algorithms, such as Random Forest (RF), Support Vector Machine (SVM), Artiﬁcial Neural Network (ANN), K-Nearest Neighbor (KNN), and Gradient-Boosting (AdaboostM1), were implemented as a form of comparison to the DT models developed. The evaluation metrics and relative importance were utilized to examine some characteristics of the DT methods. The Unique-DT model showed a promising result of the two DT models, giving an average of (F1 = 0.65) in rockburst condition prediction. Although RF and AdaboostM1 (F1 = 0.66) performed slightly better, Unique-DT is recommended for predicting rockburst conditions because it is easier, more effective, and more accurate.


Introduction
Underground activities, such as mining, railway, and road constructions, are complex geotechnical works. These could be attributed to the limited understanding of the subsurface, which makes it difficult for underground constructions, due to the variability of geology. Even though more effort has been made to perfect underground constructions, where underground operations at 2000 m below the surface have been common, several uncertainties do pop up occasionally, which can lead to the wastage of resources (time, money, and properties) and, sometimes, loss of life. One such uncertainty is the instability of rock mass, as stated by Askaripour et al. and Aydan et al. [1,2]. Meng et al. [3] and Zhu et al. [4] explained that the instability of a rock mass in deep excavations depends on the inherent properties of the rock mass, such as the type of rock, its strength and brittleness, and external conditions, such as the magnitude of in situ stresses, geological structures, dynamic perturbations, and excavation sequences during underground operations. Consequently, based on understanding the above factors and how they influence rock mass instability, the magnitude of in situ stresses and the rock mass quality have been of more impact [1,5].
Accordingly, rockburst, which is a dynamic phenomenon, is considered a type of rock mass failure around deep excavations of hard and brittle rocks and in a high-stress environment [1,3,6,7]. Rockburst occurs as a result of overstressing the rock mass or intact brittle rock, when the stresses exceed the compressive strength of the material [1,[8][9][10][11][12]. Rockburst can also be defined as sudden and intense movement, accompanied by rock failure, in underground spaces under high-stress conditions [1,[13][14][15][16]. Given the fact that Table 1. Standard classification of rockburst intensities.

Rockburst Condition/Intensity Failure Characteristics
None No sound of rockburst and rockburst activities.

Light
The surrounding rock is deformed, cracked, or rib spalled, there is a weak sound, and no ejection phenomenon.

Moderate
The surrounding rock is deformed and fractured, and there is a considerable number of rock chip ejections, loose and sudden destruction, accompanied by crisp cracking, and often presented in the local cavern of surrounding rock.

Strong
The surrounding rock is busted severely, and suddenly thrown out or ejected into the tunnel, accompanied by a strong burst and roaring sound, air spray, and storm phenomena, with continuity that rapidly expands to the deep, surrounding rock.
Rockburst mechanisms and their predictions have been under serious research over the years, and have achieved thoughtful and profound results, as shown by the works produced by several authors. Among the rockburst forecasting methods, including laboratory, numerical, analytical, and empirical, the empirical approach is the most commonly used, due to its low cost, fast procedure, and simplicity [22].
Recent studies focusing on Machine Leaning (ML) algorithms are reviewed in Table 2, and results suggest that different methods have varying performances, and some methods have shown better accuracy in predicting rockburst occurrence than others [23]. Therefore, improving the accuracy of rockburst prediction is essential for mitigating risks and enhancing mining safety during the preliminary design phase [24]. These models have successfully operated using sets of input and output data from historical rockburst cases, affirming their capability in this context [24]. The results, based on AUC and error rates, indicate that LRC is effective in predicting rockburst intensity Xu et al. [34] 60 The results revealed minimum error rate and a very high prediction for rockburst intensity Pu et al. [35] 108 The results show that moderate rockburst intensity has the best agreement with the actual circumstances Faradonbeh and Taheri [36] 134 σ t , σ c , σ θ , EEI ENN, GEP, DT The results showed the high accuracy and applicability of all three new models. However, the GA-ENN and the GEP methods outperformed the C4.5 method Afraei et al. [37] 188 D(m), σ t , σ c , σ θ , σ θ /σ c , σ c /σ t , EEI NB, DT, SVM, ANN, KNN The developed models show a high performance compared to the previous application of the empirical criteria Pu et al. [38] 246 D(m), σ t , σ c , σ θ , σ θ /σ c , σ c /σ t , EEI Support Vector Classifier (SVC) Promising results in forecasting the rockburst intensity at the Kimberlite mine in Canada were achieved Kadkhodaei and Ghasemi [39] 174 σ θ /σ c , σ c /σ t , EEI DT The results show the significantly high performance of the models J. Zhou et al. [21] 196 σ t , σ c , σ θ , σ θ /σ c , σ c /σ t , EEI FA, ANN, and (FA-ANN) The results show a significantly high performance for all three models, based on RMSE and R 2 J. Zhou et al. [24] 102 σ t , σ c , σ θ , σ θ /σ c , σ c /σ t , EEI CART, Boosting, and Bagging The results, based on accuracy, indicated that the ensemble techniques proved better for the prediction, especially the bagging NB.: σ θ is the maximum tangential stress of the surrounding rock, MPa, σ c is the uniaxial compressive strength of the surrounding rock, MPa, σ t is the tensile strength of the rock, MPa, EEI is the Elastic Energy ing rate and the risk of getting trapped in local minima [24,25,40]; ANFIS can be timeconsuming, due to the need for tuning optimal functions and rules [24,26,41,42]; SVM classifiers require extensive computations and storage, while the KNN algorithm can be computationally intensive [24,35,43,44]. Despite the existence of numerous methods for predicting rockburst conditions and their respective accuracies, developing a reliable and precise method for rockburst-prone zones remains a challenge [24]. This paper explores the use of ML algorithms, especially decision trees (DT), to predict rockburst conditions in different rock types. The goal is to develop a unique model that can effectively predict rockburst conditions, regardless of the rock type. Two distinct approaches employing decision trees are developed, and their performance metrics are compared.
Other ML algorithms, such as ANN, SVM, RF, and AdaBoostM1, are also utilized. The study is based on a dataset of rockburst and a predefined classification scale of four levels ( Table 1). The report is divided into Data Characterization (the database used for training and the characterization of testing the models), Methodology (a brief description of models and evaluation), Results, and Discussion (summary of main results and observations).

Data Characterization
The rockburst database is collected according to the studies performed by many authors [20,21,[28][29][30]33,35,36,39,45]. Several empirical methods have been introduced to evaluate rockburst phenomena [1,20]. Founded on empirical methods proposed for rockburst, the input database was selected [20,24,30,46,47]. The input variables consist of depth, uniaxial compressive strength (σ c ), tensile strength (σ t ), maximum tangential stress (σ θ ), and the elastic energy index (EEI), which are converted to four variables, namely, depth, EEI, BI-(σ c /σ t ), and SI-(σ θ /σ c ), for the prediction of rockburst condition. The depth was chosen because it influences the size of the in-situ stress, distribution, and direction. Uniaxial compressive strength (σ c ), tensile strength (σ t ), and the elastic energy index (EEI) [15] reflect the properties of the surrounding rock, and the maximum tangential stress (σ θ ) reflects the virgin geostatic stress condition and the influence of the shape and dimension of the underground space on rockburst. The Stress Index (σ θ /σ c ) and Brittle Index (σ c /σ t ) were selected because, when the BI value is small, rockburst occurs intensely, and when large, it is light. Whereas, SI relates directly to BI through Ks [20].
This paper uses a rockburst dataset with 210 records and predefined classes (Table 1) to predict rockburst using a classification approach. Table 3, Figures 1 and 2 provide the input variables and statistics. In summary, the statistical analysis of the dataset revealed that the weight distribution and variance are high, indicating a wide distribution of data, which may be caused by outliers.

Methods
This paper discusses the use of data mining (DM) techniques in predicting rockburst conditions. The focus is on decision trees (J48), using a nominal classification approach. The R statistical environment and the Rweka [48]

Methods
This paper discusses the use of data mining (DM) techniques in predicting rockburst conditions. The focus is on decision trees (J48), using a nominal classification approach. The R statistical environment and the Rweka [48] package are used to execute various DM algorithms, including J48 [23,49,50], ANN [21,51], SVM [37,51], RF [30,51], and Ad-aboostM1 [30,51]. The modelling approach and hyperparameters used for the algorithms are detailed in Table 4, and a cross-validation approach (KFOLD = 10) is used for validation purposes. Below is a brief overview of the modelling approaches used in this paper. Interested readers are encouraged to read the references for a more comprehensive understanding of ML algorithms.  The decision tree is a popular ML algorithm used for both classification and regression tasks. It is a tree-like model, where each internal node represents a test on an attribute, each branch represents the outcome of the test, and each leaf node represents a class label or a numeric value. The algorithm works by recursively splitting the data based on the most important attribute at each level of the tree, thus forming a decision path from the root of the leaf node.
Two approaches based on decision trees are described: the DT-RT approach predicts rockburst conditions for each rock type i.e., Igneous (103 cases), Metamorphic (58 cases), and Sedimentary (49 cases) using DT (J48), while the Unique-DT approach uses all datasets (210 cases) to predict rockburst conditions, reduce analysis work, and improve accuracy. Both approaches aim to improve the effectiveness of the algorithm in predicting rockburst conditions.

Multiple Algorithm Approach
This approach employs ANN, SVM, KNN, RF, and AdaboostM1 to predict rockburst conditions, using a dataset of 210 records. The goal is to compare the results based on performance metrics with other approaches.

Data Evaluation
A key problem with ML algorithms is their complexity. Thus, according to different proposals, ML algorithms are assessed by relying on accuracy. The confusion matrix is a classification process that contains the evaluation metrics, such as recall, F1, and precision. This makes it easier to access and more reliable. The performance of different distinct data-driven models will be evaluated and compared using seven metrics: Recall, Precision, F1, Accuracy (ACC), Specificity (SP), Matthew Correlation Coefficient (MCC), and Area Under the Receiving Operating Characteristic (ROC) Curve (AUC). Additionally, the interpretability of the models is assessed through sensitivity analysis-the relative importance of input variables to the output variable-rockburst condition. The equations and definitions of the metrics are as follows.
The recall measures the proportion of cases of a certain class correctly identified by the model. Therefore, recall is given by: On the other hand, precision measures the correctness of the model when it predicts a certain class. Precision is given by: The harmonic mean of precision and recall is a class of F1 score (F1). It is also a measure of the performance of the model's classification ability. The F1 score is considered a better indicator of the classifier's performance than the regular accuracy measure.
Furthermore, the accuracy criterion defines the percentage of the data in the correct place. The closer the value is to one, the higher the accuracy and reliability.
Specificity can be defined as the algorithm's ability to predict the True Negative of each category available. It is also known as the True Negative Rate (TNR).
AUC calculates the area under the ROC curve. The higher the AUC, the better the model classifier. Therefore, the AUC for a perfect classifier is 1. In all, the higher the metric values, the better the prediction. This goes for all the metrics selected. MCC ranges in the interval −1, +1, with extreme values -1 and +1 reached in the case of perfect misclassification and perfect classification, respectively, while MCC = 0 is the expected value for the coin-tossing classifier.

Discussion
This section summarizes the main results achieved in rockburst condition prediction through the application of ML techniques. These algorithms are analyzed through the accuracy criteria-metrics, and the results are according to the modelling approach adopted in this study.

DT-RT
The performance metric and DT representation of the DT-RT approach are indicated in Table 5 and Figures 3-5 below. The model showed promising results for each rock type, based on the evaluation metrics. The metamorphic DT algorithm performed the best, in terms of F1 score (0.74), indicating better classifier performance. However, the sedimentary model performed the worst, which may be due to the complexity of the geological material involved and the limited amount of data.

Unique-DT
This section describes the results of using the Unique-DT approach for predicting rockburst conditions. Table 6 and Figure 6 present the performance metrics and DT diagram, respectively. The DT-RT model for metamorphic rocks leads to the best results, with an F1 score of 0.74 and an accuracy of 0.86. However, this model is restricted to a single rock type, whereas the Unique-DT model includes all rock types and showed a promising performance, with an F1 score of 0.65 and an accuracy of 0.82. The other performance metrics indicated also showed high performance. It should be stressed that the two other DT-RT models showed poorer performances than Unique-DT.

Unique-DT
This section describes the results of using the Unique-DT approach for predicting rockburst conditions. Table 6 and Figure 6 present the performance metrics and DT diagram, respectively. The DT-RT model for metamorphic rocks leads to the best results, with an F1 score of 0.74 and an accuracy of 0.86. However, this model is restricted to a single rock type, whereas the Unique-DT model includes all rock types and showed a promising performance, with an F1 score of 0.65 and an accuracy of 0.82. The other performance metrics indicated also showed high performance. It should be stressed that the two other DT-RT models showed poorer performances than Unique-DT.

Multiple ML Algorithms
The evaluation results of several ML algorithms, such as ANN, KNN, RF, SVM, AdaboostM1, and Unique-DT, were compared and presented in Table 7. RF, AdaboostM1, unique-DT, ANN, and KNN were found to perform well in predicting rockburst conditions, while SVM showed poorer performance. F1 and accuracy metrics for RF, Unique-DT, and AdaboostM1 were similar, with RF having a slight advantage. The study suggests that the Unique-DT model is a good alternative to other ML algorithms for predicting rockburst conditions. In addition, the ML algorithms used in this study are compared with some from other studies (Table 8).   [36] 85.16% GBM [30] 61.22% DT [36] 81.48% NB [30] 53.9-67.2% Cloud [29] 71.05% DT [35] 73-93% GSM-SVM [20] 66.67-88.9% LRC [33] 80.2-90.9% GA-SVM [20] 66.67-80% BN [31] 91.75% PSO-SVM [20] 66.67-90% ENN [36] 85.19% ANFIS [26] 66. 5-95.6% This section highlights the importance of interpretability in explaining ML algorithms, particularly sensitivity analysis of DT algorithms. The aim was to identify the relevant variables (inputs) that contribute to the prediction of rockburst conditions. Figure 7 was generated to help understand what was learned by the algorithms and compare it to empirical knowledge.

Limitations
Despite the satisfactory results obtained in predicting rockburst conditions, this work has certain limitations:

•
Due to the relatively small size of the dataset (210), it may not fully capture the variability of the rockburst conditions across different geological settings. Therefore, obtaining a larger dataset could provide more robust results.

•
The study only considers five input variables, which may not capture all the relevant factors that contribute to rockburst occurrence. Including more variables could improve the accuracy of the results.

•
The study only uses DT, and a few other ML algorithms, for predicting rockburst conditions. Other approaches, such as physics-based models or hybrid models that combine data-driven and physics-based approaches, could provide complementary  Each rock type has distinct variables that contribute to determining rockburst conditions, with the Unique-DT model showing the most influence on elastic energy index (EEI = 36.73%). Other influential input variables include Depth (25.84%), Brittle index (BI = 16.13%), and Stress Index (SI = 13.85%). The three DT-RT models show varying importance, based on their input variables. This is indicative of the influence the variable has on rockburst due to the rock type. The importance of the DT-RT model for sedimentary (SD) indicates EEI (43.01%) as the most relevant factor, followed by Depth (21.19%), BI (18.71%), and SI (17.09%), respectively. In the case of IG and MT, BI (27.37%) and SI (32.21%) are the most relevant factors, respectively, and EEI is the second-most important variable for both models.
These findings align with previous studies [21,30,31,33] on rockburst condition prediction, emphasizing the importance of accurately identifying underlying factors through closer examination of input data. It should be noted that all models assign high importance to the elastic energy index (EEI), which aligns with Xu and Yu [16] who presented a new prediction method for rockburst based on this index.

Limitations
Despite the satisfactory results obtained in predicting rockburst conditions, this work has certain limitations:

•
Due to the relatively small size of the dataset (210), it may not fully capture the variability of the rockburst conditions across different geological settings. Therefore, obtaining a larger dataset could provide more robust results.

•
The study only considers five input variables, which may not capture all the relevant factors that contribute to rockburst occurrence. Including more variables could improve the accuracy of the results.

•
The study only uses DT, and a few other ML algorithms, for predicting rockburst conditions. Other approaches, such as physics-based models or hybrid models that combine data-driven and physics-based approaches, could provide complementary insights and improve the overall prediction performance.

•
The study only uses a nominal classification approach for DT, which may not be optimal for handling continuous or ordinal variables. Using other classification approaches, such as binary or multi-class classification, could provide more flexibility and accuracy in modelling the rockburst conditions.

Conclusions
Predicting the rockburst condition plays a vital role in the safety, economy, performance, and efficiency of deep underground projects. In this research, a Decision Tree (DT), which is a simple, efficient, and accurate technique, was utilized to predict rockburst conditions for different rock types, such as igneous, metamorphic, and sedimentary, both alone and together. A new model was developed by combining datasets for each rock type and modelling it, using the DT algorithm, to predict rockburst conditions. Other Machine Learning (ML) algorithms, such as Random Forest (RF), K-Nearest Neighbor (KNN), Artificial Neural Network (ANN), Support Vector Machine (SVM), and AdaboostM1 were also utilized in predicting rockburst condition. Training and testing of the models were performed on a representative dataset of 210 records containing 5 input variables, required for forecasting rockburst conditions. The dataset contains 103 igneous rockburst cases, 58 metamorphic, and 49 sedimentary.
The approach, using different DT models for each rock type, is very restrictive, in comparison with the Unique-DT model, which has a wider domain of application. Among the DT models, the one related to metamorphic rocks provided the best results. However, it can only be applied to one type of rock. Furthermore, based on the evaluated metrics, the Unique-DT (F1 = 0.65) algorithm showed a very promising performance. Although other ML algorithms were utilized and compared to Unique-DT, RF (F1 =0.66) and Ad-aboostM1 (F1 = 0.66) were slightly better in performance metrics. Taking into account its simplicity and effectiveness, the Unique-DT model is suggested to be used in predicting rockburst conditions.
Subsequently, more focus should be placed on ensemble methods, such as RF and boosting algorithms, such as AdaboostM1, as they have demonstrated strong performance in classification tasks. Moreover, to improve prediction accuracy, authors intend to incorporate additional data-rockburst cases and results obtained from different input variables and ML methods in future work.
In summary, the ML algorithms have shown that they can be used for predicting rockburst conditions if data are available. Also, the use of DT in rockburst prediction based on depth, elastic energy index, and strength rock parameters has proved effective, and demonstrated its merit in solving this complex phenomenon-rockburst.

Data Availability Statement:
The data used in the present study are under privacy issues and cannot be shared.

Conflicts of Interest:
The authors declare no conflict of interest.