Ensemble Learning Improves the Efficiency of Microseismic Signal Classification in Landslide Seismic Monitoring

A deep-seated landslide could release numerous microseismic signals from creep-slip movement, which includes a rock-soil slip from the slope surface and a rock-soil shear rupture in the subsurface. Machine learning can effectively enhance the classification of microseismic signals in landslide seismic monitoring and interpret the mechanical processes of landslide motion. In this paper, eight sets of triaxial seismic sensors were deployed inside the deep-seated landslide, Jiuxianping, China, and a large number of microseismic signals related to the slope movement were obtained through 1-year-long continuous monitoring. All the data were passed through the seismic event identification mode, the ratio of the long-time average and short-time average. We selected 11 days of data, manually classified 4131 data into eight categories, and created a microseismic event database. Classical machine learning algorithms and ensemble learning algorithms were tested in this paper. In order to evaluate the seismic event classification performance of each algorithmic model, we evaluated the proposed algorithms through the dimensions of the accuracy, precision, and recall of each model. The validation results demonstrated that the best performing decision tree algorithm among the classical machine learning algorithms had an accuracy of 88.75%, while the ensemble algorithms, including random forest, Gradient Boosting Trees, Extreme Gradient Boosting, and Light Gradient Boosting Machine, had an accuracy range from 93.5% to 94.2% and also achieved better results in the combined evaluation of the precision, recall, and F1 score. The specific classification tests for each microseismic event category showed the same results. The results suggested that the ensemble learning algorithms show better results compared to the classical machine learning algorithms.


Introduction
A landslide is a kind of geologic hazard that arises when the stability of a slope is compromised, primarily in a mountainous area.It poses a significant peril to both the natural environment and human life [1].According to the National Bureau of Statistics of China , an annual average of 8961 geological disasters, 6148 landslides, and 1807 avalanches occurred from 2012 to 2021 in China.During that decade, there were 4550 fatalities and direct economic losses of 4,446,191,000 CNY caused by geological disasters.Landslides predominantly occur in mountainous terrains, inclines within hilly regions, riverbanks, artificial embankments, and excavations, among other vulnerable areas [2].They pose a threat to engineering and construction projects.In the case of minor disasters, they can impact building construction; however, in more severe cases, they have the potential to cause significant damage to buildings, disrupt traffic flow, and hinder normal road transport.Moreover, large-scale landslides can lead to river blockages, destruction of roads and infrastructure facilities such as factories and mines, as well as the burial of villages.The economic repercussions resulting from these events are challenging to quantify accurately.To enhance the safety of individuals residing near disaster-prone sites, further measures such as monitoring systems and early warning mechanisms for hazardous slopes have been proposed with increasingly stringent requirements.
Landslides and rock slides exhibit brittle, destructive behavior, releasing energy during their deformation and destruction processes.This phenomenon generates elastic waves that propagate through solid media, with seismometers capable of capturing the location of their generation and the magnitude of the released energy [3].Microseismic monitoring is a technique specifically designed to capture micro rupture signals [4].In recent years, microseismic monitoring technology has played an increasingly significant role in the development of landslide monitoring.In the early warning detection of geologic hazards in the slope environment, a variety of equipment such as geo-radar, inclinometers, seismometers, etc., can be utilized for the real-time monitoring of landslides and collapses.Variations in microseismic signals are analyzed to reflect geohazard phenomena [5].Signal monitoring is typically accomplished using instruments that ensure uninterrupted 24 h surveillance, thereby providing ample data for subsequent processing.The available microseismic instruments not only capture heterogeneous signals generated naturally but also record signals originating from anthropogenic activities.Given that man-made sources such as human activities can introduce interference when analyzing source activity, it becomes imperative to classify the sources of microseismic signals as an initial step in signal processing analysis.
The waveform recognition and automatic processing can be roughly divided into two main steps: (1) microseismic event arrival-time detection and (2) event classification.There are numerous research methods both domestically and internationally for arrivaltime detection: (1) The short-term average (STA)/long-term average (LTA) method [6] is used to reflect the variations in the signal amplitude and frequency.When an abnormal event occurs, the value of the STA/LTA will be changed abruptly.By observing their ratios, it is possible to determine when microseismic events occur and to identify them from background noise.However, this method heavily relies on selecting an appropriate threshold value, making it crucial to carefully choose the suitable threshold value during practical applications.(2) Intercorrelation [7] is when a signal is convolved with another signal, incorporating a specific time delay, resulting in the generation of an intercorrelation function.The magnitude of the mutual correlation function directly corresponds to the level of similarity between the two signals.This technique is employed for comparing consecutive signal sequences to identify anomalous variations, specifically pinpointing the temporal occurrence of microseismic events.Other commonly utilized methodologies include template matching, wavelet transform, and the Akaike information criterion (AIC) [8], among others.
Feng et al. [1] conducted a comparative analysis of the STA/LTA method and intercorrelation method for detecting microseismic event arrivals.They applied these two methods to analyze a short signal trajectory consisting of two strong events and three weak events while also discussing the appropriate detection thresholds.The results of their experiment revealed that the intercorrelation method is susceptible to environmental noise.Therefore, it is more suitable for stable monitoring environments or when dealing with filtered signals and high signal-to-noise data, as well as for single-event detection purposes.The template matching method [9] focuses on collecting enough templates for microseismic events, and the accuracy of the recognition depends on the number of templates used in the training process.Because templates are susceptible to signal sources, different environments may have different effects on the trained model, thus affecting the robustness and portability of the model.The microseismic signals discussed in this paper are derived from the natural state in the field, where human activities and meteorological and geological environments are subject to a certain degree of variability, which puts higher requirements on the detection methods.Therefore, this paper adopts the model based on the STA/LTA method for detection.
The use of machine learning algorithms in geoscience data processing has increased with the development of artificial intelligence technology [10][11][12][13].Their powerful generalization ability has been demonstrated in several practical applications.Currently, experts and scholars worldwide have conducted extensive research on landslide interpretation, landslide susceptibility evaluation, and other fields using machine learning.Integrating remote sensing technology, the big data cloud platform, and engineering application processing technology enables the implementation of a machine learning algorithm to identify potential landslide hazards.This approach establishes a high-precision intelligent identification model, which is of great practical significance for disaster prevention and mitigation.
Meanwhile, scholars have conducted research on classifying microseismic events using machine learning.Long et al. [14] analyzed the differences in waveform parameter characteristics of typical signals acquired by the microseismic system from the Asher copper mine.They proposed a method of identifying rock rupture signals based on the decision tree (DT) classification algorithm and carried out a comparative analysis of its identification accuracy.The research results reveal that the distribution ranges of parameters overlap to differing extents.Therefore, it is not possible to effectively identify rock rupture signals using a single parameter.To eliminate the influence of noise signals, a DT classification model was used to construct a rock body rupture signal recognition model.This model effectively eliminates the influence of noise signals, resulting in a recognition accuracy rate of 97.8%.This rate is significantly higher than that of the Support Vector Machine model, which is 73.9%.Provost et al. [15] used a random forest (RF) classifier based on seismic attributes.The proposal includes 71 effective seismic attributes that can represent microseismic events.Models were developed to detect four types of seismic sources on a seismic network of eight sensors located on the Super-Sauze clay-rich landslide in the Southern Alps of France.The model achieved a sensitivity of up to 93% when compared to the manually interpreted catalog used as a reference.Based on this, Langet [16] utilized a convolutional neural network (CNN) to automatically classify 15 years of seismic signals recorded by a network of eight geophones installed around the steep slopes behind the rocky slopes of Aknes, Norway.The classifier's performance was estimated to be close to 80%.Malfante et al. [17] investigated the Ubinaus volcano acquisition of 109,434 volcanic seismic events, constructed a new model based on Support Vector Machine (SVM), and achieved a correct classification rate of 92.Maggi et al. [18] distinguished eight categories of signals based on microseismic data acquired by the Piton de la Fournaise Volcanic Observatory: summit and deep volcanic tectonic events, local, regional, teleseismic, T-phase, rockfall, and sonic.They constructed a classifier using an RF classifier and achieved a classification accuracy of 96%.The study did not include subjective evaluations.Peng et al. [19] used ten machine learning algorithms to establish the discrimination of microseismic events and blasts, including DT, RF, logistic regression (LR), SVM, K-nearest neighbor (KNN), Gradient Boosting Trees (GB), Naive Bayes (NB), Bagging, AdaBoost, and Multi-layer Perceptron classification (MLP).The results showed that LR had the best performance in parameter identification, and the accuracy of cross-validation can reach more than 0.95.This paper compared different machine learning models on the microseismic event dataset created from the Jiuxianping landslide in Yunyang, Chongqing.The aim was to identify the models with the highest classification accuracy and efficiency.Firstly, a category dataset was established to evaluate the strengths and weaknesses of the algorithms.The dataset contained feature parameters, such as the maximum amplitude, maximum frequency, mean-to-peak ratio, duration, and center frequency.At the same time, the microseismic events were classified into categories according to specific rules.Secondly, machine learning models were built to classify the microseismic events.Finally, the performance of each model was analyzed based on the experimental results, and their advantages and disadvantages were summarized.

Dataset
The dataset used in this experiment is derived from the Jiuxianping deep-seated landslide in Chongqing, China.The landslide is an ancient landslide, located in Yunyang County, on the left bank of the Yangtze River.The landslide plane is a "golden bell" shape, the longitudinal length is about 1200 m, and the average width is about 1200 m.The total plane area is about 1.44 km 2 , the average thickness is about 40 m, and the volume is about 5,700,104 m 3 (see Figure 1).We deployed a network of eight microseismic monitoring stations across the landslide area, each equipped with three channels to capture comprehensive seismic data.Utilizing these channels, we recorded the east-west, north-south, and vertical components of the microseismic signals, enabling thorough monitoring of the seismic activity within the landslide zone.Subsequently, we extracted the characteristic parameters from these microseismic signals, which were then employed as inputs for the machine learning models tasked with the classification tasks.Furthermore, we conducted statistical analyses on the microseismic events classified within the same category, thereby facilitating inference regarding the underlying nature of these events.
The dataset employed for evaluating each model encompassed a cumulative count of 4131 microseismic events spanning from January 2021 to April 2022.The STA/LTA method was used for event detection.In this paper, we set the short-time window (stw) to 0.4 s; the long-time window (ltw) to 14 s, taking into account the minimum error; Threshold 1 for event detection was set to 4, which was used to identify the start time of an event; and Threshold 2 for checking the start time of an event was set to 2, which was a lower threshold that helps to select the start time of a time accurately.The Minimum event duration (MINevent) was 0.4 s, which is the same as the length of the stw.The Minimum Interval (MINinterval) was 14 s, which was the same as the length of the ltw.The parameter MINinterval was used to separate consecutive events.

Features
Effective feature extraction methods can highlight the differences between different types of microseismic signals.To obtain good-quality seismic event classification, the choice of seismic features is critical.Energy will be released in the type of microseismic waves by cracks in rocks.The source parameters of microseismic events will be different depending on the fracture mode of the rocks.Based on the above characteristics, the characteristic parameters of the received microseismic events can be used as a criterion to distinguish signals [19].During the time it was installed, the microseismic monitoring station continuously monitored the surface rock rupture and surrounding signals of the landslide in Yunyang County, Chongqing.The signals can be quantitatively represented by the feature.The expression ability of features is related to the efficiency of the accuracy rate of microseismic event recognition.
Many of the features have already worked very well in this field such as features based on signal waveform attributes and characteristic parameters based on spectrum attributes and polarity, etc., which can reflect the characteristics of the signal from a certain perspective, so that the classifier can obtain a better classification effect.
In this paper, 60 features are calculated to describe the signal in the different dimensions of the time domain, frequency domain, time and frequency domain, and multi-station network.These features were recommended by the referenced literature: Hibert et al. [20], Provost et al. [15], and Wenner et al. [21].After a comprehensive analysis of the actual signal conditions and the specific characteristics of the research site, these features were selected.All the features are automatically calculated based on the original signal, without more artificial intervention, and only the frequency band in the Kurtosis attribute for filtering needs to be set (based on the actual situation of the signals studied in this paper, the frequency bands of 5-10 Hz, 10-50 Hz, 5-70 Hz, 50-100 Hz, and 5-100 Hz were used).

Manual Classification Standard
The manual classification dataset was meticulously curated to serve as a foundational resource for both the algorithmic model training and subsequent accuracy validation.Drawing inspiration from the classification criteria proposed by Langet et al. [16] in their work on automatic seismic signal classification recorded on the Åknes rock slope in Western Norway, our dataset delineates distinct seismic event classes.
The primary class of seismic events, termed slopequakes, encompasses phenomena associated with slope fracturing or sliding.This class is further stratified based on criteria such as the frequency range, duration, and peak intensity into subcategories, including high-frequency slopequakes (HF), low-frequency slopequakes (LF), the succession of highfrequency events (HFS), the succession of low-frequency events (LFS), and two specialized types denoted as dual-frequency center events (HLF and HLFS).Examples of seismic signals for different types of events are shown in Figure 2. The HLF and HLFS were found during the manual classification process.As can be seen in Figure 2, a microseismic event appears as a double frequency center, between 10-20 Hz and 20-40 Hz, respectively.The maximum frequency is generally greater than 30 Hz.They are distinguished based on their duration.Events with a duration of less than 5 s are labeled as High-Low Frequency (HLF), while events lasting more than 5 s are labeled as HLFS.Additionally, the dataset captures surface processes triggered by slope steepness and instability, such as rockfalls.Moreover, it includes a category for environmental noise (N), encompassing various sources of ambient noise.Notably, seismic events attributed to earthquakes were absent during the temporal scope of the dataset, thus warranting their exclusion from consideration.The manual classification standard can be found in Appendix Table A2.We analyzed the waveforms and spectrograms corresponding to the microseismic events and classified them according to our manual classification criteria.The microseismic event dataset includes a total of 4131 microseismic events from January 2021 to April 2022.To uphold the dataset's randomness and representativeness amidst seasonal and weather fluctuations, the dates and time intervals were randomly sampled.
The dataset encompasses various categories, including N, LF, HLF, HF, HFS, LFS, HLFS, and rockfall, among others.The distribution of samples within each category is outlined in Table 1 and Figure 3, alongside the corresponding count of microseismic events per category.Due to the varying probabilities of occurrence of different microseismic events, the number of events obtained through manual processing varies.Consequently, the number of events for each type in the dataset differs.For example, RF events have a lower probability of occurrence in real situations and are therefore rarer, resulting in less data for this event type and a smaller proportion of the dataset.

Methods
Typically, human scrutiny of event data is indispensable for acquiring microseismic event classification outcomes.The training dataset, grounded in manual classification, furnishes details regarding each event alongside corresponding feature parameters.Leveraging machine learning techniques, patterns embedded within numerous parameters can be discerned, thereby facilitating the automated derivation of microseismic event classifications while curtailing human involvement.In this study, a spectrum of classification methods was employed, encompassing conventional machine learning algorithms as well as those integrating ensemble learning principles.Subsequently, we elucidate the operational mechanisms of select algorithms.

Naive Bayes
NB is grounded in Bayes' Rule, wherein the training phase involves the estimation of prior and conditional probabilities while the testing phase computes the posterior probability for each potential category.Subsequently, the category with the highest a posteriori probability is designated as the final classification outcome, which is the prior probability of the category and is the conditional probability of observing the feature under the category.The Naive Bayes algorithm is adept at handling high-dimensional data and exhibits superior classification performance.However, it simplifies probability calculations by assuming feature independence within each category [22,23].

Logistic Regression
LR is based on the logistic function, which estimates the probability of an event occurring by applying a linear combination of input features and model parameters to the logistic function [24].This logistic function transforms continuous values into a range between 0 and 1, signifying the likelihood of event occurrence.Logistic regression algorithms address multi-category problems through a one-to-many strategy, designating one category as the "positive category" and the remaining -categories as "negative categories".Each binary classifier is trained using a logistic regression model to distinguish between the positive and negative categories.During prediction, all K binary classifiers are employed to predict a new event, each returning a probability score indicating the likelihood of the data point belonging to the positive category.Subsequently, the category with the highest probability score is selected as the final classification outcome.

Support Vector Machine
The primary objective of the SVM is to determine the optimal classification hyperplane for K feature classes, forming a K-dimensional space, which maximizes the classification margin while ensuring classification accuracy.This entails inputting feature data into the K-dimensional space to achieve category classification [25][26][27].The classification margin represents the distance between the closest samples to the classification hyperplane and the hyperplane itself.Therefore, the problem of constructing the optimal hyperplane is translated into an optimization problem to identify the optimal solution and select the most suitable classification hyperplane.The algorithm seeks the extreme value solution, ensuring that it is a global optimal solution rather than a local minimum.This characteristic enhances the SVM algorithm's generalization ability to unknown samples [24].

Linear Discriminant Analysis
Linear Discriminant Analysis (LDA) is frequently employed for dimensionality reduction and feature extraction to address classification problems, effectively projecting a high-dimensional space consisting of high-dimensional data onto a lower-dimensional space through linear projection [28,29].LDA mandates that the projected sample points achieve reduced distances between sample points belonging to the same category while ensuring that sample points from distinct categories are positioned farther apart in the projected space.This optimization is achieved by quantitatively selecting the ratio of maximizing inter-category variance to minimizing intra-category variance.

Perceptron
Perceptron (PCT) is similar in principle to the Support Vector Machine.The aim is to find a hyperplane that separates data points of different categories.It is focused only on whether the classification is correct or not and does not consider the size of the interval.Originally designed for binary classification, Perceptron uses a one-to-many strategy to extend it to multi-category classification.The important limitation is that it only works with linearly separable data, i.e., there exists a hyperplane that perfectly separates two categories.

Decision Tree
DT operates on a tree structure to classify microseismic events into distinct categories [30,31].The DT model comprises nodes and edges, with each internal node denoting a feature or attribute and each leaf node representing a category or value.Through a process of recursive splitting, decision tree partitions the values of each feature into smaller subsets based on the characteristics of the training set, subsequently constructing the tree to accurately classify all the events into specific categories.

Random Forest
RF, an ensemble learning method built upon decision trees, aggregates multiple weak decision trees through a majority vote mechanism to determine classifications.Renowned for its scalability and user-friendliness, the RF algorithm not only simplifies the classification process but also ranks the importance of features in contributing to accurate classification.Moreover, RF exhibits promising results in microseismic signal classification.Wenner et al. [21] applied the RF algorithm to detect and distinguish between slope instability, noise, and seismic signals, demonstrating its robust recognition capabilities even with limited training data.Similarly, Provost et al. [15] leveraged the RF algorithm on clay slopes, achieving a remarkable 93% improvement in recognition accuracy compared to manually classified datasets, thus highlighting RF's efficacy in microseismic event classification.

Gradient Boosting Trees
Similar to RF, GB also relies on decision trees, facilitating regression or classification tasks by constructing a sequence of decision tree models.Each tree is built based on the residuals of the preceding tree.GB incrementally enhances the model by iteratively adding trees, with each new tree optimized on top of all the preceding trees, thereby progressively enhancing the model accuracy [32].The key advantage of GB lies in its robust predictive performance and ability to handle large-scale datasets and effectively model complex nonlinear relationships.Consequently, it finds extensive utility across diverse domains.

Extreme Gradient Boosting
Extreme Gradient Boosting (XGBoost) is an adaptable and portable boosting decision tree algorithm pioneered by Chen [33].The parallel tree boosting methodology offered by XGBoost facilitates the rapid and precise resolution of various data science challenges.XGBoost represents an enhancement over GB, boasting high flexibility and scalability to accommodate diverse data types.Moreover, it supports customized loss functions and evaluation metrics, offering abundant parameter configurations for extensive model customization.
Enabled by optimized algorithms and parallel computing, XGBoost enables swift and efficient training and prediction processes.It excels in managing large-scale datasets and high-dimensional feature spaces, demonstrating outstanding performance in numerous machine learning competitions and real-world applications.

Light Gradient Boosting Machine
Light Gradient Boosting Machine (LGBM) is an efficient and rapid machine learning framework founded on Gradient Boosting Trees, renowned for its outstanding performance in numerous machine learning competitions.
Utilizing histogram-based techniques, LGBM effectively diminishes memory consumption and computational overhead, consequently enhancing the training speed.LGBM provides support for both column-wise and row-wise storage formats, allowing users to select the most appropriate storage scheme based on the number of features and samples, thereby further reducing memory usage.

Classification Performance
Based on the datasets constructed in this paper, microseismic events are classified using several algorithms commonly used in machine learning for multi-classification problems.The implementation of these models is realized through programming methodologies leveraging the scikit-learn library within Python, encompassing ten distinct machine learning algorithms.In our pursuit to compare the performance of varied algorithmic models, uniform parameter configurations were employed to train each model on the identical dataset.

Evaluation Methodology
In this investigation, model performance was comprehensively assessed using several metrics, including the accuracy, precision, recall, and F1 score.The accuracy rate denotes the ratio of correctly classified samples by the model to the total sample count.The precision rate signifies the proportion of samples predicted as positive categories that are indeed positive, serving as an indicator of the model's true positive predictions.Recall denotes the proportion of true examples correctly identified as positive by the classifier.The F1 score amalgamates the precision and recall, representing the harmonic mean of these metrics [24].

Split Ratio of the Dataset
To investigate the influence of the training set size on the efficacy of machine learning algorithms, the train_test_split function from the scikit-learn package in Python was employed to partition the original dataset into training and test subsets proportionally.Considering the impact of varying training set sizes on model accuracy, increments of 10% from 10% to 90% of the total dataset were allocated to the test set for model training.The accuracy rate was selected as the evaluation metric to discern the advantages and drawbacks of different training set configurations.Figure 4 shows the performance of various machine learning models at different split sizes, with the accuracy scores plotted on the y-axis and the split sizes on the x-axis.The findings, depicted in Figure 4, reveal that the classification accuracy of the NB models is notably sensitive to changes in the training set size.LR, SVM, LDA, PCT, DT, RF, GB, XGB, and LGBM maintain stable accuracy (0.85-0.95) with minimal fluctuation.RF, GB, XGB, and LGBM consistently perform best, achieving accuracies near or above 0.9.The model is mostly able to fetch better accuracy when the partition ratio is at 0.2, 0.3, and 0.5.In order to enable more data to be involved in the training, in conjunction with the actual situation, a training set-test set ratio of 0.2 was selected.It means that the training set constitutes 80% of the total dataset with the remaining 20% for testing.

Validation Accuracies
Following the determination of the training set proportions, we conducted fivefold cross-validation [34] to randomly partition the training dataset into five subsamples.This entailed training the model on four subsets while validating on the remaining subset.This process was iterated five times, with different subsets designated as the validation sets in each iteration.To assess algorithm stability, we performed 10 repetitions of fivefold cross-validation for each algorithmic model.Each run entailed the random resampling of the training data, resulting in a total of 50 validation outcomes (see Figure 5).
The algorithm stability was evaluated by computing the mean and variance across the 50 validation results (refer to Table 2).By comparing the mean accuracy and variance of the different algorithms, it is evident that GB and LGBM perform the best on this dataset, with the highest mean accuracy (94.20%) and the lowest variance (0.49 and 0.52, respectively).RF and XGB follow closely, also showing high accuracy and low variance.In contrast, NB performs the worst, with the lowest accuracy and highest variance.Overall, the ensemble learning algorithms (such as GB, LGBM, RF, and XGB) outperform the traditional machine learning algorithms on this dataset.In the evaluation of machine learning models, multiple metrics are commonly used to comprehensively measure the performance of the model.Common evaluation metrics for classification models include the accuracy, precision, and recall.To visualize the combined performance of these metrics, a radar chart can be used.The average values of the precision, recall, and F1 score for the model over 50 cross-validations are shown in Figure 6.Each evaluation metric is represented as a dimension, forming a polygon.The area of the polygon intuitively reflects the model's overall performance across all metrics: the larger the area, the better the model performs on the evaluation metrics.
In terms of comprehensive performance, XGB and LGBM achieve the highest F1 score of 0.973, as well as the highest precision (0.965) and recall (0.982).RF and GB follow closely, with F1 scores of 0.956 and 0.965, respectively.Although slightly behind XGB and LGBM, both models exhibit relatively high precision and recall.Among the other models, SVM and LR also show strong performance, particularly in accuracy and F1 scores.In contrast, NB and LDA demonstrate relatively lower performance.While they excel in recall, their combined performance in terms of the precision and F1 score does not match the higher-performing models, especially concerning the F1 score.These results underscore the superior performance metrics of ensemble learning models compared to non-ensemble models.The polygonal representation in Figure 6 provides a clear visual comparison, further highlighting the overall effectiveness of ensemble methods in classification tasks.

Category Performance
The performance of models varies across different event types.Specifically, in the evaluation of the precision, the microseismic events were categorized into eight types: HF, HFS, HLF, HLFS, LF, LFS, N, and rockfall.For each event type, assessments were conducted using three evaluation metrics: precision, F1 score, and recall (refer to Figure 7).
Across the spectrum of the ensemble learning algorithms, superior performance is observed for each event type.As can be seen from the figure, out of the given eight types, the four types HF, HFS, LF and N get more similar results in various algorithms and are able to achieve better classification results.Conversely, the event types HLF, HLFS, and LFS demonstrate an enhanced precision, F1 score, and recall compared to the other algorithms, showcasing the effectiveness of the ensemble learning algorithms in their classification.Notably, the rockfall event type may not fully reflect the classification effectiveness during testing due to its limited event count.
Meanwhile, Figure 8 is shown, which contains the confusion matrices for the four models of ensemble learning: (a) RF, (b) GB, (c) XGB, and (d) LGBM.These confusion matrices demonstrate the specific performance of each model in the classification task, including the number of correct and incorrect classifications.It further demonstrates their effectiveness in real-world applications.

Model Optimization
Grid search and cross-validation were used to optimize the parameters of the model [35].Grid search systematically explores all possible combinations of parameters by evaluating each combination through cross-validation to select the best parameters.Firstly, we listed the hyperparameters for each model to adjust their possible ranges.Subsequently, we generated all possible parameter combinations.For each combination, we evaluated the model using cross-validation, splitting the dataset into multiple subsets and training and validating the model multiple times to assess the stability and performance of each parameter combination.Finally, based on the cross-validation results, we selected the parameter combination that provided the best performance.Selecting the optimal parameter values for the random forest algorithm resulted in a 0.73% improvement in model accuracy.In XGB, the algorithmic accuracy witnessed a enhancement of 0.12% following the parameter adjustments.Similarly, within LGBM, the optimization efforts involved modifying model parameters such as n_estimators, learning_rate, max_depth, num_leaves, and min_child_samples, alongside the inclusion of regularization parameters, namely, class_weight, reg_alpha, and reg_lambda, aimed at mitigating overfitting.Regularization parameters help us control the complexity of the model.They achieve this by enhancing the influence of key features while effectively reducing the weights of less relevant features in the model.As a result, the model's accuracy increased from 93.95% to 94.70%.The specific parameter values and tuning ranges are detailed in Table 3.

Application
The monitoring station continuously observed the landslide site from 1 May 2021 to 30 June 2022.The data collected during this interval underwent predictions utilizing the previously established classification model.The relative fluctuations in the event counts are depicted in Figure 8.
The relative change R in the number of events N in each category over a 10-day time period is computed in Figure 9.
As depicted in the figure, the HF events exhibit greater variability during the initial placement of the instruments in the summer months, followed by a decrease in occurrences in the subsequent year.The HFS events demonstrate consistent characteristics, with a notable surge in the event count during the initial ten days of June, followed by minimal variation throughout the year, maintaining levels similar to those observed during the initial period.Both the HLF and HLFS events exhibit elevated event counts during June and August, followed by a decrease in July and the subsequent autumn, winter, and spring seasons.The LF events peak in mid-June with a subsequent sharp decline over the following month, maintaining relatively consistent levels throughout the year.Similarly, the LFS events peak in mid-June and remain stable thereafter.The occurrence of N events is comparatively low, with consistent characteristics throughout the year.The LF events consistently peak in mid-June, experiencing a sudden decline in the subsequent month, after which they stabilize.The LFS events demonstrate higher event counts during the summer months.The rockfall events display heightened occurrences in the late summer and early autumn months.Notably, the LF, HFS, and LFS event types all exhibit a sudden surge during the mid-June period, potentially influenced by the prevailing weather conditions at that time.

Conclusions
This paper employs ten machine learning methodologies to classify microseismic events through the construction of a microseismic event dataset.When employing default parameters, DT exhibits the highest performance among the non-integrated learning-based algorithms, achieving an accuracy of 88.75%.The models based on ensemble learning-RF, GB, XGB, and LGBM-demonstrate superior performance, ranging between 93.5% and 94.2% accuracy.Taking into account the precision, recall, and F1 score, XGB and LGBM are the better choices.It is suitable for tasks that require a balance between precision and recall.GB and RF also perform well but slightly less well than XGB and LGBM.These experimental outcomes underscore the efficacy of ensemble learning in enhancing model robustness by amalgamating predictions from multiple base models, thereby mitigating over-reliance on individual models and diminishing the risk of overfitting.
Furthermore, specific classification experiments were conducted for each microseismic event category.The results reveal a comparable performance between non-ensemble and ensemble learning models across categories, such as HF, HFS, HLF, LF, and N.However, a more pronounced disparity between the two models is observed in the classification of the microseismic categories HLFS and LFS, suggesting inherent difficulties in distinguishing these categories from each other.
The ratio of the training set to the test set impacts model performance, with optimal results achieved at an 8:2 ratio.Additionally, parameter tuning substantially influences model training outcomes.Employing a grid search methodology facilitates the identification of optimal model parameters compared to empirical settings.In light of the rapid advancements in artificial intelligence, our future endeavors will focus on exploring more efficient classification models to continually enhance the accuracy of microseismic event classification.Ratio between 45 and 47 -Provost et al. [15] 52 Mean ratio between the maximum and the median of all DFTs Mean( max(SPEC) Median(SPEC) ) Provost et al. [15] 53 Ratio between 48 and 49 -Provost et al. [15] 54 Mean distance between the curves of the temporal evolution of the DFT maximum frequency and mean frequency -Provost et al. [15] 55 Mean distance between the curves of the temporal evolution of the DFT maximum frequency and median frequency -Provost et al. [15] 56 Mean distance between the 1st quartile and the median of all DFTs as a function of time -Provost et al. [15] 57 Mean distance between the 3rd quartile and the median of all DFTs as a function of time -Provost et al. [15] 58 Mean distance between the 3rd quartile and the 1st quartile of all DFTs as a function of time -Provost et al. [15] 59 The ratio of 2 between two different seismic stations -Feng et al. [1] 60 The ratio of 1 between two different seismic stations -Feng et al. [1]

Figure 2 .
Figure 2. Seismic signal examples for different types of events.

Figure 3 .
Figure 3. Pie diagrams showing the distribution of events within the different classes (a) in the training set and (b) in the test set, both expressed in terms of the number of events and percentages.

Figure 4 .
Figure 4. Classification accuracy with different sample sizes.

Figure 5 .
Figure 5.The accuracies obtained on 50 validations by different machine learning methods.

Figure 6 .
Figure 6.Comparison of different classifiers on each evaluation indicator.

Figure 7 .
Figure 7. Performance of classification models implemented by different AI algorithms on different types of datasets.Black, gray, and white represent accuracy, F1 score, and recall evaluation metrics, respectively.(a-h) represent the classification results of each model on categories HF, HFS, HLF, HLFS, LF, LFS, N, and Rockfall, respectively.

Figure 8 .
Figure 8. Confusion matrices for the four models of ensemble learning.

Figure 9 .
Figure 9. Relative Variation in the number of events N per class over chunks of 10 days.The reference value is taken from the first chunk.

T
0 y f (t)dt, with y f : filtered signal in the frequency range [f 1 − f 2 ]Provost et al.[15]

Table 1 .
The number of events in the train and test sets.

Table 2 .
Mean of accuracy and variance of different machine learning algorithm.

Table 3 .
Tabel of model parameter tuning.
Author Contributions: Conceptualization, L.F. and B.X.; methodology, B.X.; software, B.X.; validation, B.X., Z.H. and S.H.; formal analysis, B.X.; investigation, B.X.; resources, L.F.; data curation, B.X.; writing-original draft preparation, B.X.; writing-review and editing, L.F., B.X. and S.H.; visualization, B.X.All the authors have read and agreed to the published version of this manuscript.This research was funded in part by the National Natural Science Foundation of China, grant number 42107182, and in part by the Jiangxi Provincial Natural Science Foundation Office, grant number 20224BAB203040. Funding: