Landslide Susceptibility Zoning in Yunnan Province Based on SBAS-InSAR Technology and a Random Forest Model

: Yunnan Province, China, has complex topography and geomorphology, many ravines and valleys, and frequent landslide geological disasters and is of great signiﬁcance in the assessment of regional landslide geological disasters in Yunnan Province for disaster prevention and mitigation. In this study, Yunnan Province was selected as the research area, and the average annual deformation rate of radar line-of-sight in Yunnan Province over four years from 2018 to 2021 was obtained with SBAS-InSAR technology, which was used as one of the index factors for the susceptibility evaluation of Yunnan Province. The deformation rate reﬂects the slow movement of the land surface. In addition, elevation, slope, aspect, lithological classiﬁcation, geological structure, rainfall, distance from roads, distance from rivers, topographic undulation, and NDVI were selected as evaluation index factors and combined with the annual mean deformation rate. A random forest model was used to evaluate and accurately analyze landslide geological disasters in Yunnan Province. The results showed that as an important index factor, the annual mean deformation rate of Yunnan Province can be added to the random forest model to improve the prediction accuracy. The area with high susceptibility accounted for 10% of the entire province, and the number of landslides in the region accounted for 68% of the province. Additionally, the results for prone zoning were highly correlated with the landslide distribution. The accuracy of the random forest model prediction was 0.80, and the AUC value was 0.87, indicating that the random forest model was a highly accurate and reliable evaluation method for studying landslide geological


Introduction
China's Yunnan Province has a complex environment, many ravines and valleys, and large terrain fluctuations, leading to a high incidence of geological disasters in the country. Landslide disasters are the most frequent geological disasters in the region and occur frequently in Yunnan Province, representing a great threat to people's lives and property [1].
As one of the most common disasters in Yunnan Province, China, landslides have been studied many times. Research on landslide susceptibility mainly focused on two aspects: landslide susceptibility evaluation systems and landslide susceptibility methods. At present, landslide disaster susceptibility methods mainly include empirical models (fuzzy logic, analytic hierarchy, etc.) [2,3], statistical models (information quantity, deterministic coefficients, etc.) [4,5], and machine learning models (neural networks, decision trees, support vector machines, etc.) [6,7]. Among them, when analyzing and predicting large areas at the provincial and municipal levels, the accuracy and applicability of empirical models and statistical analysis models were found to be low. Machine learning models have the problems of weak explanatory value and overfitting of prediction results. With the rapid development of machine learning, the random forest model is now commonly used by scholars because of its unique advantages. A random forest model can not only meet the problem of nonlinear features of landslides adapted to other machine learning models but also introduce randomness, avoid overfitting the model, and improve prediction accuracy [8]. Many studies have proved that the random forest model has a good prediction effect and applicability in the study of landslide geological hazard susceptibility assessment [8][9][10][11]. Therefore, in this study, the random forest model was used to evaluate the vulnerability of landslide geological disasters in Yunnan Province, China, and obtain more accurate landslide susceptibility zoning in Yunnan Province.
Regarding the study of landslide susceptibility evaluation factors, in 2018, Reichenbach et al. [12] found that 596 landslide susceptibility index factors were identified in the previous research literature. However, topography, landform, hydroclimate, and other landslide development conditions are different in different study areas, so many researchers select only some of these conditions as index factors for their research according to the actual situation in the study area [13][14][15]. For example, in 2022, for the investigation and zoning of geological disasters in Helong City, Wang X.D. et al. [16] selected 13 disaster factors including elevation, slope, aspect, curvature, lithology, distance from faults, rainfall, distance from water systems, NDVI, soil texture, hydraulic erosion, population density, and distance from roads. They evaluated the vulnerability of the study area based on the investigation and zoning of geological hazards combined with the distribution law and influencing factors of geological disasters and comprehensively considered five factors: topography, geology, meteor hydrology, soil vegetation, and human engineering activities. However, the impact of land surface deformation information on landslide disasters is rarely considered in the above index factors. The land surface deformation rate can reflect the development stage of the landslide. The surface deformation rate can also detect hidden landslide dangers that cannot be found using optical remote sensing images and can evaluate the vulnerability of landslides more comprehensively, so it is very useful to add the annual mean deformation rate as an index factor to the vulnerability evaluation.
Minor surface deformation information will be observable before landslides occur, and the land surface deformation information can be obtained with temporal InSAR (Interferometric Synthetic Aperture Radar) technology. InSAR technology has the ability to obtain a large range of small deformation information and has a good monitoring effect on the area undergoing deformation. This method has achieved many fruitful applications in landslide monitoring [17][18][19][20]. InSAR technology is not affected by weather and has a wide coverage, which is very suitable for long-term observations and geological disaster investigations over a large area [20]. Yunnan Province is an alpine and hilly area. SBAS-InSAR (Small Baseline Subset-Interferometric Synthetic Aperture Radar) technology uses a small baseline set to obtain the deformation rate, which can effectively avoid the problem of temporal and spatial coherence, and it is an effective method to monitor the surface deformation in mountainous areas. To date, few scholars have used annual mean deformation rate results obtained with InSAR technology as landslide susceptibility factors, and no researchers have applied annual average deformation rate results as influencing factors to random forest models for susceptibility evaluation studies. To fill these gaps, in the present study, results for the annual mean deformation rate obtained using time-series InSAR technology are introduced into a random forest model as the influencing factors to evaluate the susceptibility of landslide geological disasters in Yunnan Province. small baseline sets by selecting short time and space baselines for differential interference processing. The small spatial and temporal baselines in the SAR images from each small set that meet the relevant conditions can improve the coherence of image pairs and increase the number of differential interferograms. Then, we calculate the phase changes in each baseline set using the least squares method to obtain the desired surface deformation information. Since the SBAS method introduces multiple images as the main image in the process of realizing differential interference, matrix B is more prone to rank loss. Therefore, it is necessary to solve the minimum norm of the matrix with the SVD (Singular Value Decomposition) method. Then, by differentiating the deformation velocity values in different time periods, the shape variables generated in different time periods can be obtained [21][22][23][24].

Deformation Result Acquisition
In this paper, SBAS-InSAR technology is used to obtain the annual average deformation rate results in Yunnan province, and the results are used as an index factor to evaluate the susceptibility. Sentine1-1A ascending track data for four years, from January 2018 to December 2021, were reviewed and selected according to the regional size of Yunnan Province. Sentine1-1A data is freely available. With a short period and C-band, it can obtain small deformation results on a wide range of surfaces. Due to the large area of Yunnan Province, the Sentine1-1A data contained four tracks, and the image data on each track basically covered the area of Yunnan Province. Table 1 includes the number of tracks, the number of images, and the number of baselines selected for data processing. The main process for Sentine1-1A data processing in Yunnan Province was as follows: (1) The first scene image of each track is selected as the main image, and the other images are used as auxiliary images. Baseline selection is carried out using the triangulation locking method because there are a large number of images in each track. If selection is carried out using the time and space baseline threshold, unsatisfied baselines need to be eliminated for areas with poor coherence, which is very difficult to complete. Finally, M image pairs are obtained for each track. (2) The auxiliary image on each track is registered with the corresponding main image; that is, all the auxiliary image data are resampled to the main image space, and then the M image pairs are interfered according to the baseline map to obtain the required interference map. Next, an 8:2 multiple view ratio is used to suppress the speckle noise from the interferogram and improve the pixel signal-to-noise ratio. The terrain phase in Yunnan Province is simulated using the external SRTM (Shuttle Radar Topography Mission) DEM (Digital Elevation Model) with a resolution of 30 m. Different processing is carried out with the interferogram, and the obtained differential interferogram is used to remove the terrain phase. A Gaussian filter is used to filter the differential interferogram to further reduce the signal noise. After setting a certain coherence threshold, the pixels whose coherence is higher than the threshold are selected for phase disentanglement processing. (3) We select a control point with high coherence, estimate the residual phase in the initial disentanglement phase according to the phase information on the selected control point, and remove the residual terrain phase. (4) The singular value decomposition method is used to transform the differential interference results into the same reference target, and then high-pass filtering in the time domain and low-pass filtering in the space domain are used to remove the influence of the atmosphere. According to the deformation rate results for Yunnan Province and the analysis of the deformation rate for hidden landslide dangers, the deformation rate results are manually divided into the following seven levels according to a certain level in the ArcGIS software: −0.256~−0.050, −0.050~−0.010, −0.010~−0.005, −0.005~0.005, 0.005~0.010, 0.010~0.050, and 0.050~0.180 m.

Random Forest Model Method
In 2001, Breiman built the random forest (RF) model by assembling many classified trees [25]. The principle of the random forest model is to extract k samples from the total sample as training samples. The number of decision trees to be trained is then determined using the number of training samples drawn, thereby generating k decision trees to form a random forest model. Each decision tree acts as a classifier to produce a classification result, and the final classification prediction is made by voting [12]. For random forest According to the deformation rate results for Yunnan Province and the analysis of the deformation rate for hidden landslide dangers, the deformation rate results are manually divided into the following seven levels according to a certain level in the ArcGIS software: −0.256~−0.050, −0.050~−0.010, −0.010~−0.005, −0.005~0.005, 0.005~0.010, 0.010~0.050, and 0.050~0.180 m.

Random Forest Model Method
In 2001, Breiman built the random forest (RF) model by assembling many classified trees [25]. The principle of the random forest model is to extract k samples from the total sample as training samples. The number of decision trees to be trained is then determined using the number of training samples drawn, thereby generating k decision trees to form a random forest model. Each decision tree acts as a classifier to produce a classification result, and the final classification prediction is made by voting [12]. For random forest models, the number of training samples selected is generally two-thirds the total number of samples, and the other one-third is used as test samples to test model performance. In recent years, sufficient experimental and theoretical studies have verified that this model is superior to other susceptibility evaluation models because the random forest model not only improves the accuracy of prediction but also has good tolerance for errors such as noise; thus, RF is the best susceptibility evaluation model that can be selected [13]. The prediction flow in the random forest model is shown in Figure 2. of samples, and the other one-third is used as test samples to test model perform recent years, sufficient experimental and theoretical studies have verified that th is superior to other susceptibility evaluation models because the random forest m only improves the accuracy of prediction but also has good tolerance for errors noise; thus, RF is the best susceptibility evaluation model that can be selected [ prediction flow in the random forest model is shown in Figure 2.

Classification results
Optimal classification

Model Training
In this paper, the landslide interpretation and identification of Yunnan Prov carried out using visual interpretation, and 1385 hidden landslide danger points vious deformation characteristics are delineated. These points are used here as data sets, and the same number of random points are randomly created with the software as negative data sets. A total of 2770 points are used in the susceptibilit ment. Thirty percent of the data for the entire sample is selected as the model trai and the rest of the data are selected as the model test sample. The training se random forest model measures the fitting ability of the model, while the test se the generalization ability of the model. Thus, the performance of the model sh evaluated using both the training set and the test set.

Study Area
Yunnan Province is located in the hilly area of the Qinghai-Tibet Plateau between 21°8′32″-29°15′8″ northern latitude and 97°31′39″-106°11′47″ eastern lo The elevation gradually decreases from north to south, where the southeast r lower than the northwest region, and the highest point is distributed on the main Meili Snow Mountain. The province mainly contains basins, mountains, and pla which basins occupy a minimal area, while mountains and plateaus account fo the area in the province, including canyons and deep mountains. Yunnan Prov tures more ravines with a large height difference, and the terrain is more undulat

Model Training
In this paper, the landslide interpretation and identification of Yunnan Province are carried out using visual interpretation, and 1385 hidden landslide danger points with obvious deformation characteristics are delineated. These points are used here as positive data sets, and the same number of random points are randomly created with the ArcGIS software as negative data sets. A total of 2770 points are used in the susceptibility assessment. Thirty percent of the data for the entire sample is selected as the model training set, and the rest of the data are selected as the model test sample. The training set for the random forest model measures the fitting ability of the model, while the test set reflects the generalization ability of the model. Thus, the performance of the model should be evaluated using both the training set and the test set.

Study Area
Yunnan Province is located in the hilly area of the Qinghai-Tibet Plateau, located between 21 • 8 32 -29 • 15 8 northern latitude and 97 • 31 39 -106 • 11 47 eastern longitude. The elevation gradually decreases from north to south, where the southeast region is lower than the northwest region, and the highest point is distributed on the main peak of Meili Snow Mountain. The province mainly contains basins, mountains, and plateaus, of which basins occupy a minimal area, while mountains and plateaus account for 94% of the area in the province, including canyons and deep mountains. Yunnan Province features more ravines with a large height difference, and the terrain is more undulating than that in other regions. Yunnan Province is located at the junction of the Eurasian plate and the Indian plate, and the new and old structural geology is quite complex, mainly based on faults. Fault zones are distributed throughout the province, providing essential materials and energy for the development of geological disasters. River systems passing through the province mainly include the Nan Pan River, the Nu River, the Irrawaddy River, the Red River, the Lancing River, and the Jinsha River. Most rivers are representative of mountainous rivers, which are characterized by rapid water flow and large drops. The rivers are mainly distributed along the valley such that the valley slopes on both sides are steep, thereby providing the main breeding area for mountain disasters [1]. Figure 3 shows the distribution map of historical landslide points in Yunnan Province. that in other regions. Yunnan Province is located at the junction of the Eurasian pl the Indian plate, and the new and old structural geology is quite complex, mainl on faults. Fault zones are distributed throughout the province, providing essentia rials and energy for the development of geological disasters. River systems through the province mainly include the Nan Pan River, the Nu River, the Irra River, the Red River, the Lancing River, and the Jinsha River. Most rivers are rep tive of mountainous rivers, which are characterized by rapid water flow and large The rivers are mainly distributed along the valley such that the valley slopes on bo are steep, thereby providing the main breeding area for mountain disasters [1]. F shows the distribution map of historical landslide points in Yunnan Province.

Data Sources
The occurrence of landslide disasters is closely related to topography, hydrometeo human activities, and other factors and is affected by the external environment of the la and the slope's own conditions. Additionally, the formation mechanism is complex the basis of previous research [27,28], combined with the development characteristics ological environmental conditions of landslide geological disasters in Yunnan Pr as well as the difficulty and operability of obtaining influencing factors, the followin encing factors were selected for the risk assessment of landslide geological disasters in Province: slope, aspect, elevation, and topographic undulation factors were selected f topography and geomorphology; fault and lithology factors were selected from the ge structure; river and rainfall factors were selected from meteorological hydrology; the r tor was selected from human activity; the NDVI factor was selected from the vegetatio degree; and the annual average deformation rate factor. An index system for evalua risk of landslide geological disasters in Yunnan Province was then established. The la disaster data used in this paper are composed of geological disaster data with obvio slide signs on optical images, and other index factor data sources are shown in Table

Data Sources
The occurrence of landslide disasters is closely related to topography, hydrometeorology, human activities, and other factors and is affected by the external environment of the landslide and the slope's own conditions. Additionally, the formation mechanism is complex [26]. On the basis of previous research [27,28], combined with the development characteristics and geological environmental conditions of landslide geological disasters in Yunnan Province, as well as the difficulty and operability of obtaining influencing factors, the following influencing factors were selected for the risk assessment of landslide geological disasters in Yunnan Province: slope, aspect, elevation, and topographic undulation factors were selected from the topography and geomorphology; fault and lithology factors were selected from the geological structure; river and rainfall factors were selected from meteorological hydrology; the road factor was selected from human activity; the NDVI factor was selected from the vegetation cover degree; and the annual average deformation rate factor. An index system for evaluating the risk of landslide geological disasters in Yunnan Province was then established. The landslide disaster data used in this paper are composed of geological disaster data with obvious landslide signs on optical images, and other index factor data sources are shown in Table 2.

Indicator Factor Data Processing
This paper uses a variety of classification and grading methods to study the classification and grading of index factors. According to the characteristics of the selected index factors, such as elevation, slope, annual average rainfall, annual mean deformation rate, and NDVI five factors, we used a manual method to divide the grades, making the landslide distribution more concentrated and reasonable. The aspect indicator was graded by direction. For the fault, road, and river index factors, the GIS software buffer was used for grading with a buffer step size of 400 m. We divided the different types of rocks according to the qualitative division table of the hardness of the rocks [29]. According to the degree of topographic undulation in the study area, the terrain undulation degree was also divided into 7 common types.

DEM
Landslide disasters in Yunnan Province are affected by altitude and other factors, and elevation is closely related to these factors. There are great differences in altitude among various areas in Yunnan Province, and the difference between the north and the south reaches 6000 m. Altitude affects the potential energy of the slope body, which indirectly affects the disaster degree of the landslide. Therefore, elevation is regarded as one of the most important evaluation factors in the process of landslide susceptibility evaluation [30].
The overall elevation of Yunnan Province shows that the northwest region is higher than the southeast region. Here, the manual grading method was used to classify the 30 m resolution NASA DEM, and the elevation was statistically analyzed according to five levels: 0-1000, 1000-2000, 2000-3000, 3000-4000, and 4000-6700 m. Table 3 shows the number and proportion of landslides at all levels of elevation in Yunnan Province. Figure 4 shows the distribution of landslides in Yunnan Province. As shown in Table 3, landslide disasters mainly occur at elevations of less than 3000 m.

Slope
Slope refers to the steepness of the slope body, which has a direct i bility of the slope body. When a landslide occurs, it requires a certain surface, and there is also a large correlation between the slope and land Using ArcGIS software to analyze the slope of DEM with a resolutio nan Province, the slope distribution range in Yunnan Province was obtai found that the probability of landslides occurring on relatively gentle slo and greater than 50° was very low, so the slopes were divided into six i a grade of 10°: <10°, 10-20°, 20-30°, 30-40°, 40-50°, and >50°. Table 4 sh and proportion of landslides for all slope grades in Yunnan Province. F distribution of landslides in Yunnan Province. According to Table 4, l occur on slope bodies with slopes less than 50°.

Slope
Slope refers to the steepness of the slope body, which has a direct impact on the stability of the slope body. When a landslide occurs, it requires a certain effective landing surface, and there is also a large correlation between the slope and landing surface [31].
Using ArcGIS software to analyze the slope of DEM with a resolution of 30 m in Yunnan Province, the slope distribution range in Yunnan Province was obtained. The statistics found that the probability of landslides occurring on relatively gentle slopes less than 10 • , and greater than 50 • was very low, so the slopes were divided into six intervals based on a grade of 10 • : <10 • , 10-20 • , 20-30 • , 30-40 • , 40-50 • , and >50 • . Table 4 shows the number and proportion of landslides for all slope grades in Yunnan Province. Figure 5 shows the distribution of landslides in Yunnan Province. According to Table 4, landslides mainly occur on slope bodies with slopes less than 50 • .

Aspect
Aspect is the basic element of terrain composition, and different aspects wi differences in sunshine, vegetation growth, surface water flow, and surface temp This is indirectly related to the safety of the slope. Therefore, aspect is regarded a the most important evaluation factors in the process of landslide susceptibility eva In this paper, ArcGIS software is used to analyze the aspects of the DEM in Province, with the aspects graded based on the direction to obtain the aspect dist range in Yunnan Province. The slope direction in the study area was divided in grades: plane (−1), north (337. 5-22. Table 5 presents the number and proportion of landslides und grade of slope in Yunnan Province, while Figure 6 shows the distribution of land Yunnan Province. As shown in Table 5, landslides are not likely to occur on a flat and landslides are most commonly distributed in the southeast direction.

Aspect
Aspect is the basic element of terrain composition, and different aspects will cause differences in sunshine, vegetation growth, surface water flow, and surface temperature. This is indirectly related to the safety of the slope. Therefore, aspect is regarded as one of the most important evaluation factors in the process of landslide susceptibility evaluation.
In this paper, ArcGIS software is used to analyze the aspects of the DEM in Yunnan Province, with the aspects graded based on the direction to obtain the aspect distribution range in Yunnan Province. The slope direction in the study area was divided into nine grades: plane (−1), north (337.5-22.5), northeast (22.5-67.5), east (67.5-112.5), southeast (112.5-157.5), south (157.5-202.5), southwest (202.5-247.5), west (247.5-292.5), and northwest (292.5-337.5). Table 5 presents the number and proportion of landslides under each grade of slope in Yunnan Province, while Figure 6 shows the distribution of landslides in Yunnan Province. As shown in Table 5, landslides are not likely to occur on a flat surface, and landslides are most commonly distributed in the southeast direction.

Terrain Undulation
Terrain undulation is the geomorphological characteristic of the difference between the altitude of the highest point and the lowest point in a certain area. Terrain undulation is related to the stress distribution and size inside the slope body, and as its value increases, the stability of the slope becomes worse. Therefore, terrain undulation is regarded as one of the evaluation factors in the process of landslide susceptibility evaluation.
According to the degree of terrain undulation, we divide the topographic undulations of Yunnan Province into seven categories: plains, terraces, hills, small undulating mountains, medium undulating mountains, large undulating mountains, and extremely large undulating mountains. Topographic undulation can be used as an important reference index for dividing geomorphological morphology. Table 6 presents the number and proportion of landslides for each level of topographic undulation in Yunnan Province, and Figure 7 shows the distribution of landslides in Yunnan Province. As shown in Table 6, the number of landslides is high on medium undulating mountains.

Terrain Undulation
Terrain undulation is the geomorphological characteristic of the difference between the altitude of the highest point and the lowest point in a certain area. Terrain undulation is related to the stress distribution and size inside the slope body, and as its value increases, the stability of the slope becomes worse. Therefore, terrain undulation is regarded as one of the evaluation factors in the process of landslide susceptibility evaluation.
According to the degree of terrain undulation, we divide the topographic undulations of Yunnan Province into seven categories: plains, terraces, hills, small undulating mountains, medium undulating mountains, large undulating mountains, and extremely large undulating mountains. Topographic undulation can be used as an important reference index for dividing geomorphological morphology. Table 6 presents the number and proportion of landslides for each level of topographic undulation in Yunnan Province, and Figure 7 shows the distribution of landslides in Yunnan Province. As shown in Table 6, the number of landslides is high on medium undulating mountains.

Distance from the Fault
Faults have a direct impact on the regional characteristics of landslides. Fault surface infiltration, internal structure, and the stress of rock and soil. Therefore, fau regarded as one of the evaluation factors in the process of landslide susceptibility e tion.
In this study, we used the buffer analysis tool in ArcGIS software to buffer th according to a distance of 400 m, which was divided into seven grades: 0-400, 40 800-1200, 1200-1600, 1600-2000, 2000-2400, and more than 2400 m. However, the of Yunnan Province is very large, and most landslides are distributed at a distance than 2400 m. Table 7 presents the distribution and proportion of landslides at ea tance from the fault in Yunnan Province, and Figure 8 shows the distribution of lan in Yunnan Province based on fault distances.

Distance from the Fault
Faults have a direct impact on the regional characteristics of landslides. Faults affect surface infiltration, internal structure, and the stress of rock and soil. Therefore, faults are regarded as one of the evaluation factors in the process of landslide susceptibility evaluation.
In this study, we used the buffer analysis tool in ArcGIS software to buffer the data according to a distance of 400 m, which was divided into seven grades: 0-400, 400-800, 800-1200, 1200-1600, 1600-2000, 2000-2400, and more than 2400 m. However, the range of Yunnan Province is very large, and most landslides are distributed at a distance greater than 2400 m. Table 7 presents the distribution and proportion of landslides at each distance from the fault in Yunnan Province, and Figure 8 shows the distribution of landslides in Yunnan Province based on fault distances.

Lithological Classification
The hardness, type, structure, etc., of rocks all determine the stability and integrity o the slope. Lithology plays a decisive role in the resistance of the slope to erosion, weath ering, and fragmentation. Therefore, lithology is regarded as one of the evaluation factor in the process of landslide susceptibility evaluation.
Here, we divided the different types of rocks according to the qualitative division table of the hardness of rocks [29]. The geological rock group was divided into ten catego ries: harder rock sandwiching soft rock, harder rock, harder rock sandwiching softer rock loose hard rock sandwiching softer rock, hard rock, hard rock sandwiching soft rock softer rock, loose body, water body, and soft rock. Table 8 shows the number and propor tion of landslides for each grade of the annual lithology classification in Yunnan Province and Figure 9 shows the distribution of landslides based on the lithology classification o Yunnan Province. Table 8 shows that harder rocks sandwiching softer rock are common in areas with the greatest landslide distribution.

Lithological Classification
The hardness, type, structure, etc., of rocks all determine the stability and integrity of the slope. Lithology plays a decisive role in the resistance of the slope to erosion, weathering, and fragmentation. Therefore, lithology is regarded as one of the evaluation factors in the process of landslide susceptibility evaluation.
Here, we divided the different types of rocks according to the qualitative division table of the hardness of rocks [29]. The geological rock group was divided into ten categories: harder rock sandwiching soft rock, harder rock, harder rock sandwiching softer rock, loose hard rock sandwiching softer rock, hard rock, hard rock sandwiching soft rock, softer rock, loose body, water body, and soft rock. Table 8 shows the number and proportion of landslides for each grade of the annual lithology classification in Yunnan Province, and Figure 9 shows the distribution of landslides based on the lithology classification of Yunnan Province. Table 8 shows that harder rocks sandwiching softer rock are common in areas with the greatest landslide distribution.

Distance from Rivers
The river cutting depth and river erosion in Yunnan Province has a certain promoting the formation of landslides. The erosion of the river can increase the e empty surface of the slope, which in turn reduces the stability of the rock and so Therefore, rivers are regarded as one of the evaluation factors in the process of la susceptibility evaluation.
Using the buffer analysis tool in ArcGIS software, the distance between the stu and a river was buffered according to a distance of 400 m, which was divided int grades: 0-400, 400-800, 800-1200, 1200-1600, 1600-2000, 2000-2400, and more th m. However, the range of Yunnan Province is very large, and most landslides are uted at a distance greater than 2400 m. Table 9 presents the distribution and propo landslides at each level of distance from rivers in Yunnan Province, and Figure 1 the distribution of landslides in Yunnan Province based on river distances.

Distance from Rivers
The river cutting depth and river erosion in Yunnan Province has a certain role in promoting the formation of landslides. The erosion of the river can increase the effective empty surface of the slope, which in turn reduces the stability of the rock and soil mass. Therefore, rivers are regarded as one of the evaluation factors in the process of landslide susceptibility evaluation.
Using the buffer analysis tool in ArcGIS software, the distance between the study area and a river was buffered according to a distance of 400 m, which was divided into seven grades: 0-400, 400-800, 800-1200, 1200-1600, 1600-2000, 2000-2400, and more than 2400 m. However, the range of Yunnan Province is very large, and most landslides are distributed at a distance greater than 2400 m. Table 9 presents the distribution and proportion of landslides at each level of distance from rivers in Yunnan Province, and Figure 10 shows the distribution of landslides in Yunnan Province based on river distances.

Average Annual Rainfall
An increase in rainfall increases the weight of a slope itself, weakens its shear resistance, and is the trigger factor that causes rock and soil mass to lose balance. Therefore, rainfall is regarded as one of the important evaluation factors in the process of landslide susceptibility evaluation.
In this paper, the average annual rainfall in the study area was graded using the manual grading method and divided into the following six levels: 1.3-2.0, 2.0-3.0, 3.0-4.0, 4.0-5.0, and >5.0 m. Table 8 presents the distribution and proportion of landslides for each grade of average annual rainfall in Yunnan Province, and Figure 11 shows the distribution of landslides in Yunnan Province. As shown in Table 10, landslides are mainly distributed in areas with an average annual rainfall of 2-5 m.

Average Annual Rainfall
An increase in rainfall increases the weight of a slope itself, weakens its shear resistance, and is the trigger factor that causes rock and soil mass to lose balance. Therefore, rainfall is regarded as one of the important evaluation factors in the process of landslide susceptibility evaluation.
In this paper, the average annual rainfall in the study area was graded using the manual grading method and divided into the following six levels: 1.3-2.0, 2.0-3.0, 3.0-4.0, 4.0-5.0, and >5.0 m. Table 8 presents the distribution and proportion of landslides for each grade of average annual rainfall in Yunnan Province, and Figure 11 shows the distribution of landslides in Yunnan Province. As shown in Table 10

Average Annual Rainfall
An increase in rainfall increases the weight of a slope itself, weakens its shear resistance, and is the trigger factor that causes rock and soil mass to lose balance. Therefore, rainfall is regarded as one of the important evaluation factors in the process of landslide susceptibility evaluation.
In this paper, the average annual rainfall in the study area was graded using the manual grading method and divided into the following six levels: 1.3-2.0, 2.0-3.0, 3.0-4.0, 4.0-5.0, and >5.0 m. Table 8 presents the distribution and proportion of landslides for each grade of average annual rainfall in Yunnan Province, and Figure 11 shows the distribution of landslides in Yunnan Province. As shown in Table 10, landslides are mainly distributed in areas with an average annual rainfall of 2-5 m. Figure 11. Distribution of landslides in Yunnan Province based on average annual rainfall. Figure 11. Distribution of landslides in Yunnan Province based on average annual rainfall.

Human Activity Distance from Roads
The construction of a road directly affects the balance of a slope, destroys the stability of the slope body, and makes the effective empty surface steeper. Therefore, roads are regarded as one of the evaluation factors in the process of landslide susceptibility evaluation.
Using the buffer analysis tool in ArcGIS software, the roads in the study area were analyzed according to a distance of 400 m, which was divided into seven grades: 0-400, 400-800, 800-1200, 1200-1600, 1600-2000, 2000-2400, and more than 2400 m. However, the range of Yunnan Province is very large, and most landslides are distributed at a distance greater than 2400 m. Table 11 presents the number and proportion of landslides at each level of distance from roads in Yunnan Province, and Figure 12 shows the distribution of landslides in Yunnan Province based on road distance. Table 11. The number and proportion of landslide distribution at each grade based on road distance.

Distance from River
Classification (m)

Human Activity
Distance from Roads The construction of a road directly affects the balance of a slope, destroys the stability of the slope body, and makes the effective empty surface steeper. Therefore, roads are regarded as one of the evaluation factors in the process of landslide susceptibility evalua tion.
Using the buffer analysis tool in ArcGIS software, the roads in the study area were analyzed according to a distance of 400 m, which was divided into seven grades: 0-400 400-800, 800-1200, 1200-1600, 1600-2000, 2000-2400, and more than 2400 m. However, the range of Yunnan Province is very large, and most landslides are distributed at a distance greater than 2400 m. Table 11 presents the number and proportion of landslides at each level of distance from roads in Yunnan Province, and Figure 12 shows the distribution o landslides in Yunnan Province based on road distance.

The Degree of Vegetation Cover
The root system of vegetation is directly related to the ability of rock and soil to retain water. The recharge and transpiration of vegetation groundwater play an important role in affecting the shear resistance of slopes. Therefore, NDVI is regarded as one of the evaluation factors in the process of landslide susceptibility evaluation.
According to the characteristics of vegetation cover, the NDVI in the study area was divided into the following seven categories: −0.06-0.3, 0.3-0.4, 0.4-0.5, 0.5-0.6, 0.6-0.7, 0.7-0.8, and 0.8-0.9. Table 10 presents the number and proportion of landslides at each grade of NDVI in Yunnan Province, and Figure 13 shows the distribution of landslides in Yunnan Province based on the NDVI. As shown in Table 12, the main distribution range of landslides is 0.5-0.8.

The Degree of Vegetation Cover
The root system of vegetation is directly related to the ability of rock and soil to retain water. The recharge and transpiration of vegetation groundwater play an important role in affecting the shear resistance of slopes. Therefore, NDVI is regarded as one of the evaluation factors in the process of landslide susceptibility evaluation.
According to the characteristics of vegetation cover, the NDVI in the study area was divided into the following seven categories: −0.06-0.3, 0.3-0.4, 0.4-0.5, 0.5-0.6, 0.6-0.7, 0.7-0.8, and 0.8-0.9. Table 10 presents the number and proportion of landslides at each grade of NDVI in Yunnan Province, and Figure 13 shows the distribution of landslides in Yunnan Province based on the NDVI. As shown in Table 12, the main distribution range of landslides is 0.5-0.8.

Model Parameter Settings and Accuracy Verification
In this paper, 70% of the training samples were used for model establishment and training, and the accuracy in the remaining sample size classification results was verified. Compared to other prediction models, the random forest model has relatively few parameters. In this article, we mainly adjusted the following parameters: n_estimators,

Model Parameter Settings and Accuracy Verification
In this paper, 70% of the training samples were used for model establishment and training, and the accuracy in the remaining sample size classification results was verified. Compared to other prediction models, the random forest model has relatively few parameters. In this article, we mainly adjusted the following parameters: n_estimators, max_features, max_depth, min_samples_split, and min_samples_leaf. GridSearchCV was used for searching, and the parameter setting process is shown in Table 13.  Table 13 shows that when only the grid search method is used to set the n_estimators, max_features, and max_depth parameters, the accuracy indicators are better, so the following article parameters are used: n_estimators = 86, max_features = 3, and max_depth = 10.
To validate the random forest model, this paper uses the out-of-bag score to verify the model. Because data sampling uses put-back sampling, about 30% of the samples are not selected. We sought to use the data that were not sampled (OOB) as a validation set to verify the correctness of the training set. Here, the oob_score = 0.783, and the acc = 0.90 for the training set, indicating that the random forest model offers good verification accuracy, which is suitable for a vulnerability evaluation of the study area in this paper.
For the performance metrics of the random forest model, we select a confusion matrix to evaluate the model performance and the MSE (mean square error), acc (accuracy), and AUC (area under curve) indicators to determine the evaluation accuracy. The first two parameters are based on statistical indicators used to measure the fit degree of the model and the proportion of samples that are accurately classified, respectively. MSE as a statistical parameter is the mean of the sum of squares for the corresponding point errors from the predicted data and the original data and represents a good or bad degree of data fitting. The value of MSE = 0.2 indicates that the model fits well. Accuracy is the simplest and most intuitive evaluation index for classification problems and can reflect the correct proportion. Here, the acc of the training set and the test set are 0.90 and 0.80, respectively, and accuracy is 0.8 and over, indicating that the model is reasonable and accurate for the evaluation of vulnerability in Yunnan Province. The AUC parameter represents the area under the ROC (receiver operating characteristic) curve, which is calculated using the specificity and sensitivity under a series of different assumptions. The ROC curve, which is the receiver working characteristic curve, is an effective method to evaluate the effect of the binary classification algorithm, reflecting the relationship between the predicted value and the sample value. The AUC value is between 0.5 and 1, where the closer the AUC value is to 1, the higher the prediction accuracy of the model becomes. Here, the value of AUC = 0.87 indicates that the random forest model offers the best prediction accuracy for landslides in Yunnan Province. The above three parameters are commonly used as representative indicators. Figure 14 and Table 14 show the area under the ROC curve and the confusion matrix, respectively. The test results indicate that the random forest model offers high accuracy in the spatial prediction of landslide geological hazard susceptibility in Yunnan Province, good prediction performance, and no overfitting problems.

Susceptibility Assessment Results
In this paper, the random forest model was used to predict the spati of landslide disasters in the study area according to the characteristics of ters and geological environment data in Yunnan Province. According to pr experience [32,33], we used the natural discontinuity method to divide la tibility in Yunnan Province. The natural discontinuity method is a classific ing method in GIS software based on the principle of numerical statistics, imize the difference between each category. This method is often used to bility into the following five levels: low-susceptibility areas (0-0.162), low areas (0.162-0.32), medium-susceptibility areas (0.32-0.498), higher-susc (0.498-0.691), and high-susceptibility areas (0.691-0.984). The landslid grade and landslide susceptibility zone map for Yunnan Province are pr tively, in Table 15 and Figure 15. The natural discontinuity point classifica decrease the number of landslides in low-and lower-susceptibility areas a area. Here, most of the landslides fall into higher-and high-susceptibility

Susceptibility Assessment Results
In this paper, the random forest model was used to predict the spatial susceptibility of landslide disasters in the study area according to the characteristics of landslide disasters and geological environment data in Yunnan Province. According to previous research experience [32,33], we used the natural discontinuity method to divide landslide susceptibility in Yunnan Province. The natural discontinuity method is a classification and grading method in GIS software based on the principle of numerical statistics, which can maximize the difference between each category. This method is often used to divide susceptibility into the following five levels: low-susceptibility areas (0-0.162), lower-susceptibility areas (0.162-0.32), medium-susceptibility areas (0.32-0.498), higher-susceptibility areas (0.498-0.691), and high-susceptibility areas (0.691-0.984). The landslide susceptibility grade and landslide susceptibility zone map for Yunnan Province are presented, respectively, in Table 15 and Figure 15. The natural discontinuity point classification method can decrease the number of landslides in low-and lower-susceptibility areas and increase the area. Here, most of the landslides fall into higher-and high-susceptibility areas. The density of disaster points corresponds to the ratio of the number of disaster points in each interval to the area of the interval. The density method can intuitively reflect the difference in the number and area of landslides based on the vulnerability level of the model. Based on statistical data, with an increase in susceptibility within the prone zones divided by the model, the density of landslide hazard points in each grade increases, and the density of landslide hazard points in high-susceptibility areas reaches the maximum. The random forest model is densely distributed in areas with high susceptibility. Ultimately, the model results are more consistent with the actual distribution characteristics of disaster points and offer higher accuracy. It can be seen from the zoning map that the high-and higher-susceptibility areas of landslide disasters are mainly distributed in linear clusters on both sides of the valley and near mountainous areas, on both sides of rivers and both sides of roads, and in other areas with large slope differences. The terrain in the high-and higher-susceptibility areas is complex, with complex geomorphological conditions and longitudinal and horizontal ravines. Additionally, the erosion capacity of the rivers is strong, and the vegetation coverage is low, resulting in serious soil erosion. Steep river valleys also provide an effective surface for landslide development. The moderate-and lower-susceptibility areas of landslides are mainly distributed in areas of thick vegetation cover and valley-to-valley junctions with a gentle regional transition. The low-susceptibility areas for landslides are mainly located in the middle region of Yunnan Province, which belongs to the alpine plain. Here, the slope is small, and the overall distribution is blocks. Additionally, the area is large, which is not suitable for the development and occurrence of landslides and conforms to the characteristics of landslide development.

Impact Factor Importance Analysis
The importance of landslide susceptibility factors covered in this paper is ranked based on Gini impurity, as shown in Figure 16. It can be seen from Figure 16 that the slope direction, lithology, slope, and annual mean deformation rate of the key variables leading to landslide occurrence are the most important elements affecting the development of landslides. This result also shows that the annual mean deformation rate index factor It can be seen from the zoning map that the high-and higher-susceptibility areas of landslide disasters are mainly distributed in linear clusters on both sides of the valley and near mountainous areas, on both sides of rivers and both sides of roads, and in other areas with large slope differences. The terrain in the high-and higher-susceptibility areas is complex, with complex geomorphological conditions and longitudinal and horizontal ravines. Additionally, the erosion capacity of the rivers is strong, and the vegetation coverage is low, resulting in serious soil erosion. Steep river valleys also provide an effective surface for landslide development. The moderate-and lower-susceptibility areas of landslides are mainly distributed in areas of thick vegetation cover and valley-to-valley junctions with a gentle regional transition. The low-susceptibility areas for landslides are mainly located in the middle region of Yunnan Province, which belongs to the alpine plain. Here, the slope is small, and the overall distribution is blocks. Additionally, the area is large, which is not suitable for the development and occurrence of landslides and conforms to the characteristics of landslide development.

Impact Factor Importance Analysis
The importance of landslide susceptibility factors covered in this paper is ranked based on Gini impurity, as shown in Figure 16. It can be seen from Figure 16 that the slope direction, lithology, slope, and annual mean deformation rate of the key variables leading to landslide occurrence are the most important elements affecting the development of landslides. This result also shows that the annual mean deformation rate index factor added in this paper has an important impact on the prediction of landslide susceptibility and that the addition of this index factor can improve the prediction accuracy of landslide susceptibility in Yunnan Province to a certain extent.
Remote Sens. 2023, 15, x FOR PEER REVIEW 20 of 22 added in this paper has an important impact on the prediction of landslide susceptibility and that the addition of this index factor can improve the prediction accuracy of landslide susceptibility in Yunnan Province to a certain extent. Figure 16. Ranking of impact factor importance.

Conclusions
In this paper, the land surface deformation in the entire Yunnan Province was obtained using SBAS-InSAR technology. The annual mean deformation rate was added as an index factor to the landslide susceptibility evaluation study in Yunnan Province, and its importance as an index factor was explored. We found that the annual mean deformation rate was the most important index factor in addition to aspect, lithology, and slope, indicating that it had an important indicative effect on the evaluation of landslide susceptibility. Thus, the addition of the annual mean deformation rate index factor improved the prediction accuracy of landslide susceptibility in Yunnan Province to a certain extent.
When studying the application of the random forest model to landslide susceptibility evaluation in Yunnan Province, we calculated MSE = 0.20, acc = 0.80, and AUC = 0.87. The obtained landslide susceptibility results also coincided with the actual landslide distribution, which further demonstrates the reliability of the results in this paper. Therefore, the landslide susceptibility evaluation results obtained in this paper can provide a basis for disaster prevention and mitigation in Yunnan Province.
In the study area, the area proportion of the low-and lower-susceptibility areas was 57%, and the landslide proportion was 4%. The area proportion of the higher-and highsusceptibility areas was 23%, and the landslide proportion was 87%. The statistical results showed that most of the landslide hazard points were located in higher-and high-susceptibility areas, which is in line with the characteristics for the concentrated distribution of hidden landslide dangers in high-and higher-susceptibility areas.

Conclusions
In this paper, the land surface deformation in the entire Yunnan Province was obtained using SBAS-InSAR technology. The annual mean deformation rate was added as an index factor to the landslide susceptibility evaluation study in Yunnan Province, and its importance as an index factor was explored. We found that the annual mean deformation rate was the most important index factor in addition to aspect, lithology, and slope, indicating that it had an important indicative effect on the evaluation of landslide susceptibility. Thus, the addition of the annual mean deformation rate index factor improved the prediction accuracy of landslide susceptibility in Yunnan Province to a certain extent.
When studying the application of the random forest model to landslide susceptibility evaluation in Yunnan Province, we calculated MSE = 0.20, acc = 0.80, and AUC = 0.87. The obtained landslide susceptibility results also coincided with the actual landslide distribution, which further demonstrates the reliability of the results in this paper. Therefore, the landslide susceptibility evaluation results obtained in this paper can provide a basis for disaster prevention and mitigation in Yunnan Province.
In the study area, the area proportion of the low-and lower-susceptibility areas was 57%, and the landslide proportion was 4%. The area proportion of the higher-and high-susceptibility areas was 23%, and the landslide proportion was 87%. The statistical results showed that most of the landslide hazard points were located in higher-and highsusceptibility areas, which is in line with the characteristics for the concentrated distribution of hidden landslide dangers in high-and higher-susceptibility areas.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.