Performance Evaluation of GIS-Based Artificial Intelligence Approaches for Landslide Susceptibility Modeling and Spatial Patterns Analysis

The main purpose of this study was to apply the novel bivariate weights-of-evidence-based SysFor (SF) for landslide susceptibility mapping, and two machine learning techniques, namely the naïve Bayes (NB) and Radial basis function networks (RBFNetwork), as benchmark models. Firstly, by using aerial photos and geological field surveys, the 263 landslide locations in the study area were obtained. Next, the identified landslides were randomly classified according to the ratio of 70/30 to construct training data and validation models, respectively. Secondly, based on the landslide inventory map, combined with the geological and geomorphological characteristics of the study area, 14 affecting factors of the landslide were determined. The predictive ability of the selected factors was evaluated using the LSVM model. Using the WoE model, the relationship between landslides and affecting factors was analyzed by positive and negative correlation methods. The above three hybrid models were then used to map landslide susceptibility. Thirdly, the ROC curve and various statistical data (SE, 95% CI and MAE) were used to verify and compare the predictive power of the model. Compared with the other two models, the Sysfor model had a larger area under the curve (AUC) of 0.876 (training dataset) and 0.783 (validation dataset). Finally, by quantitatively comparing the susceptibility values of each pixel, the differences in spatial morphology of landslide susceptibility maps were compared, and the model was found to have limitations and effectiveness. The landslide susceptibility maps obtained by the three models are reasonable, and the landslide susceptibility maps generated by the SysFor model have the highest comprehensive performance. The results obtained in this paper can help local governments in land use planning, disaster reduction and environmental protection.


Introduction
Landslide is a sliding geological phenomenon of slope rock and soil mass moving along a shear failure surface. Landslides are natural disasters and existed before humans appeared. Now it has become an important global issue that threatens the safety of human life and property. There are many different scales of landslide occurrence in China every year. From January to July 2018, 1843 geological disasters were reported nationwide, including 973 landslides. The disasters caused the deaths of 74 people and a direct economic loss of CNY 1.04 billion. With the acceleration of urbanization

Methodology
The research method used in this paper can be divided into the following steps, as shown in Figure 2: (i) data collection; (ii) preparing the landslide conditioning factors and determining the landslide location; (iii) correlation analysis between landslide and affecting factors (WoE); (iv) selection of landslide affecting factors using the Linear Support Vector Machine (LSVM); (v) RBFN, NB and SF models are applied to assess landslide susceptibility in the study area; (vi) model validation and comparison; (vii) generating landslide susceptibility maps.

Methodology
The research method used in this paper can be divided into the following steps, as shown in Figure 2: (i) data collection; (ii) preparing the landslide conditioning factors and determining the landslide location; (iii) correlation analysis between landslide and affecting factors (WoE); (iv) selection of landslide affecting factors using the Linear Support Vector Machine (LSVM); (v) RBFN, NB and SF models are applied to assess landslide susceptibility in the study area; (vi) model validation and comparison; (vii) generating landslide susceptibility maps. ISPRS Int. J. Geo-Inf. 2020, 9,443 4 of 20 ISPRS Int. J. Geo-Inf. 2020, 9,

Preparation of Training and Validation Datasets
Landslide inventory and mapping is the foundation of landslide susceptibility mapping. In this study, the landslide survey data come from the following three methods: consulting the geological report of the study area; conducting a field survey with the help of the global positioning system (GPS); analyzing remote sensing images [38]. Finally, 263 landslides were identified in the study area, as shown in Figure 1. Rainfall and human engineering activities (engineering construction, urban construction and construction of highways and railways, etc.) are the main triggers of landslides [39], and loess landslide is the dominant type of landslide in the study area. Based on the analysis under GIS environment, the largest landslide in the study area is greater than 1×10 7 m 3 , and the smallest landslide is close to 120 m 3 . Each landslide polygon in the data set was expressed using the centroid. Furthermore, it is necessary to randomly select the same number of non-landslide points in the study area-defining 263 landslide pixels as "1", and then defining 263 non-landslide pixels as "0". In this study, the original data were randomly divided into training data set and validation data set with the ratio of 70/30, because today there is no uniform specification for the division of the data set in LSM [1,40].

Landslide Affecting Factors
When the landslide inventory is compiled, the next step is to select landslide-affecting factors. The principles to be considered in the selection of conditioning factors are the characteristics of the geological environment and the data availability in the area [41][42][43]. According to the relevant literature [44][45][46], 14 landslide affecting factors were selected, including altitude, profile curvature, plan curvature, aspect, slope, SPI, TWI, STI, distance to rivers, distance to roads, lithology, soil, landuse and NDVI, as shown in Supplementary Figure 1. The lithology map was extracted from geological maps with a scale of 1:200,000, the land use map was extracted from land-use maps with a scale of 1:100,000, the soil map was extracted from soil type maps with a scale of 1:1,000,000 and NDVI was produced using the Landsat 8 operational land imager. The remaining 11 factors were

Preparation of Training and Validation Datasets
Landslide inventory and mapping is the foundation of landslide susceptibility mapping. In this study, the landslide survey data come from the following three methods: consulting the geological report of the study area; conducting a field survey with the help of the global positioning system (GPS); analyzing remote sensing images [38]. Finally, 263 landslides were identified in the study area, as shown in Figure 1. Rainfall and human engineering activities (engineering construction, urban construction and construction of highways and railways, etc.) are the main triggers of landslides [39], and loess landslide is the dominant type of landslide in the study area. Based on the analysis under GIS environment, the largest landslide in the study area is greater than 1 × 10 7 m 3 , and the smallest landslide is close to 120 m 3 . Each landslide polygon in the data set was expressed using the centroid. Furthermore, it is necessary to randomly select the same number of non-landslide points in the study area-defining 263 landslide pixels as "1", and then defining 263 non-landslide pixels as "0". In this study, the original data were randomly divided into training data set and validation data set with the ratio of 70/30, because today there is no uniform specification for the division of the data set in LSM [1,40].

Landslide Affecting Factors
When the landslide inventory is compiled, the next step is to select landslide-affecting factors. The principles to be considered in the selection of conditioning factors are the characteristics of the geological environment and the data availability in the area [41][42][43]. According to the relevant literature [44][45][46], 14 landslide affecting factors were selected, including altitude, profile curvature, plan curvature, aspect, slope, SPI, TWI, STI, distance to rivers, distance to roads, lithology, soil, land-use and NDVI, as shown in Figure S1. The lithology map was extracted from geological maps with a scale of 1:200,000, the land use map was extracted from land-use maps with a scale of 1:100,000, the soil map was extracted from soil type maps with a scale of 1:1,000,000 and NDVI was produced using the Landsat 8 operational land imager. The remaining 11 factors were derived from ASTER GDEM with 30m resolution. Finally, all factors are eventually converted to the same resolution (30 × 30 m).
The establishment of the altitude maps is the first step in conducting research. In the study area, it is divided into seven levels according to the interval of 100 m: (1) 933-1000; (2) 1000-1100; (3) 1100-1200; (4) 1200-1300; (5) 1300-1400; (6) 1400-1500; (7) 1500-1574. The profile curvature reflects the rate of change of the surface elevation at the maximum slope of the surface. This study calculated the profile curvatures in ArcGIS software and divided them into five categories: The stream power index (SPI) refers to the erosion of flowing water, the thickness of the soil layer and the content of sand and silt. SPI has been widely used in the study of landslide susceptibility [52]. These factors will affect the occurrence of slope damage events [53]. SPI has the following formula: where As represents a specific catchment area and B represents a local slope [53]. According to the calculated value, the SPI is grouped into five categories: (1) < 10; (2) where ∂ is a local uphill contribution area representing the amount of water flowing to a particular location and tan B is a local slope. The TWI map of the study area consisted of the following six categories: (1) 1.11-2; (2) 2-3; (3) 3-4; (4) 4-5; (5) > 5. The sediment transport index (STI) value represents the degree of land current erosion and sedimentation [55]. According to the value of STI, it was divided into five categories: (1) 0-10; (2) 10-20; (3) 20-30; (4) 30-40; (5) > 40.
The distance to rivers map was divided with five different buffer ranges including: (1) 0-200; (2) 200-400; (3) 400-600; (4) 600-800; (5) > 800. The relationship between the river and the landslide can be established through these buffer zones [55][56][57]. The distance to roads was used as a landslide-affecting factor because the road repaired near the slopes reduces the load on the terrain and the heel of the slope [51]. We calculated the pavement buffer every 100 m in the study area: (1) 0-100; (2) 100 -200; (3) 200-300; (4) 300-400; (5) > 400. Different lithology units have a different magnetic susceptibility [58,59]. A total of five lithologic units are in the study area: Quaternary, Tertiary and Cretaceous, Jurassic and Triassic. The soil in the soil map can be divided into five categories: alluvial soils; red clay soils; lakes and reservoirs; cultivated loess soils; all other values.
The main land-use of the study area is farmland, forestland, grassland and water bodies, residential areas and others. The NDVI can accurately reflect the surface vegetation coverage. The data come from satellite remote sensing images. In this study, this can be classified as: (1)

Weights of Evidence
The weights-of-evidence model is built from the logarithmic form of Bayes' rule. The weighted values for the classes of landslide-affected factors can be written as in Equations (3) and (4) [60]: where P1 and P2, respectively, represent the number of pixels that appear and do not appear in the same given factor category. P3 represents the number of pixels that have no landslides in a particular factor category. When there is neither a landslide nor a landslide pixel in a particular factor category, P4 is used [61].
The positive weight (W +) not only indicates that the affecting factors leading to landslides exist, but also that the value reflects the positive correlation between the factors and landslides. The negative weight (W-) reflects the degree of negative correlation between them [60]. The difference between the two weights can be expressed as Equation (5): where C is the weight contrast. According to the value of C, the relationship between predictor variables and landslides can be reflected more comprehensively.

Naïve Bayes
The model can be built by following a few steps: (i) collect examples; (ii) estimate the prior probability of each class; (iii) estimate the mean of the class; (iv) construct a covariance matrix and find the inverse and determinant for each class; (v) for each class formation, create a discriminant function [62]. The unique point of this model is that only a small amount of training data is required for evaluation and analysis to obtain the parameters required for classification [62]. More importantly, building a model is simple [63]. The NB method is widely and successfully used in landslide susceptibility evaluation research [12,32,64].

Radial Basis Function Network
The radial basis function (RBF) originates from solving the multivariate interpolation problem. Later RBF networks were used in other programs [65,66]. The RBF network model is based on an algorithm from K-means clustering [67]. The essence of the RBF network is a radial function. It provides convenience in solving non-linear problems [68]. First, the data are imported into the input layer without calculation and then the nonlinear problem of the hidden layer neuron is processed and sent to the linear output layer. There is only one hidden layer in the RBF network itself, but there is no hidden layer in the model [69]. The hidden layer can be activated by a function, f : Rn→R, if properly trained. Typically, the basis function commonly used in RBF networks is the Gauss function.
For all the possible choices for f, the Gaussian function can be written as: ISPRS Int. J. Geo-Inf. 2020, 9, 443 7 of 20 where C i ∈ Rn indicates the center of the basis function. There are n hidden layer nodes. Let f i d i ∈ R be the radius of the first hidden layer node. Through the weight matrix (W ∈ Rn i × n) to connect the hidden layer nodes and network output, f p is the hidden node vector.

SysFor
SysFor is built for a great many good quality decision trees by both low-dimensional and high-dimensional datasets and is applied to many applications [70]. A SysFor can be built by following a few steps: (i) follow the user's instructions to find a set of "good properties" and corresponding split points; (ii) build the number of trees based on good attributes and user definitions; (iii) take the level 1 node of the tree built from step 2 and select an optional good attribute from the set of good attribute; (iv) return all the trees built in steps 2 and 3 as Sysfor [71].
This technique can be used to study a data set and master the mode of recording for the purpose of classifying a record as one of "class values". Importantly, the model can predict the "class value" of an unknown tag. The model functions similar to decision trees and decision forests [72].

Support Vector Machine
The quality of the selected model and input data will affect the quality of landslide susceptibility assessment [73]. Using this method to test the affecting factor can be expressed as (Equation (8)): where w T represents an inverse matrix, the weight matrix of each landslide affecting factor in this study, a = (a1, a2, . . . , a14) is the input vector (14 landslide-affecting factors) and b is the offset from the origin of the hyperplane. The closer the weight w i is to 0, the less important the landslide-affecting factors are to the prediction of landslides [74].

Correlation Analysis of Affecting Factors
In this paper, the WoE model is used to analyze the correlation between landslide occurrence and affecting factors, as shown in Figure 3. The overall spatial correlation between the predictable variable and the landslide can be reflected by C. WoE results analysis, which showed that the correlation with the landslide decreased with the increase in elevation. An increase in altitude is accompanied by an increase in precipitation, a decrease in temperature, and accelerated weathering, thereby promoting the occurrence of landslides [75]. Therefore, an altitude >1500m had the highest correlation compared with other elevation ranges. In terms of profile curvature, it was only positively correlated with the occurrence of landslides at (−0.46 to 0.58) and (0.58-1.97), and 42% and 29% of landslides occurred at the two profile curvature points. The plan curvature is positively correlated with the occurrence of landslides in the range from (−0.54) to 0.38 and from 1.44 to 7.56, and negatively correlated in other ranges. Both the profile curvature and the plan curvature control the speed of flowing water affect the degree of slope erosion [76]. If the slope direction is different, then the slope surface is different from the sun exposure and rainfall erosion, resulting in different slope materials [50]. In the southern part, 15% of landslides occur, and the correlation is highest in this orientation with landslides. When the slope is <10 • or >50 • , the occurrence of the landslide is randomly distributed; the correlation with the landslide is the highest in the range of 10-20, and 30% of the landslides occur in this range, meaning that this affecting factor has a high incidence of landslides in this range. The SPI value has the highest correlation, with an occurrence of landslides in the range of 10-20.  TWI is positively correlated with the occurrence of landslides only in the 2-3 range. The WoE results show that the correlation of this variable with landslides decreases with increasing distance to rivers, and the same results are also shown in the distance to road. The closer to the river, the stronger the erosion at the bottom of the slope, which affects the slope stability [51]. At the same time, building roads changes the original stability of the terrain [15]. Different lithologic units have different compositions and structures, etc., and thus produce different capabilities regarding resistance to landslide occurrence [77]. The WoE analysis of the lithology in the study area revealed that Tertiary, Jurassic and Triassic were positively correlated with the occurrence of landslides. The relationship between land use and slope stability is that different types of land-use will produce different vegetation, and the roots of different vegetation will help maintain the stability of the slope differently [78]. When the soil is alluvial soil and red clay soils, it is positively correlated with the occurrence of landslides. In land-use, 54% of landslides occur in Group C and are positively correlated with landslides in this region. However, the highest correlation is in Group E. Finally, WoE finds that the NDVI value has the highest positive correlation with the occurrence of a landslide in the range of 0.07-0.09 and the highest negative correlation in the range of 0.01-0.04.

Constructing Landslide Susceptibility Maps
Generating the landslide susceptibility map generally requires the following two steps [38]: first, obtain the landslide susceptibility index (LSI) of all the evaluation units; then reorganize the LSI obtained in the first step. The specific operation is to calculate the LSI of each evaluation unit by using the probability distribution function of the evaluation model and then re-divide the obtained LSI by the natural discontinuous point grading method in the ArcGIS software. In this study, NB, RBFNetwork and Sysfor models generated three landslide susceptibility maps, as shown in Figure 5. Figure 5(a) is the landslide susceptibility mapping generated by the NB model. The LSI is divided into five ranges of 0-0.133, 0.133-0.322, 0.322-0.537, 0.537-0.769, and 0.769-1. Figure 6 shows that the

Constructing Landslide Susceptibility Maps
Generating the landslide susceptibility map generally requires the following two steps [38]: first, obtain the landslide susceptibility index (LSI) of all the evaluation units; then reorganize the LSI obtained in the first step. The specific operation is to calculate the LSI of each evaluation unit by using the probability distribution function of the evaluation model and then re-divide the obtained LSI by the natural discontinuous point grading method in the ArcGIS software. In this study, NB, RBFNetwork and Sysfor models generated three landslide susceptibility maps, as shown in Figure 5. Figure 5a is the landslide susceptibility mapping generated by the NB model. The LSI is divided into five ranges of 0-0.133, 0.133-0.322, 0.322-0.537, 0.537-0.769, and 0.769-1. Figure 6 shows that the very low level has the largest area.  Figure 6 shows that the very low level has the largest area.

Models Validation and Comparison
The landslide susceptibility map is the most direct and authoritative evidence to test the performance of the landslide susceptibility model [79]. Without model validation, the establishment of landslide susceptibility maps has no practical significance [24]. Therefore, the receiver operating characteristic (ROC) curve and the area under the curves (AUC) [80,81] are used to evaluate the predictive power of each model. Figure 8(a) and Table 2 show the ROC curve and its parameters under the training data set, respectively. Under the training data set, the areas under the NB and Sysfor model curves were 0.792 (79.2%), 0.777 (77.7%) and 0.876 (87.6%), respectively. Figure 8(b) and Table 3 Figure 6. In this study, the structure of the network consists of three layers: an input layer of 14 neurons; a hidden layer (called an RBF unit); an output layer containing a neuron, as shown in Figure 7. The learning process of the RBFNetwork can be divided into two parts: (a) using the K-means algorithm to calculate the number of clusters (hidden neurons); (b) the optimal estimation of kernel parameters [1].

Models Validation and Comparison
The landslide susceptibility map is the most direct and authoritative evidence to test the performance of the landslide susceptibility model [79]. Without model validation, the establishment of landslide susceptibility maps has no practical significance [24]. Therefore, the receiver operating characteristic (ROC) curve and the area under the curves (AUC) [80,81] are used to evaluate the predictive power of each model. Figure 8(a) and Table 2   The SysFor model should be optimized before generating a landslide susceptibility map. The best parameters for the SysFor model were found through testing-the confidence was 0.1 and the numberTrees value was 60. Figure 5c is the landslide susceptibility mapping generated by the SysFor model, and the LSI is also divided into five categories. The LSI is divided into five ranges of 0-0.114, 0.114-0.314, 0.314-0.549, 0.549-0.78 and 0.78-1. Figure 6 shows that the very low level has the largest area.

Models Validation and Comparison
The landslide susceptibility map is the most direct and authoritative evidence to test the performance of the landslide susceptibility model [79]. Without model validation, the establishment of landslide susceptibility maps has no practical significance [24]. Therefore, the receiver operating characteristic (ROC) curve and the area under the curves (AUC) [80,81] are used to evaluate the predictive power of each model. Figure 8a and Table 2 show the ROC curve and its parameters under the training data set, respectively. Under the training data set, the areas under the NB and Sysfor model curves were 0.792 (79.2%), 0.777 (77.7%) and 0.876 (87.6%), respectively. Figure 8b and Table 3

Comparison of Landslide Susceptibility Maps
In this study, SF model was selected as the benchmark model, and the susceptibility maps generated by the other two models were matched with the susceptibility maps generated by the SF model using the method developed by Xiao et al. [82]. The difference between them is defined by subtraction in ArcGIS, as shown in Supplementary Figure 2. The raster values of the three susceptibility maps are between 0 and 1, so the value range of the comparison maps is (-1, 1). The three levels of underestimation, approximation and overestimation are based on the values obtained from the comparison map. Supplementary Table 1 shows the percentages of different levels in the total area. The values of both comparison maps break at -0.50 and 0.50. To obtain the key factors that cause differences between susceptibility maps, it is necessary to associate overestimated and underestimated pixels with all the adjustment factors used in the susceptibility analysis. The Wilcoxon signed-rank test was used to determine the significance of the differences between the susceptibility models, where the main determinants were p and z. The test is used to determine whether there is a significant difference in the susceptibility model-whether the p value is less than the significance level (0.05), and whether the z value is within the range of the critical value (−1.96, + 1.96) [1]. It can be seen from the analysis in Table 4 that the performance difference between the SF model and the other two models (RBFN and NB model) is statistically significant. In contrast, for the NB and RBFNetwork model, the performance difference was not statistically significant. Comparing the performance rankings of the models obtained under the training data set and the validation data set, it can be found that the performances of the models under the two sets of data have the same ordering-the Sysfor model has the highest accuracy (0.783), followed by the NB model (0.764) and RBFNetwork model (0.729). Other statistical methods show that the Sysfor model has a minimum standard error (SE) and a 95% confidence interval (CI).

Comparison of Landslide Susceptibility Maps
In this study, SF model was selected as the benchmark model, and the susceptibility maps generated by the other two models were matched with the susceptibility maps generated by the SF model using the method developed by Xiao et al. [82]. The difference between them is defined by subtraction in ArcGIS, as shown in Figure S2. The raster values of the three susceptibility maps are between 0 and 1, so the value range of the comparison maps is (−1, 1). The three levels of underestimation, approximation and overestimation are based on the values obtained from the comparison map. Table S1 shows the percentages of different levels in the total area. The values of both comparison maps break at −0.50 and 0.50. To obtain the key factors that cause differences between susceptibility maps, it is necessary to associate overestimated and underestimated pixels with all the adjustment factors used in the susceptibility analysis. Underestimations and overestimations of each category of factors are counted, as shown in Tables S2 and S3. For each class, the ratio of each class to the total area is defined as A. The ratio of the underestimated pixels in the class to the total underestimated pixels is defined as B. "B-A", as the ratio of the differences between the two maps, can be used to identify the key classes of underestimated anomalous clustering. The value of "B-A" can be used to determine the class with the highest degree of imbalance, as shown in Table 5. Figure S3 shows underestimations of "SF-NB" and "SF-RBFN" driven by red clay soils, underestimations of "SF-RBFN" driven by SPI. Figure S4 shows overestimations of "SF-NB" driven by distance to rivers, and overestimations of "SF-RBFN" driven by distance to rivers.

Discussion
The analysis of the weight values shows that natural factors are not the only factors affecting the occurrence of landslides, and the intensification of human activities and engineering construction may also affect the geological environments and occurrence of landslide disasters [83]. Landslide occurrence in Zichang County is the highest at the slope angle of 40-50, and when the slope angle is >50, the values of W+ and C are 0, supporting that prediction. Through the WoE analysis of the distance to roads, it is found that the closer the distance to roads, the larger the positive weight of W+, indicating that closeness to a road means that land is greatly affected by human activities, thus promoting the occurrence of landslides [84,85]. In general, the growth density of vegetation is inversely proportional to the possibility of a landslide, and land-use is of high importance for landslide susceptibility assessment [86,87]. Group C (Grassland) has a higher vegetation coverage than Group B (Forestland), so the correlation is higher than Group B. Each landslide-affecting factor contributes to the landslide susceptibility assessment, but each has its differences. Table 1 shows that 60% of the landslides occurred in the lithology type 1 (Quaternary loess), but the five groups in the study area lithology WoE analysis found in the region of the lithology for Type 1 had a poor correlation with the occurrence of landslides.
The three landslide susceptibility maps show similar landslide susceptible areas-especially the gullies on the study area. The study area is located in the middle of the Loess Plateau in northern Shaanxi. The cultivated loess soils cover almost the whole area. The special topography and geotechnical condition define the deformation and failure modes of the slopes and at the same time, they control the developmental characteristics of the landslide disasters, which determines the research. The area belongs to a high-incidence area of landslide disasters, which further reflects the importance of landslide susceptibility assessments in the study area [88].
After statistics, it is found that the number of overestimations and underestimations is limited (the overall discrepancies range from 0.04% of the total area in SF-NB to 0.07% in SF-RBFN). Through visual analysis, it is found that the distribution of overestimation and underestimation is not random, and a certain spatial pattern can be found on the comparison map. The existence of this spatial pattern may be due to a systematic error in the susceptibility analysis [82]. Such systematic errors indicate the finiteness and effectiveness of the model. Comparing the SF model as the benchmark model with the NB model and the RBFN model, respectively, it is found that there are extreme values of overestimation and underestimation. The analysis found that the occurrence of overestimation and underestimation had some relationship with the geomorphology of the study area and formed a spatial pattern. Regarding the underestimated spatial distribution, two comparisons showed similar trends, both caused by the third type of soil (red clay soil). However, the red clay soil has a large clay content and is a good water barrier. Under the action of earthquakes or rainfall, it is easy for the overlying slope to slide slowly along the bedding plane [89]. The comparison of "SF-RBFN" is clearly driven by SPI (10-20) ("B−A" = 38%). Regarding the overestimated spatial analysis, two comparisons found that both are driven by the distance to rivers (0-200m) and have a great spatial overlap with the overestimated pixels. The distance to rivers (0-200m) in the "SF-RBFN" comparison includes all overestimated pixels. In general, compared with the other two models, the SF model can better utilize the geographic information of the distance to rivers in a susceptibility assessment. However, it is easy to ignore the two categories of soil (red clay soil) and SPI (10)(11)(12)(13)(14)(15)(16)(17)(18)(19)(20) that affect landslide occurrence in landslide susceptibility analysis.
All three models contributed to the assessment of landslide susceptibility in the study area, and the performance of Sysfor is better than those of NB and RBFN models. The NB method is one of the classic machine learning algorithms, with high efficiency, easy implementation, and being suitable for multi-classification tasks. However, it not only has certain sensitivity to the expression of input data, but also has unstable classification performance. Because the NB method is based on the assumption of the conditional independence of input parameters, obviously this assumption is not realistic. The results obtained in this study are consistent with the previous research results [90,91]-the prediction performance of the NB model is poor, especially in the case of landslide problems. The SysFor builds a forest (a set of decision trees) and can explore more logical rules from a data set, and this work cannot be completed by a single tree [71]. At the same time, related literature shows that SysFor has higher prediction performance than other commonly used prediction methods [92]. The Sysfor method is relatively new, so it is not widely used in landslide susceptibility mapping [93]. In this study, the performance of the NB model is better than the RBFN model. However, there is a phenomenon that the RBFN model is superior to NB model in the research of other scholars [94]. Therefore, the performance of the landslide susceptibility model is related to the geological environment characteristics of the study area. Furthermore, the popular machine learning methods (e.g., MultiBoosting, credal decision tree (CDT), Random forest (RF), etc.) in recent years, and their hybrid models, have performed well in LSM [4,95,96]. Therefore, there is still a lot of research space on the evaluation methods and models of landslide susceptibility in the study area.
In this paper, the methods used to verify and compare the performance differences among the three models are the ROC curve, SE and 95% CI. By observing the parameters of the ROC curve, it can be found that the AUC value of the SF model is the best. The AUC values of the training and validation are 0.876 and 0.783, respectively, and the average is 0.83. Secondly, for the NB and RBFNetwork models, the AUC values under the validation data set are 0.764 and 0.729. Furthermore, the Wilcoxon signed-rank test was used to perform a two-sided test on the three models. By comparison, it is found that the performance of the SysFor model differs from those of the NB and RBFN models. The three models have performed well in this study, and the results obtained in this study can provide a tool to reduce landslides in the area, thereby reducing the losses caused by landslide disasters.

Conclusions
Since landslides pose a serious hazard to human life and property, governments and relevant agencies in various countries are working to assess the susceptibility and risk of landslides, and by mapping, landslide susceptibility maps can help solve this problem. In this study, three landslide susceptibility maps were generated using RBFNetwork, NB, and Sysfor models. The LSVM model was used to evaluate the occurrence of landslides and 14 landslide-affecting factors. The validation results show that the areas under the NB, RBFNetwork, and SysFor model curves were 0.764 (76.4%), 0.729 (72.9%) and 0.783 (78.3%), respectively. At the same time, the SysFor model performs well under various statistical data. Therefore, the landslide susceptibility mapping generated by the SysFor model shows the strongest performance, followed by the NB model, with the RBFN model ranking last. After comparing the differences in the morphology of the landslide susceptibility map, it was found that systematic errors may occur in the susceptibility analysis, which proves that the susceptibility model has limitations and effectiveness. Finally, these landslide susceptibility maps may help local governments in landslide control and land-use planning, and they may also be used in other landslide-prone areas around the world.

Conflicts of Interest:
The authors declare no conflict of interest.