Integrating landslide magnitude in the susceptibility assessment of the City of Doboj, using machine learning and heuristic approach

ABSTRACT In this work, landslide assessment of the Doboj City area was modeled by combining machine learning and heuristic tools. The machine learning part was used to map the Morphometric factor. i.e. probability of landslides based on relation between the magnitude of events and morphometric parameters: elevation, distance to streams, slope, profile curvature, and aspect. The Random Forest and Support Vector Machines algorithms were implemented in the learning protocol, which included several strategies: balancing of the training/testing set size, algorithm optimization via cross-validation, and cross-scaling. The best performing Morphometric factor ap was created by learning on 50 m and testing on 25 m dataset. The heuristic part was used for modeling of Lithological factor and Land Cover factor maps, by expert-driven scoring of their units, within 0-1 range of values. The final Susceptibility map was obtained by multiplying all three factor maps resulting in a high-performing model with AUC=0.97 and acc=92%.


Introduction
Landslide inventorying and Landslide Susceptibility Assessment (LSA) has become a conventional planning tool in the past couple of decades, with increasing scientific and practical interest (Thiery et al., 2020).The LSA is of particular importance as it develops at an accelerating rate (Reichenbach et al., 2018).Even though it is not standardized, the LSA methodology has been widely accepted based on recommendations of the leading research groups and other distinguished publications (Cascini, 2008;Corominas et al., 2014;Varnes, 1984).Although there is a lack of uniform understanding of susceptibility and hazard assessment (Thiery et al., 2020), it is commonly perceived that susceptibility considers only spatial probability of the landslide occurrence deducted from terrain's intrinsic properties, while hazard also includes temporal probability and probability of specified intensity of future events.Hazard assessment thus contains three components: susceptibility, frequency and magnitude.
There are several common LSA approaches: heuristic; deterministic; statistic and/or machine learning (ML).The latter has additionally inspired researchers in the past few years (Merghadi et al., 2020;Pourghasemi et al., 2018) bridging the computational and geoscientific field of research.Quantity of research examples is commonly entailing the increase of quality, but the impression is that many researchers simply reapply well-known ML techniques to new case studies, which opens two considerations: the first one suggesting that it might be the time for standardizing the ML approach in LSA; and the second, suggesting that research should be further directed towards innovative solutions.The latter is followed in this work.In addition, we used somewhat unconventional concept of susceptibility and combined not only the spatial probability based on the inventoried landslide locations, but also their additional characteristics.Namely, since we were in possession of a detailed single-event (2014) landslide inventory, we decided to utilize additional relations between landslide features and terrain properties.The resulting map is not pure susceptibility, nor it is purely a hazard map, but rather something in betweena susceptibility map with integrated landslide magnitude.
This paper is not attempting to present yet another landslide zonation map (Main Map), leaving its applicability or its modeling issues unanswered.Relying on the authors' previous work, such as Marjanović (2013); Krušić et al. (2017); Peshevski et al. (2019); Marjanović et al. (2019); Đurić et al. (2019), this paper attempts to address some of the common issues while providing an insight in a successfully implemented ML in LSA mapping, with a heuristic touch, as well as innovative data preprocessing.
These issues include matching the optimal ML technique against the case study at hand, working with limited data, experimenting with different sampling strategies and working across different resolutions (cross-scaling), addressing the validation problem and finally, combining of heuristic and ML approaches.In the present paper in particular, we have implemented the following strategies: crossscaling, numeric data normalizing, non-landslide area masking and balancing between non-landslide and landslide sample size.
Another common issue in LSA is the visual and cartographic quality of the output maps.They are commonly generic, with a standard red-green (traffic light) color scale, while appearance and visibility of details, which are scale-depended, are sometimes neglected.Examples of good practice are given in Bernat Gazibara et al. ( 2019), Borrelli et al. (2018), Segoni et al. (2016), wherein landslide inventorying or LSA modeling is supplied with high-quality cartographic output map in large format.

Case study area
The Doboj City belongs to the Republic of Srpska (Bosnia and Herzegovina).It is located in its central part and covers an area of about 655 km2 (Figure 1).In accordance with the latest Census 2013, it consists of 83 settlements with a population of 77,224 in total.
The landscape is dominated by the wide valley of the Bosnia River, which stretches meridianally, predisposed by N-S trending faults, but curving around the city itself, due to transection of N-S trending faults by E-W trending ones.Surrounding the valley, hilly to mountainous areas (up to 915 m a.s.l.) are ascending toward the E and SE.Drainage pattern is relatively dense, with many tributaries also predisposed by regional structural features.Lithological constituents are primarily split into two belts, Paleogene flysch to the north and ophiolitic mélange to the south, while various Cretaceous deposits and Neogene basins are squeezed between them.Subdued, there are limited intrusions of acidic volcanic rock, patches of Triassic massive limestones and Paleozoic schists.Landslides are typically hosted in the ophiolitic mélange and Paleocene-Eocene clastites and flysch, as well as in the cretaceous weathered limestone and marl in the central part (Sandić et al., 2017).

Sources and inputs
The choice of input data may vary considerably depending on availability, but more importantly distribution of their values and classes.In this case study, some important nominal conditioning factors, such as Lithologic units and Land Cover were inapplicable for ML protocol due to inadequate distribution.In particular, landslides were too densely concentrated in very few Lithologic and land cover units, while being completely absent in most of others.For this reason, the input data set for ML protocol included only numeric conditioning factors of morphometric type (Table 1).
The above-listed conditioning factors are among the most common proxies in LSA and no further details are in this work dedicated to elaborate reasons and GIS techniques needed for including them in the analysis.It is sufficient to note that all of them were of numerical type, and all are introduced as 25 m raster files.On the other hand, nominal conditioning factors were quantified heuristically, by assigning expertdefined score to their units on 0-1 scale, according to their influence on landsliding (Table 2).The scores were based on experience of local geological engineers working on the landslide inventorying in wider areas around the Doboj City, and the logic is consistent (Table 2) with the common practice (weathered, tectonized, clay-rich and younger rock formations are more prone to landsliding, as well as arable land and poorly vegetated areas).Two output maps were produced, i.e.Lithological factor and Land Cover factor map, with 0-1 scale of values, and 25 m resolution (Figure 2).
Landslide inventory compiled for the specified area was a result of a detailed field campaign conducted in 2016-2018 targeted at registering effects of massive landsliding in 2014.At each reported landslide, inventory form was filled, containing information on location, date of acquisition, date of activation, landslide type and material, water content, landslide velocity, activity and trend, triggering mechanism, as well as metric descriptors (slope angle, length, width, estimated depth, scarp height, landslide shape) and damage assessment (land use class, affected objects, inflicted damage level, recommendations).Each landslide was outlined as a polygonal feature in GIS environment.The metric information (dimensions; velocity, which was scored into 7°) (Hungr et al., 2014;IUGS-International Working Group, 1995); and activity, which was scored into 6 different degrees according to Varnes classification (Hungr et al., 2014) were used for defining the magnitude of landslides.The magnitude was appended to each of the 79 landslide polygons and normalized to relative 0-1 scale by multiplication of three main proxies (Volume × Velocity × Activity).Quantification of magnitude enabled regression ML task as will be explained later.
Predominant landslide type (76 landslides) was shallow translational earth slide, while there were two landslides of debris flow type, and one of earth flow type, which were excluded from further analyses due to drastically different mechanisms, possibly unrelated to considered conditioning factors (Dikau, 1996;Hungr et al., 2014).Landslides velocities were estimated to range between slow or m/year (85%), to moderate or m/month (15%).Most of the landslides were dormant (52%), while there were also examples of the first-time activated (28%), conditionally stable (15%) and suspended slides (5%).All slides were primarily triggered by excessive rainfall (in 2014) and inflicted damage to housing objects and local roads, subordinately agriculture and infrastructure.Average landslide size was around 5000 m 2 (70 × 70 m) with relatively shallow depths (2.5 m on average).

Mapping methodology
The LSA mapping approach in the present work is somewhat specific.It considers a combination of three different factors: Morphometric, Lithologic and Land Cover.The final Susceptibility map is obtained by their simple combination, i.e. multiplication (Figure 3).The latter two were routinely derived by using heuristic scoring, as described before, while the first one, the Morphometric factor, was derived by using two well-known ML techniques: Support Vector Machines (SVM) and Random Forest (RF).Detailed overview of LSA applications of these two   techniques is beyond the scope of this work, but it is sufficient to note that they are increasingly popular lately (e.g. more than 300 hits on www.sciencedirect.

SVM algorithm
The SVM is a linear regression tool which separates linearly non-separable examples (e.g.landslide and non-landslide instances) in high-dimensional (hyper-)space of coordinates (i.e.conditioning factors).The kernel functions are used to transform the regular coordinate space, where the classes at hand are nonseparable, into a hyper-dimensional space where these classes become separable.To do so, the algorithm needs to optimize the linear regression line (Ax + B), by maximizing the misclassification margin size.The boundary instances contained by the largest margin are called Support Vectors, and the Ax + B regression line that sits in the middle of the margin becomes the separation rule.Once determined, the rule can be transformed back into the original space as a complex function (Ax + B → f ).As ML protocol involves separate, non-overlapping, training, testing and validating sets of instances, the rule f, learned over the training set is thus applied to the 'unseen' testing and validating instances.Validation set is used for fine-tuning of the algorithm parameters or for modeling performance evaluation.For a more formulated explanation of the SVM implementation in LSA the reader is directed to Marjanović, Kovačević, Bajat, and Voženílek (2011).The SVM algorithm is used for regression and classification tasks in ML context.Both are applicable in LSA, with commonly high accuracy, expressed through Random Operating Characteristics Area Under Curve (ROC AUC) (Merghadi et al., 2020).Since based on Support Vectors, it commonly needs fewer data instances to establish accurate models than other ML techniques.

RF algorithm
The RF is actually an ensemble learning algorithm which is comprised of a number of Random Tree (RT) algorithms.RT is an algorithm which resembles an upside-down tree, starting with a root node, branching into further nodes and ending with the leaves (end-nodes).Each node is populated by a coordinate (i.e.conditioning factor) which is randomly selected (using a subset of all available conditioning factors without repetition), then parsed into a subset of values, and forwarded to the next node.Parsing is performed by introducing if/else conditions (with appropriate algebraic or logical query in respect to the values of the chosen conditioning factor) and replicated down the tree structure until the leaves are encountered where no further parsing is possible.By aggregating all if/else conditions from the root to leaf, a set of rules that separates parsed instances is generated (e.g.landslide vs. non-landslide).Some conditions can be redundant, and the set can be simplified (pruning of the tree) for optimal performance.One can imagine a number of such

Modeling protocol
The ML technique is implemented in a regression task (numeric dependent variable, i.e. landslide magnitude).
A very specific environment that is characterized by a clustered, unevenly distributed landslide inventory is indicative for applying ML techniques, wherein training on limited or scarce samples are not necessarily an obstacle in the modeling process.It is however limiting in respect to the input data, as some types of conditioning factors which are common in LSA were not evenly distributed throughout the area, especially nominal data types, such as Lithologic or Land Cover maps.Therefore, the latter two were introduced as entirely separate factors, after the ML protocol was completed and produced a complex Morphometric factor.Given the average landslide size of 5000 m 2 , the 100 × 100 m pixel resolution (10,000 m 2 ) was not representative (too coarse).Therefore, two resolution variants were considered for input data: 50 × 50 m and 25 × 25 m (the original resolution).The entire ML dataset (both landslide inventory and all numeric conditioning factors) was first cloned into two variants: 25 m resolution and 50 m resolution.Namely, the cross-scaling concept (Đurić et al., 2019) implies training at coarser resolution and validating against higher resolution data.Thereby, the algorithms are not biased by intricate details and too detailed relations that can occur in finer resolution case.The '25 m' dataset was the original one, as all input data are supplied with that resolution.The '50 m' dataset was generated by resampling all ML inputs, i.e. all numeric conditioning factors, from 25 to 50 m raster cell size.The resampling method was the Nearest Neighbor, which is known for smoothing effects in raster processing.
The ML procedure (Figure 3) can be described by the following steps: − Step 1: All inventory landslides are appended with the magnitude value.

−
Step 2: 25% of landslide polygons from the inventory are chosen for validation and 75% for training (Figure 4

Results and discussion
Since the most complex task was derivation of Morphometric factor based on ML-based regression, the following parameters were calibrated in a trial-anderror procedure over the training dataset for respective ML algorithms: − SVM included combinations of the following parameters: .ε regression tolerance or a margin within which most of the instances should fall.Larger values are decreasing the accuracy of the regression, whereas smaller values are decreasing its generalization power.Small tolerance equaling 0.001 is herein kept constant, as generalization is considered via factor C and cross-scaling step. .C is the trade-off penalty for instances outside the regression tolerance.Higher values of C lead to potential overfit.Values 2, 10, 100 were considered. .γ is the Radial Basis Function kernel exponent which corresponds to higher dimension of the hyperspace, meaning that lower values are desired in terms of computational cost, but higher values are desired in terms of successful regression.The values 0.1 and 1 were considered.− RF included only one combination of parameters: . s is the number of trees in the forest, which can be computationally demanding if unreasonably increased.Herein, 100 trees were considered. .K (K = log 2n + 1) is the number of conditioning factors (n in total) randomly selected among the five available conditioning factors without repetition.It is calculated by default and equals 2.
All parameter combinations were repeated for both training protocols (25 m on 25 m, and 50 m on 25 m), totaling 12 SVM and 2 RF models.Their performance was assessed by Root Mean Square Error (RMSE) as well as Area Under Curve (AUC) metrics (Table 3).
It is apparent that cross-scaling had positive effect on the performance for all model variants, up to several percent (8-10%).In respect to SVM, it is apparent that lower dimensionality and penalty factor result in more accurate models.The best-performing model according to the RMSE is RF_s100_K2_50vs25 (Random Forest model using 100 trees fed by two random conditioning factors, trained on 50 m dataset and extrapolated over 25 m dataset), while the modellabeled SVM_C2_g0.1_50vs25(SVM model using penalty factor 2, kernel factor 0.1, tolerance of 0.001, trained on 50 m dataset and extrapolated over 25 m dataset) was the best in respect of AUC.Given that in LSA the AUC parameter is more commonly used than RMSE, and that RMSE is more appropriate for the goodness of the fit than assessing false positives  and negatives which are more important in LSA, slight leverage is given to the AUC results, and therefore, SVM_C2_g0.1_50vs25was selected as the model of choice.It is important to highlight that all these models involved only morphometric and derived parameters as proxies of the landsliding process.
The ML model of choice was labeled as Morphometric factor (Mf), which highlighted areas prone to landslides regardless of their Lithologic composition and Land Cover conditions.The final model was therefore, supplied with these two respective factors -Lithological factor (Lf) and Land Cover factor (LCf).All three factors were normalized to 0-1 scale of values, entailing that relative relationships apply (not absolute magnitudes).A simple multiplication of all three factors (Mf × Lf × LCf) resulted in the final landslide susceptibility model (Figure 3), labeled SVM_C2_g0.1_50vs25_LithoLand,and the according map is presented in the attachment in the main frame of the layout (also in Figure 5).The Mf model labeled SVM_C2_g0.1_25vs25(for comparison with its cross-scaled counterpart), the Mf model of choice labeled SVM_C2_g0.1_50vs25(for comparison of the influence of Lithologic and Land Cover factors) and the final Susceptibility model, labeled SVM_C2_g0.1_50vs25_LithoLandare given in Figure 5 for comparison.Natural breaks technique was used to separate the respective landslide susceptibility classes in all models, from Very Low (green) to Very High (red).It is visually apparent that the final Mf model was more successful in following general morphology of valleys, hollows, gullies (according to the morphometric inputs it was trained on), unlike RF variants for instance (pixilation, discontinuation of slopes).The Mf model of choice seems to overestimate the Moderate and High susceptibility class.Given that the Mf models are based on morphometric inputs only, the heuristic intervention which includes non-morphometric influences was needed to produce more realistic final Susceptibility model.Even though the AUC performance did not increase (it remained 0.97 as in the case of Mf) it is visually apparent that geological and environmental features are better exampled on this map (Main Map).One of the reasons for the same AUC performance perhaps lies in the fact that the landslide inventory available for the validation was relatively small and localized.The validation of the final Susceptibility model SVM_C2_g0.1_50vs25_LithoLandwas performed by considering only its VH Susceptibility class (VH class = 1, everything else = 0) against the rasterized landslide inventory (landslide = 1, everything else = 0).The contingency table (Table 4), shows very high accuracy (acc = 92%), but again, it is probably the effect of a small sample available for validation.The overestimation of the VH and H class is apparent.
Distribution of susceptibility classes is relatively evenly spread (Main Map).Very Low susceptibility (indicating that there is low likelihood of landslide of any kind to appear) occupies most of the area, i.e. 33%, primarily limited to flat plains along river valleys.Very similar distribution is present in Low, Moderate and High susceptibility, occupying 20%, 22% and 19%, respectively.These classes occupy gentle to steep slopes in various Lithologic and Land Cover domains, but in respect to potential landslide occurrence can be interpreted very differently.For instance, moderate class can include several medium-sized and moderately fast landslides reoccurring with medium likelihood within outlined zone, but also, a few large but very slow landslides, or many very small but rapid ones.Of smallest extent is expectedly, the Very High susceptibility class (wherein a few large or many small landslides of moderate to high velocity, with currently active or reactivated stage can occur), occupying only 6% of the total area.Inconveniently, this zone is located in vicinity of principal urban areas and infrastructural corridors.Unlike in other ML models, where 'salt and pepper' pixilation of resulting maps is present, Very High susceptibility class in the final model (SVM_C2_g0.1_50vs25_LithoLand) is logically distributed along the valley sides on steeper slopes and across Lithologic units that are known for hosting High and Very High susceptibility class, primarily flysch (Unit 16) and ophiolitic mélange (Unit 17) (Figure 6(a)), whereas forest (Unit 2) and agricultural areas (Unit 6) host most of the High and Very High susceptibility class.

Conclusion
This work presents the reader with a know-how for implementing ML regression task in LSA, with all relevant steps in respect of the preparation of data, sampling strategy (balanced number of landslide and non-landslide training data, cross-scaling, etc.).The proposed approach was based on combining Morphometric, Lithologic and Land Cover factors as landslide susceptibility proxies.The Morphometric factor was calculated using ML, wherein the landslide magnitude range (extracted from the available inventory) was utilized to enable the ML regression task.Lithologic and Land Cover factors were obtained heuristically, by expert-driven scoring.The principal finding of this work is that heuristically defined factors successfully redeem shortcomings of the ML regression.On one hand, the absence of landslides over the entire area implies the use of ML, but on the other hand, the learning is limited to numerical conditioning factors which have full areal coverage.Neither of them could stand alone in this case study, but their combination was highly appropriate.Another important finding is that cross-scaling has been proven useful in all ML models and seemed to influence performance more than SVM or RF parameter optimization.The generalization is thus, a desirable effect, even though the improvements are slight in terms of performance metrics (8-10%), but considerable in terms of spatial distribution of susceptible zones (logical shape of susceptibility zones in respect to slope units and landslide inventory).
The area of the Doboj City can be considered as prone to landslides, with significant portion of the area zoned as Very Highly susceptible (6%) and Highly susceptible (19%).The fact that urban and infrastructural area is affected is in favor of such consideration.The unaffected areas, having Low to Very Low susceptibility are located in the lowland, flat plains and gentle slopes along the river valleys, which, on the other hand, might not be immune to other types of hazard, such as floods and torrents.

Software
GIS processing was performed in ArcGIS 10, including input data preparation, heuristic scoring, multiplication of all three factor raster models, and visualization of the outputs.Training, validation and testing set were subsampled using the Excel spreadsheet, once the landslide and non-landslide data were exported from GIS environment in a table format.All ML-related steps were performed in Weka 3.9.Conventional desktop computer (16 GB RAM, 8-core at 2.6 GHz under 64-bit operating system) was sufficient to host all indicated processing and modeling, within a reasonable time.Final map was constructed in Corel Draw X7.

Figure 1 .
Figure 1.The wider study area location (left) and extent with the altitude information and landslide inventory outlined in black (right).

Figure 2 .
Figure 2. Lithologic map (a) and Land Cover map (b) with respective factor maps after heuristic scoring in 0-1 range: Lithologic factor (c) and Land Cover factor (d).
com for the SVM in LSA with linear rising trend from 2016 to 2022; and about 600 papers on RF in LSA, with exponential rising trend from 2016 to 2022).For detailed overviews, the reader is directed toPourghasemi et al. (2018),Merghadi et al. (2020) andShano et al. (2020).

Figure 3 .
Figure 3. Modeling procedure (elements are color-coded as follows: dark blueinput data, light blue -GIS modeling orangespreadsheet data processing, greencalibration, graymachine learning protocol; software logos are placed next to related elements; specific steps are indicated in brackets).
(a)).− Step 3: Landslide polygons are first converted into raster grids (25 and 50 m resolution), subsequently converted to instance points (with corresponding resolution).− Step 4: To counterbalance the landslide instances, an equal number of non-landslide instances, with magnitude of zero, were generated randomly outside the landslide polygons, by using a mask for areas very unlikely to host any landslides, i.e. flatland (<5°), ridge lines and solid rock areas (Figure 4(a)) and appended to landslide instances generated at Step 3. − Step 5: Calibration was performed within the training set using standard 10-fold cross-validation technique where several combinations for SVM and RF parameters were prompted by trial-anderror tests.The optimal parameters with best-performing metrics were adopted for all further modeling variants.− Step 6: Using the calibrated parameters, both SVM and RF regression algorithms were trained as follows: .Training on '25 m' and testing on '25 m' dataset. .Training on '50 m' and testing on '25 m' dataset resulting in a series of intermediate models.− Step 7: Visualizing the resulting maps in GIS environment as rasters, for both visual and parametric evaluation of Morphometric factor (ML) models.− Step 8: Heuristic scoring of Lithological units, using local expert experience and landslide inventory as a reference.− Step 9: Heuristic scoring of Land Cover units, using local expert experience and landslide inventory as a reference.− Step 10: Multiplying the best-performing morphometric factor model of Step 6 with a heuristically derived factors (Lithologic and Land Cover).

Figure 4 .
Figure 4. Training and cross-validation landslides with indicated non-landslide area (a); Training dataset including subsamples of the landslide and non-landslide instances colored and sized in respect to landslide magnitude (b).

Figure 5 .
Figure 5.Comparison of the best-performing Morphometric factor model Mf (a) and its non-cross-scaled variant (b), and the final landslide Susceptibility model (Main Map) SVM_C2_g0.1_50vs25_LithoLand(c); landslides are outlined in black for all three models.

Figure 6 .
Figure 6.Distribution of susceptibility classes against Lithologic units (a), and Land cover classes (b) (refer toTable 2 for class legend).
likely to host shallow landslides in contrast to forested and vegetated areas with considerable root cohesion; urban areas are prone due to engineering interventions and excesses on pipelines.Rationale for scoring Lithological units: solid rocks of magmatic or metamorphic origin, massive carbonates, cohesionless soil on flat areas are unlikely to host landslides; tectonized ophiolite, weathered flysch and poorly cemented deposits are likely to host landslides.

Table 1 .
Input data for ML protocolmorphometric and derived conditioning factors.
RTs put together, which compose a larger collection called Random Forest.The RF algorithm only controls the number of individual RTs involved, as well as the subset size (number of conditioning factors included, but commonly log 2n + 1, where n is the total number of con-

Table 3 .
Performance of the models against the validation dataset (Figure4(a)).

Table 4 .
Contingency table of Very High susceptibility class of the final model (SVM_C2_g0.1_50vs25_LithoLand)against the available landslide inventory (number of matching pixels).
Table 2 for class legend).