Assessing multi-hazard susceptibility to cryospheric hazards: Lesson learnt from an Alaskan example

Classifying a given landscape on the basis of its susceptibility to surface processes is a standard procedure in low to mid-latitudes. Conversely, these procedures have hardly been explored in periglacial regions. However, global warming is radically changing this situation and will change it even more in the future. For this reason, understanding the spatial and temporal dynamics of geomorphological processes in peri-arctic environments can be crucial to make informed decisions in such unstable environments and shed light on what changes may follow at lower latitudes. For this reason, here we explored the use of data-driven models capable of recognizing locations prone to develop retrogressive thaw slumps (RTSs) and/or active layer detachments (ALDs). These are cryo-spheric hazards induced by permafrost degradation


Introduction
Techniques aimed at estimating locations prone to hydrogeomorphic hazards have seen significant development since the inception of the susceptibility concept.In its most modern definition, susceptibility refers to the probability of a given process occurring at a certain location (Reichenbach et al., 2018).This definition has been applied in studying a number of geomorphological processes, spanning from landslides (Atkinson and Massari, 1998;Frattini et al., 2010), to water-based (Conforti et al., 2011;Titti et al., 2022a) and wind-based (Borrelli et al., 2014(Borrelli et al., , 2016) ) soil erosion and floods (Choubin et al., 2019;Wang et al., 2022a).A similar progress has characterized the modeling development each of these phenomena.For instance, Brabb et al. (1972) pioneered expert-based landslide mapping solutions, where geoscientists recognized susceptible areas on the basis of their experience.Similarly, Verstappen (1983) authored the guidelines for geomorphological mapping.With the advent of Geographic Information Systems (Longley et al., 2005), more numerically-oriented solutions were proposed, starting from heuristic weighting (Leoni et al., 2009) to a number of bivariate statistical tools such as certainty factors (Juliev et al., 2019), weight of evidence (Regmi et al., 2010).However, these tools all suffered being unable to provide rigorous probabilistic outputs (Lombardo and Mai, 2018).This is one of the reasons why most of the geoscientific community initially moved towards multivariate statistics, largely in the form of Generalized Linear Models (Atkinson and Massari, 1998;Quesada-Rom'an et al., 2019).
In a subsequent phase, machine learning tools have occupied most of the geoscientific literature.For instance, Liong and Sivapragasam (2002) explored using support vector machines for flood susceptibility mapping.This topic was further experimented on by Khosravi et al. (2018) who tested the use of decision trees also for flooding.Abedi et al. (2021) further extended the flood susceptibility modeling towards more modern extensions of decision trees, including XGBoost, random forest and boosted regression trees.Ghosh and Maiti (2021) shared a similar experimental design but in the context of soil erosion.Moving past flooding, machine learning solutions have also constituted the foundations for wildfire susceptibility assessment, with neural networks being initially one of the many options (Leuenberger et al., 2018).With time, their use covered an increasingly wider range of natural hazard modeling due to their flexible alternative architectures.For instance, Chen et al. (2021) used deep learning to predict locations likely to develop gully head-cuts.Ma and Mei (2021) tested analogous techniques to predict the deformation of volcanic edifices and Aguilera et al. (2022) in the context of coseismic slope failures.One common element can be found across all these examples.In fact, they have all been created to seek the best susceptibility performance in a data-driven context.
However, results are achieved at the expense of interpretability, the multivariate statistical framework ensures throughout its process.For this reason, albeit in lesser numbers, the geoscientific community has branched out to Generalized Additive Models (Brenning, 2008;Steger et al., 2021), a model archetype capable of producing high performance while keeping the interpretation clear (Goetz et al., 2011(Goetz et al., , 2015)).This is the general situation regarding the modeling aspects when it comes to natural hazards.As for the regions where such hazards were studied, most of the geoscientific literature gravitated around surface processes typical of mid-latitudes.Conversely, natural hazards typical of arctic environments have received much less attention.This is mostly due to the fact that periglacial regions host a drastically smaller human population and therefore, the need for understanding and modeling cryospheric hazards has historically been less prominent than elsewhere (Nicu and Fatori'c, 2023;Nicu et al., 2023).However, global warming is rapidly changing this situation (Hinzman et al., 2005).In fact, periglacial areas are undergoing temperature changes at a much faster rate than what happens at mid-latitudes.Rantanen et al. (2022) even shows a nearly four-time increase in arctic temperatures compared to the rest of the globe since 1979.In turn, this implies that cryospheric hazards have become more spatially and temporally common, with numerous observations reported by (Ding et al., 2021) across all high mountain ranges and peri-arctic regions (see Table 1, in their review).These aspects involve for instance the destabilization of human infrastructures (Nicu et al., 2020), the modification of sediment budgets along the river networks (Crosby, 2009), and the release of greenhouse gases (Abbott and Jones, 2015).The first issue relates to the potential damage and loss of human structures, both for their current (Hjort et al., 2022) and heritage (Nicu et al., 2021) values.As for the second, river banks usually held together by ice can fail once the ice thaws, thus introducing additional sediments along the channels, which are transported and deposited far away from their otherwise stable source (Tananaev and Lotsari, 2022).Greenhouse gases also constitute a byproduct of the periglacial changes we are experiencing in recent years.The mechanism involves freeing carbon dioxide (Turetsky et al., 2020) and/or methane (Klapstein et al., 2014) into the atmosphere.These fluids were originally sealed within frozen porous materials, which once thawed, have the potential of releasing large volumes of gases (Knoblauch et al., 2013).All these phenomena share a common root cause, this being usually referred to as permafrost degradation (Streletskiy et al., 2015).This degradation and consequent thawing of the ice geomorphologically leads to specific landforms whose evolution is considered a natural hazard in itself.Specifically, permafrost degradation commonly gives rise to retrogressive thaw slumps (RTSs), active layer detachments (ALDs) and thermogully erosional features.RTS are slope failures characterized by rounded or even horse-shoe shapes, whose evolution moves backwards (therefore the term retrogressive; Lacelle et al., 2010) over several seasons.ALDs are processes of similar origin whose failure occurs much more impulsively, leading to mass movements that can transport unconsolidated materials hundreds of meters away (Kokelj and Jorgenson, 2013).As for thermo-gullies, these are also cryospheric hazards but their evolution is strongly linear and usually occurs along terrain incisions (Kokelj et al., 2017).
Cryospheric hazards have historically received much lesser attention compared to their mid-latitude counterparts (Nicu and Fatori'c, 2023).This is clearly reflected in the amount of data accessible to the geoscientific community.In turn, this limits the ability to build data-driven models to predict where cryospheric hazards may develop in the future.Currently, very few experiments exist, these being mostly carried out in Alaska (e.g., Blais-Stevens et al., 2015;Behnia and Blais-Stevens, 2018).
Aside from the data availability issues and estimating whether RTSs and ALDs can develop in arctic environments could be crucial for several reasons.The most important of these is developing an understanding of cryospheric dynamics.In fact, by using the uncharted arctic territory to build up experience in data-driven models for periglacial processes, one could transfer their prediction to other areas where RTSs and ALDs may not currently exist, but their genesis will take place and increase in the years to come.For instance, the Alpine ranges (Sattler et al., 2011) and High Mountain Asia (Huang et al., 2020;Schmidt et al., 2020;Shugar et al., 2021) are experiencing similar hazard.The experience gained from the arctic context, where significant temperature changes are constantly observed (Rantanen et al., 2022), could be particularly relevant to developing mitigation strategies across these mountainous regions.Moreover, within the same arctic context, understanding susceptible areas to RTS and ALDs can help quantify potential changes in sediment budgets as well as greenhouse gas releases.
With these overall aims in mind, here we tested a Generalized Additive Model (GAM) to classify the northern Alaskan landscape into locations prone to experience RTS and ALD.Also, following the idea of developing an understanding of the dynamics in periglacial areas, we selected a (GAM) framework to ensure a suitable prediction together with a reliable interpretation typical of statistical approaches (Beck and Jackman, 1998).To do so, we exploited the large breadth of environmental information available in Google Earth Engine.Specifically, our modeling protocol can access this cloud repository, organize, preprocess and download the necessary information to locally build GAM-based predictive models for RTS and ALD.In short, the primary purpose of the present manuscript is to classify the territory under consideration with respect to two distinct processes, whose combined probabilistic patterns will be combined to produce a multi-cryospheric hazard map for Northern Alaska.The underlying assumption is that a combination of landscape and climatic characteristics can capture the variability in the respective RTS and ALD distributions.In doing so, we also aim to explore the drivers of these two cryospheric hazards.

Study area and cryospheric hazard inventories
The study area is located in the Far North Arctic Alaska, in the northernmost region of the United States, located above the Yukon river.According to the K¨oppen-Geiger climate classification, the area falls in the subarctic and tundra environments (Peel et al., 2007).Swanson (2021) recently provided an inventory of RTS and ALD for the northern Alaskan landscape and also described the state of the region in the last few decades (c. 1980-2019), highlighting an overview of local climatic conditions and their recent evolution.Specifically, in the last forty years, the area exhibited a mean annual air temperature between − 5 • C and − 8 • C. Conversely, the mean annual ground temperature from 2000 to 2009, ranged between − 3 • C and − 8 • C with local exceptions above − 3 • C.However, temperatures underwent a significant increase in Alaska with time.Stafford et al. (2000) observed a 2.2 • C air temperature increase during winter between 1949and 1998. Similarly, from 1950to 2017, Wendler et al. (2017) reported a mean annual air temperature increase of 2.1 • C.These patterns are also reflected in the soil column, with an increase ranging between 1 and 2 • C in the Brook range, (Osterkamp, 2005), and in the range of 0.5-1.5 • C slight eastward of our study area (Osterkamp and Romanovsky, 1999).For this reason, this Alaskan sector has been observed (Jorgenson et al., 2001(Jorgenson et al., , 2006) ) and modeled (Ling and Zhang, 2003;Jafarov et al., 2012) to be particularly prone to permafrost degradation.In turn, permafrost degradation is responsible for the number of cryospheric hazard presented by Swanson (2021).Specifically, the inventory accounts for 1295 RTSs and 5508 ALDs just within ~50,000 km 2 (Fig. 1).
Notably, with the aim to test a susceptibility model both for RTS and ALD, we also selected an additional area to be used for model transferability purposes (Fig. 1c and d; see, Rudy et al., 2016;Cama et al., 2017).

Mapping units
A fundamental requirement of any susceptibility model is the choice of a suitable mapping unit.These units are the basic spatial object upon which a given study area is partitioned and also represent the object to which the probability will be ultimately assigned.The choice usually falls on either regular or irregular polygonal partitions.The former corresponds to squared grids.For instance, these are commonly employed for wildfire (see, Leuenberger et al., 2018) or gully erosion (see, Cama et al., 2020) susceptibility mapping, or in a number of lava (e.g., Crisci et al., 2004) and debris (e.g., Avolio et al., 2013) flow modeling applications.An alternative to these regular objects can be found in slope units (see, Carrara et al., 1991), catchments (e.g., Bertrand et al., 2013) or administrative units (Gunther et al., 2013).Each one of these options influences the use of the susceptibility, with detailed mapping units often being useful for local master plans and coarser ones being required to support regional or national-scale territorial management practices.
In our case, we could not opt for a slope unit partition for most of the RTSs and ALDs occur also in relatively gentle slopes (where the automatic slope unit generation fails).Similarly, we did not use catchments and administrative units to avoid unnecessary generalizations of the results.Therefore, we ultimately chose a squared lattice, whose size we constrained to a 225 × 225 m 2 for two reasons.First, the areal extent of the study area implies that any unit size smaller than 225 × 225 m would have led to a drastic increase in computational burden.Second, a 225 m side is the same resolution as the DEM accessible through Google Earth Engine (Danielson and Gesch, 2011).Therefore, by choosing a standard reference, all subsequent operations for predictors' generation also became straightforward.Notably, any mapping unit choice is arbitrary and the main requirement to be satisfied is for a mapping unit to reflect the environmental characteristics responsible for the genesis of the process under consideration.In this sense, a 225 m side grid is close enough to represent the size distribution of the RTS (mean length = m, std.length = 111 m, max length = 1117 m) and ALD (mean length = 54 m, std.length = 79 m, max length = 957 m) polygons mapped by (Swanson, 2021).

Predictors
Another fundamental requirement for any susceptibility model is the selection of a predictor set capable of explaining the distribution of presence/absence data while respecting the physical understanding of the process at hand.In the case of cryospheric hazards induced by permafrost degradation, the predictor set has to include terrain, geological and climate-related characteristics.Here we chose a total of 11 covariates, these being listed in Table 1.
The DEM derivatives were computed in Google Earth Engine by using the Global Multi-resolution Terrain Elevation Data 2010 (GMTED2010; Danielson and Gesch, 2011).Among the covariates, the slope steepness is meant to convey the direction along which gravitational pull would act (Ramage et al., 2017).As for the slope exposition, we chose this property both as a proxy for strata attitude as well as for carrying the sunlight exposition signal in the northern hemisphere (Lacelle et al., 2015).The two curvatures are often used to indicate landscape concavity or convexity, shapes that control the acceleration of overland water flows along preferential directions (Ohlmacher, 2007).Geology is instead a proxy for the above soil column type, where RTSs and ALDs may develop (Blais-Stevens et al., 2015).The original document produced by the USGS reports the following: "this map and associated digital databases are the result of compilation and interpretation of published and unpublished 1:250,000-scale and limited 1:500,000-to 1:63,360-scale maps".As for NDVI, this conveys the presence of vegetation disturbance (Huang et al., 2020).Ultimately, mean annual total precipitation (Balser et al., 2014), thawing degree days (Lantz and Kokelj, 2008), mean annual July temperature (Jones et al., 2019), mean annual snow albedo (Cassidy et al., 2017), and mean annual snow cover (Kokelj et al., 2009) holistically describe climatic characteristics that can lead to RTS and ALS formation and their development.The last covariates were evaluated via Google Earth Engine, using the datasets ERA5 for total precipitation and MODIS for the other variables (NDVI, temperature ones, as well as snow cover and albedo).
We would like to highlight that computing the large set of covariates has historically been challenging.However, cloud computing solutions, such as Google Earth Engine, have made accessing, processing and downloading large data volumes a relatively easy task.To accomplish this task, we have created a Python script that essentially returns the data matrix necessary for the subsequent RTS and ALD modeling.The code is accessible at CryoS, and we made it open to replicate similar analyses or can be implemented in other regions.Notably, statistical models require removing any redundant covariate to avoid multicollinearity issues (Alin, 2010).Here, we show some preliminary analyses where we tested the pairwise correlation among the ten covariates we chose (excluding the geology; Fig. S1).

Generalized Additive Model (GAM)
We utilized a Generalized Additive Model (GAM) framework to map the Northern Alaskan landscape susceptibility to RTS or ALD, with two separate models built for each of these cryospheric hazards.GAMs are a type of semi-parametric models that combines the flexibility of nonparametric ones together with the interpretability typical of simpler linear models (Wood, 2006).The semi-parametric nature of GAMs comes from the fact that they use a linear model as the foundation and then apply smoothing functions (i.e.non-parametric relationships, such as splines) to the predictor variables (Hastie, 2017).These smoothing functions allow the model to capture non-linear relationships between the predictor variables (covariates) and the response variable (RTS or ALD occurrence) without making strong assumptions about the functional form of these relationships (Wood, 2017).GAMs have been used in various susceptibility studies, ranging from regional to local scales (Yalcin et al., 2011;Petschko et al., 2012;Titti et al., 2021).
More generally, a GAM can be used to explain data distributed according to several exponential family distributions (gamma, Gaussian, etc.; Wood, 2006).In our context, the response variable is represented by a binary dataset, indicating the absence or presence of cryospheric hazards at specific locations.For this reason, the ideal framework to model presences/absences of RTSs or ALDs corresponds to the binomial case, which assumes the two separate RTS and ALD dichotomous data to behave according to a Bernoulli probability distribution (Bryce et al., 2022).A binomial GAM can be denoted as follows: where η is the logit function, π is the probability that cryospheric hazards are present at a given location, β 0 is the global intercept and f i is the nonlinear function estimated for each of the covariate x i in the model.The output of a binomial GAM (Eq.( 1)) is expressed as a continuous spectrum of values that reflect the probability of RTS or ALD occurrence.To evaluate the performance of binary classifiers, various metrics can be considered and grouped into two main categories: cut-off dependent (Rahmati et al., 2019) and independent (Mende and Koschke, 2010) metrics.Cut-off dependent metrics involve the selection of a specific threshold value to reclassify the probability spectrum into a binary dataset, which can be matched against the initial observations.This leads to the computation of confusion matrices, from which accuracy, precision, recall, and F 1 -score (i.e. the harmonic mean between precision and recall) can be derived (Bertolini, 2021).In the remainder of this manuscript, we will use the Youden Index (for a detailed description, see Fluss et al., 2005) to estimate the best probability cutoff.In contrast, cutoff independent metrics rely on multiple probability thresholds to compute true positives and negatives, as well as false positives and negatives.These metrics include the receiver operating characteristic (ROC; Hosmer and Lemeshow, 2000) or the precision-recall curves (PR; Loche et al., 2022) and their respective area under the curve (AUC; Boyd et al., 2013;Hajian-Tilaki, 2013).
Binary classifications can be used both for explanatory (Lombardo and Mai, 2018) and predictive purposes (Lima et al., 2021).Explanatory assessments involve interpreting the functional relations estimated from multi-variate regressions of the presence/absence vector with respect to the covariate set; i.e. the model seeks to explain why and where these hazards take place, by identifying key factors and variables that influence their occurrence and distribution (Steger et al., 2021).This can be done using the full available information, as in our work, fitting 100 % of the grid cells in our study area.However, the estimated results cannot be directly interpreted for predictive purposes.Prediction is here intended as a probabilistic estimation over unknown areas (spatially) to a given classifier that has been trained elsewhere.The aim, in this case, is to estimate areas where the processes may currently be absent, but their terrain and environmental characteristics imply that they could manifest in the future (temporally).To pursue this goal, two common approaches are used instead.The most natural approach consists of measuring the prediction skill with subsequent hazard occurrences (Lombardo and Tanyas, 2020).However, this is rarely done due to the scarcity of multitemporal hazard inventories Guzzetti et al. (2012).Therefore, when only spatial data is available, a common routine for estimating predictive performance involves splitting the data into a portion used for calibration and another for validation.This assumes that spatial replicates mimic the behavior of temporal ones.Also, the training and test splits can be done in different ways.The simplest approach is pure random split, leading to the so-called random cross-validation (RCV; Roberts et al., 2017).However, this usually leaves the data structure unchanged, resulting in similar performances to the calibration ones.Another approach is commonly referred to as spatial cross-validation (SCV; Brenning, 2012), which uses a spatially constrained subset of the data and allows for the assessment of how well the model performs in specific sectors of the study area.SCV can reveal localized model performance, which RCV cannot detect.
This study makes use of all elements described above.The two cryospheric hazard data sets are used to generate separate presence/ absence instances, whose entire information is fitted to the covariate sets computed for the Far North Alaska landscape.As a result, the model output is suitable for interpretation, allowing for the exploration of each covariate effect.As for assessing the models' predictive skills, we perform two cross-validations (a 10-fold RCV and a 10-fold SCV), for both RTSs and ALDs.Finally, we tested the fitted model on a small test area eastward of the study area, as an additional mean to evaluate the model generalization (Fig. 1).
We highlight here that, the binomial GAM protocol we developed as part of this research is implemented in Python (Serv'en and Brummitt, 2018), using the pyGAM package (Serv'en and Brummitt, 2018).Through Python, the CryoS code we share at https://github.com/zincoblenda/CryoS allows one to access cloud-based data on Google Earth Engine, processes it and then generate susceptibility estimates.

Stepwise GAM
In the literature, several solutions are available to perform variable selection, including stepwise procedures (Atkinson and Massari, 1998;Beguer'ıa, 2006;Meusburger and Alewell, 2009), LASSO (Castro Camilo et al., 2017;Amato et al., 2019;Deng et al., 2021) or penalization.Here we implemented a stepwise forward selection routine as part of the best GAM model selection.Stepwise forward selection (SFS) is an iterative approach that aims at identifying the optimal set of variables that strikes a balance between performance and simplicity, reducing overfitting and improving the generalizability of the model Khan et al. (2007).This method boils down to building one model at a time, starting from the best single variable, then moving to the best couple, triple and so on, sorted according to the Akaike Information Criterion (AIC; Hu, 2007).The algorithm continues to add variables one by one until there is no significant improvement is achieved by adding further covariate information.At this point, the algorithm stops and returns the final set of predictor variables that provide the best predictive power while keeping the model simple and parsimonious.
In other programming environments such as R, stepwise GAM functions are available (see step.GAM in Hastie and Hastie, 2015).However, in Python this is not the case.For this reason, we implemented our own local "step.GAM" routine in Python and also shared it as part of the code accessible at (CryoS).

Variable selection
Fig. 2 shows the results of the SFS for both RTS and ALD binomial GAM models.In the ALD model, all variables were retained (Fig. 2b), whereas in the RTS model (Fig. 2a), the last three variables (namely, SNOWC, ASP and TDD) were excluded as they did not contribute to the model's performance, i.e. the AIC does not exhibit any significant decrease.Interestingly, while storing AIC values at each stepwise iteration, we also stored the AUC goodness-of-fit values, which are also reported in Fig. 2. In both RTS and ALD cases, the AIC and AUC curves show almost perfectly inverted patterns.For instance, in the case of RTS, even the AUC curve reaches an asymptote at JT, justifying the exclusion of SNOWC, ASP and TDD.Similarly, in the case of ALD, the AUC continuously increases up to the last covariate insertion.

Susceptibility modeling performance
We calculated both the goodness-of-fit and predictive skills.To do so, we used ROC and AUC values, both for the reference fitting procedure and two types of cross-validation, namely a 10-fold Random Cross-Validation (RCV) and a 10-fold Spatial Cross-Validation (SCV).Regarding the latter procedure, the spatial subdivision utilized for both models was generated by performing a k-means clustering of the coordinates of the study area's pixels.Fig. S2 provides a visualization of the resulting subdivision where each area marks the leave-one-out procedure used for validation.
The results (Fig. 3) show that the performance of the model falls within the "excellent" category according to the AUC classification proposed by (Hosmer and Lemeshow, 2000).However, upon closer inspection, the fit and RCV results are better, falling almost within the "outstanding" category (with means above 0.8 and below 0.9).Conversely, SCV returned lower performance.This is not surprising because an SCV procedure is meant to break down any spatial autocorrelation in the data (Brenning, 2012).Therefore, the SCV precisely serves as an important indicator of the prediction skill under a blind test where the model cannot rely on its native spatial structure.
In other words, a spatial cross-validation usually returns the worstcase scenario performance in any spatial model.This is also particularly evident when examining the uncertainty across bootstrap replicates and cross-validation types.In fact, the variability for RCV is particularly low since the random selection does not disentangle local   spatial dependence in the data.As for SCV, where the spatial dependence was perturbed due to the constrained local selection, the variability is much higher, although still within an acceptable range (acceptable AUC threshold = 0.7; Hosmer and Lemeshow, 2000).Specifically, for the ALD case, 10 out 10 replicates exceed the 0.7 AUC mark.Conversely, for the RTS case, 7 out 10 replicates do the same.
Another way to elaborate on model performance is to look at confusion plots (see, Amato et al., 2021), where the model accuracy is decoupled for presence and absence data.These are shown in Fig. 4, for both RTS and ALD, as well as for the results obtained from the fit, RCV and SCV.At a first glance, the figure illustrates that the variation between the fitted model and the two cross-validation tests appears to be small for the ALD case.This can be inspected by looking at the distance between the fit results (the only square symbols), comparing their position to the white symbols, which constitute the mean behavior of the two cross-validations.The same consideration is valid when looking at the uncertainty bounds.These are measured with a single standard deviation width from the mean, showing much narrower intervals for the ALD as compared to RTS.Aside from the relative assessment, the two models still appear to perform well also in absolute values, with even the RTS results being associated with accurate estimates, with very few exceptions.
The last attempt to showcase our model is shown in Fig. S3.There, the susceptibility patterns are obtained by locally solving the prediction function fitted over another study area.This procedure is commonly referred to as model transferability (Steger et al., 2022) or as validation with independent spatial data (Roberts et al., 2017) and it is often assumed to return worse performance as compared to tests that are run within the same study area where a given model is calibrated.This is confirmed even in this case, with barely acceptable transferred performances down to 0.7 of AUC in both cases.

Covariates' effects
To evaluate the covariates' effects on the final susceptibility estimates, we generated partial dependence plots for each covariate.These results provide a visual representation of the relationship between the predictor variables and the response variable, allowing us to assess the Fig. 5. Marginal plots of the covariates' effects estimated for the RTS susceptibility model.Notably, the y-axes are directly expressed at the response scale (in probability rather than at the scale of the regression coefficients).
impact of each covariate on RTSs and ALDs susceptibility, separately.The partial dependence plots for each term of the model are shown in Figs. 5 and 6.Notably, to improve readability, we have opted to plot the y-axis directly in the response scale (as probabilities) rather than in the linear predictor scale (as regression coefficients).
The two figures reveal that there are some notable similarities in the way certain covariates are influencing RTSs and ALDs occurrences.However, there are also marked differences between the two, suggesting that the covariates may be playing distinct roles in each process.Below we will present our interpretation for each covariate and for each of the two processes under consideration.

Geology (GEO)
The association of RTSs and ALDs with bedrock lithology can be difficult to analyze in permafrost regions due to several factors.In particular, these processes typically occur on the surface sediment of the active layer, rather than directly at the bedrock level.Our RTS (Fig. 5 and ALD 6) models showed that the lithologies belonging to the Endicott group (denoted with the numbers 6, 10, 13, 23; Appendix B) underlay areas prone to RTSs and ALDs occurrence (i.e., contributing with marginal probabilities consistently above 0.8 for each lithotype).The Endicott group is a type of clastic sequence consisting mainly of shale, sandstone, and conglomerate (Tailleur et al., 1967).From an interpretative standpoint, the constant positive contribution of such materials may reflect the potential instability of unconsolidated materials, as opposed to more massive and cohesive ones distributed over the study area.Likely, when the permafrost is healthy or in normal conditions, these materials are held together by the ice structure.However, when the permafrost starts to degrade, this clastic sequence is the first one in the area that experiences instability, something that both models statistically picked up, irrespective of the cryospheric hazard under consideration, The last lithotype worth to be mentioned corresponds to North Alaska Sedimentary rocks (denoted with number 24 in the figures and in Appendix B).Interestingly, this appears to promote RTSs (marginal probability = 0.85) and oppose ALDs (marginal probability = 0.26).One of the possible interpretations is that these sedimentary rocks are reported by Dillon et al. (1986) to have been mapped as part of the same formation, although they internally exhibit a significant degree of anisotropy due to the different nature of the constitutive material and relative granulometry.It is possible that the same anisotropy may favor one cryospheric hazard rather than the other, as a function of the respective failure mechanisms.

Slope (SLP)
These partial dependence plots (Fig. 5 for RTS and Fig. 6 for ALD) show that two cryospheric hazards generally behave with a similar probability decay at increasing slope gradient.However, some differences arise when looking at the covariate contribution in the first part of the slope range.Specifically, the probabilistic occurrence of the two processes increases with SLP up to 10 • .RTSs become much more unlikely after this threshold, while ALDs continue to show high occurrence probabilities (>0.8) up to 30 • .This indicates that RTSs may tend to form in relatively flat areas.
At the same time, ALDs can occur along steeper morphologies, presumably because of the higher shear stress provided by this terrain morphology (Balser et al., 2014).

Horizontal curvature (HC) and Vertical curvature (VC)
Areas with concave HC (i.e., negative values) are both prone to develop RTSs and ALDs (Figs. 5 and 6, respectively).These morphologies indicate terrain where water fluxes converge.This may lead to erosion along the central track (Ohlmacher, 2007), a phenomenon that can start the development of cryospheric hazards.As for the transition from linear to convex landscape curvatures, the probabilities drastically drop to zero.
The partial dependence plots of VC are similar to the HC ones.
Negative values of VC correspond to convex morphologies along the topographic profile.Profile convexity is responsible for vertical overland flow accelerations and therefore, similarly to the previous interpretation, the associated erosion (Ohlmacher, 2007) could lead to the formation and development of the two processes under consideration.The transition to positive VC values here indicates upwardly concave shapes, where the probability of both RTSs and ALDs become much lower.

Snow albedo (ALB)
Snow albedo (ALB) is an important parameter for determining the energy budget in high-latitude regions in winter (Li et al., 2023).The albedo effect generated by snow is generally much higher than that of other land cover types (Chapin III, 1993), and it is significantly associated with the solar radiation between the snow and atmosphere (Randall et al., 1994).ALB depends on many factors, including snow depth, snow age, vegetation coverage, vegetation canopy height, snow grain size, and internal mixing, but generally, its value varies between ~0.6 and ~0.9.This range corresponds to old and new snow, respectively.Values lower than 0.6 are typical of a less dense snow cover.Partial dependence plots (Figs. 5 and 6) showed that ALB has a similar impact on RTSs and ALDs susceptibility models.In both cases, the maximum marginal probabilities are reached between 0.6 and 0.8 mean annual ALB, suggesting that areas covered by snow for most of the year are more likely to produce RTSs and ALDs.This is potentially the case because of the specific information carries in the range 0.6 < ALB <0.8.Values below 0.6 may indicate locations where little to no snow is available through the year and, therefore, where permafrost is not available to behind with.As for values above 0.8, we enter the domain of a very dense snow mantle, which may persist for most of the year.Conversely, the inbetween range may favor cryospheric hazards because.For instance, snow-packs melting can increase the amount of free water.This gives rise to pore water pressure increase which is translated into a reduction of effective stresses at the thaw front (Lewkowicz, 2007).This mechanism is documented in several articles (e.g., Lewkowicz, 2007), together with the resulting presence of sliding events, whose manifestation is due to the combined action of the snow cover melt and the rapid thaw of the ice-rich transient layer (e.g., Lamoureux and Lafreni'ere, 2009).
In any case, a general interpretation is that the albedo reduction occurring after snow melt leads to a positive change in the energy budget at the Earth's surface.This may contribute to the deepening of the active layer (Streletskiy et al., 2015;Zheng et al., 2019), thus promoting the formation of RTSs and ALDs.

July temperature (JT)
Summer temperatures directly play a critical role in the formation of RTSs and ALDs, as higher temperatures can lead to more extensive permafrost thaw and ground surface instability.Several studies have shown a positive correlation between summer temperatures and the occurrence of RTSs and ALDs (e.g., Shiklomanov et al., 2010;Liljedahl et al., 2016).Looking at both the marginal plots, the JT contribution to RTSs and ALDs occurrences also reflects the same physical assumption mentioned above.In fact, high marginal probabilities are reached between 18 • and 22 • (see Figs. 5 and 6).Beyond this temperature range, a slight decrease can be noted in the probability of RTSs and ALDs, which may suggest that these regions, characterized by higher mean JT values, are less likely to be covered by permafrost or may have more sporadic permafrost coverage.

Thawing degree days (TDD)
This work defines thawing degree-days (TDD) as the number of days in a year in which surface air temperature is above zero.Therefore, TDD are used as a proxy measure of the amount of heat accumulated over a certain period and above a specific temperature threshold.This threshold is set at the level required to thaw frozen ground or ice.As a result, TDD are used to estimate the timing and duration of the spring thaw, which can affect soil moisture and water availability, hence the L. Elia et al.Fig. 6.Marginal plots of the covariates' effects estimated for the ALD susceptibility model.Notably, the y-axes are directly expressed at the response scale (in probability rather than at the scale of the regression coefficients).impact of climate change on permafrost and cryospheric hazards.
The partial dependence plots show a correlation between TDD and ALDs occurrence, highest and most meaningful for TDD between 40 and 60 days in a year (see Fig. 6).Above the 70 days' mark, the marginal probabilities are characterized by a slight drop, which may be related to the absence or poor permafrost coverage in regions that experience warmer temperatures during the year.

Snow cover (SNOWC)
Snow cover (SNOWC) impact on ALDs can be interpreted in a similar way as TDD and ALB.Regions characterized by low (<20) or high (>60) SNOWC are less prone to generate ALDs: on the one hand, permafrost is either absent or has a limited extent; hence there is not sufficient material to generate ALDs.Conversely, a high snow cover could prevent thawing and freezing cycles, thus avoiding the generation of ALDs.

Susceptibility mapping
This section translates the model results into map form.These are shown in Fig. 7, where the first element to be addressed is the different probability range reached by the models, for RTS and ALD respectively.In the first case the probability reaches a maximum of 0.02, whereas, in the second, the maximum is 0.07.We recall here that we used all the information in the study area.Therefore, we kept the natural proportion of cryospheric hazards' presences/absences towards unbalanced data sets.The greater number of ALD occurrences in the database has repercussions in the estimation of the global intercept, which is greater compared to the one estimated for the RTS.This in turn leads to a larger maximum between the two susceptibilities.Aside from these technical aspects, one of the most important considerations that arise from the two maps' observation is the way the two probability patterns appear.In fact, despite the two cryospheric hazards sharing the same genetic process in the form of permafrost degradation, they do not occupy the same landscape niches.The two susceptibility maps do not present analogous spatial patterns and further considerations on these aspects will be provided in the multi-hazard overview.For now, another model characteristic to be highlighted links back to performance considerations.The two separate models seem again to work extremely well, with the confusion matrix showing very high counts of true positives with respect to the total, both for RTS (TPRTS / [TPRTS + FNRTS] = 88 %) and ALD (TPALD / [TPALD + FNALD] = 87 %).This attests to the model's capacity to recognize susceptible locations.The complementary information is shown in the very low false positive counts for both.Focusing on the stable locations, these appear to be associated with high numbers of true negatives but also with high numbers of false negatives.The latter represents the most important information retrieved in this study.In fact, if we have shown that our respective models are able to accurately recognize susceptible locations to RTS and ALD, this implies that the high numbers of false positives may constitute locations where cryospheric hazards have not developed yet.In other words, the locations labelled as false positives are the ones that may generate RTS and ALD in the future.

Multi-hazard susceptibility mapping
The last part of the analyses is dedicated to the combination of the two susceptibilities into a single multi-hazard prediction map.This is a tool that offers the added value of presenting where two or more processes are more likely to occur (Lombardo et al., 2020).For this to be done though, the continuous spectra of RTS and ALD susceptibility need to be binned into a few classes.Here we chose the Fischer-Jenks method (Jenks, 1967).This procedure only requires the user to define the number of classes.Then an iterative procedure will select the thresholds that would lead to the minimum internal variance across bins (for more details, see Chen et al., 2013;Aguilera et al., 2022).We opted for four classes, whose combination returned the 16 multi-hazard levels shown in Fig. 8. There, in the western sector of the study area, neither RTS nor ALD are likely to develop.However, the situation rapidly transitions to the central sector, where the landscape appears to be susceptible to both and becomes much more scattered to the east.This type of visualization maximizes the available information and can support decision-makers in prioritizing risk reduction investments (Nicu et al., 2023).

Model strengths
The number of susceptibility studies dedicated to cryospheric hazards and their impact is becoming more frequent (Nicu and Fatori'c, 2023), although their number is still much lower as compared to scientific applications at mid-latitudes, for other types of geomorphological processes (Reichenbach et al., 2018).Our work attempts to bridge the gap between the two worlds, testing state-of-the-art data-driven solutions in peri-arctic conditions.The Alaskan territory is one of the most studied areas in relation to RTS (e.g., Swanson and Nolan, 2018) and ALD occurrences (Blais-Stevens et al., 2014).However, a comprehensive cryospheric hazard assessment of Northern Alaska still needs to be explored, especially considering multihazard aspects.This gives our experiment an additional value, although we mainly focused on methodological aspects.In fact, our work presents a protocol where the whole analysis can be essentially run in a single computing environment.Unfortunately, this is rarely the case (see, Titti et al., 2022b).In fact, even with our current technology, most of the published research in geospatial hazard modeling relies on different platforms to perform different steps of any analytical procedure.
Beyond computational considerations, our protocol offers both interpretation and high performance.This is because of our GAM choice, a particularly suitable modeling framework to explore and study covariate influences on spatial processes such as RTS and ALD.Specifically, the marginal plots offer a unique opportunity to understand how landscape and environmental characteristics may be responsible, at least probabilistically, for the two cryospheric hazard occurrences.As for the performance, both processes have been separately classified with excellent classification results across fit, RCV and SCV.The only moment where our models really suffered corresponds to the external validation performed by transferring the prediction in an area to the east.There, the results barely reached acceptable performances.However, this is mostly due to different outcropping lithologies in the area.This aspect of the model transferability is rarely accounted for in susceptibility studies, with only a few valid exceptions to this rule (Wang et al., 2022b).Here we highlight the model prediction skills both through external validation as well as through a spatial-cross validation routine.This is also something that methodologically is usually underreported or even entirely neglected in susceptibility modeling (Goetz et al., 2015).However, it provides a unique perspective on model performance.In fact, when the cross-validation of choice falls under the traditional random option, the model essentially stays the same, thus returning analogous performance to the fit.In other words, the perturbation the cross-validation applies to the data, compared to its original structure, is not enough to disrupt spatial autocorrelation effects from one replicate to another.This is not the case for spatial cross-validations, where entire chunks of spatial data are removed.The difference between types of cross-validations raises an important question, regarding which one of the two measures one should trust the most.In the context of geohazard modeling, one often seeks and calibrates decisions based on the worstcase scenario.For this reason, we believe the SCV to be the procedure that mimics the most how bad a model can perform and the extent to which one could rely on it.Similar considerations and concerns can already be found in Brenning (2012), although most of the research on data-driven approaches mostly disregards them.

Model weaknesses
We consider the notion of model transferability to be of particular relevance in the context of this experiment and for cryospheric hazards in general.In fact, we have observed that the model performance substantially decreases a few tens of kilometers away from the main area where we have built our model.Therefore, the question arises of how generalizable our model is if used to target the whole Alaskan territory.Our expectation is that it would likely worsen, even beyond the acceptability limit.To test this hypothesis, one would need rigorous RTS and ALD mapping standards, and public repositories to promote datadriven research.For instance, coseismic landslides (Tanyas ¸ et al., 2019;Lombardo et al., 2021) and their rainfall-induced (Stanley and Kirschbaum, 2017;Wang et al., 2022b) counterparts have some global susceptibility solutions.However, such standards or at least such global repositories do not exist for processes generated by permafrostdegradation.To provide the right foundations to create global models or even better-constrained regional ones, datasharing initiatives like the one promoted by Swanson (2021) should become commonplace.Unfortunately, without them, even efforts to employ state-of-the-art solutions to cryospheric hazard prediction will be very limited spatially.
Aside from the spatial aspects, another limitation of our model and in general of the majority of RTS and ALD studies is that if data is geographically scarce, when it comes to the temporal dimension it becomes almost non-existing.Extremely few exceptions do exist (see Balser et al., 2014), but they are confined to site scales.However, new developments in automated mapping may constitute the solution.Very recently, deep-learning routines have been develop to map RTS (see, Nitze et al., 2021;Yang et al., 2023), although most of the applications have been placed in Tibet (Huang et al., 2020(Huang et al., , 2021) ) and only a few are available in high-arctic regions (Witharana et al., 2022).In the case of ALD, their occurrence has been mapped through change-detection (Rudy et al., 2013).Irrespective of the cryospheric hazard type, these routines are yet to be consistently used to produce multi-temporal cryospheric hazard inventories.For other hazards such as floods (James et al., 2021), landslides (Amatya et al., 2021) or fires (Anderson-Bell et al., 2021) this is already the case.The generation of RTS, ALD but also thermo-gully inventories annotated with their spatial and temporal occurrence information could unlock space-time modeling applications.In the current state, we use the term predictive model to address our GAM.However, this is only correct from a strict technical perspective.In a data-driven context, prediction is a term dedicated to a model that estimates occurrences for data that it was never trained with.However, the common definition of prediction also includes, if not even exclusively, temporal aspects such as when or how frequently a given phenomenon manifests.
For this reason, we already envision possible extensions to our spatial GAM towards their space-time counterpart (e.g., Wang et al., 2022a).This could also unlock the use of the same routine for simulation purposes, moving away from the spatio-temporal domain under consideration and opening up predictions for targeted climate scenarios.

Conclusions
We propose a modeling protocol to estimate locations prone to develop RTS and ALD, and summarise this information in a multi-hazard susceptibility map for Northern Alaska.The binomial GAMs we test here follow the state-of-the-art in susceptibility modeling.These models are known to ensure high performance and a clear interpretation.This is also shown in this work, where the influence of environmental conditions is dissected to understand the influence of terrain and climatic characteristics in the landscape response.In this sense, what stands out the most for RTSs is the role of unconsolidated materials draped over upwardly convex and laterally concave terrains subjected to high July temperatures.ALDs share a similar starting point but further require long periods with temperatures above 0 • C combined with the presence of snow cover.Interestingly, these data-driven considerations align well with the physical understanding of the two processes at hand, and it provides a foundation for the prediction to be transferred elsewhere.However, when we tested our model specifically for transferability purposes, the performance we observed significantly dropped.Such an observation calls for action.In fact, the difference between the area where we trained our model and the area where we simulated the susceptibility mainly boils to unrepresented soil materials.As for the remaining predictors, these can be obtained from cloud repositories.For this reason, we prescribe more and more efforts in mapping cryospheric hazards, whose distribution in space could support better-trained datadriven models.Analogous considerations are also valid when looking at the temporal dimension.The use of automated mapping tools could unlock multi-temporal RTS and ALD inventories, from which one could build susceptibility models showing actual probability patterns may vary also in time.We conclude by stressing once more another innovative element in this work, specifically its multi-hazard focus.This aspect is particularly relevant because cryospheric hazards do occupy the same environmental niche and therefore understanding their dynamics and where their negative impact may overlap could be the key to tailoring local remedial actions, if possible.

CRediT authorship contribution statement
1. Letizia Elia has designed the experiment, conducted the analyses, produced graphical illustrations and written the manuscript 2. Silvia Castellaro has provided feedback on the manuscript 3. Ashok Dahal has contributed to the scripting aspects 4. Luigi Lombardo has designed and supervised the experiment as well as written the manuscript

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 1 .
Fig. 1.Panel a, shows the general geographic context of the study area; panel b, shows the actual details of the study are and of the locations where RTSs and ALDs have been mapped; Panels c and d offer two incremental zooms to locate the area where we extracted the external validation datasets and the distribution of the two cryospheric hazards in the region.

Fig. 2 .
Fig. 2. Variable selection for RTS a and ALD b binomial GAM models.The dark and light curves show the behavior of the AIC and the AUC in the SFS, respectively.

Fig. 3 .
Fig. 3. Modeling performance overview.First row (panels a, b and c) indicates the results for RTSs, whereas the second row (panels d, e and f) reports the ALDs.The thick lines for the two cross-validation schemes represent the mean ROC curve, whereas the filled area show the variability in the cross-validation scheme via a single standard deviation.

Fig. 4 .
Fig. 4. Confusion plots for RTS (a) and ALD (b).The square symbols indicate the results obtained from the fit.The circles refer to the RCV whereas the triangles refer to the SCV.The colored bands indicate the variability in the ten cross-validated replicates.As for the white symbols, they represent the mean behavior obtained for the two cross-validations.

Fig. 7 .
Fig. 7. Susceptibility maps and associated descriptive statistics for RTS (first row) and ALD (second row).The confusion matrices shown in the barplots are obtained using the Youden Index shown in the violin plots.

Fig. 8 .
Fig. 8. Multi-hazard susceptibility map.The RTS and ALD classes (four each), are defined using the Fischer-Jenks method, whose results are shown in the respective density plots.A two-dimensional barplot presents the distribution of the 16 multi-hazard classes over the study area.

Table 2
List of geology (GEO) categories and corresponding names.