Derivation and evaluation of landslide triggering thresholds by a Monte Carlo approach

Introduction Conclusions References


Introduction
Rainfall thresholds indicating landslide triggering are useful for the development of early warning systems in prone areas (cf., e.g., Keefer et al., 1987;Fathani et al., 2009;Takara and Apip Bagiawan, 2009;Baum and Godt, 2010;Capparelli and Versace, 2011).Commonly, such thresholds are derived by the analysis of historical rainfall and landslide data, and identified by drawing a lower-bound envelope curve of the triggering event characteristics (e.g., Campbell, 1975;Caine, 1980;Cancelli and Nova, 1985;Cannon and Ellen, 1985;Aleotti, 2004;Wieczorek et al., 2000;Guzzetti et al., 2007) or by enhanced methods which consider curves associated with a given frequency of non-exceedance by triggering events (cf.Brunetti et al., 2010;Peruccacci et al., 2012).A review by Guzzetti et al. (2007) indicated the prevailing use in literature of so-called power-law rainfall intensityduration (I -D) thresholds, which are of the form I = a 1 D a 2 , where D is rain duration to triggering and I is rain intensity I = W/D, W being rainfall accumulated over duration D. The a 1 and a 2 parameters have been derived by different researchers, for specific sites, regions or the whole globe.
Many factors of uncertainty affect the reliability of empirical thresholds, such as rainfall temporal and spatial variability, uncertain knowledge of the triggering instants, simplicity of threshold equation that does not include all control variables and statistical issues as well (Peruccacci et al., 2012).Nonetheless, it can be argued that most of the uncertainty stems from the availability and quality of the data used to derive the thresholds (Glade et al., 2000;Berti et al., 2012).
In fact, adequate historical data on landslides and simultaneous rainfall are in most cases available for a relatively short period, which may not be sufficiently significant from a statistical point of view.Moreover, the identification of the triggering instant is in many cases significantly uncertain and landslide archives are seldom complete (i.e., all landslide events occurred in the historical period are not known).This has a direct consequence on threshold derivation, because critical (where critical here means corresponding to landslide triggering) duration D, assumed as the time interval from rainfall event start and the triggering instant, cannot be computed accurately.Another key factor is the criterion used for rainfall identification, and in particular how the beginning of a rainfall event is identified.Many authors either do not specify the criteria used for rainfall identification or apply qualitative criteria, and indeed only few works in literature (Aleotti, 2004;Brunetti et al., 2010;Tiranti and Rabuffetti, 2010;Berti et al., 2012) explicitly addressed this problem.This makes thresholds subjective and impairs comparisons of results obtained by different researchers, as in analyzing the data the criterion may have been modified from one rainfall event to another.Another point is that in many countries automatic rain gauge networks have been installed only quite recently, and one has to rely on analysis based on rainfall records at the daily aggregation timescale (cf.Guzzetti et al., 2007, and references therein).Since many landslides, especially the most devastating shallow rapidly moving ones, may be triggered by rainfall events of a few hours (cf., e.g., Highland and Bobrowsky, 2008), use of daily rainfall for threshold derivation in these cases is quite questionable.
Apart from data quality issues, it can be pointed out that use of characteristic variables for the representation of rainfall events, and in particular of their intensity and duration, introduces an intrinsic uncertainty factor, because these variables may not be adequate to represent all the rainfall characteristics that affect landslide triggering.In fact, rainfall events represented by the same values of duration and intensity may correspond to totally different event time histories (hyetographs) that thus may or may not result in triggering.Sirangelo and Versace (1996) proposed an empirical method based on the use of convolution between rainfall time series and a filter function, which attempts to overcome this uncertainty.Also, use of solely the duration and intensity pair (D, I ) in threshold formulation implies that the effect of initial wetness on triggering rainfall is neglected.Regarding this issue, several authors have added to D and I antecedent rainfall as a control parameter, though the empirical analyses have not yet provided unequivocal indications on the role of antecedent rainfall and different researchers used diverse temporal horizons for its computation (Guzzetti et al., 2007).
Another important point is that many thresholds have been derived by analyzing triggering events only and thus neglecting the non-triggering ones.This may lead to an underestimation of the triggering conditions, i.e., to thresholds that implemented in a early warning system tend to produce an unacceptable degree of false alarms, causing populations to no longer rely on early warnings (so-called cry-wolf effect; cf., e.g., Barnes et al., 2007).In fact, thresholds always should be provided with a measure of their reliability.To this end, Berti et al. (2012) proposed Bayesian probabilistic analysis to evaluate landslide-triggering thresholds in the presence of uncertainty.Receiver operating characteristics (ROC) analysis (cf., e.g., Wilks, 2011), based on the analysis of correct and wrong predictions, may be advantageously applied as well (cf., e.g., Staley et al., 2013).
Alternatively to empirical models, physically based models that couple hydrological and slope stability analysis (Montgomery and Dietrich, 1994;Wu and Sidle, 1995;Iverson, 2000;Baum et al., 2002;D'Odorico et al., 2005;Rosso et al., 2006;Baum et al., 2008) have been proposed to assess landslide triggering by rainfall, with the advantage that they take explicitly into account the meteorological, hydrological and geomechanical processes and variables that determine landslide triggering.From such models physically based thresholds may be derived (cf., e.g., Rosso et al., 2006;Salciarini et al., 2008).Such thresholds generally deviate from a straight line in the log(D)-log(I ) plane; this casts some doubts on the use of the power-law as a proper functional form for deriving rainfall thresholds.In other words, because such thresholds were derived from a physically based model, this may be interpreted as an evidence that the use of the power-law form is not supported from a physically based standpoint.Nevertheless, in such studies the meteorological aspects were analyzed in a simplistic way, because the thresholds do not consider variability of rainfall intensity during events and the initial conditions are not computed as a function of rainfall time history preceding the current event.
In spite of the limitations that we have put into evidence above, I -D rainfall thresholds are widely applied for landslide early warning systems.Perhaps their success is due to the fact that simple forms of the threshold are more easily understood by stakeholders and decision makers than the more complex, albeit more accurate, physically based models.
In this paper, a Monte Carlo-based methodology to derive and evaluate rainfall landslide-triggering thresholds is proposed, which makes use of an existing body of stochastic and physically based models.The approach combines research findings in the fields of rainfall and landslide hydrological modeling to provide an output that can be eas-ily implemented in a early warning system, i.e., a landslidetriggering threshold, based on rainfall monitoring, of the same type that is commonly derived by direct empirical analysis of observed rainfall and landslide data.In particular, from the Monte Carlo simulations synthetic rainfall series are generated by a stochastic model and corresponding triggering/non-triggering conditions are identified by an hydrological and slope stability model.The generated data set is then analyzed to derive and evaluate I -D thresholds that take into account the variability of both rainfall intensity within events and initial conditions determined by past rainfall, as well as triggering/non-triggering events to measure uncertainty by ROC analysis.Furthermore, the derived stochasticinput physically based thresholds are compared with the constant-intensity physically based thresholds, which result from the simplistic assumption mentioned above (uniform hyetographs and prefixed initial conditions) in order to assess the effect on landslide triggering of rain intensity variability during events and variable initial conditions, computed as dependent by past rainfall time history.This analysis is related to the one by D' Odorico et al. (2005), in which the effect of rain intensity variability within events is studied by considering beta-shaped hyetographs inputs to the model of Iverson (2000) for derivation of hillslope response.Nevertheless, in their work, the variability of initial conditions as dependent from antecedent rainfall has not been considered because the steady-state asymptotic solution of Montgomery and Dietrich (1994) is utilized for computation of initial conditions.From their study they conclude that beta-shaped non-uniform hyetographs have a stronger destabilizing effect than uniform hyetographs of the same volume, since the associated return period of slope instability resulted higher in this last case.In this study we instead use hyetographs generated by a Neyman-Scott rectangular pulses (NSRP) stochastic model (Neyman and Scott, 1958;Kavvas and Delleur, 1975;Cox and Isham, 1980;Rodriguez-Iturbe et al., 1987a) and account for variability of initial conditions using a water table recession model to derive the initial water table height from the response to rainfall events preceding the current one, based on a linear reservoir mass-conservation equation with similar assumptions adopted by Rosso et al. (2006).The transient response to rainfall events is computed by a model based on the transient rainfall infiltration and grid-based regional slope-stability (TRIGRS) program (Baum et al., 2008).An application of the proposed methodology is carried out to the highly landslide-prone area of the Peloritani mountains, north-eastern Sicily, Italy.A sensitivity analysis on some of the most important control variables is carried out to analyze their effect on landslide-triggering thresholds and the associated uncertainty.

Monte Carlo synthetic data generation
The Monte Carlo simulation procedure for synthetic rainfalllandslide data generation consists of the following steps: 1.A stochastic rainfall model, calibrated on observations at a selected site, is used to generate a 1000-year long hourly rainfall time series.In particular we use a NSRP model (see Appendix A) 2. The synthetic rainfall time series is pre-processed in order to identify rainfall events and their inter-arrival durations.In particular, when two wet spells are separated by a dry time interval less than t min , these are considered to belong to the same rainfall event; otherwise two separate rainfall events are considered.Details on the choice of the t min simulation parameter are given at the end of this section 3. Some of the generated rainfall events are removed from the analysis because, according to the hydrological model, they will produce no significant variation of pressure head distribution, as their instantaneous (hourly) intensity is too low.In particular the events having maximum intensity less than i min are removed from the analysis.We assume i min equal to the leakage flux limit, given by c d K S (1 − cos 2 δ), c d being the vertical leakage ratio, K S the saturated hydraulic conductivity and δ the slope of the hillslope (see Appendix B) 4. Application of previous steps leads to the generation of N RE individual rainfall events 5. An initial value of the water table height is fixed to start simulations of the hydrological response for the whole rainfall time series.For the analyzed case-study area and many similar cases, it may be assumed that at the beginning of each hydrological year the water table is at the basal boundary, because an almost totally dry season had come prior to it (this may be a slightly conservative assumption, since pressure head at the soil-bedrock interface may assume negative values after a long dry season).As this is valid also for the first year, simulation for first event is conducted considering the water table at the soil-bedrock interface 6.The response to the generated rainfall events is simulated by the models described in Appendix B and the following procedure to be applied for i = 1, 2, . .., N RE : a. Response in terms of pressure head ψ within rainfall events is computed using the TRIGRS model (Baum et al., 2008(Baum et al., , 2010) )  Regarding the choice of the inter-event time t min -an issue that is the focus of some works in literature (e.g., Restrepo-Posada and Eagleson, 1982;Bonta and Rao, 1988) -we have followed an approach analogous to that used by Balistrocchi et al. (2009) and Balistrocchi and Bacchi (2011) -for which the inter-event time may be assumed as the minimum time needed to avoid overlapping of the response produced by two subsequent rainfall events.To this end we considered that the temporal peak of pressure head due to an individual rainfall event may be reached, as mentioned above, at an instant significantly after rainfall ceases.Hence, a criterion for selecting the inter-event time has been that of choosing a value that approximates the dry time interval that contains the peak pressure head response relatively to all the N RE simulated rainfall events.In our case, from preliminary simulations a t min = 24 h appeared suitable for the hydraulic and geotechnical soil properties which are considered in this work (see Sect. 3).
Figure 1 summarizes the main steps of the described Monte Carlo methodology.

Triggering and non-triggering rainfall identification
For a hillslope of given properties, Monte Carlo simulations lead to a series of time instants at which the factor of safety drops below the value of 1 (a FS = 1 down crossing).
A triggering rainfall may be associated with each down crossing, though it is noteworthy to point out that some uncertainty is present in the link between the actual failure of the slope and its theoretical instability.Nevertheless, following several works in literature (e.g., Iverson, 2000;Rosso et al., 2006;Baum et al., 2010) this uncertainty has been not taken into account here, though it may affect at a certain degree the way that rainfall events are classified as triggering and non-triggering and the subsequent ROC-based analysis (Sect.3.2).
We investigate thresholds that are based on rainfall intensity I and duration D. Various procedures have been used to identify and compute I and D, as discussed in the "Introduction" section of this paper and by Berti et al. (2012).From a general standpoint, this procedure may be disconnected from the way event separation has been performed to compute the triggering instants with the methods described in Sect. 2. Nevertheless, for consistency with the event separation criterion that is considered in the Monte Carlo simulation methodology, it is preferable to base the procedure for identification of triggering and non-triggering events on the same inter-event time t min used in Monte Carlo simulations.
Based on the considerations above, we adopt the following procedure for triggering and non-triggering rainfall identification.First, rainfall events are separated when their dry inter-arrival is longer than t min .Rainfall events then have a total duration D tot and mean intensity I tot = W tot /D tot , where W tot is the total event cumulative rainfall.For a triggering event, triggering may occur before or after the end of the rainfall event.In the first case, the critical duration D CR is the time interval that starts at the beginning of the rainfall event and finishes at the triggering instant, and critical intensity is given by I CR = W CR /D CR , where W CR is rainfall accumulated over duration D CR .In the second case it is instead characterized with D tot and I tot .Moreover, the P 0 events that have at their beginning a water table height h i ≥ d LZ ζ CR , ζ CR being the critical wetness ratio (corresponding to FS ≤ 1; see Eq. B6), are removed from the analysis, as the triggering is due to the preceding events, which have already been included in the set of triggering points.Non-triggering events are represented by D tot and I tot .
In our case (see Sect. 2) t min = 24 h.It is worthwhile to write that with this choice the procedure of triggering rainfall identification happens to be equal to the one that was applied by Brunetti et al. (2010) in analyzing empirical data (observed landslides instead of simulated).

Uncertainty and ROC-based evaluation and optimization of thresholds
The analysis of the Monte Carlo simulations produces two sets: the set of positives P , i.e., of triggering events, and the set of negatives N, i.e., of non-triggering events.These sets may be represented as scatterplots in a double-logarithmic (D, I ) plot, and in general there is a region where both sets are present -lets say, an intersection region P ∩ N. In our framework this is due to two separate factors: - is the mean intensity I = W/D (rain-intensity variability within events) -To a given (D, I ) pair there may be corresponding diverse initial conditions (variability of initial conditions, due to variability of rainfall before the current event).
The first uncertainty factor is analyzed by letting the initial water table height h i = 0 for i = 1, . ..N RE in performing the Monte Carlo experiments (indicated in the ensuing text as ψ 0 = 0 case; see also Appendix B).To investigate the second uncertainty factor, those experiments are compared with the complete ones, where the effect of initial conditions depending on past rainfall time history is taken into account by Eq. (B2), and considering different levels of memory, by varying the parameters that appear in the water table recession constant τ M (again see Eq. B2).
Moreover, we compare the results with thresholds derived from the model by assuming uniform hyetographs as input and a prefixed initial condition (constant-intensity physically based thresholds) (cf., e.g., Rosso et al., 2006;Salciarini et al., 2008;Tarolli et al., 2011).In this case a univocal triggering threshold exists I = f (D), for given hillslope properties, and the two factors of uncertainty illustrated above are not taken into account.Due to the analytical complexity of the TRIGRS (see Sect.B2) unsaturated model it is possible to determine these thresholds only numerically (not in closed form).Hence, we have derived these thresholds by simulation of infiltration and slope stability using constant-intensity hyetographs in the (D, I ) domain discretized at a sufficient level, and searching the triggering curve by interpolation of the results.In doing this we have assumed an initial water table height at the soil-bedrock interface in order to properly compare results with the stochastic-input physically based thresholds of the ψ 0 = 0 case.
As a consequence of the presence of the region P ∩ N , when a triggering rainfall threshold is fixed -e.g., a powerlaw one I = a 1 D a 2 -the four cases of true positives, true negatives (correct predictions), false positives and false negatives (wrong predictions) can occur, as illustrated in Table 1.In general, to each pair of parameters a 1 and a 2 corresponds a prediction performance that may be measured by indices based on the number of occurrences in the four cases, denoted respectively as TP (true positive), FN (false negative), TN (true negative) and FP (false positive) (or ROC-based indices).In order to derive optimal thresholds one may maximize an objective function based on these quantities.Several indexes do exist and their advantages and drawbacks have been discussed by different researchers (cf.Murphy, 1996;Stephenson, 2000;Frattini et al., 2010).
Among the various possibilities, we consider for threshold evaluation the use of the true skill statistic (TSS) (also known as Hanssen-Kuipers discriminant; Hanssen and Kuipers, 1965), which was introduced by Peirce (1884), and is given by the difference between the true positive rate TPR = TP P = TP TP+FN (also known as sensitivity or hit rate, or recall or probability of detection) and the false positive rate FPR = FP N = FP TN+FP (also known as probability of false detection or 1specificity): It is TSS = 0 for TPR = FPR (random guess) and TSS = 1 for a perfect prediction (TPR = 1 and FPR = 0).In fact this index TSS is bounded in the interval [−1, 1], but negative values are fictitious as an inversion of the triggering threshold use brings TSS to its absolute value, which is always in the interval [0, 1] (i.e., saying that values below the threshold trigger landslides and, vice versa, values above the threshold do not trigger landslides).Different weights may be given to the TP, TN, FP and FN, as pointed out by Peirce (1884), in order to account for the fact that a FN is more harmful than a FP (see also Peres andCancelliere, 2012, 2013).Since data on the possible weights to assume are usually scarce, in this paper we prefer to proceed in a more simple and standard manner, where this different weighting is not considered.We estimate the best performing power-law threshold I = a 1 D a 2 as the one that gives the maximum value of TSS = TSS(a 1 , a 2 ).
At the same time the simulation-optimization methodology enables one to evaluate the use of I -D power-law thresholds, as the value of the objective function is a measure of the maximum performances that can be expected from the adopted functional form for the threshold, and thus a measure of its validity.
It is noteworthy to highlight that the real uncertainty associated with this threshold generally yields different -likely worse -performances of that assessed here, since uncertainty factors are more than the ones related to the stochastic nature of rainfall listed at the beginning of this section.

Geological setting and soil properties
An application of the described methodology is carried out to the Peloritani mountains near the Ionian coastal area, in north-eastern Sicily, Italy (Fig. 2).The mountain ridge extends longitudinally for about 50 km, with a SW-NE orientation, resulting in peaks higher than 1200 m.This area can be subdivided into seven basins: (1) minor basins between Al- Spatial variability of each of the soil properties could be included in our model simulations.Nonetheless, detailed information on how the properties are distributed spatially is unavailable.Hence we preferred to carry out a sensitivity analysis, by varying the hydraulic conductivity K S , the leakage ratio c d and the soil depth d LZ according to Table 4, and the critical wetness ratio in the range 0 < ζ CR < 1.To proceed in this way enables one to better analyze the way model results are influenced by these variables rather than assuming that they are distributed spatially with interpolating laws of difficult validation.Since slope mainly affects slope stability (Eq.B5) rather than the infiltration process, variation of slope is indirectly taken into account by variation of ζ CR .It is noteworthy to mention that an alternative approach may be to consider model parameters generated according to a probability distribution, as proposed by the TRIGRS-P modification of the TRIGRS code, developed by Raia et al. (2014).

Rainfall data and NSRP model calibration
Climate in the Peloritani area is Mediterranean with with hot and dry summers, and precipitation -mainly convectivefalling mostly in the period from October to January.
For calibration of the NSRP model, the rainfall series measured at the Fiumedinisi rain gauge from 21 February 2002 to 9 February 2011 (almost 9 years) has been used (see Fig. 2).Based on a preliminary analysis of monthly statistics, six homogeneous rainfall seasons have been identified: (1) September and October, (2) November, (3) December, (4) January-March, (5) April and (6) May-August (see Fig. 4 and its caption).Separate sets of parameters of the NSRP model have been determined for each one of the four rainy seasons (in total 5 • 4 = 20 parameter values), while the last two seasons have been considered to have negligible rainfall.The Weibull shape parameter b has been fixed to 0.6 for all seasons, based on different trials.Parameters obtained from calibration are shown in Table 2.  From the assumed inter-event time t min = 24 h and soil properties of Table 3 the resulting number of rainfall events is N RE = 19 826 (in average 19.83 events per year).This number derives from the initial 28 751 events from which the events with hourly intensities below i min = c d K s (1 − cos 2 δ) = 2.975 mm h −1 were cut.These values are statistically comparable to the ones on the observed series (19.18 events/year from 28.91 events/year before the cut of underleakage events).

Derivation and evaluation of rainfall thresholds
In Fig. 5 the scatterplots of triggering and non-triggering events in the log(D)-log(I ) plane, derived from analysis of Monte Carlo simulations, are shown for the ψ 0 = 0 case and for specific upslope contributing areas A/B = 10, 20 m (τ M = 2.75, 5.49 days).Related results are also shown in Table 5.In the figure, red points represent triggering rainfall events, or the set of positives P , while green points represent the non-triggering ones, or the set of negatives N.
Optimal thresholds have been derived by maximization of the TSS index (see Eq. 1), preliminarily by considering both the power-law coefficient a 1 and exponent a 2 as variable parameters.Inspection of the results revealed minimal changes of the exponent a 2 with changing ratio A/B, and so a second optimization has been carried out only with reference to the a 1 parameter, fixing the exponent a 2 to its mean value of a 2 = −0.8.Fixing the exponents forces the different thresholds corresponding to different A/B ratios to be parallel and therefore to not intersect each other; this is somehow consistent with the fact that as the A/B increases, landslides are generally more likely to occur for less severe rainfall events because of increased past-rainfall memory.
From the case of ψ 0 = 0, i.e., of an initial water table at the soil-bedrock interface for all events (Fig. 5a), where simulated uncertainty of triggering is due only to the variation of rain intensity within events, it is seen that in this case the region in which triggering and non-triggering events coexist is quite narrow; moreover, a power-law relation between I and D dichotomizes well between triggering and nontriggering conditions.In fact, the optimal power-law threshold in this case has a reliability of TSS = 0.991, practically 1 × 10 −5 (36 mm h −1 ) 2.5 × 10 −5 0.1 1, 1.5, 2 5.5 2 × 10 −5 (72 mm h −1 ) 5 × 10 −5 0.05, 0.1, 0.2 1, 1.5, 2 2.7 3 × 10 −5 (108 mm h −1 ) 7.5 × 10 −5 0.1 1, 1.5, 2 1.8 equal to the ideal value of 1.Additional insights on the effect of variability of rainfall intensity within events may be derived comparing the scatterplots for this ψ 0 = 0 case with the constant-intensity physically based threshold (also determined considering ψ 0 = 0) represented in the plots of Fig. 5 as a dashed black line.Figure 5a reveals that the deterministic threshold approximates the lower envelope curve of critical events (red dots) for short durations -that, for the analyzed data, corresponds to durations less than about 12 h.For higher durations, this is no longer true and variable-intensity hyetographs start to have a higher destabilizing effect than the constant-intensity ones of same rainfall volume.The variability of rainfall intensity within events leads to a deviation from the deterministic line of the triggering NSRP rainfall-event points, making the scatter of triggering points more similar to a straight line than to the curved deterministic threshold.This behavior is essentially due to the presence of the leakage term q l = min{c d K s (1−cos 2 δ), q(d u , t)}, whose effect is stronger for uniform hyetographs than for variable ones, since in the former there are no peaks of intensity.In particular, a uniform hyetograph produces no water table rise if intensity is below a rate slightly greater than c d K s (1 − cos 2 δ), because all infiltrating water, after percolating through the unsaturated-zone, goes to basal loss.The same does not generally occur for a variable intensity hyetograph of the same volume, because instantaneous intensity may be significantly higher than the event mean intensity W tot /D tot , and consequently a water table rise is produced.The opposite behavior for short durations is due to the fact that in this case variable hyetographs may have peaks of intensity higher than infiltration capacity, and thus not all rainfall infiltrates into the soil.Due to these reasons, the model deterministic threshold results in a poor erformance (TSS = 0.642).
The above results lead to conclude that it is important to account for variability of intensity during events and that landslide occurrence related to the transient part of the response to rainfall events can be represented with good approximation by a I -D power-law equation.This provides a physically based justification for such a widely used threshold form, which turns out to be valid when landslide occurrence is mostly due to the transient part of the hillslope response to rainfall.
For the A/B = 10 m case (Fig. 5b), which may represent prevalent conditions for the Peloritani mountains area (see Sect. 4.1), scattering of the red dots increases due to the introduced variability of initial conditions.Consequently, performances of predictions based only on intensity and duration of rainfall events become worse.Simulations for larger values of the specific catchment area (e.g., A/B = 20 m, Fig. 5c) confirm this conclusion.
Based on these results, it may be stated that, for a given climatic input, performances of thresholds which do not account for past rainfall time history (antecedent rainfall) are expected to decrease as the water table recession time constant τ M increases.Rainfall time history occurring before single rainfall events generally influences landslide triggering, determining whether a threshold based only on rainfall intensity and duration may be sufficient or the I -D threshold needs to be improved by the introduction of antecedent rainfall variables.
Finally, the threshold may be a reasonable choice for the Peloritani mountains area since performances are still high, since TSS = 0.862.
Figure 4. Moments for each month for Fiumedinisi SIAS hourly rainfall data.In particular µ denotes the mean, γ the variance, ρ(1, 1) the linear autocorrelation coefficient at lag = 1, φ the probability of a dry interval, φ DD the probability that a given interval dry after another dry one, φ WW the probability that a given interval is wet after another wet one.These moments have been used in calibration of the NSRP model via the method of moments.It can be seen that there are low differences of most of the moments within the following groups of months: Sept-Oct, Nov, Dec, Jan-Mar, Apr, May-Aug.A separate set of NSRP model parameters was calibrated for each of the first four of these seasons, while the period April-August has been neglected from the successive analyses because precipitation rates are so low that it is very unlikely that a triggering event occurs in such period.

Validation of the threshold using observed data and comparison with other thresholds
The Monte Carlo simulation technique provides a framework that is useful for exploiting the information contained in the observed rainfall series and the physics of the modeled phenomenon.Nonetheless, it remains important to validate the results against observed data, to check if the models are capable of reproducing characteristics of interest which are not directly taken into account in model calibration and development.
Here we perform a global validation by comparing the derived threshold (Eq.2) with the triggering and non-triggering observed rainfall events.
In particular we have derived from the series the rainfall events with the same criterion adopted in Monte Carlo simulations.Yet the events in the months neglected there (April-August) and the events with intensities below the leakage flow c d K S (1 − cos 2 (δ)) were not removed here in the observed record, for the test to be unbiased to this preprocessing  of data.This resulted in 190 events, whose temporal evolution of accumulated intensities I (D) = W (D)/D has been compared with the derived threshold, as shown in Fig. 6.The figure indicates positive validation of the methodology, as the events in the I -D plane that exceed the threshold are all and only the four events that have triggered landslides in the considered period (red-line time histories).This is the best result one can obtain from this test, but it is perhaps noteworthy to clarify that it is expected that in the long period the same test will not perform without errors, consistently with the Monte Carlo simulations and the way that the threshold was derived.
Comparison with other thresholds may also help in understanding how reliable the performed analysis is.Gariano et al. (2013) proposed for Sicily the threshold E = 10.4D 0.22 , where E = I ×D is cumulative event rainfall, and hence threshold is equivalent to I = 10.4D−0.78 .This threshold has been derived considering only observed triggering events and it is corresponding to an exceedance frequency of 1%.It is firstly interesting to notice that the exponent is practically equal to the one that results from our analyses (a 2 = −0.8).Furthermore, as can be seen from Fig. 5b this threshold exceeds one triggering event of the Monte Carlo simulated data, which constitutes the 1% of the triggering-rainfall data set (see Table 5: 0.01×(TP+FN) = 0.01×(104+11) = 1.15).This result is a further support to the validity of the performed Monte Carlo analysis and highlights the importance to take into account non-triggering rainfall in assessing threshold performance.

Sensitivity analysis
A sensitivity analysis has been conducted with respect to the following variables (Sect.4 and Table 4): the hydraulic conductivity K S , the leakage ratio c d , the soil depth d LZ and the critical wetness ratio ζ CR .Plots similar to those of Fig. 5 can be derived for each set of values of such variables.For brevity those plots are not shown here and the analysis is performed considering the plots of the optimal threshold coefficient a 1 (again the exponent has been fixed to a 2 = −0.80)and the maximum value of the objective function TSS as functions of ζ CR .
The results of this analysis are shown in Figs.7-8, which can be commented on as follows: -Sensitivity to hydraulic conductivity: in the ψ 0 = 0 case the variation of K S induces relevant changes neither in the threshold nor in the performance TSS.It can however be hypothesized that considering more low values of K S , infiltrating rates more strongly depend on how rainfall is distributed within the event and thus uncertainty increases.Conversely, to an higher K S variation neither of the threshold nor of the performance may be observed, since infiltration capacity will always be higher than rainfall intensity, which then infiltrates totally.In the A B = 10 m case the variations of a 1 and TSS are relevant and due to increased memory with decreasing K S ; the threshold decreases with K S and the associated uncertainty increases (lower TSS).
-Sensitivity to leakage ratio: both in the ψ 0 = 0 and the A B = 10 m cases, the increase of the c d ratio induces an increase of the threshold and in the relative uncertainty.Such a variation is of comparable magnitude in the two cases.This happens because the variation of c d affects only pressure head response during rainfall events, but does not affect significantly memory due to antecedent rainfall.Sensitivity to c d increases with soil depth, because the increased water absorption in the unsaturated strata and the consequent increased damping and smoothing effect induces an increase of the portion of infiltrating water that goes to leakage.Indeed this affects more the threshold (a 1 ) than the relative performance (TSS).memory is relatively low, soil thickness is not too shallow and hillslope is naturally not close to instability (ζ CR is relatively high).In fact I -D power-law thresholds resulted in good performance (TSS > 0.8) when τ M ≤ 3 days, ζ CR > 0.5 and d LZ ≥ 1.5 m.The constant-rainfall physically based thresholds always perform poorly.This confirms that variability of intensity during rainfall events influences significantly rainfall intensity and duration associated with landslide triggering.

Conclusions
In this work it has been shown how stochastic rainfall models and hydrological and slope stability physically based models can be advantageously combined in a Monte Carlo simulation framework to derive and evaluate landslide-triggering thresholds.The approach synthesizes research findings in the fields of rainfall and landslide hydrological modeling to provide an output that is easily implemented in a early warning system, i.e., a landslide-triggering threshold, based on rainfall monitoring, of the same type that is commonly derived by direct empirical analysis of observed rainfall and landslide data.The advantages of the approach consist in a better exploitation of the information contained in observed rainfall series and measurements of hydraulic, geotechnical soil properties and geomorphological analysis.Because both triggering and non-triggering rainfall events are taken into account, the approach enables a more correct derivation and evaluation of thresholds, for which well-known predictionskill receiver operating characteristic (ROC) analysis may be advantageously used to reduce subjectivity in the identification of thresholds and to estimate the convenience of the use of the threshold within a landslide early warning system.
Furthermore, the specific modeling framework implemented in this work enabled to analyze some general issues on landslide-triggering phenomena regarding its controlling factors and uncertainty related to variability of rainfall intensity within events and past rainfall (antecedent rainfall), with a particular focus on the widely used power-law rainfall intensity-duration threshold form.In particular, from the application to the Peloritani mountains area in north-eastern Sicily (Italy) and the conducted sensitivity analysis on various controlling parameters, the following conclusions can be drawn: (1) variability of intensity during rainfall events significantly influences rainfall intensity and duration associated with landslide triggering.In particular constant-intensity input thresholds perform conservatively only for low rainfall durations, while the opposite occurs for events of longer duration.On the other hand, when a time-variable rainfallrate event is considered, the simulated triggering points may be separated with a very good approximation (i.e., true skill statistic is close to 1) from the non-triggering ones by a I -D power-law equation.This indicates that this widely used model is adequate to represent the triggering part due to tran-sient infiltration produced by rainfall events.Thus, this gives a physically based justification for such a widely used threshold form, which turns out to be valid when landslide occurrence is mostly due to that part.This depends, for a given rainfall climate, mostly on the timing of recession of the saturated zone occurring during dry inter-event intervals (in our model represented by the constant τ M ), but also on the other soil hydraulic and geotechnical parameters, and in particular on soil depth d LZ , which must not be too shallow, and critical wetness ratio ζ CR , that must be not too low.For instance, for the case-study area, the I -D power-law threshold performs with a TSS > 0.80 when it is τ M ≤ 3 days and d LZ ≥ 1.5 m and ζ CR > 0.50.(2) In general, rainfall time history occurring before single rainfall events influences landslide triggering, determining whether a threshold based only on rainfall intensity and duration may be sufficient or the I -D threshold needs to be improved by the introduction of antecedent rainfall variables.(3) Sensitivity analysis indicated that in general threshold performance is affected by the depth of the basal boundary and the critical wetness ratio that represents the natural degree of stability of the hillslope.In particular, uncertainty of landslide-triggering prediction increases as the soil depth or the critical wetness decrease.Hence, it is more difficult to predict landslides the more an hillslope is shallow and the more it is naturally close to instability.A decrease of performance is obtained as basal drainage (leakage) increases as well.Hence the I -D power-law may not be performing adequately in the case that the bedrock is significantly fractured and the soil cover is very shallow.Results also indicate that hydraulic conductivity mainly influences memory of past rainfall and only slightly affects the uncertainty related to variability of rainfall intensity within events.
Further ongoing research is oriented to introduce additional information in the derivation of the thresholds, such as antecedent precipitation as well as indexes representative of the shape of the hyetograph.

Appendix A: Stochastic rainfall model
Stochastic rainfall point models are aimed at the generation realistic synthetic time series of (virtually) unlimited length, by calibration based on a observed rainfall series.The uninitiated reader is invited to read Neyman and Scott (1958), Kavvas and Delleur (1975), Waymire and Gupta (1981a, b, c), Rodriguez-Iturbe et al. (1987a, b), Salas (1993) and Cowpertwait et al. (1996).Here we give some specific details on the NSRP model, for it being the one used in this work to model rainfall at a site.
NSRP process belongs to the so-called class of cluster models (cf., e.g., Salas, 1993).The NSRP cluster model consists in a two-level mechanism process.This process is related to a conceptualization of meteorological processes that originate rainfall events (Foufoula-Georgiou and Guttorp, 1989), and it is obtained by the following steps (see -Each cell has origin at time τ i,j with j = 1, 2, . .., c i measured from t i , according to an exponential random variable of parameter β -A rectangular pulse of duration d i,j and intensity x i,j is associated with each rain cell.Pulses have duration exponentially distributed with parameter η while intensities X are extracted from a Weibull distribution (cf.Cowpertwait et al., 1996), which has cumulative distribution function -Finally, the total intensity at any point in time is given by the sum of the intensities of all active cells at that point.
We have calibrated the NSRP model by the method of moments, i.e., using the properties of the aggregated NSRP process Y (τ ) at different timescales of aggregation τ (cf., e.g., Rodriguez-Iturbe et al., 1987a, b;Cowpertwait et al., 1996;Calenda and Napolitano, 1999).According to this method, model parameters, i.e., λ, ν, β, η and ξ (b is typically fixed, in the range 0.6 ≤ b ≤ 0.9; see Cowpertwait et al., 1996) are estimated using at least as many moments as the parameters of the model, considering different statistics (moments)  Cowpertwait et al., 1996).at various time aggregations, and solving the related equation system, where the theoretical expressions, containing the parameters, are equated to the sample moments.Theoretical moments of Y (τ ), such as the mean µ(τ ), variance γ (τ ) and autocorrelation at lag k, ρ(τ, k), are given by formulas derived by Rodriguez-Iturbe et al. (1987a).Transition probabilities were derived as well, by Cowpertwait (1991), and have been included in the calibration process.We have solved the nonlinear equation system by numerical minimization of an objective function S(λ, ν, β, η, ξ ), that measures the global relative error between theoretical and sample moments (cf.Cowpertwait et al., 1996).
Though calibration is conducted taking into account seasonality, by calibrating the model separately for the various homogeneous seasons within the year (Sect.4.2), it is noteworthy to point out that the generated series is globally stationary, and consequently eventual annual non-stationarity due to climate change is not taken into account.In other words, the generated series represents possible realizations of the rainfall events under current climate conditions, the final aim being of deriving thresholds suitable under the present climate and not to assess how climate change may affect them.The transient part is due to infiltration of event rainfall, while the initial part depends on rainfall time history before the current rainfall event.As pointed out by Iverson (2000), for soils that are relatively shallow, i.e., when the ratio between soil depth and the square root of the upslope contributing area is small, ε = d LZ /

√
A 1, the prevailing process that determines ψ 1 is 1-D vertical infiltration, while in the dry periods in between events, the prevailing process is sub-horizontal drainage.
Based on these considerations, we use a vertical infiltration model for computing ψ 1 , the TRIGRS unsaturated model (Baum et al., 2008(Baum et al., , 2010)), and a linear reservoir subhorizontal drainage model to compute the initial conditions from the water table height at the end of the rainfall event preceding the current one (which in turn depends on past rainfall time history).This latter model is derived from a massconservation equation of soil water coupled with the Darcy's law used to describe seepage flow, where for simplicity the soil volumetric strain is neglected (the variation of porosity with pressure head is assumed null).A similar conceptualization is the basis of the model proposed by Rosso et al. (2006).
In order to understand the controlling factors of landslide triggering, it is useful to separate the analysis of the response to rainfall in terms of the transient part only.This may be done by performing the simulations assuming ψ 0 = 0 at the beginning of events.
From the pressure head response, the factor of safety FS for slope stability is computed, using a infinite slope model.
The description of these model components follows.

B1 Initial conditions model
The initial condition to rainfall event i is computed from the response at the end of rainfall event i − 1, using a water table height h recession model between storms based on the following mass-conservation equation (Rosso et al., 2006): where A is the contributing area draining across the contour length B of the lower boundary of the hillslope, δ is the inclination of the hillslope, K s is the saturated hydraulic conductivity, and θ s − θ r is soil porosity, θ s and θ r being the saturated and residual soil water contents, respectively.The ratio A/B, which can be computed based on a digital terrain model (DTM), is the well-known specific upslope contributing area, an important variable on which the topographic control on shallow landslide triggering depends (Montgomery and Dietrich, 1994).For instance, it is A/B = B N d , where N d is the number of cells draining into the local one, if one determines flow paths via the non-dispersive single direction (D8) method (O'Callaghan and Mark, 1984).Other methods consider multiple flow directions (cf.Holmgren, 1994).
The solution to Eq. ( B1) is used to compute the water table height at the beginning of rainfall event i: where ψ(d LZ , t f,i−1 ) and the inter-arrival time t i are defined in Sect. 2. The time constant τ M regulates the pressure head memory from one event to another.The initial pressure head distribution above the water table is computed accordingly with assumptions of the transient vertical infiltration model (see next Sect.B2), letting the steady (initial) surface flux I ZLT = 0, which yields the following equation (see Baum et al., 2010;Baum and Godt, 2013): for depths Z ≤ d u , d u being the depth to the top of the capillary fringe and α the parameter of Gardner's (1958) exponential soil-water characteristic curve (cf.Fig. B1).

B2 Transient infiltration model
Reference scheme for a simulated hillslope is shown in Fig. B1.Infiltration in the unsaturated zone is modeled through Richards' (1931) vertical-infiltration equation for a sloping surface particularized for the Gardner's (1958) exponential soil-water characteristic curve K where K s is the saturated hydraulic conductivity, α is the SWCC parameter, ψ cf = −1/α is the pressure head at the top of the capillary fringe, θ r is the residual water content, θ s is the water content at saturation and α 1 = αcos 2 δ.
A closed-form solution to this equation for δ = 0 has been provided by Srivastava and Yeh (1991) and extended to a sloping surface by Savage et al. (2004), and used in the TRIGRS unsaturated model (Baum et al., 2008(Baum et al., , 2010)).
The solution to Richards' equation provides the pore pressure profile in the unsaturated zone, and a flux to the saturated zone q(d u , t).Because of the partial absorption of water within the unsaturated zone, this flux turns out to be damped and smoothed with respect to the infiltrating flux at the ground surface (cf.Fig. 8 of Baum et al., 2008).The TRIGRS model then computes water table rise using q(d u , t) and subtracting from it a leakage flow rate given by q l = min{c d K s (1 − cos 2 δ), q(d u , t)} (vertical drainage at the basal boundary, which is not assumed perfectly impervious), where c d represents the ratio between saturated hydraulic conductivities of the basal boundary layer and of the regolith surficial layer.In the case that no specific information on c d ratio is available, a reasonable value may be c d = 0.1 (cf.Baum et al., 2010), which means that hydraulic conductivity of the layer below depth Z = d LZ is of one order of magnitude less than the regolith surficial layer.
The resulting water table rise is computed by comparing this excess flux accumulating at the top of the capillary fringe to the available pore space directly above it.
Pressure head rise is assumed transient in the saturated zone as well, and computed by formulas adapted from analogous heat-flow problems (see Baum et al., 2008).

B3 Slope stability model
For analysis of hillslope stability we assume an infinite slope scheme, and compute minimum factor of safety FS with the following formula (Taylor, 1948): where c is soil cohesion for effective stress, φ is the soil friction angle for effective stress, γ w is the unit weight of groundwater, γ s is the soil unit weight and δ is the slope angle.In this scheme the failure occurs at the basal boundary Z = d LZ , because the pressure head results maximum at that depth.
It is useful to consider the critical wetness ratio, derived from Eq. (B5) letting FS = 1, which is a parameter that for a given hillslope (given slope δ and soil depth d LZ ) depends only on the geotechnical characteristics of the soil: The ζ CR varies from 0 to 1, respectively, for an unconditionally unstable and a unconditionally stable hillslope (Montgomery and Dietrich, 1994), and hence it indicates the natural degree of stability of the hillslope.

HydrolFigure 1 .
Figure 1.Scheme of the Monte Carlo methodology for derivation of landslide-triggering thresholds.Text in red indicates briefly the input required, while text in black indicates models or modeling phases.

Table 1 .
Confusion matrix for the possible success and failure cases of a warning process based on a landslide-triggering threshold I = f (D).Actual Landslide (P ) No landslide (N ) Predicted Landslide: I ≥ f (D) true positive, TP false positive, FP No landslide: I < f (D) false negative, FN true negative, TN

Figure 2 .
Figure 2. Map showing the location of landslide-prone study area of Peloritani mountains, Italy.The area may be subdivided into seven catchments as highlighted in the map.Relief is shown by a 5 m×5 m resolution digital terrain model based on lidar measurements in the year 2007.Location of Fiumedinisi raingauge is shown as well.The inner box contains the area surrounding the Giampilieri municipality, where a widespread rapidly moving landslide event killed 37 people on 1 October 2009 (see Fig. 3).

Figure 5 .
Figure 5. Derivation of thresholds from ROC optimization of Monte Carlo simulations.Red points represent triggering simulated rainfall, while green ones represent the non-triggering.The best power-law stochastic-input physically based thresholds (S) are derived by maximization of the TSS ROC-based index.The constantintensity input physically based threshold (C) is determined considering the response to uniform hyetographs and water table initially at the basal boundary.(a) zero memory case ψ 0 = 0, (b) A/B = 10 m (threshold derived by Gariano et al. (2013) for Sicily is shown as well, thin dotted line, for comparison with the one derived) and (c) A/B = 20 m.

Figure 6 .
Figure 6.Validation of threshold-derivation procedure with observed rainfall events.Red lines indicate mean intensity I (D) = W (D)/D time histories that exceed the derived threshold.Green lines represent observed events that do not exceed the threshold.Threshold is exceeded for all and only the four dates in which landslides occurred in the Peloritani area: (I) 15 September 2006, (II) 25 October 2007, (III) 24 September 2009 and (IV) 1 October 2009.

Figure 7 .
Figure 7. Sensitivity of triggering thresholds and of the relative performances to hydraulic conductivity K S (cf.Table 4).Plots show, for the two cases of zero memory (ψ 0 = 0) and A/B = 10 m, how the a 1 power-law coefficient of optimized stochastic-input physically based thresholds and relative TSS performance index vary with the critical wetness ratio ζ CR .Performances of the constantintensity physically based thresholds are shown as well (in green).Different soil depths are considered: (a) d LZ = 1 m, (b) d LZ = 1.5 m and (c) d LZ = 2 m.

Figure 8 .
Figure 8. Sensitivity of triggering thresholds and of the relative performances to the leakage ratio c d (fractured bedrock) (cf.Table 4).Plots show, for the two cases of zero memory (ψ 0 = 0) and A/B = 10 m, how the a 1 power-law coefficient of optimized stochastic-input physically based thresholds and relative TSS performance index vary with the critical wetness ratio ζ CR .Performances of the constant-intensity physically based thresholds are shown as well (in green).Different soil depths are considered: (a) d LZ = 1 m, (b) d LZ = 1.5 m and (c) d LZ = 2 m.

-
Variation of ζ CR and of soil depth d LZ : with increasing ζ CR the threshold increases, while the associated uncertainty decreases.The threshold and relative performances decrease with soil depth.This indicates that landslides become less predictable as soil depth d LZ and ζ CR diminish.Generally, antecedent rainfall has to be taken into account to improve performance of landslide-triggering thresholds based on rainfall.Nonetheless, the use of only the I and D variables may still lead to good performing thresholds when Hydrol.Earth Syst.Sci., 18, 4913-4931, 2014 www.hydrol-earth-syst-sci.net/18/4913/2014/

-
Fig. A1): -First, clusters -also known as storm-generating systems, or simply storms -arrive governed by a Poisson process of parameter λt (this is the first-level mechanism) For each cluster origin, rectangular pulses (rain cells) are generated (this is the second-level mechanism).The number of pulses C associated with each storm is extracted from another separate Poisson distribution.In order to have realizations of C not less than one, it is assumed that C = C − 1, with c = 0, 1, 2, . . .(which implies c = 1, 2, 3, . ..), is Poisson distributed with mean ν − 1

Figure B1 .
Figure B1.Soil 1-D vertical scheme used to model infiltration and slope stability based on the TRIGRS unsaturated model (adapted from Baum et al., 2008).
. The instant t f,i = max(t end,i , t max,i ) is looked for, where t max being the time instant at which maximum transient pressure head occurs.It follows that the final response to rainfall event i, in terms of water table height, is ψ(d LZ , t f,i )/β, where β = cos 2 δ (slope parallel flow is assumed), and d LZ is the soil depth.Moreover, the time interval t i+1 = t b

Table 2 .
Parameters of the NSRP rainfall model resulting from calibration on Fiumedinisi rainfall data, for the four homogeneous rainy seasons (the Weibull shape parameter has been fixed to b = 0.6).The assumption of a leaking basal boundary, characterized by a given c d ratio (see Sect.B2) is realistic, given the fractured bedrock strata.We assume that the hydraulic and geotechnical properties of Table3may represent the natural heterogeneity within the study area.

Table 3 .
Soil properties considered for application of the model to the Peloritani mountains case-study area.

Table 4 .
Varied soil properties considered for sensitivity analysis.

Table 5 .
ROC-based indices for the derived best power-law stochastic-input physically based thresholds (S) and comparison with constantintensity input physically based ones (C).