Determination of Rainfall Thresholds for Landslide Prediction Using an Algorithm-Based Approach: Case Study in the Darjeeling Himalayas, India

Landslides are one of the most devastating and commonly recurring natural hazards in the Indian Himalayas. They contribute to infrastructure damage, land loss and human casualties. Most of the landslides are primarily rainfall-induced and the relationship has been well very well-established, having been commonly defined using empirical-based models which use statistical approaches to determine the parameters of a power-law equation. One of the main drawbacks using the traditional empirical methods is that it fails to reduce the uncertainties associated with threshold calculation. The present study overcomes these limitations by identifying the precipitation condition responsible for landslide occurrence using an algorithm-based model. The methodology involves the use of an automated tool which determines cumulated event rainfall–rainfall duration thresholds at various exceedance probabilities and the associated uncertainties. The analysis has been carried out for the Kalimpong Region of the Darjeeling Himalayas using rainfall and landslide data for the period 2010–2016. The results signify that a rainfall event of 48 h with a cumulated event rainfall of 36.7 mm can cause landslides in the study area. Such a study is the first to be conducted for the Indian Himalayas and can be considered as a first step in determining more reliable thresholds which can be used as part of an operational early-warning system.


Introduction
Landslides are one of the most destructive natural disasters, causing massive loss of human lives and property [1]. Globally, the triggering factor for most of the landslides is rainfall [1]. The global dataset of landslide events between 2004-2016 showed that 75% of landslides occurred in Asia with a considerable incidence in the Himalayan arc [2], which contributes to more than 70% of landslide-related fatalities across the globe [2]. One of the most affected regions in this area is the Darjeeling-Sikkim Himalayas, which contributes to 40% of the land area in India susceptible to landsliding [3]. Landslides in this region cause havoc during the monsoon season, incurring substantial human and financial losses and disrupting the livelihood of local people. The problem is likely to increase due to over-exploitation of natural resources, deforestation and the increase in infrastructure development due to an increase in the population [2]. One of the most affected areas in this region is Kalimpong, where a high number of landslides are initiated by monsoonal rainfall (Geological Survey of India Report, 2016).
The most common method to analyze the relationship between rainfall conditions and landslide incidents is statistical analysis [4][5][6][7][8][9]. The statistical methods determine a threshold in either the intensity-duration (ID) or cumulated event-duration (ED) planes over which landslides are likely to happen [5,9,10]. However, there are certain drawbacks with its use: (i) Availability and quality of the Geosciences 2019, 9, 302 2 of 9 input data, which may lead to an increase in uncertainty [6,11,12]; (ii) identification and definition of the rainfall event [13], which also highlighted that a common shortcoming of traditional rainfall threshold approaches is the subjectivity of the analysis, which leads to the non-replicability of the analysis, and the estimation of the real rainfall event, i.e., the actual amount of rain responsible for the landslide triggering. To overcome such drawbacks, researchers developing techniques for an objective definition of empirical rainfall [14][15][16] introduced several new methods to overcome these concerns by establishing thresholds to be used as an early warning system [17]. They developed an automated model to reconstruct rainfall conditions responsible for landslide events and determine the thresholds using both rain-gauge and satellite data [18]. Cluster analysis methodology was used to calculate rainfall thresholds and determine the safety factor to develop an early warning system. Recently, the use of machine learning and artificial intelligence techniques are also being used to forecast landslides [19][20][21].
The research on threshold estimation in the Indian Himalayan region is minimal and is confined to defining the intensity-duration (ID) thresholds ( [22] for Sikkim [23]; Uttarakhand [3]; and Kalimpong). The threshold estimation for the Indian Himalayan region lacks the use of the objective definition of thresholds which is crucial for better predictability of landslides. For the present study, rainfall thresholds have been defined for the Kalimpong Region by using a semi-automatic empirical approach, namely the Calculation of Thresholds for Rainfall-Induced Landslides-Tool (CTRL-T), using rainfall data and landslide records from 2010-2016 [15,24]. The tool uses an algorithm to reconstruct rainfall triggering conditions and uses statistical methods to determine cumulated event rainfall-rainfall duration thresholds and quantify uncertainty [15,24].

Study Area
The study region is Kalimpong (87.47 • -89.47 • N, 26.07 • -28.07 • E) enclosed by the Teesta and Relli rivers in the east and west respectively ( Figure 1). The region has an elevation ranging from 400 to 1665 m above sea level (a.s.l) and is drained by numerous mountain streams which have further raised the issue of landslides in the area [3,25]. Figure 1 depicts the hydrogeological map of Kalimpong, where 82% of the region lies between 804-1440 m. The geology of the region can be categorized into rock types which vary from banded gneiss, schist, sandstone with shale and younger alluvium, with rock age ranging from Archean to Quarternary [3,26]. The rock most susceptible to landsliding in the region is mica schist (GSI Report, 2016).
Geosciences 2019, 9, x FOR PEER REVIEW 2 of 9 quality of the input data, which may lead to an increase in uncertainty [6,11,12]; (ii) identification and definition of the rainfall event [13], which also highlighted that a common shortcoming of traditional rainfall threshold approaches is the subjectivity of the analysis, which leads to the nonreplicability of the analysis, and the estimation of the real rainfall event, i.e., the actual amount of rain responsible for the landslide triggering. To overcome such drawbacks, researchers developing techniques for an objective definition of empirical rainfall [14][15][16] introduced several new methods to overcome these concerns by establishing thresholds to be used as an early warning system [17].
They developed an automated model to reconstruct rainfall conditions responsible for landslide events and determine the thresholds using both rain-gauge and satellite data [18]. Cluster analysis methodology was used to calculate rainfall thresholds and determine the safety factor to develop an early warning system. Recently, the use of machine learning and artificial intelligence techniques are also being used to forecast landslides [19][20][21].
The research on threshold estimation in the Indian Himalayan region is minimal and is confined to defining the intensity-duration (ID) thresholds ( [22] for Sikkim [23]; Uttarakhand [3]; and Kalimpong). The threshold estimation for the Indian Himalayan region lacks the use of the objective definition of thresholds which is crucial for better predictability of landslides. For the present study, rainfall thresholds have been defined for the Kalimpong Region by using a semiautomatic empirical approach, namely the Calculation of Thresholds for Rainfall-Induced Landslides-Tool (CTRL-T), using rainfall data and landslide records from 2010-2016 [15,24]. The tool uses an algorithm to reconstruct rainfall triggering conditions and uses statistical methods to determine cumulated event rainfall-rainfall duration thresholds and quantify uncertainty [15,24].

Study Area
The study region is Kalimpong (87.47°-89.47° N, 26.07°-28.07° E) enclosed by the Teesta and Relli rivers in the east and west respectively ( Figure 1). The region has an elevation ranging from 400 to 1665 m above sea level (a.s.l) and is drained by numerous mountain streams which have further raised the issue of landslides in the area [3,25]. Figure 1 depicts the hydrogeological map of Kalimpong, where 82% of the region lies between 804-1440m. The geology of the region can be categorized into rock types which vary from banded gneiss, schist, sandstone with shale and younger alluvium, with rock age ranging from Archean to Quarternary [3,26]. The rock most susceptible to landsliding in the region is mica schist (GSI Report, 2016).  Landslides in Kalimpong are mostly shallow in nature, with depth varying from few centimeters to meters. The material and movements of the landslides as per classification are rockfall, debris slide, and debris flow, as identified by the Geological Survey of India [27]. The concern of slide activity has generally been on an increasing trend with every passing monsoon, which can be attributed to several factors like rainfall, improper drainage, toe cutting by Teesta, and an increase in construction activities [3]. The landslides in the region predominantly occur due to heavy monsoonal rainfall and surface drainage. The water from the mountain streams flows down the hill as a water source for the people. During monsoons these streams overflow and due to the improper drainage system, they often get built up, leading to extensive scouring and erosion [3]. The increase in construction activities has further intensified the issue of landslides as it blocks the passage of water flow. A field survey after the monsoons of 2016 and 2017 revealed that most of the landslide occurrences were on road cutting, which is indicative of both construction practices and improper drainage systems. The major roads connecting the city to Kalimpong often get disrupted during monsoons due to landslides, affecting both tourists and locals alike. Figure 2 shows the damages caused by rainfall in various years. Landslides in Kalimpong are mostly shallow in nature, with depth varying from few centimeters to meters. The material and movements of the landslides as per classification are rockfall, debris slide, and debris flow, as identified by the Geological Survey of India [27]. The concern of slide activity has generally been on an increasing trend with every passing monsoon, which can be attributed to several factors like rainfall, improper drainage, toe cutting by Teesta, and an increase in construction activities [3]. The landslides in the region predominantly occur due to heavy monsoonal rainfall and surface drainage. The water from the mountain streams flows down the hill as a water source for the people. During monsoons these streams overflow and due to the improper drainage system, they often get built up, leading to extensive scouring and erosion [3]. The increase in construction activities has further intensified the issue of landslides as it blocks the passage of water flow. A field survey after the monsoons of 2016 and 2017 revealed that most of the landslide occurrences were on road cutting, which is indicative of both construction practices and improper drainage systems. The major roads connecting the city to Kalimpong often get disrupted during monsoons due to landslides, affecting both tourists and locals alike. Figure 2 shows the

Data
The input files required for the model are rainfall data series with rain gauge location along with the landslide event occurrence dates and its coordinates for events between 2010-2016 [24,28]. The daily rainfall data was collected from a single rain gauge maintained by the Save The Hills organization. During this period, the average annual and monsoonal rainfall (June-September) was 1850.5 mm and 1637.7 mm respectively, which indicates that 88.5% of yearly rainfall occurred in the monsoon season (Figure 3). The average rainfall during the monsoon season is 13.3 mm/day. For the investigated period, the maximum and minimum monsoonal rainfall occurred in 2014 and 2012,

Data
The input files required for the model are rainfall data series with rain gauge location along with the landslide event occurrence dates and its coordinates for events between 2010-2016 [24,28]. The daily rainfall data was collected from a single rain gauge maintained by the Save The Hills organization. During this period, the average annual and monsoonal rainfall (June-September) was 1850.5 mm and 1637.7 mm respectively, which indicates that 88.5% of yearly rainfall occurred in the monsoon season ( Figure 3). The average rainfall during the monsoon season is 13.3 mm/day. For the investigated period, respectively. The variation in the minimum and maximum values of rainfall for various days is significant and is on a rising trend for every year.
The total number of landslide events identified for the same time period was 99, of which 61 landslides were triggered by rainfall (Figure 1b). During this period, 18 landslide events occurred in 2010-2012, with the majority of landslides in 2010. Most of the landslides (87%) occurred during the monsoon season, with the highest number in July. The landslide events used for the analysis are confined to a single landslide event, i.e., multiple landslide occurrences in a day are counted as a single event. Figure 3b depicts the average daily monsoonal rainfall along with landslide occurrences during the study period.

Methodology
The reconstruction of rainfall thresholds is performed by means of a Calculation of Thresholds for Rainfall-Induced Landslides-Tool (CTRL-T) proposed in [15,24]. The tool uses an algorithm to automatically extract rainfall events from daily rainfall series, reconstruct triggering rainfall conditions responsible for landslide occurrences, and calculates rainfall thresholds at various exceedance probabilities. The input parameters for the tool require the locations and the dates of    The total number of landslide events identified for the same time period was 99, of which 61 landslides were triggered by rainfall (Figure 1b). During this period, 18 landslide events occurred in 2010-2012, with the majority of landslides in 2010. Most of the landslides (87%) occurred during the monsoon season, with the highest number in July. The landslide events used for the analysis are confined to a single landslide event, i.e., multiple landslide occurrences in a day are counted as a single event. Figure 3b depicts the average daily monsoonal rainfall along with landslide occurrences during the study period.

Methodology
The reconstruction of rainfall thresholds is performed by means of a Calculation of Thresholds for Rainfall-Induced Landslides-Tool (CTRL-T) proposed in [15,24]. The tool uses an algorithm to automatically extract rainfall events from daily rainfall series, reconstruct triggering rainfall conditions responsible for landslide occurrences, and calculates rainfall thresholds at various exceedance probabilities. The input parameters for the tool require the locations and the dates of occurrence of the landslides, coordinates of the rain gauge and hourly rainfall series. The tool considers the effect of spatial variability by drawing a circular radius buffer with landslide location as its center [24,28]. The authors of [24] had suggested a buffer radius of 5 km, although the studies in Kalimpong, as well as in nearby locations, have used a radius of 15 km as a large search radius is needed when the rain gauge network is sparse (e.g., low rain gauge density); moreover, a small radius would result in many landslide events without a reference rain gauge [28,29]. The tool works in three different steps: (i) The input is received as continuous rainfall series and distinct rainfall events are reconstructed determining the duration (D) in hours and cumulated event rainfall (E) in mm. (ii) The reconstructed rainfall events depend on the selection of a rain gauge to minimize the effect of spatial variability of precipitation distribution (this is achieved by reconstructing single or multiple rainfall conditions (MRC) most likely to result in failures and assigning a weight to them); (iii) finally, the tool reconstructs the MRC within the selected rainfall event and assigns a weight (w) proportional to the cumulated rainfall and the mean intensity of the MRC and to the inverse square distance between the rain gauge and the landslide. For each landslide, the highest w is used to identify the representative rain gauge and to determine the maximum probability rainfall conditions (MPRC). Finally, using only the MRC with the maximum weight (MPRC) for each failure, the rainfall thresholds at several exceedance probabilities and the uncertainties associated with them are calculated [15,24,28]. The thresholds use power law equation cumulated rainfall E (mm) to rainfall duration D (h) and uses the frequentist method proposed in [4], which can be depicted as: where α represents the scaling parameter which describes the intercept and γ is the shape parameter which defines the slope of the power law equation; ∆α, ∆γ. The delta parameters (∆α, ∆γ) depict the uncertainties related with the two parameters, which in turn are determined using a bootstrap nonparametric statistical technique. The tool determines the mean values of the parameters and their uncertainties using a bootstrap technique by calculating for 5000 synthetic series of rainfall conditions [15,28].

Results and Discussions
The tool (CTRL-T) reconstructed 130 rainfall events for the simulated time period (2010)(2011)(2012)(2013)(2014)(2015)(2016). It is often difficult to link a specific landslide occurrence with a rainfall event, due to an irregularity in rainfall measurement by rain gauges, the higher radial distance between landslide location and rain gauges, along with occurrences of several landslides at the same location or in proximity [28]. Considering the above-mentioned factors, 61 landslides were identified as induced by rainfall. Thereafter, the landslide conditions were further analyzed by understanding the variation in the maximum daily rainfall with three days prior to the landslide event, including the day of the slide occurrence. The distribution is quite varied, with the median value being 79.2 mm; range from 4.6 mm to 361 mm and the average value of 83.5 mm. Such an analysis is critical as it helps to reject the landslide events which are not exclusively associated with rainfall, especially for the incidences associated with a daily rainfall lower than 27.4 mm (25th percentile) [28]. Considering these conditions along with the effect of spatial variation in precipitation measurements, the number of landslides was reduced to 36 (Figure 3b). The tool discarded eight landslides, which made the total number of landslide events used for analysis to be 28. The reconstructed rainfall and landslide data were used to determine cumulated event rainfall-duration (ED) thresholds, where E = (4.2 ± 1.3) D (0.56±0.05) . Figure 4 represents the ED thresholds on a log-log curve using the 28 rainfall conditions which led to landslides (marked as points) at an exceedance probability of 5% (shown as a line) and the uncertainty region associated with it (shaded area). The rainfall conditions necessary for landslides cover a range of 288 h (24 ≤ D ≤ 288 h), which is the range of validity for the thresholds with the range of cumulative rainfall being 10 ≤ E ≤ 226 mm. Table 1 depicts that with the increase of duration there is an increase in the minimum, mean, and maximum values of event rainfall.  Table 2 depicts the various threshold parameters α, Δα, γ, and Δγ for various exceedance probabilities. To achieve reliable threshold values, the relative uncertainty parameter (Δγ/γ) should be less than 10% and in this case, it was found to be 8.9% [28]. However, the relative uncertainty in α is slightly higher than 30%, with the maximum value for 50% exceedance probability which can be attributed also to the daily temporal resolution of rainfall data used as observed in [28]. The high uncertainty is a result of low sample points used to determine thresholds and can be reduced with the addition of landslide events and using hourly rainfall data.  The rainfall conditions necessary for landslides cover a range of 288 h (24 ≤ D ≤ 288 h), which is the range of validity for the thresholds with the range of cumulative rainfall being 10 ≤ E ≤ 226 mm. Table 1 depicts that with the increase of duration there is an increase in the minimum, mean, and maximum values of event rainfall.  Table 2 depicts the various threshold parameters α, ∆α, γ, and ∆γ for various exceedance probabilities. To achieve reliable threshold values, the relative uncertainty parameter (∆γ/γ) should be less than 10% and in this case, it was found to be 8.9% [28]. However, the relative uncertainty in α is slightly higher than 30%, with the maximum value for 50% exceedance probability which can be attributed also to the daily temporal resolution of rainfall data used as observed in [28]. The high uncertainty is a result of low sample points used to determine thresholds and can be reduced with the addition of landslide events and using hourly rainfall data. The threshold equation depicts that a rainfall event of 48 h for a cumulated rainfall of 36.7 mm is enough for landslides to occur [3], and calculated the thresholds for the same region using the Bayesian inference technique for 61 landslides. The results depicted a rainfall event of 12.9 mm for 24 h was required for landslides to occur. For similar conditions, the probabilities of landslides using Bayes' theorem achieved maximum probability for an event rainfall of 30 mm for a duration of three days [26]. The thresholds calculated using various techniques depict very contrasting results. Therefore, validation of the thresholds needs to be performed for a better understanding of the thresholds to be used as a warning system. However, there have been no recorded landslides in 2017 and 2018, except for the small movement of unstable slopes in Southwestern Kalimpong [25,29].

Conclusions
The Darjeeling Himalayan region is one of the most rugged mountains in the world, with very high rainfall levels contributing to the majority of the devastating landslides in India. Landslides cause havoc during the monsoons in these hilly terrains. leading to substantial human and financial losses. To understand the relationship between rainfall and landslide incidences, several techniques were applied and thresholds generated. The thresholds determined usually lacked critical information on the selection of rain gauges and reconstructing the rainfall events responsible for landslide incidences. A newly proposed tool (CTRL-T) of reconstructing rainfall events and determining thresholds overcomes the limitations of the traditional threshold estimation methods. The tool was applied to the Kalimpong Region of the Darjeeling Himalayas using rainfall and landslide data for 2010-2016. The thresholds calculated depicted that a cumulated event rainfall of 36.7 mm for 48 h is required for landslide occurrence. However, to obtain reliable thresholds the availability of further triggering rainfall conditions and landslide events along with hourly rainfall data is necessary to reduce the associated uncertainties. The crucial part in threshold analysis is the validation of the determined values, which can be achieved using an independent dataset other than the one used for simulation, which is not available for this region. The present work can be understood as a first step to determine rainfall thresholds for Indian landslide regions using an algorithm-based approach, which should be viewed as preferable to conventional methods.