Interactive comment on “ Investigation of variable threshold level approaches for hydrological drought identification ” by B

Abstract. Threshold level approaches are widely used to identify drought events in time series of hydrometeorological variables. However, the method used for calculating the threshold level can influence the quantification of drought events or even introduce artefact drought events. In this study, four methods of variable threshold calculation have been tested on catchment scale, namely (1) moving average of monthly quantile (M_MA), (2) moving average of daily quantile (D_MA), (3) thirty days moving window quantile (30D) and (4) fast Fourier transform of daily quantile (D_FF). The levels obtained by these methods were applied to hydrometeorological variables that were simulated with a semi-distributed conceptual rainfall-runoff model (HBV) for five European catchments with contrasting catchment properties and climate conditions. There are no physical arguments to prefer one method over the other for drought identification. The only way to investigate this is by applying the methods and visually inspecting the results. Therefore, drought statistics (i.e. number of droughts, mean duration, mean deficit) and time series plots were studied to compare drought propagation patterns determined by different threshold calculation methods. We found that all four approaches are sufficiently suitable to quantify drought propagation in contrasting catchments. Only the D_FF approach showed lower performance in two catchments. The 30D approach seems to be optimal in snow-dominated catchments, because it follows fast changes in discharge caused by snow melt more accurately. The proposed approaches can be successfully applied by water managers in regions where drought quantification and prediction are essential.

tested on catchment scale, namely (1) moving average of monthly quantile (M_MA), (2) moving average of daily quantile (D_MA), (3) thirty days moving window quantile (30D) and (4) fast Fourier transform of daily quantile (D_FF). The levels obtained by these methods were applied to hydrometeorological variables that were simulated with a semi-distributed conceptual rainfall-runoff model (HBV) for five European catchments 10 with contrasting catchment properties and climate conditions. There are no physical arguments to prefer one method over the other for drought identification. The only way to investigate this is by applying the methods and visually inspecting the results. Therefore, drought statistics (i.e. number of droughts, mean duration, mean deficit) and time series plots were studied to compare drought propagation patterns determined by dif-

Introduction
Drought is a hazardous natural event that is associated with below-average water availability in the hydrological cycle due to climate variability. Unlike other natural hazards 25 (e.g. floods), drought has a very complex development pattern (onset, impacted area, 12766 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | severity, recovery) that cannot be easily understood. Drought is often detected after it has already well developed (Wilhite, 2000;Tallaksen and Van Lanen, 2004;Mishra and Singh, 2010;. Many regions across the world are vulnerable to drought, leading to immense socio-economic and environmental impacts. In some areas, even fatalities are reported because of drought-related impacts. For 5 example, the 2011 drought in the Horn of Africa resulted in famine and thousands lost lives (Zarocostas, 2011;Hillier and Dempsey, 2012). As reported by Wood (2008, 2012), Wanders et al. (2010), Orlowsky and Seneviratne (2013), Prudhomme et al. (2013), Forzieri et al. (2014), and Van Huijgevoort et al. (2014), drought severity will likely increase in multiple regions across the globe. They also refer to large 10 spread in projections, because of uncertainties in emission scenarios, climate models and in particular large-scale hydrological models. Despite these uncertainties, current and projected impacts urge societies in many regions to explore water futures and solutions through increasing drought vulnerability (e.g. Fischer et al., 2011;Cosgrove and Cosgrove, 2012;Gallopín, 2012). Adaptive management strategies (e.g. Holling 15 et al., 1978) are anticipated to frame operational and long-term drought management, including identification of promising measures, and water-related policy making.
The impacts of past and future drought are also uncertain because of definitional issues (e.g. Seneviratne et al., 2012), which hamper vulnerability and adaptation studies. Different drought types need to be distinguished, because characteristics (e.g. fre-20 quency, duration, deficit volumes) substantially differ between meteorological drought (precipitation deficit), soil moisture drought and hydrological drought (below-normal groundwater or river flow) due to drought propagation through the subsurface part of the water cycle (e.g. Peters et al., 2003;Van Loon and Van Lanen, 2012). Different identification methods that are used for a specific drought type are another source of 25 uncertainty (e.g. . Two main groups of identification methods are usually applied, which have in common that long time series of hydrometeorological data are required (preferably 30 years or longer). The first group is based on the probability of an observed hydrometeorological variable occurring over a given prior period.

12767
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | It provides the deviation from normal (drought severity) in terms of SD. The most wellknown is the Standardized Precipitation Index, SPI (McKee et al., 1993). Others are developed for soil moisture (SMA; Sheffield et al., 2004); groundwater (GRI; Bloomfield and Marchant, 2013), and river flow (SRI; Shukla and Wood, 2008). The second widely applied group is the threshold approach: a drought occurs when the hydrometeoro-5 logical variable is below a predefined threshold. The threshold method was introduced by Yevjevich (1967). Hisdal et al. (2004), Fleig et al. (2006, Mishra and Singh (2010), and  provide overviews for application of this approach to drought analysis.
The choices made in the implementation of the threshold method, including the se-10 lection of the threshold level, are crucial. Ideally, the threshold level should be defined by drought impacted sectors, e.g. irrigated agriculture, cooling water for energy plants, drinking water supply, reservoir operation levels, navigation depth, or environmental flows to support stream ecology (Tallaksen and Van Lanen, 2004;Mishra and Singh, 2010;. Either a fixed or a variable (seasonal, monthly or 15 daily) threshold can be used (Hisdal et al., 2004). A fixed threshold, for example, is relevant to study ecological minimum flows. A variable threshold is more appropriate when seasonal patterns need to be taken into account; e.g. anomalies in groundwater recharge during the wet season are more important for groundwater resource management than focus on the dry season when recharge under normal conditions already is 20 low or non-existing. A variable threshold approach has been used in many hydrological drought studies, e.g. Stahl (2001)

25
A number of studies use a variable threshold method that is based on postprocessing of long-term average monthly flow, which was introduced by Van Loon et al. (2010). When applying the variable threshold method, Van Loon and Van Lanen (2012) found artefact drought events in some catchments, i.e. short-lived events with usually a high water deficit. These artefact events were not caused by weather anomalies (precipitation, temperature), but likely by the way the variable threshold had been implemented. The artefact events appeared when the flow increased very quickly (e.g. transition between winter low flow period and the snow melt peak) in connection with a gradually increasing threshold level. This also might explain the short-lasting, 5 but substantial increase in the global area in drought around March-April (which is the snow melt season on the Northern Hemisphere), as reported by Corzo Perez et al. (2011). The identified artefact drought events are of no or little relevance for possible drought-impacted sectors, because of their short duration and per definition high flows afterwards. This indicates that the current implementation of the threshold based on 10 post-processed smoothed monthly values seems not be most suitable all around the world.
The aim of this paper is to systematically analyse the performance of four different methods for implementation of the variable threshold to identify hydrological droughts in different geoclimatic conditions. The paper starts with presenting the main charac-15 teristics of five contrasting catchments in Europe (Sect. 2) that were used to test the methods, followed by a description of the basics of the four methods to implement the variable threshold method (Sect. 3). The results are presented in the form of general drought characteristics, which are complemented with selected drought event to illustrate similarities and differences for the different methods and catchments (Sect. 4).

20
The results are discussed in light of the drought identification in different geoclimatic settings at different scales (Sect. 5). Finally, the conclusions are given in Sect. 6.

Study area
The study areas of this research are five European catchments that are headwaters of basins with contrasting catchment characteristics and climate conditions  (Fig. 1). The catch-ments can be considered as representative of different climatic zones and diverse en-vironmental conditions in Europe; from subarctic climate with very high inter-annual temperature and snow-cover variation to semiarid climate with greater potential evap-5 otranspiration and extended groundwater system (Van Lanen et al., 2008;Van Loon, 2013). Therefore, the results of investigating the variable threshold levels could be ap-plicable to drought analysis in other catchments around the world, where observed and/or simulated hydrometeorological data are available. Van Loon and Van Lanen (2012) simulated the hydrometeorological variables from observations using the con-ceptual, semi-distributed rainfall-runoff model HBV (Seibert, 2000). They took the ob-served precipitation and temperature from stations inside and around the catchment, calculated catchment average values using the Thiessen polygon method, and cor-rected for elevation. In addition, they calculated potential evapotranspiration using the adapted Penman-Monteith method (Doorenbos and Pruitt, 1975;Allen et al., 1998).

15
Daily local forcing data, i.e. precipitation, potential evapotranspiration, and tempera-ture, were used as an input for HBV model to simulate daily soil moisture, groundwater storage and discharge. Van Loon and Van Lanen (2012) used the Nash-Sutcliffe effi-ciency (Nash and Sutcliffe, 1970) based on the logarithm of discharge as criterion to verify the model's performance to simulate the observed discharge. The model per-20 formance for these five catchments was between 0.63 and 0.9, which was generally taken as satisfactory or above (Van Loon and Van Lanen, 2012). The model outputs were used in this research.

Methodology
The variable threshold level should represent the low flow regime of a catchment. 25 Therefore, the optimum calculation method is a daily quantile based on very long time series to average out inter-annual variation. Often such long time series are not available and threshold levels have to be calculated from shorter time series introducing variability in the regime curve and, therefore, in the threshold level.
There are several possibilities to create a smooth threshold level when time series are not long enough. One option is smoothing the daily threshold levels. Another approach is the use of monthly data for calculation of the threshold. These two ap-5 proaches are based on two consequent steps, namely a basic threshold level calculation and a smoothing procedure. The smoothing can be done in various ways, subdivided in local and global methods. A local method, like moving average, takes into account the data close to the data point under consideration, whereas a global method, like Fourier transform, takes into account the entire dataset. The third approach com-10 bines the two steps of basic threshold calculation and smoothing into one procedure.
In this study, thresholds have been calculated for the hydrometeorological variables precipitation, soil moisture storage, groundwater storage and discharge. We applied the threshold calculation and the smoothing techniques to these variables as discussed below. The variables are denoted as Q i _j for quantile series and Thr i _j for the calcu-15 lated threshold level, where i stands for the methods of threshold calculation, i.e. daily (D), monthly (M), moving window of 30 days (D30), and j stands for the subsequent smoothing techniques, i.e. moving average (MA) and fast Fourier (FF) transform.

Moving average of monthly quantile (M_MA)
In this approach, the basic calculation of the threshold is done based on the cumulative 20 distribution of long-term monthly data. The threshold level is calculated as the 80th percentile of the flow duration curve of this distribution.
where Q M_MA (n) is the exceedance threshold level of the nth month of the calendar year.

25
The calculated exceedance threshold is assigned as the threshold level for each day of the month. This results in a fixed threshold level for this predefined month. 12771 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | The annual curve of threshold levels is, therefore, produced from 12 blocks of monthly threshold levels. When confronting time series of daily data with monthly threshold levels, jumps between two consecutive months result in unrealistic drought behaviour that extends around the beginning and end of each month. This is because of the difference between slowly-changing actual time series and sudden jumps in the threshold level 5 at the interface of the two months. This requires the use of smoothing technique to get a reliable threshold level that avoids such unrealistic drought behaviour. Therefore, we applied 30 days centred moving average to these discrete monthly thresholds as follows: 10 where Thr M_MA (m) is the threshold level of the mth day of the calendar year calculated from moving average of 30 consecutive days with monthly quantiles (Fig. 2).

Moving average of daily quantile (D_MA)
The first step in this approach is to compute daily quantiles from cumulative distribution of hydrometerological data through the entire observation period. Therefore, we 15 created 365 flow duration curves from which 365 threshold levels were determined. We calculated the 80th percentile as the exceedance threshold from the calender daily cumulative distribution as: where Q D_MA (m) is the daily quantile of the mth day of the calendar year.

20
However, the time series of the daily thresholds gives a fluctuating threshold level that can lead to frequent and short-lived deficit periods that cannot be identified as 12772 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | drought (Fig. 2). Therefore, we implemented the smoothing techniques of a centred moving average of 30 days as (similar to the previous threshold level method): where Thr D_MA (m) is the threshold level of the mth day of the calender year calculated using D_MA threshold method.

Thirty-days moving window quantile (30D)
In this approach, daily threshold levels are calculated based on quantiles from flow duration curve over a monthly time window that moves through the time series. Therefore, the distribution is made on a monthly basis, however, without taking calendar months as a starting point. This is done until annual curves of daily thresholds are attained, 10 which gives a threshold level that does not necessarily require additional smoothing (Fig. 2).
where Thr 30D (m) is the threshold level of the mth day of the year calculated using 30D threshold level method.

Fast Fourier transform approach (D_FF)
In this approach, we used the annual curve of the daily thresholds determined using the basic calculation method applied in the second threshold level method (Eq. 3).
where Q D_FF (m), in this approach, is the threshold level of the mth day of the calendar 20 year calculated using D_FF threshold level method. 12773 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | The fast Fourier transform assumes that this data contains a set of repeating daily measurements. The time series of hydrometeorological variables is, therefore, converted to frequency series, which is then modified by removing Fourier components with frequencies higher than a cutoff frequency. The cutoff frequency is optimized in such a way that the inverse of the modified frequency series best fits the time series of 5 the threshold level determined by 30D threshold level method.
where FFT is the fast Fourier transform algorithm applied on the mth day quantile (Q D_FF ) and Thr D_FF (m) is the corresponding daily threshold level determined using D_FF threshold level method. 10 The results of the threshold calculation methods applied to the Narsjø catchment (Norway) are displayed in Fig. 2. When we systematically analyse the behaviour of the threshold level approaches through each hydrological regime, it seems that the methods perform differently during the high-flow period (from May to July). For example, the M_MA threshold is well below the discharge curve during the high-flow period. The 15 D_FF threshold, however, seems to be very close to the actual discharge curve.

Computation of drought characteristics
The calculated threshold levels were applied to the entire time series of all catchments. The magnitudes of drought characteristics were computed based on the difference of the actual time series and the threshold levels. The use of threshold level at daily tem-20 poral resolution introduces minor drought events and possible dependency between two or more consecutive drought events (Hisdal, 2002;Van Loon and Van Lanen, 2012). To remove minor drought events, we excluded events that persisted for less than 15 days, as suggested by Hisdal et al. (2004), Fleig et al. (2006, and Van Loon et al. (2011) (Fig. 3). To eliminate dependencies, we applied a pooling procedure based 25 on an inter-event period of 10 days (Tallaksen et al., 1997; Fleig et al., 2006). With that procedure, two consecutive drought events with drought duration (D i and D i +1 ) and 12774 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | deficit volume (V i and V i +1 ) and with an inter-event period (t c ) less than 10 days were pooled together as follows to generate the j th drought event.
where D pooled (j ) is the drought duration of the j th drought event and V pooled (j ) is its 5 deficit volume. For state variables the maximum deviation from the threshold level was used as a severity measure (H). For these variables, deficit volume (V pooled ) was replaced by H pooled :

General drought statistics
It is hypothesized that drought numbers should decrease, mean drought duration should increase and drought severity should decrease moving from meteorological 12775 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | drought through soil moisture drought to hydrological drought. Comparing the threshold level approaches, the M_MA threshold method has given the least number of droughts in most catchments (Table 1). Except for precipitation and groundwater droughts in the Narsjø catchment, the method produced fewer or a comparable number of droughts in all catchments. For example, fewer discharge droughts are identified in the Narsjø 5 catchment, because the calculated threshold level is well above the daily quantiles during periods when abrupt increase in the actual data is confronted by slow rise in the threshold series. This higher threshold level merges two or more droughts together in these periods that could otherwise fall into separate droughts upon using the other three methods. Therefore, the method generates longer mean drought duration. This 10 effect is also noticeable in slow responding catchments such as Upper Metuje and Upper Guadiana. For example, the M_MA threshold level approach applied to Upper Guadiana catchment provided average groundwater drought duration of 130 days, which is longer than the mean duration computed using the other three threshold approaches. This method has resulted in a SD of 70 days duration among the four threshold meth-15 ods ( Table 2). This effect could be accompanied by the slow response to meteorological droughts in these two catchments caused by an extended aquifer system. Time series of discharge of catchments with extended aquifer systems are much smoother than those of precipitation. Therefore, applying the M_MA smoothing technique to the already smooth time series results in longer drought durations than one would expect.

20
The threshold levels calculated with the D_FF method have reduced to fixed threshold level for discharge in Upper-Metuje catchment and for precipitation and discharge in Upper Sázava catchment. As a result, the computed mean drought duration for these hydrometeorological variables is much longer than those computed with other methods for the rest of the catchments. For example, for the drought event in 1976, the 25 duration of the discharge drought is calculated to be 56 days (from 22 July 1976 to 16 September 1976) when using the M_MA and 30D threshold level methods and 60 days (from 18 July 1976 to 16 September 1976) when using the D_MA threshold level method. However, the same drought is found to sustain for 129 days (from 8 July 1976 to 14 September 1976) when applying the D_FF threshold level method. Mean calculated deficit volume is often higher when using the D_FF and D_MA threshold methods than using the M_MA and D30 threshold methods. However, no substantial difference between approaches is found in calculating the magnitudes of discharge deficit for the Guadiana catchment, groundwater deficit in the Nedožery catchment and soil moisture 5 deficit in the Narsjø catchment. Among the drought characteristics, deficit volume is more reliably calculated using all the methods than the number of droughts and mean drought duration.
Despite considerable differences in magnitudes of the drought characteristics, the drought propagation patterns determined with all methods meet our expectations. In all 10 threshold approaches used in this study, larger number of short-sustained precipitation droughts propagated into fewer, but longer sustained and severe soil moisture and hydrological droughts (Table 1). To see why the magnitudes differ so much, we need to study drought propagation in more detail by a visual investigation of time series. 15 In this section, we identify and present examples of the most apparent differences and similarities based on the associated drought identification and typology proposed by Van Loon and Van Lanen (2012).

Selected drought events
The most important element is the development of some artefact events that are exclusively caused by the chosen method. For example, the M_MA and D_FF thresh-20 old level methods have produced artefact drought event in discharge for the Narsjø catchment during December 1984 to June 1985 without any meteorological drought in the preceding period (Fig. 4). In this particular example, the artefact event that persisted for 48 days when using M_MA threshold level method did not appear when we used the 30D threshold method. Such artefact events are successfully removed by 30D 25 threshold approach because it follows the regime more closely (Fig. 2).
The other difference between the threshold level approaches is that the D_FF threshold is, in some cases, reduced to a fixed threshold. This significantly impacts the 12777 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | magnitude and severity of some droughts particularly during periods of classical rainfall deficit drought (Fig. 5) and warm snow season drought (Fig. 6). In such circumstances, the D_FF threshold method gives intense and long-sustaining droughts that may not be equivalently reproduced by other methods. For the rest, all threshold level methods performed equivalently in terms of drought 5 propagation patterns. The most pronounced similarity is shown in the example of a wet-to-dry-season drought in the Upper Guadiana catchment (Fig. 7). In such circumstances, the impact of the threshold level approaches on the drought propagation pattern is limited to only small changes in duration and deficit volume of these drought events.
10 Similarly, the four threshold level approaches applied to a rain-to-snow-season drought event in the Narsjø catchment (Fig. 8) generated drought propagation patterns that only differ in magnitude. In such circumstances, the deviation in the time series of discharge anomalies plays a typical role in the choice for a suitable threshold level approach. In this example, the discharge anomaly persisted for 308 days since 7 March 15 1976 using the M_MA threshold level method. Similarly, with the 30D approach the anomaly started on the same date but ceased only 3 days earlier. However, the total deficit volume during this period differs from 69 mm (with M_MA) to 58 mm (with 30D threshold level method). In catchments like the Narsjø catchment, the 30D approach seems to be more reliable than the other three approaches.

Discussion
A variable threshold has been used in many drought studies. The most straightforward application of a variable threshold is the use of a monthly threshold on data with a monthly resolution (e.g., Mathier et al., 1992;Lehner et al., 2006;Weiß et al., 2007;Van Huijgevoort et al., 2012;Wada et al., 2013). The drought event definition by Yevjevich et al. (1983) was originally developed for analysing time series with a time resolution of one month or longer. Because droughts develop slowly and are a so-called "creeping disaster" (Wilhite, 2000), a monthly time resolution might be sufficient to quantify drought characteristics. The disadvantage, however, is that calender months are an arbitrary subdivision of the year and the timing of a discharge peak strongly influences whether a month is classified as dry or wet. Therefore, a daily resolution is advised also for drought studies. The threshold level method of Yevjevich et al. (1983) 5 was successfully tested on daily hydrographs (Zelenhasić and Salvai, 1987;Tallaksen et al., 1997;Kjeldsen et al., 2000;Tate and Freeman, 2000;Hisdal et al., 2001).
The question arises how to determine the variable threshold level for hydrological drought analyses on daily time scale. A monthly threshold confronted with daily data introduces problems with the "staircase" pattern of the threshold. Therefore, smoothing

15
Another option is using a daily threshold. Zaidman et al. (2002) use standardised daily anomalies, which are comparable to a daily threshold, and Fleig et al. (2011) use a daily threshold. In both studies, the daily values were not smoothed to produce reliable threshold levels. This is not problem for observations or simulations with a very long period of record, in which extreme daily values are averaged out. For short periods 20 of record, however, it leads to a threshold level in which extreme daily values have a big influence, because not enough observations are available to create a smooth duration curve. Smoothing of the daily threshold with a moving average (D_MA in this study) has, to our best knowledge, never been used before.
Smoothing the daily threshold can also be done by a Fourier transform (in this study 25 D_FF). The advantage is that it is a global method, which takes into account the total pattern instead of only the values just before and after the target value. To our knowledge a variable threshold level calculated with use of a Fourier transform has never been applied before.

12779
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Instead of performing a smoothing afterwards (like in M_MA, D_MA, and D_FF), smoothing can also be incorporated in the calculation of the threshold itself. In that case, the threshold is not based on calender months, not on daily values, but on a moving window of a number of days. In this study, we used a moving window of 30 days (30D), while in other studies moving windows of 11 days (Stahl, 2001) Stahl (2001) investigated the sensitivity for the period of the moving window and concluded that differences start to level off around the 10 day window.
In summary, various methods for calculating variable thresholds are available and applied in drought studies. This study is the first to compare different approaches and 10 quantify the differences for a number of contrasting catchments in Europe. The positive conclusion of this study is that all approaches can be used in drought propagation analysis; in general, the same drought propagation patterns are found. This contradicts the common expectation that the choice of the threshold level is extremely important for the outcomes of a drought study (Lehner et al., 2006). It is true that the type (fixed of vari-15 able) and magnitude (based on the 70th or 90th percentile) of a threshold changes the values of the drought characteristics, but the effects on drought propagation processes (or changes of drought in the future) are expected to be less influenced, because in those cases relative differences are compared. An exception are future regime changes that are evaluated with a variable threshold. For example, an expected shift of the snow 20 melt peak in the future will result in high flow during the historical low flow period (winter) and drought during the historical high flow period (spring) (Van Huijgevoort et al., 2014). Wanders et al. (2014), therefore, propose a changing threshold for the future.
We also found some discrepancies between the results of the threshold level methods. Largest differences were found in catchments and variables for which the Fourier 25 transform (D_FF) could not characterise the low-flow regime correctly and reduced to a fixed threshold. Additionally, differences were found in climates with an abrupt change in discharge, e.g. due to snow melt. The 30D and D_FF threshold approaches seemed to capture this fast transition best. As such an abrupt change in discharge might also occur in other climates, for example monsoon climates, the 30D and D_FF level methods seem to be most suitable for global scale drought analysis.

Conclusions
In this research, we proposed variable threshold level approaches for hydrological drought identification; namely moving average of monthly quantile (M_MA), moving average of daily quantiles (D_MA), thirty days moving window quantile (30D) and fast Fourier transform of daily quantile (D_FF). We used the threshold levels determined with these methods to analyse hydrological drought on a daily basis.
We found that the proposed threshold level approaches are good alternatives for drought propagation analysis and classification. However, the 30D threshold level ap-10 proach can be preferably used in most catchments, particularly in snow-dominated catchments. This threshold level approach eliminates artefact events that are solely caused by a sharp increase in daily discharge due to sudden snow melt in combination with gradual increase of the threshold level.
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |