An assessment of the ability of Bartlett – Lewis type of rainfall models to reproduce drought statistics

Of all natural disasters, the economic and environmental consequences of droughts are among the highest because of their longevity and widespread spatial extent. Because of their extreme behaviour, studying droughts generally requires long time series of historical climate data. Rainfall is a very important variable for calculating drought statistics, for quantifying historical droughts or for assessing the impact on other hydrological (e.g. water stage in rivers) or agricultural (e.g. irrigation requirements) variables. Unfortunately, time series of historical observations are often too short for such assessments. To circumvent this, one may rely on the synthetic rainfall time series from stochastic point process rainfall models, such as Bartlett–Lewis models. The present study investigates whether drought statistics are preserved when simulating rainfall with Bartlett–Lewis models. Therefore, a 105 yr 10 min rainfall time series obtained at Uccle, Belgium is used as a test case. First, drought events were identified on the basis of the Effective Drought Index (EDI), and each event was characterized by two variables, i.e. drought duration ( D) and drought severity ( S). As both parameters are interdependent, a multivariate distribution function, which makes use of a copula, was fitted. Based on the copula, four types of drought return periods are calculated for observed as well as simulated droughts and are used to evaluate the ability of the rainfall models to simulate drought events with the appropriate characteristics. Overall, all Bartlett–Lewis model types studied fail to preserve extreme drought statistics, which is attributed to the model structure and to the model stationarity caused by maintaining the same parameter set during the whole simulation period.


Introduction
Drought impacts on the environment and on the economy are among the highest of all natural disasters due to their longterm and extensive scale (Wilby and Wigley, 2000).It is often stated that drought is one of the most complex natural hazards, and that it affects more people than any other hazard (Wilhite et al., 2007).Understanding drought statistics, therefore, is essential for planning and management of water resources.
There are two main challenges with respect to the statistical analysis of droughts.Unlike extreme rainfall or flood problems, drought may last from several months to years; therefore, the first challenge consists of retrieving a historical climate data set which is sufficiently long for analysis.Precipitation data, being the most important variable for drought investigation, is not always available from observations for such a long period.In the latter case, one may consider the use of stochastic point process rainfall models, which allow for generating extremely long rainfall time series with similar statistics to what was observed (Verhoest et al., 2010).
The second challenge concerns the characterization of the dependence between the different variables that define a drought.Droughts are generally characterized by multiple attributes (Shiau and Modarres, 2009;Wong et al., 2010), of which drought duration and severity are the two most important variables in the majority of the reported drought research.Generally, traditional univariate analysis does not account for any correlation between these variables.However, the description of the extremity of an event depends upon the combination of duration as well as severity.A Published by Copernicus Publications on behalf of the European Geosciences Union.
multivariate frequency analysis of droughts can be explored using copulas (Shiau and Shen, 2001;Shiau, 2003Shiau, , 2006;;Kim et al., 2006a,b;Shiau and Modarres, 2009;Wong et al., 2010).Copulas, proposed by Sklar (1959), are multivariate distribution functions that allow for the description of the dependence structure between random variables independent of their marginal behaviours (Genest and Favre, 2007;Salvadori and De Michele, 2010).These functions have already been widely used to investigate dependence structures between hydrological variables, albeit mostly in the field of rainfall and flood problems (Shiau and Modarres, 2009).
This study aims at investigating whether rainfall series simulated by the Bartlett-Lewis (BL) models can be used for drought analysis, as it can be questioned whether those models are able to reproduce drought statistics.These BL models have already extensively been validated for general statistics, such as mean, variance, auto-covariance and zero depth probability as well as for extreme precipitation events (Cameron et al., 2000(Cameron et al., , 2001;;Kaczmarska, 2011;Vandenberghe et al., 2011;Verhoest et al., 1997;Wheater et al., 2005).However, an assessment on whether the statistics of the alternative extreme behaviour (i.e.low precipitation), to the authors' knowledge, has not been reported in literature.Therefore, a time series of 105 yr  of 10 min rainfall observed at Uccle, located near Brussels, Belgium, is used for comparison.First, a drought index (i.e. the EDI index; Byun and Wilhite, 1999) is calculated for the observed time series.Then drought events are selected based on an investigation of the EDI, and are characterized by two variables, namely drought duration (D) and severity (S), whose dependence is modelled using a copula.A similar analysis is performed on simulated time series obtained from different types of Bartlett-Lewis models.Finally, four types of copulabased return periods for drought are calculated for both the observed and simulated time series.Through a comparative analysis of the results, the ability of the Bartlett-Lewis models for preserving drought statistics is assessed.
The following gives an overview of the structure of the paper.Section 2 introduces the observed rainfall time series and the five Bartlett-Lewis models used in this research.Section 3 describes the drought index used.Section 4 briefly introduces copulas and explains the calculation method of bivariate copula-based return period for drought.Section 5 presents results of marginal distribution fitting, copula selection and comparison of four types of drought return periods between observed and simulated data.Finally, conclusions and recommendations for further study are given in Sect.6.

Observed and modelled data
In order to evaluate whether the selected BL rainfall models can reproduce the drought statistics, this study conducts a comparison between an observed and several synthetic time series for Uccle, Belgium, simulated by the rectangular process rainfall models.The observed time series is a precipitation record with a time resolution of 10 min from 1 January 1898 to 31 December 2002 measured by a Hellmann-Fuess pluviograph in the climatological park of the Royal Meteorological Institute at Uccle, near Brussels, Belgium (Démarée, 2003).Analysing this series with the adaptive Kolmogorov-Zurbenko (KZA) filter, De Jongh et al. (2006) detected droughts around 1920 and during the mid-1970s and drier-than-normal conditions at the beginning of the twentieth century.Five time series of 105 yr of 10 min rainfall were simulated by five versions of the Bartlett-Lewis rectangular pulses model.In these models, clusters of possibly overlapping rectangular pulses having a random length (duration) and height (intensity) are generated.Finally, a time series of rainfall is obtained through discretizing time into intervals of a given length (in this study, 10 min), and cumulating the rainfall volumes of the rectangles that fall within each interval.The applied versions of the BL models are the original Bartlett-Lewis (OBL) model (Rodriguez-Iturbe et al., 1987a), the modified Bartlett-Lewis (MBL) model (Rodriguez-Iturbe et al., 1988), the modified Bartlett-Lewis Gamma (MBLG) model (Onof and Wheater, 1994), the truncated Bartlett-Lewis (TBL) model (Onof et al., 2013) and the truncated Bartlett-Lewis Gamma (TBLG) model (Onof et al., 2013).
The OBL model was first proposed by Rodriguez-Iturbe et al. (1987a), in which storm events are generated randomly according to a Poisson process with parameter λ.Each storm origin is followed by a sequence of cell origins, modelled by a second Poisson process characterized by parameter β.Cells can be generated during a time interval having a duration which is exponentially distributed with parameter γ .For each cell origin, a rectangular cell is generated with random depth and duration, both exponentially distributed, respectively with parameters 1/µ x and 1/η.Rodriguez-Iturbe et al. (1987a) introduced dimensionless parameters κ = β/η and φ = γ /η to ensure that the number of cell origins associated with one storm arrival follows a geometrical distribution with mean µ x = 1 + κ/φ.As such, the OBL model has five parameters (λ, β, γ , µ x , η).The OBL model, however, showed some shortcomings with respect to the preservation of the zero-depth probabilities (Rodriguez-Iturbe et al., 1987a,b).To solve this, the modified Bartlett-Lewis (MBL) model was introduced (Rodriguez-Iturbe et al., 1988) allowing the average cell duration to vary between storms by modifying the exponentially distributed cell duration through randomizing the parameter η according to a Gamma distribution with shape parameter α and scale parameter ν.The mean cell inter-arrival time, β −1 , and the mean storm duration time, γ −1 , can be varied randomly by keeping κ and φ constant while varying η.The MBL model thus has six parameters (λ, µ x , α, ν, κ, φ).Since the MBL model poorly reproduced the extreme behaviour of rainfall, Onof and Wheater (1994) (λ, α, ν, κ, φ, p, δ).
A problem that remained unnoticed by many users of the aforementioned models is that very unrealistic events of excessively large cells are occasionally generated when sampling a large number of mean cell duration values during long simulations (typically, when simulation is much longer than the size of the data set) (Verhoest et al., 2010).To surpass this issue, the Gamma distribution for the sampling of η is truncated to inhibit the sampling of extremely large mean cell durations (Onof et al., 2013).The truncation parameter ε can be handled as an extra parameter during calibration.As with the MBL and MBLG models, the truncated model can use either an exponential or gamma distribution to represent rainfall depth; these will be referred to as the TBL and TBLG models, respectively.The TBL and TBLG models respectively have seven (λ, µ x , α, ν, κ, φ, ε) and eight parameters (λ, α, ν, κ, φ, p, δ, ε).
Different rainfall characteristics can be represented in terms of the model parameters.In case of the OBL model, the mean storm inter-arrival time (h) and the mean cell interarrival time in a storm (h) are given by 1/λ and 1/β, respectively; the mean storm duration (h) equals 1/γ ; cells have a mean duration (h) of 1/η and a mean intensity (mm h −1 ) of µ x .The same physical interpretation of the parameters can be given for the other models.For the MBL and MBLG models, η can be calculated as α/ν.For the TBL and TBLG models, the calculation of η is more involved since it is sampled from a truncated Gamma distribution; a detailed calculation of η can be found in Onof et al. (2013).For the MBL, MBLG, TBL and TBLG models, β and γ can be calculated as κ • η and φ • η, respectively.In the MBLG and TBLG models, µ x equals p/δ.
The models are calibrated using the Generalised Method of Moments (GMM) in which model parameters are chosen to minimize the difference between the model values calculated with the available analytical expressions and the empirical values of these statistics obtained from observed data.The fitting procedure is subject to a certain level of subjectivity.There are no general guidelines about the choice of moments or aggregation levels in the objective function, nor is there a general consensus about the weights used during fitting (Vanhaute et al., 2012).Several approaches exist, each exhibiting certain advantages and disadvantages, which make it hard to select one particular method for calibration.For example, attributing greater weight to the mean rainfall intensity during fitting will lead to a better reproduction of mean rainfall intensity, whereas other properties may be reproduced less accurately, as a result.Evidently, this extends to the choice of the included moments and their aggregation levels during fitting.Several authors have attempted to address some of these issues empirically which has led to differing conclusions, making it particularly hard to assess the merit of a certain method, since the inferred conclusions are obviously influenced by the chosen evaluation criteria (Burton et al., 2008;Cowpertwait et al., 2007;Vanhaute et al., 2012).The chosen fitting properties in the current work include the hourly mean, and the variance, lag-1 auto-covariance and proportion of dry intervals or Zero Depth Probability (ZDP) at timescales of 10 min, 1 and 24 h.The lag-1 autocovariance is the autocovariance of rainfall intensity with lag-1 time step where the time step equals the aggregation time.ZDP is the proportion of dry intervals within the time series calculated for a given aggregation level.These chosen fitting properties are similar to those by Cowpertwait et al. (2007).The corresponding empirical variances (which can be obtained by treating yearly observed statistics as repetitions of a given rainfall statistic) are used as weights in the objective function (Jesus and Chandler, 2011).The aforementioned calibrations are realised using the Shuffled Complex Evolution algorithm (Duan et al., 1994).This method has shown to be a reliable and easy-to-use method, when compared to other heuristic optimization algorithms (Vanhaute et al., 2012).The parameter sets for 5 BL models are presented in the Appendix A.

Drought index
Different forms of drought can be identified, including meteorological drought, which is defined on the basis of the severity and the duration of the dry spell, agricultural drought, which accounts for the agricultural impact of the drought event, and hydrological drought, in which the impact of reduced precipitation on surface or subsurface water supply is taken into account.Given the fact that agricultural and hydrological droughts also include effects of land use, soil management, hydrological characteristics of a catchment and water management, we do not focus on these types of drought as comparing their drought statistics may not merely be attributed to shortcomings in the rainfall time series.As such, we focus on meteorological drought indices that are solely based on the rainfall time series.
Several types of meteorological drought indices have been proposed in literature, including the Rainfall Anomaly Index (RAI) (Rooy, 1965), the Bhalme and Mooly Drought Index (BMDI) (Bhalme and Mooley, 1980), the Standardized Precipitation Index (SPI) (McKee et al., 1993), the National Rainfall Index (NRI) (Gommes and Petrassi, 1994), the Effective Drought Index (EDI) (Byun and Wilhite, 1999), and the Drought Frequency Index (DFI) (González and Valdés, 2006).In this study, we opt for the EDI proposed by Byun and Wilhite (1999), as this index can be calculated on a daily time basis (Morid et al., 2006), whereas most of the other drought indices are calculated at a monthly scale, which renders them less interesting for assessing the temporal behaviour of BL models.
Basically, the EDI is an index that expresses the standardized deficit or surplus of precipitation with respect to a mean calculated over a user-defined time period.The first step in the calculation procedure of EDI is to calculate the Effective Precipitation (EP).The EP refers to the available precipitation accumulated over a period.In calculation, EP is presented as the cumulative daily precipitation with a timedependent reduction function (Kim et al., 2009); in other words, the EP for any day j is a weighted sum of the precipitation of the l previous days with decreasing weights (Morid et al., 2006) where j is the number of the days since the beginning of the time series.For values j > l, EP is calculated as where P (j − m) is the precipitation at mth day before day j .The duration l is usually chosen as 365 days (Dogan et al., 2012;Byun et al., 2008;Kim et al., 2009;Morid et al., 2006Morid et al., , 2007;;Pandey et al., 2008;Yu-Won and Hi-Ryong, 2006), as a representative value of the total water resources stored for a long period (Morid et al., 2006) or the most common precipitation cycle (Kim et al., 2009).The same value of l is also taken in this study.Since l is chosen as 365 days, the first year of all data sets is used to calculated the EP for the second year, therefore the EDI values are just available for the final 104 yr data.
Once the EP is obtained for each day, its deviation, DEP, with respect to a mean EP (i.e.MEP) is calculated: where MEP(j ) is the mean value of the EP values of the days j ≡ j (mod 365) (e.g. 15 May) of all years in a "standard period".The ideal standard period should represent average climatological variables and therefore it should be long enough, for example, 30 yr or more (Byun and Wilhite, 1999;Kim et al., 2009;Morid et al., 2006;Pandey et al., 2008).In this study, the standard period of 30 yr from 1971 to 2000 is applied for Uccle observations and from year 71 to 100 for BL simulations.
Finally, the EDI for each day is calculated: in which SD (DEP(j )) is the standard deviation of DEP of the days j ≡ j (mod 365) of all years over the standard period.
The classification of the drought severity by the EDI is presented in Table 1.For easier calculation of the drought index, all years are considered to have 365 days; for the leap years, 29 February is excluded and the rainfall on 28 February is recalculated as the average rainfall observed on 28 and 29 February.For a more detailed explanation of the EDI calculation procedure, we refer to Byun and Wilhite (1999) and Kim et al. (2009).
In this research, based on the classification propsed by Morid et al. (2006) (see Table 1), a drought event is defined as an extremely dry to moderately dry period during which the EDI index is continuously less than −0.70.Each drought is characterized by two dependent attributes: duration, D, and severity, S, where the latter is the cumulative value of the absolute value of the EDI within the drought event, that is, where T is the value of j at the onset of a drought event (i.e. the day at which the EDI value becomes less than −0.70).

Fitting drought duration and severity with copulas
In order to perform a copula-based frequency analysis, a bivariate distribution function of the drought duration D and drought severity S needs to be characterized.According to the theorem of Sklar (Sklar, 1959), if F DS (d, s) is a twodimensional distribution function of D and S with marginal distributions F D (d) and F S (s), then there exists a copula C such that Conversely, for any univariate distribution F D (d) and F S (s) and any copula C, the function F D,S (d, s) defined above is a two-dimensional distribution function with marginal distributions F D (d) and F S (s).The second equality in Eq. ( 6) describes a transformation based on the invariance property of copulas (Genest and Rivest, 2001), in which the marginal distribution functions F D (d) and F S (s) of the variables D Table 2. Selected copulas and their domain of dependence parameter θ and Kendall's tau.
and S are transformed into values U and V in the unit interval I = [0, 1]: In order to obtain (u i , v i ) for each couple (d i , s i ) in the data set, theoretical or empirical cumulative distribution functions of D and S can be used (Vandenberghe et al., 2011).The later approach is preferred in this study as it allows for the use of an empirical distribution function for the random variables D and S. The values for U and V are calculated as follows: where n is the number of drought events, and R i and T i are the ranks of d i and s i among drought events.
To model the dependence structure, we restricted the copulas to the most common one-parameter families, such as Clayton, Gumbel, Frank, Ali-Mikhail-Haq (AMH), A12 and A14 (Table 2).A12 and A14 are the unofficial names of two nameless copula families introduced by Nelsen (2006).
Several techniques can be used for estimating the copula parameter θ, such as semi-parametrical rank-based methods, parametrical methods, and kernel techniques (Genest and Favre, 2007;Salvadori and De Michele, 2007).Here, we use a rank-based method based on Kendall's tau τ K , in which the copula parameter θ is calculated as a function of Kendall's tau.We opted for this method as it has proven to be robust in describing variable correlation and outlier effects (Li et al., 2012).Furthermore, this methodology is easy to implement.Table 2 presents the copula functions, domain of the dependence parameter θ, relationship function between θ and Kendall's tau and the range of Kendall's tau for the different copulas tested.
For the copula selection, the root mean square error (RMSE) and two goodness-of-fit test statistics proposed by Genest et al. (2006), namely S n and T n , are calculated.
The RMSE is calculated as follows: where n is the number of data points, C the fitted copula based on parameters estimated via Kendall's tau, and C n the empirical copula.
The two goodness-of-fit test statistics, S n and T n , are calculated based on Kendall's tau.The smaller their values the better the accuracy achieved.S n and T n are defined as follows: where K n is the Kendall process (Genest et al., 2006).Mathematical details of the calculation of S n and T n are provided by Genest et al. (2006).The p values associated with S n and T n are calculated by means of a bootstrap method (Genest et al., 2006).

Copula-based drought frequency analysis
Bivariate hydrologic events can be categorized as joint and conditional events (Shiau, 2003).The joint drought events can be defined in two cases: AND {D > d and S > s} and OR {D > d or S > s} (Vandenberghe et al., 2011).The return period definitions T AND and T OR , respectively, for AND and OR events are defined in term of u and v as where E(L) is the expected drought inter-arrival time, which can be estimated from observed droughts which have been identified based on the EDI.One may also be interested in two types of conditional drought situations which are referred to as COND 1 {S|D > d} and COND 2 {S > s|D ≤ d} (Salvadori et al., 2007;Vandenberghe et al., 2011).The return periods T COND1 and T COND2 , respectively for COND 1 and COND 2 are defined as follows: Mathematical details and discussions of these return period calculation functions are provided by Salvadori (2004), Salvadori andDe Michele (2004, 2007), Salvadori et al. (2007), Vandenberghe et al. (2011), andGräler et al. (2013).

General rainfall characteristic evaluation of BL models
To assess the general performance of the Bartlett-Lewis models, the models' ability to reproduce general historical rainfall characteristics is first considered.Table 3 compares general historical rainfall characteristics to simulated rainfall, at different levels of aggregation.For the purpose of comparison, the percentage deviation of the simulated values from the observations is also listed in Table 3.Several differences between the models can be discovered from the table.Generally, the mean is reproduced quite well by all models.However, the TBL model shows a slightly higher deviation from the observations than the other models.All models underestimate the variance at the 10 min level of aggregation.They are more accurate at the hourly and daily level.The TBL model seems to produce poorer results than the other models at all levels of aggregation.The autocovariance at the sub-hourly level, however, is reproduced very well by the truncated models (TBL and TBLG), while at the hourly and daily level, this is less the case.The reproduction of ZDP is comparable for all models.However, it can be seen that the OBL model fails to reproduce this property at the daily level.The modest analysis above shows that, in general, certain differences exist between the models.However, it is not possible to conclude that one model performs better than the other, based on these general characteristics.It can be concluded that each of the parameterized BL models are well calibrated and the performances of the considered variants of the BL models are comparable.

Analysis of EDI values
In order to unveil any patterns in drought occurrence behaviour in the observed and simulated data, EDI values are plotted in function of the year (x axis) and the day of the year (y axis) (Fig. 1).This way, a qualitative assessment of any seasonal patterns as well as the inter-annual variability can be made.From the plots, it can be seen that droughts seem to occur more often and more severely in the Uccle observations than in the BL simulations, except for the MBL simulation.This figure reveals that drier (low EDI) and wetter (high EDI) periods seem to span over multiple years.The evidence of some very dry years and certain remarkably wet years can be found in the EDI figures of Uccle and the MBL and TBL models.No clear seasonal trends can be witnessed for both observed and simulated data.It can also be seen that the OBL, MBLG and TBLG simulations produce less extreme events than the other models; therefore, based solely on these plots, it seems that these three models do not represent well long-term dry conditions in a realistic manner.A comparison between frequency distributions of EDI values of observed and simulated data (Fig. 2) shows that the OBL and MBLG models simulate less lower EDI values and slightly overestimate higher EDI values.In other words, these models seem to produce more wetter long-term spells than the other models and the Uccle observations.For further investigation, Table 4 shows the intra-and inter-annual variance of the EDI for all data sets.This table confirms that except for the MBL and MBLG models, all models produced more or less the same intra-annual variability of EDI in comparison with those calculated from the Uccle observations.The intra-annual variances for the MBL and MBLG models are higher than those for Uccle which means that, on average, the EDI values obtained from MBLand MBLG-modelled time series fluctuate more throughout the year than those derived from observed rainfall series.In case of the inter-annual variability, the value for the OBL  model is more or less similar to that for Uccle, while there is an overestimation for the MBL and MBLG models, and an underestimation for the other models.
From a practical point of view, we only consider droughts with a duration of at least 7 days in the frequency analyses.Droughts with a duration smaller than 7 days are not easily detected in reality and may not cause any serious effects.The threshold duration of 7 days, which is much smaller than the minimum drought duration identified by other drought indices (usually a month), still allows for investigating the temporal behaviour of BL models with respect to predicting minor drought events.This choice also helps to remove a problem of "ties" in the data.This problem refers to the presence of events with identical values for both D and S which may cause difficulties in distribution fitting and copula-based statistical analysis.For more information about this problem, we refer to Salvadori andDe Michele (2006, 2007).Table 5 gives some basic statistics for all observed and simulated drought events from which it can be seen that all models overestimate the numbers of drought events while generally they underestimate the drought duration.

Analysis of YAEDI 365 values
In this analysis, we considered a year which had a YAEDI 365 less than −1.5 as a "seriously dry year" based on the classification in Table 1. Figure 3 presents YAEDI 365 of all data during 104 yr; for observed data, the years are numbered from 1899 to 2002, while they range between 2 and 105 for synthetic data.For Uccle, we may identify three seriously dry years, being 1921, 1949 and 1976; this agrees with the findings by De Jongh et al. (2006).From these data, one may infer that a seriously dry year occurs every 27 to 28 yr, however, this statistic should be treated with care given the limited length of the time series.All models, except the MBL model, seem to underestimate dry conditions and fail to simulate the extreme events.Two seriously dry years were detected for the MBL model in years 42 and 64.Only one seriously dry year was observed for the OBL and TBLG models, while there is not a single one observed for the MBLG and TBL models.YAEDI 365 values simulated by the OBL, MBLG and TBL models seem to be smaller and less variable than those by the other models.The underestimations of all BL models, except the MBL model, are also confirmed in Fig. 4 in which the empirical cumulative distribution functions for YAEDI 365 are presented for all data sets.

Probability distributions for D and S
The marginal cumulative distribution functions of D and S need to be modelled separately as these are needed when  conducting a copula-based frequency analysis in order to transform these values from R 2 to I 2 or vice versa.Different commonly used parametric models such as Generalized Pareto (GP), Generalized Extreme Value (GEV), exponential, Weibull and gamma distribution functions, and a nonparametric Kernel model are considered in this fitting test.The reason for also conducting a non-parametric model fit is that such model may avoid the typical problems of underor overestimating extreme events when fitting a parametric model (Vandenberghe, 2012).
In order to assess the significance of the fit, the statistics of Anderson-Darling AD n (Anderson and Darling, 1954) is calculated and used for identifying the appropriate distribution for D and S. Table 6 lists the values of the AD n statistics for the five parametric and the one non-parametric distribution functions fitted to the duration D and the severity S of the Uccle observed, OBL-, MBL-, MBLG-, TBL-and TBLGmodelled droughts.Note that smaller AD n values express a better distribution fit.Table 7 presents the p values for these AD n tests.As can be seen from Table 6, the best fits for both D and S are provided by the GEV distribution.The second best fits for D and S are obtained with the Kernel distribution.These results are different from some previous studies in which the distribution function for D is generally considered to be a geometric distribution (Kendall and Dracup, 1992;Mathier et al., 1992) or an exponential distribution (Shiau, 2006;Zelenhasi and Salvai, 1987), and the distribution function for S is considered to be a gamma distribution (Mathier et al., 1992;Shiau and Shen, 2001;Shiau, 2006;Zelenhasi and Salvai, 1987).p values in Table 7 indicate that only for the Kernel distribution, an appropriate fit for all data sets is obtained.In contrast, all the parametric distributions are clearly rejected.To avoid the problems of under-or overestimation of marginal distribution fitting for D and S, the distributions should be investigated by a graphical presentation.Figures 5 and 6 display the GEV and Kernel cumulative distribution functions of D and S for all data sets, respectively.As can be seen for all cases, the Kernel distribution  (red line) is better in simulating the extreme values while the GEV (black line) overestimates the extremes.
Based on the above analysis, the Kernel distribution for both D and S is selected in further analysis.Figure 7 shows the comparison of the Kernel cumulative distribution functions of D and S for all data sets; it is clear that the cumulative probability of D or S calculated from the observed Uccle data (black line) is always smaller than what is found for the different BL models, which means that all BL models generally underestimate D and S.

Identification of the appropriate copula
The copula parameter estimation method for D and S makes use of the estimation of Kendall's tau (τ K ).As Kendall's tau values of all data sets (Table 8) are out of range for the AMH copula (    Table 9 presents the RMSE values obtained for different copulas tested for the observed and simulated data.As can be seen, similar results are obtained for all copulas, therefore it is difficult to decide which copula is the best for all data sets if only the RMSE is used. The results of S n , T n and their respective p values are presented in Tables 10 and 11.In case of S n , the p values indicate that only the Frank copula is found to be an appropriate copula at the 5 % significance level for all data sets.The p values for T n from Table 11 also result in the same conclusion.Based on these statistics, the Frank copula is considered for the frequency analysis for all data sets.Comparison between empirical copulas (red dotted line) and fitted Frank copulas via Kendall's tau (full line) for all data sets is shown in Fig. 8.In this figure, the plotted variables U and V are the normalized ranks of the variables D and S, respectively.It is clear that the copula fits for data of the MBL and MBLG models are slightly worse than for the others; however, the fits are still considered to be acceptable.

Copula-based frequency analysis
In this section, four types of return period will be investigated; we will focus on drought events with a return period of 5 yr (Fig. 9) and 10 yr (Fig. 10).In case of AND and OR return periods, for both 5 and 10 yr drought return periods, it is clear that all models underestimate the magnitude of drought properties.In other words, an observed drought event having a return period of 5 or 10 yr will have a lower frequency of occurrence if it were modelled by the five BL models.The underestimations seem to become more pronounced for more extreme events.However drought statistics from the MBL simulation still remain closest to those by the Uccle data in all cases, followed by the TBLG and TBL simulations.For COND 1 type, the MBL, TBL and TBLG models slightly overestimate the magnitude of 5 yr drought events, but slightly underestimate in cases of 10 yr drought events.Remarkable underestimations are witnessed for the OBL and MBLG models for both 5 and 10 yr droughts.The MBL model produces the closest 5 yr drought statistics compared to those of Uccle observations, while the TBLG model best represents the 10 yr drought statistics.For the COND 2 type, truncated models (TBL and TBLG) show the best performance in both 5 and 10 yr drought return periods.There is a slight overestimation witnessed for the MBL model, and  p AD n Uccle 0.000 0.018 0.000 0.000 0.000 0.402 0.000 0.058 0.000 0.000 0.000 0.424 OBL 0.000 0.007 0.000 0.000 0.000 0.347 0.000 0.267 0.000 0.000 0.000 0.283 MBL 0.000 0.013 0.000 0.000 0.000 0.400 0.000 0.007 0.000 0.000 0.000 0.297 MBLG 0.000 0.007 0.000 0.000 0.000 0.390 0.000 0.203 0.000 0.000 0.000 0.350 TBL 0.000 0.007 0.000 0.000 0.000 0.493 0.000 0.007 0.000 0.000 0.000 0.307 TBLG 0.000 0.003 0.000 0.000 0.000 0.360 0.000 0.013 0.000 0.000 0.000 0.310 It should be noted from the figure that lines presenting results for all Bartlett-Lewis models are shorter than those of observed data in case of OR and COND 2 , which should be attributed to the lack of severe drought events (see Table 5 and Fig. 7); the interpolated results by copula are therefore limited to a certain scale.Overall, it is difficult to conclude which model has the best performance.The MBL model seems to be the best in simulating droughts with a high frequency for the AND, OR and COND 1 types.The truncated models show the best results for the COND 2 type for both 5 and 10 yr return periods.The OBL and MBLG models fail in almost all cases.The shortcomings of all rainfall models in simulating extreme drought events can be partly explained by their overestimation of the cumulative value of marginal distribution functions for D and S. We can thus conclude that the BL models seem to simulate longer and more severe drought events at a too low pace.This can be attributed to the fact that the model itself does not foresee any non-stationarity and maintains the same parameters throughout the simulation period.The shortcomings also can be explained by the problems of inducing overclustering by model structure (Vandenberghe et al., 2011) which may result in generating more and shorter dry periods.

Conclusion and recommendation
In this study, some basic statistics are compared and a copula-based bivariate frequency analysis is performed in order to assess whether rainfall series simulated by Bartlett-Lewis (BL) models are able to reproduce drought statistics.
A record of the 105 yr period of 1898-2002 of 10 min rainfall for Uccle in Belgium is used as observed data.For drought identification, the EDI is chosen and drought events were defined as an extremely dry to moderately dry period during which the EDI is continuously less than −0.7.Each drought event is characterized by two variables, i.e. drought duration (D) and severity (S).Through quantitatively analysing daily EDI time series, it was clear that droughts seem to occur more often and more severely in the observations and for the MBL simulation than for the other BL models.However, no clear seasonal trends can be witnessed for both observed and simulated data.
It was demonstrated that D and S could be modelled by a GEV distribution in contrast to what is generally considered that the distribution for D should be a geometric or an exponential distribution and for S should be a gamma distribution (Mathier et al., 1992;Shiau and Shen, 2001;Shiau, 2006;Zelenhasi and Salvai, 1987).However, in this research context the non-parametric Kernel distribution was selected as it allowed better representation of the upper tail of the distribution.The analysis of marginal distribution functions showed that in general all models underestimate D and S.This may lead to the underestimation of the probability of extreme events by all models.The application of the yearly accumulated negative EDI, YAEDI 365 , also allows for identifying dry conditions in the time series for all data; for Uccle, three seriously dry years are witnessed within the 105 yr time series.All BL models tested, except the MBL model, seem to underestimate these dry conditions and fail in simulating similar extreme events.YAEDI 365 values simulated by the OBL, MBLG and TBL models seem to be smaller than those by the other models.
A frequency analysis was performed using bivariate copula-based return periods of droughts, expressed in term of D and S. The Frank copula was selected based on results of RMSE, S n and T n .Four types of copula-based drought return periods are conducted for all data sets.The comparison of four types of drought return periods indicated that all BL models seem to underestimate the drought severity compared with those observed in Uccle in almost cases and it is therefore difficult to conclude which model best reproduces drought statistics.However the MBL model produces the drought statistics that are closest to those of the Uccle observations in case of 5 yr event for AND, OR and COND 1 types and of 10 yr event for AND and OR types.The TBL and TBLG models perform very good in case of COND 2 type for both 5 and 10 yr droughts.The OBL and MBLG models show disappointing results in most cases.The shortcomings of all BL models in simulating extreme drought events can be partly explained by the fact that the BL models simulate longer and more severe drought events with a too low frequency which can be attributed to several reasons.First, as the models use the same parameter sets throughout the simulation period, these models thus cannot foresee any non-stationarity.This could explain why the variability in, for example, YAEDI 365 values calculated from BL simulations is too small compared to those by the observed time series.Furthermore, the temporal variability assured through the stochastic process within the BL models is insufficient to allow for generating the extreme drought events.The simulating process in all BL models also does not assume any temporal autocorrelation between successive storms which may be needed to model longer drought periods.Finally, the model problem of over-clustering may have greater impacts during severe drought periods than during the remaining simulation period.One way to solve this problem is to investigate whether temporally changing parameter sets would allow to better reproduce the droughts while still ensuring the other characteristics of rainfall (such as moments, extreme rainfall or zero depth probabilities).This could be obtained by including a dependence between models parameters or cluster variables through, for example, introducing copulas in the BL structure.The limitations of the BL models could also be attributed to the fact that a Poisson process might be insufficient to simulate the storm arrivals; in that case modifying the structure of the BL model by replacing the exponential distribution modeling the inter-storm arrival times with a longertail distribution, could be investigated.Finally, even though the model is well calibrated, it is undeniable that stochasticity in the generated time series may still have an impact on the final results.Therefore, it is advisable to validate the performance of a model through the use of several model simulations.

Fig. 1 .
Fig. 1.Evolution of the EDI index of observed and simulated rainfall records for 104 yr.The y axis corresponds to the day of the year (DOY), while the x axis displays the year.

Fig. 5 .
Fig. 5. GEV and Kernel cumulative distribution functions of drought duration D.

Fig. 6 .
Fig. 6.GEV and Kernel cumulative distribution functions of drought severity S.

Fig. 7 .
Fig. 7. Comparison of Kernel cumulative distribution functions of D and S of observed and simulated data.

Fig. 8 .
Fig. 8. Fitted Frank copulas (full line) and empirical copulas (dotted red line) for all data set.

Table 3 .
Comparison of general historical rainfall characteristics with simulation results.Values between brackets are percentage deviations of the simulated characteristic with respect to the observation.

Table 4 .
Intra-and inter-annual variance of EDI for Uccle and BL models.

Table 2 )
, it is no longer considered in the study.

Table 5 .
Basic drought statistics of observed and simulated data.

Table 6 .
Values of the AD n statistic test for some distribution functions fitted to drought duration D and drought severity S.

Table 7 .
p values for the AD n statistic test (p values larger than 0.05 indicating an appropriate fit of the distribution are displayed in bold).

Table 8 .
Kendall's tau τ K for couple of D and S.