Uncertainty of Intensity – Duration – Frequency ( IDF ) curves due to varied climate baseline periods

Abstract Storm water management systems depend on Intensity–Duration–Frequency (IDF) curves as a standard design tool. However, due to climate change, the extreme precipitation quantiles represented by IDF curves will be subject to alteration over time. Currently, a common approach is to adopt a single benchmark period for bias correction, which is inadequate in deriving reliable future IDF curves. This study assesses the expected changes between the IDF curves of the current climate and those of a projected future climate and the uncertainties associated with such curves. To provide future IDF curves, daily precipitation data simulated by a 1-km regional climate model were temporally bias corrected by using eight reference periods with a fixed length of 30 years and a moving window of 5 years between the cases for the period 1950–2014. Then the bias-corrected data were further disaggregated into ensemble of 5-min series by using an algorithm which combines the Nonparametric Prediction (NPRED) model and the method of fragments (MoF) framework. The algorithm uses the radar data to resample the disaggregated future rainfall fragments conditioned to the daily rainfall and temperature data. The disaggregated data were then aggregated into different durations based on concentration time. The results suggest that uncertainty in the percentage of change in the projected rainfall compared to the rainfall in the current climate varies significantly depending on which of the eight reference periods are used for the bias correction. Both the maximum projection of rainfall intensity and the maximum change in future projections are affected by using different reference periods for different frequencies and durations. Such an important issue has been largely ignored by the engineering community and this study has shown the importance of including the uncertainty of benchmarking periods in bias-correcting future climate projections.


Introduction
The design of hydrosystems is commonly developed with the help of Intensity-Duration-Frequency (IDF) curves that represent the frequency and the intensity of maximum rainfall events in different durations. In different parts of the world, an upward trend in the maximum daily and sub-daily precipitation values has been observed, and these values are comparable to the amounts shown by the IDF curves (Al Mamoon et al., 2016;Rodríguez et al., 2014;Mirhosseini et al., 2014;Arnbjerg-Nielsen, 2012;Denault et al., 2002;Waters et al., 2003). However, non-stationarity causes variation over time in the return period of a specific rainfall event (i.e., storm) (Mailhot and Duchesne, 2010). It has been predicted that by the end of 21st century there will be a substantial reduction in the return period of an annual maximum precipitation amount with frequent occurrence of extreme rainfall events (Intergovernmental Panel on Climate Change (IPCC), 2012). The sensitivity of urban storm water collection systems could be adversely affected by such changes (Willems, 2013). In many cases, the design of such collection systems is based on historical IDF curves, but these curves may need to be modified to account for the possible effects of climate change (Watt and Marsalek, 2013). Therefore, urgent actions are needed to examine the accuracy and uncertainties of the IDF curves that are currently used for the design of urban storm water collection system taking into account projections of future short-duration rainfall (hourly or sub-hourly) under the impact of climate change.
To model the hydrological outcomes of urban watersheds reliably, whether for the current or future climate, it requires the use of hourly or even sub-hourly precipitation data (Segond et al., 2006;Watt et al., 2003). However, observed rainfall with fine temporal resolution is not often available; in many parts of the world, precipitation is generally recorded on a daily basis, and hourly records are available only in limited regions. In addition, most of the climate data are with daily temporal resolution. Hence, to assess the robustness and sensitivity of urban storm water drainage systems, it is necessary to disaggregate precipitation for the current climate (in case fine resolution is not available) and future climate into finer temporal resolutions. Moreover, the creation of future IDF curves that depend on finely tuned records of precipitation will be affected by different sources and levels of uncertainty. Such uncertainty casts considerable doubt on the outcome of the entire process, especially from an engineering and practical perspective. Some sources of uncertainty include the climate change scenarios, the adopted global climate models (GCM) and regional climate models (RCM), natural internal weather variability, methods of downscaling and disaggregation, and techniques of bias correction. Several authors suggest that uncertainty in the results might be mitigated by adopting an ensemble approach where consideration is given to more than one climate model, IPCC emission scenarios, and statistical downscaling methods (Van Der Linden and Mitchell, 2009;Taylor et al., 2012). In this way, it should be possible to assess the extent of the uncertainty associated with each of these approaches.
Much of the recent literature on future climate projections in general (Sarr et al., 2015;Sunyer et al., 2015) and future IDF curves specifically (Alam and Elshorbagy, 2015;Kuo et al., 2014;Rodríguez et al., 2014;Mirhosseini et al., 2013) has adopted the aforementioned suggestion and has used ensemble of climate models and IPCC emission scenarios to cover the uncertainty resulted from each of these two sources. However, with regards to the uncertainty caused by the bias correction of a climate model, we have noticed that most of the recent studies have carried out the bias correction of GCMs and RCMs statistically by depending upon one reference period (Sarr et al., 2015;Sunyer et al., 2015;Kuo et al., 2014;Mirhosseini et al., 2013). The problem with the bias correction studies so far lies in the assumption used for the correction. It assumes that the bias for the future period is identical to the bias in the control period, which may not always be true, and this may affect the results of future bias-corrected data. This is confirmed by Boberg and Christensen (2012) and Sunyer et al. (2014) who have shown that the bias of a climate variable (temperature or rainfall) depends on the value of that climate variable. Although some studies do account for a change in bias, they either for coarse spatial resolutions especially for GCMs (Li et al., 2010;Miao et al., 2016) or rely on subjective decisions that depend upon expert knowledge to define the range of bias change between the current and future climate (Buser et al., 2009(Buser et al., , 2010.
The second drawback of many bias correction studies is related to the reference period used for the bias correction. This is confirmed by Li et al., 2010, who have shown that the sensitivity of bias correction results is related to the choice of various reference periods. The authors argue that care should be taken when adopting a specific reference period for bias correction.
A trend analysis of the rainfall process and its extremes shows that extreme precipitation exhibits multidecadal timescale fluctuations (Ntegeka and Willems, 2008;Willems, 2013). The precipitation oscillation peaks in different periods depending on the season and the region (Willems, 2013). Thus, choosing a reference period within an oscillation period of lower extremes could produce a different result for future climate compared with that based on another period. In addition, Willems (2013) shows that multidecadal oscillations occur with irregular periodicities in the range 30-60 years for central-western Europe. Thus, fixing the length of the reference period at 30 years in a bias correction might not reflect the true risk of precipitation in the future climate. However, as most of the regions lack long records of precipitation data for the study of the trend in rainfall extremes, researchers tend to adopt the results of Willems (2013) and fix the length of the reference period at 30 years for their climate studies (Buser et al., 2009(Buser et al., , 2010Sunyer et al., 2015;Kim et al., 2015Kim et al., , 2016.
As most of the recent studies on future climate projections adopt the above-mentioned conventional assumption for bias correction and fixed the length of the reference period at 30 years, we have adopted the same assumptions in our study. However, we use different reference periods to correct the future RCM data bias and build future IDF curves by using only one RCM and one method for bias correction. By doing this, the extent and source of the uncertainty in future IDF curves can be investigated. Yet, some uncertainty also arises from the reference period used for the bias correction of the RCM based on the conventional assumption of correction.
Most of the previous studies have adopted the period 1961-1990 as the reference period for the bias correction or the downscaling of future GCMs and RCMs (Yang et al., 2010;Dosio et al., 2012;Kim et al., 2015Kim et al., , 2016. However, it would be logical to assume that the most recent period is more likely to resemble future projections because it has experienced more warming (Li et al., 2010). Thus, we intend to ascertain which of the periods (e.g., the commonly used reference period , the most recent , or another specific reference period) produces the most extreme rainfall prediction. This is of importance for designing a reliable sewer system. Such a reference period with highest extremes may produce the worst consequences for the sewer system and thus should be considered in the decision making process. Although such a case may not be adopted for the design of the sewer system, due to the performance deterioration of any solution for the flood risk problem over time (Ashley et al., 2008), it is helpful to know what other flexible and sustainable solutions should be taken into account in flooding mitigation measures (Willems et al., 2012;Willems, 2013).
Thus, the objectives of this study are to (i) generate a continuous record of 5-min precipitation for the period 2069-2098 and construct future IDF curves; (ii) identify the change between the current and future climate; (iii) quantify the uncertainty associated with the constructed future IDF curves that may be caused by the reference period; and (iv) determine whether there is a specific reference period when used for the bias correction and that produces the more extreme values than the other reference periods, i.e., the worst case that the designer of a sewer system needs to know.

Rainfall data
The study area is located in West Yorkshire, Northern England and comprises an area of approximately 12 km Â 5 km. The observed rainfall dataset used in this study is the gridded precipitation product, created by the Centre of Ecology & Hydrology Gridded Estimates of Areal Rainfall (CEH_GEAR) for the period 1890-2014 (Keller et al., 2015). This gridded data set has a spatial resolution of 1 km Â 1 km and is based on different station densities for different periods. Station density peaked at around 6250 stations in 1974 (Eden, 2009), while for the period 1961-2000 there was an average of one rainfall station per 49 km 2 (4400 stations) (Perry and Hollis, 2005). For this study, the CEH rainfall data that cover our study area for the period 1950-2014 were adopted as the observed data.
The composite radar data covering the study area were provided by the UK Met Office radar network through the British Atmospheric Data Centre (BADC) with spatial and temporal resolutions of 1 km and 5 min, respectively. A 60-km 2 area of radar grids covers the study area. The catchment is within the coverage of three single-polarisation C-band weather radars at Hameldon Hill, High Moorsley, and Ingham, which are located 30 km, 95 km, and 90 km away from the study area, respectively (UK Met Office, 2009). Quality control and corrections of the main sources of error related to the radar rainfall data were implemented by the UK Met Office Nimrod System (Harrison et al., 2009) and therefore the radar data have been corrected; however, further checking and post-processing of the radar data was also performed; the methods used are briefly explained in (Fadhel et al., 2016). The 5 min radar data for the period January 2006-May 2016 were used to disaggregate the future daily RCM down to 5-min durations.
The gridded daily temperature data provided by the Climate hydrology and ecology research support system meteorology dataset  [CHESS-met] at 1 km spatial resolution are used in this study. Robinson et al. (2015) derived the CHESS-met air temperature for a reference height of 1.2 m by using the bicubic spline method to interpolate the MORECS air temperature from 40 km to 1 km resolution. Later the interpolated data at each 1 km grid cell were adjusted to its elevation depending on the Integrated Hydrological Digital Terrain Model. The CHESS-met temperature data for the period 1961-2015 were adopted in this study as the observed dataset. The CHESS temperature data were used to bias-correct the climate temperature variable (Section 3.1) and for rainfall disaggregation as well (Section 3.2).

Climate data
In this study, we have used the climate data of the Met Office Hadley Centre HadRM3 dataset. The Met Office Hadley Centre's RCM HadRM3 uses the global climate model HadCM3 to project future climate conditions at regional level (Murphy et al., 2009). The RCM data consist of an 11-member ensemble of one unperturbed member and ten members with various perturbations to atmospheric parametrisations, all of which rely on the same historical emissions scenario, SRES A1B (Murphy et al., 2009). Time series of climate data for the period 1950-2100 can be obtained from the HadRM3 Perturbed Physics Experiment Dataset (HadRM3-PPE-UK) at a resolution of a 25-km grid in space and daily in time.
The study area is located in the middle of two 25-km grids of the climate grids. Thus the future modelled temperature data, which is for a reference height of 1.5 m, for the two grids and the period 2069-2098 were bias-corrected depending on the period 1985-2014.
However, the precipitation provided by the HadRM3 dataset was modified and downscaled by the Future Flow Climate data from the Centre for Ecology and Hydrology  due to systematic discrepancies from the observations for the historical period pre-2000. Such discrepancies are frequently produced by RCM outputs because accurate reproduction of small-scale atmospheric processes is hindered by their coarse spatial resolution. This could have significant effects on the results if the data are used to model river flow and groundwater levels, so the HadRM3-PPE-UK daily outputs were adjusted to ensure compatibility between their statistical qualities and those of the observations for the identical periods by applying a statistical method. Furthermore, the lack of spatial uniformity noted in precipitation in 25-km grid was dealt with through spatial downscaling to 1 km Â 1 km (Newton et al., 2012).
Although modelled precipitation data were bias-corrected and downscaled by Newton et al. (2012), the data still contain some discrepancies compared with the real climate . Thus, the climate precipitation data with 1 km spatial resolution for the period 1950-2014 were adopted to correct the bias of the future precipitation data for the period 2069-2098, as explained in the next section.

Statistical bias correction method
Bias correction can be undertaken by using various approaches; this study uses the distribution-based scaling (DBS) approach to correct the bias of daily RCM data for the two climate variables including temperature and precipitation (Yang et al., 2010). Regarding to precipitation, the gamma distributions is used to map the quantiles of the observed and simulated data for every monthly segment of the calendar year. More specifically, we used the double gamma distribution by separating the precipitation distribution into two segments divided by the 95th percentile, which helps to determine the major features of both normal and extreme precipitations.
However, to bias-correct the climate temperature data, the spatial resolution of both observed and modelled temperature data sets should be matched firstly. Thus, the observed temperature data were spatially upscaled from 1 km to 25 km by using a simple averaging method. The upscaling procedure ends up with two grids to match the RCM temperature grids covering our study area. Later the bias of temperature data were corrected to the upscaled observed data using the DBS approach based on the Gaussian distribution. As in Yang et al., 2010, to take into account the seasonal variation, a 15-day moving window was used to smooth the mean and standard deviation of daily temperature, which were further smoothed using Fourier series with five harmonics. Since Olsson et al. (2015) have shown that the bias of modelled temperature data could be corrected with and without dependence between temperature and precipitation (i.e. wet-dry day separation), thus we have corrected the climate temperature data without wet-dry day separation.
To bias-correct the model future projection, the cumulative distribution function (CDF) of the model in the reference period is used to identify the corresponding percentile values of the future period. Later the observation CDF is used to find the climate variable value for the same future cumulative probability, which represents the bias corrected future value. The drawback of the above procedure of bias correction is the assumption of constant bias between the reference and future periods. However, we will accept this assumption as most of the recent studies did (Sarr et al., 2015;Sunyer et al., 2015;Kim et al., 2015Kim et al., , 2016) and we will investigate the effect of reference period on bias corrected future projections.
As mentioned earlier, most of the previous studies have adopted one reference period to bias correct ensemble of GCMs and RCMs. However, this study takes a different approach, which is to biascorrect one RCM by using ensemble of reference periods. For this analysis, the full time series of the observed and modelled rainfall for the period 1950-2014 were divided into eight sub-periods each with a fixed length of 30 years, moving from the first to the last sub-period with a moving window of 5 years. Each of these eight sub-periods represents a reference period that is used to biascorrect the future RCM (the reference period's intervals are shown in Table 3). Thus, each ensemble member of the future RCM for the period 2069-2098 is bias-corrected eight times based on the eight reference periods.
However, since the focus of the study is to assess the uncertainty of the constructed future IDF curves, which may result from the reference period used to bias-correct the climate rainfall data, the bias of future climate temperature data for the period (2096-2098) were corrected by fixing the reference period to 1985-2014.

Disaggregation model
It is necessary to be able to disaggregate precipitation data collected on one timescale (e.g., daily) to a different, shorter timescale (e.g., hourly). The algorithm for rainfall disaggregation used in this study is almost similar to the one used in Westra et al. (2013), which has combined the Nonparametric Prediction (NPRED) model and the method of fragments (MoF) framework. The difference between the algorithm used in this study from the one used by Westra et al. (2013) is by using the NPRED model instead of the generalised additive model (GAM).
The NPRED model adopts the logic of Partial Informational Correlation (PIC) to identify the system predictors, and uses the knearest-neighbour regression formulation based on a Partial Weights (PW) to predict the response depending on the weighted Euclidean distance. The NPRED model is recently released within the open source NPRED R-package, and for more information about the model, the reader is referred to Sharma et al. (2016) and Sharma and Mehrotra (2014). Since the NPRED model can be used only for prediction, the MoF (developed by Westra et al. (2012) and Mehrotra et al. (2012) under the historical climate assumptions) can help to find the full temporal distribution of sub-daily rainfall (Westra et al., 2013). The MoF resamples the fragments of subdaily time scale from the historical observation restricted to daily rainfall and other atmospheric covariates.
In this study we are interested in disaggregating the current daily observed and future modelled rainfall data for the periods 1985-2014 and 2069-2098, respectively to a sub-hourly scale (more precisely, to a 5-min scale). However, due to climate change, future rainfall patterns will be reflected by the patterns of warmer days from the historical rainfall data (Westra et al., 2013). Thus, using historical rainfall alone as predictors for future rainfall disaggregation my not reflect the true temporal pattern for future rainfall.
Scaling, which is the relationship between temperature and rainfall, was investigated by Blenkinsop et al., 2015 for the UK, and the authors have found that the scaling magnitude for extreme hourly rainfall for summer season is more important than other seasons and is centred around 6.9% per°C. However, since we are interested in disaggregating the rainfall data into sub-hourly scale, it is better to assess the rainfall-temperature relationship for storm burst fractions which can help to find how the rainfall temporal pattern will change with temperature. For this analysis we adopt the closest four gauges from our study area instead of the radar data because the temporal coverage for the gauges are longer than the corresponding coverage by radar. The gauges with a 15 min temporal resolution were quality checked using the procedure explained in Fadhel et al. (2016). Table 1 shows the temporal length for each gauge and the distance between the gauges and the centre of the study area. It is clear from the table that three of the gauges share the period (1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015), while the last gauge covers only 15 years of the data. Thus we tested the significance of the scaling magnitude for each gauge individually using the whole length of the data, and for the three gauges using the shared 19 years. Wasko & Sharma's procedure (2015 for scaling the total storm volume and storm burst fractions with temperature were adopted in this study by using all the data (i.e. without seasonal separation). The scaling results of the hourly storm volume show a statistical significant positive scaling of 2.4% per°C to 3.2% per°C in three out of the four gauges. While regarding the scaling magnitude for each fraction, three gauges show a statistical significant positive scaling with temperature for the first two fractions, whereas a statistical significant negative scaling for the last fraction was shown by two gauges (Table 1). However, for the third fraction the results show a very small increases with temperature, and the significance of the results was shown by only one gauge. Since the study area is not very big in size, the above results from only four gauges with varied significance confirmed the change of rainfall temporal pattern with temperature. The scaling magnitude was addressed for different durations, however we showed only the hourly results since this subject is not the main focus of the paper.
From the above results it can be seen that it is of importance to use temperature as a second predictor for future rainfall disaggregation, especially the disaggregation into fine temporal resolution. For this reason, the NPRED tool is used to predict the fraction of rainfall occurring in the maximum 5 min storm burst using the daily rainfall and temperature as predictors.
By logical thinking, dividing the data into seasonal segments and use it for the prediction is better than using all the data without seasonal separation. However, by doing this the PIC theory in the NPRED model shows that the seasonal separation will result in using rainfall predictor alone for the predictions rather than using both rainfall and temperature. This is the case for all seasons except winter. To further explore the importance of seasonal separation, a multiple linear regression (MLR) with equal weights was fitted between the predictor variables, and the response. The MLR model was fitted to the data with and without seasonal separation and the statistics of the leave-one-out cross validated results were compared with the corresponding cross validated results using PW in the NPRED tool for all the data without seasonal separation. The results in Table 2 show that the PIC_PW algorithm using all the data is better than the corresponding results from the linear regression with equal weights by using all the data and for the two seasons (summer and autumn). Thus, the NPRED tool was used for the prediction of the maximum 5 min storm burst without seasonal separation of the data.
The section below is a brief description of the algorithm used for the rainfall disaggregation procedure.
1. Find the fraction of 5 min, which is the ratio between the maximum 5 min storm bursts on a specific day to the total rainfall amount for that day by using the historical sub-daily rainfall data, which is the radar data in our study; 2. Fit a model between daily rainfall (i.e. the sum of 5 min radar data) and temperature as the predictor variables, and the 5 min fraction resulted from step 1 as the response using the NPRED tool; 3. Apply the model resulted from step 2 to predict the future 5 min fraction using future climate data as predictors; 4. Use the MoF framework to search in the historical data for analogue future atmospheric state to that projected in step 3, and sample the historical rainfall temporal pattern from one of these days (see Eq. (2) in Westra et al., 2012). The optimal window size that used in the MoF is 15, as in Sharif and Burn (2007).

Creation of IDF curves
An Extreme Value distribution (EV), which is also known as Gumbel distribution was selected as the best probability distribution for our study area based on three different goodness of fit tests (Anderson-Darling; Kolmogorov-Smirnov; and Chi-Squared test. The derivation process is not shown here because it followed the standard procedure (Millington et al., 2011)). The EV distribution for the annual maximum series was used in this study to create the IDF curves. The EV parameters were estimated by using the method of maximum likelihood estimates. The process of creating future IDF curves consists of the following eight steps: 1. Bias-correct the future RCM rainfall data for the period 2069-2098 using a specific reference period; 2. Disaggregate the daily future bias-corrected RCM to 5 min using the NPRED_MoF algorithm; 3. Take the areal average of the 5-min disaggregated rainfall data for the study area; 4. Aggregate the 5-min rainfall intensity to four different durations based on concentration time, which was found to be 15 min for the study area following the equations proposed by Desbordes (1974). The aggregation can be done by applying a moving average operation with time step equal concentration time to the full 5-min disaggregated time series to aggregate the rainfall intensity into four rainfall durations (15 min, 1, 6, and 24 h); 5. Obtain the annual maximum series of precipitation intensity for each duration; 6. Use the Gumbel EV distribution to find precipitation depths for different return periods (2, 5, 10, 25, 50, and 100 years); 7. Repeat steps 5 and 6 for each duration; 8. Repeat steps 1 to 7 for different reference periods.

Bias correction
This study adopted the distribution-based scaling method for bias correction (Yang et al., 2010). As mentioned earlier, the conventional bias correction method assumes that the bias for the future and benchmark period is the same. An example of the effect of this assumption on future bias-corrected rainfall data is presented in Fig. 1(a-c) which illustrates three cases using one RCM ensemble and three different reference periods for the month January. Fig. 1a shows that the modelled rainfall for the first RCM  1. Effect of reference period on future bias corrected rainfall using the convensional assumption of bias correction for three reference periods.  . Hence the bias in the figure which is the relative difference between the observed and modelled rainfall in this study, is negative. Thus, using the same bias to correct the future RCM will result in the corrected data that are lower than the original uncorrected future RCM. The bias correction in this case tends to shift the CDFs of the RCM for both current and future periods downwards so that the RCM in the current period matches the observation.
In contrast, when the reference period 1985-2014 is used for the bias correction, it is clear from Fig. 1b that the bias of the RCM for this period underestimates the observation. Thus, the future bias-corrected RCM obtained by applying the same bias for the control period is larger than the corresponding raw future RCM because the corrected CDFs for the current and future period are shifted upwards.
In addition, there is another case when two reference periods with the same bias direction are used, they both under/overestimate the observation but at different magnitudes. Fig. 1c shows that the bias magnitude for the period 1965-1994 is less than the corresponding magnitude for the period 1985-2014 (Fig. 1b), although the two periods have biases in the same direction (both underestimate the observation). In this case, the future biascorrected data based on the former reference period are lower than the corresponding bias-corrected data based on the latter reference period.
In order to make it easier to visualise how the bias can change over time either by direction or magnitude or both, Fig. 2 shows the monthly magnitude and direction of bias between the areal averages of the observed and modelled rainfall for the eight reference periods and for extreme rainfall. By referring to these changes in bias over time one can determine whether the future biascorrected data would be larger or smaller than the original uncorrected data.
From the above discussion, it is clear that the accuracy of a future bias-corrected RCM rainfall depends significantly on the reference period as it's confirmed by Newton et al., 2012. Because the bias direction and magnitude between the modelled and observed rainfall for different reference periods can produce different results for the future bias-corrected data. In addition, as mentioned earlier in the introduction, due to multidecadal oscillations, a reference period may have fewer extremes and/or those extremes may have Fig. 2. Monthly magnitude and direction of the biases between the modelled (box plot of the 11 ensemble members) and the observed (solid line) extreme rainfall for eight reference periods.

Table 3
The percentage of change between the future and current climate for the mean of the 11 RCM ensemble members, five rainfall durations, 5-year return period, and eight reference periods. lower values than those in another period (Willems, 2013). Another explanation is that fixing the length of the reference period to 30 years may not be sufficient to represent the total length of an oscillation (Willems et al., 2012). Consequently, these two issues may affect the results of future bias-corrected rainfall data and later the IDF curves.

Reference
In an attempt to address this issue, we have adopted the conventional assumption to bias correct the future RCM rainfall but for eight reference periods. In our method, the future biascorrected data are further disaggregated to 5-min precipitation values using the NPRED_MoF algorithm. Then the future IDF curves for the disaggregated data can be produced. By doing this, the Fig. 3. IDF curves for the current and future climate for the first and last reference periods. Each subplot is for a specific return period and for five rainfall durations. extent of the uncertainty, which is the range of variation of the results between the eight cases, in the future IDF curves can be investigated. We assume that such uncertainty may originate from the reference period used for the bias correction.
It's worth mentioning that in our study we didn't focus on the reasons of bias between the modelled and observed data since this subject has already been covered and explained comprehensively in previous literatures (Baker and Peter, 2008;Hawkins and Sutton, 2009;Willems et al., 2012). However, in this study we intend to focus on how is the change in bias over time and how the selection of a reference period can affect the results of future bias-corrected RCM rainfall and IDF curves.

IDF curves
In this study, the IDF curves for a small urban area in West Yorkshire were created for the future climate by using the following: 11 RCM ensemble members; five rainfall durations; six return periods; and eight reference periods for the bias corrections of the RCM. The IDF curves were then utilised to find out the percentage of the relative change between the current and future climate. In addition, we investigated the uncertainty (i.e. range) of the results for the future climate by comparing eight cases of future IDF curves that were created based on the eight reference periods that were adopted for the bias correction of the future RCM. Fig. 3(a & b) shows an example of the IDF curves for both the current and future climate and for the first and last cases of the reference periods. The plots demonstrate that the IDF curves in the future climate are different to the currently employed IDF curves and that the extent of the difference between them varies depending on the reference period employed. It is clear from Fig. 3(a & b) (and from the figures of the rest of the reference periods which are not shown here) that for all the reference periods except the first one the projected rainfall intensity for most of the climate ensemble members tends to increase for all the frequencies and reference periods. While, only few climate ensemble members project a decrease in the rainfall intensity, which were found to be much less than the corresponding growth projected by the other members. However, the first reference period differs from the above results for the two durations (5 min and 1 h). For a 5 min duration, the projected rainfall intensity decreased for all frequencies except the first one. While for 1 h duration, the decline in future rainfall intensity was for the last three return periods. This is the case for the most climate ensemble members. It is worth mentioning that the significance of the results at the 95% level was tested by the maximum likelihood estimate of the parameters used for developing IDF curves (the results not shown). Table 3 shows the sample of the results of the percentage of change between the future and current climate for the mean of the 11 RCM ensemble members, five rainfall durations, eight reference periods, and for a 5-year return period. Also, the results from Table 3 and the rest not shown results are translated to a schematic plot in Fig. 4 which shows the uncertainty in the percentage of change between the current and future climate compared with the return period and rainfall duration. It was found that the uncertainty in the change between the current and future climate varies according to the reference period and also tends to be more pronounced as the return period increases for each rainfall duration (Plot 1 in Fig. 4). However, for each return period the uncertainty in the change between the current climate and the future climate projection becomes smaller as the rainfall duration increases over the eight reference periods (Plot 2 in Fig. 4). This is true for all return periods except the first one where the uncertainty increases for the last three durations and declined for the first two.
It was found from the results of the percentage of change between the future and current climate that the reference period which shows the maximum change between the future and current climate varies according to rainfall duration and return period. The most recent period  dominates the other seven reference periods and shows the maximum change in rainfall for the first three durations for all the return periods except the first one. The same reference period shows the maximum change in rainfall intensity for the duration of 24 h, but for the last three return periods. While for the first three return periods, the sixth reference period  shows the maximum future rainfall change compared with the current climate. The former period dominates the maximum increase in future rainfall for the duration of 6 h over the six return periods, and all durations for the first return period.
As for the reference period 1960-1989, which is comparable to the period 1961-1990 that is used for bias correction by most of the previous studies, it shows the second highest result for the maximum increase in future rainfall intensity but only for the first duration. This is the case for all the frequencies except the first one. However, for the other durations the percentage of increase in future rainfall intensity based on the reference period 1960-1989 is less than the corresponding percentages of the other reference periods, but not the smallest one. This is the case especially as the return period increases.
It is clear from the above discussion that there is not any specific reference period that produces the maximum change in future rainfall compared to the current climate for all frequencies and durations. This is because the annual extremes for each duration depend upon both the concentration time and on the data that has been bias-corrected based on a specific reference period. In addition, the multidecadal oscillation (with varied number and values of extremes) and the fixed length of reference period at 30 years (which may not reflect the total length of an oscillation) might be another explanation for the results variation. In this study the sources of uncertainty are diverse and start from choosing the RCM and ends by creating future IDF curves. Hence, the magnitude of uncertainty augmented through the whole procedure. However, we believe that the significant uncertainty in the results of future IDF curves arises from the reference period used for the bias correction, which should not be ignored.
As mentioned earlier, the projection results based on the selected eight reference periods are not consistent with respect to longer return periods and shorter durations. To further explore the results for longer return periods, the rainfall intensities for 24 h of rainfall in the future and current climate were plotted for different return periods and for the eight reference periods. The results are presented in Fig. 5a. The graphs show that if a given rainfall intensity under the current climate occurs once every 50 years, the probability of that given rainfall happening in any year is p = 2%. The return period for the same rainfall intensity under future climate conditions was found to vary between 11.4 and 18.97 years depending upon the reference period (by taking the mean of the 11 RCM ensemble members). The results for the second and the seventh reference periods suggest that the rainfall intensity under the current climate conditions is expected to happen once in around every 12 years (p = 8.33%). However, when the reference period widely adopted in the previous literature is used, it shows that the same rainfall intensity is expected to happen once every 16.2 years (p = 6.17%), which is almost the same results for the fourth and fifth reference periods. While the first reference period shows that the same current rainfall intensity is expected to occur once every 18.97 years (p = 5.27%), which is less than half of the current return period. The earliest projection for the current rainfall intensity were shown by the sixth and the most recent reference periods which is expected to be once in about every 11.4 years (p = 8.77%). The significance of the above results at the 95% level are shown in a table within Fig. 5a.
The above analysis was repeated for other durations and return periods, where Fig. 5b is an example and is similar to Fig. 5a but for rainfall intensity of 15 min. The results show that the uncertainty seen from the above analysis is much higher for shorter durations and longer return periods. In addition, the reference period that shows the earliest projection for a current rainfall intensity, (i.e., the sixth reference period in the above example) is different for various durations and return periods. Another interesting point is when two or more reference periods show almost the same results regarding when a current rainfall intensity is expected to happen in the future, it is not necessary the same reference periods produce close projections for other durations and return periods.
It is clear from these results that the extent of uncertainty for a given rainfall intensity, which is expected to appear in the future but with a shorter return period, varies according to the reference period. Such uncertainty is considerable for shorter durations and longer return periods.
The reason for using 11 RCM ensemble members is to mitigate the uncertainty of future rainfall projections. In the above analysis, we focused on the mean of the 11 RCM ensemble members to make it easier to illustrate how the uncertainty associated with the reference period arises. In the following, we focus on the uncertainty of future projections for the wettest ensemble member, driest ensemble member, and the mean of the 11 RCM ensemble members. Table 4 shows the sample of the results of the rainfall intensity for future projections for 5-year return period, five rainfall durations, eight reference periods, and for the wettest ensemble member, driest ensemble member, and mean of the 11 RCM ensemble members. For the wettest ensemble member, which is also known as ''pessimistic" climate scenario as defined by Willems (2013) because it shows the highest impact among the climate ensemble members, it was found that the uncertainty of future rainfall intensity for all eight reference periods increases as the return period lengthens for a specific rainfall duration (Plot1 in Fig. 4).
This uncertainty is significant for small rainfall durations (less than 1 h), but it tends to be less significant for rainfall durations of more than 1 h as the return period lengthens. However, for a specific return period the uncertainty of future rainfall intensity tends to decline as rainfall duration lengthens over the eight reference periods (Plot 2 in Fig. 4). Likewise, the uncertainty for the driest ensemble member and the mean of the 11 RCM ensemble members grows as the rainfall duration shortens and return period lengthens. However, the uncertainty is much less than that for the wettest ensemble.
If we look at the wettest ensemble member, we can see that the maximum rainfall intensity for all rainfall durations is shown by the seventh reference period for all frequencies except the first one. For the first frequency, the former period shows the maximum rainfall intensity for the first three durations, while the sixth and fifth reference periods produce the highest rainfall intensity for the durations of 6 h and 24 h respectively.
The reference period for the mean of the 11 RCM ensemble members with the maximum rainfall intensity is similar to those periods which show the maximum projection of the percentage of change between future and current climate as explained earlier.
Thus, it will not repeated here.
Regarding the driest ensemble member results, it was interesting to find the reference period which shows the maximum rainfall intensity for the wettest ensemble member (i.e. 1980-2009), it also shows the driest projections. This is the case for durations less than 24 h and frequencies higher than 5 years; and durations less than 1 h and 2-year frequency. For all the frequencies, the most recent reference period shows the lowest projected rainfall intensity for the last duration (24 h), and the fifth reference period produces the driest projection for a 2-year return period and 6 h duration.
As can be clearly seen, there is a large degree of uncertainty in the projected rainfall intensity of these eight reference periods for short durations and for large return periods for the wettest ensemble member, driest ensemble member, and the mean of the 11 RCM ensemble members. Whilst the reference period that shows the wettest and driest projections is almost the same for specific durations and return periods, it is not the same one for the mean climate scenario. Neither the most recent period nor the most common period adopted by the previous literature shows the maximum values for the wettest and driest ensemble members during all durations and frequencies.
As we mentioned earlier, the maximum rainfall projections especially the one produced by the pessimistic climate scenario may not be used for the sewer system design, but it should be adopted in the decision making process in case that the consequences of such scenario are high (Willems, 2013).

Conclusions
A set of IDF curves for future climate scenarios was developed and compared with the IDF curves for the current climate. Eight reference periods with a fixed length of 30 years and a moving window of 5 years from the first to the last period were used to bias-correct the 11 ensemble members of the future RCMs rainfall data provided by the Met Office Hadley Centre.
The results of the climate model projections that were based on these eight reference periods suggest that the uncertainty in the percentage of change in the projected rainfall intensity compared with that of the current climate vary significantly across the eight reference periods. This uncertainty results in an increase in each rainfall duration as the return period lengthens. While for each return period, the uncertainty in the change of future projections declines as the rainfall duration increases. However, all the different return periods show increase in rainfall intensity for all durations over seven out of eight reference periods. In addition, the Table 4 The rainfall intensity for future projections for 5-year return period, five rainfall durations, eight reference periods, and for the wettest ensemble member, driest ensemble member, and the mean of the 11 RCM ensemble member. reference period that shows the maximum change between the future and current climate varies depending on the rainfall duration and the return period. However, the period commonly used for bias correction  does not show the most extreme future rainfall intensities compared with the other reference periods.
A specific current rainfall intensity is expected to appear in the future, but with a shorter return period. The uncertainty with respect to this point is considerable depending on which reference period is used for the mean of the 11 RCM ensemble members, and is much higher for shorter rainfall durations and longer return periods. However, the projected rainfall intensity for the wettest ensemble member, driest ensemble member, and the mean of the 11 RCM ensemble members varies over the eight reference periods. This variation is accompanied by a large uncertainty for short durations and long return periods. Although the reference period that shows the wettest and driest projection is the same for most of durations and frequencies, it is not the same for the mean climate scenario.
Overall, our study clearly shows that the uncertainty in the future IDF curves resulted from the use of different reference periods to bias-correct the RCM, and that the effect of the reference period on future climate projections is significant. This is the case when adopting the conventional assumption for bias correction. There is no specific reference period that can produce the most extreme projections because such projections depend upon both the concentration time of the catchment and the bias-corrected extreme rainfall in the reference period. A reference period within a specific oscillation of multidecadal oscillations may not contain the most extreme values of precipitation. In addition, a fixed length of 30 years for the reference period may not reflect the complete length of an oscillation. Consequently, a reference period may show lower values for the bias-corrected climate projection than other periods.
Therefore, in addition to the recommendation of using an ensemble approach, we recommend the adoption of an ensemble of reference periods to cover the uncertainty that can result from the constant bias assumption during the bias correction depending on just one reference period.
This study shows that the future IDF curves are highly affected by the choice of the reference period used for the bias correction. Such an important issue has been largely ignored by the engineering community and this study has shown the importance of including the uncertainty of benchmarking periods in bias-correcting future climate projects. Further research is required to examine the untackled questions, for example, what will the results be if the change in bias is adopted in the bias correction method, and the temperature used for the disaggregation is bias-corrected using different reference periods? By examining different sources of uncertainty that affect IDF curves another interesting question will be which source may produce the significance variation in the IDF results compared to other sources. Even more interestingly, how the results will be for different study areas? It is hoped that this study will stimulate the community to explore such questions further.