Advances in Geosciences Statistical bias correction of global climate projections – consequences for large scale modeling of flood flows

General circulation models (GCMs) project an increasing frequency and intensity of heavy rainfall events due to global climate change. This rather holds true for regions that are even expected to experience an overall decrease in average annual precipitation. Consequently, this may be attended by an increasing frequency and magnitude of flood events. However, time series of GCMs show a bias in simulating 20th century precipitation and temperature fields and, therefore, cannot directly be used to force hydrological models in order to assess the impact of the projected climate change on certain components of the hydrological cycle. For a posteriori correction, the so-called delta change approach is widely-used which adds the 30-year monthly differences for temperature or ratios for precipitation of the GCM data to each month of a historic climate data set. As the variability of the climate variables in the scenario period is not transferred, this approach is especially questionable if discharge extremes are to be analyzed. In order to preserve the variability given by the GCM, methods of statistical bias correction are applied. This study aims to investigate the impact of two methods of bias correction, the delta change approach and a statistical bias correction, on the large scale modeling of flood discharges, using the example of 25 macroscale catchments in Europe. The discharge simulation is carried out with the global integrated model WaterGAP3 (Water – Global Assessment and Prognosis). Results show that the two bias correction methods lead to distinctively different trends in future flood flows.


Introduction
In the past 100 years, a rise of global mean near-surface temperature of 0.74 • C ± 0.18 • C was observed which can be attributed to the anthropogenic emission of green house gases.For the 21st century, a further increase of 1.8-4.0• C, depending on the considered emission scenario, is expected (IPCC, 2007).As the saturation vapor pressure of water vapor is positivly correlated with temperature, the temperature rise will lead to an increased water content of the atmosphere.Consequently, based on projections of general circulation models (GCMs), an intensification of the global hydrological cycle accombined by an increasing frequency and intensity of heavy rainfall events is assumed.This rather holds true for regions that are even expected to experience an overall decrease in average annual precipitation (Meehl et al., 2007;Allen and Ingram, 2002;Hegerl et al., 2007).Furthermore, this may be attended by an increasing frequency and magnitude of flood events.
However, time series of GCMs show a bias in simulating 20th century precipitation and temperature fields.Thus, in order to assess the impact of the projected climate change on certain components of the hydrological cycle, they cannot directly be used to force large scale hydrological models or land surface models but a posteriori bias correction has to be performend.In recent years, several methods of bias correction have been developed.The most straight forward and widely used delta change approach (e.g., Lehner et al., 2006;Lettenmaier et al., 1999;Graham, 2004) adds the mean monthly change signal between GCM scenario and baseline period to an observed climate record.As the variability of the climate variables in the scenario period, as given by the GCM, is not transferred, this approach is especially questionable if discharge extremes, both in magnitude and return period, are to be analyzed.In contrast, methods of statistical bias correction aim to correct for all statistical moments of the intensity distribution (Piani et al., 2008(Piani et al., , 2010)).
The objective of this study is to investigate how the abovementioned bias correction methods impact the simulation of future flood flows.
Published by Copernicus Publications on behalf of the European Geosciences Union. 2 Data and methods

WaterGAP3 model
The WaterGAP model (Water -Global Assessment and Prognosis) has been developed at the Center for Environmental Systems Research (CESR) with the aim of providing a basis both for an assessment of the current state of water resources and water use, and for gaining an integrated perspective of impacts of global change on the water sector (Alcamo et al., 2003;Döll et al., 2003).WaterGAP consists of two main components: a global water use model and a global hydrology model.The aim of the hydrological model is to simulate the characteristic macro-scale behaviour of the terrestrial water cycle in order to estimate water availability.Based on the time series of climatic data, the hydrological model calculates the daily water balance for each grid cell, taking into account physiographic characteristics like soil type, vegetation, slope and aquifer type.Runoff generated on the grid cells is routed to the catchment outlet on the basis of a global drainage direction map (Lehner et al., 2008), taking into account the extent and hydrological influence of lakes, reservoirs, dams, and wetlands.The model is calibrated by adjusting one free parameter, γ , which controls the fraction of total runoff from effective precipitation in order to minimize the error in simulated longterm annual discharge.
For the current version, WaterGAP3, the spatial resolution has been enhanced, from 30 by 30 arc min (longitude and latitude) to 5 by 5 arc min gridded scale (approx.6 × 9 km in Central Europe).Partially enabled by the enhanced spatial resolution, the process representations of runoff formation and runoff concentration in the hydrological model have been substantially improved: (1) the snow routine has been revised by modelling snow dynamics on sub-grid scale (approx.0.4 × 0.4 km) (Schulze et al., 2005); (2) a module has been added for a dynamic representation of permafrost occurrence, which directly influences groundwater recharge (Aus der Beek and Teichert, 2008); (3) in order to distinguish between mountainous rivers with steep river bed slopes and rivers in lower regions a variable flow velocity algorithm has been implemented (Schulze and Döll, 2004); (4) the river length has been enhanced by applying an individual meandering factor for each grid cell derived from a high-resolution drainage direction map; (5) an approach has been developed and applied which utilizes Köppen regions to estimate potential evapotranspiration and ground water recharge (Weiß, 2009).As a last point, dams from the Global Reservoir and Dam Database (GRanD) have been implemented into Wa-terGAP3 in order to consider anthropogenic flow regulation.Thereby, all dams with a storage capacity higher than 0.1 km 3 have been taken into account and a management scheme  2006) has been applied (Döll and Fiedler, 2009).
The aforementioned model revisions have been a prerequisite for the application of WaterGAP3 to analyze, besides long-term water availability, discharge extremes.The model's general ability to simulate flood discharges has been evaluated by Verzano (2009).

Study catchments
The analysis was performed for 25 gauging stations listed in the Global Runoff Data Centre (GRDC) station catalog (cf.Table 1) which were chosen based on the following criteria: (1) The routing area had to be greater than 20 000 km 2 , (2) the station had to feature a continuous daily discharge record for the reference period 1971-2000, and (3) in the case of more than one gauging station within a basin, the routing area between two subsequent stations had to be greater than 10 000 km 2 .Although an even distribution of the study basins over the whole European continent was intented, Fig. 1 clearly shows that most of the catchments are concentrated in Central and Northern Europe.Catchments situated in Southern or Eastern Europe did not fulfil the aforementioned selection criteria for two main reasons: (1) especially in Southern Europe, daily discharge time series provided by the GRDC data base were distinctively shorter than 30 years or uncontinuous, and (2) the time series of most of the Eastern European gauging stations end around 1990.

Baseline climate
In the baseline period 1971-2000, the WaterGAP3 hydrological model was forced with the WATCH Forcing Data (WFD) (Weedon et al., 2010).This global data set (land areas only) features nine key near-surface meteorological variables for the period 1958-2001 at a sub-daily resolution (3-6 h).The WFD were derived from the ERA-40 reanalysis product by interpolation to 0.5 • grid scale and subsequent elevation correction.A bias-correction was performed based on monthly observational data from the CRU TS2.1 (Mitchell and Jones, 2005) and the GPCCv4 (Rudolf and Schneider, 2005) data sets.
Out of nine variables available from the data set, five were employed as input for the WaterGAP3 model: 2 m air temperature, downwards long-wave radiation flux, downwards short-wave radiation flux, rainfall rate and snowfall rate (summed up to total precipitation as the fractions of rain and snow are calculated internally).All variables were aggregated to daily time steps and downscaled to 5 arc min spatial resolution.

Climate projections
In the scenario period 2041-2070, WaterGAP3 was forced with climate projections of three state-of-the-art GCMs: IPSL-CM4 (Institut Pierre Simon Laplace, France), ECHAM5 (Max Planck Institute for Meteorology, Hamburg) and CNRM-CM3 (Centre National de Recherches Meteorologiques, France).For each GCM, projections for two of the IPCC AR4 emission scenarios, B1 and A2, were available.For each combination of GCM and emission scenario, the variables precipitation and temperature were subject to two methods of bias correction, a delta change approach and a statistical bias correction procedure, which will be described in the following sections.

Delta Change time series
The Delta Change approach is based upon transferring the mean monthly change signal between GCM control and GCM scenario period to an observed time series.In the present study, two 30-year windows were used, 1971-2000 as baseline and 2041-2070 as scenario period, according to the 30-year definition of the climate normal.It has to be highlighted that the change signals between GCM baseline and scenario period were derived from mean monthly values but were applied to the daily WFD time series.We chose this particular approach in order to avoid considerable variability in the day-to-day change signals which would have occured when using daily change factors.

Statistical bias corrected time series
The statistical bias corrected GCM time series were generated within the EU-funded project WATCH -WATer and Global CHange.The applied method of bias correction, also referred to as quantile mapping or histogram equilization, was specified and tested by Piani et al. (2010Piani et al. ( , 2008)).It is based upon associating the modelled variable x mod of the GCM control run with the observed variable x obs through their cumulative distribution functions (CDFs) such that CDF mod (x mod ) = CDF obs (x obs ).A transfer function between x obs and x mod is fitted which subsequently can be applied to the future GCM time series.

Methodological approach
WaterGAP3 simulations were carried out with similar model setup except for the climate time series used to force the model.As the basis for any further analysis, four model runs were performed for the baseline period 1971-2000: one using the WFD and three employing the bias corrected GCM control run time series.Subsequently, 12 simulations were carried out for the scenario period 2041-2070 according to all possible combinations of GCM, emission scenario and method of bias correction (3 GCMs × 2 emission scenarios × 2 methods of bias correction).
From the resultant 30-year time series of daily discharges, four flood regime indices, the mean annual flood (Q mean ), the median annual flood (Q med ), the 25-year flood, and the 50-year flood, were derived by applying the following procedure: 1. Determination of the annual maximum series (AMS) In a first step, the flood indices derived from the simulated hydrographs of the reference period were validated against the results from the observed discharge series.In order to analyze changes in the flood indices between baseline and scenario, we employed relativ change factors (CFs) which will be explained using the example of Q med , but were calculated for all aforementioned flood indices.The CF in Q med is given by the ratio between the simulated Q med of the scenario period and the simulated Q med of the baseline period.Values greater unity indicate an increase in Q med in the scenario period whereas values smaller unity indicate a decrease in Q med compared to the baseline period.For all delta change time series, the calculation of CFs was based on the results from the WFD simulation.In the case of the statistical bias corrected time series, the CFs were derived from the results of the GCM scenario simulation and the respective GCM control simulation.

Model validation
In order to evaluate the general ability of the WaterGAP3 hydrology model to reproduce the flood regime of the study catchments, the flood indices derived from the simulated daily discharge series of the baseline period were validated against the results calculated from the observed discharge series.Fig. 2 shows that an overall good agreement between the modelled and the observed values could be achieved for all four flood indices considered in this study.This conclusion is confirmed by the coefficients of determination which range between 0.971 and 0.985.However, the gradients of the regression lines for Q med and Q mean of 1.13 and 1.10, respectively, indicate that the model tends to overestimate the more frequent flood events.In contrast, for the events with higher return periods, the 25-year and the 50-year flood, no systematic over-or underestimation can be determined, which is confirmed by gradients of 1.04 and 1.03, respectively.In consideration of the fact that WaterGAP3 was calibrated with only one free parameter against the error in longterm annual discharge, i.e. no flood calibration scheme was applied, the model performend reasonably well in simulating flood flows.

Future flood flows
In Fig. 3 the change factors derived from the delta change time series are plotted against the change factors resulting from the statistical bias corrected time series, seperately diagrammed for each GCM and flood index.
The delta change approach leads for both CNRM-CM3 and IPSL-CM4 to constant or decreasing discharges for all four flood indices and study catchments.In particular ISPL-CM4 is characterized by significantly decreasing discharges for the more frequent events Q mean and Q med .The CFs derived from the statistical bias corrected time series, on the other hand, show a wider spread with both increasing and decreasing flood discharges for all four indices.Considerable increases can be observed in particular for the 25-year and the 50-year flood discharges in some basins.
In the case of ECHAM5, the statistical bias correction results in constant or increasing flood discharges for all four indices.Especially Q med is characterized by significant increases for all study catchments.Unlike IPSL-CM4 and CNRM-CM3, for ECHAM5 a marked difference between the two emission scenarios is visible.While the delta change approach results in constant to decreasing flood discharges for the B1 scenario, constant to increasing values were found for the A2 scenario.Furthermore, the ECHAM5-A2 deltachange series shows for a group of catchments considerable increases in the 25-year and 50-year flood levels exceeding the CFs derived from the statistical bias-corrected time series.This may be attributed to the scaling procedure applied to generate the delta change precipitation time series which affects not only the mean but also the variance of the distribution.As each value of the observed time series is multiplied by a scaling factor (cf. Eq. 2), the variance of the scenario time series will equate to the variance of the observed record multiplied by the squared scaling factor.Given an increase in monthly precipitation, extreme values present in the observations are further amplified.
As already indicated by Fig. 3, besides the method of bias correction, considerable uncertainty is introduced by the respective GCM chosen to force the hydrological model.
Figure 4 shows the range of the Q mean and the 50-year flood level CFs between the emission scenarios of each GCM.For most of the study catchments the CF range between the emission scenarios, B1 and A2, is very small within one GCM.On the other hand, the CFs show considerable variability between the different GCMs.Furthermore, in a lot of cases an opposed direction of change can be observed.The ECHAM5 results suggest an increase of Q mean , exceeding 10 %, for 24 out of 25 study catchments.In contrast, the IPSL-CM4 simulations suggest constant (±10 %) or decreasing Q mean levels for 17 catchments based on the A2 simulation and 22 catchments in the B1 simulation.In all study catchments an increase in Q mean can be observed for at least one GCM/emission scenario combination.At the same time 10 catchments show a decrease in Q mean of more than 10 % for at least one GCM/emission scenario combination.
Compared to the Q mean results, the 50-year flood CFs show a larger range within the single GCMs.Yet, as already pointed out for Q mean , in most basins the CFs differ more significantly between the different GCMs than between the two emission scenarios.

Summary and conclusions
This paper investigated the impact of two bias correction methods for GCM time series, the delta change approach and a statistical bias correction, on the simulation of flood discharges.In summary, the results showed that these two methods of bias correction lead to distinctively different trends in future flood discharges.Especially for CNRM-CM3 and IPSL-CM4 the change factors derived from the delta change time series indicated declining values for all analyzed flood indices, following the general precipitation trend in the scenario period.These results highlight a major drawback of the delta change approach: as mean monthly changes between GCM scenario period and baseline period are added to an observed time series, the method does not account for increases or decreases in the variability of the climate variables in the scenario period as given by the GCM.At the same time the results from the ECHAM5-A2 simulations suggest that the delta change approach may also generate an enhanced variability that is not given by the GCM.In addition, the study demonstrated that the selection of a particular GCM to force the hydrological model poses a major source of uncertainty in assessing future trends in flood flows.The analyzed flood indices differed more significantly between the applied GCMs than concerning the respective emission scenarios.For the majority of the investigated basins the different GCMs resulted in opposed directions of change in future flood discharges.

Fig. 1 .
Fig. 1.Geographical distribution of the GRDC gauging stations used for analysis, the corresponding catchment areas are highlighted in yellow.
The scenario daily temperature (T scen d,m ) was derived by adding the absolute monthly change signals to the observed time series, here the WFD 1971-2000.In the case of precipitation, the observed data were scaled with the relative change signals given by the GCM: d,m and P obs d,m are observed daily temperature and precipitation, T GCMcon m and P GCMcon m are mean monthly GCM temperature and precipitation of the control period, and T GCMscen m and P GCMscen m are mean monthly GCM temperature and precipitation of the scenario period.

2.
Calculation of Q mean and Q med from the AMS 3. Fitting of a Pearson type III extreme value distribution according to the guidelines of the US Water Resources Council (1981), as described in Maniak (2005) 4. Estimation of the 25-year and the 50-year flood level from the fitted distribution

Fig. 3 .
Fig. 3. Scatter plots between the CFs derived from the statistical bias corrected time series (x-axis) and the CFs resulting from the delta change time series (y-axis), separately for all indices and GCMs.In each plot, the data points of the A2 emission scenario are depicted in red and the data points of the B1 emission scenario are depicted in blue.The solid red line represents the 1:1 line.

Fig. 4 .
Fig. 4. Range between the two emission scenarios for the Q mean CFs (top) and the 50-year flood level CFs (bottom) for the individual study catchments.

Table 1 .
GRDC gauging stations used for analysis.