Are regional climate models relevant for crop yield prediction in West Africa?

This study assesses the accuracy of state-of-the-art regional climate models for agriculture applications in West Africa. A set of nine regional configurations with eight regional models from the ENSEMBLES project is evaluated. Although they are all based on similar large-scale conditions, the performances of regional models in reproducing the most crucial variables for crop production are extremely variable. This therefore leads to a large dispersion in crop yield prediction when using regional models in a climate/crop modelling system. This dispersion comes from the different physics in each regional model and also the choice of parametrizations for a single regional model. Indeed, two configurations of the same regional model are sometimes more distinct than two different regional models. Promising results are obtained when applying a bias correction technique to climate model outputs. Simulated yields with bias corrected climate variables show much more realistic means and standard deviations. However, such a bias correction technique is not able to improve the reproduction of the year-to-year variations of simulated yields. This study confirms the importance of the multi-model approach for quantifying uncertainties for impact studies and also stresses the benefits of combining both regional and statistical downscaling techniques. Finally, it indicates the urgent need to address the main uncertainties in atmospheric processes controlling the monsoon system and to contribute to the evaluation and improvement of climate and weather forecasting models in that respect.


Introduction
Climate has a strong influence on agriculture which is considered the most weather-dependent of all human activities (Oram 1989). Improved climate prediction therefore offers interesting potential benefits to agriculture. On the one hand, numerous studies have tried to link seasonal prediction outputs from global climate models (GCMs) to crop models, thus translating climate forecasts into seasonal crop predictions (Hansen et al 2006). On the other hand, combining GCMs and crop models also provides a tool to assess the impacts of future climate change on crop production (Jones and Thornton 2003). This is particularly important for Sub-Saharan Africa where climate variability and drought threaten food security.
However, translating GCM outputs into crop yields is difficult because GCM grid boxes are of larger scale than the processes governing yield, involving partitioning of rain among runoff, evaporation, transpiration, drainage and storage at plot scale (Baron et al 2005). Integrated climate-crop modelling systems, therefore, need to handle appropriately the loss of variability caused by the difference between scales. This can potentially be achieved by scaling down GCM outputs by various dynamic, empirical or statistic-dynamic methods (Von Storch 1995). Since impact studies ultimately rely on the accuracy of climate input data (Berg et al 2010), it is therefore crucial to quantify the errors inevitably propagated by such downscaling techniques through the combined climate/crop modelling.
This study aims to assess the impact of such errors on the performance of yield prediction. We take West Africa as a case study which illustrates well the dependence of crop production on climate variability and which benefits from a unique multi-model exercise of regional downscaling performed throughout the ENSEMBLES project (van der Linden and Mitchell 2009). We first assess the ability of a set of nine regional models simulations to reproduce the key climate factors for crop production by comparing selected regional model outputs and observations. We then use regional model outputs to drive a crop model and assess the accuracy of yield prediction compared to crop model simulations driven by climate observations. Finally, a simple bias correction (Michelangeli et al 2009) is applied to regional climate model outputs to improve the accuracy of yield prediction.

Study area and climate data
Senegal is located in West Africa, mainly in the droughtprone Sahel region where livelihood is heavily dependent on traditional rainfed agriculture. The main primary food crops grown in Senegal are millet, rice, corn, and sorghum. Three climate datasets are used in this study.
• A set of 12 meteorological stations uniformly distributed across the country (figure 1) compiled by AGRHYMET Regional Centre is used as the local ground truth. These stations record rainfall and several meteorological parameters at 2 m from ground such as solar radiation, insolation, surface wind speed, humidity and temperature. To facilitate intercomparisons, we perform a bilinear interpolation of ERA-I data to obtain interpolated values at each of the 12 synoptic stations.

ENSEMBLES regional simulations
A coherent multi-model experiment has been performed throughout the ENSEMBLES project (van der Linden and Mitchell 2009). A set of ten regional configurations with eight regional models are provided over West Africa using the ERA-I reanalysis as lateral boundary conditions over the 1990-2005 period. The use of the same boundary conditions makes possible the evaluation of each regional configuration. Regional climate outputs are freely available at daily timescale at 50 km for the AMMA region (Christensen et al 2009). For our purpose, we use nine regional configurations: We adopt an agronomic point of view to assess the accuracy of these regional configurations. Four climate variables are selected for their crucial role on crop production: (i) rainfall which represents the only water input for rainfed crop, (ii) temperature which acts to modulate the duration of crop growth cycles, (iii) solar radiation which limits biomass and crop yields, and (iv) the crop potential evapotranspiration (PET) referring to the evaporating demand from crops under optimal conditions. The realism of these four variables in the regional configurations is assessed by comparing simulations with observed data across at both seasonal and interannual timescales. For these comparisons, we perform a bilinear interpolation of each gridded simulation to obtain interpolated values at each of the 12 synoptic stations. Daily minimum and maximum air temperature, mean wind speed, mean air relative humidity and solar radiation at 2 m are used to compute PET from Penman-Monteith equation (Allen et al 1998). These interpolated simulated climate variables are then used to drive a crop model.

Correcting regional climate biases
In order to correct the statistical distributions of the RCM simulations and to make them as close as possible to those of the observations, the cumulative distribution functiontransform (CDF-t) method (Michelangeli et al 2009) has been applied. This type of approach, initially developed in a statistical downscaling context, aims at correcting the cumulative distribution function (CDF) G of a random variable X (here, temperature or precipitation, etc) given at a relatively low resolution (e.g., from a GCM, or, in our case, from RCM simulations), into the CDF F of the equivalent variable at a much smaller scale (i.e., higher resolution) through a mathematical transformation T : T (G(x)) = F(x), for any realization x of X. Once F, the local-scale CDF, is defined, a 'quantile-mapping' approach (Haddad and Rosenfeld 1997) can be applied between large-and local-scale CDFs to generate time series. Working directly on statistical distributions, CDF-t can also be used as 'bias correction' method. In the present application, the required CDFs are nonparametrically estimated without any season discrimination and a cross-validation technique is performed to generate simulations over the 11-year period 1990-2000 temporally relatively independent from the calibration data. CDF-t is calibrated for ten years and applied on the 11th year to provide simulations. The 10-year and 1-year periods are turning in order to cover the whole 1990-2000 period. CDF-t is applied to each regional climate outputs used to drive the crop model described below.

Crop model SARRA-H
The crop model used in this study is SARRA-H version V.2 particularly suited for the analysis of climate impacts on cereal growth and yield in dry tropical environments , Baron et al 2005   based on experimental field data in Mali and parametrization procedures . The photoperiod sensitivity and the duration of the crop's various growth phases are keys to agroecological adaptation to semi-arid environments, for dryland crops . Compared to fixed-duration cultivars, photoperiod sensitive cultivars flowering at the right time have the advantage of a flexible crop cycle length that can be adapted to the length of the rainy season of any given environment. The parametrization of photoperiod sensitivity in the SARRA-H crop model enables modulate the crop cycle length from North Senegal (short rainy season and short crop cycle) to South Senegal (long rainy season and long crop cycle).
Two kinds of simulations are performed: • First, we use the nine regional configurations outputs to drive the crop model at each of the 12 stations in Senegal over the 1990-2000 period. The simulated yield is then spatially averaged to get one value per year over Senegal. Yield predictions are compared to the one obtained from a control simulation where the crop model is driven by observed climate data over the same period. • Second, we design a simulation protocol able to quantify the error in the yield prediction induced by each of the individual climate inputs used to drive the crop model. We run the crop model with observed climate data except for rainfall which comes from the regional models. The comparison between the yield prediction from this simulation and the one obtained from the control simulation with only observed climate data illustrates the error induced by the representation of rainfall in the regional models. A similar protocol is applied but for solar radiation, temperature and PET.
These simulations are performed both with raw and bias corrected climate model outputs. Note that the objective of the simulation protocol is not to estimate realistically past observed variations of crop yield but to assess the sensitivity of simulated crop yield to the regional climate models biases.

Results
3.1. Regional simulations assessment 3.1.1. Spatial pattern and seasonal cycle. Figures 1  and 2 compare respectively the mean and seasonal cycle in observations and in regional simulations of the four crucial climate variables for crop production. Results of figure 2 are shown in average over the whole Senegal over the 1990-2000 period. Although there is a large dispersion between regional models, the seasonal cycle of the West African monsoon is generally well reproduced by regional simulations with the rainy season in summer characterized by high values of rainfall, a minimum in mean temperature and PET. Rainfall reproduced by ERA-I tends to be too weak in summer as for five regional simulations over nine (CLM-GKSS, PROMES-UCLM, RCA-SMHI, HadRM3P-HC and HIRHAM-METNO). Two regional configurations (RACMO-KNMI, RegCM-ICTP) correct this dry bias and two others (HIRHAM-DMI, REMO-MPI) over-estimate summer rainfall. The same regional model HIRHAM gives opposite response in rainfall in the DMI configuration (too much rainfall) and in the METNO configuration (not enough rainfall) which points out the importance of parametrization in regional modelling. The bias is particularly strong in the wettest part of the country in South Senegal (figure 1). Solar radiation shows the highest dispersion between regional simulations differing from observations by more than one standard deviation (grey area in figure 2). It tends to be over-estimated in early season (May-June) in all simulations and for the rest of the year in all models except for REMO-MPI and CLM-GKSS. The biases are particularly strong in the Northern Senegal where the radiation is the highest (figure 1). Mean temperature is accurately simulated by regional models except for RACMO-KNMI and RegCM-ICTP which show a strong cold bias in summer. All regional models as well as ERA-I overestimate PET, likely due to the positive bias of solar radiation (figure 1). As shown by Baron et al (2005), it is very likely that important biases in monthly rainfall and/or in solar radiation can strongly affect crop production. In addition, the observed over-estimation of PET increases the amount of water that could be evaporated and transpired by the crop. Figure 3 displays a Taylor diagram (Taylor 2001) for each of the four crucial variables for crop production. This diagram provides the ratio of standard deviation as a radial distance and the correlation with observations as an angle in the polar plot for the nine regional simulations as well as the ERA-I data over the 1990-2000 period in average over the Senegal and from 1 May-30 November. Note that correlations are computed after removing linear trends from each time series. Mean temperature (figure 3(c)) seems to be the best feature of all regional simulations with significant correlations for four over the nine regional configurations (RACMO-KNMI, HIRHAM-DMI, HIRHAM-METNO, RCA-SMHI). Rainfall variability is over-estimated in most regional simulations and only ERA-I shows significant correlation with observed data (figure 3(a)). The accuracy of regional simulations is the weakest for the solar radiation and PET (figures 3(b) and (d)). It is likely due to the importance of horizontal convergence and meridional temperature gradient which strongly constrain rainfall in West Africa and which is generally well represented by climate models, whereas this kind of constraint does not exist for radiative fluxes. Indeed, only one regional configuration over nine show significant correlations for solar radiation (REMO-MPI) and none for PET. The standard deviation of solar radiation tends to be under-estimated while most regional simulations over-estimate the PET variability.

Interannual variability.
From this assessment, it is difficult to depict the most accurate regional simulation. Indeed, none regional simulation performs well for the four variables. For instance, the regional model which simulates the best the interannual variability of solar radiation (REMO-MPI) is also the one which performs the worst for rainfall variability. Finally, figure 2 shows that ERA-I performs far better than regional models with realistic amplitude of standard deviation for the four climate variables and significant correlation with observed rainfall, mean temperature and PET.

Crop simulations
3.2.1. Predicted yield using raw regional climate simulations. Figure 4 compares simulated yields in Senegal using climate Figure 2. Seasonal cycles of rainfall (a), solar radiation (b), mean temperature (c) and potential evapotranspiration (d) for observations, ERA-I and regional configurations. Seasonal cycles are computed over the 1990-2000 period. The grey area is ± standard deviation computed from observations. data and regional climate simulations. A large dispersion is observed when using regional models to drive the crop model with simulated yield ranging from near 0 kg/ha (CLM-GKSS) to almost 630 kg/ha (RegCM-ICTP) for the 1990-2000 average while the control simulation with observed climate data is 547 kg/ha (figures 4(a) and (c)). The largest biases are observed in the southern part of the country ( figure 4(b)) likely due to the rainfall and solar radiation deficits in regional simulations (figure 1). We can divide the regional configurations into two groups: a group of four which simulates accurately the average yield (figure 4(c)) with a bias less than 100 kg/ha (RACMO-KNMI, RegCM-ICTP, HadRM3P-HC and HIRHAM-DMI, figure 4(d)) and a second group of four (CLM-GKSS, HIRHAM-METNO, RCA-SMHI, PROMES-UCLM, REMO-MPI) which underestimates, sometimes strongly, the mean yield. The same model HIRHAM ran by DMI and METNO falls into the two categories. The interannual variability of crop yield simulated from observed climate data and from regional models differs strongly although all regional models have the same ERA-I boundary conditions. The simulated yield using ERA-I appears to be more realistic than most of those obtained with regional models both for interannual variability and mean yield.
Major biases in simulated mean yield come from the representation of rainfall and solar radiation in regional simulations ( figure 5(a)). For some models, the bias induced  (1990-2000 period) with models output in average over the Senegal and from 1 May-30 November for rainfall (a), solar radiation (b), mean temperature (c) and potential evapotranspiration (d).
by one of these variables is greater than twice the standard deviation of the control simulation. In some cases, biases can be additive: the very low yields of the CLM-GKSS ( figure 4(a)) are related to an abnormally dry rainy season and a deficit of solar radiation (figures 2(a) and (b)) which both induce a yield loss of more than 250 kg/ha ( figure 4(a)). Conversely, errors in the representation of these two variables can compensate each other: for instance the dry bias of ERA-I (figure 2(a)) is slightly compensated by an over-estimation of solar radiation. The systematic over-estimation of PET in regional simulations induces a yield loss whose amplitude is lower than 100 kg/ha for most models. The representation of temperature in the regional models has a low impact on yield prediction. Figure 5(b) shows that the interannual variability of yield prediction is essentially driven by rainfall. The correlation between crop yields simulated with observed rainfall and those simulated with modelled rainfall can vary from near zero (REMO-MPI) to 0.8 (RegCM-ICTP) according to the regional configuration. This correlation is less sensitive to other variables, except for solar radiation in the CLM-GKSS model.

3.2.2.
Predicted yield using bias corrected regional climate simulations. Figure 6 is similar to figure 4 but for crop simulations using bias corrected regional climate drivers. The application of CDF-t largely improves the accuracy of crop yield predictions. The mean crop yields over the 1990-2000 period obtained with regional simulations are very close each other and are now close from the mean yield simulated with observed climate data (figures 6(a) and (c)). The mean biases are all weaker than 100 kg/ha (figure 6(d)). The coefficient of variation of simulated yields is also much more realistic when applying the bias correction technique to regional climate simulations (figure 6(e)). Figure 7(a) shows that CDF-t has reduced the biases induced by all the four climate variables but the representation of solar radiation and rainfall remains the major source of biases in crop simulations. If CDF-t  has successfully reduced the errors in mean and variance of regional climate outputs, the correlation between crop yields simulated with observed data and those simulated with regional models has not been improved. Indeed, figure 7(b) is rather similar to figure 5(b), with very low correlations induced by an inaccurate representation of interannual variability of rainfall and solar radiation.

Conclusions
This study assesses the accuracy of state-of-the-art regional models for agriculture applications in West Africa. Although very few studies on climate impact document the error induced by downscaling, we found that for similar large-scale conditions, e.g., ERA-I reanalyses, regional modelling can  introduce a strong dispersion in crop yield simulations. This dispersion comes from different physics in each regional model but also in the choice of parametrizations for a single regional model. Indeed, we found that two configurations of the same regional model (HIRHAM-DMI and HIRHAM-METNO) are sometimes more distinct that two different regional models.
Even if some important climate features such as temperatures and rainfall are well represented by regional models, sometimes correcting biases of the ERA-I large-scale boundary conditions, biases in the representation of rainfall but also in solar radiation, both crucial for crop development, lead to unrealistic yields in half of the regional configurations. Four regional configurations perform relatively well in simulating the mean yield (RACMO-KNMI, RegCM-ICTP, HadRM3P-HC and HIRHAM-DMI) but have difficulties to accurately simulate interannual variability of predicted yields. The latter problem limits the potential of their use in the context of downscaling GCM seasonal forecasts. The application of the bias correction technique CDF-t shows promising results in reducing drastically the biases in the mean simulated yield but does not improve the prediction of year-to-year variability of crop yields.
This study confirms the importance of the multi-model approach in quantifying uncertainties for impact studies adopted throughout the ENSEMBLES project and in the ongoing coordinated regional climate downscaling experiment (CORDEX) which fosters an international coordinated effort to produce improved multi-model high resolution climate change information over regions worldwide for input to impact/adaptation work and to the IPCC AR5. Results also stress the benefits of combining both regional and statistical downscaling techniques. Finally, this study points out the urgent need to address the main uncertainties in atmospheric processes controlling the monsoon system and to contribute to the evaluation and improvement of climate and weather forecast models in that respect. The African Monsoon Multidisciplinary Analysis-Models Intercomparison Project (AMMA-MIP) provides a highly relevant framework to address those issues (Hourdin et al 2009).