A Statistical Bias Correction Tool for Generating Climate Change Scenarios in Indonesia based on CMIP5 Datasets

Providing information regarding future climate scenarios is very important in climate change study. The climate scenario can be used as basic information to support adaptation and mitigation studies. In order to deliver future climate scenarios over specific region, baseline and projection data from the outputs of global climate models (GCM) is needed. However, due to its coarse resolution, the data have to be downscaled and bias corrected in order to get scenario data with better spatial resolution that match the characteristics of the observed data. Generating this downscaled data is mostly difficult for scientist who do not have specific background, experience and skill in dealing with the complex data from the GCM outputs. In this regards, it is necessary to develop a tool that can be used to simplify the downscaling processes in order to help scientist, especially in Indonesia, for generating future climate scenario data that can be used for their climate change-related studies. In this paper, we introduce a tool called as “Statistical Bias Correction for Climate Scenarios (SiBiaS)”. The tool is specially designed to facilitate the use of CMIP5 GCM data outputs and process their statistical bias corrections relative to the reference data from observations. It is prepared for supporting capacity building in climate modeling in Indonesia as part of the Indonesia 3rd National Communication (TNC) project activities.


Introduction
Developing climate change scenarios require baseline and projection data from the outputs of global climate models (GCM). The data are gridded and usually stored in specific file, such as netCDF or Grib, where not many people familiar with the format. This can be a considerable constrain for the researcher who would like to use the data for their climate change related studies but do not have enough experience or skill in dealing with the data. The outputs of GCMs from the Climate Model Inter-comparison Project Phase 5 (CMIP5) are the main data prepared for climate change studies, particularly for the assessment of future climate scenarios. The Intergovernmental Panel on Climate Change (IPCC) used these datasets in the Fifth Assessment Report (AR5) published in 2013 [1]. The data are not only complex in term of their huge file size and uncommon file format, but they also commonly contain biases. These biases need to be statistically corrected before we can use them for further climate-change related studies. Statistical bias correction is one of many methods that can be selected to perform statistical downscaling of GCM outputs.
Processing climate data in order to generate climate scenarios for a selected studied region are complicated and require special skills. Therefore, it is necessary to develop a tool that can be used to simplify the downscaling processes, starting from reading the complex data, doing interpolation, performing bias correction and preparing downscaling results that can be easily used for further analysis. In this study, we develop a tool or software that can be used for the preparation of climate change scenarios in Indonesia based on the outputs of CMIP5 GCMs. The tool called as "Statistical Bias Correction for Climate Scenarios (SiBiaS)", and now has been developed to version 1.1. The tool is prepared for supporting capacity building in climate modeling in Indonesia as part of the Indonesia 3 rd National Communication (TNC) project activities. This paper aims to provide general description of SiBiaS as the tool for generating downscaled future climate scenarios in Indonesia. It also describes potential uses of the tool in supporting climate change vulnerability and impact studies over different domain location in Indonesia.

Data
SiBiaS is specially designed to facilitate the use of CMIP5 GCM data outputs and process their statistical bias corrections relative to the reference data from observations. There are a number of GCMs where their data outputs can be taken from the Climate Models Inter-comparison Project version 5 (CMIP5) database. The data can be obtained directly from the PCMDI/ESGF data portal (http://www-pcmdi.llnl.gov or http://pcmdi9.llnl.gov/esgf-web-fe/), or from other sources. Especially for SiBiaS v1.1, we downloaded the CMIP5 data from a more user-friendly website called as the KNMI climate explorer (http://climexp.knmi.nl). There are 24 models selected and stored for the SiBiaS database. The list of the models where their output used in the software is shown in Table 1. The program consists of rainfall and surface temperature data outputs from 24 CMIP5 GCMs under 4 RCP scenarios. RCP is the most recent scenario used by the IPCC for their new climate projection studies for the Fifth Assessment Report (AR5), which is published at the end of 2013 [1]. The RCP scenarios consists of four climate change scenarios defined by radiative forcing pathways until 2100, i.e. RCP2.6, RCP4.5, RCP6.0 and RCP8.5 [22,23]. Those four scenarios represent different pathways of radiative forcing in the future. RCP2.6 is the lowest scenario with radiative forcing peak at ~3 Wm-2 before 2100 and then declines ("peak and decline" pathway). Meanwhile, the highest scenario is RCP8.5. It has "rising" pathway with radiative forcing is expected to be greater than 8.5 Wm-2 in 2100.
The main data used as reference are taken from the gridded observation of rainfall and surface temperatures data, which is already cropped only for Indonesia domain. The Climate Hazards Group InfraRed Precipitation with Station Data (CHIRPS) version 2.0 [24,25] is used as the observed reference file of rainfall data in the system. The data has two selections of spatial grid resolutions, i.e. 0.25ᵒ (low) and 0.05ᵒ (high) grid resolutions. For surface temperature data, the Climate Research Unit TS3.22 [26] data is used for reference file. This data has a relatively coarse 0.5ᵒ grid resolution. In addition to this data that already automatically available in the program, users can also add reference data from observed stations by entering the data files into the browse file menu. To be able for the program to read the user-defined data, the data has to be prepared by following the file format as shown in the user manual [27].

Method
There are two choices of statistical bias correction techniques implemented in this program, i.e. i) Delta method, and ii) Statistical intensity distribution method. A previous study provides more details information regarding the Statistical intensity distribution method [28]. While, information about the comparison between the strengths and weaknesses of various bias correction or statistical downscaling method is summarized in another study [29].
Delta method [e.g. 30] is a simple downscaling method widely used for preparing climate change scenarios at local level. It uses either through addition or multiplication of delta (Δμ) and the data in the baseline periods. In general, SiBias follow the equations below: Addition method of the delta is mostly used for temperature (Equation 1), while multiplication is used for rainfall data (Equation 2). In order to maintain the variability of the model when calculating the time series of the future rainfall or temperature, Δμ is calculated from the differences or from the ratio between the mean value of the observed (μ ob ) and the modelled baseline (μ mb ).
Meanwhile, the Statistical intensity distribution method is based on the correction of distribution of climate data from the model, which is transformed in order to match with the distribution of observed data [28]. In the process of performing bias correction methods, the observation data and simulation assumed to have a specific distribution pattern, where for rainfall data commonly follows a Gamma distribution pattern [28]. Gamma-based PDF follows Equation 3 below: Where x is the rainfall data and k is the shape parameter and θ is the scale parameter. In general, the correction process is conducted by calculating the value of the inverse cumulative distribution function (CDF) curve for observed (y) and modeled (x) data. Data from the inverse CDF is then used to develop a regression equation where y = f (x). The equations obtained from the regression is considered as correction factor that can be used to correct current baseline and future projection data from the models. Figure 1 provides an example of steps in performing bias correction based on the statistical intensity distribution method. We have tried several regression approaches in order to find the best fit of the regression model. It is found that the third order polynomial regression that is forced the fitted line to go through the origin point (0,0), has better result in providing correction factors for the rainfall data. Dissimilar to the distribution function of rainfall data that follows Gamma distribution, we modify the fitting of distribution for temperature data. The pattern of temperature distribution generally differs from rainfall data due to its Gaussian-shaped that tend to follow normal distribution patterns.

Results and Discussion
SiBiaS is built to process baseline and projection data from the CMIP5 GCMs outputs into downscaled climate change scenarios for various locations in Indonesia. The program consists of rainfall and surface temperature data outputs from 24 CMIP5 GCMs under 4 RCP scenarios. This data can be selected and processed for the calculation of downscaling within the selected region in Indonesia. The tool performs correction of rainfall or temperature data for every single grid location within single month to consider seasonality in climate data, meaning if the data has 10 grid location then it will have 10 grid x 12 month (=120) correction factors. The result of the calculation in the program will deliver two types of outputs, i.e. i) graphic plots from the results of the area-averaged monthly climatology, monthly time series, anomalies of monthly time series and probability density function (PDF), and ii) downscaled data in simple ASCII format. SiBiaS v1.1 interface is shown in figure 2.

Graphic plots resulted by SiBiaS
The tool offers simple graphical results for different purposes in delivering downscaled future climate scenarios. It also can be used for assessing agreement between model in simulating rainfall or temperature in the selected area of study. Figure 3a shows an example of the uncorrected temperature climatology from the model baseline compared to the observed data during the same baseline period in . It shows the model performance in reproducing surface temperature climatology in Indonesia. This kind of figures can also be used to identify the best model that matches the climatology of the observed data.   The important aspect in performing the bias correction or downscaling of climate data from GCM outputs is to obtain the corrected data for future climate scenarios. Figure 3b-3d provides the examples of the area-averaged of the bias-corrected data by using delta approach. The figures show the scenarios of future rainfall climatology and monthly time series and their anomalies resulted from each individual models. Those monthly time series and their anomalies are also used to display and compare the PDFs resulted from every models as shown in figure 3e and 3f. The individual model results combined into this specific single figure, delivering the median value and the range between minimum and maximum of all the selected models performed in figure 3. The example of the simplified figure showing the range across different models is shown in figure 4. The figure is also can be used to get information about the range of uncertainty in the future climate scenarios resulted from many models.

Comparing downscaled outputs for different regions in Indonesia
SiBiaS is designed for producing bias-corrected data for generating future climate scenarios within user-defined domain in Indonesia. Therefore, the users can select various area of interest within the country to deliver future climate scenarios. Figure 5 demonstrates an example of further uses of the SiBiaS outputs for comparing future scenarios of rainfall climatology over different areas in Indonesia. It shows the box plots resulted from the outputs of 24 GCMs, depicting uncertainty of rainfall climatology in the future. The median value provide information regarding the tendency of the rainfall climatology might unfold in the future over different selected regions in Indonesia, i.e. Sumatra, Jawa, Kalimantan, Sulawesi, Bali and Nusa Tenggara, Maluku, and Papua. It is shown that some areas in Indonesia are expected to experience less rainfall in the dry season and more rainfall in the rainy season in the future, as shown by their scenario of future rainfall climatology.

Delivering more analysis from the downscaled data outputs
More analysis can be delivered by using the ASCII data outputs of SiBiaS. The data is automatically stored in the output folder of the program. It consists not only the data of area-averaged values used for creating default graphical plots, but also spatial data from every grid locations within the selected region. In order to perform further result from this data outputs, third party software should be used for creating maps and graphical plots. The data outputs resulted by SiBiaS consist information such as, monthly climatology, monthly time series, anomalies of monthly time series, and PDF values from raw and anomaly data. The data is available for every grid locations and for the area-averaged value of the selected domain. In addition, the data output also provides information regarding latitude and longitude locations of every grid points within the selected domain. This information is very important for creating spatial mapping from the data outputs. Figure 6 provides an example of the map created from the output data resulted by SiBiaS. The maps compare the future scenarios of seasonal rainfall in September-October-November (SON) period over Sumatera during three different periods under 4 RCP scenarios. This type of maps is very useful for comparing the result of climate scenarios in different periods in the future as resulted by using different climate change scenarios. Figure 6 is depicted from the median value of the selected 24 GCMs. Many other different maps and graphical plots can be explored and created from the outputs of SiBiaS. Further analysis related to future climate scenarios and impact analysis can also be done by using the data outputs.

Conclusion
SiBias is built to perform user defined climate change scenarios for various locations in Indonesia. It processes baseline and projection data from the CMIP5 GCMs outputs under different Representative Concentration Pathways (RCP) scenarios. The analysis result from SiBiaS can be used as important information for providing climate change scenarios to support climate change impacts, adaptation and mitigation studies in Indonesia.