An Assessment of Recent and Future Temperature Change over the Sichuan Basin, China, Using CMIP5 Climate Models

The Sichuan basin is one of the most densely populated regions of China, making the area particularly vulnerable to the adverse impacts associated with future climate change. As such, climate models are important for understanding regional and local impacts of climate change and variability, like heat stress and drought. In this study, climate models from phase 5 of the Coupled Model Intercomparison Project (CMIP5) are validated over the Sichuan basin by evaluating howwell eachmodel can capture the phase, amplitude, and variability of the regionally observed mean, maximum, and minimum temperature between 1979 and 2005. The results reveal that themajority of themodels do not capture the basic spatial pattern and observedmeans, trends, and probability distribution functions. In particular, mean and minimum temperatures are underestimated, especially during the winter, resulting in biases exceeding238C.Models that reasonably represent the complex basin topography are found to generally have lower biases overall. The five most skillful climate models with respect to the regional climate of the Sichuan basin are selected to explore twenty-first-century temperature projections for the region. Under the CMIP5 high-emission future climate change scenario, representative concentration pathway 8.5 (RCP8.5), the temperatures are projected to increase by approximately 48C (with an average warming rate of10.728C decade), with the greatest warming located over the central plains of the Sichuan basin, by 2100. Moreover, the frequency of extreme months (where mean temperature exceeds 288C) is shown to increase in the twenty-first century at a faster rate compared to the twentieth century.


Introduction
China has seen a significant increase in regional temperature since the late nineteenth century (Ren et al. 2005; Q. Wei and Chen 2011). Ding et al. (2007) calculate that the annual mean surface air temperature has increased by 0.88C during the twentieth century, with an accelerated warming of 1.18C during the second half of the century, which is slightly higher than the global temperature trend for the same period. Regionally, northern China has become drier while central China has become wetter during summer, and southern and east-central China has become wetter during winter (see, e.g., Hu et al. 2003;Chen et al. 2006;Gu et al. 2009;Y. Li et al. 2009Y. Li et al. , 2010Yin et al. 2012). Since China is a densely populated country where the climate has high spatial and temporal variation, the physical environment and economic productivity across the country is particularly vulnerable to the adverse impacts associated with future climate change. In particular, water resources, energy security, human security, well-being and health, and environmental and social conditions are highly vulnerable to climate changes (Barnett et al. 2005;Patz et al. 2005;Wang 2005; Gu et al. 2012).
Recently, global climate models (GCMs) have examined past, present, and future climate trends and variability of Southeast Asia and China (see, e.g., Gao et al. 2002;Jiang et al. 2004;Ding et al. 2007;Xu et al. 2009;Sun et al. 2010;Chen et al. 2012;Jiang et al. 2012;Ma et al. 2012;Wang et al. 2012;Xu et al. 2013;Zhou et al. 2014;Wu and Huang 2016;Zhou et al. 2016). To summarize previous research using the most recent implemented GCMs within the framework of phase 5 of the Coupled Model Intercomparison Project (CMIP5; Taylor et al. 2012), future warming appears in all regions of China , and there will generally be fewer cold extremes and more warm extremes . Gu et al. (2014) found that the temperature in most parts of China will increase by more than 38C by 2100 with the greatest warming experienced during the months March through August, and that near-surface warming will accelerate in the latter part of the twenty-first century (also see Chen and Frauenfeld 2014). Xu and Xu (2012) and Wang and Chen (2014) have also shown that in winter, the northern continental regions show greater warming than the southern regions. In spite of these results, Jiang et al. (2015) found that most of the CMIP5 models have a topographyrelated cold bias, primarily attributable to the (relatively coarse) resolution of the regional topography over southern China and the Tibetan Plateau, resulting in an overestimation of the spatial variability and horizontal gradient in temperature across China. Other studies have shown that these modeled cold biases are also related to the poor capacity of climate models at reproducing the East Asian winter monsoon and its relationship with tropical teleconnections (see, e.g., Gong et al. 2014Gong et al. , 2015Song and Zhou 2014;Wu and Zhou 2016).
It is also important to note that, traditionally, studies that have evaluated the performance of GCMs in simulating surface air temperature variability and patterns over China during the twentieth century (with respect to observations) have used multimodel ensemble means (i.e., the average of simulated temperature from a subset of GCMs) (see, e.g., Xu and Xu 2012;Zhang 2012). This is because averages across structurally different models empirically show better large-scale agreement with observations (Cubasch et al. 2001;Zhou and Yu 2006;Xu 2007;Xu et al. 2007), and noise in future predictions is thereby reduced. Although a multimodel ensemble analysis may provide a more robust climate change signal, such a method does not consider the relative strengths and weaknesses of each model as an ensemble invariably hides the substantial variations between the individual models. Through the multimodel approach, the low-frequency natural climate variability occurring over multidecadal time scales and its mechanisms may not necessarily have the right phase. If the models collectively misrepresent some component of the forcing or partially cancel each other out, then the future natural variability in an ensemble will be inherently suppressed. Additionally, because internal climate variability increases toward smaller horizontal scales, multimodel means have much smoother geographical patterns and trends, and thus smaller local extremes, than that of observed patterns and trends. Therefore, aspects of climate variability are not represented well in multimodel means, and this is not acceptable for regional and local economic planning purposes that, for example, require local future predictions of temperature extremes. Consequently, the intercomparison between individual models is important and necessary, especially for estimating the credibility of future climate projections, and it is imperative to explore individual models for their individual merits to better understand future changes. As such, studies using CMIP5 simulations have previously highlighted that most of the models in the CMIP5 data repository exhibit varying degrees of skill, depending on the region of China and the season Chen and Frauenfeld 2014;Gu et al. 2015). It is, therefore, important to evaluate the performance of each climate model in CMIP5 for different regions (Chen and Frauenfeld 2014;Wu and Huang 2016) and to determine the best overall GCM depending on the specific application for which the model will be used.
The Sichuan basin region ( Fig. 1) of China is particularly vulnerable to climate extremes [e.g., droughts, floods, cold temperatures, and heat waves (Kuo et al. 1986;Zhang et al. 2008;Li et al. 2015)] and has a confluence of a large population, insufficient arable land, and economic underdevelopments. As such, future climate change is expected to inflict significant socioeconomic and personal damage because of the dense and growing human population and economy of this region. Improving the predictability of annual mean and extreme temperatures over the Sichuan basin is thus important for prevention of climate-related stresses. In particular, future climate change will have profound implications for internal heating and cooling loads in the ever-expanding urban regions (particularly the cities of Chengdu and Chongqing, China), resulting in human physiological, perceptual, and behavioral responses (e.g., changes to overall annual energy consumption per household). Since this region has a large population and a growing and diverse economy that is (at least partly) dependent on climate, and because the country plays a significant role in climate change negotiations, being one of the largest greenhouse gas emitters, understanding future changes in temperature across the region is of importance for China's future environmental, economic, and social development.
In this paper, therefore, we will deduce the optimal GCMs that provide the most realistic interpretations of historical near-surface air temperature and, using those models, identify future regional temperature changes over the Sichuan basin, one of the most populated and climate-sensitive regions of China. Thus, the objective for this study is the evaluation of CMIP5 simulations compared to monthly observations across the Sichuan basin region by exploring the capacity of each of the CMIP5 models at simulating mean, maximum, and minimum air temperature. Here, we only focus on air temperature because it has the most direct relevance for exploring (current and future) heating and cooling demands as well as outdoor and indoor thermal comfort in addition to having a historical record that can be compared to model simulations and atmospheric reanalyses. This paper is organized as follows: section 2 describes the datasets and analysis methods; section 3 presents a comparison between observed temperature variability with the modeled CMIP5 simulations along with a discussion on modeled atmospheric circulation patterns and its intra-annual variability over China; section 4 shows the projected decadal temperature trends across the Sichuan basin for the twenty-first century using a selection of the best available GCMs; and section 5 provides a concluding summary, a discussion on the uncertainties associated with the modeled results, and final remarks.

a. Geographical setting
Located in southwestern China (Fig. 1a), the Sichuan basin is framed by mountain ranges that are 1000-3000 m FIG. 1. (a) Location of the Sichuan basin within China; (b) location of the meteorological stations within the Sichuan basin (from west to east): Chengdu, Yibin, Nanchong, Chonqing (north), Chongqing (south), and Jiangbei; and (c)-(i) the regional mean derived from the mean of all six stations (black) and mean annual temperatures from the six meteorological stations (colored). The black horizontal solid lines correspond to individual location means (see Table 1), and the black horizontal dashed lines in (d)-(i) correspond to the regional mean  of 18.068C. above mean sea level (MSL) on all sides. The central part of the basin mainly consists of hills, flatlands, and low mountains ranging between 400 and 800 m MSL. With cool winters and hot, humid, and wet monsoon summers, the basin is persistently cloudy. This makes the Sichuan basin globally unique, with some of the highest cloud fractions in the world (Klein and Hartmann 1993). The persistent distribution of cloud over the basin is a combined result of the higher surface temperatures relative to the surrounding areas and high water vapor content from the prevailing southeasterly wind Li and Gu 2006), as well as the strong regional stable stratification in the troposphere bounded by the region's complex topography. Owing to the flat lowlands and fertile ground, the Sichuan basin has always been one of the country's major agricultural production bases. It is one of the most populated regions in the country, with an estimated population of 100 million people. The combination of the basin's topography, a dense population, and rapid twentieth-century urbanization also means that the Sichuan basin is one of the most polluted regions in the country (Qiao et al. 2015). These factors, along with the presence of high levels of anthropogenic heating, are associated with the urban heat island effect being observed in many of the urban centers (see, e.g., Yao et al. 2015). This is particularly important as urban aerosols and the heat island effect can exert a significant influence on microscale and local dynamics and microphysics, which may not be resolved in current GCMs. Therefore, capturing both the dynamical and thermodynamical effects of the surrounding plateaus and the Sichuan basin, as well as the impact of urbanization on small spatial scales, is of great importance if GCMs are to replicate the regional-scale circulation and climate patterns in this region.
b. Observations Figure 1b shows the location of six meteorological stations [Chengdu, Chongqing (north), Chongqing (south), Jiangbei, Nanchong, and Yibin] found within the Sichuan basin, which have been used in this study. The coordinates for each station are shown in Table 1.
Note that all six stations are located within an urban setting, and they may therefore be exposed to the urban heat island effect. In consideration of the availability of temperature data, atmospheric reanalysis data, and the historical period of the CMIP5 models, the period 1979-2005 was employed for the analysis. Monthly observed temperature was obtained from the National Climatic Data Center archives (https://gis.ncdc.noaa.gov/maps/ncei/ cdo/monthly). The annual mean temperature between 1979 and 2005 for each station is shown in Figs. 1c-i. The mean temperature across all stations ranges from 16.858C at Chengdu to 18.838C at Chongqing (north) ( Table 1). The long-term  trends in temperature are positive and statistically significant, ranging from 0.678 to 1.578C in the last 27 years (Table 1). For validating the model data against the observations, a regional mean temperature from 1979 to 2005 was calculated from the six sites (see Fig. 1c), and regional mean values were similarly calculated for each CMIP5 model.
As previously mentioned, here we only focus on air temperature because of its direct relevance to thermal comfort indices, in addition to having an historical record that can be compared to model simulations. A preliminary analysis also revealed that although models and observations do not exhibit a significant trend in the amount of precipitation for the Sichuan basin, significant biases and intermodel differences exist in precipitation amount and in spatial pattern. The relationship between topography, modeled precipitation, and the East Asian monsoon is not discussed here, but is worthy of further investigation.

c. Global climate model data
To assess the historical and future projected changes in the regional climate of the Sichuan basin, monthly output from all 47 GCMs are extracted from the CMIP5 data repository. The CMIP5 set of experiments (Taylor et al. 2012) includes simulations of twentieth-century climate (referred to as historical experiments) and future projection experiments of twenty-first-century climate under the new greenhouse gas emission scenarios [referred to as representative concentration pathways  (Moss et al. 2010;Meinshausen et al. 2011). The RCP simulations represent mitigation scenarios that produce emissions pathways following various assumed policy decisions that would influence the time evolution of the future emissions of greenhouse gases, aerosols, ozone, and land-use changes (Moss et al. 2010). More details on the models can be found at the CMIP5 website (http://cmip-pcmdi.llnl.gov/cmip5/availability.html). Table 2 gives an overview of the climate models used in this study, including associated institutions and the resolution of the atmospheric model components of those GCMs. Since there are a different number of available ensemble members for each individual model, and since these members are generally tightly clustered relative to intermodel spread, we consider only one member from each CMIP5 model in order to give equal weighting across models. Because of the different spatial resolutions adopted by different GCMs (see Table 2), all of the models are bilinearly interpolated onto a common 0.758 3 0.758 (;80 km 3 ;80 km) grid for comparison between simulations and observations. For comparison with the individual CMIP5 models, an ensemble of all the models was constructed using equal weighting, and this was also regridded. In this study, we consider 1979-2005 to be the reference (base) period and also consider three additional decadal time intervals: 2030s, 2060s, and 2090s. It is important to note that some models for some time periods have missing data. The text states when individual models have been omitted from the data analysis.

d. Meteorological reanalysis data
In addition to the observations, we have also utilized the meteorological reanalysis ERA-Interim for the same period  for the GCM validation. The ERA-Interim is analyzed on pressure levels and is the most recent global reanalysis product from the European Centre for Medium-Range Weather Forecasts (ECMWF) (Simmons et al. 2007;Dee et al. 2011), with a spatial resolution of N128 (horizontal Gaussian grid, nominally 0.78 in latitude/longitude; ;80 km 3 ;70 km near Chongqing). The ERA-Interim dataset has also been interpolated onto a 0.758 3 0.758 grid. When compared to the regional annual mean observations of temperature , ERA-Interim has a mean bias, root-mean-square error, and correlation coefficient of 21.248C, 1.24, and 0.98, respectively. Of four meteorological reanalysis products tested [not shown: ECMWF twentieth-century reanalysis (ERA-20C), NCEP-NCAR reanalysis, and NCEP-DOE AMIP-II reanalysis], we find that ERA-Interim outperforms the others at capturing the inter-and intra-annual variability across this region. This is consistent with other recent studies (Betts et al. 2009;Mao et al. 2010;Mooney et al. 2011;Hodges et al. 2011;Bao and Zhang 2013).

a. Temperature trends since 1979
Linear trend analysis of the observations reveals that there was a significant warming trend in annual temperatures during the 1979-2005 period across the Sichuan basin. The regional mean constructed from the observations exhibits a warming rate of 10.878C over the last 27 years ('10.328C decade 21 ; Fig. 1c), which is higher than elsewhere in China (Chen and Frauenfeld 2014. The regional mean temperature trend from ERA-Interim is consistent with the observations, with a trend of 10.838C (27 yr) 21 ( p , 0.01). Figure 2 compares the temperature trend derived from all CMIP5 models to the observations over the 27-yr period. All but 2 of the 47 models have an overall warming trend throughout this period, 28 of which are statistically significant ( p , 0.05). Interannual variability in temperature is evident between the individual GCMs, resulting in warming trends between 1979 and 2005 as high as 12.018C (FGOALS-g2) and even cooling trends as low as 20.438C (MIROC-ESM-CHEM). There are six models with a 27-yr temperature trend within 60.18C of the observed trend, and, in order of precision, these are BNU-ESM (10.868C), IPSL-CM5A-MR (10.918C), HadCM3 (10.838C), GFDL CM2.1 (10.798C), HadGEM2-AO (10.968C), and FIO-ESM (10.978C). Despite the spread in temperature trends being so large among the models, the multimodel ensemble trend is 10.858C (27 yr) 21 , which is also statistically significant ( p , 0.05), and is comparable to the observed trend. Seasonally, the regional observations indicate that the warming trends in the Sichuan basin are approximately equal in summer and in winter. The observations exhibit a warming rate of 10.608C ( p 5 0.06) over the last 27 years during summer and a warming rate of 10.618C (27 yr) 21 ( p 5 0.06) during winter. The regional observations indicate, therefore, that the annual warming in the Sichuan basin is driven by changes in spring and autumn temperature, with linear trends of 11.288 ( p 5 0.01) and 11.008C (27 yr) 21 ( p 5 0.03), respectively. This seasonality in temperature trends is not well captured by CMIP5 (not shown). CMIP5 models exhibit consistently larger trends in different seasons, and the seasonal variability of temperature increase during this period is greater in the majority of the GCMs than that in observations.

b. Intra-annual temperature variability
To uncover the annual cycle of the mean climate of the Sichuan basin, Fig. 3 presents the range of monthly mean, maximum, and minimum temperatures from the different sources of climate data: the CMIP5 models, the regionally averaged observations, ERA-Interim, and the multimodel ensemble mean. Figure 3 shows that the CMIP5 models have an annual cycle consistent with the observations, but the majority of the models underestimate mean, maximum, and minimum temperature throughout the year. This is more apparent during the winter months than in the summer months; for instance, the multimodel mean temperature bias increases from 23.958C in summer to 26.028C in winter.
Figure 3 also shows that the intermodel spread is greatest for maximum temperature, with some models overestimating and underestimating maximum temperature by up to 168 and 298C (respectively) throughout the year. The observations indicate that the annual mean    . Trends that are statistically significant ( p , 0.05) are indicated by hatching. The regional mean annual temperature trend from the observations and ERA-Interim are colored in red. The multimodel ensemble mean trend is also colored in red. CMIP5 models that are within 60.18C of the observed regional mean trend are colored in green.  variability and temperature range is consistently muted across CMIP5 models, and therefore, the multimodel ensemble estimate for the intra-annual temperature variability of 10.388C is inconsistent with the observations. Mean annual biases between the simulated and observed monthly mean temperature for each CMIP5 model were also calculated, and these are presented in Fig. 4. With a few exceptions, the mean bias is widespread, with many models having a bias greater than 228C. The lowest and highest biases are 20.138 (MIROC4h) and 29.138C (BCC_CSM1.1), respectively. The multimodel ensemble mean bias is 25.088C, which is considerably higher than the bias calculated for ERA-Interim (21.248C). There are 38 CMIP5 models with a mean annual bias greater than 238C; only two models (MIROC4h and IPSL-CM5A-MR) have a mean bias less than 228C. This indicates that most models do not accurately simulate the observed climatology for temperature at a regional scale. For the seasonal temperature biases, Fig. 4 also shows that the CMIP5 models generally perform better in summer than they do in winter. The top five models that have the lowest mean bias compared to the observed mean, maximum, and minimum temperature are individually plotted in Fig. 3. By ranking each of the CMIP5 models and giving them an aggregated score for mean, maximum, and minimum temperature biases, we find overall that the five models with the highest skill in reproducing the climatological mean temperature over the Sichuan basin are (in descending order): MIROC4h, CESM1(FASTCHEM), MIROC5, GISS-E2-R-CC, and GISS-E2-R.

c. Mean climatology of the Sichuan basin
To obtain a better sense of model variability across the Sichuan basin, we also investigate the spatial variability of climatological annual mean temperature over the twentieth century. Figure 5 compares CMIP5 average annual mean air temperatures with observations and ERA-Interim data for the 1979-2005 period. Immediately apparent in Fig. 5 is the difference in the spatial pattern of temperature within the basin between the GCMs. Some models (e.g., BNU-ESM and FIO-ESM) have a northwest to southeast gradient in temperature, whereas other models [e.g., MIROC4h and CESM1 (FASTCHEM)] clearly have an imprint of the Sichuan basin in the temperature field. Figure 5 shows that those models with a gradient do not have a local maximum in temperature, resulting in an underestimation of temperature compared to the observations. Using ERA-Interim as a proxy for regional observations, we should expect the pattern in temperature to coincide with the basic morphological shape of the Sichuan basin. The ERA-Interim dataset indicates that the mean annual temperature is greater than 168C for much of the basin (;44% of the domain shown in Fig. 5), with the mean annual maximum of 18.008C located in the central part of the basin. Therefore, it is clear that the models disagree in the location and the spatial extent of the temperature maximum within the Sichuan basin. This is also particularly obvious in the multimodel ensemble spatial pattern (Fig. 5), which, as a result of averaging across FIG. 4. Regional mean annual (gray), winter (blue), and summer (red) temperature biases as compared to the regional mean observations. A negative bias indicates the model underestimates the observations. The 47 CMIP5 models are ordered from lowest to highest (from left to right) mean annual temperature bias. The bias for ERA-Interim and the multimodel ensemble are also shown. different datasets, has resulted in a smooth geographical temperature pattern with no local maximum compared to ERA-Interim or those models with the lowest mean annual biases [e.g., MIROC4h, IPSL-CM5A-MR, and CESM1(FASTCHEM)].
To explore the link between horizontal resolutions in shaping the regional climate of the Sichuan basin, a linear least squares fit of the regional mean temperature bias and the grid cell area (approximated at the latitude and longitude of Chongqing) was calculated. A significant regression was found, with an r 2 value of 0.23, which is statistically significant at the 99% confidence level ( p , 0.01). An equally strong positive and significant correlation is also found between grid cell area and minimum temperature bias (r 2 5 0.21; p , 0.01) but not for maximum temperature (r 2 5 0.06; p . 0.05). This indicates that, although the lack of small-scale topography in some of the models can account for temperature biases (i.e., those models with higher horizontal resolution, and thus better resolved topographical features, tend to have smaller temperature biases), it is not the only factor in controlling model performance. The larger cold biases in the winter seasons imply that the GCMs may represent surface-cloud feedbacks incorrectly , and therefore, further analysis of the near-surface energy balance and associated processes is required. Similarly, the results indicate that the models are not capturing the seasonal minima and maxima in mean temperature. This could be an artifact of the relatively coarse resolution of the GCMs and the lack of direct (and indirect, e.g., clouds) effects of detailed topography and surface conditions, as well as the representation of high pollution and aerosol concentrations in the troposphere, in this region.

d. Temperature probability distribution functions
To further assess uncertainty in the regional simulations of the Sichuan basin's climate, probability distribution functions (PDFs) were calculated (Fig. 6). PDFs demonstrate the capability of the models at simulating present climatic distributions of mean, maximum, and minimum temperature that are otherwise not ascertained in the above analysis. Using all station data from the six sites, area-estimated distributions of mean, maximum, and minimum temperature (rather than individual point estimates; i.e., all data from each of the six sites were binned together; there was no averaging across the six locations) were calculated. Similarly, all data from the six locations for each of the models were used to derive the modeled PDFs. Bin sizes of 0.58C were used. To compare the similarity between the observed and simulated PDFs, the skill score S score , which was first developed by Perkins et al. (2007), was adopted. This metric calculates the cumulative minimum values of two distributions of each binned value, thereby measuring the common area between two PDFs. Expressed formally, where n is the number of bins used to calculate the PDF, Z m is the frequency of values in a given bin from the model, and Z o is the frequency of values in a given bin from the observed data. If a model simulates the observed distribution perfectly, the S score will equal one, which is the total sum of the probability of each bin center in a given PDF. If a model simulates the observed PDF poorly (e.g., there is negligible overlap between the observed and modeled PDFs), it will have a skill score close to zero. Therefore, the confidence in the skill of a model declines as the overlap between the present observed and present simulated PDFs also decreases. This is a very simple measure that provides a robust and comparable measure of the relative similarity between model and observed PDFs (Perkins et al. 2007;Sun et al. 2015). If a climate model can be shown to poorly simulate the current PDF distribution, then one can presume that the model will not have skill in simulating future distributions. Figure 6 presents the spread in S score for mean, maximum, and minimum temperature, while the models that have the highest S score (i.e., the highest skill) at simulating the present climatic distributions are individually plotted (see Fig. 6 legend). The S score for ERA-Interim (for mean temperature only) is also plotted. For mean temperature, 30 of the 47 models in the analysis have an S score greater than 0.70, with a total of 15 models scoring greater than 0.80. By far the best model is MIROC4h, with a skill score of 0.90 (Fig. 6). IPSL-CM5A-MR is close in skill score (0.87), followed by CESM1(FASTCHEM) (0.83), MPI-ESM-MR (0.82), and MIROC5 (0.82). For comparison, the five models with the weakest S score are GFDL CM2.1 (0.58), FGOALS-g2 (0.57), GFDL-ESM2M (0.55), GFDL-ESM2G (0.55), and BCC_ CSM1.1 (0.55). Although all CMIP5 models capture more than 50% of the observed distribution in mean temperature for this period, such varying S score illustrates the considerable modeled variability, thus strongly supporting the necessity for omitting weak models from an ensemble as these weak models strongly bias the skill of FIG. 6. Simulated PDF S score for monthly mean, maximum, and minimum temperature as modeled by 47 CMIP5 models. The top five highest S score for each variable are plotted individually (see legend). The solid red line is the multimodel median value. The bottom and top of the box indicates first and third quartiles, and bars extend to 1.5 interquartile ranges outside of the quartiles. the ensemble. The S score for the multimodel ensemble is 0.72, which is substantially lower than the S score for the five high-confidence models, ERA-Interim (0.93), and the median CMIP5 S score of 0.74.
Unlike the PDF distributions of monthly mean temperatures, only 5 of the 47 models for minimum temperature have an S score greater than 0.80 (Fig. 6). This indicates that minimum temperatures are less well represented overall, and this is similarly indicated by a low multimodel ensemble S score of 0.65. MIROC4h performs the best (0.88), followed by GISS-E2-R-CC (0.82) and GISS-E2-H-CC (0.82). The three weakest performing models are INM-CM4.0 (0.44), IPSL-CM5A-LR (0.43) and IPSL-CM5B-LR (0.36). EC-EARTH, NorESM1-ME, CanCM4, and GFDL CM2.1 are not included in this analysis because of missing data. The low skill scores across the models at simulating the distribution of observed minimum temperatures is caused by the significant large negative (cold) biases in the winter (DJF) months. Only five models have an S score greater than 0.50 when only considering the winter seasons. This is consistent with Figs. 3 and 4 and other studies (see, e.g., Annan et al. 2005;Gao et al. 2011;Ji and Kang 2013;Su et al. 2013;Jiang et al. 2015) that have shown the significant cold bias across China in the majority of the CMIP5 models is further exaggerated in winter. In contrast, the overall performance of the models in representing maximum temperature is much better (Fig. 6), with a high multimodel ensemble S score of 0.76; 13 of the 47 models generate a skill score in excess of 0.80. CESM1(BGC) (0.94), CCSM4 (0.93), and CESM1(FASTCHEM) (0.93) have the strongest skills, while IPSL-CM5A-MR (0.56), GFDL-ESM2G (0.54), and GFDL-ESM2M (0.54) are the weakest. NorESM1-ME, CanCM4, and GFDL CM2.1 are not included in this analysis because of missing data.

e. Atmospheric circulation pattern
We also examine the atmospheric circulation pattern (500-hPa geopotential height and wind vectors) over Asia for the 1979-2005 period. This is important because the climate of this region is strongly affected by intraand interannual atmospheric circulation variability [e.g., the East Asian monsoon and El Niño-Southern Oscillation (ENSO); Huang et al. 2012;Gong et al. 2015;Zhang 2015]. Therefore, a correct representation of the regional circulation pattern can also indicate model robustness.
The 500-hPa geopotential height for winter (December-February) and summer (June-August) months compared to ERA-Interim are shown in Figs. 7 and 8, respectively. Overall, the biases in geopotential height are mainly negative (as shown by the multimodel ensembles), and it is clear that there are large intermodel differences in the magnitude of the geopotential height biases for both summer and winter. During the winter (Fig. 7), the majority of the CMIP5 models have a negative geopotential height anomaly (exceeding 230 gpm) over Asia, generally resulting in a stronger subtropical jet stream over South Asia and south of the Sichuan basin (see, e.g., MRI-CGCM3). The biases in geopotential height coincide with the areas strongly affected by intra-and interannual atmospheric circulation variability. This is demonstrated in Fig. 9, which shows the mean intra-annual 500-hPa geopotential height standard deviation compared to ERA-Interim. Overall, all of the models compare well with the spatial pattern in variability; however, the magnitude of intra-annual variability in the 500-hPa geopotential height is generally overestimated by the majority of the models. Consistent with Fig. 9, therefore, the highest negative biases shown in Fig. 7 generally coincide with the East Asian trough region. As a consequence, the majority of the models have a statistically significant ( p , 0.01) negative west-east (27 models in total) and a negative southnorth (31 models in total) gradient in geopotential height bias. During the summer (Fig. 8), the subtropical jet stream moves north of the Sichuan basin, which is well captured by all of the models; however, there are large intermodel differences in geopotential height. Figure 8 indicates that those models with negative anomalies over the East Asian-northwestern Pacific region in the summer generally have stronger northeasterly flow over eastern China (see, e.g., IPSL-CM5B-LR), while those models with positive anomalies generally have a stronger zonal subtropical jet [see, e.g., CESM1(WACCM)] compared to ERA-Interim. Although the biases are less negative during the summer (typically ;27 gpm) than they are in winter, the majority of the models also have a statistically significant negative west-east gradient (34 models in total) in geopotential height bias. In summary, although some of the CMIP5 models are able to reproduce the mean midtropospheric zonal flow (e.g., geopotential heights are consistent with reanalysis), the vast majority of the models have an obvious negative bias in geopotential heights, with the highest negative biases over the East Asian trough region. Annually, this results in stronger zonal winds and positive anomalies along the climatological East Asian jet stream. These biases are already well documented in the literature (see, e.g., Song and Zhou 2014;Gong et al. 2014Gong et al. , 2015. It is important to note, therefore, that while the majority of the CMIP5 models have a correct representation of the spatial pattern and variability of the 500-hPa geopotential height, clear biases exist in the magnitude of these patterns. It appears that the negative bias in the geopotential heights is consistent with the regional cold biases in the lower troposphere (especially in winter), which is likely further exacerbated in the Sichuan basin by the lack of locally resolved topographical features. However, there is a very weak correlation in winter (0.14 and p 5 0.33) and summer (0.30 and p 5 0.04) between the geopotential height biases and lower tropospheric biases over the Sichuan basin. This may be because of several complex microscale and local dynamical features (e.g., the distribution of cloud cover, the urban heat island effect, and unresolved topography) all contributing to the modeled negative temperature biases. Since there are negligible correlations between temperature and geopotential height biases for both seasons, we use the geopotential height biases alone as an independent check for model performance.
By ranking each of the models and giving them an aggregated score for winter and summer geopotential height biases and winter and summer latitudinal and longitudinal gradient biases, we find that the five models with the highest score are, in descending order, CESM1 (CAM5), MPI-ESM-P, CanESM2, MPI-ESM-LR, and ACCESS1.3. Although this ranking provides an indicator for those models that better represent the circulation over Asia, only the CESM1(CAM5) model has an imprint of the Sichuan basin in the temperature field (Fig. 5).

f. CMIP5 model selection for future climate modeling
Based on the above model evaluation, it is clear that the vast majority of the GCMs in the CMIP5 repository can be removed from further analysis. Only a few of the models are capable of reproducing either the general circulation or temperature, and therefore, we have low confidence in future temperature projections for the Sichuan basin for those models that perform poorly in the past (when compared to the station data and ERA-Interim). Similarly, the ensemble of all CMIP5 models does not show spatial and intra-annual variability and distributions of temperature in the Sichuan basin consistent with the observations and ERA-Interim. Temperature in the Sichuan basin is poorly represented by the ensemble because the spatial variations in this region are smoothed out when the CMIP5 models are averaged together (also see Jiang et al. 2015). This is an inherent limitation of the CMIP5 ensemble itself; the complex topographical features of the Sichuan basin are smoothed, resulting in biases (Fig. 4) and a spatially muted temperature field (Fig. 5) greater than many of the individual models that do resolve the basin. Enhancing environmental safety and protection in the Sichuan basin and enabling more efficient and economic planning by decision-makers requires accurate climate predictions that cannot be adequately resolved by the ensemble. Therefore, we do not use the multimodel ensemble for the projected temperature analysis.
We subsequently use an aggregated skill score to help select a subset of the CMIP5 models for future temperature projections. The regional mean annual temperature trends (Fig. 2); the mean annual temperature biases (Fig. 4); the mean, maximum, and minimum temperature S score (Fig. 6); and a combined score for summer and winter geopotential height biases and latitudinal and longitudinal gradient biases were summed together to get an aggregated skill score. Therefore, five CMIP5 models have been identified as the most suitable at representing the spatial variability, the interannual trends, and intra-annual variability for mean, maximum, and minimum temperature in the Sichuan basin. These are MIROC4h, IPSL-CM5A-MR, CESM1(FASTCHEM), MIROC5, and CMCC-CM and have subsequently been ordered according to their mean annual temperature bias. All five of these models have an imprint of the Sichuan basin in the temperature field (Fig. 5).
In selecting these five models, in the future, we aim to fully force a regional climate model with these global model outputs in order to better understand regional atmospheric circulation and near-surface temperature and precipitation. Therefore, model assessment results

Projected decadal temperatures
Having identified five models as having high skill at replicating the observed historical variability of the climate of the Sichuan basin, projected changes in temperature during the twenty-first century will now be presented. These data are projected for three twentyfirst-century decades: 2030s, 2060s, and 2090s. All models have been bias adjusted throughout the twentyfirst century according to their mean, maximum, and minimum annual temperature bias (as shown in Fig. 4).
Future temperature projections for the five models show that there will be continued warming within the Sichuan basin, with a range in the rate of warming between 0.308 and 0.878C decade 21 . Table 3 shows that under the future RCP4.5 and RCP8.5 scenarios, the Sichuan basin will be about 18C warmer than the observed 2000s decadal mean temperature of 18.358C by 2030. The projections indicate that this warming will continue past 2060, and by the end of the century, the mean annual temperature will have likely exceeded 208C, resulting in the Sichuan basin being approximately 48C warmer than it was at the beginning of the century. Under the RCP8.5 scenario, the IPSL-CM5A-MR model suggests that by 2090, the mean annual temperature will have exceeded 258C, while both MIROC5 and CMCC-CM estimate a mean annual temperature of around 238C. Similarly, by the end of the century, all models (which have data) for both RCP scenarios indicate that the annual maximum and minimum temperatures will have exceeded 258 and 178C, respectively. The increase in maximum temperature is projected to become more pronounced than that in minimum temperature for the two RCP scenarios, indicating an increased likelihood of heat waves in the future. These changes may have a serious impact on the energy demand in this region, which has similarly been found in other FIG. 9. Difference between the mean intra-annual standard deviation of the 500-hPa geopotential height (gpm) simulated by the CMIP5 models and ERA-Interim over 1979-2005. (bottom right) The standard deviation of ERA-Interim annual 500-hPa geopotential height.
regions of China (see, e.g., Xu and Xu 2012;Wu and Huang 2016;Zhou et al. 2016). Table 3 indicates that the increasing temperatures will be more pronounced under the high RCP8.5 emission scenario than under the moderate RCP4.5 emission scenario. This also suggests a higher likelihood of drought under RCP8.5. Assuming RCP8.5 is the most likely future greenhouse gas emission scenario (Sanford et al. 2014), Fig. 10 shows that the greatest warming occurs over the central flat plains of the Sichuan basin, with the highest temperatures between the Tuojiang and Jiang Rivers north of the Yangtze River (e.g., the Dazu, Neijiang, and Ziyang districts). Although the spatial extent and magnitude of the decadal annual mean temperatures greatly differs among the five CMIP5 models, this can be attributed to underlying topography and the shape and elevation range of the Sichuan basin. The CMIP5 projections (Table 3 and Fig. 10) for temperature are consistent in that there will be sustained warming across this region, but with greatest warming experienced in the center of the basin, throughout the twenty-first century. We also calculated the mean decadal temperatures for China as a whole and find that the change in temperature for the basin between 2030 and 2090 (the range in temperature change is 1.278-5.448C) is consistent with the estimated countrywide warming (the range in temperature change is 1.708-4.828C). The implications of increasing temperatures and drought conditions in the future will, therefore, be just as relevant for China's economy and social development in the Sichuan basin as it is elsewhere.
To further uncover changes in temperature extremes, Fig. 11 shows the frequency of months with a mean temperature $288C, with each panel split in 29-yr periods from 1890 to 2099. These results emphasize basinwide significant changes in temperature extremes consistent with the future maximum, mean, and minimum warming (Table 3). From Fig. 11, we can determine that the frequency of extreme months is expected to increase in the twenty-first century (post 2010) at a faster rate compared to the twentieth century. For example, the largest increase in the frequency of months with a mean temperature $288C from one period to the next in the past, equivalent to 16 months [or an increase in frequency by 4.4%; CESM1(FASTCHEM)], occurred between 1950-1979and 1980 shows that we can expect jumps in frequency of 8.9% (MIROC5) and 12.5% (CMCC-CM) between 2010-39 and 2040-69. Therefore, throughout the twenty-first century, the frequency of extreme periods is set to increase, with the greatest chance of heat waves in the central plains of the Sichuan basin. Figure 11 also indicates that periods of extreme temperature will increase in frequency more in the southeast region of the Sichuan rather than in the northeast, which is a consequence of the higher topographic Tibetan Plateau boundary compared to the relatively flatter and less steep features in the south and east. The region can also expect a decrease in cold periods (not shown) because of the positive upward trend in minimum temperatures. In comparison to the frequency of hot periods in the past to the future for all six sites investigated, Chengdu will experience the smallest increase in the number of extreme periods, since the city is located at a higher elevation and closer to the Tibetan Plateau. In contrast, the IPSL-CM5A-MR, MIROC5, and CMCC-CM models indicate that by 2100 all three summer months (June, July, and August) in the city of Chongqing will have a mean monthly temperature exceeding 288C, compared to today whereby just one summer month exceeds such a threshold. Such significant future climate changes in the Sichuan basin will have serious implications, including localized environmental degradation, substantial hydrological impacts, and potentially severe human deprivations. Therefore, understanding future climate change in this region is of concern for both the scientific community and the policy makers.

Conclusions and discussion
The objective of this study was to evaluate the performance of 47 CMIP5 GCMs in simulating surface air 2) For seasonal and annual mean temperatures, the GCMs show substantial cold biases over the region (generally exceeding 23.08C), especially during winter. The spatial pattern of temperatures over this region is shown to be partly dependent on the horizontal resolution of the individual models, and those models with a better representation of the Sichuan basin tend to have smaller temperature biases overall.
3) The future temperature projections for the Sichuan basin, using the five most skillful models with respect to the regional climate, indicate that the RCP8.5 scenario exhibits a consistent increase in annual temperature during the twenty-first century at an average rate (across the five models) of 10.728C decade 21 . By the end of the twenty-first century, temperature is projected to have increased by approximately 48C, with the largest warming located over the central plains of the Sichuan basin. The warming experienced within the basin during the twenty-first century is consistent with the projected warming for China as a whole. 4) The absolute frequency of extreme months (where mean temperature exceeds 288C), and the rate of change in the frequency of extreme months, will be greater in the latter part of the twenty-first century than it ever was over the twentieth century.
The conclusions presented here are consistent with those from previous studies (see, e.g., Annan et al. 2005;Chen et al. 2011;Gao et al. 2011;Ji and Kang 2013;Su et al. 2013;Chen and Frauenfeld 2014;Gu et al. 2014Gu et al. , 2015Jiang et al. 2015) across China. However, compared to previous studies that have used a multimodel ensemble mean, this study is more comprehensive by providing a detailed analysis of historical and future projected temperature changes across the entire CMIP5 repository. The results presented here indicate that the use of an ensemble does introduce uncertainties, at least with regards to the magnitude of regional intra-annual temperature and spatial variability, in comparison to the five individual skillful climate models. This is unsurprising as the forcing and the feedbacks produced by a multimodel ensemble might misrepresent some component of external and internal forcing, resulting in misleading results when compared to observed temperature changes. Therefore, the results of the multimodel ensemble are tentative. It is also clear that the ensemble mean is determined by the large intermodel spread and climate (temperature) sensitivity across this region. The ensemble average is not especially close to any of the individual CMIP5 models, and the results presented here may be different for a different-sized ensemble. As an alternative, and although similar uncertainties may also exist for individual models, it is possible to deduce high-confidence models by removing those models that do not give a realistic interpretation of the historical climate of the Sichuan basin and thus assume they will not give an accurate representation of the future climate. This study highlights that most of the models in the CMIP5 data repository exhibit varying degrees of skill and determining the best overall model is difficult given that the ''best'' depends on the application required and region of interest. Based on this analysis and various statistical measures, there are two CMIP5 models that noticeably better simulate historical surface air temperature variability over this region. These two models are MIROC4h and IPSL-CM5A-MR. Since we have demonstrated that these two models have a reasonable coverage of the range of changes and in spatial characteristics of mean temperature across the Sichuan basin, they could be used for future downscaling experiments in this region using a finescale regional climate model with a better representation of local topography and for multiple future emission scenarios. Such conclusions only apply to this basin and may not be valid for other locations in China.
Although we can have confidence that some of the models simulate the correct magnitude and sign of lowfrequency change of warming over the Sichuan basin, the vast majority of the CMIP5 models do not accurately capture the intra-annual and spatial variability, or distributions, of mean, maximum, and minimum temperature. These cold biases have also been reported in previous studies across China (Annan et al. 2005;Gao et al. 2011;Ji and Kang 2013;Su et al. 2013) and are a likely result of several types of uncertainties. First, uncertainties in the historical simulations may arise from the limited representation of high tropospheric pollution and aerosol concentrations in the Sichuan basin, their physical (e.g., transport, sources, and sinks) and chemical processes, and their connection to other various direct and indirect (e.g., clouds) natural and anthropogenic forcings. Therefore, the different magnitudes of the cold biases between models indicates a common deficiency among the CMIP5 models stills exists for reproducing climatic and microscale features (e.g., clouds, the urban heat island effect, pollution, and urbanization) in such a highly spatially heterogeneous and complex terrain. Further in-depth analysis of these features and their impact on, for example, the surface energy balance is required in the future to explain such differences between different models. Second, since topography strongly controls the climate of this region, local extremes may be smoothed in the GCMs when calculating annual and monthly extremes, and this implies that models may fail to represent surface-cloud feedbacks over this region. It is beyond the scope of this study to examine this, but temperature anomalies caused by deficiencies in climate models simulating cloud properties in China have been similarly observed by Zhou and Li (2002), Yu et al. (2004), and Chen and Frauenfeld (2014). Since a systematic cold bias exists in all of the models, which is further exaggerated in winter and in the coarse-resolution models, this indicates that both the localized dynamical and thermodynamical effects of this complex region are not fully captured. In particular, it is already known that the Sichuan basin has one of the highest fractions of stratiform clouds of anywhere, particularly in winter (Klein and Hartmann 1993;Li and Gu 2006). The combination of (generally) flatter topography with the use of poor physical cloud parameterization schemes means that it is unlikely that the CMIP5 models accurately simulate the distribution of clouds across the Sichuan basin. Simulated stratiform clouds that are optically too thin (or missing altogether) can account for the significant cold biases via a strong negative longwave radiation bias throughout the year (which is further accentuated in winter). Although a weak and nonstatistically significant relationship was found, the near-surface cold biases are also likely related to the biases seen in the 500-hPa geopotential height as it has previously been shown how the East Asian winter monsoon and its relationship with tropical teleconnections can impact the climate of China (Gong et al. 2014(Gong et al. , 2015Song and Zhou 2014;Wu and Zhou 2016). Therefore further work is required to fully determine the role of midtropospheric flow on temperature variability in the Sichuan basin. Finally, uncertainties also exist with the station dataset; the observations are not evenly distributed across the basin and are biased toward the city of Chongqing and the urban environment. Understanding the magnitude and the reasons for these uncertainties and large intermodel differences in future temperature projections is important because of the potential impact of increased drought conditions on human welfare and the environment through decreasing river discharge and water availability and increasing risk of heat wave. If the dynamical and thermodynamical effects in this region were represented reasonably in the models, the simulated patterns could be improved greatly.
Further research, particularly regarding ensemble projections using high-resolution regional climate models and an analysis of the uncertainties related to the model spread, is needed for a better understanding of the future changes over the Sichuan basin. Effective use of downscaling techniques should provide more confidence in the future. In addition to this, a good representation of recent and present climate is a necessary condition for confidently predicting future climate. Here we focused on mean temperature, variability, and trends within the Sichuan basin of China. To identify reasons for model biases and spread and to further reduce uncertainties of future predictions, this analysis should be extended to variables other than temperature. Besides temperature, precipitation is also an important quantity with economic impact in this region, especially as the climate of the basin is dominated by the East Asian summer monsoon (Zhang 2015), with summer precipitation (May through September) accounting for approximately 74% of the annual rainfall across the Sichuan basin. This seasonal concentration of rainfall results in higher risk of floods and droughts, and improving the predictability of summer precipitation is thus important for prevention of climate-related risks. In addition to this, clouds and aerosols impact the radiation budget and simulated temperature, and therefore, the intermodel representation of chemical and cloud processes is a key area for future research.
The evaluation of climate models against observed data is an important step in building confidence in their use for future impact assessments. It is useful to identify the Sichuan basin as responsive and possibly vulnerable to future changes in temperature and the frequency of extreme months (where mean temperature exceeds 288C) because of the potential impacts on human societies. The results presented here can help facilitate the development of adaptation strategies to reduce climaterelated stresses and risks. The intensity of the climate extremes within the Sichuan basin are likely to be reduced as greenhouse gas and pollution concentrations and emissions decrease in the future (Xu and Xu 2012;Ji and Kang 2015).