Evaluation of Eight Current Reanalyses in Simulating Land Surface Temperature from 1979 to 2003 in China

Land surface temperature T s provides essential supplementary information to surface air temperature, the most widely used metric in global warming studies. A lack of reliable observational T s data makes assessing model simulations difﬁcult. Here, the authors ﬁrst examined the simulated T s of eight current reanalyses based on homogenized T s data collected at ; 2200 weather stations from 1979 to 2003 in China. The results show that the reanalyses are skillful in simulating the interannual variance of T s in China ( r 5 0.95) except over the Tibetan Plateau. ERA-Interim and MERRA land versions perform better in this respect than ERA-Interim and MERRA. Observations show that the interannual variance of T s over the north China plain and south China is mostly inﬂuenced by surface incident solar radiation R s , followed by precipitation frequency, whereas the opposite is true over the northwest China, northeast China, and the Tibetan Plateau. This var- iable relationship is well captured by ERA-Interim, ERA-Interim land, MERRA, and JRA-55. The homogenized T s data show a warming of 0.34 8 Cdecade 2 1 from 1979 to 2003 in China, varying between0.25 8 and 0.42 8 Cdecade 2 1 for the eight reanalyses. However, the reanalyses substantially underestimate the warming trend of T s over northwestChina, northeastChina, and the TibetanPlateau and signiﬁcantly overestimate the warming trend of T s over the north China plain and south China owing to their biases in simulating the R s and precipitation frequency trends. This study provides a diagnostic method for examining the capability of current atmospheric/land reanalysis data in regional climate change studies.


Introduction
Near-surface air temperature T a provides key evidence for global warming and has been widely used to study global climate change. The observed global warming has been mainly attributed to anthropogenic greenhouse gases. Gradually, scientists and the public become more interested in climate change in the region in which they live. The mechanism underlying regional climate change is more complicated because it is determined by greenhouse gases and multiple feedbacks (Zhou and Wang 2016a,d). Among them, the effect of land surface energy feedback on regional climate change has been studied less frequently and is a more challenging topic.
Land surface radiation budget is even a stronger determining factor of the evolution of land surface temperature T s . After the sun rises, the surface is heated by surface-absorbed solar radiation. To balance the surfaceabsorbed radiation, the surface emits longwave radiation, which is determined by T s . The surface net radiation is partitioned into latent and sensible heat fluxes, and the latter directly warms the air over the surface (Zhou and Wang 2016c). Surface vegetation coverage and its wetness play an important role in the partitioning of the surface net radiation and therefore changes of T a and T s at regional scales.
Nowadays, the current reanalysis products have been widely used in climate studies as important auxiliary observations (Thorne and Vose 2010;Dee et al. 2011a). In reanalyses, the energy budget is closely associated with T s (Viterbo and Beljaars 1995;Trigo et al. 2015). Although the importance of T s has been recognized, the accuracy of reanalyzed T s products over land is not well understood, mainly because of the lack of global station-based T s observations.
The examination of simulated T s from reanalyzed models can provide insight into the parameterization of land surface processes Trigo et al. 2015). Using limited station-and satellite-based T s observations, several existing studies have revealed that surface incident solar radiation R s dominates the change in T s and that surface-atmosphere conditions, including land cover, soil moisture, soil texture, and the temperature-moisture profile, substantially influence T s (Wen et al. 2003;Weng et al. 2004;Xiao and Weng 2007;Good 2016;Guo et al. 2016).
A few studies have revealed an underestimation of the diurnal variance in the modeled T s over sparsely vegetated regions (Zeng and Dickinson 1998;Yang et al. 2002;Chen et al. 2010Chen et al. , 2011Zheng et al. 2012;Trigo et al. 2015). Much effort has been put forth to develop T s parameterizations for land surface models. Yang et al. (2002) reparameterized surface roughness (controlling the latent and sensible heat fluxes) to improve the available energy partitioning and to further meliorate the diurnal variance of the modeled T s over the Tibetan Plateau and dry land of China (Chen et al. , 2011.  improved the modeled T s over bare soil surfaces by modifying the surface roughness and constraining the minimum friction velocity. Apart from these revisions in land surface model parameterizations,  indicated that atmospheric forcing data and assimilation systems have an important impact on the modeling of T s . Zheng et al. (2012) found that the NCEP operational Global Forecast System (GFS) has a large cold bias in daytime T s over the western United States and adopted new vegetation-dependent formulations of the momentum and thermal roughness lengths to reduce this bias. Trigo et al. (2015) found a slight cold bias in daytime T s and a warm bias in nighttime T s for the ECMWF model. Surface roughness (controlling the latent and sensible heat fluxes) and surface conductivity (controlling the ground heat flux) were revised to substantially improve the modeled T s , although the introduction of realistic vegetation into the model caused a limited improvement in the simulated T s (Trigo et al. 2015).
Prior evaluations and improvements have mainly been based on comparison of the absolute value of the modeled T s using only a few T s observations at a limited number of stations. Because T s is not routinely measured at meteorological stations under the guidance of the World Meteorological Organization (WMO), T s observations are only available for a relatively brief period at a few stations, for example, from the National Oceanic and Atmospheric Administration's Surface Radiation Budget Network (NOAA SURFRAD) (Augustine et al. 2000), FLUXNET (Baldocchi et al. 2001), and the GEWEX Asian Monsoon Experiment (GAME/Tibet) (Ma et al. 2006). In China, however, T s has been routinely measured since the 1950s, and there are enough available T s data since the 1960s at approximately 2200 stations. This makes it possible to perform a comprehensive assessment of the modeled T s in reanalyses, including the climatology, interannual variance, long-term trend, and their controlling factors.
Reanalyses consist of an assimilation system, forecast model, and observation data; thus, the error in reanalyzed outputs is mainly derived from these three components. The above attempt to revise the parameterizations of the surface roughness, surface conductivity, and friction velocity has improved the simulation of the absolute value of T s , although it remains uncertain whether reanalyses can reproduce the long-term trend (Thorne and Vose 2010;Dee et al. 2011a). Thus, the extent to which reanalyses can represent the observed trend in T s and the factors affecting the simulated bias in the T s trend need to be investigated. This examination of the simulated T s trend is closely related to Earth's surface energy budget and hence to the simulated capacity of climate change by reanalyses.
A previous work (Zhou and Wang 2016b) has utilized satellite-derived T s to reveal an underestimation of the T s trend from 2002 to 2015 over approximately 70% of the global desert in the current reanalyses and examined the relationships between interannual variations of T s and atmospheric circulation changes. However, satelliteretrieved T s data, based on infrared observations, are only available under clear sky conditions and are of short duration.
Using longer datasets that include T s , R s , and precipitation data from 1979 to 2003 at approximately 2200 stations over China, this study further provides a quantitative examination of the reanalyzed T s , including the climatology, interannual variances, long-term trends, and their controlling factors, from eight reanalysis products. This study quantifies the T s errors and their dominant factors in reanalysis products, hence providing a better understanding of the modeled T s error sources and indicating possible improvements for simulating climate change.

a. Observation datasets
The latest comprehensive daily dataset, including land surface temperature T s , precipitation P, 2-m air temperature T, sunshine duration, relative humidity, and surface pressure from approximately 2400 meteorological stations in China from 1961 to 2014, was obtained from the China Meteorological Administration (CMA; http://data.cma.cn/data). This dataset has been subjected to the initial quality control by the CMA, which includes the identification of outliers, an internal consistency check, spatial and temporal consistency checks, artificial checks, and correction of suspected and erroneous data.
To reduce the effects associated with sampling and record length, the stations were required to have at least 95% of the record duration at all time scales (i.e., $28 days per month, $347 days per year, and $24 years during the study period of 1979-2003). As a result, approximately 2200 stations met the requirements and were used in this study (supplemental Fig. S1).
Sunshine duration has been suggested to be the best variable for reconstructing the long-term surface incident solar radiation R s based on the revised Ångström-Prescott Eq. (1) (Yang et al. 2006;: where n is the measured sunshine duration, N is the theoretical value of sunshine duration, and R c is the daily total solar radiation at the surface under clear sky conditions. Here, the sunshine duration represents the period when the direct solar beam irradiance exceeds 120 W m 22 . The effects of Rayleigh scattering, water vapor absorption, and ozone absorption can be considered in R c using previously discussed meteorological observations (Yang et al. 2006;. Several intensive studies have reported that R s derived via sunshine duration can accurately depict the interannual, decadal, and long-term variances of R s . In addition, this R s derivation has been shown to accurately reflect the impact of aerosols and clouds on R s over China (Tang et al. 2011;. The derived R s is nearly free from the sensitivity drift problem that commonly plagues R s observations over China . More details on the derived R s can be found in Yang et al. (2006) and . In this study, a precipitation event is defined as one day with precipitation of at least 0.1 mm.
Reanalyses consist of assimilation system, forecast model, and observation data. MERRA, NCEP-R1, and NCEP-R2 adopt three-dimensional variational data assimilation systems (3D-VAR), whereas ERA-Interim, ERA-Interim land, ERA-20C, JRA-55, and MERRA land employ four-dimensional variational data systems (4D-VAR). ERA-20C only assimilates surface pressure and marine winds (Poli et al. 2016), and the other reanalyses assimilate many of the basic upper-air atmospheric fields, including air temperature, surface pressure, marine winds, radiosonde moisture, satellite radiances, and so on, from multiple sources.
The reanalyzed T s values are pure model calculations without assimilating T s observations owing to the lack of a global T s observing network (Zhou and Wang 2016c). In a reanalysis land surface model, a skin layer without heat capacity is assumed to isolate radiative heating from underlying soil (Viterbo and Beljaars 1995); thus, T s corresponds to the temperature at the interface. The reanalyzed T s data are physically derived from the surface-atmosphere energy balance for each tile during the forecast step of the land surface model [Eq.
(2)] (Viterbo and Beljaars 1995;Viterbo et al. 1999), which links the surface with the lowest model level through dry static energy and moisture (Best et al. 2004) and thermal contact with a single four-layer soil profile (or one layer if snow is present) (Dee et al. 2011b). The surface-atmosphere energy balance equation can be written as where f Rs is a small fraction of R s transmitted directly to surface soil, snow, or ice, a is the surface albedo, « is the longwave emissivity, s is the Stefan-Boltzmann constant, R L is the downward longwave radiation, and H and LE are the sensible and latent heat fluxes, respectively. The right-hand side of the equation represents the ground heat flux through coupling with the underlying soil, snow, or ice with temperature T 0 . Moreover, C surf is the surface thermal conductivity, and T s is the land surface temperature. The area-weighted T s of each tile is then averaged from the T s of subtiles; hence, T s is a prognostic variable of the land surface model. Different reanalyses have specific schemes that might influence the simulation of T s , R s , and precipitation. NCEP-R2 adopts a simple rainfall assimilation over land for improving soil wetness to prevent long-term climate drift of soil wetness (Kanamitsu et al. 2002). The MERRA land (ERA-Interim land) T s is averaged from an offline land surface model simulation forced by bias-corrected MERRA (ERA-Interim) reanalysis (primarily for precipitation), which is considered more accurate for land surface hydrological studies (Balsamo et al. 2009;Reichle et al. 2011). Precipitation bias corrections preserve water balance closure and are vital for improving land model output (Reichle et al. 2011;Balsamo et al. 2015).
The assimilation of radiosonde temperature-moisture profiles and satellite radiances facilitates the acquisition of cloud parameters and the computation of R s and precipitation processes (Dee et al. 2011b;Dolinar et al. 2016). However, the reanalyses do not allow aerosol loadings to change annually, albeit an inclusion of their monthly climatologies .

c. Method to homogenize the observed time series
Since approximately 2003, the observation infrastructure has been changed from manual to automatic T s measurements. Owing to the absence of snow removal for automatic observations, a higher temperature might be obtained from automatic observations. Other problems related to the observation infrastructure (e.g., instrument aging and changes in observing practices) and station relocations can also lead to false time heterogeneity in time series. Therefore, the study period is selected from 1979 to 2003.
To diminish the impact of data homogenization on the trends in the observed T s , R s , and precipitation frequency, we used the RHtestsV4 software package (Wang and Feng 2013) to detect and homogenize the breakpoints in the monthly time series. The package involves two algorithms: the PMTred algorithm is based on the penalized maximal t test (PMT) with a reference series (Wang et al. 2007), and the PMFred algorithm is based on the penalized maximal F test (PMF) without a reference series (Wang 2008a). The PMFred algorithm can detect undocumented mean shifts in a time series with a linear trend using a common-trend two-phase regression model (Wang 2008a). If the breakpoint is statistically significant, the quantile-matching (QM) adjustment in RHtestsV4 is recommended for making adjustments to the time series (Wang et al. 2010;Wang and Feng 2013). The QM adjustment aims to match the empirical distributions from all detrended segments with the specific base segment (Wang et al. 2010). Moreover, the QM adjustment also considers a seasonality of discontinuity, the annual cycle, first-order autoregressive errors, and the linear trend in the time series (Wang 2008b;Wang et al. 2010;Wang and Feng 2013). Recently, the PMTred algorithm with the QM adjustment was successfully used to homogenize climatic time series (Dai et al. 2011;Tsidu 2012;Aarnes et al. 2015;Siswanto et al. 2016;Wang and Wang 2016). We adopted the PMFred algorithm to detect breakpoints at each station. As such, 714 out of 2223 (30.2%) stations have significant breakpoints at a confidence level of 95% for the T s time series and 804 out of 2223 (36.2%) stations for R s (Fig. S1). Moreover, 1268 (57%) stations have significant breakpoints for the T s or R s time series; an additional 955 (43.0%) stations have original homogenized data (Fig. S1). Considering the change in the observing instrumentation around 2003, we selected the longest available segment from 1979 to 2003 as the base segment and used the QM adjustment to adjust the significant breakpoint at a confidence level of 95%. Moreover, we preliminarily inspected the detected breakpoints with limited station information.

d. Trend calculation and partial linear regression
We interpolated the reanalyzed datasets from different spatial resolutions (Table 1) onto the observed sites using bilinear interpolation (i.e., a linear interpolation function on two-dimensional grids). When calculating the regional values, we averaged the station data inside 18 3 18 grids using an area-weighted average method to minimize the impact of the heterogeneous distribution of the sites.
To assess the performance of the reanalyzed T s , the bias, root-mean-square error (RMSE), standard deviation (STD), and correlation coefficient r were used. Furthermore, we compared the trend in the reanalyzed T s with the observed trend: where y is the reanalyzed or observed T s , R s , and precipitation frequency anomaly relative to the reference period of 1981 to 2000, a is the corresponding trend, t is the year, b is the intercept when t 5 0, and « is the equation error. This equation is treated using the ordinary least squares method and the two-tailed Student's t test.
Pearson correlation analysis is often used to reveal the relationship between two variables. However, several of the environmental variables may be linearly covariant with one another (e.g., 48% of the variance in R s might be accounted for by precipitation frequency over China). The partial least squares approach is a widely applied statistical tool to isolate the relationship between two variables from the confounding effects of several correlated variables (Wold et al. 1984;Radok and Brown 1993;Beer et al. 2010;Carvalhais et al. 2014). In evaluating the relationship between T s and R s or precipitation frequency, we used partial least squares to statistically exclude the confounding effects of the other variable: slope (x,y)jz 5 s y s x r x,y 2 r y,z r x,z where r (x,y)jz and slope (x,y)jz are the partial correlation coefficient and partial regression coefficient, respectively, between the detrended T s and R s or precipitation frequency after controlling for the detrended precipitation frequency or R s ; r x,y , r x,z , and r y,z are the correlations between the detrended T s and R s , T s and precipitation frequency, and R s and precipitation frequency, respectively; and s y and s x are the standard deviations of the detrended T s and R s or precipitation frequency. The use of the detrended time series in the above Pearson and partial linear correlation and regression analysis instead of the original nonstationary time series provides a robust estimate of their relationship (Veizer et al. 2000;Zhou et al. 2007;Podobnik and Stanley 2008).
To evaluate the potential collinearity of the independent variables in the models, we calculated the variance inflation factor (VIF). The VIF for R s and precipitation frequency were less than 4; that is, VIF of 1.90, 3.09, 1.40, 1.48, 3.12, and 2.34 are obtained over China, northwest China, the Tibetan Plateau, northeast China, the north China plain, and south China, respectively, much less than the threshold of 10, above which the collinearity of models is bound to adversely affect the regression results (Ryan 2008).

Results
a. Climatology and bias in land surface temperature Figure 1 illustrates the bias in T s from the raw observations and the eight reanalyses relative to the homogenized observations over China. The eight reanalyses roughly underestimate the absolute value of T s except for ERA-Interim land and ERA-20C over the heart of the north China plain (Fig. 1). The underestimation of T s in the reanalyses is largest over the Tibetan Plateau (210.898 to 25.448C), followed by northwest China (25.848 to 21.848C) and south China (24.218 to 22.818C), and ending with the north China plain (23.458 to 21.438C) and northeast China (22.978 to 21.368C) ( Fig. 1 and Table 2). The raw T s observations must be adjusted at more than 30% (714/2223) of the stations, especially over central China (Fig. 1). However, a small adjustment of 0.218C from the raw T s observations over 30% of the stations is insufficient for canceling the underestimation of T s by the reanalyses (Fig. 1). Overall, the eight reanalyses have different simulation capacities for the absolute values of T s ( Fig. 1 and Table 2). ERA-Interim land has the best simulation capacity for the absolute value of T s , while ERA-20C, NCEP-R1, and NCEP-R2 are the worst in this regard.
The reanalyses rationally reproduce the seasonal cycle of T s , although they have a larger underestimation in summer than in winter (Fig. 2), which is consistent with the evaluated results of reanalyzed T s using observations from the U.S. Climate Reference Network (USCRN) stations and MODIS T s products (Zhou and Wang 2016b). This underestimation likely results from a combination of vegetation growth and an increase in R s in summer with the addition of a larger overestimated precipitation frequency in summer (than in winter) in the reanalyses (Zhou and Wang 2017).
The underestimated T s might directly influence the simulation of surface air temperature. Ma et al. (2008) reported that ERA-40, NCEP-R1, and NCEP-R2 underestimated surface air temperature in China with a maximum cooling bias in most of western China and suggested that NCEP-R1 had the worst performance. Our simulation results for T s are very similar to their results and provide a plausible explanation for the underestimated surface air temperature in the reanalyses. Several previous studies have presented a very small bias in T a in ERA-Interim (Decker et al. 2012;Wang and Zeng 2012;Zhou and Wang 2016c), which is consistent with the result for T s . Wang and Zeng (2015) illustrated a larger simulated underestimation of T a in summer than in winter from simulations using ERA-Interim, MERRA, and NCEP-R1, which is reflected by the T s simulation in nature (Fig. 2).

b. Interannual variability of land surface temperature
In this study, we paid more attention to the interannual variability and trend of T s than to its climatology. Figure 3 shows Taylor diagrams of the observed and reanalyzed annual T s anomalies over China and its five subregions. We found that the correlations between these reanalyzed and observed annual T s anomalies are relatively strong (Fig. 3). Among them, NCEP-R1 and NCEP-R2 have relatively weak correlations (0.1 and 0.4, respectively) over the Tibetan Plateau.
ERA-Interim, JRA-55, ERA-Interim land, MERRA, and MERRA land perform well in terms of the simulated T s anomaly over China (r 5 0.95; RMSE 5 0.18C) (Fig. 3a) and its subregions (r 5 0.9; RMSE50.28C) (Figs. 3b-f). ERA-20C always presents a large standard deviation in T s and hence a large RMSE over China and its subregions (Fig. 3), which likely results from less assimilation of multiple observations and incomplete parameterization schemes (Poli et al. 2016).
Compared with the T s from ERA-Interim and MERRA, the T s from the land surface products (i.e., ERA-Interim land and MERRA land) have a smaller standard deviation and RMSE (Fig. 3), maybe because the offline land surface models merge the observed precipitation dataset and hence improve the partitioning of available energy over land (Reichle et al. 2011;Balsamo et al. 2015). In addition, T s over south China is depicted most accurately by the reanalyses (Fig. 3).

c. Relationship of land surface temperature to surface incident solar radiation and precipitation frequency
This section will discuss the relationship of T s against R s and precipitation frequency. In physics, the R s heats the surface, largely determining the variability of the T s at various temporal scales. In turn, the T s controls the surface-emitted longwave radiation (Wang and Dickinson 2013). Existing studies suggested that precipitation frequency is a better factor in quantifying interannual variability of soil moisture over China than precipitation amount (Piao et al. 2009;Wu et al. 2012) and then in reflecting vegetation growth and surface characteristics (e.g., surface albedo and roughness). These changes would alter the partitioning of available energy for regulating the change in T s . Additionally, upper-air raindrops (Fujita 1959;Chen and Wang 1995) or synoptic advection (Trenberth and Shea 2005;Wei et al. 2014) associated with precipitation events also have an inevitable impact on the T s . We therefore used precipitation frequency instead of precipitation amount in this study. Our following results show that precipitation FIG. 4. Composite map of the partial correlation coefficients of T s against the surface incident solar radiation R s and precipitation frequency. To avoid a spurious correlation, the detrended time series were used to estimate the partial correlation for 1979 to 2003 from the eight reanalysis products and observations: (a) homogenous observations, (b) raw observations, (c) ERA-Interim, (d) ERA-Interim land, (e) ERA-20C, (f) MERRA, (g) MERRA-land, (h) JRA-55, (i) NCEP-R1, and (j) NCEP-R2. We found that homogenization has only a slight impact on the relationship between T s and R s or precipitation frequency. frequency is a good indicator in reflecting precipitation impact on interannual variability and trend of T s .
Although the seasonal cycle of vegetation coverage impacts the simulated values of T s , current reanalyses do not allow vegetation to change annually. The T s is measured over the exposed soil surface that does change with time. Therefore, interannual relationships between R s (or precipitation frequency) and T s are expected to be similar in the model and observations. Our results below confirmed this inference. Figures 4  and 5 illustrate the partial relationships between the annual T s and R s anomalies (after controlling for the precipitation frequency) and the precipitation frequency (after controlling for R s ). The homogenization of the data has only a slight impact on the partial relationship of T s with R s and the precipitation frequency (Figs. 4 and 5).
In the observations, the interannual variance of T s is primarily regulated by the precipitation frequency and secondarily by R s over northwest China, northeast China, and the Tibetan Plateau (Figs. 4, 5, and 6). ERA-Interim, ERA-Interim land, and MERRA can coefficients between T s and R s and underestimates those between T s and the precipitation frequency over the Tibetan Plateau, northwest China, and northeast China (Figs. 4,5,and 6). In addition, none of the eight reanalyses can reproduce the observed positive partial relationship between T s and R s over the Tibetan Plateau, especially ERA-20C, NCEP-R1, and NCEP-R2 (Figs. 4, 5, and 6). According to the observations, the interannual variance of T s is dominated by R s and secondarily influenced by the precipitation frequency over the north China plain and south China (Figs. 4,5,and 6). ERA-Interim, ERA-Interim   land, ERA-20C, and JRA-55 rationally capture the observed relationships over the north China plain and south China (Figs. 4, 5, and 6). MERRA and MERRA land can represent the observed relationships over the North China Plain, and NCEP-R2 can represent the observed relationships over south China (Figs. 4,5,and 6). However, MERRA, MERRA land, and NCEP-R1 exhibit smaller correlation and regression coefficients between T s and R s over south China than those from the observations (Figs. 4, 5, and 6). The same situation occurs over the north China plain for NCEP-R1 and NCEP-R2 (Figs. 4, 5, and 6).
Overall, the observations show that the correlation and regression coefficients between T s and R s over humid regions (i.e., the north China plain and south China) are larger than those over arid and semiarid regions (i.e., northwest China, northeast China, and Tibetan Plateau), although the correlation and regression coefficients between T s and the precipitation frequency over humid regions are smaller than those over arid and semiarid regions (Figs. 4, 5, and 6).  Table 3).

d. Land surface temperature trend
However, the averaged trends across a large territory may mask regionally different values, reflecting diverse climate change trends and effects. Moreover, there is a stronger observed warming of T s over northeast China (0.578C decade 21 ; p , 0.05), northwest China (0.538C decade 21 ; p , 0.05), and the Tibetan Plateau (0.328C decade 21 ; p , 0.05) ( Fig. 7 and Table 3). Among the eight reanalyses, the increases in R s and the decreases in precipitation frequency result in the strong warming over northeast China (Fig. 7 and Figs. S2 and S3). However, the eight reanalyses generally underestimate the T s trend over these regions: 0.408-0.588C decade 21 ( p , 0.05) over northeast China, 0.068-0.378C decade 21 over northwest China, and 20.268-0.398C decade 21 over the Tibetan Plateau ( Fig. 7 and Table 3). Especially over the Tibetan Plateau, NCEP-R1 and NCEP-R2 present negative trends of 20.098 and 20.268C decade 21 , respectively ( Fig. 7 and Table 3), which contributed to the failure of NCEP-R1 to simulate surface air temperature trends over the Tibetan Plateau (You et al. 2010;Wang and Zeng 2012).
Furthermore, T s displays an observed warming of 0.298C decade 21 ( p , 0.05) over the north China plain and 0.178C decade 21 ( p , 0.05) over south China. FIG. 8. As in Fig. 7, but for trend differences in the land surface temperature T s (8C decade 21 ).
Among the eight reanalyses, the decreases in R s and precipitation frequency result in the relatively weak warming over these regions (Fig. 7 and Figs. S2 and S3). However, the reanalyses overestimate the T s warming trends over these regions [i.e., 0.298-0.528C decade 21 ( p , 0.05) over the north China plain and 0.208-0.378C decade 21 ( p , 0.05) over south China] because the reanalyses ignored the canceled effect of the increasing aerosol loading .
We further examined the impact of data homogenization on the T s trend. The T s trends derived from the original dataset are always higher than those from the homogenous dataset, especially over the north China plain and south China ( Fig. 8a and Table 3). This indicates that the homogenization might adjust the breakpoints in the time series (Wang 2008a) and help to objectively depict the T s trend, which advances the assessment of the modeled T s trend in the reanalyses.
e. Contribution of surface incident solar radiation and precipitation frequency to land surface temperature trend biases Figure 8 shows the trend differences in T s between the eight reanalyses and the homogenous observation. The overall pattern in the trend in T s is underestimated over the Tibetan Plateau and northwest China and overestimated over the north China plain and south China (Fig. 8). Evidently, the R s trends (Fig. 9) over the north China plain, south China, and northeast China are overestimated by the eight reanalyses, which primarily explains the overestimated T s trend over the north China plain, south China, and part of northeast China; for example, the spatial pattern correlation between the T s trend difference and the R s trend differences is up to 0.77 ( p , 0.05) over these regions (Figs. 8 and 9). The additional increased R s trends will heat the land surface and make the T s trend larger in the reanalyses than in the observations. The trends in the precipitation frequency are overestimated over northwest China and the Tibetan Plateau and underestimated over northeast China, the north China plain, and south China in the eight reanalyses (Fig. 10). This pattern of differences in precipitation frequency trends substantially explains the underestimated T s trend over northwest China and the Tibetan Plateau and the overestimated T s trend over the north China plain, south China, and part of northeast China; for example, the spatial pattern correlation of the T s trend difference against the R s trend differences is up to 0.61 ( p , 0.05) over China (Figs. 8 and 10). These overestimated trends in the precipitation frequency may contribute to an overestimation of soil surface moisture, allowing, therefore, more available energy for the latent heat flux and in turn leading to surface cooling (Jung et al. 2010;Wang and Dickinson 2012).
By integrating the relationship for the trend biases in T s with those of R s and the precipitation frequency over China and its five subregions (Fig. 11), it was found that the underestimated T s trends in the reanalyses mainly result from the overestimated trends in the precipitation frequency and the underestimated trends in R s over the Tibetan Plateau and northwest China, whereas the overestimated T s trends in reanalyses mainly result from the overestimated trends in R s over the north China plain and south China.
We further quantified the contribution of the trend biases in R s and the precipitation frequency to those in T s in the eight reanalyses. Over northwest China and northeast China, the trend biases in R s (precipitation frequency) can explain 56.3% (40.2%) of the trend biases in T s in the reanalyses (Fig. 12). The trend biases in T s are more largely explained by the trend biases in R s than precipitation frequency because partial sensitivity of T s to R s is more greatly overestimated by reanalyses than to precipitation frequency (Fig. 5). Over the north China plain and south China, the trend biases in R s (precipitation frequency) can explain 35.1% (26.3%) of the trend biases in T s in the reanalyses (Fig. 12). Moreover, the sensitivities of the trend biases in T s to the trend biases in R s (precipitation frequency) by 0.  (Fig. 12).
The precipitation frequency might influence R s through frequent cloud cover (Bony et al. 2015;Norris et al. 2016;Zhou et al. 2016); therefore, we further removed the collinearity of the trend in R s and the precipitation frequency while quantifying their contribution to the trend biases in T s . Following this manipulation, the trend differences in R s (precipitation frequency) can independently explain 45.2% (13.3%) of the trend biases in T s in the reanalyses over arid and semiarid regions, whereas the trend differences in R s (precipitation frequency) can independently explain 26.8% (10.8%) of the trend biases in T s in the reanalyses over humid regions. FIG. 10. As in Fig. 7, but for trend difference in the surface precipitation frequency (days decade 21 ).
In addition, the remaining part of the trend biases in T s in the reanalyses might be explained by different land surface parameters, including the surface roughness, surface conductivity, friction velocity, turbulent exchange coefficients, vegetation growth, and so forth (Zeng and Dickinson 1998;Veizer et al. 2000;Yang et al. 2006;Zhou and Wang 2016c,e).

Conclusions and discussion
The T s is one of the most important parameters regulating the energy exchange and budget between Earth's atmosphere and surface, including latent and sensible heat fluxes and longwave radiation, which are closely relevant to climate change. Several previous studies have revealed an underestimation of the diurnal variance in the modeled T s over sparsely vegetated regions; much effort has been exerted to reduce this bias (Zeng and Dickinson 1998;Yang et al. 2002;Trigo and Viterbo 2003;Chen et al. 2010Chen et al. , 2011Zheng et al. 2012;Trigo et al. 2015). However, a lack of long-term T s observations hinders the assessment and improvement of land surface models in reanalyses and hence our understanding of landatmosphere processes. In this study, using long-term T s observation from 1979 to 2003 at approximately 2200 stations over China, we first comprehensively examined simulated T s , including the climatology, interannual variances, longterm trend, and their controlling factors, from eight reanalyzed products. The T s is underestimated in the eight reanalyses, and the underestimation is the largest over the Tibetan Plateau (210.898 to 25.448C). ERA-Interim land and ERA-20C slightly overestimate T s over the heart of the north China plain mainly because of less aerosol loading in the reanalyses. In regard to the absolute value of T s , ERA-Interim has the best simulation capacity, whereas ERA-20C, NCEP-R1, and NCEP-R2 give the worst performance.
The reanalyses are skillful in simulating the interannual variance of the T s anomaly (r 5 0.95 and RMSE 5 0.18C) over China except over the Tibetan Plateau. ERA-Interim and MERRA land perform better in this aspect than ERA-Interim and MERRA. The interannual variance of T s is dominated by R s (having large partial correlation and regression coefficients) and secondarily influenced by the precipitation frequency over the north China plain and south China, whereas it is primarily regulated by the precipitation frequency (having large partial correlation and regression coefficients) and secondarily by R s over northwest China, northeast China, and the Tibetan Plateau. The interannual domination of T s is well captured by ERA-Interim, ERA-Interim land, MERRA, and JRA-55.
The T s observations present a strong warming trend over China (0.348C decade 21 ). However, the T s trend is consistently underestimated by the reanalyses over northwest China, the Tibetan Plateau, and part of northeast China and overestimated over the north China plain and south China. We further found that the trend biases in R s (precipitation frequency) can explain 56.3% (40.2%) of the trend difference in T s in the reanalyses over northwest China and northeast China and 35.1% (26.3%) of the trend biases in T s in the reanalyses over north China plain and south China. After removing the collinearity of R s and the precipitation frequency, the trend biases in R s (precipitation frequency) can independently explain 45.2% (13.3%) of trend biases in T s in the reanalyses over arid and semiarid regions and 26.8% (10.8%) of trend biases in T s in the reanalyses over humid regions. Moreover, the sensitivities of the trend differences in T s to the trend biases in R s (precipitation frequency) over arid and semiarid regions are significantly larger than those over humid regions.
Previous studies have mainly focused on the revision of land surface parameterizations, including the surface roughness, surface conductivity, and so forth, to improve the modeled T s in reanalyses (Zeng and Dickinson 1998;Veizer et al. 2000;Yang et al. 2006;. This study shows the importance of R s and precipitation frequency in determining the variability of T s , which should be studied in the near future.