Climatic warming in China during 1901–2015 based on an extended dataset of instrumental temperature records

Monthly mean instrumental surface air temperature (SAT) observations back to the nineteenth century in China are synthesized from different sources via specific quality-control, interpolation, and homogenization. Compared with the first homogenized long-term SAT dataset for China by Cao et al (), which contained 18 stations mainly located in the middle and eastern part of China, the present dataset includes homogenized monthly SAT series at 32 stations, with an extended coverage especially towards western China. Missing values are interpolated by using observations at nearby stations, including those from neighboring countries. Cross validation shows that the mean bias error (MBE) is generally small and falls between 0.45 °C and −0.35 °C. Multiple homogenization methods and available metadata are applied to assess the consistency of the time series and to adjust inhomogeneity biases. The homogenized annual mean SAT series shows a range of trends between 1.1 °C and 4.0 °C/century in northeastern China, between 0.4 °C and 1.9 °C/century in southeastern China, and between 1.4 °C and 3.7 °C/century in western China to the west of 105 E (from the initial years of the stations to 2015). The unadjusted data include unusually warm records during the 1940s and hence tend to underestimate the warming trends at a number of stations. The mean SAT series for China based on the climate anomaly method shows a warming trend of 1.56 °C/century during 1901–2015, larger than those based on other currently available datasets.


Introduction
Climatic warming during the past century has been evident in worldwide surface air temperature (SAT) records (Hartmann et al 2013, Jones 2016. It is clear that long-term and homogeneous instrumental SAT series are essential for quantifying the observed climate trend. Great efforts have been made for decades to develop reliable long-term SAT series for different regions of the world (Jones et al 1999, Jones 2016. The collection, compilation, and construction of long-term instrumental SAT data in China have also been on-going over recent decades (Tao et al 1991, Wang et al 1998, Cao et al 2013. However, the estimates of climatic warming trends during the last century for China in previous studies show a large range (from 0.3°C-1.5°C/century), primarily due to various data issues, especially for the early period before 1950 (Tang et al 2009). For comparison, the data of the regional mean SATseries based on a different number of stations for the period since the 1950s agree well with each other (e.g. Jones et al 2008). Cao et al (2013) established for the first time a set of homogenized long-term SAT series at 18 stations in middle and eastern China and showed a regional mean warming of 1.52°C for the 1909-2010 period. Here, we update the data series and extend the Original content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence.
Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. dataset towards western China in particular, where observations are sparse.
The extension to a more spatially complete dataset of long-term instrumental temperature observation series for China occurred mainly due to a number of pieces of early observation series becoming available for western China. At the same time, it is beneficial to update the existing dataset as five years have passed since the early work of Cao et al (2013) was undertaken (the final year then was 2010) and significant improvements have been made to the availability of SAT data in China in this period.
The present paper introduces how the new dataset 'China Homogenized Monthly Temperature Dataset (CHMTD-V1.0) during 1873-2015' is constructed, including quality-control, interpolation of missing records, and homogenization of the long-term series. The new dataset serves as an improved database for studying the geographical pattern of SAT change and, in particular, updating the estimate of the centuryscale warming trend over China.
The rest of the paper is organized as follows. Section 2 introduces the sources of additional data and datasets used for comparison. Section 3 explains the basic data processing, including interpolation and homogenization techniques used to develop the dataset. Section 4 presents the results. Section 5 is a summary.

Data
The original SAT records and the associated metadata before 1951 are from two main sources. One is the early work 'Two climatic databases of long-term instrumental records of Chinese Academy of Sciences (CAS) in the People's Republic of China' (Tao et al 1991). The other is the long-term series of SAT and metadata developed since 2002 by the National Meteorological Information Center of the China Meteorological Administration (CMA), including digitized long-term instrumental records and metadata for the 60 largest cities of China. A further digitized longterm temperature dataset of 78 stations set up in 2013 is also included. The metadata, which are archived in text files for each station, include detailed information about changes in the observational instrumentation, times, locations and environment. The CMA also holds the original records of SAT and the associated metadata during 1951-2014 for over 2400 national meteorological stations across China (Cao et al 2016). Instrumental temperature records for Hong Kong and Macao are available from the websites of the Hong Kong Observatory and the Macao Meteorological and Geophysical Bureau, respectively. Instrumental temperature records for Tainan and Hengchun stations are available from the website of the Central Weather Bureau of Taiwan.
The CRUTEM4 (Jones et al 2012, Osborn and Jones 2014) database of SAT within the China area is used to compare the regional average series based on the climate anomaly method (CAM) (Jones 1994, Jones andMoberg 2003). CRUTEM4 additionally includes the previously published 18 homogenized temperature time series (Cao et al 2013). The GISTEMP dataset is also used for comparison, which does not include the present homogenized station series from China. A principle rule for choosing stations for the present dataset is that the station is within at least the 30 yr series of SAT records before 1951 if it is in eastern China, or over 10 yr if it is in western China (to the west of 110 E). As shown in figure 1, the 32 stations chosen are reasonably well distributed over China. Compared with the dataset of Cao et al (2013), the new dataset includes seven more stations in western China with a relatively long observation series. For eastern China, there are also seven new stations, including two from Taiwan. Table 1 shows the station names and their abbreviations for the 32 stations in figure 1.

Data quality control and interpolation
The original daily mean maximum and minimum temperature (Tmax and Tmin) series records of We perform compilation and quality control using the method of Cao et al (2013) to determine the reliability of each station's monthly SATseries and then consider any obviously erroneous value as a missing value. Table 2 shows the percentage of missing records for the 14 new stations of the long-term monthly mean SATseries. Only two stations, Tainan and Hengchun in Taiwan, have a continuous monthly mean SAT from their start dates to the end of 2015. Ten stations have missing rates between 0.36% and 7.44%. Wulumuqi and Tengchong have the largest missing rates (10.95% and 9.33%, respectively). Hetian station started in 1942, and has the shortest time series among all 32 stations. It is beneficial to interpolate some missing records, especially for the western part of China. To do so, records at neighboring stations (including some in neighboring countries) with instrumental observations before 1950, particularly during the 1940s, were used. After infilling all those missing records, we produced a complete temperature series for all the stations.
When undertaking the interpolation for a given station, the reference stations were chosen from those of 152 stations in the China area (figure 2, green dot), which are within a distance of 300 km from the candidate station and have data value for more than 10 yr before 1951. It is difficult for the seven stations in western China to choose neighboring temperature series as none may exist within China. We therefore chose 76 stations bordering China to interpolate missing records at the stations in western China. These stations are located in 11 countries: Bangladesh, India, Kyrgyzstan, Kazakhstan, Laos, Mongolia, Myanmar, Russia, Thailand, Vietnam and Nepal (figure 2, blue dot). The temperature data series at the stations outside China are from CMA's first global monthly temperature dataset over land, which was developed by integrating four existing global temperature datasets and several regional datasets from major countries or regions (Xu et al 2014). The temperature series of Kathmandu extending from January 1921 to December 1975 is archived at CRU (via P D Jones). To ensure that the temperature series at the reference stations are highly correlated with those at the candidate station, we set a distance threshold of 500 km. For Wulumuqi, Hetian and Lhasa stations, the threshold distance was enlarged to 1000 km. A three-step interpolation technique for the monthly mean SAT anomaly from the local climatological mean is applied, using one of four independent statistical approaches for each step, i.e. the standardized method, partial least squares regression, multivariate linear regression and gradient plus inverse distance square method (full details given in Cao et al 2013). Cross validation is conducted to assess the reliability of the interpolation results. Table 3 shows the mean bias error (MBE) and root mean square error (RMSE) for each of the 12 stations from the first year of the station observations to 2015.
As table 3 shows, MBE is generally small and falls between 0.45°C and À0.35°C. RMSE is smaller than 1°C at six stations (Xining, Yingkou, Dalian, Yantai, Xiamen, and Nanning) and is generally below 1°C at Changchun and Lhasa. These interpolation errors are similar in magnitude to previous results for stations in eastern China (Cao et al 2013). A relatively large RMSE (between 1.5°C and 1.6°C) is found at two stations in western China (Wulumuqi, Lhasa) for 119 months of interpolated values.
In total, we interpolated 738 monthly values for 12 stations (accounting for 5.4% of the total monthly records) to develop a continuous time series of monthly mean SAT from the start year up to 2015 for the 32 stations.

Homogenization of SAT time series
The inhomogeneity of a SAT time series is usually expressed as a sudden change (breakpoint) compared with neighboring station records. A homogeneous time series should diminish the sudden changes due to non-climatic factors, such as changes in observational locations, times and instruments. A variety of homogenization algorithms have been developed and assessed in recent years (Venema et al 2012). In this work, we applied some existing homogenization methods together with station metadata to perform the inhomogeneity detection and adjustment. Any gradual trend biases, e.g. due to an enhancing urban heat island effect during recent decades, may remain in the homogenized series for some studied sites.
The RHtests version 3 software package (Wang and Feng 2010) was used as the primary method to detect and adjust changepoints. This package includes the PMTred algorithm, which is based on the penalized maximum t (PMT) test, and the PMFred algorithm, which is based on the penalized maximum F (PMF) test (Wang 2008). The PMT test is a relative homogeneity test and must be used with a reference series. The PMF test can be used without a reference series. Cao and Yan (2012) showed that the PMTred algorithm is suitable for detecting multiple changepoints with a reference series representing a regional temperature series when the observational network is dense, while the PMFred algorithm is often applied to a sparse observational network where a reference series may be quite distant. Therefore, we use both PMFred and PMTred algorithms with a statistical test at the 1% significance level.
Firstly, for the time series of monthly mean SAT at the 14 new stations before 1950, we apply the PMFred algorithm without a reference series because the observational stations were sparse during this period. To increase the reliability of the PMFred method in detecting changepoints, we repeat the above work using the two-phase regression method (Easterling and Peterson 1995) and a running student's t-test. When one changepoint is detected by at least two methods or is supported by the metadata, it is accepted as a break Environ. Res. Lett. 12 (2017) 064005 point. When the changepoints detected by different methods are close to each other (within two years), these changepoints are considered as one and its occurrence time is determined by available metadata or is set to the first occurring year if without metadata.
Secondly, for the 12 stations (excluding Tainan and Hengchun) during 1951-2015, the PMTred algorithm considering a reference series is applied to detect changepoints. Each reference series is constructed on the basis of stations with continuous and homogeneous  1930-03/1931 11/1939-10/1941 11/1943-12/1943 09/1944-12/1944 11/1949-05/1950  The homogeneity of the time series at each reference station is checked by using the PMFred algorithm. The reference series is chosen to highly correlate with the tested station (with a correlation coefficient larger than 0.8). For each tested station, if there are more than two reference stations, their arithmetic average is defined as a reference series. In the third step, possible changepoints in the 1950s are further detected by using the PMFred algorithm to ensure the continuity of the whole series.
In fact, the metadata shows that many stations were relocated in the years during the early 1950s when the national meteorological network was rebuilt. Table 2 shows the detected changepoints at the 14 new stations. There are no breakpoints at three stations (Hetian, Yingkou and Tainan). Each temperature series identified to contain significant changepoints is then adjusted to the latest segment of the data series using the mean-adjustments of RHtestsV3. Finally, we obtain the homogenized time series of monthly mean SAT at each of the 14 stations. Figure 3 shows the time series of the annual mean SAT anomaly at Xining station from 1937-2015 as an example. It is notable that discontinuity dates are different for the Tmax and Tmin series (figures 3(b)À(d)). In Tmin, there are two changepoints detected due to relocation in 1975 and 1995, while the Tmax series is homogeneous. This means the two relocations had little influence on the Tmax series. This phenomenon is quite common for temperature observations in China. Overall, the Tmin series have more changepoints than the Tmax series, implying that the Tmin measurement is more sensitive to changes in the observation system in China. A physical reason for is that the Tmin series usually has a smaller variance than Tmax is in this region, hence any non-natural biases in the Tmin series are more statistically significant.
When averaging Tmax and Tmin to generate the Tm series, the discontinuity still exists. The long-term series of Tm has one more changepoint in 1951 ( figure 3(a)), corresponding to a relocation recorded in the metadata. Xining station was moved in January 1951 from downtown to the nearby airport, which was located in the eastern suburb, about 8 km away from the downtown location, causing the jump in the temperature series ( figure 3(a)). The adjusted Tm series shows an increasing trend.

Cases of relative warmth during the 1940s
It is clear that the century-scale warming trend estimation is uncertain when using different datasets (Tang et al 2009, Jones 2016). The most significant differences occurred during the 1920s-1940s. In particular during the 1940s, China experienced wars and so a large number of observations are missing. Figure 4 gives the temperature series for Beijing and Nanjing stations during 1905-2015 before and after homogenization. For Beijing station, we can hardly see a warming trend in the raw temperature series before  -4 1937 1944 1951 1958 1965 1972 1979 1986 1993 2000 2007 2014 1954 1960 1966 1972 1978 1984 1990 1996 2002 2008 2014 1954 1960 1966 1972 1978 1984 1990 1996 2002 2008 2014 1954 1960 1966 1972 1978 1984 1990 1996 2002   Environ. Res. Lett. 12 (2017) 064005 1965, when the station was moved from downtown to the Southeast of Daxing District about 3.88 km away. The homogenized series (the red line) shows a consistent warming trend and the relative warmth before the 1940s is not as obvious as in the raw data. Based on the present method, the Beijing series did not exhibit significant changepoints before 1951, though it was possible to make some minor adjustments based on other methods (e.g. Yan et al 2001). For Nanjing station, a warm peak occurs around 1941. However, Nanjing had many continuous missing months during 1938-1945. The early warm peak was a result of interpolation from three stations (Shanghai, Xuzhou and Dafeng) (Cao et al 2013). According to the metadata, Nanjing station was located in the urban district during the 1940s and observations were missed due to the war. In 1958, the station was moved to the Yuhua suburb to the south and southeast of the downtown location (6.8 km away), with a declining jump of the temperature records (albeit it is not statistically detectable).
By analyzing the metadata, we found that in China many stations (about 40% of the studied stations in this paper) moved from the city center to a suburb due to the rapid urbanization development shortly after the foundation of the People's Republic of China in 1949. The environment of the stations did not accord with the observation criteria in the earlier periods and these relocations usually led to obvious jumps to cooler temperatures. Figure 3(a) shows a typical case (see the change point around 1951). Thus, cases such as Nanjing station need more attention and perhaps further adjustments. In general, there is still potential for further homogenization of the present dataset because the present work has adjusted only the most significant biases in the station series.  1905 1915 1925 1935 1945 1955 1965 1975 1985 1995 2005 2015 1905 1915 1925 1935 1945 1955 1965 1975 1985 1995   Environ. Res. Lett. 12 (2017) 064005 based on the unadjusted station data over the 32 stations. As figure 6(a) shows, based on the unadjusted SAT series, nine stations have large warming trends over 1.5°C/century, with the largest warming rates over 3.0°C/century at three stations in the northeast (Hailar) and northwest (Wulumuqi, and Hetian). Eight stations mostly located in southern China (Macao, Tianjin, Wuhan, Shapingba, Changsha, Fuzhou, Xiamen and Nanning) have a range of small trends between À0.6°C and 0.4°C/century. The other 15 stations have a range of trends between 0.5°C and 1.5°C/century. In figure 6(b), the homogenized annual mean SAT series show larger warming trends in general. The largest warming trends occur at Harbin (4.0°C/ century), Wulumuqi (3.7°C/century), and Changchun (3.5°C/century), which are located in the most northeast or northwest of China. By and large, the original time series tend to underestimate the local warming trends during the overall period from  1903 1913 1923 1933 1943 1953 1963 1973 1983 1993 2003 2013 (2009) showed that many stations in China moved from some urbanized locations to an out-of-town location during their history, causing cooling biases in the SAT series. Comparatively, the adjusted data present a more geographically coherent pattern of climatic warming (figure 6(b)) than the original data do ( figure 6(a)). The adjusted data demonstrate a range of trends between 1.1°C and 4.0°C/century in northern China, between 0.4°C and 1.9°C/century in southeastern China, and between 1.4°C and 3.7°C/century in western China (to the west of 110 E).

Century-scale warming trends in China
To calculate the country-wide average trend, we applied the most widely used method, CAM (Jones 1994, Jones andMoberg 2003), to avoid the biases caused by uneven station density. The base period from 1961-1990 was taken to calculate the mean climatology and temperature anomalies. Gridbox anomaly values were then produced by averaging the individual station anomaly values within each 5°Â 5°grid box. Figure 7 shows the time series of SAT anomalies for China during 1901-2015 based on the original and adjusted data. The data from CRUTEM4 and GISTEMP are also used for comparison. As figure 7(a) shows, the four temperature series of CRUTEM4, GISTEMP, original (CHMTD-RAW) and adjusted (CHMTD -ADJ) station data are similar to each other from the 1950s. The correlation coefficients among them are beyond 0.99 for the period since 1951. The regional mean SAT series for China based on the 32 stations agrees well with that based on a much denser station network for the period 1951-2015 (T-China in figure 7(b)). The T-China series is the average of 2419 stations' homogenized Tm series over the whole of China (Cao et al 2016). The long-term temperature series have slight trends before the late 1960s, followed by a rapid warming after about 1970. For the period 1901-2015, the linear trend is 1.13°C ± 0.21°C/ century based on the original data, 1.56°C ± 0.20°C/ century based on the homogenized data, and 1.21 ± 0.16 and 1.30°C ± 0.19°C/century based on GISTEMP and CRUTEM4, respectively.
The warming trend estimated from the adjusted data is larger than that based on the raw data because, in many cases, the raw station data involve cooling biases as discussed supra. Note that CRUTEM4 additionally includes 18 stations of the homogenized dataset published in Cao et al (2013), thus showing a trend nearest to that based on the present adjusted data. GISTEMP does not include the present homogenized data but with some processing of homogenization of the original data used, hence the trend based on GISTEMP is between that of the raw data and that of CRUTEM4. It is also worthwhile noting that the difference between different datasets in terms of the regional mean series appears smaller than those between the raw and adjusted data for individual stations because inhomogeneities at individual stations may compensate each other when calculating the regional mean.
Uncertainty in the regional mean series also arises from the different number of stations during the different periods. Two series for the period 1943-2015 are present in figure 7(c) in addition to the original China mean temperature anomaly series involving 32 stations. One involves 25 stations with continuous records since 1920 or earlier; the second involves 15 stations with continuous records since 1905 or earlier. The differences among the three series during 1943-2015, as shown in figure 7(c), are acceptable and serve as measures of possible uncertainties in the early parts of the long-term series.
According to figure 7, the warmest year for China during the period of study is 2007, with the SAT anomaly being 1.69°C/1.70°C based on the original/ adjusted data. Following the relatively cool years of 2010-2012, the SAT anomalies tend to recover in 2013-2015.

Summary
Via consistent quality control, multi-way interpolation and homogenization in this study, we have established an extended set of monthly mean SAT series in China back to the 19th century. The dataset includes 32 stations across China and is available at http://data. cma.cn/ according to the data sharing policy of the CMA. Different source datasets are synthesized to produce the new dataset. Compared to the earlier version dataset of Cao et al (2013), the new dataset has a much improved coverage towards western China, with nine stations located to the west of 110 E.
Several homogenization methods and available metadata were applied. Major inhomogeneous biases in the original data were adjusted. Compared to the original data, the adjusted series of annual mean SAT shows a larger warming trend for most stations. The warming rates range from 1.1°C-4.0°C/century in northern China. Trends are generally weaker in southeastern China.
The regional mean SAT series for China based on the 32 stations agrees well with that based on a much denser station network for the period 1951-2015. The linear trend in the regional mean SAT series of 1901-2015 is 1.56°C ± 0.20°C/century, with an enhanced warming rate of 0.26°C ± 0.04°C/decade for the recent period 1951-2015. There was little trend over the early period before 1951 due to relative warmth around the 1920s in the region (Zeng et al 2003). Uncertainty mainly arises from the sparse observations during the early period. There is potential to further homogenize the early series before 1950.