Variations in summertime compound heat extremes and their connections to urbanization in China during 1980–2020

Summertime heat extremes are exceptionally harmful and destructive to human health and socio-economic systems. Based on 2419 meteorological stations, this study investigates the temporal and spatial characteristics of summertime extreme high temperatures across China and their response to urbanization over the period 1980–2020. The results show that (a) both the maximum temperature (TXx) and minimum temperature (TNn) in the summer have shown an upward trend in most parts of China during the past 40 years; significant upward trends are found in eastern China for independent hot days (IHDs), independent hot nights (IHNs) and compound hot events (CHE). (b) Extreme heat shows a positive spatial clustering pattern across China. The hot spots of the three compound heat events (including IHD, IHN and CHE) are mainly distributed in the north; while the cold spots are roughly distributed in the south. The spatial pattern of hot and cold spots of TXx exhibits an east-west orientation while that of TNn shows a southeast-northwest orientation. (c) The correlation between the heat extremes and urbanization level varies among regions, with most exhibiting linear and exponential relationships. Significant urbanization effects on TXx and TNn are detected in all sub-regions, with contributions ranging from 11% to 41% and 14% to 29%, respectively, while urbanization contributions to the three compound heat events vary significantly among the sub-regions. Our findings present the spatio–temporal patterns of summertime heat extremes, especially the spatial clustering pattern of the compound extremes, and thus enrich the understanding of these variations and their regional response to urbanization, which may have important implications for policy-making among distinct sub-regions.


Introduction
Global climate change has led to a significant increase in the frequency and intensity of extreme climate events, posing a serious threat to human well-being (IPCC 2014). Of all extreme events, extreme heat has a high probability of occurrence, a wide range of impacts, and large potential risk, severely affecting human health, economic development, ecosystem services, food security and infrastructure (Zeng et al 2014, Frolicher et al 2018, He et al 2018. In the past few decades, extreme heat events that occurred in the summer have led to numerous irreversible disasters throughout the world. During the summer of 2003, a major heat wave that enveloped Europe resulted in approximately 30 000 deaths (Fouillet et al 2006). The prolonged heat wave during the summer of 2013 led to hydropower supply shortages in eastern China (Sun et al 2014). It was estimated 12 000 premature heat-related deaths occur annually in the United States (Shindell et al 2020). General increases in the frequency and intensity of extreme heats are projected to continue and even accelerate in the future (Matthews et al 2017, Han et al 2019, Wu et al 2020. Thus, comprehensively assessing the spatiotemporal variations of this type of event is crucial for developing mitigation and adaptation measures. Compared to the extensive researches focusing on individual climate extremes, little attention has been given to compound temperature extremes. Previous research on extremely hot days/nights usually refers to events whose daily maximum/minimum temperature exceeds a prescribed threshold (Chen and Zhai 2017). However, this univariate definition should occur solely during daytime and nighttime, respectively (Oswald and Rood 2014). For instance, a single day can be considered either a hot day or a hot night when both the maximum and minimum temperatures exceed the 90th percentile. More precisely, it should be called a 'compound heat extreme' . Combined daytime-nighttime temperature extremes might differ from individual hot days/nights not only in meteorological aspects (Purich et al 2014, Freychet et al 2017 but also, more importantly, in impacts on human and natural systems (Wang et al 2020). The normal functions of human society and ecosystems affected by extreme daytime temperatures may be alleviated by cool nights, while extremely hot nights might still trigger catastrophic failure in human thermoregulation (Garcia-Herrera et al 2010). Compound extremes combine and amplify adverse effects from both hot days and nights, as continuous high temperatures deprive humans of their chance to recover (Vaidyanathan et al 2016).
Urbanization is an irreversible process of human socioeconomic development and modernization and poses significant effects on increasing extreme heat events (Luo and Lau 2018, Zhao et al 2019, Cai et al 2020, Wu et al 2020. The rapid urbanization process has brought about changes in land use types and large-scale population gathering, resulting in increased greenhouse gas emissions and changes in the underlying surface, which has led to the continuous trend of climate warming (Grimm et al 2008, Brans et al 2018. In addition, industrial production and transportation will emit a large amount of polluting gases and particulate matter such as sulfur dioxide and nitrogen oxides, which can change the atmospheric composition to a certain extent (Huang et al 2015). Man-made buildings will change the underlying surface, affecting the exchange of heat, moisture and matter between the earth's surface and the atmosphere . Seto and Shepherd (2009) found that the interconnections between the process of urbanization and regional climate change became stronger in the 21st century. Georgescu (2013) highlighted the direct impacts of urban expansion on temperature warming and suggested that urban-induced warming can locally reach up to 4 • C above the maximum expansion scenario in Arizona, US. Studies have also indicated that a significant climate difference exists between urban and nearby suburban regions (Kalnay and Cai 2003, Zhao et al 2019. China has experienced a rapid urbanization process during the past 40 years under the reform and opening-up policy proposed in the late 1970s (Wu 2014). The urbanization effects on climate change cannot be ignored (Easterling et al 2016). According to climate simulations and meteorological observations, high temperature extremes will become more frequent, more intense, and longer lasting in summer in China (Liao et al 2021). He et al (2021b) found that synergy between heat wave and heat island effect is different under different urban development models. Luo and Lau (2021) found that the urban areas of the major city of China experienced a stronger intensification of heat stress than the surrounding rural areas during 1971-2014. Zhao et al (2019) found that the warming index showed an upward trend in the Beijing-Tianjin-Hebei (BTH) urban agglomeration region from 1980 to 2015. Cities face the combined challenges of heat wave processes and heat island effects, which seriously threaten urban prosperity and liveability (He et al 2021a). Although there are several studies on the spatial-temporal changes in temperature extremes and their responses to urbanization, the results were mainly focused on individual cities or regional urban agglomerations, without considering the different urbanization levels. In addition, previous results were mainly dependent on trend analysis methods and interpolators, ignoring the local features. It is critical to revisit the spatial pattern of extreme heat, including the individual and compound extremes, based on the geographical approaches to enrich and favour our understanding of their regional differences and their response to urban blooms under different development levels.
In this study, five extreme temperature indices (including compound hot extremes) were selected to study the spatial and temporal patterns of extreme heat during the summer, and the relationship between them and urbanization was investigated across China. The paper is arranged as follows: section 2 introduces the data and gives a brief description of the method. The results are presented in section 3, and a discussion of the results is provided in section 4. The conclusions are summarized in section 5.

Data
The daily temperature observation data from 1980 to 2020 released by the National Meteorological Information Center (NMIC) of the China Meteorological Administration (CMA) (Ren et al 2012) were used in this study; this data have been widely applied to climate change studies over mainland China (Ding et al 2016, Jiang et al 2016. The data were qualified and homogenized by the China NMIC of the CMA. Considering the completeness and continuity of the historical sequence, we removed the meteorological stations that failed to measure the daily maximum (minimum) temperature for more than one month and those that had not detected a protracted period of high (low) temperatures lasting more than three days. Ultimately, 2160 stations were selected, and their distribution are shown in figure 1.
The land use and land cover data of China used in this paper were provided by the Resources and Environmental Sciences Data Center, Chinese Academy of Sciences (Xu et al 2018). The quality and accuracy of the data were comprehensively checked and controlled by the Resources and Environmental Sciences Data Center (Liu et al 2005(Liu et al , 2010. Table 1 shows the classification of land use/land cover remote sensing monitoring data in China. The numbers 51 and 53 in the classification system represent urban land use in this paper, including the built-up areas of large, medium and small cities, county capitals and towns, and other construction land. The heat extremes in this paper included independent hot days (IHDs), independent hot nights (IHNs), compound hot events (CHEs), maximum value of the daily maximum temperature (TXx) and minimum value of the daily minimum temperature (TNn) (table 2).

Global spatial autocorrelation analysis
Spatial autocorrelation refers to the spatial dependence of observed variables, that is, whether variables with similar spatial locations have similar or opposite trends. The most commonly used statistic is Moran's I index with a range of [−1, 1] (Thompson et al 2018). This article uses Moran's I index to explore the changes in the spatial accumulation pattern of China's extremely high temperatures over the period of 1980-2020. The calculation formula is as follows: The statistical significance of Moran's I is commonly transformed to a Z score test with a standard normal distribution (Cliff and Ord 1981): where E(I) is the expected value of I for a random spatial pattern, and Var(I) represents the variance of I. A positive Z score with a significant level implies that Low woodland and shrubland with canopy closure >40% and height below 2 metres 23 Forestland with tree canopy closure of 10%-30% 24 Other woodland 3 Grassland 31 High-coverage grassland 32 Medium-coverage grassland 33 Low-coverage grassland 4 Waters 41 Naturally formed or artificially excavated rivers and land below the water level of the main trunk all year round 42 Lake 43 Reservoir pond 44 Permanent glacier snow 45 The tidal flooding zone between the high and low tides of the coastal spring 46 The land between the water level of rivers and lakes during the normal period and the water level of the flood period 5 Urban, rural, industrial and residential land 51 Urban land 52 Rural settlements 53 Other construction land 6 Unused land 61 Desert 62 The ground is mainly gravel, and the vegetation coverage is less than 5% of the land 63 The surface is of the salt-alkali accumulation can only grow the soil with strong salt-tolerant plants 64 The land is flat and low-lying, with wet plants growing on the surface 65 Land with surface soil and vegetation coverage below 5% 66 The surface is rock or gravel, which covers >5% of the land 67 Other unused land the distribution of the extreme heat index is spatially clustered. A significance level of 0.05 was used in our study.

Hot and cold spots analysis
The Getis-Ord Gi statistics were used to estimate and understand localization and distribution patterns of hot extremes across China (Getis and Ord 1992), which could identify statistically significant spatial clusters of high (hot spots) and low values (cold spots). Geographic objects with high z scores and small p values indicate statistically significant hot spots, and those with low negative z scores and small p values demonstrate statistically significant cold spots (Streletskiy et al 2015, Liu et al 2019. The principle of this index is as follows: where w ij (d) is the weight matrix within the range of distance (i ̸ = j). G i (d) represents the degree of correlation between the statistics of province i and neighbouring province j under the condition of distance weight w ij (d).

Clustering of urbanization levels
We create a Thiessen polygon for each meteorological station and use this polygon to represent the influence range of each station. In this way, there is an index value in each Thiessen polygon corresponding to the urbanized area in the polygon. For the work of classifying the level of urbanization, we calculate the rate of change of the urbanized area within each Thiessen polygon and classify the study area by using the K-nearest neighbours (KNN) method.
As one of the simplest algorithms among machine learning algorithms, the KNN algorithm is mainly used for nonparametric statistics of classification and regression (Altman 1992, Wu et al 2017.

Spatial and temporal distribution of hot extremes
The spatial distribution of the significant change trends (p < 0.05) of the five indices during 1980-2020 is shown in figure 2. Across the country, all indices have shown a significant increasing trend during the past 40 years. The spatial autocorrelation of the five indices was evaluated by the value of Moran's I ( figure 3). The values of Moran's I were greater than 0 for the five indices in each year from 1980 to 2020, implying that these indices present a positive spatial clustering pattern in the study area. The degree of spatial autocorrelation was also assessed by the Z values of the normal distribution (figure 4). The Z score of IHN in 2016 was less than 1.96, and the Z scores in 1997, 1999 and 2018 were less than 1.96 for CHE. Thus, the clustered patterns in these four years could have been the result of random chance at a 5% significance level. Heat extremes in other years exhibited tendencies  towards positive spatial autocorrelations and represented clustered patterns.
We can see that the Moran's I of each indicator also changes in different years, which indicates that the degree of its spatial autocorrelation also changes. Compared with TXx and TNn, the three compound heat indices show weak spatial autocorrelation. The Moran index values for TXx and TNn fluctuate in the range of 0.3-0.5 and 0.4-0.5, respectively. Except for IHD, the spatial autocorrelation degrees of other indicators have a downward trend, indicating weaker spatially autocorrelation over the time.
The distribution of the hot and cold spots of five indices in 1980 is shown in figure 5. According to the magnitude of the Gi statistic value at a significance level of 0.01, all stations are divided into three categories: hot spots, cold spots, and insignificant sites.
Among them, the area where the hot spots with a significance level of 0.01 were concentrated was called the high concentration area of hot spots. Similarly, the area where the cold spots with a confidence level of 0.01 is called the high concentration area of cold spots. The areas with high concentrations of hot spots for IHD were mainly distributed in the central region, while those of IHN were mainly distributed in the central and western regions and the northeast. For CHE, there is almost no high concentration area of hot spots. The high concentration areas of cold spots for IHD are distributed on the southeast coast and the BTH region, and those for IHN and CHE are distributed on the southeast coast and the middle and lower reaches of the Yangtze River, respectively. The hot and cold spots of TXx exhibit a pattern of east-west distribution, while those for TNn show a north-south pattern.   Figure 6 displays the distribution pattern of cold spots and hot spots for five indices in 2020. For three compound extreme heat events, the aggregation pattern in 2020 has undergone more clear changes than that in 1980. The hot spot gathering area of IHD moves northwards, its cold spots in the BTH region become hot spots gathering, and the cold spots in the southeast coastal area move to the southwest. The hot spot gathering area of the IHN expanded on the original basis, and a new cold spot gathering area appeared on the Tibetan Plateau. The hot spot clustering area of CHE appeared in the BTH region and Shanxi Province, and its cold spot clustering area appeared in Jiangsu and Zhejiang provinces, which underwent significant changes compared with the aggregation patterns in 1980. For TXx, the original hot spot gathering area expanded to the west, and the cold spot gathering area in the northeast increased. In contrast, the aggregation pattern of TNn has not changed significantly, but the area of hot spot aggregation areas in Yunnan Province has decreased.

Correlation between urbanization and high temperature extremes change
We divided China into 2160 Thiessen polygons based on meteorological stations and calculated the urbanization rate of each Thiessen polygon from 1980 to 2020 using the land use data. China is divided into six regions based on the principle of similar values and spatial proximity (figure 7): the western region (R1), the North China Plain and Northeast China (R2), the southeastern region (R3), parts of the Bohai Bay area (R4), the Yangtze River Delta region (R5) and the Pearl River Delta region (R6). From 1980 to 2020, the region with the fastest urbanization change is R6, with an increase rate of 9.61 square kilometres per year, while the slowest upwards trend is R1, with an upwards trend of 0.49 square kilometres per year. The changing rates of urbanization in R2, R3, R4 and R5 were 0.87, 0.99, 4.1 and 5.78 square kilometres per year, respectively. The average urbanization rate in the past 40 years was 0.88 square kilometres per year, and the R1was far below the average level. Table 3 shows the correlation relationship between the urbanization rate in different regions and five indices from 1980 to 2020. On the whole, the trend of change for each index is positively correlated with the change trend of the urbanization rate of each region. Significant positive correlations between TXx (TNn) and urbanization rate were detected for all six subregions, with correlation coefficients varying from 0.37 to 0.79 and 0.47 to 0.63 for TXx and TNn, respectively. The correlation between the urbanization trend of R1 and that of TXx is the strongest, with a correlation coefficient of 0.79 (p < 0.05). The strongest correlation between the change trend of urbanization and the change trend of TNn occurred in R5, with a correlation coefficient of 0.63. For three CHEs, the change trend of IHD has a significant correlation (p < 0.05) with the urbanization change trend of the other five regions except R4, of which the correlation in R6 is the strongest. The urbanization trends in R1 and R3 are significantly correlated with the change trend of IHN. The two regions, including R2 and R3, exhibited a significant correlation between the urbanization rate and CHE. In addition, we found that the urbanization trend in the R3 area has a significant correlation with the change trend of all indicators, which also indicates that the response of extremely hot events to urbanization in the R3 region is stronger than that in other regions.
To further study the relationship between urbanization and extreme heat and to predict the change in extreme hot events in the future according to the change in urbanization level, regression analysis was performed by using the urbanization area as the independent variable and the change in extreme heat events as the dependent variable. We used linear and nonlinear models (including quadratic curves, exponential functions, logarithmic functions and power functions) to fit the relationships and select the model with the best fitting line, as displayed in figure 8.
TXx shows exponential growth with the growth of urbanization area in R1, R2 and R3, where all grow rapidly first and then tend to be flat, and the corresponding critical values (urban area) are 15 000, 20 000 and 15 000 square kilometres, respectively. TXx increases linearly at rates of 4.67, 5.19 and 2.18 • C per 1000 square kilometres in R4, R5 and R6, respectively. TNn increases exponentially with increasing building area in R1, R2, R3 and R4. The TNn increases rapidly at first and then tends to be flat in R1, R2 and R3. The critical values of the corresponding building area are approximately 10 000, 20 000 and 15 000 square kilometres and increase linearly at rates of 4.3 • C and 3.87 • C per 1000 square kilometres in R5 and R6. IHD has an exponential relationship with urban area in R1 and increases linearly with urban area at rates of 6.44 × 10 −6 and 25.28 × 10 −5 d per square kilometre in R3 and R6, respectively. The IHN increases linearly at rates of 11.49 × 10 −6 and 9.43 × 10 −6 d per square kilometre with increasing urban area in R1 and R3. CHE increased linearly at rates of 25.99 × 10 −7 and 24.58 × 10 −7 d per square kilometre in R2 and R3, respectively.
To measure the contribution of urbanization to the change of extreme heat (the explanation degree of change for an urban area to extreme heat), we make a logarithmic transformation on the exponential model with significant significance in figure 8. Table 4 shows the contribution rate of the urban area growth in different regions to the increase in extreme heat events. In R1, the growth of urban area contributes the most to TXx, which is 41%, followed by R5, of which the value is 36%. The grown of 29% and 28% in TNn can be attributed to the growth of urban areas in R4 and R5, respectively, and this value ranges from 14% to 21% in the other four regions. The response of IHD to urbanization is the most obvious in R1, with an urbanization contribution of 19%, while this value is 16%, 11% and 10% in R6, R3 and R2, respectively. The contributions of urban area growth in R1 and R3 to IHN are 17% and 13%, respectively. In addition, the contributions of urbanization in R2 and R3 to CHE are 27% and 23%, respectively, yet the urbanization effect was not significantly detected in other regions.

Discussion
The study of extreme heat events with a single variable is no longer sufficient for people to effectively reduce the risks brought by high temperature disasters and take specific countermeasures (Zscheischler et al 2018). If the damage was caused by high temperature in both periods of time, then it was difficult to relieve the body (Vaidyanathan et al 2016, Wang et al 2020. However, traditional studies often focus on a single variable. For the above reasons, this study selected three compound heat indices and studied their temporal and spatial patterns in the summer and their response to urbanization. On the impact of urbanization on extreme climate, previous studies usually divided meteorological stations into urban and rural categories directly and obtained the impact of urbanization by comparing the difference in extreme climate index between the two types of stations. There are currently four main methods for classifying rural stations or reference stations: (a) methods of dividing reference sites based on the intensity or range of night lights measured by satellites (Peterson et al 1999, Hansen et al 2001, 2010, Peterson 2003, Ren and Ren 2011; (b) methods to determine whether a reference site belongs to a reference site based on the proportion of urbanized area in a certain radius buffer zone (Gallo et al 1996, Kalnay and Cai 2003, Ge et al 2007, He et al 2007, Wang and Ge 2012, Patra et al 2018, Tysa et al 2019; (c) methods based on the proportion of the population in the area where the station is located (Karl et al 1988, Ren et al 2008; and (d) a complex method that comprehensively considers the urbanization area, the number of urban populations, the distance from the station to the city centre, the station density and the relocation (Ren and Zhou 2014). Different classification methods could lead to large differences in classification results, and these methods are also subject to their own limitations. In addition, there are errors in night-time light data due to fewer observation years or accuracy problems; population data are often measured in administrative units, and urban stations and rural stations may exist under the same administrative unit. The comprehensive consideration method requires considerable cost and time consumption, and the proportion of each indicator should be carefully measured. The change in land use or land cover is the most direct spatial manifestation of the process of urbanization. Specifically, due to China's vast territory, the level of urbanization development varies greatly between the East and West and between coastal and inland areas. Simply dividing cities and villages ignores the impact of different urban development levels on extreme climate events.
In this study, we generated the corresponding Thiessen polygons throughout China with each meteorological station as the centre, clarified the influence range of each station, and ensured continuity in time and integrity in space. We did not distinguish urban meteorological sites and rural sites; instead, based on the rate of change in urban land use within each polygon, we divided China into areas with different levels of urbanization. In this way, there is no need to consider the difference in results due to the different selection criteria of the reference station. After clustering the change rate of the urban area, we found that in the past 40 years, the change rate showed a pattern in which the eastern region was greater than the western region and the coastal region was greater than the inland region. Li also shows that the spatial pattern of China's urbanization has changed from 'high in the north and low in the south' to 'high in the east and low in the west' (Li and Sun 2020). The relationship between land surface and near-surface temperature is nonlinear due to various competing and additional processes, such as thermal storage, anthropogenic heat flux, etc (Oke 1982). The results of this study are basically consistent with this conclusion. In addition, this study quantifies the relationship between the rate of land use change and compound heat extremes in several different regions, reflecting the linear and non-linear impact of different urbanization levels on the five extreme temperature indices. We note that there is a significant correlation between the changes in urban area and the changes in five indices in R3. Combined with the results of regression analysis, we can conclude that extreme heat has a stronger response to urbanization in Southeast China than in other regions. Previous studies have also confirmed this conclusion. Luo and Lau (Luo and Lau 2018) pointed out that cities in eastern China bear greater thermal pressure than rural areas, and urbanization contributes approximately 30% to the rise of such events. Zhou showed that urbanization has a significant impact on climate change in Southeast China, and urbanization leads to a rise in surface temperature of 0.05 • C/10a, which is higher than that in other periods and regions (Zhou et al 2004).
This paper avoids the deviation of the influence of urbanization on the temperature index caused by the different division methods of urban and rural stations. However, when dividing different regions, this study is based on the Thiessen polygon where the meteorological stations are located. In some areas in western China, where meteorological stations are relatively sparse, the method may lead to some biases. Further studies will conduct on how to reduce the uncertainty of this method especially when applying in sparse-gauged regions.

Conclusion
In this paper, we used five indices to study the temporal and spatial evolution of extreme heat during the summer in China from 1980 to 2020 and explored its response to urbanization. The main conclusions can be summarized as follows: (a) In the past 40 years, both the TXx and TNn during the summer have shown an upward trend in most parts of China. The TXx rose the fastest in Sichuan Province and the junction of Jiangsu and Zhejiang Provinces. The BTH region and the southeast coastal region are the areas where the TNn rose the fastest. Three compound extreme heat events showed a significant increase, mainly occurring in southeast China, with the number of stations accounting for approximately onetenth of the total stations. (b) Except for TNn, the hot spot clustering area of the other four indices expanded, and the clustering pattern of the hot and cold spots also changed significantly. The hot spot area of IHD spreads to the north, while the area of TXx expands to the west; IHN has a new cold spot accumulation area on the Tibetan Plateau, and for CHE, a new hot spot area appears in the BTH region. (c) The relationships between the hot extremes and urbanization area varied among regions, with most of them exhibiting linear relationship and exponential curves. Urbanization effects on the TXx and TNn were detected in all the six regions, with contributions ranging from 11% to 41%, and 14% to 29%, respectively. Compared with TXx, the urbanization effect on TNn varies small among the six sub-regions, with large contribution observed in R4 and small contribution in R2. Urbanization in R3 has an impact on the changes of the five indices, ranging from 11% to 26%. Besides, urbanization effects on the three CHEs (IHD, IHN and CHE) vary greatly among the six regions. The growth of urban area in R4 and R5 has no significant effect on the changes of three CHEs.

Data availability statement
The data that support the findings of this study are available upon request from the authors.