Regional compound humidity-heat extremes in the mid-lower reaches of the Yangtze River: a dynamical systems perspective

Heat extremes including heatwaves have an adverse impact not only on ecosystems but also on human health. The impact can be seriously exacerbated when both spatial extension and compound factors (such as humidity) are included. However, a unified frame combining compound humidity-heat extremes with their regional extension has received little scientific attention. This study solves this problem by taking the evolution of daily mean 2 m air temperature (Tmean) and relative humidity (RH) over a large domain as two dynamical systems (DSs), then the instantaneous coupling index from the DS method combined with clustering analysis can sort out the regional compound humidity-heat extremes with distinct spatial organized structures. Among them, the compound humidity-heat extremes with dipole Tmean and RH patterns may be missed by the methods based on regional averaging or undiscerned by DS method. Moreover, the mechanisms behind these regional compound humidity-heat extremes with dipole pattern are distinctive on both dynamics and thermodynamics, with a dipole structure found in the atmospheric low-level circulation. These novel findings can contribute considerably to the in-depth understanding on the compound humidity-heat extremes and their mechanisms.


Introduction
Compared to the extremes caused only by a single variable, such as high temperature, those by two or more variables occurring at the same time over a given domain, called compound extremes (CEs), can lead to even more disastrous damages (Ridder et al 2020, Rogers et al 2021. Among them, humidity can play a quite important role (Byrne et al 2018). For example, dry-hot events cause severe scarcity of food (Ciais et al 2005, Lu et al 2018, or humid-hot events result in salient deaths of heart disease, especially to the elder (Li 2020, Raymond et al 2020, Marx et al 2021. Compared to the extensive studies on heat extremes, limited research has been devoted to the compound humidity-heat extremes. Ding and Qian (2011) classified the wet and dry heatwave (HW) by mean relative humidity (RH) and found that wet HW events were commonly located in the southeastern China, while dry HW events mostly in the northwestern China and North China (Ding and Ke 2015). Usually, the RH effect is taken into account empirically by computing apparent temperature (Fischer andSchär 2010, Russo et al 2017), heat index (HI; Xu et al 2021), or wet-bulb temperature (Stull 2011, Freychet et al 2020, Yu et al 2021. Since some empirical parameters in these formulas may not be universal, all these subjective indices will lead to some uncertainties in results of humidity-heat CEs. Most importantly, the extremes from these empirical results cannot ensure that extremes of both Tmean and RH are synchronous, and the extremes of these empirical results can be related more to extremely humidhot events, not to dry-hot or humid-cold extremes. To solve this problem, Faranda and his colleagues developed a dynamical systems (DSs) method by taking two variables (such as Tmean and RH) observed for a period over a given domain as two DSs, from which an instantaneous coupling index is calculated to quantify the coherence between two fields at each moment (Faranda et al 2020, Luca et al 2020a, 2020b and to study the CEs. Furthermore, the impacts of extreme events can be aggravated when they occur simultaneously over a large contiguous domain, called regional extremes (REs, such as figures 1(b) and (c)). For this kind of events, measures mostly used in previous studies are to count the number of grids or stations meeting the given criteria at a time over a given region. When the number is beyond a preset relative threshold (such as 40%), then a RE is taken to occur over the analyzed region Ke 2015, An andZuo 2021). In order to characterize or classify the REs, an index based on the regional average is calculated over the analyzed region (Trenberth and Stepaniak 2001, Ding and Ke 2015, Xu et al 2021. Since the proportion of grids/stations occurring extremes over the analyzed region is not always 100%, the strength of the REs is usually underestimated after regional averaging, especially when there is a dipole or tripole spatial pattern (figure 1(c)). Similarly, since the coupling index is designed only to assess how two atmospheric fields resemble their predefined states at each time step (Faranda et al 2020, Luca et al 2020a, 2020b, not to discriminate different spatial structures (such as the three typical patterns in figure 1), the DS method cannot be directly used to fully capture the REs, either.
When the CEs occur within an extensive domain, so called regional compound extremes (RCEs), the situations would be even more serious, causing more economic and social losses than non-CEs or CEs only in single station (Li 2020, Raymond et al 2020, Marx et al 2021. Studies along this direction are deficient mainly due to no appropriate method to study this kind of extremes. In this study, the main objective is to develop a suitable procedure to study the RCEs, specifically for the humidity-heat RCEs. For this purpose, we first adopted the advantages of the DS method which is adept in characterizing the evolution of the atmospheric states (Faranda et al 2017a(Faranda et al , 2017b(Faranda et al , 2019, especially in sorting out the CEs (Faranda et al 2020, Luca et al 2020a, 2020b. Then we also took into account the advantage of both domain-averaging (Trenberth and Stepaniak 2001, Ding and Ke 2015, Xu et al 2021 and cluster analysis (Caliński andHarabasz 1974, Stéfanon et al 2012) in discriminating and classifying different spatial structures, which are closely related to the organized spatial patterns found in different kinds of RCEs. By this way, we can find the novel type of RCEs missed in previous studies, and their different reasonable mechanisms will be revealed by composite analysis (Stéfanon et al 2012.
To test the performance of our proposed procedure, a key domain has to be chosen. China is among the countries that are severely affected by heat extremes and it is also one of hotspots of multivariate compound events around the world (Ridder et al 2020). Especially, the mid-lower reaches of the Yangtze River are the region with the most frequently occurring extremes and a hotspot for hot CEs over China (Luo et al 2020. We take humidity-heat CEs over this region as an example to test our proposed procedure and to demonstrate its performance in studying and characterizing the humidity-heat RCEs. The rest of this study is structured as follows. Data and method are introduced in section 2. Section 3 describes that the strong coupling days are with highly correlated Tmean and RH. Then we classify the RCEs into two types and analyze their spatial and temporal features. Finally, some plausible mechanisms behind two types of the RCEs are discussed, followed by a brief summary in section 4.

Study region and datasets
In this study, the domain for the mid-lower reaches of the Yangtze River is defined as a pentagonal region determined by land area east of 111 • E, 26 • N-35 • N (figure 1(a)), which is similar to those in previous studies (Luo et al 2020. The daily Tmean and RH employed to calculate the DS parameters are from the European Centre for Medium-Range Weather Forecasts (ECMWF) Re-Analysis version 5 (ERA-5; Hersbach et al 2020) from 1981 to 2020 with a spatial resolution of 0.25 • × 0.25 • . In order to retrieve convinced results, DS parameters were also calculated by means of daily Tmean and RH from 150 national standard stations located in the study region provided by China National Weather Information Center (http://data.cma.cn).
For composite analysis, daily geopotential height, zonal and meridional wind and surface air temperature were obtained from National Centers for Environmental Prediction-National Center for Atmospheric Research Reanalysis (NCEP/NCAR-R1; Kalnay et al 1996), and outgoing longwave radiation (OLR) was obtained from NOAA's Outgoing Longwave Radiation Daily Climate Data Record (Lee 2014), all with a 2.5 • horizontal resolution in order to investigate the large-scale atmospheric features in relation to the RCEs. Anomaly fields were computed by removing daily climatology defined for the period from 1981 to 2020. This study mainly focused on the boreal summer season (June-July-August, JJA).

Dynamical systems method
DS method considers the observations of a meteorological variable over a period within a given region as a DS x i,j (t) , such as daily mean temperature Tmean i,j (t) or relative humidity RH i,j (t) in this study. For a given state ζ i,j (a latitude(i)-longitude(j) map of a variable for a specific day, see figure 2(a)), the analogues of ζ i,j on the evolutionary trajectory can be quantified by a function g( with 'dist' denoting a Euclidean distance between ζ i,j and a state on the trajectory of x i,j (t). Given a threshold s X (such as the top 98th quantile of g(x i,j (t), ζ i,j ), DS method is not sensitive to the choice of the quantile ranging from 90th to 99th), we extract the exceedances with two DS parameters: the univariate inverse of instantaneous persistence 0 ⩽ θ ⩽ 1 and univariate instantaneous dimension d = 1/σ. The instantaneous persistence θ −1 measures the residence time around such a state and the instantaneous dimension d can be interpreted as a proxy of active degrees of freedom of the system around a given state. This method can also be extended to cope with bivariate cases by defining a coupling strength index for the concurrence between two variables (Faranda et al 2020, Huang et al 2022: where x i,j (t) and y i,j (t) represent the dynamical trajectories of two variables (e.g. Tmean and RH), and ν[ ] denotes the number of cases meeting the condition for given thresholds s X and s Y (defined similarly as in one variable) in the bracket. Here the coupling strength 0 ⩽ α x,y ζ x , ζ y ⩽ 1 measures the extent to which the dynamical states of two systems at a certain time are concurrent. When α = 0, there are no concurrence between two variables at that moment. When α = 1, then every time x i,j (t) on another day that closely resembles the given state ζ x , y i,j (t) will also resemble the given state ζ y at the same time. Similarly, the bivariate inverse of instantaneous persistence θ x,y and bivariate instantaneous dimension d x,y can also be calculated to characterize the co-variations of two fields. More details about DS method and related Matlab code please refer to Faranda et al (2020).
In this study, we computed three DSs metrics: d, θ −1 and α from Tmean and RH, in order to study humidity-heat RCEs from a DS perspective. We term the days with α exceeding the threshold of 90th percentile (results reported in this study are qualitatively insensitive to the choice of the percentile ranging from 80th to 95th) of all JJA season as high-α events (figure 2(b)). Two α series computed from ERA5 and station observations are highly correlated (correlation coefficient is 0.91), which confirms the robustness of our results (figure S1 (available online at stacks.iop.org/ERL/17/064032/mmedia)).

K-means
K-means is one of the most commonly used unsupervised learning algorithms to divide the objects into specific groups. In this study, the K-means clustering method is applied to the daily Tmean of high-α events to further discriminate the events of different organized patterns (results are insensitive to the choice of using Tmean, RH, or Tmean and RH, since the variables in sorted high-α events are highly correlated, figure 3). Here the Calinski-Harabasz criterion was utilized to determine the optimal number of clusters, by maximizing the variance measure ratio of homogeneity within a cluster and heterogeneity between clusters (Caliński and Harabasz 1974; figure S2).

High-α events are with highly correlated Tmean and RH states
First of all, there is no well-defined correspondence between coupling and regional (humidity or heat) strength ( figure 3(a)). Over the scatter plots of the regional averaged RH (RARH) versus the regional averaged Tmean (RATM) in JJA (figure 3(a)), coupling indices scatter over the whole space. Even for these RCEs defined by RARH and RATM (RARH and RATM exceeding seasonal 90th or below seasonal 10th), both coupled (high-α) and uncoupled (nonhigh-α) events cannot be differentiated ( figure 3(a)). This result reconfirms that the extremes from these regional-averaged indices or empirical results cannot ensure that extremes of both Tmean and RH are all synchronous.
For high-α events, however, there is a marked monotonic relation between RARH and RATM in JJA ( figure 3(a)). More importantly, these highly correlated REs are also spatially correlated, i.e. high-α events tend to have a stronger negative spatial correlation ( figure 3(b)), which indicates that the spatial distribution of Tmean and RH for these high-α events is highly similar. It is worth pointing out that there seems no direct relationship between coupling strength and spatial patterns of the analyzed variables according to the definition of α, however, since the joint recurrence of two fields implies that there may exist common underlying dynamics to these two variables such that they are more likely to show similar patterns.
However, their composites (figures 3(c) and (d)) do not resemble well the typical RE shown in figure 1(b). This may be due to the wide range of highα events for both RARH and RATM (figure 3(a)), which may result from different types of REs with different spatial patterns similar to those given in figures 1(b) and (c), or from coexistence of REs and non-REs (low RATH or RATM).

Two types of RCEs and their spatial-temporal features
Since the coupling strength α cannot discriminate the different spatial structures (Huang et al 2022), and there are multiple types of system states for both Tmean and RH (see the three typical patterns in Tmean in figure 1), thus we applied K-means cluster analysis to check whether there are different patterns or coexistence of REs and non-REs in highα events. The optimal number for high-α events determined by Calinski-Harabasz criterion is two (figure S2), and we termed them as high-α1 and high-α2 ( figure S3(a)). The composite anomalies of Tmean and RH for both types of high-α events exhibit organized spatial patterns (figure 4), which are similar to the two typical states shown in figures 1(b) and (c), respectively. However, there is no state for high-α events as figure 1(d), i.e. no non-RE. The patterns of Tmean and RH of high-α1 are generally regional-homogeneous, positive anomalies with a strong center covering almost the whole area in Tmean (figure 4(a)) and negative anomalies with a strong center in RH (figure 4(c)), thus forming a dryhot RCE. The patterns of high-α2 show a meridional dipole structure in both Tmean (figure 4(b)) and RH (figure 4(d)), forming a dry-hot RCE in the south sub-region and a humid-cold RCE in the north subregion. This is a novel type of RCE, with two kinds of RCEs occurring simultaneously.
The difference between the two types of RCEs can also be revealed in their RH-Tmean relation between bivariate inverse persistence and instantaneous dimensions, the coupling strength α is found to decrease with increased d and θ, and high-α events are with lower value of inverse persistence and instantaneous dimensions ( figure S3(b)). Similar feature can also be found for the univariate inverse persistence versus instantaneous dimensions (figures S3(c) and (d)). These results suggest that the high-α events may have more organized spatial patterns and persist longer time, which is consistent with the characteristics of the REs. Meanwhile, the lower d and θ in the high-α1 than those in high-α2 indicate that the high-α1 events will take the most uniform spatial patterns and long-lasting temporal behaviors. Most of the RATM for high-α1 exceed the seasonal 90th percentile with RARH below the seasonal 10th percentile  ( figure S3(a)). Table 1 summarizes the mean regional averaged Tmean, RH and heat index (HI) within each type, with HI for the sensible temperature perceived by human body directly by considering both temperature and RH (the empirical formula was provided in Supplement text). By traditional methods (e.g. regional averaged index), the second type of RCEs (high-α2) with dipole patterns will be missed due to the low regional averaging indices. Actually, the RATM for high-α2 over the southern sub-region (grids with positive anomalies) is 29.90 • C, which has already exceeded the threshold of the seasonal 90th percentile.
The distinct features of high-α events can be also revealed in their temporal features. High-α events can last longer in both Tmean and RH (figure 5), which is in accord with their lower θ in figure S3(b). The gap of time delay correlation between the high-α in RH is not obvious, but much wider in Tmean, which is in line with the findings that the values of θ for high-α1 and high-α2 in Tmean are much more distinct than in RH (figures S3(c) and (d)). Meanwhile, the values of d and θ for high-α2 are larger than high-α1 in both Tmean and RH.

Different mechanisms behind the two types of RCEs
The found different features for two types of RCEs may be associated with different physical mechanisms. Previous studies find that anomalous anticyclone circulation and thermodynamics (and related reduced cloud cover, increased solar radiation, adiabatic heating and moisture transport) are the fundamental components for hot extremes.
Significant positive anomaly in 850 hPa geopotential height can be found over the Yangtze River valley in high-α1 events, and a well-defined positive anomaly can also be found at 500 and 200 hPa geopotential height fields, forming a barotropic structure over the mid-lower reaches of the Yangtze River (figures 6(a) and (c)). This quasi-barotropic feature has been found of great importance for HWs , which can result in the reduced cloud cover and increased solar radiation, and provide the favorable conditions for the formation of the hot extremes. Different from the cases of high-α1, there is a dipole pattern of north-low and south-high in 850 hPa geopotential height anomaly for high-α2 (figure 6(b)), this dipole can persist up to the 500 hPa geopotential height field (figure 6(d)) but totally controlled by a positive anomaly in 200 hPa geopotential height ( figure 6(b)). This peculiar pattern conforms to the results found in figure 4(b) and is not reported in the previous studies.
Previous studies find that heat waves over East Asia, especially Korea, can be classified into two distinct types based on the spatial patterns of atmospheric circulation anomalies, one is zonal wave type characterized by pronounced zonal wave trains across Eurasian continent and the other is the meridional wave type characterized by Pacific-Japan (PJ) pattern (Yeo et al 2019). Marked feature with a zonal wavelike pattern across the Eurasian continent termed as Circumglobal Teleconnection (CGT) by Ding and Wang (2005) can be found in both clusters, represented by a dashed purple arrow in figures 6(a) and (b), strong for high-α2 but weak for high-α1. Another prominent circulation pattern that impacts on hot extremes appears in high-α events is PJ pattern represented by a dashed black arrow and characterized by a meridional wave-like train with two key centers located in Taiwan (22 • N, 120 • E) and Japan  (35 • N, 140 • E), respectively (figures 6(a) and (b)). The negative-phase PJ pattern may play an important role in the formation of the RCEs, whereas the positive-phase PJ pattern inhibits the RCEs over the Yangtze River valley but promotes HWs over Korea and Japan (Noh et al 2021). Prolonged RCEs are formed by the combination of CGT and PJ pattern and the location of these wave trains decides whether the spatial structure of Tmean and RH is domainhomogeneous or dipole.
Generally, the heat waves over East Asia are also related to the enhancement of the well-known Western North Pacific Subtropical High (WNPSH; Luo et al 2020, Noh et al 2021. The intensification and the westward extension of WNPSH are significant at 500 hPa in high-α events (figures 6(c) and (d)). In addition, the south westerly at the edge of WNPSH can bring a large quantity of warm air from ocean, leading to a tremendous warming over coastal areas. This warming effect is pronounced in high-α events, with a strong southwesterly flow dominating the whole area (figures 6(e) and (f)). However, the flow in the northern part for high-α2 turns its direction to almost westerly, leading to its temperature not warm as the southern part, thus forming a dipole structure. Overall the horizontal advection is one of principal processes responsible for the near-surface warming for the high-α events. The total cloud cover and vertical integral of divergence of moisture flux (figure S4) all exhibit a regional-homogeneous pattern in high-α1 and a dipole in high-α2, which is in line with the features of Tmean and RH shown in figure 4.

Discussion
It should be pointed out that previous studies classified the extreme events mainly as humid heat waves over this study region based on the threshold of 60% for RH (RH60) when a HW with the RH larger than RH60 Ke 2015, Xu et al 2021). If taking this criterion to classify the humidity-heat RCEs, both high-α1 and high-α2 events should be classified as the humid-hot RCEs since there is relative constant RH during JJA over the study region and RH is all larger than RH60 (table 1). These results are not contradictory to those given in the former section, since one is based on the absolute value by considering the human physical perception and the other based on the relative threshold of the distribution of RH in JJA over a specific region.
Another issue related to the RCEs is the great importance of the matched spatial-temporal structures between Tmean and RH fields ( figure 3(b)). The regional average based methods can only capture the average related information about the spatial field, but fail to capture this spatial organization including gradient of a given field (Trenberth and Stepaniak 2001), not to mention the concurrence of organized spatial patterns. Similarly, DS metrics can sort out different types of RCEs, but they cannot discriminate different types of RCEs with distinct organized spatial patterns. This is the main reason why we proposed a new framework to combine the advantages of DS metrics, regional averaging and clustering analysis to fully characterize the RCEs. The details of this framework can be summarized as follows, (a) Taking two variables, such as Tmean and RH, over a given large domain as two DSs, the instantaneous coupling index α from the DS method can be calculated; (b) For a chosen threshold of the coupling index α, sorting out the RCEs; (c) Through clustering analysis, the RCEs with distinct spatial organized structures can be classified; (d) By means of composite analysis, different mechanisms can be reached for different types of RCEs.

Conclusions
In this study, taking the evolution of Tmean and RH fields as two DSs, DS method was employed to investigate the humidity-heat RCEs in the midlower reaches of the Yangtze River. According to this dynamical coupling metric α, strong concurrent event tends to occur when the daily high temperature and low RH extremes exist simultaneously over a large domain, which forms a dry-hot RCE. To fully characterize the dynamical properties of these compound extremes, local dimension d and inverse instantaneous persistence θ were also computed, both of them are found to decrease with increased α. In other words, d and θ are simultaneously low in a highα day, which is conductive to a sustaining compound extreme with a contiguous regional extension. Dynamical coupling metric α can sort out the concurrent spatial-temporal patterns between Tmean and RH fields, but it cannot discriminate the different orderly structures in spatial patterns, such as domainhomogeneous or dipole. K-means clustering is efficient in classifying the sorted events as the different high-α events. Superior to regional averaged index, this coupling index combining the K-means clustering can pick out RCEs with dipole or even more complicated orderly spatial patterns, of which the subregion has already meet the criterion of RCEs, but they still may be missed or underestimated by traditional statistical method due to the definition of regional averaging. The computed spatial correlation between Tmean and RH and the time delay correlation for each variable show that the proposed procedure performs better with both high-α1 and high-α2 events taking distinct orderly patterns and lasting far longer similarly. These results indicate the proposed DS based procedure is more suitable for classifying, characterizing and quantifying the RCEs.
The composite analysis further illustrates that above found two kinds of events are associated with different physical mechanisms. For the high-α1 RCEs, a high-pressure anomaly with barotropic structure is prominent over the mid-lower reaches of the Yangtze Rivers, accompanied with intensified and westward-extended WNPSH and strong negative meridional PJ pattern but weak mid-latitude zonal wave trains. However, a north-low and southhigh pressure anomaly with quasi-barotropic structure is dominated for high-α2 RCEs, accompanied with intensified and westward-extended WNPSH, strong negative meridional PJ pattern and strong mid-latitude zonal wave trains. The marked differences between high-α1 and high-α2 events lie in the strengths of zonal wave trains and the location of the negative (or positive) central pressure anomaly in PJ patterns. Additionally, the contribution of horizontal advection is quite important for high-α events, bringing a large quantity of warm air from ocean, thus inducing a pronounced warming over the study region. In this study, the novel type of RCEs with dipole structure is found for the first time, and they are with distinct physical mechanisms, both dynamically and thermodynamically. These results are important for in-depth understandings on the extreme events Ke 2015, Xu et al 2021). These novel findings can provide a considerable understanding on the RCEs and may improve the forecasting skills of these health-damaging events (Tang et al 2021).

Data availability statement
All data that support the findings of this study are included within the article (and any supplementary files).