Probability of compound climate extremes in a changing climate: A copula-based study of hot, dry, and windy events in the central United States

Climate extreme events exert disproportionate impacts on ecosystems and humankind. Focusing on univariate statistics to estimate the harm from compound extreme events usually falls short in communicating the real risk. Here, the co-occurrence of hot, dry, and windy events (HDWs) in the central United States was analyzed over the period 1949–2018. Results demonstrate south-west Kansas and north Texas as locations where HDWs are more frequent. The combination of drought and a heatwave in 1980 and 2011, increased the likelihood of HDWs. Use of copula enables the study of the co-occurrence of multiple extremes. The copula approach identified a greater risk of HDWs compared with traditional empirical analysis. The dependence structure between the temperature, humidity, and wind variables showed no effect on the co-occurrence frequency of HDWs in the warm-season (May through September). Results suggest an increase in the risk of HDWs in spite of the historical wind speed drop across the majority of Great Plains. Multivariate perspectives are necessary for a more informed assessment of compound extremes risk and for improved design of adaption strategies.


Introduction
In September 1882, Kansas and Missouri suffered from several consecutive days with severe hot, dry, and windy events (HDWs) with a temperature higher than 38 • C and the relative humidity lower than 17%. The 'furnace winds' from the southwest withered and burned up the vegetation and suspended outdoor labor (Curtis 1891). In another severe condition, after the prolonged, drought in 1887, the HDWs during July and August in Kansas damaged 30% of the promising crop worth of seven million dollars (Curtis 1891). These historic examples of extreme weather-/climate events with more than one weather variable involved are called the compound extremes. Compound extreme events are defined by IPCC as '(1) two or more extreme events occurring simultaneously or successively, (2) combinations of extreme events with underlying conditions that amplify the impact of the events, or (3) combinations of events that are not themselves extremes but lead to an extreme event or impact when combined' (Seneviratne et al 2012). Leonard et al (2014) also defined a compound extreme as 'an extreme impact that depends on multiple statistically dependent variables or events' . HDWs are exceedance/non-exceedance compound extremes when high temperature and moderately high wind speeds are accompanied by low relative humidity.
HDW events have been observed to increase evapotranspiration (Derrel et al 1993), cause a suite of severe impacts on croplands (Curtis 1891, Lydolph 1964, Yang and Wang 1978, Lydolph and Williams 1982, Leathers and Harrington 2001, Wang et al 2016, influence the rapid onset of flash drought (Otkin et al 2018, Christian et al 2019, and initiate or intensify wildfires (Flannigan and Harrington 1988, Jolly et al 2015, Srock et al 2018. In Oklahoma, wind speeds of 3.5 m s −1 to 7 m s −1 , when relative humidity ranged from 15% to 30%, were considered as wildfire thresholds (Reid et al 2010). In the southern Great Plains, a temperature of 34 • C-41 • C, relative humidity of 18%-30%, and wind speed of 6.5-8.6 m s −1 were considered favorable for wildfire in the growing season (Krueger et al 2015).
Based on our review of published literature, fewlimited studies have analyzed the spatial distribution and frequency of HDWs in the contiguous United States (CONUS). Lydolph and Williams (1982) analyzed the pattern of HDWs in the eastern half of the CONUS including the Great Plains. Analyzing data during 1951-1960 along with case-studies from 1961-1963 and 1980, they found a higher occurrence of HDWs in the Great Plains. Examining both the Great Plains and western half of CONUS, Leathers and Harrington (2001) found the greatest occurrence of hot and dry conditions in the southwest deserts over the period of 1948-1993. However, the rare occurrence of high-speed winds in the southwest shifted the higher probability of HDWs toward the southern Great Plains (Leathers and Harrington 2001). There is substantial evidence that Northern Hemisphere HDWs are usually observed between mid-May and mid-September for the CONUS and in mid-summer in the Great Plains (Lydolph and Williams 1982). Lydolph and Williams (1982) classified HDWs or 'sukhovey' into five classes, with the temperature threshold varying across a range of 29 • C-38 • C, the humidity threshold lower than 30%, and with wind speeds higher or lower than 7 m s −1 . Leathers and Harrington (2001) defined a HDW or 'furnace wind' event with a temperature higher than 35 • C, relative humidity lower than 30%, and wind speed equal to or higher than 7 m s −1 . The 35 • C threshold for temperature was considered a critical temperature for crop development in the CONUS (Leathers and Harrington 2001). The critical temperature threshold may be adjusted for different crop types. For example, a temperature higher than 29 • C, 30 • C, and 32 • C would be harmful to corn, soybean, and cotton, respectively and reduces the productivity of these crops by 30%-82% when the temperature is higher than the identified thresholds (Schlenker and Roberts 2009). Analysis of the influence of global warming on CONUS crop agriculture revealed the harmfulness of temperatures above 34 • C (Schlenker et al 2006). Here, the fixed thresholds (35 • C for temperature, 7 m s −1 for wind speed, and 30% for relative humidity) were used to analyze the co-occurrence of HDWs. This work is novel by analyzing the risk of HDWs and their spatial and temporal distribution in the CONUS and using copula to analyze the trivariate cooccurrence of HDWs. Based on the knowledge available now, no study has analyzed the trivariate cooccurrence of HDWs applying copula. While HDW events damage crops and encourage the spread of wildfire, little attention has been paid to risk analysis of HDWs and their spatial and temporal distribution in the CONUS. Here, we modeled the joint behavior of the three elements that combine to create a HDW as well as examining the temporal and spatial changes of HDWs over a 70-year period in the central CONUS. Copula families have recently been used to characterize compound hydrologic and climatic extremes. A copula is a function that links or joins a multivariate distribution function to its single dimension marginal distribution functions (Nelsen 2007). Diverse copula families characterize the scale-free dependence between/among random variables.

Data
A long-term dataset extending from 1949 through 2018 with an hourly time-step was obtained from the HadISD (www.metoffice.gov.uk/hadobs/hadisd/). HadISD is a global, station-based, sub-daily, and quality-controlled dataset developed based on the Integrated Surface Database (ISD) at the National Oceanic and Atmospheric Administration's National Centre for Environmental Information (NOAA/NCEI, formerly the National Climatic Data Center-NCDC; Dunn et al 2012Dunn et al , 2016. The dataset has been extended to cover from 1931 to the present and is designed to study the long-term changes of extreme temperature, pressure, and humidity (Dunn et al 2016). The advantage of using this dataset is the availability of all three variables (temperature, relative humidity, and wind speed) for all stations. The data from 1931-1948 were excluded because few stations had minimal missing data. For 1949-2018, stations with less than 10% missing data in temperature, relative humidity, and wind speed for the warm season were selected to study the changes of compound HDWs in the central CONUS. Missing data at a station was identified when at least one of the three climate variables was not available. Stations that had less than 10% missing data but had no data for the entire year were also excluded to reduce bias in the trend analysis results. Twenty-seven stations are available for this study of HDW events during the period 1949-2018 for the 10 states in the CONUS that comprise the Great Plains (figure 1).

Methods
Two different approaches were used to investigate the compound extremes of HDWs. An empirical method, a simple annual count of the hours with compound exceedances of temperature and wind speed above 35 • C and 7 m s −1 respectively, and relative humidity below 30%, was used. The second approach develops copula families to model the trivariate dependence of temperature, relative humidity, and wind speed. For the three random variables of X (e.g. relative humidity), Y (e.g. temperature), and Z (e.g. wind speed) with cumulative distribution functions F Z (z) = Pr (Z ⩽ z), the trivariate joint distribution function or copula (C) can be written as: (u, v, z) (1) where three variables are less than an established threshold and u, v, and z are the uniformly distributed marginals on [0,1]. However, there is an obvious difference between non-exceedance (F) and exceedance probabilities (P). The probability of hours in each warm season when temperature and wind speed exceed the 90th percentile and the relative humidity is at the same time below the 10th percentile is called the exceedance/non-exceedance probability. The exceedance/non-exceedance probability can be written as: where v ′ = 1 − v, and z ′ = 1 − z are the exceedance probabilities of temperature and wind speed, respectively, and u is the exceedance probability of relative humidity below the threshold. Therefore, the equation can be solved as (Nelsen 2007, Liu et al 2018: Then, the trivariate extremes would be defined as the area in three-dimensional space where temperature and wind speed exceed the established threshold, while relative humidity is lower than a threshold. If considering the 90th and 10th percentile thresholds as it is done in the literature of extremes and climate change; hence, v ′ = z ′ = 0.9 and u = 0.1. When considering the fixed thresholds, the value of u, v ′ , and z ′ are not necessarily equal.

Copula families.
Six copula functions were used from Archimedean and Elliptical copulas that have been widely used in hydrological and climate studies (Aghakouchak et al 2014, Zscheischler and Seneviratne 2017, Hao et al 2018, Liu et al 2018, Zhou and Liu 2018, Tavakol and Rahmani 2019a. Clayton, Gumbel, Frank and Joe copulas with different patterns of dependence structure were selected among several Archimedean copulas (figure 2). Frank copula is a symmetric copula with no tail dependence. However, the Clayton copula has a greater lower tail dependence and Joe and Gumbel copulas have a greater upper tail dependence (Nelsen 2007). Gaussian and Student-t are Elliptical and radially symmetric copulas with no tail dependence for the Gaussian copula (figure 2). For the Student-t copula, tail dependence depends on the correlation and the degree of freedom (Nelsen 2007). Sadegh et al (2017) provides a complete list of copula families and their related functions as well as the range of parameters for each copula.

Fitting the copula.
Before modeling the joint probability distribution, it is necessary to transform random variables (i.e. temperature, relative humidity, and wind speed) to uniformly distributed marginals [0,1] through calculating the normalized ranks, which is a common process for copula analysis ( After fitting the best copula for each year and each station, the exceedance probability was given by equation (3). The frequency of extreme HDWs was calculated using both empirical (counting) and copula methods. The limitations of the empirical approach are that it needs a long period of data and cannot Figure 3. The 3-dimensional distribution of normally transformed temperature, relative humidity, and wind speed (a) and the contour plots of fitted and empirical copula for temperature and relative humidity (b), temperature and wind speed (c), and relative humidity and wind speed (d) for the station located at Brownsville in Texas at 97.4 • W and 25.9 • N. Frank copula was the best-fitted copula for all three bivariate random variable combinations (b-d) and the trivariate random variables. describe the compound interaction among the contributing variables (Hao et al 2018). Compared to copula, an empirical method has a higher uncertainty when calculating the probability of extreme events (Zscheischler and Seneviratne 2017, Hao et al 2018).
This research compares the probability of extreme events calculated from both the empirical method and the copula model. In addition, the dependence between variables was calculated. Regular correlation methods (e.g. Pearson, Spearman, and Kendall) indicate the dependence on a full range of data (Maritz 1995, Helsel andHirsch 2002) and does not consider the influence of variable distribution function on correlation (Hofert et al 2018). However, the measure of association, developed for each copula family, can calculate the dependence among variables after transforming the data variables to a standard uniform distribution (Hofert et al 2018). In this study, both regular correlation methods and the measure of associations (Spearman and Kendall; http://copula.r-forge.r-project.org/) were calculated and the impact of dependence on the probability of compound extremes was analyzed. For trivariate cases, when all three variables are considered together, the measure of associations (τ XYZ ) was calculated as the average of corresponding pairwise coefficients (Nelsen 1996, Genest et al 2011. where for example the τ XY , τ XY , and τ XY denote Kendall's tau for bivariate measure of associations between pairwise variables (e.g. temperature and wind speed). The Mann-Kendall test (Mann 1945, Kendall 1975, was used to identify any monotonic trends in the time series. In addition, the two-sample Kolmogorov-Smirnov (K-S) test (Massey 1951) was used to recognize the differences between the CDFs of single and compound extreme events values for the two 35-year sub-periods: 1949-1983 and 1984-2018.

Empirical vs copula
An important step in employing copulas to model the dependence between stochastic variables is the selection of a copula that best fits the data. Trivariate copulas were selected based on larger P-value, the lowest AIC and BIC, and the greatest log-likelihood. Considering the entire period , Student-t copula was the best fit for the majority of stations (89%). When analyzing each year separately, Student-t copula (74%) and Frank copula (20%) were the bestfitted trivariate families.
Counting the number of events when wind speed and temperature are both higher than their 90th percentile and relative humidity is simultaneously less than its 10th percentile threshold, is a relatively straightforward approach to calculate the probability of HDWs (Rahmani and Tavakol 2019). This empirical approach has been used to analyze the co-occurrence of compound or consecutive multiple extremes (Beniston 2009, Fischer and Knutti 2013, Hao et al 2013, Morán-Tejeda et al 2013. The limitations with the empirical method include the shortcoming of a lack of available long-term data and the inability to model the dependence among variables (Hao et al 2018). For example, if relative humidity, temperature, and wind speed are independent, for 3672 data pairs (hourly data for one warm-season) there would be an average of only four compound HDWs each year based on the 90th and 10th percentile thresholds (0.1 * 0.1 * 0.1 * 3672 = 3.7). Relative humidity, temperature, and wind speed could be considered independent in stations with less than annual 3.7 HDWs per warm season. In this study, the simple empirical approach was not done due to how HDWs were defined in previous research. The expected number of HDW events per year based on the 90th and 10th percentiles (3.7) should be different than what was determined using fixed thresholds that seem to matter given plant physiology and wildfire frequency.
After fitting the best trivariate copula on the data, the probability of compound HDWs was calculated for each station. Given that the fixed thresholds of 35 • C for temperature, 30% for relative humidity, and 7 m s −1 for wind speed matter for agricultural crops, the probability of HDWs was calculated using the fixed thresholds. The probability of a HDW was then calculated based on the computed percentiles. To test the robustness of the fitted copulas, the cooccurrence of HDWs was also calculated based on the empirical method. the different approaches lead to similar spatial patterns in the results with the highest probability of HDWs in the region extending from southwest Kansas into northwest Texas (figure 4). By

Variability of percentiles corresponding to the fixed thresholds
The spatial variability of the computed percentiles corresponding to the fixed thresholds of 35 • C for temperature, 30% for relative humidity, and 7 m s −1 for wind speed was analyzed. For temperature, the spatial pattern of differences in the value of percentiles corresponding to 35 • C is very small with a minimum of the 91st percentile in North Texas and a maximum of the 100th percentile in the northeast (North Dakota and South Dakota) and to the east of the Rocky Mountains in Colorado and Montana ( figure 5(a)). In these regions, 35 • C is considerably higher than the local 90th percentile of temperature and the lack of hot temperatures limits the occurrence of HDWs. For relative humidity, the lowest percentile values are located in the western half of the Great Plains ( figure 5(b)). The percentile corresponding to 30% relative humidity is very close to zero in coastal south Texas, which limits the local frequency of HDWs. For wind (figure 5(c)), the percentiles ranged from 68th percentile in southwest Kansas to the 93rd percentile in Montana. The 68th percentile for wind speed at the southwest Kansas station demonstrates the relatively high probability of wind speed exceeding 7 m s −1 . Temperature is is the most limiting variable for the co-occurrence of HDWs across the entire study area ( figure 5(a)). Hypothetically, in stations where the temperature percentile is lower, then more HDWs are expected. However, along the Texas coast, generally only wind speed and temperature meet the criteria for HDWs, since humid air flowing in from the Gulf of Mexico increases atmospheric water vapor in the Great Plains (Durán-Quesada et al 2010) and limits the local probability of occurrence of HDWs.

Dependence among variables
Pearson and Mann-Kendall tests were used to calculate the association between variables to analyze the influence of correlation on the probability of HDW events. The strongest and weakest linear correlation values were discovered between temperature and relative humidity (negative values from 0.48 to 0.82), and temperature and wind (0.0 to 0.57), respectively. To eliminate the influence of observed climate change on the long-term trends of variables, all three variables were de-trended linearly and correlations were re-calculated for the de-trended data. The Student-t test results indicated no statistically significant difference between the correlation values calculated based on the original and de-trended data. Calculated Mann-Kendall correlation values (figure 6) indicate a strong association between temperature and relative humidity (negative values from 0.33 to 0.66), especially in the south (Texas) and northwest (Montana). The correlations between temperature and wind (ranged from 0.0 to 0.45) and wind and relative humidity (negative values ranged from 0.05 to Figure 6. Mann-Kendall correlation values between temperature, wind speed, and relative humidity. The correlation values are negative between temperature and relative humidity, and wind speed and relative humidity. The highest and lowest correlation occur between temperature and relative humidity (a) and temperature and wind speed (b), respectively. The legend is applied for all three plots. Figure 7. The annual co-occurrence probability of extreme temperature and relative humidity (a), temperature and wind (b), and wind and relative humidity (c) in the central CONUS. The threshold was defined based on the temperature higher than 35 • C, relative humidity less than 30%, and the wind speed higher than 7 m s −1 . 0.44) were stronger in the south Texas and northeast Kansas (figure 6). Calculated Mann-Kendall correlation for the extreme temperature, wind, and relative humidity values (using fixed thresholds) showed the same pattern of association between variables. However, the association was weaker compared to when all data were considered and fewer stations showed statistically significant correlation between the extreme values of variables. Calculating the measure of association based on fitted copula showed the same relationship between pairwise variables. The Student-t test results showed no statistically significant difference between the correlation values and the measure of association values.
Although higher correlation between variables may increase the probability of compound events (Zscheischler and Seneviratne 2017, Zhou and Liu 2018), results demonstrated no influence of correlation on the probability of compound HDWs when using the fixed thresholds. Calculating the trivariate Kendall dependence among variables (equation (4)) based on the best fitted copula for each station in each year, the lowest dependency between variables was observed in western New Mexico and Colorado. The highest trivariate dependence was found in southern Texas. Again, no relationship was found between the dependence among variables and the highest frequency of HDWs.

Pairwise probability between variables
The copula approach was also used to model the bivariate relationship between randomly distributed variables. The highest probability of compound high temperature and low humidity was discovered in a region that extends from northwest Texas into southwestern Kansas ( figure 7(a)). The probability decreased toward the northwest and northeast. A similar pattern of change in the bivariate co-occurrence probability was found for temperature and wind speed ( figure 7(b)).
Although the occurrence probability of simultaneous moderate-speed winds and low humidity is higher in western Great Plains (Borchert 1950), low temperature in northern sections of these regions (figures 7(a)-(b)) limits the occurrence of HDWs. Comparing the calculated bivariate probabilities (figure 7) with correlations (figure 6) suggests no relationship between the variable inter-dependence and the probability of compound events. In principle, although variables, especially temperature and relative humidity, were well correlated, the extremes of the variables may not co-occur since there was no tail dependence (Nelsen 2007). Unlike previous studies that showed the effect of the high correlation between precipitation and temperature on the co-occurrence of extreme events (Zscheischler and Seneviratne 2017, Zhou and Liu 2018), here no influence of correlation among variables on the frequency of compound HDWs was discovered. This finding likely relates to the lack of dependence as exhibited by using the percentiles corresponding to the fixed thresholds. In addition, previous studies (Zscheischler and Seneviratne 2017, Zhou and Liu 2018) assessed the concurrency of other combinations of variables, such as low precipitation and high temperature. However, in this study of HDWs, the diverse nature of wind speed and humidity influence the results.

Temporal changes of HDWs
The changes in the probability of HDWs were analyzed over the two 35-year sub-periods individually. K-S test results revealed no difference in the distribution of HDWs for the two periods for a  (table 1). Stations with the highest frequency of HDWs (hotspots; figure 4) had a nonsignificant, but upward trend.
To better understand the effects of individual variables on the temporal variability of HDWs, the number of extreme hot (temperature higher than 35 • C), dry (relative humidity less than 30%), and windy (wind speed higher than 7 m s −1 ) events were calculated for each station. The average frequency of occurrence of events varied across the stations, with between 1-28 hot hourly events per warm season, 2-171 dry hourly events, and 237-1109 windy hourly events per warm season. Hot extremes with the lowest number of events were considered as the primary limiting parameter for HDW in the central CONUS. The highest frequency of both hot or windy events was found at stations in Texas and Kansas. For dry events, the greatest frequency was observed in western half of the study area, in Colorado, New Mexico, and Montana.
Mann-Kendall trend test results indicate an upward trend in extreme hot events in the majority (89%, 52% significant) of stations especially over the western Great Plains (figure 8, table 1). Two stations (7%) in the eastern Dakotas had significant downward trends for hot events. The majority of stations (67%, 11% significant) had an upward trend in the frequency of low relative humidity events. In the east and northeast, the trend for dry events was mostly decreasing. Windy events had a downward trend in a majority of the central CONUS (85%, 52% significant; figure 8, table 1). Warmer temperatures initially increase evapotranspiration rates; but as soils dry, there is a reduction in evaporative cooling (Seneviratne et al 2012). The greener vegetated lands in higher latitudes with longer growing seasons and higher transpiration rates (Wolf et al 2016) are expected to see a rise in temperature related to a decrease in soil moisture and a change in the partitioning of energy between latent and sensible heat fluxes (Alexander 2011). Accompanying the hotter and drier conditions, moderate winds are expected to increase evapotranspiration (Derrel et al 1993), increase plant damage (Curtis 1891, Yang and Wang 1978, Wang et al 2016, and create environments more favorable for the spread of wildfire (Srock et al 2018). Although the prediction of a flash drought is challenging (Pendergrass et al, 2020), knowing the risk of HDWs will inform those planning for a possible flash drought (Otkin et al 2018).

Discussion
The highest risk of HDWs mirrors the highest risk of the extremely hot and windy conditions. Strengthening of the dependence between individual variables did not influence the co-occurrence of HDWs in the central CONUS (figures 4 and 6). The region extending from southwest Kansas into northwest Texas was identified as a hotspot (figure 4) where a higher occurrence of HDWs is expected, coinciding with a region of faster wind speeds (Deharpporte 1984, Elliott et al 1986, Klink 1999, Archer and Jacobson 2003. Temperature and wind are respectively the most and the least limiting variables that control the occurrence of HDWs in the central CONUS ( figure 5). HDWs occurred more frequently in 1980 and 2011, with the co-occurrence of drought and a heatwave (Karl et al, 1981, Hoerling et al 2013, NCEI 2020. Despite not finding a significant trend in droughts, Mazdiyasni and Aghakouchak (2015) found a substantial increase in concurrent heatwaves and drought during 1960-2010 in the CONUS. This shift may trigger more HDWs when moderate-speed winds occur simultaneously with compound hot and dry conditions.
Results from this study demonstrate that, although wind speeds at the 90th and 50th percentile thresholds are decreasing in the Great Plains (Pryor et al, 2009), the increase of hot and dry events (figure 8, table 1) can potentially increase the risk of HDWs. Global relative humidity is not significantly changing (Hartmann et al 2013) and weak trends in the CONUS suggest an increase in winter (Brown and Degaetano 2013). The increase in HDW events is more significant in the west, northwest, and the southern Great Plains portions of the study area (figure 8). Although HDW alone is not sufficient to start wildfires, the increasing trend of HDWs may exacerbate wildfire spread and lead to crop damage in these areas. Donovan et al (2017) mentioned an increase in the probability of annual wildfire 33 to 117 during 1985-2014 mainly in Edwards Plateau and Cross Timbers. These areas are just to the east of theas hotspot for HDWs in this study. Given the importance of human ignition, finding an exact relationship between HDWs and wildfire frequency is complicated (Tan et al 2018). In addition, it is complicated to identify the impact of natural or anthropogenic factors (Kirchmeier-Young et al 2017, Tan et al 2018. However, monitoring HDWs and finding hotspots can influence regional planning. In the east and northeast of the study area, the temperature decrease corresponding to the 'warming hole' (Pan et al 2004, Meehl et al 2012, Tavakol et al 2020b, along with a lower occurrence of windy and dry events, drops the risk of HDWs (figure 8).
As the world's population is expected to be exposed to more compound extreme events in the future (Mora et al 2018), this study suggests that traditional empirical methods fall short in communicating the risk of compound HDWs particularly in the central CONUS where higher frequency of HDWs is expected. Copula families with flexible dependence modeling are a better alternative for the analysis of compound climate extremes, since more than one variable is typically responsible for the extreme impacts (Leonard et al 2014). However, the concept of copula has been defined based on the independent and identically distributed random variables with continuous univariate margins, with no expected ties to occur (Nelsen 2007, Hofert et al 2018. Yet, because of the lack of precise measurements and rounding, it is common to observe ties in data in practice. This study acknowledges the influence of ties in distinguishing copula families (Hofert et al 2018).

Data availability statement
The data that support the findings of this study are openly available at the following URL/DOI: www.metoffice.gov.uk/hadobs/hadisd/.