Heat wave time of emergence patterns: a matter of definition

Hot extremes, such as heatwaves, have been associated with health, economic, and ecosystem-wide impacts. The timing of emergence of changes in extremes due to anthropogenic climate change is a topic of broad scientific and societal importance. While various studies have estimated the timing and impacts of heatwaves, the definitional aspect of a heatwave in determining the relative time of emergence has not been addressed. We adopt two commonly used definitions of heatwave employed in different reports of the Intergovernmental Panel on Climate Change (IPCC) to evaluate the time at which the frequency of heatwaves becomes detectably different from the historical baseline using an ensemble of 10 GCMS from the CMIP6 archives forced by the SSP2–4.5 concentration pathway. For a heatwave definition of sustained temperatures exceeding 5 °C warmer above the historical climatology, time of emergence is earliest in high latitudes over land and displays correlation with the signal (amount of warming) and noise (variability). In contrast, for a heatwave definition of sustained temperatures exceeding the 90th percentile of historical climatology, time of emergence is earliest in low latitude regions and is correlated with the signal to noise ratio. This work underscores the importance of metric choice in estimating the timing of new climate regimes and that metric selection for informing adaptation timing should thus be tailored to the regional context.


Introduction
While global climate change measured through mean warming thresholds is often the target of international climate action, the impacts of climate change largely manifest locally through regional extreme climate and weather events.The observed annual mean climate change signal is now greater than the noise (i.e. the standard deviation of detrended annual mean values) over most land areas (Hawkins et al 2020), and is detectable even at the local and daily scale (Sippel et al 2020).Extreme heat events in particular, or heatwaves, are an annual cause of weather-related deaths (e.g., Luber and McGeehin 2008, Mitchell et al 2016, Mora et al 2017), have been shown to reduce labor productivity (Dunne et al 2013), and impact cornerstone ecosystems globally (Seddon et al 2016, Vinagre et al 2018, Stillman, 2019, Breshears et al 2021).Multiple studies suggest that as global temperatures continue to rise, we can anticipate a higher frequency of heatwaves and record-shattering heat events (e.g.Diffenbaugh andScherer 2011, Fischer et al 2021).Heatwaves are expected to double in frequency with a 2 °C rise in global temperatures exposing 37% of humanity to severe heatwaves once every five years (Dosio et al 2018).Determining when and where these extremes are detectably different from historical climatology is of public interest, especially as extreme heat records are expected to be continuously shattered as climate change continues (Meehl and Tebaldi 2004, Russo et al 2014, 2015, 2016).Identification of the difference from historical climatology can also allow for more targeted and timely adaptation strategies to mitigate the impacts of these heat events (Wang et al 2018).
While studies have assessed heatwaves for various intensities and on various spatial and temporal scales (Coumou et al 2013, King et al 2015, Harrington et al 2016, Im et al 2017, Perkins-Kirkpatrick & Gibson 2017) few have evaluated the time of heatwave emergence and discussed definitional aspects.The time of emergence (ToE) of an anthropogenic climate signal is typically defined as the time at which the magnitude of a climate signal becomes detectable amidst the background natural variability, but there are multiple ways this can be defined for heatwaves as we show below.Seminal papers evaluated the spatial distribution of ToE for annual or summer mean temperature change, and found early emergence in low latitudes due to the low interannual variability of these regions (Mahlstein et al 2011, Hawkins andSutton 2012).ToE studies on extreme heat have assessed summer temperatures and extreme daily temperature emergence (Harrington et al 2016, Im et al 2017, King and Karoly 2017, King et al 2017, Lopez et al 2018).These studies all show increased frequencies of temperature extremes, with early emergence of daily temperature extremes in low latitude countries, similar to the findings of summer mean temperatures in Mahlstein et al (2011).Single day temperature extremes, however, do not have the same impact as sustained temperature extremes that are observed in heatwaves.Studies have estimated that a 33% increase in the number of summer heatwave days in India can lead to a 78% increase in heat-related mortality (Mazdiyasni et al 2017) and that the frequency and severity of summers with sustained temperature extremes was tied to a 5-10 fold increase in heat-related mortality rate (Lüthi et al 2022).While a heatwave is generally defined as a series of consecutive hot days, there is no universal definition for the required number of days or temperature thresholds in order for a heatwave to be defined.A range of health impacts are related to extreme heat, but studies focus on a variety of different extreme heat metrics without operating under a single definition (Basu 2009).This range of definitions makes it difficult to relate the expected ToE of heat extremes to the expected ToE of heat-related impacts, which risks misinforming adaptation priorities.Thus, considering the sensitivity of ToE estimates to heatwave definition is the focus of the present work.
In this analysis we adopt two definitions used by the Intergovernmental Panel on Climate Change (IPCC) and compare their relative ToEs.The first definition was employed in the 2007 IPCC report, which defined a heatwave as 'at least five consecutive days with maximum temperature at least 5 °C higher than the climatology of the same calendar day' (Meehl et al 2007).The 2019 IPCC special report only qualitatively defines heatwaves as 'a period of abnormally and uncomfortably hot weather', though defines extreme weather as an event that 'would normally be as rare as or rarer than the 10th or 90th percentile of a probability density function estimated from observations' (IPCC, 2018).Taken together, we interpret the IPCC 2019 heatwave definition to be daily temperatures exceeding the 90th percentile of historical climatology, and we consider the same duration of sustained extreme heat as the first definition (i.e. 5 days above the 90th percentile).We henceforth refer to these two definitions as the '5 °C anomaly', and '90th percentile' heatwaves, respectively.Using a multi-model ensemble of general circulation models (GCMs) daily temperatures, we illustrate the extent to which the definition of heatwave drives the patterns of emergence.In addition, we evaluate how these patterns correlate with the mean temperature signal, noise, and signal to noise ratio.This work underscores the importance of metric definition in evaluating the relative times of emergence of the anthropogenic signal.

Data
We analyze daily surface air temperature (TAS) from 10 General Circulation Models (GCMs) from the Coupled Model Intercomparison Project, Phase 6 (CMIP6) archives (Eyring et al 2016).GCMs are all forced by the Shared Socioeconomic Pathway (SSP) 2-4.5 scenario, with daily output provided from 1850-2100.The data are regridded through spatial interpolation to a 5°× 5°resolution as done in (Hawkins and Sutton 2012).We use the first available simulation from each of the General Circulation Models (GCMs) provided in the archive.The GCM simulations used in this analysis are listed in the supplementary materials.

Defining heatwave frequencies
For each GCM simulation, we define a heatwave relative to historical climatology, which we define to be the 1901-1950 average temperature and distribution for each calendar day.We consider only summer months in this analysis, defined in each grid cell as the three consecutive months of the year with the warmest average temperature.A heatwave is identified if the specific threshold is exceeded for at least 5 consecutive days.If the threshold is exceeded for more than 5 days, we round down to the lowest multiple of 5.For example, a heatwave that lasts 5-9 days would be counted as one five-day heatwave, 10-14 days as two five-day heatwaves, etc.

Heatwave thresholds
For the first definition, which we term 5 ℃ anomaly, the temperature must exceed 5 ℃ above daily historical climatology.For the second definition, termed 90th percentile, the temperature must exceed the 90th percentile of the daily historical climatology.
Emergence can be thought of as a signal to noise problem.In the signal and noise analysis presented below, the signal is defined as the difference between mean summer temperatures in degrees Celsius from 2070-2100 and the mean summer temperatures in degrees Celsius during the historical period 1901-1950.The noise is defined as the standard deviation of summer temperature anomalies during the historical period.The signal to noise ratio is the multi-model mean of the signal divided by noise values.

Statistical tests
We perform the following test for each GCM independently.For each heatwave definition (5 °C anomaly and 90th percentile) we calculate the heatwave frequency (in number of summer heatwaves/year) during the climatological reference period ) and a 30-year moving window starting in 1926-1955.We use a 5-day window for each climatological reference period, centered on the reference day.We use the two-sample Kolmogorov-Smirnov (KS) test, similar to Mahlstein et al 2011, where the null hypothesis is that the two samples come from the same distribution.The KS test was selected because it does not assume a normal distribution within the data, which may not occur when evaluating the frequency of extreme events.We then shift our comparison window ahead 5 years at a time, and perform the KS test on 1931-1960 heatwave frequencies relative to the reference period.ToE is defined as the midpoint of the 30-year window when the KS test rejects the null hypothesis with 95% significance.
To aggregate ToEs across the GCM ensemble, we define the ensemble ToE for a grid cell as the first year when at least 80% of GCMs detect emergence.If this is not achieved by 2100 then ToE is not defined.
Correlations for the signal, noise, and signal to noise, are calculated using the Spearman Rank Correlation Coefficient.This test evaluates the correlation between rankings of two variables, thus is less influenced by extreme values than the Pearson correlation coefficient.

Results
Multi-model time of emergence Figure 1 maps the multi-model ToE of summer heatwaves for the two different definitions considered here.For the time series and methods considered here, the earliest ToE possible was 1940 and the latest 2085.There are marked differences in ToE patterns across the definitions.The 5 °C anomaly definition shows earliest ToE for high latitude land regions, later or no emergence for lower latitude land regions, and virtually no emergence by The role of signal, noise, and the signal to noise (S/N) ratio can be used to explain this difference in patterns.We find across all GCMs that the noise is negatively significantly correlated with the ToE of 5 °C anomaly heatwaves (see figure 1(a) and (c)) and table 1), followed closely by the signal (see figure 2).For the ToE of 90th percentile heatwaves, we find the S/N to exhibit highest correlation (see figures 1(b) and (d)) and table 1), with the pattern being significantly correlated across all GCMs.This finding is further illustrated in figure 2.

Time of emergence at individual grid cells
Figure 2 provides the time series of the 5-day running minimum of daily mean summer temperature anomalies, along with the two approximate time of emergence thresholds, and their corresponding ToEs for five representative grid points from the ACCESS-CM2 GCM.All grid points show emergence for the 90th percentile ToE.The grid point in Canada (49 N, 77 W; Quebec) is the only selected location where the ToE of the 5 °C anomaly heatwave emerges prior to the 90th percentile ToE, though only marginally earlier (5 years).This is due to the 90th percentile resting above the 5 °C anomaly threshold in this location, a reflection of the substantial climatological noise in unperturbed summer temperatures.In contrast, the 90th percentile ToE leads the 5 °C anomaly ToE by 15 years in Southern Africa, and by more than 70 years in the Indian Peninsula.
By definition, the 90th percentile threshold has historically been exceeded at some time by all grid cells, globally.Therefore, a consistent positive change in the mean will eventually result in increased frequency of exceeding the heatwave threshold, which is shown to occur nearly globally by 2100 for the SSP2-4.5 scenario considered here.In contrast, only a fraction of grid cells show emergence for the 5 °C anomaly definition, including the three grid points over land illustrating this effect, shown in figure 2. For the two grid points over the ocean, the 5 °C threshold is never exceeded, reflecting the ocean's damping effect both on temperature extremes and on the overall signal.The grid cells over the ocean illustrate how historically this type of heatwave Table 1.Spearman rank correlation coefficients for both heatwave definitions with the signal, noise, and signal to noise ratio.The reported value is the ensemble mean correlation coefficient with the ensemble range of correlation coefficients included in parenthesis.did not occur, with the 5 °C threshold resting substantially higher than historical extremes.If the noise remains relatively constant around the mean, then this would require a substantial signal to exceed the required heatwave threshold, thus explaining the lack of emergence over the ocean in low noise, low signal regions.Whereas in regions with higher noise, (e.g.Quebec/Canada and India), 5 °C has historically been exceeded and therefore a positive signal results in them becoming more frequent.Similarly, a region with a high enough signal can result in 5 °C anomalies being either newly observed or much more frequent.

Discussion and conclusion
This analysis shows that according to GCMs, an increase in 90th percentile heatwave frequency has already emerged nearly globally, and that it has already emerged or will soon emerge over most land areas for both definitions considered here.However, the selected heat wave definition dictates where emergence is first observed.Heatwaves defined by the 90th percentile emerge especially early throughout much of the tropics.Similar to the ToE of mean summer temperatures from Mahlstein et al (2011), the signal to noise ratio appears as the most significant driver of 90th percentile heatwave emergence.These findings are in line with earlier studies suggesting that the frequency of new extremes is determined by the signal to noise ratio (Rahmstorf and Coumou 2011), and which identify the tropics as being the hotspot of heatwave emergence (IPCC 2018).In contrast, heatwaves defined by the 5 °C anomaly emerge soonest in high latitudes, do not emerge over much of the ocean, and are most correlated to the noise of warming, and also highly correlated with the signal.The landocean warming contrast as well as polar amplification mechanisms can explain these spatial patterns of emergence.
With both definitions, noise is correlated to the time of emergence, but in different ways.For the 5 °C anomaly definition, the noise is negatively correlated, meaning more noise will lead to an earlier time of emergence.For the 90th percentile definition, the noise is positively correlated, meaning that more noise will lead to a later time of emergence.These correlations are reflected in the differences in patterns between the two definitions.Noise is highest in the high latitudes, and the 5 °C definition emerges first in those latitudes while the 90th percentile definition emerges later in those latitudes.While the noise plays a role in both definitions, the impact varies based on what is meant by a heatwave.
This work emphasizes the importance of metric definitions in evaluating the emergence of climate extremes.Defining extremes based on the magnitude of the anomaly will identify emerging hotspots in regions with both larger signals (which lead to high magnitudes of change) and larger noise, (where historical anomalies already have a large magnitude, by definition).In contrast definitions based on historical distributions will favor emergence in regions with low variability, where only a small shift in signal can lead to a statistically different distribution.Here, we have demonstrated that the definition of a heatwave significantly affects analysis of the timing of potential future impacts.Expanding the analysis to other heatwave definitions found in the scientific literature could illuminate different patterns of change.Using the ToE of heatwaves to inform the timing of adaptation strategies thus requires a careful consideration of which metric is most relevant for the context and adaptation needs.For example, targeting adaptation efforts to mitigate deadly heatwaves may consider definitions that have been linked to human mortality, such as nighttime temperatures which are thought to increase mortality due to the lack of relief from daytime heat (Kim et al 2023), or a definition that accounts for both temperature and humidity, which has been found to be a stronger indicator of heat stress than temperatures alone (Mitchell et al 2016, Schwingshackl et al 2021).This analysis does not consider the difference in heat stress between urban and rural rea (Fischer et al 2012) or how vulnerability, adaptation options, and acclimatization of the population can lessen heatwave impacts (Wang et al 2018).
Further, this study gives context to IPCC reports, some of which emphasized early emergence of heatwaves at low latitude, while others identified high latitude.Yet in July 2023, temperature records were shattered in broad regions at middle latitudes across the United States, Southern Europe, and China (Zachariah et al 2023), requiring a careful look at definitions and their interpretation.This work suggests that considering only one definition of heatwave, defined by percentiles, may be underrepresenting the widespread potential of perceptible heatwaves.

Figure 1 .
Figure 1.Time of emergence (ToE) of summer heatwave frequency.(a) ToE of 5 °C anomaly summer heatwave frequency, where ToE is defined when at least 80% of GCMs indicate emergence (b) as in (a) but for the heatwave threshold above the 90th percentile of historical climatology.Summer is defined as the three hottest consecutive monthly averages in the climatological period, from 1901-1950.The color bar indicates the midpoint in the earliest thirty-year window where the frequency of these heatwaves is statistically different from historical frequency using the two sample Kolmogorov-Smirnov statistical test.White indicates that ToE has not occurred for at least 80% of GCMs by the end of the simulation period (2100).(c) Noise, equal to the multi-model mean standard deviation of daily temperature anomalies in the climatological period.Units are in °C.(d) Signal to noise ratio, equal to the multi-model mean of the signal (computed as the 2071-2100 mean summer temperature relative to the climatological mean) divided by the noise.All GCMs are forced by SSP2-4.5 emissions scenario.

Figure 2 .
Figure 2. The signal to noise ratio and its role in governing the Time of Emergence (ToE) of heatwaves.Maps illustrate the multimodel mean of the signal, computed as the mean of daily summer temperature values in 2071-2100 relative to historical values from 1901-1950.The time series plots illustrate the five-day running minimum of daily summer anomalies relative to historical values (black) for five select locations for the ACCESS-CM2 model along with the summer mean (grey line).The blue band illustrates the noise of daily temperature anomalies, the dashed red line is the summer average historical 90th percentiles of daily temperature anomalies, and the solid red line is the 5 C temperature anomaly threshold.The dashed blue line shows the 90th percentile ToE and the solid blue line shows the 5 C anomaly ToE if it is defined by 2100.All GCMs are forced by SSP2-4.5 emissions scenario.