The influence of internal climate variability on heatwave frequency trends

Understanding what drives changes in heatwaves is imperative for all systems impacted by extreme heat. We examine short- (13 yr) and long-term (56 yr) heatwave frequency trends in a 21‐member ensemble of a global climate model (Community Earth System Model; CESM), where each member is driven by identical anthropogenic forcings. To estimate changes dominantly due to internal climate variability, trends were calculated in the corresponding pre-industrial control run. We find that short-term trends in heatwave frequency are not robust indicators of long-term change. Additionally, we find that a lack of a long-term trend is possible, although improbable, under historical anthropogenic forcing over many regions. All long-term trends become unprecedented against internal variability when commencing in 2015 or later, and corresponding short-term trends by 2030, while the length of trend required to represent regional long-term changes is dependent on a given realization. Lastly, within ten years of a short-term decline, 95% of regional heatwave frequency trends have reverted to increases. This suggests that observed short-term changes of decreasing heatwave frequency could recover to increasing trends within the next decade. The results of this study are specific to CESM and the ‘business as usual’ scenario, and may differ under other representations of internal variability, or be less striking when a scenario with lower anthropogenic forcing is employed.


Introduction
Heatwaves (prolonged periods of anomalously warm temperatures; Perkins and Alexander 2013) inflict disastrous impacts on human health, infrastructure, and ecosystems (McMichael and Lindgren 2011, Welbergen et al 2008, Coumou and Rahmstorf 2012, Perkins 2015. Since at least 1950, increases in heatwaves have been observed over numerous regions (Della-Marta et al 2007, Perkins et al 2012, Russo et al 2014, Ding et al 2010. These observed trends in heatwaves are predominantly statistically significant (Perkins et al 2012), and anthropogenic climate change is a main contributor (e.g. Stott et al 2004, Christidis et al 2015, with projected future changes consistently indicating increasing trends (e.g. Meehl  When investigating climate projections, traditional analysis is generally supported by multi-model ensembles such as the 5th Climate Model Intercomparison Project (CMIP5) global climate model archive (Taylor et al 2012). While CMIP5 and similar ensembles provide an estimate of structural and parametric uncertainties surrounding climate projections (Taylor et al 2012), the internal variability of each participating model is almost certainly underrepresented. Numerous studies have demonstrated that Original content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence.
Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. even slight perturbations in a model's initial conditions, when all external forcings are constant, can result in very different trend estimates (e.g. Deser et al 2012, Perkins and Fischer 2013, Deser et al 2014 and overall changes in heatwaves (Kay et al 2015, Teng et al 2016. This is very important, since due to the inherent variability in the climate system, the same principle undoubtedly applies to observations. Therefore, just because we have measured one set of observed heatwave changes does not mean it is the only possibility. The frequency, duration and intensity of heatwaves vary markedly on interannual and interdecadal scales due to climate variability phenomena (Kenyon and Hegerl 2008, Parker et al 2014, Hoerling et al 2013, Perkins 2015. Thus, different representations of internal variability will likely influence resulting trends. The present study explores the distribution of historical trends of global and regional heatwave frequency when accounting for the influence of internal climate variability. We consider short-and long-term trends to quantify the effect of variability on rates of change over different temporal periods (Martozke and Forster, 2015), as well as whether short-term trends can be indicative of the longer-term signal. Additional to previous studies (e.g. Deser et al 2012, Deser et al 2014, Marotzke and Forster 2015, Kay et al 2015, Teng et al 2016 we examine whether such trends are unprecedented due to the presence of human influence. We utilize observations and a 21-member ensemble of the global climate Community Earth System Model (CESM; see Fischer et al 2013), and consider regional and grid-box trends.

Data
To measure observed changes in heatwaves, we use the HadGHCND observational record, a 3.75°Â2.5°q uasi-global daily dataset of maximum (T max ) and minimum (T min ) land temperatures (Caesar et al 2006, Perkins et al 2012. Since HadGHCND is incomplete in space and time, we only use grid boxes that have at least 55% of the total period between 1955-2009, and 5% of the total during 2000-2011 (Perkins et al 2012), The overall time period used, 1955-2011, is common between observations and CESM (see below). We extract daily T min and T max from version 1.0.4 of the CESM climate model, which includes the Community Atmosphere Model version 4 at 1.875°Â 2.5°global resolution (see Gent et al 2011. In addition to a 982 yr control simulation under no external forcing and greenhouse gas concentrations are set to pre-industrial levels, this ensemble has 21 members, each driven by identical external forcings. From 1950From -2005 all members are forced with historical anthropogenic greenhouse gas and aerosol concentrations, and natural forcings. From 2006-2100 prescribed RCP8.5 forcings are employed. Each member only differs in their initial conditions, where on the 1st of January 1950 random perturbations on the order of 10 À13 are imposed on atmospheric temperature . Despite this minute alteration, a substantial amount of variability is induced across the ensemble providing an ideal platform for this study. We exclude the first 5 yr of each historical simulation for spin-up, to which we concatenate the respective RCP8.5 simulation to provide data from 2006-2011 matching the length of observations (herein referred to as 'forced' simulations). Employing a scenario with less anthropogenic greenhouse forcing (e.g RCP4.5) would likely yield more subtle findings. However we are limited to RCP8.5 as it is the only future scenario applied to CESM.

Calculating heatwaves
We use the Excess Heat Factor (EHF) heatwave definition (Nairn et al 2009, Perkins et al 2012, Nairn and Fawcett 2013, an operational heatwave index employed by the Australian Bureau of Meteorology. Comparisons of heatwave trends calculated from the EHF and indices based on T min and T max are detailed in Perkins and Alexander (2013). EHF is based on two excess heat indices: EHI(accl.) and EHI(sig.), that are combined to derive EHF: T i is the average temperature for day i, and T 90i is the calendar day 90th percentile, calculated from a 15 d window centered on T i . The average temperature is the average of T min and T max within a 24 h cycle (9 A.M.-9 A.M.). EHI(accl.) describes the anomaly over a 3 d window against the preceding 30 d, and EHI(sig.) describes the anomaly of the same 3 d window against a climatological extreme threshold and flags particularly warm conditions. For a heatwave to occur, EHF must be positive for at least three consecutive days (i.e. i, i þ 1 and i þ 2). For observed data and the CESM realizations, a base period of 1961-1990 was used to define T 90i. For the control simulation, a 30 yr base period was selected at random as there were no detectable differences between T 90i values from 500 randomly selected 30 yr periods. We consider heatwaves over a 5 month summer-May-September in the northern hemisphere and November-March in the southern hemisphere (Perkins et al 2012). The resulting record spans events commencing between 1955-2010 in the observations and realizations, and for 981 yr in the control, since we omit the last year in the northern hemisphere to match the same timespan in the southern hemisphere. We analyse heatwave frequency using the seasonal total of heatwave days, where a heatwave day is part of at least three consecutive days of Environ. Res. Lett. 12 (2017) 044005 positive EHF values. Section S1 in the supplementary material provides a regional evaluation of CESM against HadGHCND observations.

Trend analysis
Heatwave trends were calculated per decade using Sen's Kendall slope estimator that is robust against outliers and non-normally distributed data (Sen 1968, Zhang et al 2005, Caesar et al 2011, which are common characteristics of extremes. Grid box and regional average trends were calculated at the native resolution for HadGHCND and each model realization. Trends for all 21 'Giorgi' regions (Giorgi and Francisco 2000) were originally calculated. However, we discuss only Western North America, Northern Europe, East Asia and Australia, representing a variety of climates and differing influences of internal variably, as well as balancing spatial constraints of this study. Trends are deemed significant at the 5% level, where the null hypothesis is no detected trend (i.e. a magnitude of 0).
The bulk of our analysis on forced trends is based on 1955-2010 (56 yr) and 1998À2010 (13 yr); the former is the longest possible period common across all datasets, and the latter covers a similar period where the observed global average temperature trend was smaller than the long-term (e.g. Liebmann et al 2010, Trenberth and Fasullo 2013, Kosaka and Xie 2013, Marotzke and Forster 2015. These periods were also selected to analyse the role of internal variability on short-and long-term rates of change under observed external forcings. We first present rank histograms (Hamill 2001, Haughton et al 2014 to determine if CESM can capture the spatial pattern of observed trends. Rank histograms (figure 1) show the position of the observed trend against the 21 ensemble members in descending order.
To determine whether forced trends are unprecedented against background internal variability, we respectively compare them to all 56 yr and 13 yr trends calculated from the control. Regionally, we also investigate the minimum length for which a trend must be calculated to be indicative of the long-term (1955À2010) change in CESM. While previous methods have used trend significance (e.g. Liebmann et al 2010, Lewandowsky et al 2015) or other signal to noise analysis (e.g. Santer et al 2011), we adopt an alternate method analyzing trend magnitudes across all available temporal lengths. For each realization, we compute trends of 5 to 56 yr, where all trends truncate in 2010. To adequately sample the range of longerterm trends in CESM, trends of 51 to 56 yr length were aggregated across all realizations, resulting in a sample of 105 trends. For each realization we then calculate the trend commencement year from which all trends starting prior to this year consistently lie within the 1st and 99th percentile of the aggregated sample. For example, if the resulting year was 1975, a trend spanning at least 1975-2010 is necessary to provide an adequate and stable estimation of overall long-term changes in heatwave frequency, for the specific realization in question. The red dotted line indicates the percentage of grid boxes where CESM members are expected to be greater than the observed trends by chance. The ensemble tends to over/under estimate some long-term trends, and over estimate some short-term trends. However, over 75%-80% of all common areas, the model estimates observed changes in heatwave frequency reasonably well.
Environ. Res. Lett. 12 (2017) 044005 Section S3 in the supplementary material details regional heatwave trends in CMIP5 ensemble (Taylor et al 2012). While the spread in CMIP5 trends is greater, this is likely due to the larger sampling of model configurations (e.g. physics, resolution, etc). The variability of CESM trends is within that of CMIP5, and centered on a similar median. The overall conclusions of our study are very similar across both ensembles, however quantitative results detailed below are specific to CESM, and could differ if another climate model (with an adequate number of realizations) was used.

Results and discussion
Rank histograms of short-and long-term heatwave frequency trends (figure 1) indicate that the ensemble is under-dispersive when compared to the observed spatial trend pattern. Considering long-term trends, almost 14% of grid boxes the observed trend is larger and 12% are smaller than the entire CESM ensemble ( figure 1(a)). This indicates an underestimation in the range of forced changes by CESM. In figure 1(b), observed trends are smaller than the ensemble over almost 18% of grid boxes, indicating that CESM overestimates short-term changes in heatwave frequency. However, for the majority of grid boxes (the 76% or 82% not affected by an over-or under-estimation) the ranking of observations against CESM is within the model's uncertainty envelope (see supplementary material available at stacks.iop.org/ERL/12/044005/ mmedia). This corresponds well to where the observed long-and short-term trends are within the CESM ensemble range (figures 2(a) and (b)), with the exception of parts of Eastern (Central) Asia in figure 2 (a) and (b) where the observed trend is higher (lower). While some improvements could be made in the same as (a) but for short-term; (c) 1 st percentile of longterm trends from the externally forced 21 member CESM ensemble; (d) same as (c) but for short term trends. Units of these graphs are days/decade. (e) percentage of forced long-term trends greater than the control; (f) same as (e) but for short-term trends; (g) percentage of significantly increasing forced long-term trends; (h) same as (g) but for short-term trends.
Environ. Res. Lett. 12 (2017) 044005 simulation of the entire large-scale spatial pattern of heatwave frequency trends, CESM is appropriate in demonstrating the influence of internal variability on heatwave frequency over most global regions.
Spatially, there are clear differences in observed heatwave frequency trends across the two time periods (1955-2010 and 1998-2010)-both the direction and magnitude of change can be drastically different--indicating that shorter-term trends ( figure 2(b)) are not indicative of long-term changes ( figure 2(a)). It is clear that, for most regions, there is an increase in heatwave frequency over 1955-2010, but this is not always reflected in short-term trends. Moreover, even regional long-term trends can be anomalously small. For instance, a 'warming hole' in heatwave frequency trends is detected over the U.S., although over a different area than previously documented for mean temperature (Pan et al 2004. The absence of pronounced increases are also evident elsewhere in both short-and long term trends (figures 2(a) and (b)). The cause of the warming hole in seasonal mean temperatures is currently debated, with some studies suggesting this phenomenon exists because of a change in variability , Meehl et al 2015. Our results indicate a similar influence of variability on heatwave trends.
Further, figure 2(c) demonstrates that no or very small trends (±0.5 d decade À1 ) in heatwave frequency were, although improbable, feasible during 1955-2010 over large regions, as indicated by the ensemble 1st percentile. This suggests that, under recent anthropogenic forcing, internal climate variability could have masked the underlying increasing trend, where he median trend of CESM is 1-4 d decade À1 , and the 99th percentile trend is 2-6 d decade À1 (not shown), Similarly, figure 2(d) demonstrates that largely decreasing trends, generally between 5-20 d decade À1 , were possible over 1998-2010. Note that the respective trends in figures 2(c) and (d) are not physically consistent, where the trends in one region may not occur under the same internal variability conditions as another. However, our results suggest that internal variability can render short-term declines and longerterm pauses in heatwave frequency physically possible under observed anthropogenic forcing.
For large parts of Africa, the Maritime Continent, Central and North America, the Mediterranean, and Eurasia, all longer-term forced trends are unprecedented compared to pre-industrial conditions (figure 2(e)). Over all other regions, typically 30%-70% of forced trends exceed the range of trends under pre-industrial conditions. Short-term forced trends are less likely to be outside the range of unforced trends (figure 2(f)), though there are instances when a notable percentage of forced trends are (the Tropics and central Russia). For both long-and short-term trends, regions where forced trends are largely outside the range of the unforced trends are also more likely to be significantly positive (2(g) and 2(h) respectively). Over tropical regions, where internal variability is typically low, trends do not have to be significantly increasing to be outside the range of unforced trends.
Over the Middle East, southern Russia, western Africa, the tropics and north America (figure 2(e)), forced long-term heatwave trends are very likely (>90% occurrence) increasing faster than what would be expected without anthropogenic influence. For all other regions most of the long-term forced trends are unprecedented, though a small number are indistinguishable from the pre-industrial control. The percentage of unprecedented forced trends is of course smaller for short-term trends (figure 2(f)). However, >50% of short-term trends are unprecedented over some regions (e.g. Central America and central Russia). Figure 3 demonstrates when short-and long-term heatwave frequency trends are consistently unprecedented against pre-industrial conditions. Similar to figure 2(e), most long-term trends commencing in 1955 or later are already unprecedented, however trends in regions higher than 60 N are unprecedented when commencing between 1990-2005 or later ( figure  3(a)). Over central Australia, this applies to trends generally commencing between 1985-2015, and 1975-1990 over the eastern United States. For short-term trends ( figure 3(b)), consistently unprecedented trends appear between 2010-2030 over tropical regions, and 1990-2010 for most other regions. Therefore, as anthropogenic forcing increases, short-and long-term changes in heatwave frequency will be exceptionally more rapid-not only will we experience completely new climates in the coming decades, we will reach novel heatwave conditions at unmatched speeds. This result is additional to emergence studies (e.g. Diffenbaugh and Scherer 2011, Hawkins and Sutton 2012, King et al 2015 where new seasonal climate regimes are expected by 2020-2040 over the tropics and 2060-2070 over the mid-latitudes (Diffenbaugh and Scherer 2011).
The stark differences between short-and longterm trends are further evident at the regional level (figure 4). For selected regions (Giorgi and Francisco 2000), the ranges are larger for forced short-term trends (figures 4(a), (d), (g) and (j)) than long-term (figures 4(b), (e), (h) and (k). Moreover, forced shortterm trends display a larger spread than corresponding pre-industrial trends, indicating anthropogenic influence increases uncertainty in short-term changes in heatwaves, despite a general skewness towards positive trends. The occurrence of unprecedented trends (modest, or little overlap between forced and preindustrial trends) is more evident than at the grid box level in figure 2, since smaller-scale variability is removed.
Similar to figure 2, it is unlikely that observed short-term trends exceed those expected under preindustrial conditions, even in cases where the regionally-averaged observed trend is relatively large Environ. Res. Lett. 12 (2017) 044005 (e.g. Australia, 4(j)). The opposite is true for regional long-term trends (figures 4(b), (e), (h) and (k)), where observed changes are mostly unprecedented. For Western North America (4b), East Asia (4h) and Australia (4k), there is almost no overlap between the forced and pre-industrial distributions, indicating an unprecedented shift towards faster rates of change in regional heatwave frequency.
It is clear that short-term trends are not indicative of the long-term trends, so how long does a heatwave trend need to be in order to be robust? Across all regions there is considerable spread within the ensemble on the latest year a trend can commence to represent the long-term trend (4c, 4f, 4i, 4l). In all cases, there is at least one realization where the latest starting year occurs before 1960; indicating trends should be measured over at least 50 yr to be a stable representative of the regional long-term change. Over North Europe (4f ) and Australia (4l ), some realiza-tions estimate long-term representative trends from as late as 1995, where the respective influence of internal variability is likely much smaller. While outside the scope of this study, future work could examine physical reasons and the role of climate modes (e.g. El Nino/Southern Oscillation, Pacific Decadal Oscillation) on the ranges of regional heatwave trends under identical anthropogenic forcing.
It is logical that over larger regions shorter trends may be sufficient to detect a robust change, as variability is averaged out over larger spatial scales. Moreover, regions at lower latitudes could also produce shorter, yet stable trends, since the climate is less variable. Conversely, regions at higher latitudes could require in longer trends to detect a signal (i.e. earlier commencement years) since variability is larger, However, figure 4 indicates that such situations are not the case, and the opposite of what these expected patterns. The latest commencement year for Australia,  figure 4 shows that within a region, the latest commencement year is likely longer than the 17 yr minimum for global average temperature (Santer et al 2011, Lewandowsky et al 2015. However, the large spread in CESM conjectures the representation of internal variability is a crucial factor in determining the time required in measuring a clear regional signal. This should be carefully considered when declaring observed trends as representative of an overall signal. Small or decreasing heatwave trends under historical forcing are consistent with the observational record of average temperature (Liebmann et al 2010, Trenberth and Fasullo 2013, Risbey et al 2014, Marotzke and Forster 2015. However, will shortterm regional declines in heatwaves last under anthropogenic influence? Exclusive of Alaska and Northern Europe, all regions show an increasing 13 yr heatwave trend on average, 5 yr after either no or a declining trend is detected (table 1). Similarly, almost all regions display at least a 50% chance of an increasing heatwave frequency trend 5 yr after a shortterm decline commences. These results strengthen 10 yr after a short-term decline, where all regions display increasing trends on average. The chance of an increasing trend 10 yr after a short-term decline is mostly between 85%-95%. This striking result indicates that short-term periods of no or decreasing heatwave frequency are transitory under anthropogenic forcing.

Conclusion
This study researched the effects of internal climate variability on heatwave frequency trends under preindustrial and forced conditions. It built upon previous research investigating short-and long-term average temperature trends where internal variability dominates on short timescales, and anthropogenic forcing on longer timescales (e.g. Liebmann et al 2010, Meehl et al 2014, Risbey et al 2014Marotzke and Forster 2015. While there is some evidence that the employed version of CESM underestimates the role of variability on the global scale (figure 1), this study demonstrates for trends in heatwave frequency: The failure of short-term trends to be robust indicators of longer-term changes; That small or decreasing short-and long-term trends are possible under historical anthropogenic forcing over most global regions due to internal variability; Where historically-forced long-term CESM trends are unprecedented against background climate variability; The disparity among ensemble members on the required length of a regional trend to be considered indicative of the long-term signal; and The high likelihood of regional trends to regain an increasing signal within 5-10 yr of a shortterm decline commencing.
Despite the uniqueness of CESM in assessing trends over different realizations of internal variability, the quantitative results are specific to this model. Based on other physical representations of internal variability and climate sensitivity, the separation of forced trends from internal variability could occur at different dates in other models (see supplementary material; Hawkins and Sutton 2012). So while this study has demonstrated that anthropogenic influence will override heatwave trends that climate variability alone dictates (figure 3), the timing of such a change will ultimately be model-specific.
In conclusion, this study has demonstrated the considerable effect internal climate variability has on trends of heatwave frequency. It is clear that shortterm trends vary in magnitude and direction more than long-term trends, and that short periods of decreasing heatwave frequency are possible under anthropogenic influence. However, anthropogenic influence is forcing heatwave trends, especially over the long-term, towards unprecedented rates of increase. The study has found that the actual rate of change and its robustness largely depends on the realization of internal variability of the specific sample, and not just the physical in-built variability of CESM. Lastly, over all global regions, short-term declines are followed by increasing trends within 5-10 yr, suggesting regions that experienced a decrease in heatwave frequency over 1998-2010 will see an increase within the next decade. Table 1. Average 13 yr trend 5 yr (column 1) and 10 yr (column 3) after a regional (rows, for region bounds, see Giorgi and Francisco 2000) 13 yr hiatus in heatwave frequency. Percentage of positive 5 yr and 10 yr trends are in columns 2 and 4, respectively.