Rethinking extreme heat in a cool climate: a New Zealand case study

New Zealand is one of many higher latitude countries where extreme heat is perceived to be a less consequential impact of climate change, by virtue of its relatively cool climate. Consequently, metrics to quantify the impacts of extreme heat in New Zealand have not kept pace with wider improvements in heatwave definitions. This study evaluates different methods to quantify extreme heat in New Zealand, with a view to improve the knowledge base underpinning future climate change risk assessments. Specifically, this analysis (1) reveals which of New Zealand’s purportedly hottest years in the satellite era are robust to different definitions of extreme heat; (2) introduces a new method of quantifying extreme heat which is applicable across different regions, and serves equally well whether an analysis is contextualised relative to the past (attribution) or for the future (projections); (3) detects previously unidentified heatwaves over recent decades; (4) identifies locally significant increases in extreme heat and the potential lengthening of summer months after only 0.5 °C of global warming; and (5) discusses further research priorities to better understand the impacts of extreme heat in New Zealand over the coming decades.


Introduction
The impacts which accompany periods of extreme heat are manifold (Perkins and Alexander 2013), and such extremes will become more frequent, intense and longer lasting as global temperatures increase (Perkins-Kirkpatrick and Gibson 2017). Some of the impacts of extreme heat will only affect those regions where temperatures are already very hot (Dunne et al 2013, Coffel and Horton 2014, Pal and Eltahir 2016, Mora et al 2017. However, many studies also demonstrate that significant health (Robine et al 2008, Christidis et al 2010, Honda et al 2014, Gasparrini et al 2015, 2017, Tobías et al 2017 and economic (Nguyen et al 2011, Mcevoy et al 2012, Zander et al 2015, Zuo et al 2015, Xia et al 2018, Steffen et al 2019 impacts can accompany extreme heatwaves, particularly when 'extreme' is defined relative to the climatological characteristics of a specific location (Sillmann et Perkins and Alexander 2013, Cowan et al 2014, Russo et al 2015, Perkins and Gibson 2015. Of particular note, several case studies have identified excess deaths associated with temperatures which were low in absolute terms, but anomalously hot relative to their typically cool climates: such examples include England and Wales (Christidis et al 2010), the Netherlands (Folkerts et al 2020), southern Finland (Donaldson et al 2003), Stockholm, Sweden (Åström et al 2016) and Toronto, Canada (Gasparrini et al 2015).
These studies collectively suggest that detrimental impacts of more frequently recurring periods of 'extreme' heat can be found across many populated regions of the world (Perkins-Kirkpatrick and Gibson 2017), if the metrics chosen to characterise such anomalies are carefully calibrated according to the temperature regimes which communities are historically familiar with (Frame et al 2017), and therefore adapted to (Ballester et al 2011, Nairn and Fawcett 2015, Harrington et al 2017, Hawkins et al 2020. However, the extent to which heatwaves have been identified as problematic in the context of climate change varies significantly from country to country. Such differences can be partly attributed to the past experience of a particularly extreme event-like the European heatwave of 2003 (Robine et al 2008), the 2009 heatwaves in Victoria, Australia (Coates et al 2014), or the 2010 heatwave in Ahmedabad, India (Azhar et al 2014)-which can then catalyse the implementation of early warning systems and other policy interventions (Fouillet et al 2008, Nicholls et al 2008, Council of Australian Governments et al 2011, Hess et al 2018.
New Zealand, meanwhile, is representative of a country where the risks associated with extreme heat have been the subject of much less research. In 2018, the New Zealand Ministry of Health released its first guidance about the potential development of early warning systems related to extreme heat in New Zealand (Ministry of Health 2018). Two conclusions emerged: first, that New Zealand still has no formal definition of a heatwave which is regularly monitored; and second, that an absence of evidence exists to develop triggers for the activation of future heat action plans, like those available in Australia (Nicholls et al 2008)  The freely available, gridded ERA5 product was chosen for three reasons: first, open access data encourages reproducibility. Second, because results using a gridded product are more easily applicable to climate models (Black et al 2016) than stationbased observations (Avila et al 2015), any heatrelated metrics defined here will be more compatible with future scenario analyses, satisfying criterion (2) above. Third, recent analyses of surface energy partitioning and near-surface meteorological variables in ERA5 found significant improvements over the previous-generation ERA-Interim product (

Identifying anomalous hot years in New Zealand using current definitions
While New Zealand has yet to adopt any formal definition of a heatwave (Ministry of Health 2018), there has been ample use of a 'hot day' metric within the local research community, including as a proxy for heat-related impacts in climate change risk assessments commissioned by regional decision makers (Ministry for the Environment 2018).
The National Institute of Water and Atmosphere (NIWA) defines a 'hot day' in New Zealand as any day when Tx exceeds 25 • C: this definition purportedly stems from the claim that 'beef and dairy cattle tend to start experiencing heat stress at temperatures above this threshold' (https://bit.ly/2WkiSx3), rather than bearing any relationship to a broader range of impacts on human health, infrastructure or otherwise. While the notion that daily maximum temperatures exceeding 25 • C can result in the onset of stress in dairy cows is itself questionable (Perez-Mendez et al 2019), particularly when such a metric does not include humidity considerations (Bryant et al 2007, Dunn et al 2014, Polsky and Keyserlingk 2017, I nevertheless focus on this widely adopted approach first. Figure 1 presents the number of days where Tx exceeds 25 • C, for five selected hot years since 1979 (noting that a year is hereafter considered from July 1 to June 30 to capture the austral summer season, and 'hot year' should be interpreted only with respect to the metric in question, rather than annual mean temperatures). 1998/99 corresponded with an extreme La Niña event (Zhu et al 2014, Cai et al 2015, which is purported to be linked with anomalous above-average sea level pressure over the New Zealand domain (Tait et al 2006, NIWA 2018. Meanwhile, the summer of 2012/13 was widely considered to be New Zealand's worst drought on record (Harrington et al 2016) -though several of these years were associated with robust drought impacts. When interrogating figure 1 further, we find that some regions of the South Island's West Coast did not experience any days with Tx above 25 • C. Meanwhile, large parts of the country experienced more than two months above the hot day threshold, including many rural areas where dairy farming is prevalent (Dairy 2018). These results therefore suggest two possible outcomes: (1) if the hot day metric does characterise exposure to heat stress among beef and dairy cattle, then these five extreme years within only two decades must have led to devastating consequences across the relevant industries; or (2) the metric is failing as a proxy for livestock susceptibility to heat-related impacts, possibly because of farm-level adaptation to recurrent experiences. While outside the scope of this analysis, future research is clearly needed to interrogate the sector-specific impacts of these anomalous years, and confirm which of these two hypotheses are actually supported by observations.

The value of normalised metrics of extreme heat
Among recent developments within the climate extremes research community, has been the recognition that the most effective heatwave metrics are not those which specify absolute temperature thresholds or an absolute anomaly criteria (Frich et al 2002), but instead those which are normalised with respect to the climatology of a given region (Perkins and Alexander 2013, Nairn and Fawcett 2015, Russo et al 2015, 2016, Perkins 2015, Fischer and Knutti 2015, and for a given time of year (Perkins and Gibson 2015). Among the 27 metrics recommended by the World Meteorological Organisation's Expert Team on Climate Change Detection Indices (ETCCDI) (Alexander et al 2006, WMO 2009  When employing this normalised perspective, the inadequacy of using a singular temperature threshold to characterise all impacts of extreme heat across the widely varying local climates of New Zealand becomes clear. Figure 2 presents the full range of local climates experienced in New Zealand, whereby Tx data is separately aggregated for each grid cell, and across all days within each calendar month for the 30-year period spanning 1981-2010 (this baseline is used hereafter for all percentile-based and normalised temperature analyses). When comparing the 25 • C isotherm against the corresponding 90th percentile threshold, we find that large parts of the north of the country experience temperatures above 25 • C far too often to be characterised as 'extreme' , while many regions of the South Island could experience very unfamiliar temperatures Given these differences between the Tx90p and 25 • C-exceedance approach to characterising hot days, it is worth re-examining New Zealand's five 'hot' years (figure 1) from a normalised perspective. Comparing absolute and relative thresholds for extreme heat. For each land point in New Zealand, daily maximum temperatures for each calendar month were extracted across the period 1981-2010, and then the daily data were ranked from low to high, separately for each individual month. These percentiles of daily maximum temperature data for an individual month are presented along the y-axis, while the x-axis shows all combinations of land-based grid cells and months in New Zealand, also sorted from low to high. The solid black horizontal line shows the 90th percentile of daily maximum temperatures, following the ETCCDI definition of a hot day. The blue lines highlight all 'land-months' above the 25 • C isotherm. months of anomalously high temperatures ( figure  3(b)), while the severe 2012/13 drought aligned with between 60 and 90 days of relatively hot temperatures during the year, and consistently across the country.
Interrogating these 5 years under a Tx90p framework also reveals that the frequent exceedances of 25 • C in 1997/98, as seen in figure 1(a), occurred mostly during the hottest months of the summer where such temperatures were common in the respective regions anyway. By contrast, figures 3 (d) and (e) suggest unusual warm spells in 2017/18 and 2018/19 occurred throughout the year, possibly inducing more pronounced heat-related impacts as a consequence (Perkins and Gibson 2015).

The limitation of fixed thresholds to characterise the future evolution of heat-related impacts
While the Tx90p metric provides useful insights into the properties of recently observed hot years like those in figure 3, there are nevertheless drawbacks to any heat metric which is based on the exceedance of a fixed temperature threshold. First, the binary nature of counting exceedances of a fixed temperature threshold, whether absolute or percentilebased, means that little information can be obtained about the relative severity of different heatwaves. For example, a 10-d heatwave consisting of Tx anomalies exceeding the 90th percentile by 0.1 • C would, under the current framework, be viewed as equivalent to a 10-d period where Tx continuously exceeds the 90th percentile by 10 • C, despite the latter event obviously generating more detrimental impacts.
Second, there are related issues to do with the statistical saturation of Tx90p, and similar 'frequency-of-exceedance' metrics under future warming scenarios. As discussed in depth elsewhere (Harrington and Otto 2018), wherever a percentile-based metric of 'extreme' is defined according to a historical baseline (1981-2010 is chosen here; the ETCCDI uses 1961-90), there will come a time under future warming scenarios where almost every day in the calendar year will be defined as a 'hot day' . I refer to this phenomenon as saturation: once all days are characterised as 'hot' , say after 2 • C of global warming, then the added impacts of a 3 • C or a 4 • C world no longer be detectable using this metric, thereby rendering it uninformative for

The flexibility of a 'sigma-exceedance' metric to quantify extreme heat
One solution to the limitations outlined in section 3.1, is to align metrics of extreme heat based on the observed past (percentile-based metrics) with a 'climate change emergence' framework, a concept which has recently gained traction within the research community ( Figure 4(a) presents three daily Tx thresholds (hereafter called 'sigma-exceedance' thresholds) which are empirically equivalent to the 90th, 95th and 99th percentiles of the 1981-2010 baseline: these are +1.3σ, +1.6σ, and +2.2σ, respectively. While we focus on those thresholds constrained from the real-world Tx distributions over New Zealand (see figure 4 caption for details), it is noted that such thresholds are similar to those found using a perfect Gaussian distribution ( figure 4(b)).
While this change in terminology sounds trivial, three benefits emerge. First, one can now easily characterise the relative intensity of different observed heatwaves: by presenting the average σ anomalies across the duration of the heatwave, these events can be interpreted in a statistical context more widely understood in the scientific community (relative to more complex metrics like HWMId (Russo et al 2015)). Second, this revision ensures that future heatwaves, which might be orders of magnitude more severe than historical experiences (Power and Delage 2019), can still be quantified in a manner comparable to heatwaves from both the past and present day (Otto et al 2018). Third, this adjustment presents future opportunities to align the language used to describe the intensity of specific extreme events-like the 'angry' Australian summer of 2013 (Lewis and Karoly 2013)-with the terms 'unusual' (>1σ), 'unfamiliar' (>2σ), 'unknown' (>3σ) and 'inconceivable' (>5σ), which were recently developed to categorise the emergence of changes in mean climate signals

Identifying candidates for New Zealand's worst heatwaves
Now the strengths and weaknesses of different metrics to quantify extreme heat have been explored, the 'sigma-exceedance' framework is used next to quantify New Zealand's worst multi-day heatwaves since 1979. Specifically, I quantified the most severe N-day heatwaves which were widespread throughout country: N was considered from 3 to 14 d and the severity was measured according to the areal median and 90th percentile of Tx anomalies (in units of σ), based on all land grid cells across the country.
Through this process, two key events continued to emerge as particualarly significant: one event, spanning November 2nd-5th 2019, which dominated the statistics for events with N < 7 d; and one event, spanning January 20th-31st 2018 which was linked to the top-ranked events over timescales of 8-14 d. The 2019 event was particularly severe, with more than half of the country-including most major cities-spending the entire 5-d period with Tx anomalies exceeding+2.2σ (which is equivalent to exceeding the local 99th percentile) for that time of year.
It is also worth noting that both heatwaves occurred during the hottest months of the year, such Converting percentile anomalies to units of standard deviations for daily maximum temperatures. Panel (a) presents empirical estimates of how each standard deviation anomaly corresponds to an exceedance of a local 90th, 95th or 99th percentile in daily maximum temperatures. The thick coloured line shows this estimate for the median land grid cell, for each day in the calendar year (x-axis), while the thin lines show the 5%-95% uncertainty range. The black horizontal lines in panel (a) have been hereafter selected as fixed daily standard deviation thresholds (+1.3σ, +1.6σ and +2.2σ) which correspond to the three percentile-based exceedance thresholds (90th, 95th and 99th, respectively). Panel (b) shows how these varying definitions of 'extreme' compare for an idealised Gaussian distribution-strong similarities are found with those real-world thresholds shown in panel (a). that temperatures above 25 • C were found in many regions for the duration of the events (figure S1 which is available online at stacks.iop.org/ERL/16/ 034030/mmedia). Both events therefore satisfy the multiple criteria to characterise extreme heat discussed throughout sections 2 and 3. Hence, if decision makers were interested in understanding the impacts associated with extreme heat in New Zealandparticularly those linked to excess morbidity and mortality in vulnerable populations, temperaturerelated productivity losses, or infrastructure impacts (Steffen et al 2019)-then these two heatwaves represent promising candidates for analysis.

Recent changes in the observed frequency of extreme heat
The final stage of this analysis involves a preliminary assessment of how the frequency of heat extremes has changed over New Zealand in recent decades. To do this, the number of days with Tx exceeding +1.3σ (or the 90th percentile) are counted in each calendar month, for the decade spanning 1980-1989. This is then compared with the corresponding number of hot days witnessed in the decade spanning 2010-2019 (again, separately for each month), and the fractional increase in the likelihood of witnessing a hot day in the 2010s relative to the 1980s (called a 'risk ratio' , following NAS (2016)) is then calculated for each grid cell.
Because research elsewhere suggests anthropogenic increases in global temperatures approximate 0.5 • C between the 1980s and the 2010s (Leach et al 2018, Haustein et al 2019)-albeit with uncertaintywe consider such a comparison to be useful in contextualising the scale of impacts possible with another half degree of warming in the future, as has been done elsewhere . However, it is emphasised that any conclusions presented in figure  5 should be treated only as preliminary and requiring further analysis, since the results compare 10-year periods only, and local extremes can be influenced by modes of internal climate variability unrelated to anthropogenic drivers (Rogelj et al 2017, Pfleiderer et al 2018. In addition, only risk ratios above 1.4 are shown in figure 5, following a sensitivity test whereby two identical distributions-each of equal size to the monthly distributions compared in figure 6-were resampled 10 000 times, and a value of 1.4 represented the upper range (95th percentile) of risk ratios found by chance alone. Figure 6 presents the risk ratios of local hot day frequency for each of the 5 months centred on the main austral summer period (December-February). While robust changes are found for the months of December and January, February shows a particularly pronounced increase in the frequency of hot days, with some regions showing more than a five-fold rise in the number of hot days in the most recent decade, relative to the 1980s. The colour scales also align with corresponding percentile thresholds: the yellow colours denote exceedances of the 90th percentile, the orange colours denote exceedance of the 95th percentile, and darker red colours denote exceedances of the local 99th percentiles for the entirety of the N-day heatwave period.

Figure 6.
Month-by-month presentation of risk ratios for observed changes in extreme hot days. Each panel shows locally specific changes in the probability of witnessing a day in that specific month and during the decade spanning 2010 to 2019, where the maximum temperature anomaly locally exceeds +1.3σ (or the 90th percentile), relative to the likelihood of witnessing an equally hot day in that some month for the decade spanning 1980-1989. There are also several regions, particularly over the central North Island (Waikato), which show more than double the number of hot days in the most recent decade for both November and March (figure S2). Supplementary analysis (figure S3) suggests that the 90th and 99th percentile thresholds of Tx have risen faster than mean temperatures in Waikato, suggesting feedback mechanisms may be playing a role in exacerbating heat extremes over these regions (Wartenburger et al 2017, Lorenz et al 2019. Changes of this nature are more commonly associated with moisturelimited regions like the Mediterranean (Seneviratne et al 2016, Vogel et al 2017 and have not been identified before in New Zealand; future research is therefore needed to disentangle the physical drivers of this change. While acknowledging the caveats surrounding the risk ratios of figure 5, these results show that New Zealand may be witnessing an effective lengthening of summertime conditions over the last several decades. Such results are supported by separate research showing an attributable increase in the likelihood of North Island droughts over the twentieth century (Harrington et al 2016). While further analysis under a formal statistical framework (Stone et al 2019) is needed to verify these results, decision makers and stakeholders should nevertheless prepare for similar trends to accompany future increases in global temperatures.

Summary and outlook
The direct impacts of rising temperatures under climate change will present a challenge to all countries, including those who exhibit cool, temperate climates in the present day (Gasparrini et al 2015). While New Zealand is no exception to this fact, the importance of such changes in extreme heat has been understudied to date, leading to ambiguity in what planning responses might be appropriate for decision makers (Ministry of Health 2018).
While placing a particular focus on New Zealand, this study presents a systematic analysis of the wider literature on metrics to characterise the impacts of extreme heat in all cool climates. By contextualising the severity of recent-observed hot years, we examine the merits and drawbacks of currently used hot day definitions in New Zealand, before introducing a new metric which is (1) normalised relative to local experiences, (2) capable of disaggregating the relative severity of different heatwaves, and (3) can assess future heatwaves which are poorly captured by metrics based on historical statistics. This new metric adds particular value when identifying recent periods of extreme heat and seamlessly analysing the evolution of changes in extreme heat, both past and future, with continuing climate change. However, it is stressed that the most effective assessment of future heatrelated impacts in New Zealand will only be achieved through developing multiple metrics, each tailored to sector-specific impacts driven by extreme temperatures. Such efforts should be a priority for future research.
Finally, the results presented in this study point to several specific priorities for future research. These include: (1) a retrospective analysis of the heat stress actually experienced by livestock for the five hot years presented in figures 1 and 2; (2) examining the realworld impacts associated with the two heatwaves identified in figure 6, as well as quantifying the attributable increase in the likelihood of witnessing such events due to climate change; and (3) systematically modelling the possible extension of summertime temperature extremes identified in figure 5, including how these changes might evolve under scenarios of warming over the next several decades.