Projected increase in the spatial extent of contiguous US summer heat waves and associated attributes

The frequency, intensity and duration of heat waves are all expected to increase as the climate warms in response to increasing greenhouse gas concentrations. The focus of this study is on another dimension of heat waves, their spatial extent, something that has not been studied systematically by researchers but has important implications for associated impacts. Of particular interest are spatially contiguous heat wave regions, examined here over the conterminous US for the May–September season in both the current climate and climate model projections from the CMIP5 archive (11 models total) using the RCP4.5 and RCP8.5 radiative forcing scenarios. Given their myriad impacts, heat waves are defined using multiple temperature variables, one which includes atmospheric moisture. In addition to their spatial extent, several other physical attributes are computed across contiguous heat wave regions, including a proxy for energy use. An estimate of the human population exposed to current and future heat waves is also evaluated. We find that historical climate model simulations, in aggregate, show good fidelity in capturing key characteristics of heat waves in the current climate while projections show a substantial increase in spatial extent and other attributes by mid-century under both scenarios, though generally less for RCP4.5, as expected. Overall, the study presents a framework for examining the behavior, and associated impacts, of a frequently overlooked aspect of heat waves. The projected increases in the spatial extent and other attributes of heat waves reported here provides a new perspective on some of the potential consequences of the continued increase in atmospheric greenhouse gas concentrations.


Introduction
Heat waves, generally defined as consecutive days with extreme daily temperatures, often have multiple, deleterious impacts on society and in both the natural and built environment. The physical attributes of individual heat waves, however, can vary substantially, as exemplified (and confounded) by the use of differing definitional criteria. Heat waves have been variously identified using relative or absolute thresholds of daily maximum, minimum or mean temperatures, with some definitions also including some measure of atmospheric moisture, typically when exploring impacts on human health (Perkins and Alexander 2013, Smith et al 2013, McGregor 2015, Perkins 2015, Horton et al 2016, Coffel 2018. However, there are three common heat wave attributes that are in widespread use: their duration, intensity and frequency of occurrence. A combination of attributes can allow for greater discrimination among events, such as using duration and intensity in computing the Heat Wave Magnitude Index (Russo et al 2014). Many previous studies conclude it is very likely that the duration, intensity and frequency of heat waves will increase as the climate system warms in response to increasing greenhouse gas concentrations Original content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence.
Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. (Perkins et al 2012, Bindoff et al 2013, Perkins 2015, Perkins-Kirkpatrick and Gibson 2017, Vose et al 2017, Dosio et al 2018. An important physical attribute of heat waves that has not previously been examined systematically is their spatial extent, specifically, the spatial extent of contiguous regions simultaneously experiencing heat wave conditions. The physical size of such regions has important implications for heat-related impacts, including energy demand and the exposed human population to extreme daily temperatures. Atmospheric conditions favorable for heat waves can also reduce air quality (e.g. Jacob andWinner 2009, Harlan andRuddell 2011). Earlier studies that considered the spatial extent of heat waves looked at the fraction of land area covered (not necessarily contiguously) (Russo et al 2015, Sharma andMujumdar 2017), specific heat wave cases (Rebetez et al 2008), fixed regions defined using empirical orthogonal functions (Lau and Nath 2012), or cluster analysis of different heat wave attributes (Stefanon et al 2013, Lhotka and Kysely2 015).
Here, an algorithm is applied to daily gridded temperature data that identifies contiguous regions over the US domain that meet specified heat wave criteria under certain constraints, such as the minimum number of grid points required to define a region (see Methods). We first apply the algorithm to temperature data from the North American Regional Reanalysis (NARR; Mesinger et al 2006) for the May to September warm season during 1979-2009, and for comparison, historical runs from 11 coupled climate models contained in the CMIP5 archive (Taylor et al 2012), covering 1980-2005 (more information on the selection of the 11 models provided in Methods). We then consider climate model projections for the mid-century period of 2031-2055 by examining temperatures from the same set of 11 models forced with the representative concentration pathway RCP4.5 and RCP8.5 scenarios. To accommodate the multiple heat wave definitions and types of impacts described in the literature to date, daily values of the maximum, minimum and mean temperature are evaluated along with maximum apparent temperature computed using a linear regression fit developed by Steadman (1984) (hereafter simply called apparent temperature). In reanalysis and CMIP5 model historical runs and projections the total number of heat wave events, their duration and magnitude are computed. For maximum daily apparent temperature and daily mean temperature, the exposed human population in contiguous heat wave regions is quantified, with cooling degree days (CDD) also computed for the latter variable. Projected changes in spatial extent and other attributes are quantified, with changes in the geographic distribution of overall heat wave frequency also presented.

Methods
Daily maximum and minimum temperature along with specific humidity and surface pressure near the time of the daily maximum temperature were obtained from the NARR dataset . While not direct observations, a recent study (Lyon and Barnston 2017) compared apparent temperatures computed for a set of 33 first-order US meteorological stations with values using NARR temperature and humidity at the nearest grid point, with favorable results (see supplemental information; SI is available online at stacks.iop.org/ERL/14/114029/mmedia). For the 11 CMIP5 models (table 1), the 2 m daily maximum and minimum temperature (variables tasmax and tasmin, respectively) were utilized along with daily mean values of 2 m specific humidity (huss) and surface pressure (ps) from the CMIP5 archive for the May to September season . The 11 CMIP5 models used were selected based on data availability and data quality (lack of missing, or obviously erroneous, values) and have a transient climate response and equilibrium climate sensitivity (see table S1) similar to the 30-model average as provided by Flato et al (2013).
The CMIP5 model variables are known to exhibit various biases. Of particular relevance here are model biases in temperature, specific humidity and surface pressure. Biases in all these variables have been computed for the historical period 1980-2005 (see SI and figures S1 and S2). Since heat waves are defined here using percentiles, the influence of such biases on identifying heat wave events is minimized. However, for CDD, an absolute temperature threshold is used, and the influence of temperature biases needs to be considered explicitly. To do so, computed biases in climatological monthly mean maximum and minimum temperature (for each model) were first subtracted from respective daily temperature values of the corresponding month (from May to September). CDD are then calculated using the bias-corrected temperature data. The daily mean temperature was calculated as the average of the daily maximum and minimum values. The daily maximum apparent temperature is computed following the multiple linear regression approximation of Steadman (1984) as: T a =−1.3+ 0.92T +2.2e, where T a is the apparent temperature (°C), T is the near surface air temperature (°C), e is the near surface vapor pressure (kPa) and we have assumed outdoor conditions in the shade with no wind. The vapor pressure (kPa) is computed as e q p 1.60810 , 3 - · where q is the specific humidity (g kg −1 ) and p is the surface pressure (Pa). We note that an alternative linear regression approximation of apparent temperature for 'shade' conditions provided by Steadman (1984) yields very similar results to those obtained from the above equation (see SI). For the CMIP5 models, using the daily mean specific humidity versus the value at the time of maximum temperature can introduce a small bias in T a , on the order of a few tenths of a degree, which is not considered sufficient to substantially alter the overall results (see SI). The apparent temperature is used in the study as it represents the perceived temperature of humans given the combination of heat and humidity and is thus a measure of 'thermal comfort'. The linear regression fit of Steadman (1984) estimates apparent temperature values he obtained via a more complex calculation and additional inputs. An alternative is the Heat Index used by the US National Weather Service (Rothfusz 1990). The Heat Index uses a more complicated, nonlinear multiple regression fit to apparent temperature values computed by Steadman (1979) and surface temperature and relative humidity as inputs. The Heat Index shows greater sensitivity to extreme temperature and humidity conditions than our simpler, multiple linear regression estimation of apparent temperature. As such, researchers interested in absolute values of apparent temperature should consider using the Heat Index in their analyses instead of our simpler formulation. However, this study is focused on identifying heat waves based on relative values (percentiles) of apparent temperature rather than absolute values. Therefore, so long as the Heat Index is closely associated with our estimate of apparent temperature, the two indicators should provide very similar results. Comparisons of daily values of the Heat Index and our apparent temperature calculation were thus made at 9 stations selected from 9 climate regions across the US and revealed linear correlations 0.98 across all stations (see SI and figure S3). Given this strong relationship, and since examining specific health outcomes from heat waves is beyond the scope of this study, we used the simpler linear regression approximation of Steadman (1984) to estimate apparent temperature (see SI).
All of the data in the study were re-gridded to a common 2.0°latitude×2.0°longitude grid, with only grid points over land included in the analysis. At least 50% of a grid box had to contain land in order to be considered land area. A common grid was needed to consistently compute heat wave spatial extent and other attributes across models, with the 2.0°×2.0°r esolution selected as it was very close to the resolution of the original data (see table 1). The spatial domain covered is 24°N-50°N, 62°W-130°W, which is a boxed region roughly covering the conterminous US Gridded population estimates for 2015 were obtained from the Gridded Population of the World version 4 dataset (GPWv4, Doxsey-Whitfield et al 2015). The population within each 2.0°×2.0°grid point was estimated by aggregating the GPWv4 data to this resolution.
For all temperature variables, a heat wave is defined locally when a daily temperature variable exceeds the 95th percentile of its historical distribution for 3 or more consecutive days. The percentile thresholds were identified by first ranking daily temperature values (1 May-30 September) at each grid point over the period 1979period -2009period for NARR and 1980period -2005 for the CMIP5 models. The slight difference in base periods for NARR and CMIP5 has little effect on the climatological mean values. Given the relatively short historical period used, daily percentile values at a given location can fluctuate up and down somewhat from one day to the next, an undesired result of sampling variability rather than changes in seasonally varying climate. To minimize this effect, the daily 95th percentile values were temporally smoothed at each grid point by applying a 15 d moving average to the ranked temperature values, which reduces the time domain of the study to 8 May-23 September. This approach follows that successfully used in a recent study of US heat waves (Lyon and Barnston 2017).
Contiguous heat wave regions were identified by applying the connected components algorithm in Matlab (release R2014a) to gridded, binary (yes/no) heat wave data. The first step was to flag grid point values in each day's temperature field as '1' if the heat wave criteria were met and '0' if they were not. The type of connectivity in the Matlab algorithm was set at 8, which allows for adjacent and diagonal connectivity, and the minimum number of connected grid points was set at four, representing a minimum area for a heat wave of about 151 000 km 2 (slightly larger than the area of the State of New York). This minimum area was chosen to emphasize larger heat waves events, considered more likely to be tied to impacts (and distinct variations in the atmospheric circulation) than events which are identified at only a few, isolated grid points. Once contiguous regions have been identified the algorithm allows them to be tracked through time and to move in space. A heat wave region is also allowed to break into sub-regions as long as a minimum of four contiguous grid points overlap with a contiguous region identified the previous day.
The approach is illustrated in figure 1, which shows contiguous heat wave regions in the NARR data (at its original ≈32 km resolution) based on maximum daily temperature exceeding the 95th percentile for 3 or more days during the period 2-4 August 1980. Note that a heat wave identified on a given day would require that the daily temperature had exceeded the 95th percentile for 3 or more days ending on that date. The top panel in figure 1 shows a single, contiguous heat wave located in the central part of the US on 2 August. On 3 August (middle panel), this heat wave has broken into two sub-regions while a new and distinct heat wave emerges along the southeast coast. On 4 August (bottom panel) only the remnants of the original heat wave (region 1) remain. The various heat wave attributes are evaluated separately over the individual, contiguous heat wave regions and over their respective lifetimes.
The specific attributes evaluated are the number of heat wave events and their duration, the daily maximum and average daily spatial extent and a daily maximum and average daily normalized magnitude. The normalized magnitude was computed as the number of degrees above the 95th percentile temperature threshold divided by the difference between the threshold temperature and the median temperature averaged across all grid points in a contiguous heat wave region. This normalization was undertaken to account for geographic and within-season variations in daily temperature variance that would otherwise skew the results towards regions with higher variance. For example, based on temperature data from US observing stations (Barnston 1993), the standard deviation of daily maximum temperatures for Billings, MT in July is 4.6°C while for Miami, FL it is 1.4°C (in September the values are 7.1°C and 1.5°C, respectively). For sake of argument, if July daily maximum temperature is normally distributed (normality is not assumed in our other analyses), the 95th percentile threshold would be 1.64 standard deviations above the median. If the daily temperature exceeded this threshold by 1.0°C at both locations, this would represent a normalized value of 0.44 at Miami but only 0.13 at Billings. Of course, in the case of energy demand and some other applications, the absolute temperature departure above threshold is important, so for daily mean temperatures we also computed the daily maximum number of CDD (threshold of 18.5°C) as summed across all points in a contiguous heat wave. We emphasize the daily maximum CDDs since meeting peak energy demand is the greatest challenge during a heat wave (Bartos et al 2016, Aufhammer et al 2017. As discussed in the Methods section, bias-corrected temperature data are used when computing CDDs from model data.

Results
To provide context for projected changes in various heat wave attributes, the projected multi-model mean (MMM) temperature change (2031-55 minus 1980-2005) for the CMIP5 models during the May to September season is provided in figure S5, along with the projected temperature change divided by the historical mean monthly standard deviation for both the RCP4.5 and RCP8.5 scenarios. Figure 2 shows the spatial distribution of the frequency of heat wave days for the MMM CMIP5 historical runs  and RCP8.5 projections (2031-2055) where a simple bidirectional smoothing has been applied to the gridded data. Results for daily maximum and minimum temperature and daily maximum apparent temperature are shown. Note that the range on the color scale is an order of magnitude larger for projections than for the historical period. The projected frequency of heat wave days clearly shows a substantial increase in all areas of the country for all three temperature variables, particularly in the southern and western portions of the country. The large, projected increase in heat wave frequency for daily maximum apparent temperature in the southwest US (see figures 2(b) and (d)) appears consistent with a faster projected rate of increase of apparent temperature than for maximum temperature in this region (not shown). Similar differential trends in these variables have been identified using station observations (Gaffen andRoss 1999, Grundstein andDowd 2011). The historical frequency of heat waves based on minimum temperature (figure 2(e)) is lower than that for daily maximum temperature or apparent temperature, consistent with previous findings based on station observations (Lyon and Barnston 2017). This result is also seen in related results for minimum temperature in the CMIP5 historical runs and NARR data ( figure S6). On the other hand, the frequency of heat wave days for minimum temperatures shows the greatest increase in projections relative to the historical period, with the largest increase in the southeastern US For daily maximum and daily maximum apparent temperature, the increase in heat wave days is generally greater in the southern half of the US, with a large increase also seen along the east coast. This latter result may in part reflect the fact that some grid points in these coastal locations have a substantial fraction over the ocean (the resolution of the CMIP5 model data is 2.0°×2.0°lat./lon.) but may also reflect the fact that temperature variance is typically lower in these coastal locations, making a fairly uniform increase in temperature from anthropogenic forcing more likely to lead to heat waves there. The CMIP5 projected changes in heat wave day frequency in figure 2 are generally consistent with those from downscaled regional model projections for the US (e.g. Kunkel et al 2010). Figure 3 summarizes the physical attributes of contiguous heat waves occurring in NARR and the CMIP5 historical runs and projections based on the RCP8.5 scenario (a similar figure for the RCP4.5 scenario is provided in figure S7). The attributes include the average number of contiguous events, their duration, daily maximum and average daily spatial extent and daily maximum and average daily normalized magnitude. The bars on the plots represent values averaged across all identified heat wave events and all models, with the whiskers indicating the range in average values across the 11 CMIP5 models. For all heat wave attributes and for all temperature variables, figure 3 shows that the MMM values from the historical runs are in very good agreement with those obtained from NARR, with differences typically less than 10% (see table S2). By this metric at least, the models appear to be doing a very good job in capturing the aggregate behavior of several important contiguous heat wave attributes. Figure 3 also shows that contiguous heat waves identified using daily minimum temperature in NARR and CMIP5 historical runs have comparatively fewer events, of shorter duration and lower spatial extent than do the three other temperature variables. This relative difference is not found for normalized magnitude, however.
For RCP8.5 projections, the magnitude of all attributes increases substantially (for RCP4.5 the overall pattern of change is similar, with the magnitude of attribute changes generally smaller, as expected). Across temperature variables, both the average daily maximum spatial extent and average daily maximum magnitude increase by a factor of roughly 1.8 (i.e. 80%) over historical values (see table S2). The mean number of events and event duration also both roughly double by mid-century. It is perhaps not surprising that the average daily spatial extent and average daily normalized magnitude in projections show less of an increase than their respective daily maximum values in projections, since projections show an increase in the overall number of events, some future events are expected to be just meeting the definitional criteria. It should also be kept in mind that increases in the duration of heat wave events is a more relevant metric in quantifying changes in heat wave behavior than changes in the number of events, or overall heat wave days. For example, a single heat wave event persisting for 9 d would likely have larger impacts than 3 separate events that each only persist for 3 d (e.g. Anderson andBell 2011, Troy et al 2015).
In terms of some potential impacts of heat waves, figure 4 shows the daily maximum and event average CDDs and daily maximum exposed population results for NARR and the CMIP5 MMM historical runs and RCP8.5 projections (RCP4.5 results are shown at the bottom of figure S7). These two variables are evaluated for daily mean temperature and the daily maximum apparent temperature. CDDs are based on daily mean temperature, which are also relevant to health outcomes as elevated nighttime temperatures preclude relief from daytime heat stress (Greene and Kalkstein 1996, Hajat et al 2006, Luber and McGeehin 2008. Extreme daily maximum apparent temperatures are directly related to heat stress (Steadman 1984). For both temperature variables, results from the CMIP5 historical runs are again very similar to those obtained using NARR (typical differences <5%; table S2). Figure 4(a) indicates the daily maximum CDDs double in the MMM projections, with the increase for the event average showing a more modest increase. The projected increase in CDDs during heat waves has major implications for meeting future energy demand and is likely to place substantial stress on the energy system (USGCRP 2018), much more so than will increases in mean temperature alone (Jaglom et al 2014, McFarland et al 2015, Larsen et al 2017. The exposed population to extreme daily maximum apparent temperature and daily mean temperature ( figure 4(b)) is also seen to double in RCP8.5 projections by mid-century.
Previous work on US population exposure to extreme heat (e.g. Jones et al 2015, Coffel et al 2018) did not consider the simultaneous population exposure to heat waves in spatially contiguous regions. Of course, in such contiguous regions human populations may show some degree of acclimatization as temperatures rise, making them less vulnerable to heat waves. Sheridan and Dixon (2017), for example, find an overall decline in human vulnerability to heat waves in several major US cities since the mid-1970s. The results in figures 3 and 4 are based on values that are averaged across all heat wave events identified in the NARR and CMIP5 model data. To estimate an upper-bound (not the most likely outcome) in projected heat wave attribute changes we also examined more severe heat wave events, identified as the 90th percentile in attribute values in NARR and the CMIP5 historical runs and projections. For this purpose, only the RCP8.5 scenario was used and for CMIP5, the 90th percentile attribute values were calculated for individual models first and then averaged across all 11 models. We find (SI; figure S8 and table S3) that for these more severe events the daily maximum spatial extent, duration, exposed population and CDD more than double from historical values by mid-century under this high greenhouse gas scenario. For maximum daily spatial extent, this represents an increase of roughly 20%-30% over projected changes in average heat wave attribute values across our four temperature variables under RCP8.5.

Discussion and conclusions
This study was motivated by the lack of a systematic analysis of the spatial extent of contiguous heat wave regions. Larger spatial extent of heat waves strongly suggests larger human exposure and increased energy demand and could also have implications for fire risk and air quality although the latter two impacts were not examined explicitly here. Each of the impacts, however, considered in isolation, or from the viewpoint of their simultaneous interactions, is clearly worthy of further research. That the historical runs from 11 CMIP5 models used are able (in aggregate) to closely capture several of the attributes of historical heat waves identified in the NARR data provides some confidence for employing the models to examine projected changes in these attributes. The results from projections show substantial increases in the spatial extent, duration, CDD and exposed population, with respective values for the more extreme heat waves (>90th percentile of historical attribute values) all roughly doubling over historical values by midcentury under the RCP8.5 forcing scenario. The projected, exposed population is a conservative estimate as it is based on static population estimates for 2015; projected population changes would make this figure higher (Coffel et al 2018, Jones et al 2015. The magnitude of these increases under the RCP4.5 scenario is roughly 70%-90% of those found using RCP8.5. For the high-concentration RCP8.5 scenario, the projected increase in CDDs for these more extreme heat waves is roughly five times greater than historical mean values, reinforcing concerns of increased stress on the energy system from extreme heat (USGCRP 2018). Even when averaged across all projected heat waves, the duration, exposed population and CDD are found to roughly double, with the spatial extent increasing by roughly 80% from respective values for the current climate, with RCP4.5 values being 5%-20% less.
While the focus of this paper is on the spatial extent of larger, contiguous heat wave regions and associated attributes, the importance of small or localscale aspects of heat waves are certainly recognized. For example, the urban heat island effect can significantly enhance exposure to extreme temperatures in cities (Habeeb et al 2015) where population density is also high. In addition, the joint occurrence of heat waves and drought can place additional stresses on the natural and built environment (Mazdiyasni and Agha-Kouchak 2015) and the realism of this joint behavior needs to be evaluated in climate models (e.g. Lyon 2009).
The results of the current study are also clearly sensitive to the ability of coupled climate models to properly capture the physical mechanisms that drive heat wave development and persistence, mechanisms that may change as a result of increased radiative forcing from rising greenhouse gas concentration (see review by Perkins 2015). And while this study finds a close agreement between heat wave attributes identified in CMIP5 historical runs and the NARR, the latter are not direct observations. Irrigation and intensified cropping in the central US (e.g. Cook et al 2011, Mueller et al 2015, for example, have been argued to reduce maximum temperature trends, with neither effect included in the CMIP5 projections used here. In addition, the high-concentration RCP8.5 scenario may prove to be unrealistically high by mid-century and the transient climate response to increasing greenhouse gas concentrations may exceed the real-world response, as has been suggested for some models . NARR and CMIP5 multi-model mean historical and RCP8.5 projected changes in cooling degree days and exposed population to heat waves (>95th percentile for 3 or more days) based on daily maximum apparent temperature (green bars) and daily mean temperature (grey bars). Whiskers indicate the range in values across the 11 models in historical runs  and projections (2031-2055). (a) Daily maximum and average cooling degree days (°C). Panel (b) indicates the average maximum population exposed to heat waves defined using daily maximum apparent temperature (green bars) and daily mean temperature (grey bars).
(e.g. Kirtman et al 2013). In such cases, the results presented here may be more applicable later than mid-century but remain substantial.
In future work, more research is needed to evaluate the influence and possible nonlinearities in impacts associated with spatially contiguous heat wave regions in both the current climate and climate model projections. A clear example is testing current management assumptions and capacity needs in the energy sector regarding meeting peak load demand requirements during spatially expansive heat waves. The type of results provided here, for example, could be used to stress test current system capacity and inform management decisions and planning going forward.
Overall, the study provides a methodological framework for examining the behavior and potential impacts of a frequently overlooked aspect of heat waves. While additional research is needed, the current results suggest the impacts of the projected changes in heat wave characteristics could be substantial if greenhouse gases continue to increase, especially if that increase continues unabated.