Heat Waves: Physical Understanding and Scientific Challenges

Heat waves (HWs) can cause large socioeconomic and environmental impacts. The observed increases in their frequency, intensity and duration are projected to continue with global warming. This review synthesizes the state of knowledge and scientific challenges. It discusses different aspects related to the definition, triggering mechanisms, observed changes and future projections of HWs, as well as emerging research lines on subseasonal forecasts and specific types of HWs. We also identify gaps that limit progress and delineate priorities for future research. Overall, the physical drivers of HWs are not well understood, partly due to difficulties in the quantification of their interactions and responses to climate change. Influential factors convey processes at different spatio‐temporal scales, from global warming and the large‐scale atmospheric circulation to regional and local factors in the affected area and upwind regions. Although some thermodynamic processes have been identified, there is a lack of understanding of dynamical aspects, regional forcings and feedbacks, and their future changes. This hampers the attribution of regional trends and individual events, and reduces the ability to provide accurate forecasts and regional projections. Sustained observational networks, models of diverse complexity, narrative‐based methodological approaches and artificial intelligence offer new opportunities toward process‐based understanding and interdisciplinary research.

HWs can concur with other hazards or drivers in the form of compound events with cumulative impacts (e.g., Leonard et al., 2014). Examples of HW-compounded hazards include high humidity (humid HWs; e.g., S. Russo et al., 2017), droughts (e.g., Miralles et al., 2019), air pollution (e.g., Xiao et al., 2022;Zong et al., 2022) or wildfires (e.g., Sutanto et al., 2020). These compound events can lead to strong perturbations of the global carbon cycle, and disproportionate socioeconomic impacts in highly-populated areas and major breadbasket regions (Bastos et al., 2020;Kornhuber et al., 2020;Sippel et al., 2018). To some extent, HWs have an inherent compound nature since their drivers interact at multiple spatio-temporal scales and intervene in other climate-related hazards. Overall, knowledge on compound events is still limited , and references therein), partially due to incomplete understanding of the underlying hazards, their spatio-temporal dependences, and the multiplicity of drivers and interactions, as also illustrated in this review.
Globally, HWs have become more frequent, persistent and intense in recent decades (e.g., Seneviratne et al., 2021), and several regions have experienced unprecedented events that would have been very unlikely without increasing greenhouse gas (GHG) emissions (e.g., Robinson et al., 2021). The observed changes and associated effects of HWs are projected to soar under continued global warming posing unprecedented risks and potential irreversible damages to ecosystems (Deryng et al., 2014;Gasparrini et al., 2017;Harris et al., 2018;Hoegh-Guldberg et al., 2018;Ruffault et al., 2020;Watts et al., 2018). Therefore, better understanding of HWs and their responses to climate change is paramount to assess and predict associated risks, and to enhance preparedness for future events. As such, HWs are one of the four types of extreme events addressed by the Extremes Grand Challenge of the World Climate Research Programme (WCRP; Alexander et al., 2016), which is organized around major research topics targeting documentation, understanding, simulation and attribution of extremes.
This review summarizes current knowledge and challenges, focusing on recent advances since the Special Report on Extreme Events (SREX; Seneviratne et al., 2012). It also expands upon the last report of the Intergovernmental Panel for Climate Change (IPCC; Seneviratne et al., 2021) and includes questions not addressed therein. Emphasis is put on (but not be limited to) HWs over land at different temporal scales (from daily to monthly scale). Addressed questions include the definition of HWs (Section 2), their physical drivers (Section 3), and observed changes and future projections (Section 4). Urban HWs as well as emerging topics related to Marine HWs (MHWs) and subseasonal forecasting will be discussed in Section 5. Section 6 summarizes key gaps requiring scientific progress and discusses potential areas for future research.

Defining and Characterizing Heat Waves
There are many HW definitions. According to the International Meteorological Vocabulary (WMO, 1992), a HW is a "marked warming of the air, or the invasion of very warm air, over a large area." The Glossary of Meteorology of the American Meteorological Society (AMS) defines it as "a period of abnormally and uncomfortably hot (and usually) humid weather." A similar, albeit more specific, definition is employed by the IPCC (IPCC: Annex VII: Glossary, 2021): "a period of abnormally hot weather, often defined with reference to a relative temperature threshold, lasting from 2 days to months." These definitions depend on the local climatology and/ or the impacted system (e.g., "hot weather" or "warm air"), involve some degree of arbitrariness (e.g., "large area") and cannot be generalized (e.g., "abnormally hot weather" in polar regions would not be hot in other regions, not even "uncomfortable"). In the scientific literature, a plethora of HW definitions can be found (e.g., McGregor et al., 2015;Perkins & Alexander, 2013;X. Zhang et al., 2011;Zuo et al., 2015). Figure 1 provides a schematic of different types of HW indicators and their qualitative categorization in terms of complexity. It is used for illustrative purposes only, since there is not a single and easy way to classify all HW definitions. For We note that these classifiers are not fully independent (e.g., some methodological choices can be determined by data availability). The location of HW indices in the 2-D space is given for indicative purposes only, and is based on subjective assessment. The diagonal line is intended to measure the level of complexity, which increases with data limitations and the number of choices. Ovals filled in blue denote temperature-based indices and those in orange identify composed indices more oriented to impact assessments. Background shaded colors define the components of the risk-based framework (hazard in purple, and exposure and vulnerability in pink). a more detailed description of specific HW indices and targeted applications, the reader is referred to Table 1, which provides a representative list of HW indicators widely employed in the literature. The separation between climate-and impact-oriented definitions is not obvious. In particular, impact-oriented definitions should combine a hazard measure with other indicators influencing the risk of impacts on the targeted sector. However, this entails challenging estimates of exposure and vulnerability, and hence simple climate-based indices are often employed as proxies of impact assessments (e.g., the frequency of tropical nights with minimum temperature above 20°C; Table 1). Many climate-based indices rely on temperature data only (e.g., daily maximum temperature, TX). These provide the simplest definitions of HWs. Even so, selecting an index is not trivial since it  Russo et al. (2015) EHT (heat excess factor) Cumulative TM threshold exceedances weighted by an acclimatization factor Accumulated intensity Event index (TM) Health, Agriculture, Water and Energy Perkins and Alexander (2013) HI (heat index) A measure of thermal discomfort (or apparent temperature)

Intensity
Multivariate index e , f Health Steadman (1979) UTCI (Universal Thermal Climate Index) An equivalent temperature of the human physiological response to the thermal environment Intensity Multivariate index f , g Health Fiala et al. (2012) Note. Columns indicate the abbreviation, a brief definition, the main HW attribute captured by the index, the index category and underlying variables, some reported sectoral applications and a reference. The index category includes extreme, event and multivariate indices ( Figure 1). The former allows subclassification in absolute and percentile indices. T, TM, TX, and TN denote instantaneous, daily mean, daily maximum and daily minimum air temperature, respectively. a Absolute and threshold indices only measure a HW attribute if extreme conditions actually occur. b Sectoral applications are based on the Expert Team on Sectorspecific Climate Indices (ETSCI) assessment (https://climpact-sci.org/indices/) or on applications of the original (or reformulated) indices. c There exist similar definitions for TN. d A measure of the energy needed to cool a building. There are different implementations in terms of variables, base temperatures and periods for accumulation. e There are different formulations using empirical non-linear functions of air (dry bulb) temperature and humidity (relative humidity or dew point temperature). f Depending on the author, absolute or relative thresholds can further be imposed to determine different categories of heat stress conditions. g Typical variables considered include air temperature, wind speed, radiation (radiant temperature) and humidity (relative humidity or dew point temperature).
implies decision-making at different levels (from data selection to methodological choices) to reach an adequate compromise between sample size and representativeness of the tail of the distribution.
Following earlier efforts to define a coordinated set of extreme indices, the Joint CCl/CLIVAR/JCOMM Expert Team on Climate Change Detection and Indices (ETCCDI) developed 27 extreme indices (Alexander et al., 2006; http://etccdi.pacificclimate.org/list_27_indices.shtml), including 17 temperature-based indicators of warm and cold extremes. These indices have been extensively used in IPCC reports (Seneviratne et al., , 2021 and provide the basis for further developments. They are based on absolute thresholds, percentiles or duration criteria (spell indices), yielding complementary information on intensity, frequency or lifespan of extreme conditions. In general, ETCCDI and other HW indicators employed in the literature (single extreme indices in Figure 1) can be classified in two groups: (a) indices representing specific aspects (not necessarily extreme) of the temperature distribution and (b) those considering specific properties of the tail of the distribution. The main difference resides in the way thresholds are defined (absolute or relative values). The first group includes absolute indices. They use maximum values or exceedances above absolute thresholds over a predefined period (e.g., an individual season) during which HWs may be absent or extend beyond that period (e.g., Perkins & Alexander, 2013;S. Russo et al., 2015). These indices are often easy to understand (e.g., the monthly maximum of TX, TXx; Table 1), but do not necessarily capture the tail of the distribution and must be interpreted within the context of local climatological conditions. Moreover, they can saturate in a warming climate (Dunn et al., 2020), particularly if moderate thresholds are employed (e.g., the frequency of summer days with TX above 25°C; Table 1). Many of these indices are often employed as bioclimatic indicators for risk assessments in human health or ecosystems that function under certain temperature ranges (e.g., Perkins, 2015;X. Zhang et al., 2011).
The second group focuses on extreme values (percentile indices) as a way to infer the frequency, persistence or intensity of extreme conditions. We include in this category spell indices (a sequence of hot days) based on percentiles (e.g., the warm spell duration index; Table 1). Percentile indices (e.g., number of days with TX above the 90th percentile, TX90p) use temperature thresholds relative to the local climate that allow intercomparability across locations and times of the year, being common in the climate community (e.g., Dunn et al., 2020). Percentile-based thresholds are typically fixed in time or allowed to vary seasonally, depending on whether one is interested in warm-season or all-year extremes. However, they also involve some degree of subjectivity, such as the choice of the percentile, the sample used to compute that threshold (e.g., calendar day, centered windows of different length, the entire seasonal or annual distribution) or the reference period. A relatively low threshold would include warm (but not necessarily extreme) events, while higher thresholds require a sufficiently large sample size for robust estimation. Furthermore, limited sampling hampers the reliability of threshold estimates and may induce artificial discontinuities in the HW index at the beginning and end of the reference period (X. C. Zhang et al., 2005). These choices, as well as the ways of quantifying changes in extreme indices are not straightforward, particularly in the presence of long-term trends in the underlying variables , as is the case of the present-day climate. For example, HW indicators based on standardized data relative to the local mean and variability can lead to overestimated exceedance rates outside the reference period (Sippel et al., 2015).
Overall, single extreme indices only measure one specific aspect of the phenomenon and hence they do not represent the multi-dimensional nature of HWs (R. M. Horton et al., 2016;Perkins, 2015). This has set the ground to define more complex indices that consider HWs as events and describe their characteristics, either separated or combined (event indices in Figure 1). Important HW event attributes considered in the literature include the frequency of events, maximum duration, and daily peak intensity, and less often other characteristics, such as timing of occurrence or areal extent (Sánchez-Benítez et al., 2018;Schoetter et al., 2015;Stéfanon et al., 2012). The selected attributes and associated thresholds are subject to certain degree of arbitrariness, sometimes with questionable ability to compare events across regions and/or periods (Perkins & Alexander, 2013;S. Russo et al., 2015). These indices can summarize multiple HW event characteristics in a single metric. For instance, the Heat Wave Magnitude Index daily (HWMId, S. Russo et al., 2015; Table 1) characterizes the annual maximum magnitude of HW events with at least 3 days above the calendar-day 90th percentile of TX, where the magnitude is the sum of normalized TX exceedances (in interquartile range units) over the HW duration.
The aforementioned indices are defined at the grid-cell level, therefore assuming HWs as local phenomena and disregarding the conditions in adjacent regions. Barriopedro et al. (2011) coined the term "mega-HW" to refer to outstanding long-lasting (≥7 days) events affecting large areas (≥10 6 km 2 ), which was further popularized in subsequent studies (e.g., Fischer, 2014;Miralles et al., 2014). Although this definition is ambiguous, it has motivated the assessment of the spatial extension of HWs and their spatio-temporal evolution by using novel approaches that capture the multi-dimensional nature of HWs (Sánchez-Benítez et al., 2020;Schumacher et al., 2019;Stéfanon et al., 2012;Thompson et al., 2022;Vogel et al., 2020). As HWs develop in space and time, and the associated impacts often depend on their spatio-temporal extent, this perspective provides a more complete structural description. For example, it has proven useful when analyzing HWs at synoptic and larger spatial scales (e.g., M. McCarthy et al., 2019;Sánchez-Benítez et al., 2018), as well as the projected changes in their attributes (Lyon et al., 2019;Vogel et al., 2020). The most documented mega-HWs are the August 2003 event that affected western and central Europe, causing ∼70,000 heat-related fatalities and US$ 10 billion of economic losses (e.g., García-Herrera et al., 2010), and the July-August 2010 episode of eastern Europe and western Russia, with ∼50,000 fatalities and US$ 15 billion costs (e.g., Barriopedro et al., 2011). The term has also been employed for recent HWs in other regions, such as the May-June 2015 episode in India (Ratnam et al., 2016) or the 2013/2014 episode in Brazil (Geirinhas et al., 2022).
HW indices require data sets of at least daily resolution with good temporal coverage to retrieve robust estimates of the underlying distributions (e.g., typically a period of 30 years or more for climate analyses). Due to limitations in data availability, the provision of operational global products has been restricted to relatively simple HW indices based on TX, daily minimum or mean temperature (Perkins, 2015;X. Zhang et al., 2011). Gridded products of ETCCDI indices based on station measurements are available from the Hadley Extremes database (HadEX3) for 1901-2018 (Dunn et al., 2020). The Global Historical Climatology Network-Daily Extremes database (GHCNDEX; Donat et al., 2013) also provides a near-real time collection of ETCCDI indices, although with lower spatial coverage and fewer quality tests. These data sets represent a compromise between data availability and quality. Despite continuous improvements, they are still sparse and/or have insufficient temporal coverage in some regions of Africa, South America, continental Asia, Indonesia and in the polar regions. Spatial-scale mismatches and unreliable estimates in regions and periods that are poorly constrained by observations affect the performance of these data sets Sheridan et al., 2020;. In particular, observational uncertainty remains large in areas of complex topography such as the Himalayas, central Andes, northern Africa or Greenland (Di Luca, Pitman, et al., 2020). Therefore, improvements are still required in data recovery, digitization and methodologies (quality control, gridding and interpolation) that preserve the spatial details in coastal and elevated regions (e.g., Dunn et al., 2020). New methodological approaches have allowed the development of global observational products of daily temperature at high resolution (e.g., Berkeley Earth, Rohde et al., 2013) that are very promising for regional and global assessments of HWs since at least 1950 (Perkins-Kirkpatrick & Lewis, 2020).
Due to the development of observational-based products and the large array of HW impacts, indices have been further refined to develop definitions tailored to specific impact sectors, often with particular emphasis on human health (e.g., McGregor et al., 2015;Mora et al., 2017;Z. Xu, Cheng, et al., 2018). Some impact-oriented HW definitions only require temperature data (e.g., the cooling degree days, Table 1, often employed as a proxy of electricity demand during HWs; Spinoni et al., 2015). Others further incorporate the event-based perspective of HWs. For example, the Excess Heat Factor (EHF; Table 1) combines two indices accounting for acclimatization and temperature exceedances over a percentile to derive HW events, defined as positive EHF values over three consecutive days (Nairn & Fawcett, 2014;Perkins & Alexander, 2013). However, heat-related impacts are more often described by multivariate indices that combine temperature with other variables (e.g., humidity, radiation and/or wind speed; Figure 1). Multivariate indices have been commonly employed in human health (e.g., Buzan & Huber, 2020). Heat stress is often diagnosed by some measurement of thermal discomfort that, in some cases, can be inferred from observational products and reanalyzes. An example is the Universal Thermal Climate Index, which builds upon budget analyses of heat exchanges in energy balance models of the human body to derive an equivalent temperature that would cause the same human response as the actual environment (e.g., Fiala et al., 2012; Table 1). There also exist simplified versions of the human perceived temperature. For instance, the Heat Index provides a measure of apparent temperature based on non-linear functions of temperature and humidity (Steadman, 1979; Table 1). Similarly, HWMId has been adapted to humid HWs by replacing TX by the apparent temperature (S. Russo et al., 2017). Nonetheless, the heterogeneous distribution of observations and limited availability of high-quality data hamper both the development of impact-oriented indices based on variables other than temperature and robust estimates of multivariate distributions (R. M. Horton et al., 2016). Moreover, impact-oriented indices (e.g., forest fires, agriculture, infrastructure performance, physiological stress or energy demand and production) can involve complex and local-dependent heat-impact relationships, compromising their None of the ETSCI indices considers exposure or vulnerability, but rather user-defined thresholds that can be accommodated to the specificities of the system (e.g., critical physiological limits). Furthermore, the consideration of time-varying thresholds in impact-oriented HW indices is often ignored. However, this is a relevant issue because adaptation to climate change involves changes in the perception of HW occurrence and associated risks. To illustrate this conceptually, we use the Berkeley Earth data set and a simple percentile-based HW index (see Appendix A) to quantify the perceived occurrence of hot days under three idealized adaptation scenarios ( Figure 2). These are meant to represent hypothetical systems with high (blue), low (red), and critical (yellow) adaptive capacities to the observed global warming. Scenarios are constructed by using different reference periods through the analyzed period that account for the mean climate (the "norm") perceived at any time (e.g., Vogel et al., 2020). They yield time-varying HW thresholds, which are interpreted as the critical levels above which harmful impacts would occur. That way, a slowly-varying threshold indicates little adaptation to climate change (red curve in Figure 2a), whereas systems with fast adaptive capacities tend to follow the observed changes in the mean climate (blue curve). Given the observed warming, and assuming no changes in exposure, adaptation has strong implications for the witnessed HW occurrence ( Figure 2b): high adaptation yields smaller frequencies in hazardous days than low adaptation or tipping point scenarios, meaning a substantial reduction in the associated risk. This poses additional challenges to HW definition: accounting for acclimatization in the definition of HWs requires knowledge on the adaptation time scales, tolerance ranges and critical limits of the specific system at hand.
In summary, the complexity of HW properties and impacts is possibly the main reason for the large diversity of indices. For temperature-based indices, critical choices concern the use and definition of absolute versus percentile thresholds, followed by other levels of decision such as the duration of the spell. Absolute thresholds allow for easy interpretation, but the choice of thresholds can be challenging due to variations across climatic zones or the targeted system in the case of bioclimatic indicators. Percentile thresholds enjoy a flexible identification of idealized systems with high (blue), low (red), and critical (yellow) adaptive capacities to observed climate change in 1965-2006. For a given year (x-axis), the y-axis illustrates the centered year of the 31-yr period that the system is adapted to. The black diagonal line indicates the case of instantaneous adaptation; (b) Percentage of global land areas (y-axis) experiencing an annual mean  frequency of hot days equal to or greater than a given value (x-axis) for the three adaptation scenarios (assuming the same adaptation across the globe). Hot days are those with observed TX above the 95th annual percentile of the 31-yr period perceived as normal at any time, following the adaptation scenarios in (a). Data source: Berkeley Earth data set (Rohde et al., 2013). 7 of 54 events across regions but the results are affected by the choice of the reference period (e.g., Dunn et al., 2020). A relatively stationary baseline (e.g., 1961-1990) is often used for the quantification of observed changes (e.g., Seneviratne et al., 2012Seneviratne et al., , 2021, but a more recent period (e.g., 1981-2010) is preferable in terms of data availability and future adaptation applications. In the case of more complex indices, the consideration of variables other than temperature, such as wind speed or humidity, might compromise their reliability due to limited observations or the presence of biases in model simulations. Ultimately, the availability and quality of data, and the specific application and purpose of the study must guide the choice of the index. This is particularly crucial for impact-oriented indices, which require careful and multidisciplinary examination of the targeted sector. In this case, scrutinizing criteria based on data availability, computational feasibility, representativeness and uncertainty should be employed to assess the relevance of the index.

Physical Drivers of Heat Waves
Although HWs are manifested as local intraseasonal phenomena, they result from large-and small-scale processes that interact in complex ways and at a wide range of temporal scales ( Figure 3). Influential factors are the atmospheric circulation, typically considered a fast driver, and anomalous conditions in slowly-varying climate components, which can act as proximate (e.g., land surface) and remote (e.g., upper ocean or sea ice) forcings (Coumou et al., 2018;Domeisen et al., 2023;Hoskins & Woollings, 2015;Miralles et al., 2019;Sillmann et al., 2017, and references therein). Furthermore, global (GHG concentrations) and regional (land-use/land-cover, aerosols) anthropogenic forcings are the dominant factors of long-term trends in the frequency, duration and intensity of HWs (Seneviratne et al., 2021). Based on the scale-dependent structure of Figure 3, these drivers are described below in a remote-to-proximate sequence, proceeding from planetary-to-large-scale (Section 3.1), large-scale-to-regional (Section 3.2), and regional-to-local (Section 3.3) scales. Note, however, that inter-process interactions prevent a complete separation of drivers by spatio-temporal scales. Local processes and anthropogenic forcings will also be considered in more detail in separate sections devoted to urban HWs (Section 5.1) and long-term trends (Section 4), respectively.

Planetary-To-Large-Scale Factors
At multidecadal and longer scales, anthropogenic radiative forcing by well-mixed GHGs is acknowledged as the major driver of HWs due to its influence on the global mean temperature and long-term variations (Seneviratne et al., 2021). Increasing concentrations of GHGs have perturbed the Earth's energy balance at the top of the atmosphere through an enhanced trapping of the long-wave thermal radiation emitted by the planet, resulting in continued global warming, which scales linearly with the cumulative concentrations of carbon dioxide (CO 2 ) since preindustrial levels. Mean warming alone involves shifts in the temperature distributions toward higher values, and hence a higher likelihood of exceeding fixed temperature thresholds. The spatial fingerprint of anthropogenic GHG forcing features marked regional variations due to cloud, water vapor, vegetation or ice/snow cover processes (e.g., polar amplification by ice/snow-albedo feedbacks), ultimately yielding more complex changes in regional/local temperature distributions (see also Section 3.3). Enhanced GHG forcing can also cause changes in atmospheric circulation, with potentially large effects in HWs (Section 3.2). However, dynamical changes are more often uncertain or weak, sometimes due to opposite contributions of fast (direct radiative effects) and slow (e.g., ocean warming) components of the forced response (Shaw & Voigt, 2015;Shepherd, 2014).
Long-memory internal components of the climate system (e.g., ocean, cryosphere, etc.; Figure 3) also affect HWs on a wide range of temporal scales by promoting atmospheric teleconnections (Section 3.2) and anomalous heat/moisture fluxes that cause remote interactions with regional processes (Section 3.3)-see for example, Figure 3. Spatial and temporal scales of characteristic heat wave (HW) drivers. Diagram identifying the characteristic HW drivers and their relevant scales, from planetary to local spatial scale, and from multiannual to multiday temporal scale. Different processes are identified and allocated to specific scales based on a synthesis of previous literature studies. Note that different drivers may interact with each other at multiple scales, and that smaller scale processes are conditioned on larger scale states that may on their own trigger changes in those smaller scale processes. These scales of interaction are greatly simplified in this schematic. Stan et al. (2017), Coumou et al. (2018), Wolf et al. (2020), Domeisen et al. (2023). The proposed mechanisms often invoke thermally-forced changes in atmospheric circulation (e.g., shifting, weakening or stalling of the mid-latitude jets, excitation of high-amplitude Rossby wave patterns or atmospheric modes of variability) induced by anomalous diabatic heating or meridional temperature gradients, although process understanding is still incomplete. Recent studies report HW increases over mid-high latitude regions of the Northern Hemisphere (NH) at interannual and longer time scales in response to shrinking summer Arctic sea ice and snow cover (Coumou et al., 2018;Tang et al., 2014;. However, two-way interactions with the chaotic nature of the atmosphere make it difficult to separate cause and effect, and can lead to low signal-to-noise ratios and non-stationary responses to these drivers. HWs can also be modulated by large-scale internal modes of variability operating from interannual to decadal scales. For example, decadal-to-multidecadal variations in warm extremes, including the unabated global increase in HWs during the recent slowdown in global warming (e.g., Johnson et al., 2018), have been partially related to low-frequency modes of variability such as the Pacific Decadal Oscillation (PDO) or the Atlantic Multidecadal Oscillation (AMO) (e.g., Kamae et al., 2014;Kenyon & Hegerl, 2008;S. D. Schubert et al., 2014;Zhou & Wu, 2016). These internal modes of variability might act as pacemakers of low-frequency variations in HWs, although observational evidence is hampered by the short record.
On interannual scales, organized large-scale tropical patterns of sea surface temperature (SST) anomalies can also modulate regional HW characteristics. The local atmospheric response to tropical SST anomalies is thermally driven via deep convection and associated latent heat release, whereas mid-latitude responses are mainly driven by dynamical upper tropospheric adjustments that act as a Rossby wave source (e.g., Trenberth et al., 1998). Several studies have reported measurable effects of El Niño-Southern Oscillation (ENSO) in many regions of the globe (Alexander et al., 2009), and of the tropical Indian Ocean Dipole (IOD) in several areas of Eurasia. For instance, warm phases of ENSO and tropical Indian ocean warming can enhance HW occurrence in India (Rohini et al., 2016), China (Wei et al., 2020) and the entire Eurasian continent, including Europe and northeast Asia (R. Z. Zhang et al., 2022), whereas cold ENSO phases favor summer HWs in southeastern US (e.g., Deng et al., 2018). In the Southern Hemisphere (SH), warm ENSO phases are the dominant SST-related drivers of summer temperature extremes in large areas of Australia . Extratropical SST anomalies also induce barotropic and baroclinic atmospheric responses (e.g., Kushnir et al., 2002) that can source or feed upper-level Rossby waves preceding regional HWs in remote regions. For example, a zonal tripole pattern in Pacific SSTs may provide skillful subseasonal forecasts of hot days in eastern United States (US) (McKinnon et al., 2016b), while European HWs have been associated with spring cold SST anomalies over the North Atlantic (Della-Marta et al., 2007;Duchez et al., 2016). To the extent that these slowly-varying drivers are predictable several months in advance, and their remote teleconnections and lagged responses can be anticipated, they represent opportunities for an improved predictability of HWs beyond weather forecasts (Section 5.3). However, atmospheric responses to SST forcing are often small as compared to internal atmospheric variability (Kushnir et al., 2002), they depend on the background state (e.g., Thomson & Vallis, 2018), and the effects of atmospheric circulation on SSTs may be confounded with SST forcing. At least, for some regions, SST anomalies do not seem to represent a sufficient condition or are considered of secondary importance (e.g., the influence of the South Indian Ocean SST dipole on southwestern Australian HWs; Boschat et al., 2016).
Other remote drivers of HWs operate on smaller spatio-temporal scales. For example, during the boreal summer tropical subseasonal variations are dominated by regional monsoonal systems, such as the Indian and Western North Pacific summer monsoons, and the associated diabatic heating anomalies can drive a considerable fraction of the intraseasonal atmospheric variability over Eurasia and North America, respectively (e.g., Beverley et al., 2021;Ding & Wang, 2005;Stan et al., 2017). There is also emerging evidence of regional responses in hot extremes to intraseasonal variations in tropical convection (Madden-Julian Oscillation, MJO). Studies are mainly restricted to the boreal winter (e.g., Matsueda & Takaya, 2015), when the MJO activity is higher. However, active MJO phases over specific Indo-Pacific regions can also instigate Rossby wave trains in summer, and have recently been related to outstanding HW events (Hsu et al., 2020;Rodrigues et al., 2019), and enhanced summer HW occurrence in regions of both hemispheres (e.g., southwestern US; Y. Y. Lee & Grotjahn, 2019, and southwestern South America;Demortier et al., 2021). Figure 4 summarizes key internal modes of variability associated with warm-season HW occurrence, as inferred from linear correlations between the monthly series of HW day frequency and a selected list of representative indices for some of the drivers mentioned above. Note that local correlations can be small and do not necessarily involve cause-effect. For comparison, the analysis also includes extratropical modes of atmospheric variability, which have direct effects on the atmospheric circulation, and hence the synoptic patterns associated with HWs. These are the main drivers of HW variations in northern mid-to high latitudes, although this may partially reflect the existing imbalance of atmospheric indices between the NH and SH (see full list in the Appendix A). Tropical drivers may modulate HW frequency as well, with a dominating role over the SH, including regions where HW mechanisms have been less explored. There are extratropical land areas, even in the NH, where low-latitude drivers may exert stronger influences in HW occurrence than extratropical atmospheric modes of variability, stressing the need for further research on tropical-extratropical interactions.

Large-Scale-To-Regional Factors
An imperative driver of HWs is the atmospheric circulation, which also modulates the degree of influence of other drivers. In mid-latitude regions, HWs can emanate as regional manifestations of quasi-stationary Rossby wave packets (waves with a local maximum amplitude and limited zonal extent) that are meridionally confined by the waveguide effect of the jet (e.g., Teng & Branstator, 2019;Wirth et al., 2018, and references therein). These waveguides allow Rossby waves to propagate zonally and trigger teleconnections across faraway regions, being typical HW drivers in regions affected by summer jets, such as Eurasia (S. D. Schubert et al., 2014), China (P. Wang, Tang, et al., 2017), US (Teng et al., 2013), north-central India (Ratnam et al., 2016), Australia (Risbey et al., 2018) or Brazil (Geirinhas et al., 2018). When these high-amplitude Rossby wave patterns recurrently affect the same regions, they lead to long warm spells and concurrent HWs in different longitudinal locations (Rogers et al., 2022;Röthlisberger et al., 2019). Under favorable background flow conditions, summer waveguides can circumvent the hemisphere (circumglobal wave trains, CGTs; Branstator, 2002;Ding & Wang, 2005; Figure 3). Summer CGTs are well documented in the NH, where they can eventually develop large amplitudes and persist over specific geographical locations, synchronizing weather conditions through the hemisphere (e.g., Kornhuber et al., 2019). Several mechanisms have been suggested to cause phase-locking amplification of CGTs, including the presence of double jets or quasi-resonance (constructive interference) of free and forced Rossby waves, with localized topographic forcing affecting the longitudinal structure of CGTs (e.g., Coumou et al., 2014;Jiménez-Esteve et al., 2022;Petoukhov et al., 2013). The involvement of summer CGTs with amplified zonal wavenumbers 5-8 during regional temperature extremes of the NH has been widely reported (Coumou et al., 2014;Kornhuber et al., 2017;S. D. Schubert et al., 2014;Teng et al., 2013). In particular, wavenumbers 5 and 7 show a tendency for preferred phase positions, favoring extremes, sometimes concurrently, in central North America, eastern Europe and eastern Asia for wavenumber 5, and western-central North America, western Europe and western Asia for wavenumber 7 ( Figure 5; Branstator & Teng, 2017;Kornhuber et al., 2020). In the SH, summer CGTs manifest regionally over the Indian, Australian, Pacific and South American sectors with typical wavenumbers 4-6, and can precede HWs downstream by the convergence of eastward propagating wave activity flux within the waveguide O'Kane et al., 2016;Risbey et al., 2018). In spite of that, some studies emphasize that summer waveguides do not tend to be circumglobal (Teng & Branstator, 2019) and that the contribution of CGTs to temperature extremes is not higher than that of non-circumglobal patterns (Fragkoulidis et al., 2018;Röthlisberger et al., 2016).
At continental and subcontinental scales, different large-scale and synoptic weather systems can be involved in HWs, depending on the considered region. Much has been learned from specific mega-HW episodes, albeit results are not always generalizable to other events or regions. Overall, a large body of literature on atmospheric circulation is available for HWs in the northern mid-latitudes and in Australia. More recently, tropical (e.g.  . Map displaying the most influential mode of interannual variability in heat wave (HW) occurrence. The signal is derived from a linear correlation analysis for the 1979-2021 period using detrended monthly time series of both local HW day frequency and standardized indices representative of the modes of variability. The figure only shows signals that are statistically significant (p < 0.1, two-tailed t-test). Teleconnection patterns (TCP) include the 10 leading extratropical modes of atmospheric variability of the Northern Hemisphere and the Southern Annular Mode (SAM). For interannual modes of sea surface temperature (SST) variability, we consider up to 2-month lagged relationships. They include El Niño-Southern Oscillation (ENSO), Tropical Atlantic (TA) variability and Indian Ocean Dipole, the latter measured by the Dipole Mode Index (DMI). To account for the life cycle and diversity of ENSO patterns, we used indices defined over different regions (EN1 + 2, 3, 3.4, and 4), retaining the one with the largest signal at each grid-cell. Similarly, TA considers the regional SST indices of the tropical North and South Atlantic. Madden-Julian Oscillation includes monthly indices for each of its 10 phases. MON summarizes the effect of some of the major regional summer monsoons (defined for the months of the corresponding summer season), including the Indian, Western North Pacific, West African and Australian monsoons. Gray shading identifies regions of missing data. See the Appendix A for details. For sources, see the Data Availability Statement. Data source: Berkeley Earth data set. et al., 2022) have also started to gather attention because of their high vulnerability and potential environmental risks in ecosystems and the cryosphere, respectively. Figure 6 shows the atmospheric conditions associated with high-impact HWs in Europe, North America, Japan, India, Australia, South America, Siberia and Antarctica. It includes the record-shattering 2022 European mega-HW ( Figure 6a) that affected western Europe, where many records were beaten during 9-20 July 2022: temperatures surpassed 45°C in Spain and Portugal, and 40°C in France, Germany and, for the first time, the UK. The HW was accompanied by an intense European drought, devastating wildfires, a record-breaking Mediterranean MHW (with local SSTs of ∼30°C) and high death tolls in many European countries, including Spain (>2,000 excess deaths in July), Portugal (>1,000, 7-18 July) and the UK (>2,000, 11-22 July).
In the extratropics, persistent quasi-stationary high-pressure systems are common drivers of HWs. They often display jet meanders, Rossby wave breaking and disruption of the westerlies, which are distinctive features of blocking (e.g., Kautz et al., 2022;Woollings et al., 2018; Figure 3). Indeed, warm summer extremes and HWs have been related to co-located blocking (e.g., the 2021 North American mega-HW, Figure 6b, and the 2010 Russian mega-HW, Brunner et al., 2018;Chan et al., 2019;Neal et al., 2022;Pfahl & Wernli, 2012;Schaller et al., 2018;Schneidereit et al., 2012;Stéfanon et al., 2012). At comparatively lower latitudes, such as central Europe and some regions of North America, South America, China and Australia, HW-related systems are often better described as meridional extensions of the subtropical belt or subtropical ridges (Risbey et al., 2018;Sousa et al., 2021;P. Wang et al., 2018;Zschenderlein et al., 2019), although these systems can sometimes be confounded with blocking (e.g., the 2022 European HW; Figure 6a Figure 6f; Álvarez et al., 2019). High-pressure systems can also be instigated by regional anomalies in semi-permanent seasonal features, such as the western North Pacific subtropical high (e.g., the 2018 Japanese HW; Figure 6c; Imada et al., 2019, and HWs in southern and eastern China;Freychet et al., 2017;P. Wang et al., 2018) or the South Atlantic convergence zone (southeastern South American HWs; Cerne & Vera, 2011). In other regions at comparatively lower latitudes, such as northern Africa (L. Hu et al., 2019) or India (Ratnam et al., 2016), baroclinic systems characterized by upper-level convergence, low-level circulation anomalies and associated moisture fluxes acquire larger importance. As an example, anomalous low-level northwesterly winds led to reduced moisture flow from the ocean and positive temperature anomalies in India during the 2017 HW ( Figure 6d; Ratnam et al., 2016). Recent HWs in polar regions have been linked to a number of mechanisms, in all cases involving high-pressure anomalies at high latitudes (e.g., the 2020 Siberian HW; Figure 6g, and the 2020 Antarctic Peninsula HW; Figure 6h); the resulting impacts on the cryosphere (permafrost thaw, melting of snow/ice, etc.) were further aggravated by a jet-induced decrease of cold Arctic air intrusions to Siberia in winter-spring (Overland & Wang, 2021), and by moist air masses from the SH ocean and topographic effects in Antarctica (González-Herrero et al., 2022;M. Xu et al., 2021).
For atmospheric systems linked to HWs, there is limited dynamical understanding of the factors determining their onset and maintenance. For example, the activation of CGTs can occur by internal atmospheric dynamics 10.1029/2022RG000780 12 of 54 but also by local or remote diabatic heating anomalies, such as soil moisture deficits over large continental regions or tropical and extratropical SST anomalies (e.g., Beverley et al., 2021;McKinnon et al., 2016b;O'Reilly et al., 2018;Teng & Branstator, 2019). Similarly, there is no comprehensive theory for blocking onset and sustainment (e.g., Woollings et al., 2018). In spite of this, there have been recent advances such as the identification of diabatic influences on the formation and maintenance of upper-tropospheric anticyclones associated with HWs. Cloud-diabatic processes and latent heating release by condensation of water vapor in moist ascending air streams can drive atmospheric circulation anomalies and fuel them with warm air masses Quinting & Reeder, 2017;Schumacher, Hauser, & Seneviratne, 2022;Steinfeld & Pfahl, 2019). In turn, these high-pressure systems set up the favorable conditions for the increase and maintenance of high temperatures during HWs. The associated mechanisms involve horizontal temperature advection, adiabatic warming by subsidence and diabatic processes, either radiative (e.g., enhanced shortwave radiation by reduced cloudiness) or non-radiative (sensible and latent heat) (e.g., Zschenderlein et al., 2020). However, the relative contribution of these processes to the build-up of HWs remains poorly quantified. In some regions, HWs are frequently driven by the advection of warm stable air masses from nearby land areas (e.g., Saharan intrusions in Iberia; Sousa et al., 2019). Warm advection has been pointed out as a relevant factor for HWs in central and eastern Europe (e.g., Miralles et al., 2014;Schumacher et al., 2019), Australia (e.g., Boschat et al., 2015) or southeastern South America (e.g., Rusticucci, 2012). Nevertheless, different formulations of horizontal temperature advection may yield contrasting results on the influence of this process, which has been regarded as either a major (Schumacher et al., 2019) or a minor (Zschenderlein et al., 2019) contributor to HWs. Land-sea temperature contrasts also affect heat and moisture transport, which are important contributors to humid HWs and nighttime temperatures in coastal regions (e.g., northern South America, eastern US and eastern Asia; Freychet et al., 2017;. Other studies have rather emphasized the dominant role of adiabatic warming (e.g., in Iberia or China; Santos et al., 2015;W. Wang et al., 2016). Recent events such as the 2021 North American HW have also been linked to upwind latent heating and the occurrence of landfalling atmospheric rivers (long narrow corridors of high water vapor transport), which can lead to additional warming through sensible heat convergence and water vapor-temperature feedbacks (e.g., Mo et al., 2022;Schumacher, Hauser, & Seneviratne, 2022). The co-occurrence of high temperatures and atmospheric rivers has also been identified as a threat to the ice-shelf stability in polar regions (e.g., Clem et al., 2022). This suggests that the contribution of processes can substantially vary over small areas and/or across HW events, as found for other regions (e.g., central Africa and Australia; L. Hu et al., 2019;Hirsch & King, 2020).

Regional-To-Local Factors
Although persistent high-pressure systems are a necessary ingredient for the development of HWs, regional factors can modulate their onset and evolution on a wide range of temporal scales. Anthropogenic aerosols (suspended solid or liquid particles emitted directly to the atmosphere or produced in the atmosphere from precursor gasses, such as nitrates and sulfates; Figure 3) increased globally after the 1950s, stabilized thereafter, and are expected to decrease worldwide in response to air quality controls (Arias et al., 2021). Because of their relatively short lifetimes, aerosol concentrations show strong spatial variations and have regional effects on climate. Aerosols interfere with climate through radiative effects as well as cloud microphysical interactions that alter the properties of clouds, yielding proximate and remote impacts on regional temperature, the hydrological cycle and atmospheric circulation (e.g., Kasoar et al., 2018;Samset et al., 2018;Westervelt et al., 2018). Overall, most anthropogenic aerosols are associated with enhanced scattering, absorption and reflection (cloud brightening) of solar radiation, which causes dimming and net cooling, attenuating the long-term warming due to increased GHGs (Arias et al., 2021, and references therein). Precipitation responses to aerosol forcing are complex and mediated by dynamical changes, such as southward shifts of the inter-tropical convergence zone by the hemispherically asymmetric cooling or weakening of regional monsoons associated with reduced land-sea thermal contrasts (e.g., L. Liu et al., 2018). Recent and projected reductions in the emissions of anthropogenic aerosol precursors have been linked to accelerated summer warming and more severe warm extremes in some regions of Europe, east Asia and North America (e.g., L. Chen & Dirmeyer, 2019a;Y. Xu, Lamarque, & Sanderson, 2018;A. Zhao et al., 2019). Model simulations yield land-averaged annual maximum TX increases of ∼1°C for a complete removal of present-day anthropogenic aerosol emissions . Despite the growing body of literature, both the influence of anthropogenic aerosols on specific characteristics of HW events and their associated atmospheric circulation patterns remains unexplored. On the other hand, although the dominant aerosol cooling can be an important temperature buffer during HWs (Dey et al., 2021), the net effect depends on complex interactions with the longwave radiation in the case of absorbing aerosols, such as black carbon or dust (Mondal et al., 2021;Sicard et al., 2022). In particular, natural aerosols (e.g., dust, biomass burning, etc.) can be present in high concentrations during regional HW events (e.g., Saharan intrusions, HW-wildfires compound events). A case study of the 2010 Russian mega-HW suggests that feedbacks induced by biomass burning aerosols from wildfires could have contributed to local air stagnation, increased atmospheric stability and changes in atmospheric circulation (Baró et al., 2017). However, the radiative and dynamical effects of natural aerosols on HWs have often been overlooked and remain poorly characterized. Other anthropogenic regional forcings with potentially large influences on HWs include land-cover and land-use changes. They influence climate through biogeochemical processes (CO 2 uptake, emissions of natural aerosols, methane, and biogenic precursors of both ozone and aerosols) and biophysical effects in the land-atmosphere fluxes of energy and water (Jia et al., 2019). Although re-/afforestation can mitigate global warming due to enhanced CO 2 sequestration from the atmosphere, the local surface temperature response is controlled by the balance between radiative and non-radiative effects. These effects can vary across regions and seasons depending on the change in vegetation cover and the characteristics of the bare soil and background climate (e.g., Mahmood et al., 2014, and references therein). In particular, deforestation can cause radiative cooling by increases in surface albedo, and non-radiative warming from reduced evaporative cooling and turbulent mixing of air. These two competing biophysical effects are in turn influenced by the open land and underlying soil (water availability, snow cover, and associated albedo effects), inducing asymmetric changes in maximum and minimum temperatures, as well as seasonally contrasting temperature responses to forests in temperate and boreal regions (e.g., Y. Li et al., 2015). The non-radiative cooling effect of forest gains tends to dominate annually over the tropics, some areas of the northern latitudes, as well as in summer for regions with enough water availability, attenuating warm extremes (Alkama & Cescatti, 2016;Bright et al., 2017). For instance, a model study found that complete afforestation of cleared areas could lead to a summer TXx cooling of at least 2°C over many regions of central Europe (Strandberg & Kjellström, 2019). On the other hand, changes in land use and extensive land management, such as irrigation and cropland intensification, have been linked to long-term regional cooling of hot summer days in some hotspot regions of North America, Europe and Asia (e.g., L. Chen & Dirmeyer, 2019a;N. D. Mueller, Butler, et al., 2016;Thiery et al., 2020). Under non-limited soil moisture supply conditions, these processes cause non-linear temperature responses (stronger evaporative cooling for warmer extremes), mainly due to enhanced atmospheric evaporative demand. For example, irrigation effects in model simulations yielded a mean cooling of ∼0.8°C in annual maximum TXx over irrigated areas of the globe and the 1981-2010 period (Thiery et al., 2017).
In addition to land forcings, regional land-atmosphere coupling can also exacerbate HWs through soil moisture and vegetation feedbacks that modulate the partitioning of surface energy into sensible and latent heat fluxes (e.g., Miralles et al., 2019, and references therein; Figure 3). In dry soils, evaporation is reduced, and the excess of available energy is compensated for by enhanced sensible heat flux, enabling positive feedbacks between air temperature and soil dryness. This coupling is particularly strong in transition regions where soil moisture can become a limiting factor during summer, such as central regions of Europe and US (e.g., Hirschi et al., 2014). Depleted soil moisture has been shown to increase summer temperature variability and amplify the intensity and likely the duration of HWs in both observational data sets and models of diverse complexity (e.g., Fischer et al., 2007;Hirschi et al., 2014;Lorenz et al., 2010;Miralles et al., 2014;Suarez-Gutierrez et al., 2020;Wehrli et al., 2019), although studies are often biased to European HWs and/or specific events. Antecedent soil moisture deficits have also been reported for HWs in regions of US (Benson & Dirmeyer, 2021), Australia (Herold et al., 2016), India (Rohini et al., 2016), China (J. Y. Zhang & Wu, 2011), Africa (Careto et al., 2018), Brazil (Geirinhas et al., 2022), and Argentina (Coronato et al., 2020), but with a varying degree of detail of the underlying physical processes. Soil moisture depletion and subsequent reduction in evaporative cooling may be caused by accumulated rainfall deficits in the preceding months (B. Mueller & Seneviratne, 2012;Quesada et al., 2012;A. Russo, Gouveia, et al., 2019), enhanced plant transpiration due to early greening in spring (Ma et al., 2016) or increased atmospheric demand (e.g., Vicente-Serrano et al., 2020).
Our understanding of soil moisture feedbacks remains incomplete; they still represent an important source of uncertainty in future HW projections, and of discrepancy between observations and model simulations (e.g., Dirmeyer et al., 2018;Hirschi et al., 2014;. In particular, soil moisture influences that are mediated by transpiration depend on complex vegetation dynamics and physiological responses, with different plant species showing a variety of strategies to cope with high vapor pressure deficit, soil dryness and heat stress (de Kauwe et al., 2015;Kala et al., 2016;Krich et al., 2022;Lemordant et al., 2016;Sippel et al., 2018). For instance, some trees experience higher transpiration rates during HWs when soil moisture is available, which has been reported as an evolutionary strategy against overheating (Drake et al., 2018;Krich et al., 2022). Interactive plant physiology and phenology may also cause potential competing effects in response to increasing GHG concentrations (Figure 7; e.g., Keenan et al., 2013;Skinner et al., 2018;Lemordant & Gentine, 2019). On the one hand, the increase in photosynthetic carbon fixation rates can lead to enhanced biomass, plant transpiration and evaporative cooling (CO 2 fertilization effect). On the other hand, stomata can partially close to maintain a near-constant concentration of CO 2 inside the leaf, therefore limiting plant transpiration (CO 2 physiological effect). The relative importance of these processes varies across species and bioclimatic conditions, and in turn, modulates water use efficiency (WUE) and soil moisture availability, which can yield contrasting responses in warm extremes across seasons. For example, enhanced foliage cover during spring would induce soil moisture deficits in summer, with potential amplification on the occurrence and intensity of HWs (Lian et al., 2020), while too early greening may have limited consequences for summer HWs (Ma et al., 2016). Similarly, during the warm season CO 2 physiological effects can exacerbate HW frequency and intensity in dense vegetated regions (Skinner et al., 2018), but during the growing season the induced soil water savings could partially mitigate HW severity by increasing water availability for summer (Lemordant et al., 2016).
A remaining gap in the understanding of HWs is the interaction between land and the mesoscale and synoptic circulation, which governs the redistribution of heat and dryness. Dry soils may modify atmospheric circulation through diabatic effects on column expansion, upper level divergence, or intensification of thermal lows (Haarsma et al., 2009;Koster et al., 2016;Sato & Nakamura, 2019;Teng & Branstator, 2019;Vautard et al., 2007;Zampieri et al., 2009). Eventual landscape conversions and management may also affect the aerodynamic roughness and momentum fluxes, causing changes in mesoscale circulation and the boundary layer, particularly if the Figure 7. Dominant carbon, energy, and water feedbacks associated with canopy conductance under increasing CO 2 concentrations. (a) Stomatal response: during the growing season, the enhanced water use efficiency (WUE) due to CO 2 -induced stomatal closure decreases latent heat flux by transpiration, allowing higher air temperature and soil moisture (water saving). (b) Fertilization effect: in early summer, the increase in leaf area index (LAI) due to CO 2 fertilization (stimulation of biomass production) enhances latent heat fluxes through transpiration and the associated cooling effect, partially compensating some of the transpiration reduction caused by the decrease in stomatal conductance. (c) Water cycle feedback: during a summer HW, potential spring soil moisture savings could decrease the stress of the vegetation and increase transpiration, lowering peak temperatures. Minus (plus) sign reads decrease (increase). Note that these processes vary across species and do not include changes in evaporation due to increased atmospheric evaporative demand, as well as other factors affecting land-atmosphere coupling (e.g., changes in precipitation and the redistribution of atmospheric water vapor by atmospheric circulation). 15 of 54 affected area is large enough (Mahmood et al., 2014, and references therein). In turn, these mesoscale processes and changes in the atmospheric boundary layer can instigate dynamical land-atmosphere feedbacks, which may result in positive or negative precipitation responses (e.g., Ardilouze et al., 2020;Guillod et al., 2015;Taylor et al., 2012). Being still poorly understood, these processes are important for HWs, since they can modulate soil moisture-temperature feedbacks, therefore amplifying or dampening regional temperature responses (e.g., Stéfanon et al., 2014). In particular, the dynamics of the atmospheric boundary layer may play a key role in the intensification of mega-HWs by allowing a multi-day build-up of heat and additional soil depletion (Miralles et al., 2014; Figure 3). Deep atmospheric boundary layers at daytime modulate the entrainment of warm and dry air, which is stored in residual layers at night and re-enters the boundary layer during the following diurnal cycle. More recent regional studies have shown that these warm air masses heated by surface fluxes can be advected to downwind regions (Quinting & Reeder, 2017;Schumacher et al., 2019;Zschenderlein et al., 2019). This has coined the term "teleconnected feedbacks," and may lead to the "self-propagation" of events, whereby transport of diabatic heating from upwind regions experiencing land-atmosphere feedbacks may instigate similar feedbacks and aggravate droughts and HWs in downstream locations Schumacher et al., 2019;Schumacher, Keune, Dirmeyer, & Miralles, 2022). Remote influences of land-atmosphere coupling may also be mediated by dynamical feedbacks. For example, model ensembles show that soil moisture effects on European HW amplitudes are larger when the troposphere is free to respond than when it is fully constrained, suggesting potential "drier soils-enhanced ridge" feedbacks and non-local HW amplification effects induced by the atmospheric responses to soil moisture deficits (Merrifield et al., 2019). Similarly, idealized model simulations indicate that atmospheric interactions (e.g., temperature, cloud cover, heat transport) account for a substantial fraction of the warming response to prescribed decreases in surface albedo and evapotranspiration, particularly in the extra tropics (Laguë et al., 2019).

Integrating Heat Wave Drivers Across Different Scales
Based on the large-scale-to-local drivers identified in this section, Figure 8 summarizes major HW-related processes and integrates them in a conceptual framework. Although some drivers are common to HWs in different regions, the illustration may partially reflect processes associated with mid-latitude HWs, which are overrepresented in the scientific literature. Atmospheric circulation ultimately determines the onset and maintenance of HWs, but both this driver and the HW characteristics are influenced by initial conditions and interactions with long-memory components such as the land surface, upper ocean and cryosphere. High-pressure systems (1, black Figure 8. Schematics summarizing some of the large-scale-to-regional drivers of heat waves (HWs) reported in this review. A regional HW (near-surface warming in red shading) is displayed within the blue box on the right, which denotes the spatial domain and the atmospheric boundary layer. Figure excludes external forcings such as greenhouse gases, aerosols and land-use/land-cover changes affecting long-term trends in HWs. Figure not drawn to scale. anticyclonic vortices in Figure 8; e.g., blocking, subtropical ridge, Rossby wave packet) favor adiabatic warming by subsidence (2, purple arrow), enhanced radiative heating (3, yellow arrow) and warm horizontal advection (4, dark red arrow) toward the region. Warm advection also depends on land-sea temperature contrasts (5, dark red arrow). The resulting near-surface warming forms deep and warm atmospheric boundary layers, which in turn modulate heat entrainment from the free troposphere (6, spiral arrows). HWs can be amplified by local land-atmosphere coupling through soil moisture feedbacks (7, orange loop) involving soil desiccation (brown shading) and enhanced sensible heat fluxes (brown arrow), as well as by teleconnected feedbacks with upwind droughts, which provide additional hot air to the warm advection (8, brown-red arrow). This and other diabatic heating sources (e.g., tropical convection, SST anomalies, and associated latent heat fluxes; 9, blue arrow) can promote Rossby wave trains, CGTs and remote teleconnections (10, colored contours) embedding the HW-related high-pressure system. Latent heat release in moist ascending air streams and atmospheric rivers also provides a source of diabatic heating and intensification of the upper tropospheric anticyclone collocated with the HW (11, light red arrow).
Not necessarily all drivers are present during a HW: relevant factors can differ among regions and HW events, or may combine to shape record-shattering events (Fischer et al., 2021). The level of understanding also varies across drivers. For a number of them, opposing effects have been reported (e.g., radiative and non-radiative responses to land-cover changes), and/or the response depends on the forcing properties (e.g., time of the year, vegetation types, aerosol composition, etc.). In other cases, understanding is limited due to the lack of long records needed to characterize the driver's behavior (e.g., low-frequency modes of variability) or because their signals are weak compared to the high internal variability of the atmosphere (e.g., cryosphere). Slow drivers (e.g., SST anomalies) may represent favorable rather than necessary conditions for HW occurrence, and hence their involvement would be conditional on the states of the background flow and/or other drivers, which are yet to be determined. In some regions of US, Europe or southeastern Asia, several HW drivers (e.g., CGTs, land-atmosphere coupling, tropical or extratropical SST anomalies) often act in tandem and add to the effects of other drivers operating at longer scales (e.g., aerosols, land-cover/land-use changes, GHG). However, they have mainly been addressed fragmentarily, with limited cross-understanding and quantification of their relative roles. An integrated view is needed, because slowly-varying factors can instigate the fast drivers of HWs and hence represent a potential source of enhanced predictability (see Section 5.3). Furthermore, driving factors interact at a wide range of spatial and temporal scales, and their dependencies and role in HWs may be non-stationary on a wide range of time scales (e.g., an enhanced influence of soil moisture in future HWs; Suarez-Gutierrez et al., 2020). A comprehensive framework on their interactions across multiple spatio-temporal scales is still lacking. In particular, the relative importance of these processes on local and regional scales is not well understood yet and would require integrated assessments of all known drivers (through e.g., factorial experiments; Wehrli et al., 2019). Finally, recent advances have mainly focused on thermodynamic aspects, which contrasts with the relatively poor understanding of dynamical processes (e.g., mesoscale circulation, land-atmosphere dynamical feedbacks and remote effects). This is somewhat paradoxical since the atmospheric circulation is the ultimate driver of HWs and hence should be considered a central element. Although all these challenges may differ in nature and avenues for progress, deciphering the big picture will require coordinated theoretical, observational and modeling developments toward process-based understanding.

Observed Changes and Attribution
Recent studies have gathered additional evidence of anthropogenic influences in the observed trends of temperature extremes at different temporal scales (e.g., Christidis et al., 2015;Diffenbaugh et al., 2017;Dunn et al., 2020;Kim et al., 2016;King et al., 2015). Observations show a global eightfold increase in the occurrence of record-breaking monthly temperatures as compared to those expected in a stationary climate (Robinson et al., 2021). Model simulations indicate that current levels of global warming are already responsible for ∼75% of hot days globally (Fischer & Knutti, 2015). Changes in temperature extremes have accelerated in the last decades at a faster rate than the mean, with overall larger changes in minimum than maximum extreme temperatures, although the trend magnitude varies across seasons, regions, spatio-temporal scales and metrics (e.g., Dunn et al., 2020;Seneviratne et al., 2014;. The IPCC concluded that it is virtually certain that there has been an increase in the number of warm days and nights since 1950 globally, and very likely in Europe, Australasia, Asia, and North America at regional scales (Seneviratne et al., 2021). Some of the areas with the largest trends include tropical regions of South America, Northern Africa and the Mediterranean, and Southeast Asia, where the number of warm days and nights have doubled since the 1970s (Dunn et al., 2020). Globally, extreme humid heat has more than doubled in frequency since 1979, and human physiological tolerance limits (wet bulb temperatures of 35°C) have already been exceeded in some coastal subtropical locations of South Asia, the Middle East, and southwest North America . A recurring question is whether the observed trends in temperature extremes are driven by changes in the mean, variability or other higher moments. Several studies have emphasized the importance of increasing variability (e.g., Douville et al., 2016;Kodra & Ganguly, 2014), although they are often limited to areas of strong land-atmosphere coupling. Changes in temperature variance vary by region, variable (maximum, minimum or mean temperature) and time scale. However, in many regions, a shift in the mean can explain a large fraction of the observed changes (McKinnon et al., 2016a), and current evidence lends limited support to the hypothesis of global increases in temperature variance (e.g., Huntingford et al., 2013).
Detection and attribution of extremes are employed to identify externally forced signals in an observed change or event, and quantify the relative contributions of the causal factors (e.g., NAS, 2016;Otto, 2017;Stott et al., 2016). Detection of observed trends in extremes requires robust estimates of internal variability and is thus affected by limitations in data availability, quality, coverage and completeness, which tend to be more acute in low income countries and remote regions. Attribution further requires physical understanding and multivariate analysis fitting the observed changes to the simulated responses to external forcings. Studies show that anthropogenic activities have made record-hot summers of 1950-2012 10 times more likely in many regions (e.g., the Mediterranean, Sahara and Middle East; B. Mueller, Zhang, & Zwiers, 2016), increasing the severity and probability of unprecedentedly hot months over more than 80% of the monitored land (Diffenbaugh et al., 2017). For events of shorter duration, anthropogenic influences have been identified in the increasing trends of several HW indices, including the intensity and frequency of warm days and nights for all continents (e.g., T. Hu et al., 2020;Kim et al., 2016), as well as the duration and area affected by warm extremes for the NH continents and Australia Dittus et al., 2016). At these spatial scales, GHG-induced global warming is the dominant factor in the observed changes in temperature extremes (Seneviratne et al., 2021). The tropics, where variability is low, are among the regions where stronger anthropogenic influences and earlier emergence of a climate change signal in temperature extremes would be expected (King et al., 2015). For some continents such as Europe, Asia and North America, the GHG-forced trends in warm extremes have been partially offset by detectable cooling signals due to aerosol increases during 1950-1980Seong et al., 2021). At regional scales, attribution of trends is challenging due to lower signal-to-noise ratios (i.e., high internal variability), influence of regional forcings (e.g., tropospheric aerosols) and processes (e.g., feedbacks), and poorer model performance in simulating the observed trends and variability (e.g., van Oldenborgh et al., 2022). However, distinguishable GHG signals in temperature extremes already emerge in many subcontinental regions, with aerosol effects being sometimes also detected (e.g., in central Europe or east Asia), particularly for warm extremes (e.g., Seong et al., 2021;Z. Wang, Jiang, et al., 2017). Quantifying the relative contribution of different anthropogenic forcings is still problematic. For example, aerosol and GHG forcings are difficult to separate, arguably due to collinearity in their signals, as well as opposite aerosol trends over the observational period (T. Hu et al., 2020). Recent studies also suggest that deforestation would have exacerbated recent mega-HW events (Strandberg & Kjellström, 2019) and contributed to up to one third of the warming of hot extremes in northern mid-latitudes since pre-industrial times (Lejeune et al., 2018). On the contrary, in some regions of the US, South Asia and Europe, extensive land management practices, such as irrigation expansion and cropland intensification may have dampened the regional increases in hot extremes. For instance, over some of the regions with the largest cropland increases, extreme summer temperatures have cooled at a rate of more than 0.20°C per decade during the 20th century (e.g., US Midwest; N. D. Mueller, Butler, et al., 2016) and model simulations indicate that irrigation may have halved the probabilities of local hot extremes (e.g., South Asia; Thiery et al., 2020). Therefore, on local scales, historical changes in these regional forcings could have comparable or even larger effects than those caused by GHG increases (e.g., L. Chen & Dirmeyer, 2019b). However, they have not been considered in formal attribution studies due to observational uncertainties and model discrepancies in the representation of processes and simulated temperature responses to land-use/land-cover changes.
Regional trends in extreme temperatures have also been associated with atmospheric circulation changes, which would have contributed to the disproportionate increases in warm extremes over some mid-latitude regions of the NH (e.g., Coumou et al., 2015;D. E. Horton et al., 2015;Rogers et al., 2022;Rousi et al., 2022). Some of the reported summer circulation trends include increasing summer frequencies of anticyclonic circulation over western Asia, eastern North America and Europe (D. E. Horton et al., 2015). Weakened mid-latitude zonal flows and baroclinicity in the NH may lead to more persistent circulation patterns, such as double jet structures in Eurasia (Rousi et al., 2022), as well as increasing occurrences of high-amplitude quasi-stationary Rossby wave patterns (e.g., Coumou et al., 2015;M. H. Lee et al., 2017;Rogers et al., 2022). However, the attribution of dynamical changes is problematic, since they are often not well understood and/or robustly simulated by models (Hoskins & Woollings, 2015;Shaw, 2019;Shepherd, 2014). The detection of robust circulation trends is further complicated by the limited period of observations and large internal variability. In particular, there is no robust compelling evidence of increases in the amplitude or frequency of wavy patterns across the diversity of proposed metrics and definitions (jet meandering, wavenumber amplitudes, amplified stationary Rossby waves; Hoskins & Woollings, 2015, and references therein). Other diagnostics of wave amplification, such as blocking frequency, do not exhibit obvious trends in summer either, except perhaps over Greenland (e.g., Woollings et al., 2018). On the other hand, the hypothesis that the jet has become more conducive to CGTs with certain wavenumbers (Section 3.2) needs to be reconciled with model simulations and theoretical arguments suggesting a reduced expectation of waveguiding for weakened jets under climate change (see e.g., Teng & Branstator, 2019). Some of the reported changes in mid-latitude summer circulation, including the weakening of storm tracks, and the increase in frequency, amplitude and/or persistence of quasi-stationary wave patterns, have been linked to weakened equator-to-pole temperature gradients caused by Artic Amplification (e.g., Coumou et al., 2018, and references therein). However, the atmospheric response to Arctic sea ice loss is not robust in observations and models (Barnes & Screen, 2015;Screen et al., 2018), and there are other potential drivers of circulation changes (Shaw, 2019). They include tropical SST anomalies (Deng et al., 2018;R. M. Horton et al., 2016), the excitation of large-scale modes of variability by regional forcings (e.g., anthropogenic aerosols or widespread land-cover and land-management changes; Mahmood et al., 2014;Westervelt et al., 2018), and other mechanisms without obvious anthropogenic influences (e.g., Arctic diabatic heating by enhanced convergence of moist intrusions; Baggett & Lee, 2019). Therefore, improved understanding of the involved mechanisms is required to add robustness and elucidate whether the observed changes are caused by anthropogenic forcings (Mann et al., 2017) or internal multidecadal variability (e.g., SSTs; Huntingford et al., 2019).
Along with changes in temperature extremes, global increases in warm-season HW event characteristicsnamely magnitude (Figure 9c), duration (Figure 9b), and more notably frequency (Figure 9a)-have been observed since at least 1950. Some of the largest and most widespread changes occur in the Middle East and some tropical regions, and secondarily in Europe, Australia, eastern Asia or western North America (e.g., Ceccherini et al., 2017;Y. Chen & Li, 2017;Perkins-Kirkpatrick & Lewis, 2020;S. Russo et al., 2015Vose et al., 2017). HWs have also become more extensive during 1950-2021 (Figure 9d), with the largest changes over the Middle East, northern Africa and Europe. Accordingly, the probability of concurrent HWs across the NH mid-latitudes has experienced a sevenfold increase in the last four decades (Rogers et al., 2022). Although the global increases in frequency, intensity and duration of HWs are virtually certain, confidence varies regionally depending on data availability and amount of gathered evidence (Seneviratne et al., 2021). The lack of long-term trends in some regions (e.g., the "warming hole" in eastern US) could reflect the influence of regional factors (e.g., aerosols, cropland management; see Section 3.3) or internal variability, which can render long-term pauses in regional HW trends  or accelerate them, as observed during the recent period of global warming deceleration (Johnson et al., 2018). On the other hand, disproportionate increases in regional HW characteristics and concurrent HWs over parts of Europe, US or Asia have been linked to regional forcings (e.g., deforestation; Lejeune et al., 2018) and/or atmospheric circulation changes (Mann et al., 2017;Rogers et al., 2022;Rousi et al., 2022), but the relative importance of these factors remains unclear.
The last decades have also witnessed record-breaking HW events and a threefold increase in the percentage of the global land area affected by severe HWs with respect to the early 20th century (Zampieri et al., 2016).  (2015). Some (if not all) of these years witnessed record-breaking HW events over the considered regions (e.g., the 2010 European mega-HW), often accompanied by droughts (e.g., the 2012 North American HW) and wildfires (e.g., the 2009 Australian HW). As a consequence, HW episodes have become a central focus of event attribution studies (e.g., Stott et al., 2016). As individual extreme events are unique, they cannot be deterministically attributed to climate change, and different probabilistic methods for event attribution have been developed (e.g., Otto, 2017, and references therein). Overall, the risk-based approach estimates how much more likely or intense a predefined class of event (i.e., a group of events similar to the observed one) is in the actual world compared to a hypothetical climate without anthropogenic influences (counterfactual world). Differently, the conditional or ingredient-based approach provides partial attribution by constraining the climate system to specific states at the time of the event (i.e., a conditional factor such as SSTs or atmospheric circulation). Despite the proven capabilities of current models to generate HW events, the diversity of model physics and biases can lead to substantial differences in attribution statements  and a reasonable simulation of extreme statistics may be insufficient to ensure that the mechanisms are well simulated (e.g., land-atmosphere coupling; Vautard et al., 2019). Observational-based methodologies have also been developed, such as the analog method (e.g., Yiou et al., 2017;Jézéquel et al., 2018, and references therein), which uses historical circulation analogs to infer how dynamically driven events would have unfolded in an "old world" with reduced anthropogenic influences. Assuming enough good analogs and long periods, it provides a conditional attribution to long-term changes rather than to anthropogenic factors alone. Fixing the atmospheric circulation emphasizes the influence of thermodynamic changes on the event, which could lead to false positives (by ignoring actual dynamical changes), but also avoids false negatives caused by uncertain circulation responses (Shepherd, 2016;Trenberth et al., 2015). This method also provides the basis for quantifying dynamical and thermodynamic contributions to the changing probabilities of events (Vautard et al., 2016;Yiou et al., 2017), although intertwined influences make this separation challenging (Harrington et al., 2019).
The largest number of extreme event attribution studies (e.g., the BAMS annual special issues "Explaining extreme events from a climate perspective"; Herring et al., 2022; the World Weather Attribution, https://www. worldweatherattribution.org) correspond to temperature extremes, and more often report human influences, including events that would have been almost impossible without anthropogenic forcing. Although human influences are regionally variable, confidence in attribution of HWs is among the strongest of all extreme events, particularly for large and persistent events, thanks to consistent evidence from observations, reasonable model simulation and physical understanding linking events to anthropogenic factors (e.g., NAS, 2016). However, the geographical coverage of events remains patchy and physical processes are often overlooked Trenberth et al., 2015). Moreover, attribution statements of HWs mostly deal with changes in frequency or intensity, ignoring other relevant attributes (e.g., duration, spatial extent; Sánchez-Benítez et al., 2020), and do not inform on future events, which is relevant for changing risks and decision-making. Finally, the framed question and the lack of objective criteria in defining the event (e.g., spatial and temporal scales) and the experimental design (e.g., data selection, counterfactual estimates, etc.) lead to subjective choices that influence the conclusions, making statements sensitive to the way questions are posed (e.g., Angélil et al., 2018;Kirchmeier-Young et al., 2019;Otto et al., 2016;Risser et al., 2017). For example, the results can depend on whether attribution focuses on changes in the intensity or frequency of the event, because a small warming does not necessarily imply a small change in occurrence (e.g., Seneviratne et al., 2021).
Recent efforts include the development of fast attribution approaches (e.g., Philip et al., 2020) for the deployment of climate services (near real-time operational attribution), which requires coordination of the weather and climate community with communicators toward unified protocols and transparency. There is also a pressing need to incorporate the associated impacts of HWs in a risk-based framework, including the effects of exposure and vulnerability (e.g., Sillmann et al., 2021). The so-called end-to-end attribution is only starting to emerge, as it requires multi-disciplinary research and the involvement of end users, but some examples can be found in the literature. Figure 11 illustrates an attribution study of heat-related mortality in Paris and London for summer 2003 (Mitchell et al., 2016). Using apparent temperatures from a high-resolution regional climate model (RCM) as input for a health impact model based on local exposure-response relationships, it was found that anthropogenic influences increased the risk of heat-related mortality by ∼70% and ∼20% in Paris and London, respectively. Finally, the attribution field would benefit from further development of models and methodologies for event selection, process-based attribution, and communication of changes in hazards and risks that could emerge in the future.

Climate Change Projections
State-of-the-art Global Climate Models (GCMs) participating in the Coupled Model Intercomparison Project (CMIP) capture the observed changes in extreme indices and HWs over the second half of the 20th century, although with some tendency to overestimate the magnitude of trends in hot days (e.g., Fan et al., 2020;Kharin et al., 2013;Seneviratne et al., 2021;. The skill of GCMs in representing temperature extremes varies with the season, region, reference data set and extreme index (better representation of absolute and threshold indices in the tropics, and of percentile indices in the extratropics), yielding lower spatial pattern skill scores for duration and percentile indices (e.g., Fan et al., 2020;Kim et al., 2020). The multi-model spread is comparable to that retrieved from observation-based products, and the ensemble mean outperforms any individual model, likely indicating compensation of errors (e.g., Kim et al., 2020;. Although model performance in the representation of HWs has improved through model generations, in the last decade (i.e., from CMIP5 to CMIP6) there have been limited additional improvements in terms of mean biases or model spread Kim et al., 2020;Wehner et al., 2020). Decomposition approaches of the magnitude of daily temperature extremes indicate potential error compensation between different contributing terms, but also the reduction of pervasive warm biases over some mid-latitude land regions due to an improved simulation of synoptic-scale variability (Di Luca, Pitman, et al., 2020). Typical regional errors, including too persistent and/or intense HWs, and overestimated variability in extreme temperatures, have been linked to too strong land-atmosphere feedbacks (e.g., Miralles et al., 2019;van Oldenborgh et al., 2022) or missed processes (e.g., irrigation; Thiery et al., 2017) in GCMs. Other important mechanisms related to HW intensification are underrepresented (e.g., feedbacks of surface fluxes on boundary layer properties; Dirmeyer et al., 2018). Ocean-atmosphere coupling can also affect the simulation of HWs and their trends, particularly in some tropical regions and coastal areas of open seas Gröger et al., 2021). Moreover, the simulated changes in HW-related atmospheric circulation patterns, such as CGTs, are sensitive to the forced patterns of SST responses to anthropogenic forcings (Huntingford et al., 2019), and GCMs tend to underestimate the near-surface impacts of CGTs (e.g., Luo et al., 2022).
Model performance is also affected by the spatial resolution. High-resolution GCMs improve the simulation of temperature extremes. However, the effects of increasing the spatial resolution could be non-linear, being mainly beneficial for GCMs with coarse resolution (Di Luca, Pitman, et al., 2020). Downscaling with RCMs brings better representation of warm extremes in coastal regions or areas of complex topography, but the added value varies regionally, and often depends on the driving GCM, the parameterizations of the RCM, and their structural differences (e.g., resolution, representation of aerosol forcing or coupling; Seneviratne et al., 2021). In some cases, RCMs do not yield obvious improvements, and a good representation can be achieved by the wrong reasons (e.g., compensation of favorable circulation biases for HW occurrence with a wet soil bias or vice versa; Lhotka et al., 2018). In fact, substantial biases and model spread have been reported in RCM simulations driven by reanalysis data (Vautard et al., 2013). Limited improvement has also been reported for the simulation of some HW drivers (e.g., blocking) in high-resolution GCMs  and RCMs (Jury et al., 2019). In summary, studies have found a long list of potential causes of model biases such as limited horizontal resolution, the insufficient representation of air-sea interactions, land-atmosphere coupling, land-cover/land-use changes and boundary layer processes, or deficiencies in parameterizations (e.g., radiation, convection, microphysics, land surface schemes, vegetation) affecting surface fluxes, the hydrological cycle and associated feedbacks (e.g., L. Chen & Dirmeyer, 2019b;Dirmeyer et al., 2018;Fischer et al., 2018;Schwingshackl et al., 2019;Stéfanon et al., 2014;Thiery et al., 2017;Vautard et al., 2013). Therefore, there are likely many ways to improve the simulation of temperature extremes, but gains brought in a single model do not guarantee a better performance in others. et al., 2018; C. Li et al., 2021). It is virtually certain that unabated global warming will cause more frequent hot extremes over most land areas (Seneviratne et al., 2021). These changes will be accompanied by and dominate the projected increases in dry-hot compound events (e.g., Sarhadi et al., 2018;X. Wu et al., 2021). Global warming will also result in longer, more intense, extensive and frequent HW events over most land areas (e.g., S. Russo, Sillmann, et al., 2019;Vogel et al., 2020), with ∼1.5-2 additional HW events per degree of global warming over most regions . Under the Representative Concentration Pathway (RCP) 8.5 scenario, almost every year in 2081-2100 would experience record-breaking HWs, with a global mean exceedance probability of 76% (A. Zhao et al., 2019). Overall, GCM projections indicate an intensification of the patterns found in observations ( Figure 9) and historical runs, which are consistent across model generations and associated forcing scenarios (RCPs in CMIP5, and Shared Socioeconomic Pathways, SSPs, in CMIP6; Kharin et al., 2013;C. Li et al., 2021;Wehner, 2020; Figure 12). Irrespectively of the emission scenario, the magnitude of warm extremes over most land areas tends to increase linearly with global warming (Figure 12a), albeit with substantial variations across regions (e.g., C. Li et al., 2021) and the HW metric considered . Differently, the probability of occurrence of unprecedented events (those breaking all previous records by large margins) is pathway-dependent, since it increases with the accelerating rate of global warming (e.g., by a factor of ∼2-3 in RCP8.5 as compared to RCP4.5; Fischer et al., 2021). Furthermore, the projected changes in intensity and probability increase non-linearly with the exceptionality of the event, being larger for more extreme than for weaker events (e.g., Kharin et al., 2018;C. Li et al., 2021). Accordingly, a hot day that occurs once every 20 years in the present-day climate would be ∼2.5 times more likely in a 2°C warming world, whereas a 1-in-50 year warm day would experience a five-fold increase. RCM projections confirm the dramatic  . Box plots show the interquartile range with the median in between, and whiskers denote the 90% uncertainty range; (b) Changes in HW day frequency with respect to present-day (1981-2010 mean) for a 1.5°C global warming; (c) Same as (b) but for 2.0°C; (d) Uncertainty in HW day projections for a 2.0°C warming world, as inferred from the difference between the Global Climate Models (GCMs) with the largest (≥75th percentile) and lowest (≤25th percentile) projected changes in global mean HW day frequency. Hatching in (b-d) denotes areas where differences are not statistically significant (p > 0.05, two-sided t-test). Results are based on historical and SSP3-7.0 simulations from a multi-model ensemble of CMIP6 GCMs. See the Appendix A for details. changes in hotspot areas like the Mediterranean, where the strongest present-day HWs could become almost the norm by the end of the 21st century under a RCP8.5 scenario (e.g., Molina et al., 2020). RCMs can yield discrepancies with GCMs, though. For example, in several European regions, RCMs project weaker summer warming than GCMs, which has been linked to discrepant representations of topography and cloud processes, but also to structural differences in forcings (e.g., time-invariant aerosol forcing and missing CO 2 physiological responses of vegetation in many RCMs; Schwingshackl et al., 2019;Boé et al., 2020). The future regional rates of increase in warm extremes are due to the combination of projected changes in mean temperature and variability, but they are overall dominated by the former. Like in observations (Section 4.1), the contribution from increased variability is expected to be larger in transitional regions (e.g., Mediterranean, western US, southern Australia or South America; Di Luca, Vogel et al., 2017) and in coastal areas (Holmes et al., 2016); this is partially related to enhanced drying and land-sea contrasts, respectively. However, for the rest of the globe, projected changes in variability are small (e.g., tropics), oppose to the mean warming trend (e.g., future decreases in variability at high latitudes), or vary in magnitude and sign across GCMs (e.g., Argüeso et al., 2016;Lewis & King, 2017;Vogel et al., 2020). In agreement with this, the spatial pattern of projected changes in warm extremes depends on the index considered (e.g., . Absolute indices largely reflect changes in mean temperature, with pronounced increases in tropical regions, where a small shift in the mean leads to large changes in exceedance rates. Percentile indices are sensitive to temperature variability and show large increases in the extratropics, where variability is larger than in the tropics and can be amplified by regional feedbacks (Section 3.3). Similarly, projected increases in HW event properties (e.g., duration) tend to be larger over regions with small temperature variability such as the tropics, whereas middle and high latitudes would experience the largest increase in HW intensity (e.g., Dosio et al., 2018;Harrington et al., 2016;, likely amplified by soil moisture feedbacks (e.g., Douville et al., 2016;Vogel et al., 2017).
Increases in GHG concentrations account for most of the projected HW changes, which are aggravated by aerosol reductions, mainly over the largest emission regions of the NH (China, Europe, US and India; Y. Xu, Lamarque, & Sanderson, 2018). By the end of this century, global aerosol forcing might counterbalance only 0%-20% of GHG forcing, compared to ∼80% during 1980-1990(Bauer et al., 2022. However, per degree of global warming, future aerosol reductions may have stronger regional impacts than GHG increases on 21st century HW trends . This response has partially been related to non-linear effects from aerosol-cloud interactions (A. Zhao et al., 2019), which are still poorly represented in models and therefore represent an additional source of uncertainty in HW projections. The role of other external forcings with large regional influences (e.g., land-cover/land-use changes) has been less explored and remains uncertain. CO 2 -driven changes in vegetation dynamics, such as spring greening (Lian et al., 2020) and reduced summer stomatal conductance (Skinner et al., 2018) may further increase HW frequency, intensity and duration, particularly in transitional and densely vegetated regions. In model simulations with 1% annual increase in CO 2 , vegetation effects globally accounted for ∼15% of the CO 2 -induced warming in annual TXx (Lemordant & Gentine, 2019). Nevertheless, vegetation effects can cause opposite responses in extremes (e.g., regional increases in Europe or the Amazon and decreases in North America), depending on the dominant mechanism (e.g., physiological effects, biomass increases, soil moisture feedbacks, Section 3.3). These processes are often missed or not well represented in current GCMs (L. Chen & Dirmeyer, 2019b;N. D. Mueller, Butler, et al., 2016;Seneviratne et al., 2021;Thiery et al., 2020). For instance, cooling effects from land-cover/land-use historical changes (e.g., agricultural expansion in Europe, eastern Asia, and central-eastern US) are not robust across models, partially due to differences in the parameterizations of the biogeophysical processes. Furthermore, GCM projections often ignore future irrigation changes, which are uncertain due to increases in WUE and limits in agricultural expansion and water availability. Therefore, an improved quantification of regional changes in HWs will require improvements in land surface models (e.g., representation of phenology, multiple crop types, and transient management).
Other key uncertainties relate to changes in SSTs (e.g., Huntingford et al., 2019), land-sea temperature contrasts (e.g., Baker et al., 2019) and atmospheric circulation (e.g., projected changes in the strength and location of jet streams; Harvey et al., 2020). For example, summer mid-latitude weather patterns and long-lasting warm periods may become more persistent in response to weaker equator-to-pole temperature gradients induced by Arctic Amplification (Kornhuber & Tamarin-Brodsky, 2021;Pfleiderer et al., 2019). However, tropical warming has an opposite effect on meridional temperature gradients, and models disagree on whether their future summer changes will be dominated by Arctic or tropical warming. Moreover, these two global drivers of climate change tend to affect the pole-to-equator temperature gradients at different heights (the lower and upper troposphere, respectively), which is expected to cause different, albeit still unclear, effects on mid-latitude dynamics. In addition to forced responses, internal variability (e.g., multidecadal modes of variability) represents an additional source of uncertainty, particularly at high latitudes (∼25%-50% of the projected spread in warm-season HW days;  and for near-term projections, stressing the need for large ensembles to address regional changes (e.g., Deser et al., 2020). All these uncertainties contribute to widening the spread of regional HW projections, compromising the provision of actionable information for the development of adaptation and mitigation strategies on relevant spatial scales. Even for the same level of global warming, there are significant inter-model differences in HW projections, particularly at low latitudes, indicating that HW changes there are particularly sensitive to global warming (Figure 12d) Despite the increased comprehensiveness and realism of current GCMs, the uncertainty of climate projections at regional scales is still large, pointing at pervasive problems in fundamental processes and the need to understand model uncertainty (through e.g., models of diverse complexity or physically-consistent narratives of plausible changes; Collins et al., 2018;Shaw, 2019;Shepherd et al., 2018;Zappa, 2019).
Recent studies have assessed the impacts of the different Paris Agreement targets. An aggravation of HWs is unavoidable as we approach 1.5°C global warming (Hoegh-Guldberg et al., 2018). As compared to present-day, the intensity and frequency of hot extremes will very likely increase at 1.5°C global warming in all populated continents (Seneviratne et al., 2021), with significant increases over large vulnerable areas of Africa, South America and southwestern Asia (Figure 12b; e.g., Dosio et al., 2018;Nangombe et al., 2018). An additional 0.5°C global warming would cause further increases in HW occurrence (Figure 12c), and it could double the frequency of extreme events over the tropics, southwestern US and the Mediterranean. This stresses the importance of stabilizing global warming, which would also cause rapid declines in the probability of record-shattering extremes (Fischer et al., 2021). Near and beyond the bounds of the Paris agreement, the tropics would approach a permanent HW state as compared to today's standards  and highly populated areas of eastern US, northern Latin America and China would experience unprecedented humid HWs due to combined changes in HW intensity and humidity (S. Russo et al., 2017). With continued global warming, the future human habitability of some low-latitude countries has been questioned, as they will experience uncharted changes in HWs, exacerbated by low levels of development, large population growth and reduced adaptive capacity (e.g., Buzan & Huber, 2020;Harrington et al., 2016;Mishra et al., 2017;Pal & Eltahir, 2016). Recent studies have assessed future changes in population exposure to HWs (Gasparrini et al., 2017;Z. Liu et al., 2017;Mora et al., 2017) to identify regions where adaptation options may be urgent. Within the next two decades, more than half of the world's population is projected to experience summers that will be warmer than the historically  warmest summer every 2 years (B. Mueller, Zhang, & Zwiers, 2016). Under RCP8.5, almost 75% of the population could be exposed to deadly HWs by 2100 (Mora et al., 2017). Reducing global warming from 2°C to 1.5°C would result in a decrease of ∼1.7 billion people exposed to severe HWs, many of them living in developing countries in Africa, Asia and South America (e.g., Dosio et al., 2018). By combining the relative changes in hazard probability, exposure and vulnerability, S. Russo, Sillmann, et al. (2019) estimated that the consequences of a 1.5°C warming in vulnerable regions are similar to those of a 2°C warming in the developed world, highlighting the critical risks for countries of Africa, the Middle East, southeast Asia and Latin America. Transforming climate into actionable information will require additional efforts and interdisciplinary research to account for adaptation and biased risk assessments of HWs (e.g., Ebi et al., 2018).

Urban Heat Waves
Over the last decades, the frequency of hot days and nights has significantly increased in more than half of the urban areas of the globe (Mishra et al., 2015). The exposure to severe heat is projected to soar in the future because of the combination of global warming with urban land expansion and population growth, especially in cities located in already warm regions of Africa and Asia Klein & Anderegg, 2021). HWs are exacerbated in urban areas because air temperatures can be a few degrees Celsius higher there than in their surroundings. This is referred to as the urban heat island (UHI) effect (Oke, 1982), which results from several factors (Figure 13). The geometry of street canyons not only increases the incoming solar and longwave radiation that are absorbed due to multiple reflections, but also reduces natural ventilation (e.g., Best & Grimmond, 2015; L. Zhao et al., 2018). Moreover, the prevalence of surface materials that retain heat and absorb little water leads to disproportionate sensible heat fluxes (e.g., Ramamurthy et al., 2014). Additional heat is also released into the urban boundary layer via anthropogenic sources, including the use of air conditioning (AC) and other cooling systems during warm periods (e.g., Sailor, 2011). For example, during the 1995 Chicago HW, the UHI aggravated the impacts by raising nocturnal temperatures by 2.0°C-2.5°C in the city, where most fatalities occurred (Kunkel et al., 1996).
Typically, the largest contrast between urban and rural temperatures is observed at night, when the heat absorbed during the day is released into the boundary layer (e.g., Bohnenstengel et al., 2015). Nevertheless, there are multiple modulating factors such as the size and urbanization pattern of the city, and moisture availability. High-resolution modeling and observations show that the UHI is often amplified during HWs in the highly urbanized northeastern US, with differences between urban and rural temperatures as high as 8°C at night and in the early morning; during those events dry and deep boundary layers enable rapid soil moisture desiccation (Ramamurthy et al., 2017), as found by Miralles et al. (2014) over non-urban surfaces (Section 3.3). However, the thermal gradient between urban and rural areas as well as the proximity to large water bodies respectively induce secondary and sea breeze circulations that moderate the UHI effect. Although positive interactions between UHI and HWs have been found for different regions, the results depend on local conditions, with overall higher UHI intensification at nighttime for non-coastal cities (e.g., around 1°C air temperature increase in Beijing and 2°C in Oklahoma; Basara et al., 2010;Jiang et al., 2019) and at daytime for some coastal cities with land-sea breeze effects (e.g., around 1°C in Shanghai and up to 3°C in Athens; Founda & Santamouris, 2017;Jiang et al., 2019). Other studies have assessed urban-rural differences in coarser models with a sufficiently detailed urban land parametrization (e.g., M. P. McCarthy et al., 2010;L. Zhao et al., 2018) and in observations (e.g., Liao et al., 2018) to understand the contribution of urbanization to HWs at continental scales. Besides enhanced anthropogenic heat emissions in urban areas, urban-rural differences in evaporation are key contributors to the corresponding temperature differences at daytime, pointing to the importance of soil moisture limitation in the cities. Hydroclimate conditions can also modulate urban-rural contrasts in HW trends, which are larger in the wet than in the dry regions of China due to the smaller variability of UHI intensity in the former (Liao et al., 2018).
A review paper by Jay et al. (2021) shows that relying on AC to mitigate urban heat is unsustainable and unaffordable for many of the most vulnerable. Effective cooling strategies to mitigate urban heat have been considered at the landscape and building level, such as improving building energy efficiency and transportation technology, extending urban bluespace (areas dominated by surface waterbodies or watercourses) and greenspace (e.g., parks, green roofs and walls), increasing the albedo of pavements and roofs, and improving roof insulation or pavement porosity (Georgescu et   measures, however, come with trade-offs and limitations (Jay et al., 2021). For instance, the presence of water bodies can reduce the urban thermal load during the day but may also increase it at night, probably due to their high thermal inertia (Ward et al., 2016, and references therein). Statistical relationships of land surface temperatures with land use and landscape pattern metrics indicate that interspersing greenspace into urban patches can have stronger mitigation effects than large green areas in Shanghai (J. Li et al., 2011), while in the case of forested green areas in Berlin large patches with complex shapes contribute significantly to thermal reduction (Dugord et al., 2014). Nevertheless, different conclusions may emerge from different vegetation types, land-use patterns and climatic conditions (Dugord et al., 2014). Therefore, more research is still required on the appropriate geometry and diversity of both bluespace and greenspace for heat mitigation (Dugord et al., 2014;Gunawardena et al., 2017). Moreover, the cooling effect of urban greenspace can be reversed during HWs (e.g., the 2006 European HW) following drying and changes in albedo of shallow-rooted vegetation (i.e., grass and unforested vegetation; Ward et al., 2016). Some studies have also concluded that green spaces with a high density of trees are more efficient than grasslands in delivering the cooling effect during the day, most probably because of their high evaporative capacity and shading effect (Grilo et al., 2020;Gunawardena et al., 2017).
Other analyses have stressed the importance of identifying appropriate locations for tree planting and reducing impervious surfaces to effectively lessen UHI effects (Coutts et al., 2016;Ziter et al., 2019). During summertime heat events in Melbourne trees are particularly effective in influencing the street microclimate in open, shallow street canyons (daytime cooling up to 1.5°C), whereas their influence is masked by the shading effects of buildings in narrow, deep canyons (cooling up to 0.9°C) (Coutts et al., 2016). The observed cooling will likely be greater in the future if the tree canopy cover of the city is increased from 22% (in the early 2010s) to 40% (2040 target). Based on summer observations in a midsize city of the Upper Midwest US, Ziter et al. (2019) noted that the greatest potential for daytime cooling (>1.5°C) can be achieved with a limited number of additional trees by increasing tree cover in neighborhoods with ≥40% canopy cover. This occurs because the benefits of tree cover grow rapidly above that threshold. Nevertheless, they emphasized that planting efforts should also take place in areas with low tree cover to consider inequalities in access to green space and societal vulnerability to extreme heat (e.g., low income and lack of AC). The same study found that the effects of canopy were limited at night; therefore, reducing impervious surfaces is critical for lessening nighttime urban heat.
Multi-model projections of urban climates also support an increase in green infrastructure as an effective means of reducing urban heat stress (taking into account the effect of both temperature and humidity) on large scales. The potential evaporative cooling efficiency has been projected to increase more efficiently in dry than in wet cities from 2006-2015 to 2091-2100, with a global average increase of 16.6% in JJA under RCP 8.5 (L. Zhao et al., 2021). Similar results have been found by D. Li et al. (2019) for North American cities in summer, where spatial variations of daytime UHI intensity are mainly controlled by variations in the capacity of urban and rural areas to evaporate water. For certain cities, urban heat can almost completely be offset by adopting irrigated green roofs, and partly offset by cool (high albedo) roofs (e.g., New York City and Phoenix; Tewari et al., 2019). Conversely, Imran et al. (2018) suggest that cool roofs may be more efficient than green roofs during HWs in Melbourne, where the city-scale maximum near-surface UHI is reduced during the day by 0.60°C-1.50°C by using albedos of 0.50-0.85, and by 0.30°C-1.15°C if green roof fractions of 30%-90% are adopted. Other analyses indicate comparable potential benefits of green and cool roofs to reduce near-surface temperature during hot periods in the US (Krayenhoff et al., 2018;D. Li et al., 2014), although their performance can be limited under very dry conditions and due to dirt accumulation, respectively (D. Li et al., 2014). On the other hand, cool roofs are cheaper and easier to implement while green roofs offer some environmental advantages such as potential improvements in air quality and reduction in storm water runoff (e.g., D. Li et al., 2014). In addition, green roofs have the advantage that they help maintain limited temperature variation during the whole year, while reflecting roofs might increase the need for domestic heating in high-and mid-latitude regions during winter (Bevilacqua et al., 2017;Masson et al., 2014). Nevertheless, as urban greening reduces temperature by enhancing evaporative cooling at the cost of increasing atmospheric humidity, the effectiveness of this practice to combat urban heat stress should be evaluated with care L. Zhao et al., 2021).
The potential of combinations of different adaptation measures, applied at the scale of a whole city, is not fully understood, because direct experimentation is not feasible and the speed of changes in the urban environment is comparable to that of climate change (Viguié et al., 2020). Nevertheless, some modeling studies have investigated the benefits of adopting complementary adaptation strategies during periods of extreme heat, such as including cool roofs, green roofs and street trees (e.g., for the US; Krayenhoff et al., 2018); increased vegetation cover and roof albedos (e.g., in Berlin; S. Schubert & Grossman-Clarke, 2013), or combining different green structures like trees on curbsides, green roofs and green walls (e.g., for an urban setting in Sri Lanka; Herath et al., 2018). Full implementation of some of these strategies could have larger effects during the afternoon than at night (Krayenhoff et al., 2018;S. Schubert & Grossman-Clarke, 2013). Moreover, they cannot provide full adaptation to climate change without reductions of GHG emissions. As an example, an overall night-time summer warming (1.5-4.5 K under RCP4.5 and 4-7 K for RCP8.5, dependent on the region) persists by the end of the century across the contiguous US with combined adaptation practices (cool and evaporative roofs, and street trees), while some improvements can be achieved during the afternoon (from 2 K cooling in California to 1-2 K warming in the reminder of the US under RCP4.5, and 2-6 K warming with RCP8.5) (Krayenhoff et al., 2018). At the city scale, Viguié et al. (2020) tested a combination of some optimistic measures (10% of the surface of the city devoted to new parks, strict building insulation and widespread use of reflective materials for walls and roofs) for future HWs in Paris under a median GHG emission scenario. The authors of that study concluded that such actions cannot totally replace the use of AC to ensure thermal comfort during a HW like that of 2003, although they provide significant cooling of the outdoor air (as much as 4.2°C at night) and cancel out the outside air temperature increase due to the heat released by AC systems. Combining the irrigation of parks and gardens in residential areas with occasional pavement watering in the city center has also been proposed as a good solution to reduce heat stress in Paris during a HW of the same magnitude, with a maximum air temperature decrease of 1.1°C during the day in the city center and 2.6°C at night in suburban areas (Daniel et al., 2018). However, temperatures remain very high and the implementation of the first of these two practices may lead to very large water consumption.
Summarizing, HW aggravation in cities remains an important topic of study and the adaptation potential of societies through different measures is to a large degree unknown. The efficacy of such strategies needs to be carefully assessed for different cities and regional climates (e.g., water-limited vs. non-water limited), warranting further research. Additional studies are particularly needed for urban areas in the hot, humid tropics and in the adjacent subtropical regions where population growth is concentrated. Understanding the complex interactions in the urban atmosphere that magnify the effects of HWs will require an integrated approach across different disciplines, with a combination of emerging observational and modeling tools (e.g., Barlow et al., 2017;Best & Grimmond, 2015;Bohnenstengel et al., 2015). These include, among others, remote sensing observations, profile measurements, extensive sensor networks, intensive observation periods at specific sites, crowdsourced data, and high-resolution models of diverse complexity.

Marine Heat Waves
MHWs are periods of spatially coherent extremely warm SSTs, typically defined as days exceeding a given local (and seasonally-varying) percentile of a fixed baseline period (Hobday et al., 2016). They display longer spatio-temporal scales than their land counterparts, often affecting millions of square kilometers for weeks to months. By imparting devastating impacts to marine life (habitats, distributions and populations), primary production and biodiversity, recent MHW episodes have evidenced the vulnerability of marine ecosystems and fisheries (e.g., Hughes et al., 2017;Smale et al., 2019). Accordingly, in the last years there has been a large body of literature exploring the influence of local processes and large-scale drivers on MHWs, as well as their historical trends, future changes, and associated predictability (see e.g., Holbrook et al., 2020;Oliver et al., 2021, and references therein).
Land HWs and MHWs share some similarities. MHWs also arise from regional processes interacting with large-scale modes of variability and their teleconnections across a range of spatio-temporal scales. Local factors determine the formation, maintenance and decay of MHWs through changes in the heat budget of the upper-ocean mixed layer associated with for example, heat transport, air-sea heat fluxes or vertical mixing (e.g., Holbrook et al., 2019;Oliver et al., 2021, and references therein). Typical processes include increased stratification and warm horizontal advection by ocean currents, surface heating by enhanced radiative fluxes or suppressed heat losses from the ocean, and reduced vertical mixing and upwelling of cold deep waters to the surface. The dominant processes and characteristics of MHWs vary regionally, but also with the considered event. Intense MHWs typically occur in regions where SST variability is large (e.g., extratropical western boundary currents) and are often associated with anomalous heat transport, whereas persistent events are more frequent in tropical eastern boundary currents, where temporal autocorrelation is high (Holbrook et al., 2019; Oliver et al., 2021).
Nonetheless, several commonalities have been reported for the most intense MHW events, including a marked decrease in wind-driven heat ocean loss, which may explain their preference to occur in the summer of each hemisphere, when mixed layers are shallow and winds are weak (Sen . Overall, MHWs have also become more frequent, persistent, intense and extensive since at least the 1980s . Their frequency has doubled over 1982-2016 and ∼87% of them can be linked to global warming today (Frölicher et al., 2018;Oliver et al., 2018), which is even higher than for land HWs (Section 4.1). With the exception of the Southern Ocean 2016 MHW, most notable individual events have also been attributed to anthropogenic climate change, which has increased the probability of occurrence, duration and intensity by more than 20 times (Laufkötter et al., 2020). The observed increases in MHW properties, in particular frequency, are mainly due to mean ocean warming, which dominates the trends over nearly two thirds of the ocean, although changes in SST variability may be important in mid-latitude western boundary currents (e.g., Oliver, 2019).
GCMs reproduce the broad spatial patterns of MHWs, with an overall tendency to overestimate the persistence and areal extent, and underestimate the frequency and intensity (Frölicher et al., 2018;Plecha & Soares, 2020). As compared to global mean biases, there are regional variations in the magnitude (e.g., larger biases in frequency and duration over middle and high latitudes) and sign (e.g., overestimated intensity over the Southern Ocean, and frequency over equatorial regions) of GCM errors (Plecha & Soares, 2020). GCM biases have been partially attributed to missed small-scale features and underestimated variability as a consequence of the coarse ocean resolution. This may lead to the underrepresentation of MHW drivers, particularly in eddy-rich regions such as the western boundary currents (Pilo et al., 2019). However, there have been limited improvements across model generations (CMIP5 and CMIP6), which share deficiencies, despite the eddy-permitting ocean resolution of some CMIP6 models (Plecha & Soares, 2020).
As found for HWs over land areas (Section 4.2), GCMs consistently project additional increases with continued warming in almost all oceanic regions, as well as non-negligible differences between 1.5°C and 2°C global warming . Despite the lower warming rates of the ocean as compared to land, MHWs would experience larger increases than land HWs, because the distributions of SSTs are narrower than those of air temperature over continental regions. Regional projected changes are not spatially homogeneous, being particularly large in regions of reduced SST variability (tropics and Arctic), and comparatively small in areas with low warming or cooling rates (high latitudes of the North Atlantic and Southern Oceans; Plecha & Soares, 2020). For example, for a global warming level of 2°C, changes in the probability of MHW days exceeding the pre-industrial 99th percentile are up to 60% over the Western Pacific warm pool, 45% in the Arctic Ocean and ∼10% in the Southern Ocean (Frölicher et al., 2018). For high-emission scenarios (SSP5-8.5), highly vulnerable regions of the tropical and Arctic oceans will face permanent states of MHWs that could largely be avoided under low-emission scenarios (SSP1-2.6; Fox-Kemper et al., 2021). Limiting global warming to 1.5°C-2°C would also substantially increase the expected return periods of the most impactful observed MHWs to 5-20 years, allowing some time for recovery (Laufkötter et al., 2020).
As in the case of land HWs (Section 3.1), the occurrence of MHWs has been related to high-pressure atmospheric systems, particularly in the subtropics (Sen , and internal modes of variability (Scannell et al., 2016), such as ENSO or IOD, whose fingerprints do not simply reflect the SST signatures of these modes . In fact, regional MHWs can be associated with anomalous occurrence of land HWs over some regions (Figure 14; see also Salinger et al., 2019), encouraging further exploration of their interactions. On interannual and shorter scales, ENSO is the largest global driver of MHWs, with El Niño events increasing the intensity, duration and areal extent of MHWs, both in proximate (the tropical Pacific) and remote regions of the Indian, Atlantic and Southern oceans (e.g., Sen . Current seasonal forecasting systems are skillful in predicting the occurrence and duration of MHWs up to 4 months in advance over large areas of the global ocean (Jacox et al., 2022). The skill varies depending on the region, season and the state of large-scale climate modes, with predictability lead times of up to 12 months in regions strongly affected by ENSO (e.g., eastern tropical Pacific) and of 2 months for atmospheric-driven MHWs (e.g., over the Mediterranean Sea and western boundary currents). As the occurrence of MHWs is also modulated by low-frequency modes of variability, the background ocean state (e.g., heat content, mixed-layer depth) and ocean circulation, there are also some prospects for potential predictability on longer (decadal) time scales, at least in regions where oceanic processes dominate the dynamics of MHWs (Holbrook et al., 2020). Improved predictions of MHWs will assist effective management and decision-making in adaptation and mitigation strategies. However, risk assessments should also consider ecosystem stressors other than SSTs (e.g., ocean acidification) that may combine with MHWs to yield high-impact compound events (Holbrook et al., 2020). Current global observational systems allow for near-real-time identification of MHWs, and future research will benefit from new generations of three-dimensional ocean reanalyzes, ocean model developments and collaboration with the land HW community.

Subseasonal Forecasts
Subseasonal forecasts (from 2 weeks to 2 months) bridge the gap between weather and seasonal predictions, and are key for the deployment of early warning systems, by realizing actionable anticipatory information for efficient risk management (Merz et al., 2020). Predictions of extreme events beyond weather horizons would have potential benefits to guide decision-making in many socioeconomic sectors, including agriculture, energy, health and water resources . Although the atmospheric patterns conducive to HWs are hardly predictable on these time scales, slowly-varying large-scale drivers (Section 3.1) and regional factors (Section 3.3) may improve their predictability by acting as boundary conditions for the atmospheric circulation (e.g., Domeisen et al., 2023). However, current capabilities to predict the onset, duration, location or amplitude of HWs on subseasonal time scales are limited by the chaotic nature of the atmosphere, the small number of robust predictors and the lack of understanding of the involved processes (e.g., Doblas-Reyes et al., 2013;Hoskins, 2013;Merryfield et al., 2020, and references therein). Compared to winter, extratropical summer forecasts are considered more sensitive to regional processes such as land-atmosphere coupling (e.g., Ardilouze et al., 2017), although remote drivers such as ENSO (Deng et al., 2018;O'Reilly et al., 2018), extratropical SSTs (Duchez et al., 2016;McKinnon et al., 2016b;Ossó et al., 2020) or sea ice anomalies (Wolf et al., 2020) may also impart some predictive skill. The MJO and tropical diabatic sources, summer monsoons, terrestrial diabatic heating in the middle latitudes (e.g., continental soil moisture anomalies) and high-latitude long-memory processes (e.g., snow cover, sea ice) are additional potential sources of predictability, since they can excite teleconnections preceding HWs in certain regions (Section 3.2). When present in the initial conditions, these teleconnections can enhance the predictability of the atmosphere in remote regions up to 3 weeks (Grazzini & Vitart, 2015). Therefore, uncovering and realizing the conditional prediction skill associated with these drivers may open new "windows of opportunity", that is, situations when subseasonal forecasts are more likely to succeed (Mariotti et al., 2020).
There are modeling issues that may prevent capturing the large-scale drivers and/or regional feedbacks of HWs (e.g., Merryfield et al., 2020). Models are often initialized without assimilating potential predictors such as sea ice thickness. Initial conditions of soil moisture and land-atmosphere interactions are often not well represented due to the lack of high-quality operational land products as well as limited development and coupling validation of land surface models (Balsamo et al., 2018;Dirmeyer et al., 2018). Misrepresentation of diabatic sources and processes, likely associated with radiative or convective parameterizations, as well as mean biases in the jet latitude, regional blocking and longitudinal extension of Rossby wave packets can also reduce the predictability (e.g., Beverley et al., 2019;Pante & Knippertz, 2019;Quinting & Vitart, 2019). Different model components are often initialized individually, so that inconsistencies and/or specific parameterizations can lead to rapid adjustments and model drifts, affecting atmospheric circulation forecasts. Finally, choices in post-processing methodologies (model verification metrics, bias correction, etc.) affect the skill scores (Manrique-Suñén et al., 2020), with potential implications for impact models employed in sectoral applications . Multi-model ensembles, in development for operational use, generally outperform individual models (Pegion et al., 2019), although the diversity of protocols hampers faster advances (Merryfield et al., 2020).
Some promising results are starting to emerge, though, including more skillful predictions of the probability of occurrence of weekly-to-monthly mean temperature extremes (compared to climatological forecasts) and of outstanding HW events (e.g., Becker et al., 2017;Domeisen et al., 2022;Tian et al., 2017;Vitart & Robertson, 2018, and references therein). These studies have shown potential for extended-range forecasts of HWs on time scales of 3-4 weeks. For example, the 2010 Russian mega-HW could have been predicted up to three weeks in advance with today's models ( Figure 15). Nonetheless, the predictive skill can be regionally and seasonally dependent, with typically highest scores in the tropics, and some NH land areas during late spring and summer. There are also differences in predictability across HW events and a tendency to underestimate their observed amplitudes. Land-atmosphere coupling represents a potential source of enhanced predictability for warm extremes up to 2 months, particularly in transitional regions of Europe, the US or east Asia (e.g., Seo et al., 2019). This is supported by reforecasts of historical conditions, which indicate that European summer warm extremes could be more predictable than the mean temperature on subseasonal scales (Wulff & Domeisen, 2019). Moreover, misrepresentations of land-atmosphere feedbacks or inappropriate soil moisture initializations are deemed culprit for forecast failures (e.g., Ardilouze et al., 2017;Ford et al., 2018;Seo et al., 2019). From the examined studies, it is inferred that, in addition to obvious choices (model, season, region and time horizon), the performance of subseasonal forecasts can also vary with the skill metric, HW event, index or attribute (e.g., duration, intensity, timing and location), suggesting that distinct HW characteristics may exhibit different horizons of predictability. While temporal and spatial averaging can add some forecast skill, it does not systematically result in higher predictability, and high levels of spatio-temporal aggregation can indeed be detrimental (van Straaten et al., 2020).
As the capabilities of current dynamical systems to predict HWs beyond 2 weeks may still be insufficient to deploy effective mitigation strategies, recent studies have also explored the potential of statistical methods. By inferring relationships between the target and a set of selected or discovered predictors from climate data sets, statistical models offer computational efficiency and flexibility. The adopted approaches vary in complexity, ranging from simple regression models to machine and deep learning (ML) techniques (Cohen et al., 2019;Runge et al., 2019). Statistical methods alone or in combination with ensemble-forecast systems to select "best members" have been employed for seasonal predictions (Hall et al., 2017;Neddermann et al., 2019, and references therein), sometimes achieving comparable skill to that of dynamical forecasts. As a complementary tool to dynamical models, ML applications have rapidly emerged in the Earth sciences, showing promising results in the representation of unresolved physical processes (e.g., Rasp et al., 2018), the generation of data-driven weather forecasts (Dueben & Bauer, 2018;Scher & Messori, 2021), and potential applications in climate modeling (Reichstein et al., 2019;Schneider et al., 2017). ML methods have also been proposed as an alternative paradigm to extended dynamical predictions (e.g., Cohen et al., 2019). Several statistical and ML models have been developed for the prediction of temperature extremes at different lead times, including weather (Chattopadhyay et al., 2020), subseasonal (e.g., Miller et al., 2021;Vijverberg et al., 2020;Weirich Benet et al., 2023) and seasonal (Kämäräinen et al., 2019;Pyrina et al., 2021;R. Z. Zhang et al., 2022) time scales. They include a diversity of approaches in terms of model complexity, predictors, lags and trained data sets. In some cases, statistical models compete with (or are superior to) operational dynamical models in predicting warm extremes 3-4 weeks in advance (e.g., López-Gómez et al., 2022;Miller et al., 2021;Weirich Benet et al., 2023). Improvement has sometimes been achieved with relatively simple methods and few precursors, suggesting different levels of predictability or model sensitivities to the training set (e.g., similar performance of simple and complex models for small data sets). Therefore, there is not a single way to boost the skill of statistical subseasonal forecasts. Strategies include improvements in the signal-to-noise ratio through temporal and/or spatial data aggregation, or refined techniques to optimize the error losses or the limited sample size associated with extreme events (e.g., by using learning from less extreme events; Jacques-Dumas et al., 2022;López-Gómez et al., 2022;Vijverberg et al., 2020). ML methods have also been employed to discover subseasonal drivers of European high temperatures at different time leads (van Straaten et al., 2022), and windows of opportunity for enhanced subseasonal forecasts of HWs (Guigma et al., 2021). Overall, the uncovered linkages confirm the key role of well-known drivers (tropical forcing and soil moisture) already revealed by dynamical forecasts, but also specific regional features in extratropical SSTs, sea ice and snow cover anomalies, which may vary with the region, timing and lag considered. In this sense, deep learning weather prediction models offer great potential to generate data-driven ensemble predictions and a large number of reforecasts of extremes with low computational cost and comparable performance to that of dynamical models beyond 2 weeks (Weyn et al., 2021).
Interpretability is a common issue in ML, and some methods require a large amount of input data for training and design choices (e.g., Dueben & Bauer, 2018). Approaches to tackle with the shortness of historical data include data augmentation techniques through sampling algorithms that multiply the number of observed events (e.g., Ragone & Bouchet, 2021) or dynamical-statistical hybrid strategies (e.g., transfer learning from dynamical models to reanalysis data sets; e.g., Jacques-Dumas et al., 2022). In addition to the selected architecture and hyper-parameters, a predefined subset of predictors, regions and time leads is often required, which may not capture the true source of predictability due to the intrinsic non-stationarity and intermittency of the drivers. On longer time scales, the non-stationarity of the climate system represents an additional issue (e.g., making historical signals to weaken or new pathways to emerge), but also an opportunity if the atmospheric circulation becomes more predictable with climate change (Faranda et al., 2019). The massive amount of model-based data could be exploited to partially cope with these limitations, although the performance of the so-learned relationships could be compromised in the real world by the presence of biases. Despite these challenges, in the next years we could see massive applications of ML approaches to HW-related problems, including: (a) the refinement of impact-oriented HW definitions; (b) the identification of drivers and quantification of their relative roles in single events, observed trends and model biases; (c) the development of early warning systems, or (iv) the management of uncertainty in future regional projections.

Conclusions and Scientific Roadmap
HWs have received considerable attention in recent years, given the disproportionate risks they can pose for societies and ecosystems under ongoing and continued global warming. Despite recent advances there are still gaps in current knowledge that prevent further progress and delineate priorities for future research of HWs.
Progress has been made by moving from single extreme indices to multi-faceted indicators of HW events, but many studies employ different definitions and emphasize different attributes, hampering their comparison (Section 2). Although challenging, coordinated efforts to reduce the number of indices have been proposed, at least in the climate community, where most metrics depend on temperature data only (Perkins, 2015). Before any attempt of prioritization, a systematic assessment would be necessary to understand the specific signatures of HWs captured by each metric (frequency, duration, intensity, etc.) and their links to properties of the time series (mean, variability, skewness, serial autocorrelation, etc.). In the case of more complex (e.g., multivariate) indices, the influence of separate components in the index and its sensitivity to changes in the temperature distribution should also be addressed. Concerted efforts in the climate community could then focus on identifying redundancies among metrics, restraining the number of indices and establishing relationships between HW characteristics, generation mechanisms and specific impacts. A universal definition of HW is perhaps too ambitious and even unfeasible, since there are no obvious optimal thresholds and choices for statistical HW definitions that can be generalized. Instead, a reduced and manageable number of indices with well understood links to the shape of the distributions would help explain regional features and the responses of HW characteristics to climate change. This could also provide guidance for tailored developments of impact-oriented indices, whose expansion seems unavoidable given the diversity and regional specificities of the affected systems. In the meantime, decisions must be guided by judgments on data credibility and about which index fits best for a given purpose.
Observational evidence and theoretical understanding has revealed a long list of drivers involved in HW development and maintenance across scales ( Figure 8). As these drivers have mainly been addressed fragmentarily, a comprehensive framework accounting for the effects of fast and slow drivers and their interactions across multiple spatio-temporal scales is needed. It may combine empirical and mechanistic approaches, taking advantage of the development of methodologies for quantifying multi-hazard interrelationships (Tilloy et al., 2019). Concerning the extratropical atmospheric circulation (Section 3.2), controversy on the type of weather systems involved (blocking, subtropical ridge, Rossby wave packets, CGT, etc.) still remains. This may just reflect different perspectives to address atmospheric circulation, but also emphasizes the lack of comprehensive dynamical theories accounting for dry and moist processes, and the need for understanding the links between dynamical aspects of weather systems and statistical HW definitions (Messori et al., 2018;Steinfeld & Pfahl, 2019). Recent studies stress the relevance of processes affecting air masses during their transport toward the HW region, but disagree on the contribution of horizontal advection, adiabatic warming and/or diabatic heating (e.g., Schumacher et al., 2019;Zschenderlein et al., 2019). This suggests methodological discrepancies in their quantification, great spatial heterogeneity and/or case-to-case variability of HWs. Comparing HWs that occur or not in conjunction with other extremes or under different combinations and states of the abovementioned drivers could shed some light on these discrepancies.
Little advance has been made beyond statistical connections on the mechanistic influence of anomalous SST patterns and low-frequency climate modes in HW occurrence and variability (Section 3.1). However, potential linkages are supported by the co-occurrence of HWs and MHWs, and shared modulating factors (e.g., ENSO). Future studies exploring interactions between HWs and MHWs could foster progress on the role of SSTs in HWs. This would in turn require better understanding of diabatic heating sources, air-sea interactions and tropical-extratropical atmospheric teleconnections. In turn, MHWs face key challenges in modeling, data availability and coverage that limit our understanding of the roles of the oceanic background state, subsurface processes, biogeochemical feedbacks and large-scale remote drivers (Section 5.2). While there is still much to learn about MHWs, collective assessments, often missed in the land HW community, have brought relatively fast advances in their understanding.
Despite the growing body of literature on regional forcings of HWs (anthropogenic aerosols, land-use/landcover changes), understanding is still incomplete, mainly limited to regional radiative and/or thermodynamic effects, and subject to uncertainties stemming from model deficiencies and discrepancies in the representation of processes (Section 3.3). In particular, the relative roles of radiative versus non-radiative processes are not well quantified, and it is yet unclear whether and how these regional processes instigate changes in atmospheric circulation patterns conductive to HWs over proximate and remote regions (e.g., Zampieri et al., 2009). Advances could be fostered through multi-model coordinated experiments targeting the joint and separate impacts of these forcings, as well as their varying effects across types of vegetation and aerosols. Regarding soil moisture feedbacks, there are still knowledge gaps concerning the relative roles of atmospheric evaporative demand and soil moisture deficits on land evaporation. Recent findings on "teleconnected-feedbacks" concentrate on specific regions (mainly Europe), although these remote land influences may operate in other regions. Comparatively, the effects of vegetation phenology and dynamics, or feedbacks between land surface processes and dynamical aspects of HWs (e.g., atmospheric circulation, clouds and/or precipitation) remain poorly understood. Local land-atmosphere feedbacks depend on the hydraulic behavior of plant species as well as a plethora of environmental conditions (water availability, nutrients), and so does their influence on HWs. This highlights the need to account for interactive plant physiology and phenology, which are often not accurately represented in land surface models (e.g., Miralles et al., 2019), and may cause competing effects on HWs.
There is compelling evidence that global warming is leading to more frequent, persistent, intense and extensive HWs, and that increases in GHGs are the main driver of the observed changes (Section 4.1). GHG forcing dominates the observed changes in warm extremes in all continents, and new attribution studies are able to disentangle the spatially-varying cooling effects of anthropogenic aerosols (e.g., Seong et al., 2021). The level of understanding of regional changes is limited and efforts are still biased to specific regions, though. Regional trends can deviate substantially from those induced by anthropogenic GHG increases alone, due to large internal variability, changing dynamics and/or regional and local forcings, including aerosol trends, land-use/land-cover changes or widespread urbanization, which may either reinforce or counteract GHG-induced effects. In most land areas, the specific mechanisms and their relative contributions to regional trends in HWs and their attributes have not been well identified. The multiplicity of drivers, temporal evolutions, characteristic spatio-temporal scales (short-lived vs. long-lived forcings), and interactions are confounding factors. Accordingly, the relative importance of each forcing on HW trends varies depending on the spatial scale (local, regional, continental, global), the targeted period (past, present, future) and time horizon (short-term, long-term). Land-cover changes (afforesta tion, deforestation or reforestation) and land-use practices (agricultural expansion and irrigation) remain poorly addressed, and may cause large, eventually counteracting effects on HW magnitude and trends (e.g., Thiery et al., 2017;Sippel et al., 2018; Section 3.3). As a consequence, different regions might undergo similar changes due to different mechanisms. Moreover, regional drivers of HW trends can have remote influences (e.g., cooling responses to anthropogenic aerosol increases far from the main emission regions) and yield partial cancellation in the presence of opposite trends (e.g., European responses to historical anthropogenic aerosol changes), which might be misinterpreted as a lack of effect. Similarly, the interactions of regional forcings remain poorly explored. For example, anthropogenic aerosols may intervene in HW characteristics by modulating land-atmosphere feedbacks  or inducing changes in the regional atmospheric circulation (Y. Xu, Lamarque, & Sanderson, 2018).
The number of studies attributing single HW events to climate change is growing fast, and outstanding events are now assessed a few weeks after they occur. However, on regional scales where HW events impact the most, the detection of anthropogenic influences is hampered by the short length of observations, multiplicity of forcings, the lack of robust model responses, and by the difficulty to differentiate the forced response from internal variability (Section 4.1). In particular, changing dynamics can strongly affect the occurrence of HW events, and there is mounting evidence of recent circulation changes, but understanding of their causes and links to anthropogenic forcings remains limited. Attributing HW events on small spatio-temporal scales (including urban HWs) and expanding capabilities to inform on future events are key scientific gaps that need to be addressed in order to add value in local decision making. In this sense, there are promising methodological developments, such as "rare event algorithms" (Ragone & Bouchet, 2021) or ML weather forecasts (e.g., Weyn et al., 2021) that may bring more robust assessments of low-probability events with lower computational costs than model simulations. Challenging steps include the detection of changes in the atmospheric circulation, the disaggregation of dynamical and thermodynamic influences, the role of different anthropogenic forcings, and the consideration of aspects that are relevant for impact risk assessments, adaptation and mitigation on regional and local scales. End-to-end attribution of the associated impacts on humans and natural systems is emerging slowly because of the need to incorporate non-climate factors in the risk-based framework. While this represents an excellent opportunity for interdisciplinary research and novel applications (e.g., attribution of high-impact compound events and associated damages to country-level responsibility), the complex nature of non-climatic uncertainties may require novel approaches. Recent studies advocate for causality rather than statistics-driven approaches (Olsson et al., 2022). They include the consideration of past events as benchmarks to dissect the contributing factors and construct narratives (storylines) of potential risks based on present or future feasible outcomes of these factors and their combinations (e.g., Sillmann et al., 2021, and references therein). Storylines are starting to be employed for the attribution of HW events by nudging the atmospheric circulation to that observed at the time of the event and conditioning on different contributing factors through factorial experiments (e.g., van Garderen et al., 2021). These event-based storylines are promising approaches to convey actionable climate information for decision-making, although they face methodological challenges to incorporate vulnerability and exposure.
These issues pose challenges to current models, which still have problems to represent processes associated with energy, carbon and water fluxes, and have biases in HW-related weather systems (Section 4.2). On regional scales, the model skill in simulating the observed trends can be poor, and there has been little progress in determining whether these discrepancies are due to missing or misrepresented local forcings and feedbacks, natural variability, or observational issues. Some regional land forcings (e.g., deforestation, irrigation) and associated feedbacks are treated very differently (or not considered) in current GCMs, calling for mechanistic model developments of plant growth and physiology to better understand the role of vegetation dynamics (e.g., the stomatal behavior, biogeochemical processes and water efficiency across plants). These modeling efforts should be accompanied by the deployment of observational networks for model evaluation (e.g., Balsamo et al., 2018;Dirmeyer et al., 2018;Miralles et al., 2019). On the other hand, the links between model errors and specific mechanisms are often overlooked, and hence it is still unclear whether model improvements are associated with a refined representation of physical processes. A reasonable representation of HW characteristics can be erroneously achieved by compensating model biases so the extent to which HWs in models occur for the right reasons is still uncertain. Higher resolution alone may not guarantee an improved simulation of HWs, and would further require simultaneous adjustments in parameterizations, as well as continued efforts towards fully coupled RCMs accounting for two-way ocean-atmosphere interactions, development of convection-permitting models and physically-consistent statistical techniques for downscaling and bias correction (e.g., Coppola et al., 2021;Jacob et al., 2020). Simultaneously, large ensembles sampling internal variability and stochastic processes are important to evaluate models and quantify forced responses of HWs, particularly at local scales. The relative benefits and the optimal balance of model complexity and ensemble size are still unclear (Roberts et al., 2018). Suggested approaches to deal with this trade-off and reduce computational costs include high-resolution simulations for specific events or drivers, or novel emulator methodologies that can replicate the distribution of the full ensemble (Deser et al., 2020). Although our capability to simulate HWs is improving, models still have errors that affect the whole spectrum of HW drivers, with limited prospects of substantial improvements in the short-term. Therefore, mechanistic understanding and process-based model evaluation of HWs will ultimately be required. This is particularly important under non-stationary conditions (e.g., enhanced atmospheric evaporative demand, earlier vegetation greening, changes in atmospheric circulation, etc.), because some of the reported influences may change or not hold in a warmer climate, potentially leading to a disproportionate intensification of HWs (e.g., Domeisen et al., 2023;Lian et al., 2020;Rasmijn et al., 2018). Accordingly, model evaluation should prioritize the correct representation of drivers and associated processes, and how they affect the simulation of HWs, instead of focusing on the accurate depiction of extremes only. Recent advances in understanding have been brought by simplified models that allow a more straightforward identification and interpretation of basic HW ingredients than GCMs (e.g., Miralles et al., 2014), although they have mainly been applied to specific events and would benefit from more systematic assessments. Therefore, synergistic approaches exploiting these complementary model capabilities are strongly advisable. Examples include the design of driver-tailored experiments with and without coupling to other model components or a comprehensive use of models of diverse complexity: from large-scale simplified dynamical models that allow process understanding of the atmospheric circulation and connection to GCMs (e.g., Maher et al., 2019) to weather numerical models that explicitly resolve eddies involved in the turbulent transport within the atmospheric boundary layer (e.g., N. Zhang et al., 2014).
Predictive capabilities of HWs have recently expanded to subseasonal scales due to improved observations, data assimilation schemes for initialization and models (Section 5.3). However, dynamical predictions have mainly focused on specific regions and drivers (e.g., land surface states), and are far from completing the big picture of skillful sources of predictability and associated mechanisms. Hence, it is yet unknown if the limited regional skill on subseasonal scales is due to missing predictors, misrepresented processes or a true lack of intrinsic predictability. While subseasonal forecasts are still insufficient to be used for early warning, studies suggest that potential predictability is higher than the current prediction skill. For example, in the presence of remote forcing, summer waveguides may enhance the predictive skill of HWs in certain regions. The identification of windows of opportunity when HW events are more predictable is challenging because interactions over multiple spatio-temporal scales imply time-varying contributions and conditional involvements of the drivers. Realizing this potential predictability on subseasonal scales will require improved identification of drivers (e.g., MJO, ENSO, summer monsoons, soil moisture, Arctic sea ice, etc.), enhanced forecasts of these predictors and improved representation of the processes involved in their regional influences. This can entail challenging developments in model parameterizations, resolution and coupling, and post-processing techniques for model verification and bias correction (e.g., Domeisen et al., 2022;Merryfield et al., 2020). Therefore, advances will ultimately rely on better understanding of the mechanisms that lead to HWs. Sensitivity experiments prescribing certain components (e.g., SSTs, soil moisture, large-scale atmospheric circulation) may help uncover biases, sources of predictability and the underlying mechanisms. In the meantime, novel methods based on ML, either alone or in combination with dynamical forecasts, arise as very promising tools to integrate products into a future ready-to-go operational system. Ideally, these statistical methods should be reliable discovering true connections, suitable for small data sets and reasonably tractable in physical interpretation to gain dynamical understanding. Subseasonal forecasts pose a major challenge to ML due to the limited predictability and sampling at these time scales, short record of high-quality observations for training and verification, high dimensionality and spatial correlations of climate variables, as well as the multiple time-evolving spatio-temporal interactions of the drivers. These limitations may lead to reproducibility issues and a case-dependent performance of ML algorithms. Translating these potential advances into process-based understanding will require incorporating interpretability to ML methods (i.e., how the model has learned the inferred relationships; McGovern et al., 2019). Interpretable ML algorithms will help assist dynamical forecast systems by unleashing drivers that should be represented in model simulations. By complementing ML techniques with physical understanding (e.g., hybrid modelling frameworks; Reichstein et al., 2019), not just enhanced predictions but a better knowledge of the HW precursors can be attained. This represents an opportunity for progress and interdisciplinary collaboration in fundamental questions related to HW detection, prediction and attribution.
Under unabated global warming, the observed changes in HWs will increase proportionally with the magnitude of global warming, with larger increases for more intense events (an "intense-gets-more-intense" pattern; Section 4.2). With the current warming rates, record-shattering events with unprecedented intensities are already possible in today's climate. The benefits of limiting warming are unquestionable. Despite the unavoidable future increases in HWs, the level of understanding of regional changes is still fragmented, and different types of models (e.g., RCMs vs. GCMs) can show discrepancies in their projections, undermining confidence in future changes and the efficient development of adaptation and mitigation plans on relevant scales. Although changes in variability can have large effects in some regions (e.g., through land-atmosphere feedbacks), recent studies suggest that observed and projected changes are overall dominated by the mean seasonal warming. If so, it would seem reasonable to focus on understanding regional changes in the mean and their uncertainties, although, paradoxically, the causes of biases in seasonal mean temperature could be different from those in extreme temperature . Ultimately, confidence in future projections will depend on our understanding of the physical mechanisms behind projected changes. Besides reducing biases, which may not be feasible in the next few years, efforts should focus on identifying the drivers and understanding their combined effects in HW projections (e.g., Baker et al., 2019). The spatial diversity of regional responses more likely involves changes in a variety of interconnected factors, including SSTs, land-sea contrasts, land-atmosphere coupling and atmospheric circulation. The latter entails several challenging issues (e.g., Collins et al., 2018). Regional circulation responses show large spread, partially due to the superposing responses to different anthropogenic factors (e.g., GHG increases and aerosol reductions; Lucas et al., 2014), enhanced influences of internal variability, and competing effects of climate change (e.g., the opposing effects of direct CO 2 radiative forcing and sea surface warming; Shaw & Voigt, 2015). For example, regional blocking decreases caused by weakened baroclinicity or jet shifts could partially be counteracted by enhanced diabatic heating release in a moister atmosphere (R. M. Horton et al., 2016). Similarly, weakened summer jets would be less conductive to CGTs, but waveguiding also depends on jet latitude and width, on non-linear interactions with synoptic eddies, as well as on the characteristics of the diabatic forcing sources. This stresses the need to address changes in the background state and the forcings of atmospheric circulation (Teng & Branstator, 2019). Advances in understanding the atmospheric circulation responses to climate change could be made by focusing on the "thermodynamic starting points" (e.g., decreased meridional temperature gradient, increased latent heat release in tropics) that initiate the otherwise unmanageable number of dynamical processes (Shaw, 2019). The responses of these global drivers of climate change can then be combined to construct a reduced number of plausible and physically consistent narratives of atmospheric circulation changes (e.g., Zappa, 2019, and references therein).
Recent studies have also addressed future risks of HWs, mainly associated with changes in exposure. Quantifying risks is more challenging, as it also depends on vulnerability and physiological understanding of the systems' responses to heat, which will require coupled transdisciplinary actions. As projected changes in HWs depend on the rarity of the event (e.g., Kharin et al., 2018), risk assessments will need to identify the extreme thresholds critical for the system's vulnerability and define strategies as a function of the level of extremeness of the hazard. Assessments of vulnerability to HWs have mainly focused on human health and often rely on country-based indicators of socioeconomic development (e.g., Human Development Index; UNDP, 2019). However, vulnerability to HWs also involves complex relationships between HWs and the targeted sector, which can differ between urban and rural areas or across population groups, and also depend on adaptation measures and acclimatization that would reduce the impact. Cities are particularly vulnerable to HWs, but they also have a unique potential to adapt (Section 5.1). While increasing urban greenspace fraction and the reflectivity of urban materials mitigates UHI effects, solutions must be tailored to specific locations. This requires integrated approaches across disciplines (e.g., Barlow et al., 2017), including the deployment of new urban-specific measurement technologies and physically-based urban parametrizations covering a range of scales that can be exploited by researchers, building designers, city planners and policy makers to make cities more resilient to HWs. While many of the measures discussed in the scientific literature may be widely applicable to urban areas both in the middle and in the low latitudes, future studies should address their economic viability, the potential environmental consequences, and other limitations for specific urban environments. Interdisciplinary research will also allow the exploration and careful assessment of additional strategies (e.g., land radiative management; e.g., Seneviratne et al., 2018) for adaptation and mitigation of HW-related impacts.
In summary, recent studies have provided important, but still incomplete understanding of HWs. Despite the reported improvements, long-lasting issues related to HW definition, model performance, attribution to climate change, or future projections still remain, which also hamper advances on traditional (e.g., urban HWs) and emerging topics such as subseasonal forecasts and MHWs. We argue that the fragmentary understanding of the physical drivers contributing to HWs on regional scales and their multi-dimensional interactions is a major obstacle preventing substantial progress, which is aggravated by a fundamental lack of understanding of dynamical aspects. Process-based understanding would benefit from coordinated efforts challenging the integration of theoretical, observational and modelling developments. Filling these gaps will provide a strong basis to reduce uncertainties in future regional projections and address temporal and spatial relationships with other compound hazards. Interdisciplinary research will ultimately be required to translate this gained process understanding into a risk framework including aspects such as exposure, vulnerability and adaptation, and to realize an efficient provision of actionable information to end-users and policy-makers.

Appendix A: Data Sets and Methods
Several figures have specifically been prepared for this review. This section describes the data sets and methods employed in their computation.

A1. Observations and Model Simulations
We use daily TX for 1950-2021 from the experimental observational-based Berkeley Earth data set (Rohde et al., 2013). This data set provides gridded temperature fields (with almost complete coverage of global land areas) on a regular 1° × 1° grid based on a novel kriging interpolation framework to maximize the information from weather stations (see source in the Data Availability Statement). It has been employed for the generation of Figures 2, 4, 9, 10, and 14.
The model simulations considered for Figure 12 come from 23 Global Climate Models (GCMs) (Table A1) available in the CMIP6 archive (Eyring et al., 2016). Some GCMs are provided by the same institution with modifications in model components or atmospheric resolution. Daily mean temperature (TM) and TX for 1980-2100 were obtained by merging historical simulations (1980-2014, with the observed evolution of external forcings) and projections (2015-2100) under the SSP3-7.0 scenario. Only one realization is considered for each model (variant r1i1p1f1, if available). Data were interpolated to a common regular grid of 2.5° × 2.5° before performing any computation. For each GCM, we compute global (land and ocean) annual mean TM (GTM) series for the 1980-2100 period to identify the timing of emergence of specific global warming levels (from 1°C to 3.5°C at 0.5°C intervals) with respect to the preindustrial period (1850-1899). These targets are recomputed with respect to the mean simulated climate for 1981-2010 (corrected targets) by simply subtracting the observed global warming of that period, which is ∼0.7°C in the global Berkeley Earth data set product. A given global warming level occurs when the mean GTM of a 31-yr period (and all its subsequent periods) exceeds that of 1981-2010 by the corrected amount. Note that the same warming level can be achieved at different periods depending on the GCM. Finally, the 31-yr periods with the same level of global warming in all GCMs are grouped to derive multi-model ensemble means of heat wave (HW) day annual frequency. For each global warming level and GCM, the corresponding changes in HW day frequency are computed with respect to the present-day (1981-2010) mean simulated by that model. Model uncertainty for a given global warming is quantified as the difference of the mean HW changes projected by the GCMs with the largest (≥75th percentile) and lowest (≤25th percentile) changes in global mean HW day frequency (6 GCMs in each group).

A2. HW Definition
For all data sets considered, HWs are defined locally as periods of at least three consecutive days with TX above the local 95th annual percentile of the reference period (1981-2010, except for Figure 2, where the baseline changes with time to account for acclimatization). As the annual cycle is not removed, this definition emphasizes the detection of events in the warm season (if a local seasonal cycle is present). For regions with marked seasonality, our HW definition can detect moderate extremes in the warmest months of the year (as compared to definitions based on a seasonally varying threshold). However, it ensures a large sample size for robust threshold estimates and applicability across the globe without assumptions on seasonality (e.g., a predefined warm season for each hemisphere). HW events are only computed for land grid points with non-missing data during more than 75% of the days of both the reference and full  period. All days comprising a HW event will be referred to as HW days. Local HW events are characterized by several event attributes and daily metrics for each day of the HW event. They include: -HW duration: the number of consecutive HW days.
-HW intensity: the TX exceedance above the local threshold (in °C). It can be computed for each HW day (daily HW intensity) and event (e.g., the daily peak or cumulative intensity over the HW duration). -Areal extent: The areal extent is computed for each day and land grid point with HW conditions. It is defined as the total extension (in km 2 ) covered by that grid cell and all spatially connected grid points with simultaneous HW conditions. Note that for a given day all adjacent grid cells experiencing HW conditions will have the same areal extent.
Local HW metrics allow temporal aggregation on different scales. This is employed to derive monthly and annual HW statistics such as the monthly frequency of HW days and the maximum areal extent or HW duration in the year. The monthly frequency of HW days is employed to quantify the influence of the main modes of variability ( Figure 4) and Marine HWs (MHWs) (Figure 14) in HW occurrence. Annual indicators are used to compute trends in observed HW characteristics ( Figure 9) and future changes in the frequency of HW days ( Figure 12).
We also define the HW magnitude as the sum of daily intensities for all HW days in the year, which provides an integrated measure of HW event frequency, duration and intensity on an annual basis. This metric is not very sensitive to the specific arrangement of HW events and hence it is considered more robust than other estimates of HW severity (e.g., the highest cumulative HW intensity in the year). HW magnitude has been used to identify the most severe years for each continent (Figure 10), defined as those with the highest area-weighted HW magnitude over all land grid points of the continent.

A3. Modes of Variability
To assess the influence of internal modes of variability on HW occurrence ( Figure 4) the following indices have been considered: Northern Hemisphere atmospheric teleconnection patterns (TCP), Southern Annular Mode (SAM), El Niño indices (EN1+2, 3, 3.4, and 4), Indian Ocean Dipole (IOD), Tropical North and South Atlantic indices (TNA and TSA, respectively), Madden-Julian Oscillation (MJO), and the Indian, Western North Pacific, West African and Australian summer monsoons (see sources in the Data availability statement). The list does not mean to be exhaustive since Figure 4 is used for illustrative purposes only. The Northern Hemisphere atmospheric TCP include the following modes of variability: North Atlantic Oscillation, East Atlantic, East Atlantic/Western Russia, Scandinavia, Polar/Eurasia, Pacific/North American, West Pacific, East Pacific-North Pacific, Pacific Transition and Tropical Northern Hemisphere. Due to its higher zonality, a similar list of regional TCP does not exist for the Southern Hemisphere, whose atmospheric variability is dominated by the SAM. Although the MJO is a transient phenomenon with a 30-to 60-day lifecycle, a monthly degraded version is herein employed, with 10 MJO indices measuring the anomalous convection activity over different tropical regions on a monthly basis (80°E, 100°E, 120°E, 140°E, 160°E, 120°W, 40°W, 10°W, 20°E, 70°E). The monthly indices were standardized with respect to the 1981-2010 period. Note that we do not include all reported modes of interannual variability and SST modes of low-frequency variability such as the Pacific Decadal Oscillation and the Atlantic Multidecadal Oscillation.
For each grid point, we compute the Pearson's correlation coefficient between the monthly series (1978-2021) of HW day frequency and each index separately (similar results are obtained for the Spearman's rank correlation). All monthly series were previously detrended to emphasize variations at interannual and shorter scales, and avoid inflated correlations by covariability of their trends. For non-atmospheric modes of variability, lagged correlations (index values up to 3 months before the local HW occurrence) are also considered to account for potential remote teleconnections. The analysis is restricted to calendar months when both the index and HW frequency are defined. The latter corresponds to the warm-season months (i.e., those with at least one HW event during the analyzed period). As correlation coefficients depend on the sample size, which varies spatially and with the mode of variability, we assess statistical significance and look for the lowest p-value to determine the mode of variability with highest influence in local HW occurrence.

A4. MHWs
In Figure 14, we compute the monthly mean frequency anomaly of HW days during the months of regional MHW occurrence. The composite analysis is performed for a list of historical (1982-2016) MHW events over selected regions, as reported in Table 2 of Holbrook et al. (2019). MHW events can last several consecutive months and the number of case studies is small (seven regional events or less). As many of these MHW events occurred in recent years, the local time series of monthly HW day frequency were first detrended (with zero mean) over 1979-2017. This removes the effects of long-term trends in the composite. We also take into account seasonality effects in the occurrence of both local HWs and MHW events. To do so, we computed random composites of monthly HW frequency with a 500-trial bootstrapping. Each random composite uses the calendar months of MHW events but with random years of occurrence (these years are consistent for all land grid points to preserve spatial coherence). In all cases, the composited HW frequency is expressed as the local departure from the mean of the random composites, thus representing a climatological anomaly (in HW days per month). These anomalies are statistically significant at p < 0.1 if they fall outside the (5-95)th percentile range of the random distribution. This analysis has been performed for each region of MHW occurrence separately. In Figure 14, the composites of anomalous HW occurrence over Asia/Europe/North America, Africa, Australasia and South America correspond to MHWs events over the Northwest Atlantic, Benguela, Great Barrier Reef and Humboldt/Peru regions, respectively.

Data Availability Statement
Gridded daily temperature data employed for the computation of Figures 2, 4, 9, 10, and 14 was retrieved from the Berkeley Earth data set (https://berkeleyearth.org/data/, last access 23 December 2022). Daily mean and maximum surface air temperature output of GCM simulations ( Figure 12) was downloaded from the CMIP6 data portal (https://esgf-node.llnl.gov/search/cmip6/). The monthly time series of indices ( Figure 4) (HEAT, 101088405). S.S-S acknowledges support from the ORCA-DEEP (PID2020-115454GB-C21) project funded by the Spanish Ministerio de Ciencia e Innovación. We also thank ECMWF for providing ERA5 reanalysis data, Berkeley Earth data set for providing the observational temperature product, and the data providers of monthly indices and CMIP6 GCM simulations. D. Domeisen and two anonymous reviewers provided helpful comments that helped improve the manuscript.