Detection and attribution of climate extremes in the observed record

We present an overview of practices and challenges related to the detection and attribution of observed changes in climate extremes. Detection is the identi ﬁ cation of a statistically signi ﬁ cant change in the extreme values of a climate variable over some period of time. Issues in detection discussed include data quality, coverage, and completeness. Attribution takes that detection of a change and uses climate model simulations to evaluate whether a cause can be assigned to that change. Additionally, we discuss a newer ﬁ eld of attribution, event attribution, where individual extreme events are analyzed for the express purpose of assigning some measure of whether that event was directly in ﬂ uenced by anthropogenic forcing of the climate system. an open article the CC license (http://creativecommons.org/licenses/by/4.0/).


Introduction
Contemporary climate change presents one of the most pressing challenges for human society. As the climate continues to change, the risks associated with climate extremes takes on ever greater importance. Changes in the mean climate, particularly since the middle of the 20 th century, have been linked to anthropogenic-induced increases in greenhouse gases (Hegerl et al., 2010). Indeed, a number of recent climate assessments have concluded that observed changes in the climate system over the past century are largely a result of human activities (Seneviratne et al., 2012;Bindoff et al., 2013;Walsh et al., 2014aWalsh et al., , 2014b. Climate extremes, by definition, are rare events, however climate change has resulted in changes in the occurrence of extreme events (Easterling et al., 2000, Seneviratne et al., 2012. Climate extremes can result from external forcing of the climate system, such as from increasing greenhouse gases, or natural variability, or more likely some combination of the two. For example, some of the more robust climate change signals related to extremes in both the observed record and in model simulations for the future are decreases in the number of unusually cold days and nights, and increases in the number of unusually warm days and nights (Seneviratne et al., 2012;Min et al., 2013;Kim et al., 2015). Other changes include an increase in the number of heavy precipitation events (Kim et al., 2015) and a likely increase in the incidence of hurricanes in the north Atlantic since about 1970 (Kunkel et al., 2013;Seneviratne et al., 2012) while longer-term trends in hurricanes remain a subject of inquiry (Landsea, 2015;Kossin et al., 2015). Once a signal of change in an extreme is found, the question most often becomes how the change is related to humaninduced climate change (Hulme, 2014).
Detection of climate change in the observed record refers to the identification of a statistically significant change in some part of the climate system. The change could be in some highly averaged mean quantity or in some measure of extreme weather or climate. Observed climate change over various time scales for many parts of the climate system is well summarized in the IPCC 5th Assessment Report (Bindoff et al., 2013) and continues to be extensively monitored (Blunden and Arndt, 2015). It has been clear for some time that changes in the occurrence of weather and climate extremes are major players in producing changes in the natural environment and society, and these kinds of changes have increasingly been the subject of research papers and scientific assessments (e.g. CCSP, 2008;Seneviratne et al., 2012).
However, it is not enough to show that a change in the climate has occurred; indeed once a change has been detected it is important to attribute that change to some cause. Attribution, especially to human greenhouse gas emissions, lends confidence to model projections of the future driven by anthropogenic forcing as well as predictions of extremes at shorter time scales (Seneviratne and Zwiers, 2015). Attribution also provides information for more robust decisions in adaptation activities related to weather and climate extremes (Sippel et al., 2015). Traditionally, detection and attribution studies focused on mean changes (e.g. Hegerl et al., 2007;Bindoff et al., 2013); however in the past decade or so climate extremes have become a focus of detection and attribution studies. A number of recent papers have included overviews of detection and attribution science related to extremes.
Furthermore, when an extreme event occurs climate scientists are increasingly queried by the news media, policy makers, private enterprise and the public as to the likely cause of the event. The question of attribution of these events to human-induced climate change is of particular interest Zwiers et al., 2013;Hulme, 2014;Hegerl, 2015). Through the process of answering this question valuable information regarding risk due to climate extremes is provided, which is useful to a wide range of stakeholders for disaster risk reduction activities.
Two schools of thought have emerged in this rapidly developing field and are described in Section 4. The first, referred in this paper as "Oxford", where the technique was first envisioned, quantifies the change in probability of an extreme event of a particular observed magnitude caused by the human alteration of the climate system. The second, referred to here as "Boulder", introduced first in a series of paper by researchers from NOAA's Earth System Research Laboratory, examines the human induced change in magnitude of an extreme event. While providing different types of information to stakeholders, both these probabilistic and mechanistic schools of thought have been shown to be equivalent.
In this paper we present an overview of practices and challenges related to the detection and attribution of observed changes in climate extremes. In particular we mainly examine temperature and precipitation extremes, while acknowledging that trends in other kinds of extremes, such as tropical cyclones, droughts, and even extreme snow storms may exist and deserve attention. However, we do discuss a newer field of attribution, event attribution, where individual extreme events, such as storms or heatwaves, are analyzed for the express purpose of assigning some measure of the extent to which that event was directly influenced by anthropogenic forcing of the climate system.

Detection of trends in extreme temperature and precipitation
Extreme weather and climate events are a natural part of the climate system. For example, an examination of the paleoclimate record shows that megadroughts and pluvials have happened in the Western and Central United States throughout the last 2000 years (Woodhouse and Overpeck, 1998). Yet, true climate extremes are rare events. Because of this rarity researchers often relax the definition of extremes in such a way as to increase the number of observations that can be used in a statistical analysis. For example, in the case of studies of changes in the occurrence of hot daily maximum temperature extremes, rather than defining the extreme threshold such that it is observed only once every few years, the definition is often set to a threshold value (e.g. 90th percentile value) that is not truly extreme but produces a larger number of observations that exceed the threshold allowing more robust statistical results.
But what about the data sets used in these analyses? To detect an observed change in the climate system, particularly a change suitable for an attribution study, a data set of sufficient temporal and spatial coverage is necessary. Depending on the climate extreme, there is often a lack of observed climate data to document these events for many parts of the world. If the observations exist they often are not in digital form. Also, although the situation is changing, many countries continue to be reluctant to share them with the research community Kunkel and Frankson, 2015).
As noted above, since the analysis of climate extremes often involves examination of the tails of a statistical distribution, a threshold value may be used to determine the number of observations that exceed that value over time creating a time series of exceedance counts. Data quality can impact the counts if there are a number of erroneous values that are not screened out by quality assurance methods, or if the quality assurance methods, which are often more concerned with mean values, are too rigorous and exclude true values. Additional issues include missing data, especially if those missing data would exceed an established threshold or would affect the calculation of the threshold itself. In terms of global analyses, data may be missing for large regions of the globe resulting in a less than true global analysis (Donat et al., 2013). Finally, if longer term data are available they are often observed at weather observing stations, such as at airports, and may be impacted by issues such as urbanization or less than ideal station siting which may result in lower quality data.
The homogeneity of climate data may also impact analyses of climate extremes (Trewin, 2010). Climate data are considered homogenous when all trends and variations are the result of the climate system itself. Inhomogeneities in climate data occur for a variety of reasons. Observing stations often are moved multiple times over longer periods (e.g. 50-100 years) resulting in changes in the local characteristics of the site (e.g. more trees, slight difference in elevation, etc.) Reasons for moves vary but examples include relocation from a city center to an airport, a change in a volunteer observer who also hosts the equipment, or the need to use the site for other purposes. A common inhomogeneity source is urbanization around a station, which will generally cause localized warming, primarily in T min (Karl et al., 1988), the magnitude of which can be several degrees in the largest urban areas. This warming is real and relevant to impacts on urban residents, but will not be representative of real trends at a larger regional scale; thus, for attribution applications, this urban warming should be removed. Changes in instrumentation such as a new type of thermometer, the installation of a wind shield on a raingauge, or changes in observing practices such as the time observations are taken all can result in an inhomogeneous time series. The impact on the observed time series is typically either a discontinuity (jump up or down), or a gradual change that can appear as a trend (Menne and Williams, 2009), either of which can impact the analysis of changes in extremes. Methods for identifying and correcting for inhomogeneities have typically been applied to time series based on longer averaging periods, such as monthly, seasonal, or annual time series (e.g. Easterling andPeterson, 1994, Menne andWilliams, 2009). In the past decade or so approaches to assess and correct for inhomogeneities in daily and even subdaily data have been developed (e.g. Della-Marta andWanner, 2006, Trewin, 2013), but still have not been widely implemented. However, even without corrections applied to higher temporal resolution data, results of analyses of extremes are consistent with what would be expected based on analyses of mean values (e.g. Alexander et al., 2006, Min et al., 2011. Incomplete spatial coverage of observing stations for a region or the globe is another potential source of uncertainty. Since there are a number of regions in the world that are not covered in global-scale data sets used for climate analyses, particularly for extremes (Cowtan and Way, 2014), it is unknown how the addition of these regions would impact detection and attribution studies. Even in regions that have observing stations, the question of lower spatial density could prove problematic. Kunkel et al. (2007) used Monte Carlo techniques to examine the impact of lower spatial density of observing stations in the western United States and missing data in detecting changes in heavy precipitation over the contiguous United States. They found that limited spatial density was more important than missing data in detection studies, but that neither issue was severe enough to reduce statistical significance values below standard confidence levels.
Satellite and reanalysis products have the advantage of global coverage. However, both classes of products have significant uncertainties in their representation of extreme daily temperature and precipitation (AghaKourchak et al., 2011). Satellites do not measure these quantities directly, but instead measure radiances in different wavelength bands. Inversion algorithms for satellite data are generally not as well calibrated for large values as they are for average values. For example, AghaKourchak et al. (2011) analyzed the ability of four satellite-based precipitation products to capture precipitation extremes and suggested that extensive efforts are necessary to reliably capture these kinds of events. Reanalysis products are hybrid model-observation based datasets that are produced by assimilating observations into highly constrained climate model simulations. Biases in the reanalysis models, particularly those from coarse horizontal resolution or time-dependent uncertainties due to issues such as observing network changes (Wehner et al., 2014), can be significant for extremes. Station based daily gridded products provide more complete regional coverage to estimate extreme values and can reduce noise that is often inherent in station-based time series (Fischer et al., 2013). The process of gridding a data set can introduce uncertainties in subsequent analyses of extremes, particularly in calculating return periods, but have minimal effect on analyses of long-term trends and inter-annual variability (for more discussion on this and other data issues see Alexander (2015)). However, as with satellite inversion algorithms, gridding algorithms are also not generally designed for the tails of the distribution. As a result of these factors, extremes from different products can differ substantially over the same locations even when the original observations are closely related (Wehner et al., 2014). As noted earlier there are still many regions in the world that lack higher temporal resolution climate data that are suitable for examining changes in extremes. Because of the difficulty and/or reluctance for many countries to provide these data, the World Meteorological Organization's Joint WMO CCl/WCRP/JCOMM Expert Team on Climate Change Detection and Indices (ETCCDI) developed a set of climate extremes indices and organized a set of workshops in data poor regions to encourage these countries to calculate the indices at the workshop. The resulting data sets of the indices were then used to construct the HadEX2 data set, which is a quasi-global gridded data set of these indices that can be used in detection and attribution studies (Alexander et al., 2006). A similar data set, GHCNDEX (Donat et al., 2013) has also been constructed by calculating these indices using the GHCN-Daily data set. Although, in general, it lacks the spatial coverage of the HadEX2 data set, it is updated on a regular basis. However, the difference in spatial coverage between the two data sets is not consistent and depends much on the variable of choice. Use of either of these two data sets in detection/attribution analyses presents a tradeoff of either of a lack of coverage, or not being as up to date as possible, both of which increase uncertainty in results. Fig. 1 shows the pointwise trend in one of these temperature indices, the annual maximum daily high temperature (TXx). With some exceptions, this measure of extreme temperature is largely experiencing increased values. Aggregation into larger regions, a common practice in detection and attribution (D&A) studies, yields greater areas of significance in these trends. Fig. 2, shows the pointwise trend in an extreme precipitation metric, the annual maximum 5-day (pentadal) precipitation (Rx5day). Again, statistical significance is increased with spatial aggregation. As discussed in more detail below, trends in both of these fields have been attributed to anthropogenic changes to the composition of the atmosphere. D&A studies of the ETCCDI indices are still rather limited although further opportunities exist. Fig. 3 shows trends in the consecutive dry days (CDD), a crude measure of drought and/ or the dry season. Other more sophisticated measures of meteorological and agricultural drought also offer opportunities (Burke et al., 2006;Wehner et al., 2011;Sheffield et al., 2012;Dai, 2013). D&A studies of seasonal rather than annual indices can be more insightful as the large scale meteorological patterns behind extreme temperature and precipitation often varies across the annual cycle. It follows that the statistical description as the physical mechanism behind any changes varies seasonally as well.
The above examples (Figs. 1-3) utilized metrics that have an annual time resolution. Calculation of such metrics is convenient but does not necessarily focus the analysis on high-impact extreme events, which in general are rarer than annual occurrence events. However, examination of more rare events usually requires access to the original observational data sets such that the analyses can be customized to focus on metrics more related to impacts. Fig. 4 shows grid box trends for the number of occurrences of 5-day duration cold spells that are colder than the threshold for a 1-in-5yr recurrence. A few features stand out. The direction of trends (downward) is very coherent spatially across the Northern Hemisphere with only a few grid boxes with upward trends. In the Southern Hemisphere, nearly all available grid boxes exhibit downward trends. Despite the relatively small sample size associated with this extremes definition, the downward trends are statistically significant for much of southern Eurasia, western Canada, and Australia. These results clearly illustrate that global warming has been accompanied by a decrease in the number of extreme cold episodes. As with Figs. 1-3, most of the land areas in the tropics and Southern Hemisphere are missing, due to lack of available data or reluctance of nations to provide their data for general use.
There are several methods of data analysis to detect trends. Standard linear regression is usually not the preferred method to examine trends because in most cases metrics of extremes are not normally distributed. A common nonparametric approach (e.g. Alexander et al., 2006) is Kendall's tau-based slope estimator (Sen, 1968). Statistical significance is assessed by looking at the sum of the signs of the differences of all possible pairs of data points. The estimate of the magnitude of the trend is the median of all nonzero pairwise trends. Both serial and spatial correlation needs to be taken into account as substantial levels of such correlation are often present in extremes data, a consequence of the large-scale and long-lasting meteorological patterns that are often the physical cause of extreme events.
The Generalized Extreme Value (GEV) distribution has been found to be suitable as a fit to the tails of the distribution for atmospheric variables. In this approach, the original data set is subsetted by extracting the maxima over each "block" of time, where a block is often a season or year. The probability density distribution of the block maxima, G(y), is given by (Coles, (2001): Pointwise linear trend over the 1951-2014 period in annual maximum 5-day total precipitation (Rx5day) using GHCNDEX on a 2.5°Â 2.5 Â latitude/longitude grid (Donat et al., 2013). Units: mm/year, stippling indicates statistically significant trends (p⩽0.05). where ε, μ, and s are called the shape, location, and scale parameters and has been shown to provide a good fit to seasonal and annual maximum temperatures and precipitation (Zwiers et al., 2010;Wehner, 2013). A trend can be introduced as a trend in the location parameter or log of the shape parameter (Katz, 2010). In Zwiers et al. (2011), trends were studied by assuming that the changes can be expressed by a linear (in time) change in the location parameter with the other parameters remaining constant. The maximum likelihood method was used to find the estimates of the parameters that best fit the observed data. While the above statistical approaches provide a well-established numerical framework for estimating uncertainties, uncertainties in estimated probabilities for the most extreme events (e.g. 2003 European heat wave, the 2010 Russian heat wave) must be considered very high. Exploring the true probability of the most extreme conditions requires another approach. Climate models offer a possible solution. They are based on the fundamental physical laws governing the climate system. They can produce the dynamic chaotic behavior that is characteristic of the system. In principle then, they can produce events like the singular ones described above and provide insights into their probability. Very long simulations can provide the large sample sizes needed to establish statistical confidence.
But what about detection of changes in other phenomena of the climate system, such as tropical or convective storms? Storms in particular present difficulties in detecting long-term trends. For example, quantifying tropical cyclone numbers, as well as intensities has presented numerous challenges. Observations of tropical cyclones in the North Atlantic in the early part of the observational record (e.g. prior to World War II), occurred only when a storm struck land or was encountered by a ship at sea. Between WWII and the 1960s, aircraft observations were then included, and with the advent of satellites these observations were also included. This change of observing methods through time has resulted in much uncertainty in the annual counts of tropical cyclones in the earlier part of the record (CCSP, 2008;Vecchi and Knutson, 2011 ). Fig. 5 shows counts of hurricanes in the Atlantic by year based on the HURDAT data set (Jarvinen et al., 1984;Landsea et al., 2004). Vecchi and Knutson (2011) have provided adjustments for numbers of hurricanes for the period prior to 1966 based on ship track densities and sea-surface temperatures. Examining the unadjusted data in Fig. 5 there appears to be a longterm increase in numbers of hurricanes. However, using the adjusted data that account for storms that where likely missed in the pre-1966 period, there is no longer an increasing trend over the entire period (Vecchi and Knutson, 2011). Lastly, if the analysis is restricted to the modern era (satellite era, 1966 to present) confidence is high in the counts of hurricanes and Hartmann et al., (2013) state that there is very high confidence that there has been an increase in numbers of hurricanes over that period (Kossin et al., 2007). Why there has been an increase in hurricanes in this period is still the subject of much debate (Landsea, 2007, Holland andWebster, 2007).
Detecting changes in observations of other extremes present their own issues and drought is a good example. With differences in ways drought is defined using indices such as the Palmer Drought Severity Index, or the Standardized Precipitation Index   (Jarvinen et al., 1984, Landsea et al., 2004 and the adjusted data are based on the method described in Vecchi and Knutson (2011). The 1880-1943(red), 1944-1965(yellow), and 1966-2014 periods are shaded to indicate the different observing methods used during those periods as discussed in the text. Data are available at http://www3.epa.gov/climatechange/science/indicators/weather-climate/cyclones. html. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) (see Box 3.3 in Seneviratne et al., 2012) and with their reliance on observations of temperature and precipitation to calculate these indices, or the use of modeled data sets (Sheffield et al., 2012) it can be difficult to get coherent conclusions regarding long-term changes, particularly for large regions or the globe (Hartmann et al., 2013).

Attribution of trends in extreme temperature and precipitation
Attribution is generally defined as the evaluation of the contribution of multiple potential causes to a change in the climate system or to a climate event, with accompanying statistical confidence (Hegerl et al., 2010). "Detection and Attribution" or D&A is often used in the same phrase to describe a specific type of analysis of trends in climatic variables. In such analyses, a detected signal in the observations is sought in climate model simulations. Typically, the first step searches via statistical techniques for a spatially and/or temporally equivalent signal in an ensemble of climate model integrations constructed to realistically simulate the climate system with best estimates of observed external forcing factors. The forcing factors usually include a combination of anthropogenic factors (e.g. greenhouse gases, sulfate and carbonaceous aerosols, ozone, land use etc.) and natural factors (e.g. solar variations, volcanoes). If the signal is not found in these simulations, then the either the observed signal is a natural one and washed out in the chaos of the ensemble of model simulations or the models are not fit for purpose to replicate the observations. If the signal is found in the ensemble of realistic simulations, the next step is to analyze a counterfactual ensemble of simulations. To attribute the signal to human activities, one or more of the anthropogenic forcing factors is held stationary at some estimated preindustrial value in this second set of simulations. If the intended signal is then not found in these simulations, the trend is attributed to human activities.
This general approach is illustrated in Fig. 6, which compares model and observed trends of the annual maximum value of daily maximum temperature (T xx ) and the annual minimum value of the daily minimum temperature (T nn ) for the coterminous United States for the 50-yr period 1956-2005. The T xx and T nn trends were calculated for simulations from CMIP5 global climate models. Two sets of simulations were analyzed: (1) those driven with both natural and anthropogenic historical forcings ("Hist"; the factual set), and (2) those driven with only natural historical forcings ("Nat"; the counterfactual set). There were 78 Hist simulations from 29 CMIP5 models and 35 Nat simulations from 15 CMIP5 models. Trends were calculated using linear least squares regression. All of the historical simulation trends (red) are positive (warming) for both T xx and T nn . The two distributions are distinct, but there is considerable overlap. About 37% and 54% of the Hist trends are within the envelope of the Nat trends for T xx and T nn , respectively. Observed trends were calculated from long-term stations in the Global Historical Climatology Network-Daily (GHCND) data set (Menne et al., 2012) Stations were selected based on minimal missing data, specifically less than 5% missing daily temperature data for 1956-2005; a total of 2324 stations met this criterion. Time series of T xx and T nn were calculated for each individual station. Then, average time series for 1°by 1°grid boxes were calculated. Finally, national time series were calculated by averaging the grid box time series. The observed T xx trend of þ0.01°C/decade is not statistically significant and is within the range of the both Hist and Nat trends. By contrast, the observed T nn trend of þ0.56°C/decade is highly statistically significant (p o0.01) and on the high end of the range of the Hist trends, and well outside the distribution of the Nat simulations. The conclusion from this analysis is that the observed T nn trend is externally forced. However, the lack of an observed T xx trend would appear to be an ambiguous message. It is not inconsistent with external forcing (since a number of Hist simulations show little trend) but little more can be said at this spatial scale. However, trends in any variable at continental scales should be interpreted in the larger context of changes global scale. Statistically significant global increases in both T xx and T nn are seen in the HadEx observations (Brown et al., 2008). The lack of a T xx trend is another facet of the local "warming hole", the lack of summer warming in the much of the central and eastern U.S. which is most apparent in daytime maximum temperatures. The summer (June-August) mean daily maximum temperature trend for the coterminous U.S. for 1956-2005 is a statistically insignificant 0.04°C/ decade. This lack of a detectible change in both average and daily maximum temperatures reveals that the human signal has not yet emerged out of the internal variability at this spatial scale. The very large trend in T nn must also be interpreted in this context. Natural variations may be enhancing, rather than canceling the human induced increase in T nn . In both cases, internal variability at the scale of the coterminous United States is significantly larger than at the global scale, making robust detection and attribution statements more difficult.
Although the human influence on the annual or seasonal average of many different climatic fields have been detected and attributed, the few D&A studies of trends in extremes is limited to the tails in the distribution of short term averages of precipitation and temperature. One of the only attributions of observed trends in extreme precipitation to human changes in atmospheric GHG concentrations was shown by Min et al. (2011). This classical D&A study draws on linear principal component analysis (PCA) based optimal fingerprinting techniques that have been extensively used in attribution studies of the changes in more averaged quantities (Allen and Tett, 1999;Allen and Stott, 2003). Observed changes in annual maximum daily and pentadal precipitation based on the HadEX (Hadley Centre global land-based gridded climate extremes) data set (Alexander et al., 2006) are analyzed from 1951-2003. Although observations are limited to the Northern Hemisphere and are sparsely covered in many regions, they find that heavy precipitation events have intensified over 2/3 of those observations. Comparing the set of CMIP3 models that provided daily precipitation in the 20th century anthropogenic and natural forcing (20c3m), the 20th century anthropogenic forcing only (ant) and the 20th century natural forcing (nat) experiments, they find that the observed increases in extreme precipitation can be attributed to human causes. Furthermore they find that the magnitude of the simulated increases are significantly smaller than observed suggesting that actual future extreme precipitation increases may be larger than models suggest.
Using a similar optimal fingerprinting technique, Min et al. (2013) examined indices of annual maximum daily high (T xx ) and low (T nx ) temperatures as well as annual minimum daily high (T xn ) and low (T nn ) temperatures using the multi-model CMIP3 ensemble. In mid-latitude and polar regions, these indices can be interpreted as measures of summer and winter high temperature extremes and summer and winter low temperature extremes respectively. Observed increases from 1951-2000 are larger in the winter than in the summer in both hemispheres. In this study, detection and attribution statements were attempted for both 5 and 10 year means of these extreme temperature indices both globally and for 5 continental scale regions. Globally, the natural and anthropogenic effects are not only attributable together but can also be separately identified for all 4 indices reflecting the robustness of the forced signal relative to internal chaotic noise. On the continental scale, North America exhibited the most robust D&A results. Consistent with the relative magnitude of the observed changes, winter extreme temperature increases were somewhat more robustly attributable than their summer equivalents. More recent analysis using the CMIP5 models (Kim et al., 2015) shows that the newer generation of models replicate the magnitude of observed trends in these 4 metrics of extreme temperatures significantly better than the previous generation (Shiogama et al., 2006;Christidis et al., 2011;Min et al., 2013). The removal of two large modes of natural variability, the Arctic and Pacific Decadal Oscillations, further strengthened the robustness of the attribution result. Christidis et al. (2013) further examined the trend in these extreme temperature indices attributing part of the observed changes in warm days and nights to land use changes. In the tropics, deforestation further increases extreme high temperatures over that due to GHG increases by reducing soil moisture and hence cooling from evapotranspiration. In the mid-latitudes, the reduction in evaporative cooling has less effect than the increase in surface albedo, resulting in a slight reduction in the increase in extreme high temperatures. However, while such a top of canopy analysis is consistent with how the atmosphere interacts with the land surface, the impact of deforestation on the extremely hot temperatures actually experienced by people and animals would not likely be diminished by the replacement of forests by grassland due to the decrease in available shade.

Probabilistic extreme event attribution
As Section 3 above describes, detectible trends in certain measures of extreme weather can be attributed to human activities. As individual extreme weather events can have devastating impacts, it is natural to ask to what extent anthropogenic climate change is responsible for the damages. Inherent to this type of question is the causality of specific weather events. However, the climate is a chaotic system and the causal factors behind specific weather events can rarely be described in deterministic terms. Hence, it is often stated in the popular press after a notable extreme weather event that nothing can be said about the role of climate change in that particular event with some caveats that such events can be expected to become more common in the future. This statement is most often patently false. For much can be said about the effect of climate change on many recent extreme weather events in a probabilistic formalism. The rapidly emerging field of Probabilistic Extreme Event Attribution has quantified the effect of climate change on a wide variety of extreme weather (for instance see Peterson et al., 2012Peterson et al., , , 2013Herring et al. 2014).
While there are many different approaches to event attribution, we review two alternative, yet complementary approaches here. The earliest efforts to assess the human influence on extreme weather events consider the change in probability, based on climate model simulations of the "world that was" compared to simulations of the "world that might have been" had humans not interfered with the climate system. This "Oxford" school of thought was first proposed by Allen (2003) and implemented by Stott, Stone and Allen (2004) to describe the summer 2003 European heat wave. The chance of this particularly deadly heatwave, associated with up to 70,000 excess deaths (Robine et al., 2008), was found to be at least doubled by anthropogenic changes to the climate system.
In such analyses, the change in probability due to climate change is often expressed as a ratio, originally referred to a "risk ratio" but more precisely termed a "probability ratio" (Fischer and Knutti, 2015) = ( ) PR P P / 1 real nat where P real is the probability of a simulated event of the observed magnitude in the "world that was" simulations and P nat is its probability in the "world that might have been" simulations. The alternative "fractional attributable risk" (FAR) = -( ) FAR P P 1 / 2 nat real is often used in epidemiology and environmental law (Stone and Allen, 2005) to define how much of a risk is due to a particular forcing agent. Both types of attribution statements are conditioned by the assumptions of the climate model simulations. In the original Stott et al. (2004) study, CMIP3 simulations from the fully coupled ocean-atmosphere HadCM3 model were used. Using this class of models, the attributions statements based on Eqs. (1) and (2) are conditional only on the changes in the external anthropogenic forcing agents, usually atmospheric greenhouse gas, sulfate aerosol and ozone concentrations and sometimes land use changes.
The technique has subsequently been refined by using stand alone atmospheric models with prescribed sea surface temperatures (SST) and prescribed sea ice extent (Pall et al., 2011, Folland et al., 2014. In such analyses, these surface boundary conditions are derived from actual observations to represent the "world that was". For the counterfactual "world that might have been", the SST and sea ice extent are obtained by subtracting out an estimate of anthropogenic change in those fields. This can be straightforwardly obtained by the linear PCA techniques used in typical D&A analyses (Allen and Tett, 1999;Allen and Stott, 2003). As there are multiple CMIP3/5 models that are suitable for this task, there can be multiple, equally valid, formulations of the counterfactual simulations. This provides one source of the estimation of the structural uncertainty in PR and FAR based attribution statements. Another source of structural uncertainty is currently being explored in the coordinated C20C þ experiment (Folland et al., 2014) that involves multiple international modeling groups following this atmospheric model based event attribution protocol. In this and related experiments (for instance see http://www.climatepre diction.net/weatherathome/), additional conditions are associated with any resulting attribution statements. Principal among these is that the influence of the particular state of the ocean on the simulated probability of the event in question can be quantified. For instance, if the event occurred in a year with a large ENSO event, the SST pattern in the "world that might have been" also has a large ENSO event in that year, albeit cooled by the removal of the human contribution to global warming. Hence, conditions related to the state of the ocean at the time of the event can be incorporated into the attribution statement.
This separation of natural and anthropogenic contributions to extreme events provides some of the motivation behind the alternative "Boulder" school of thought approach to event attribution. This approach begins with a careful deconstruction of the local and large-scale meteorology responsible for the event in the context of observed and simulated trends in the region of interest. The first of these studies examined another deadly heatwave, this time during summer 2010 in Russia (Dole et al., 2011) and was followed by an analysis of the 2011 Texas heatwave and drought . Quantifying the natural and anthropogenic contributions to the magnitude of extreme events provides another perspective on event attribution that may appear to be at odds with the frequency based perspective described above. Because of the asymptotic shape of the tails of probability distributions in questions, small changes in magnitude may result in large changes in frequency (Otto et al., 2012). Hence, it may appear that natural factors play a larger role than anthropogenic factors for some extreme events. However, these factors are not additive but multiplicative. More recently, Trenberth et al. (2015) has argued that anthropogenic changes in the thermodynamics of the climate system are much larger than are changes in atmospheric dynamics and concluded that statements about the human influence on the magnitude of extreme events are more practical and societally relevant than risk attribution statements. Further methodological details of probabilistic extreme event attribution are discussed in Pall et al. (2015).
Event attribution studies are principally based on climate model experiments. Unlike the formal D&A studies performed at near global scales discussed in Section 4, explicit detection exercises are not always performed on the observations alone. However some recent studies King et al., 2015) have added a detection step at the spatial scales of interest to individual events. This technique uses a peaks over threshold extreme value formalism with a time dependent covariate. King et al. (2015) examined annual mean Central England temperatures using both a large ensemble of atmospheric models and this observationally based method and found that the increased risk for the record 2014 temperatures were entirely consistent between the two approaches.
Uncertainties in the attributed changes in risk (Oxford School) or magnitude (Boulder School) of extreme events arise from several sources. First, observational uncertainties in the actual magnitude of an event can be significant. The median estimate of the risk ratio can be sensitive to this uncertainty, although the lower bound of a confidence interval can be less sensitive (Jeon et al., 2015). Second, while model simulations of the "world that was" are constrained by observations, the counterfactual simulations, especially those using atmospheric only models, are highly determined by estimates of the climate sensitivity to external human forcing agents. Third, given a certain prescribed set of boundary condition changes, the extreme weather response of different also varies. A complete attribution statement must address each of these sources of uncertainty.
Probabilistic extreme event attribution is a rapidly developing field of inquiry with a number of different, if not opposing, views. In addition to the scientific issues raised by such analyses, there are important social, legal and ethical issues that arise. While outside the scope of this review paper, such matters are discussed at length by Hulme (2014). Furthermore, while the number of events studied has been greatly enlarged by the short essays in the Bulletin of the American Meteorological Society (BAMS) supplements (see Herring et al., 2014Herring et al., , 2015 and range from cold snaps to heat waves and droughts to floods, the events analyzed largely occurred in the industrialized countries where the authors lived. Developing nations are, of course, not immune to the effects of climate change on extreme weather and the impacts there often are more severe and are beginning to draw the attention of the scientific community, as shown by the most recent version of the BAMS supplements (Herring et al., 2015). Systematic or comprehensive global analyses of individual extreme events have not yet been performed. Forecasts of changes in the seasonal risk of extreme events performed in advance using existing seasonal forecasts of sea surface temperatures and sea ice extent are currently possible. Assessment of the skill of such forecasted risk would increase confidence in extreme event attribution statements.

Future directions and opportunities
High performance computing has recently enabled a new class of climate models that can simulate extreme weather significantly better in multi-decadal integrations than the CMIP5 models. Although still a rapidly developing field, global atmospheric models at resolutions of 25 km can produce tropical cyclones and other intense storms (Walsh et al. 2015). Hence, simulated long period return values of extreme precipitation more closely represent observations (Wehner et al., 2014). The station based raw data that make up observational databases will not significantly improve in quality or coverage in the near future. Uncertainties in gridded observational products are large, even over the well observed North America and Europe (Wehner et al., 2014). Satellite observations are beginning to have long enough periods of record to be useful in detection/attribution research. However, satellites have their own issues that require much post-processing to correct for biases owing to problems such as orbital drift and changes in instrumentation.
There are efforts underway to improve coverage in regions of the globe that are underrepresented in global data sets. The World Meteorological Organization (WMO) has an expert team on the data rescue (ET-DARE) and is cooperating in a number of data rescue initiatives to help digitize and make available observations that continue to exist in country meteorological services archives. In time these kinds of efforts will help fill in data coverage gaps in many parts of the world and should be encouraged.
Advances in extreme value statistical techniques are enabling more robust estimation of long period uncertainties through the use of physical covariates (Sillman et al., 2011). Such reductions in the uncertainty of the statistical fits, will likely improve both detection and attribution of changes in extreme temperature and precipitation.
There is a demand from the journalism community for more rapid assessment of the human contribution to specific weather events while their memory is fresh in the public's mind. The Worldwide Weather Attribution project (http://www.climatecen tral.org/wwa) is one such effort. This partnership between Climate Central, the University of Oxford Environmental Change Institute (Oxford ECI), the Royal Netherlands Meteorological Institute (KNMI), the University of Melbourne, and the Red Cross Red Crescent Climate Centre (the Climate Centre) is operationalizing attribution statements for certain classes of extreme weather events. Their first operational statement used two methods to conclude that they are "virtually certain that the heat wave that stretched across much of Europe in early July (2015) was more likely to happen now than in the past due to climate change" (http://www.climatecentral.org/europe-2015-heatwave-climatechange). The statement was made on July 10 during the actual heat wave. The project team came to their conclusion using two methods. The first used the very large ensemble technique described in Section 4. The second used an empirical trend detection technique based on extreme value statistical techniques King et al., 2015). Detailed statements about specific European cities were also made. In this case, the scientific rationale behind the attribution statements is well established. For other classes of events, particularly strong storms such as hurricanes or mesoscale convective systems, the effect of a warmer climate on event statistics is less well understood. Furthermore, CMIP5 class modeling systems may not be "fit for purpose" to analyze such events requiring that specialized simulations be performed after the event occurrence. Performing and interpreting such custom simulations may likely take longer than the news cycle (typically a few weeks at most).

Conclusions
This paper has provided an overview of detection and attribution as it relates to extreme events. There is increasing interest by many sectors of society for information about causes of extreme events, particularly if the cause can be linked to human-induced climate change. Furthermore, attribution studies of extremes need to begin considering the impacts of extremes by evaluating exposure and vulnerability in a risk-based framework (Cardona et al., 2012).
The science of attribution, particularly event attribution, is still emerging and for this information to be useful to a wide range of stakeholders, uncertainties in attribution results need to be assessed and articulated in a way that stakeholders can understand. Since attribution science is dependent on climate modeling results, it is important to continue to assess and improve climate models. Similarly, detection studies must have high quality observational data sets, and improvements in spatial and temporal coverage of longer-term climate data sets are also needed. Lastly, since satellite-based data sets of the climate system are now becoming long enough for use in classical style detection/attribution studies, and have proven useful in event attribution, innovative uses of these data must be pursued.