Projected changes in the frequency of climate extremes over southeast Australia

Most studies evaluating future changes in climate extremes over Australia have examined events that occur once or more each year. However, it is extremes that occur less frequently than this that generally have the largest impacts on sectors such as infrastructure, health and finance. Here we use an ensemble of high resolution (∼10 km) climate projections from the NSW and ACT Regional Climate Modelling (NARCliM) project to provide insight into how such rare events may change over southeast Australia in the future. We examine changes in the frequency of extremes of heat, rainfall, bushfire weather, meteorological drought and thunderstorm energy by the late 21st century, focusing on events that currently occur once every 20 years (those with a 5% Annual Exceedance Probability). Overall the ensemble suggests increases in the frequency of all five extremes. Heat extremes exhibit the largest change in frequency and the greatest ensemble agreement, with current 1-in-20 year events projected to occur every year in central Australia and at least every 5 years across most of southeast Australia, by the late 21st century. The five capital cities included in our model domain are projected to experience multiple climate extremes more than twice as frequently in the late 21st century, with some cities projected to experience 1-in-20 year events more than six times as frequently. Although individual simulations show decreases in some extremes in some locations, there is no strong ensemble agreement for a decrease in any of the climate extremes over any part of southeast Australia. These results can support adaptation planning and should motivate further research into how extremely rare events will change over Australia in the future.


Introduction
The southeast of Australia is the most densely populated region of the country and is exposed to multiple climate extremes that pose significant risks to communities and infrastructure. Research shows that many of the extremes affecting southeast Australia are likely to increase under climate change. For instance, extreme heat is projected to increase in duration, frequency and intensity (Argüeso et al 2015, Perkins-Kirkpatrick et al 2016, Gross et al 2017, Herold et al 2018. Extremes in both the surface and pyroconvective components of bushfire weather are projected to intensify over much of the region (Clarke and Evans 2019. Projections also indicate an increase in bushfire fuel load (Clarke et al 2016) and seasonal changes in burn windows Evans 2019, Di Virgilio et al 2020). Drought intensity and frequency are projected to increase over some (Ukkola et al 2020) or most (Kirono et al 2020) of southeast Australia. Rainfall extremes are projected to intensify , Bao et al 2017, as are the large scale atmospheric conditions conducive to thunderstorm development (Allen et al 2014). Conversely, East Coast Lows, which can bring strong winds and heavy precipitation to the southeast coast of Australia, are projected to occur less often on average, though this result depends on seasonality and the type of East Coast Low (Ji et al 2015, Pepler et al 2016.
While there have been multiple efforts to investigate extremes under climate change in southeast Australia, these studies have generally assessed 'moderate' extremes, which occur once or more each year (e.g. Clarke and Evans 2019, Gross et al 2017, Herold et al 2018. Here we apply extreme value theory to the dynamically downscaled NSW and ACT Regional Climate Modelling ensemble (NARCliM; Evans et al 2014) to evaluate the change in frequency of events that currently occur once every 20 years (hereafter '1-in-20 year events', also commonly referred to as events with a 5% Annual Exceedance Probability) for five climate extremes by the late 21st century. Extreme value theory has previously been applied to projections of temperature and precipitation extremes from global climate models and indicates increases in the intensity of 1-in-20 year events over Australia under multiple emission scenarios (Kharin et al 2007, Perkins et al 2009, Kharin et al 2013. We extend these studies by focusing on multiple climate extremes and analysing regional simulations run at a much higher resolution. Our results provide information relevant to risk and adaptation planning and should motivate further research into projections of extremely rare events across Australia. The five climate extremes assessed here relate to heat, rainfall, bushfire, drought and thunderstorms. While this is not an exhaustive list of extremes impacting southeast Australia, they are responsible for significant economic costs or fatalities. For example, NSW and Victoria, the two most populous Australian states, experienced approximately $3 billion (2007 $AUD) in damage from storms and bushfires over 2007(Deloitte 2017 (Coates et al 2014). And the record-breaking 2019-20 bushfire season alone is estimated to have cost Australia $1.95 billion (2018 $AUD) due to smoke-related illnesses and deaths (Johnston et al 2020).
The five extremes examined here were chosen because they or their proxies are directly represented by the meteorological output of climate models. Other extremes of significance to southeast Australia, such as flood and air pollution, require a hierarchy of numerical models to simulate under climate change, and will be the focus of future work by the NSW Department of Planning, Industry and Environment (DPIE).
The next section of this paper describes the climate data used, the chosen definitions of our five climate extremes and the extreme value statistics applied. Section 3 evaluates NARCliM's ability to capture 1-in-20 year events and presents the projected frequency of these events under climate change, section 4 discusses these results and their caveats and section 5 concludes the study.

NARCliM regional climate projections
The NARCliM climate projections consist of four global climate models (GCMs) from phase three of the Coupled Model Intercomparison Project (CMIP3), dynamically downscaled by three configurations of the Weather, Research and Forecasting (WRF; Shamarock et al 2008) regional climate model (RCM). The four GCMs (MIROC3.2, ECHAM5, CGCM3.1 and CSIRO-MK3.0) were selected based on their performance over Australia, the independence of their errors and their ability to span the full range of potential future climates (Evans et al 2014). The WRF configurations differ by their choice of boundary layer physics, surface layer physics, cumulus physics, micro-physics and radiation scheme, and were selected based on their performance and the independence of their errors. Thus the 12 different GCM-RCM combinations efficiently sample the uncertainty in future climate conditions. Simulations were performed over two domains; the CORDEX Australasian domain at a ∼50 km resolution and a nested southeast Australian domain at ∼10 km resolution (figure 1 of Evans et al 2014). The latter domain is the focus of this study. The CMIP3 GCMs were forced with the Special Report on Emissions Scenarios (SRES) A2 scenario, which closely matches the radiative forcing of Representative Concentration Pathway 8.5 (RCP8.5) and leads to a projected global warming of approximately 4°C above pre-industrial temperatures by the end of the 21st century (Rogelj et al 2012). Three time periods were simulated: 1990-2009, 2020-2039 and 2060-2079. In this study we compare the climates of 1990-2009 (hereafter the 'recent past') and 2060-2079 (hereafter the 'future'), excluding 2020-2039 to focus on the largest possible climate change signal. While CORDEX simulations driven by the newer CMIP5 are available for Australasia (e.g. Evans et al 2020), we use NARCliM as our area of interest is southeast Australia and NARCliM offers a finer spatial resolution (10 km versus 50 km), which can be important for adaptation. Furthermore, NARCliM has been thoroughly evaluated in the literature and shown to reproduce multiple aspects of southeast Australian climate well (e.g.

Climate extremes
The extremes considered in this study are heat, rainfall, bushfire weather, meteorological drought and thunderstorm energy, and we apply commonly used definitions to characterise each. Annual maxima for each extreme are calculated for each ensemble member and are used as input into our return period calculations (next section). Thus, for each model grid cell, each ensemble member provides 20 values for the recent past and future.
For extreme heat and rainfall we use the TXx and Rx1day indices, respectively (Zhang et al 2011). TXx represents maximum daytime temperature while Rx1day represents maximum single-day rainfall.
We characterise meteorological drought using the Standardised Precipitation Index (SPI; McKee et al 1993), which is endorsed by the World Meteorological Organisation (Hayes et al 2011). The SPI is calculated on a userdefined n-month time scale. An SPI value is calculated for each month in a time-series and represents the number of standard deviations that the precipitation total for the preceding n-month period lies below or above the median of the base period. The base period climate is represented by a time-series consisting of the same n calendar month totals, after the base period data has been fitted to a distribution (in our case Gamma) and normalised. In this study we calculate the 3-month SPI (n=3) to focus on seasonal droughts.
To characterise bushfire weather we use the McArthur Forest Fire Danger Index (FFDI; McArthur 1967), which is also used by the Australian Bureau of Meteorology to forecast fire danger. The FFDI combines surface temperature, 10 m wind velocity, relative humidity and rainfall into a unitless measure. Daily FFDI values were taken from Clarke and Evans (2019) who use the formulations of Noble et al (1980) and Finkele et al (2006).
NARCliM simulations were not performed at a spatial resolution capable of resolving the processes involved in thunderstorm generation. Consequently we focus on characterising the amount of energy available to thunderstorms, specifically the Convective Available Potential Energy (CAPE). CAPE represents the amount of energy available in the atmospheric column for convection to occur (measured in J kg −1 ) and is a fundamental ingredient in thunderstorm generation. CAPE values were calculated on a 3 hourly basis.

Extreme value statistics
To calculate return periods and return levels for each climate extreme we fit the Generalised Extreme Value (GEV) distribution to each grid cell for the recent past and future. Curve fitting was performed with the fevd function from the extRemes package (Gilleland and Katz 2016) in the R programming language (R core team 2017). The method of linear moments was used for estimating the location, shape and scale parameters, as has been recommended for small sample sizes (see Kharin et al 2007 for a discussion).
We chose the GEV distribution to minimise methodological decisions that require a level of subjectivity. The alternative to the GEV distribution-the Generalised Pareto distribution-requires a threshold to be selected to define each extreme (e.g. for extreme heat, the number of days each year above a certain temperature). This choice must balance the need to focus on extreme events against the need to have sufficient data to fit the distribution to and can include contiguous samples that violate the assumption of independence (e.g. a single heatwave producing several days of high temperatures). The GEV approach largely precludes such issues and thus is easier to automate.
Our analysis focuses on frequencies expressed as return periods. Return periods are derived from the inverse of the occurrence probability of an event, with lower return periods indicating a higher frequency of an event. As such, to determine the future frequency of events with a return period of 20 years in the recent past (1-in-20-year events) we invert the GEV distribution derived for our future period. Where the GEV distribution function is defined as: where ε is the shape parameter, μ the location parameter, σ the scale parameter, and where 1 + ε((x−μ)/ σ)>0, and σ>0. The latter two conditions ensure that the distribution function yields probability values no greater than one. For some grid cells, the linear moments estimation returned parameter values that did not allow this condition to be satisfied and fitted recent past and future probability distributions that were inconsistent with each other. This was most prevalent for rainfall, meteorological drought and thunderstorm energy and is likely an artefact of our small sample size. The affected grid cells, which account for less than 1.5% of the land surface, were masked out.
Kolmogorov-Smirnov tests were conducted to measure the goodness of fit of the GEV distribution. All tests passed with a p-value of 0.05, indicating the fitted GEV distributions are statistically similar to the model data. Probability density functions and return level plots were also spot-checked and showed reasonable comparisons between fitted and empirical (i.e. model simulated) distributions. Supplementary figure 1 (available online at stacks.iop.org/ERC/3/011001/mmedia) shows an example of these plots for a single location and ensemble member for the recent past.

Observations and reanalysis
To evaluate the ability of NARCliM to capture 1-in-20 year events we compare it to extremes calculated from a combination of gridded observations and reanalyses. Daily values of maximum temperature and total rainfall from the ∼5 km datasets of Jones et al (2009) were used to calculate heat extremes, rainfall extremes and meteorological drought. Bushfire weather extremes were taken from Dowdy (2018) who used the temperature, rainfall and vapor pressure datasets of Jones et al (2009) combined with surface wind fields from the NCEP-NCAR reanalysis (Kalnay et al 1996). Thunderstorm energy was based on 3 hourly CAPE values from the ∼25 km ERA5 reanalysis (Hersbach et al 2020). These datasets were regridded to the ∼10 km resolution of our NARCliM projections for comparison. Maximum temperature and CAPE were regridded with bilinear interpolation, rainfall with conservative interpolation and FFDI based on the nearest neighbour method. The above datasets are referred to as observations for this study, noting that reanalyses are simulations that assimilate available observations and thus are still subject to modelling uncertainty.

Recent climate and model evaluation
We first evaluate NARCliM's ability to capture the observed spatial patterns and magnitudes of extremes by comparing the simulated and observed 1-in-20 year return values for the recent past. We focus on the NARCliM ensemble median unless stated otherwise. Figure 1 shows the simulated 1-in-20 year return values for the recent past (top row) and corresponding values calculated from observations (bottom row). The simulated spatial patterns of bushfire weather, heat, rainfall and thunderstorm energy extremes match reasonably well with observations, with pattern correlations of 0.82, 0.92, 0.66 and 0.6, respectively (figure 1). In both NARCliM and observations, bushfire weather and heat extremes are higher inland and mitigated by the Great Dividing Range. And extreme rainfall and thunderstorm energy are highest along the east coast, particularly in northeast New South Wales and southeast Queensland. While meteorological drought is not captured nearly as well as the other extremes (pattern correlation=0.36) the broad increase in severity of 1-in-20 year events toward central Australia is captured (cf figures 1(c) and (h)). Domain mean differences in magnitude indicate that bushfire weather, heat and thunderstorm energy extremes tend to be underestimated by the ensemble while extreme rainfall tends toward overestimation (figure 1). This is consistent with previous work showing positive and negative biases in mean rainfall and temperature, respectively (Olson et al 2016) and a negative bias in bushfire weather extremes (Clarke and Evans 2019). Overall, the extremes simulated by NARCliM exhibit spatial features consistent with observations, giving us confidence in the ability of NARCliM to capture the relevant processes. Figure 2 shows the ensemble median of projected future (2060-2079) return periods for events that occur every 20 years in the recent past (1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009). Values below 20 years indicate events will occur more often in the future while values above 20 years indicate they will occur less often. Stippling indicates grid cells where at least 75% of the ensemble (9 of 12 members) agree on the direction of change in frequency. Grid cells without stippling but that still show a value indicate where the ensemble agreement on the direction of change is less than 75% but where the interquartile range is less than 20 years. Thus these grid cells indicate where projected return periods are similar to the recent past. Land grid cells that are masked white indicate areas where ensemble agreement on the direction of change is less than 75% and where the interquartile range is 20 years or more. Thus these grid cells indicate where there is low ensemble agreement and a large spread in projected return periods.

Future changes
Overall, the ensemble shows with high agreement an increase in the frequency of all five extremes over at least half of the model domain. There is high agreement for increasing heat extremes across all of southeast Australia ( figure 2(b)) and for increasing thunderstorm energy across most of southeast Australia (figure 2(e)). For bushfire weather and meteorological drought, large areas exhibit low ensemble agreement, where a clear change signal cannot be discerned (figures 2(a) and (c)). And for extreme rainfall, approximately half of the domain exhibits low ensemble agreement and large spread (white grid cells in figure 2(d)), reflecting, in part, the higher variability of the processes associated with rainfall extremes. Nonetheless, it is clear that for all five extremes all areas exhibiting high ensemble agreement correspond only to an increase in frequency of events in the future.
The largest increase in frequency is projected for heat extremes in central Australia, where 1-in-20 year events are projected to occur every year in the future, or twenty times more frequently ( figure 2(b)). For bushfire weather, large parts of eastern New South Wales, Victoria and South Australia are projected to experience 1-in-20 year events every 5-10 years, or two to four times more frequently than the recent past. These are regions where bushfire activity is already considerable. Extremes in thunderstorm energy are projected to occur approximately every 5 years across most of southeast Australia. For meteorological drought, 1-in-20 year events are also projected to occur approximately every 5 years across much of Victoria, southern New South Wales and southeast South Australia, a region encompassing the southern Murray-Darling Basin which has major agricultural significance for Australia ( figure 2(c)). Figure 3 shows the full ensemble spread of projected future return periods for the five capital cities covered by the model domain (selecting the grid cell centred over each city). Based on the ensemble median response all cities show an increase in the frequency of 1-in-20 year events for all five extremes. However, not all extremes in all cities meet our criteria for at least 75% ensemble agreement (orange versus blue boxes in figure 3). Of note, Adelaide and Melbourne exhibit increases with high ensemble agreement for all extremes, and all extremes are projected to occur more than twice as often in these cities compared to the recent past. Canberra exhibits increases with high agreement for all but one extreme. While in Sydney and Brisbane three and two extremes exhibit increases with high agreement, respectively. All cities indicate a tripling in frequency of at least one extreme (most often heat). The largest increase in frequency among all cities occurs in Adelaide for heat extremes, where 1-in-20-year events are projected to occur almost every two years, or over 9 times more frequently than in the recent past.
Despite the majority of the NARCliM ensemble projecting increases in the occurrence of all five extremes in almost every city, very long return periods are projected by a small number of members (i.e. where the ensemble maximums exceed the y-axis in figure 3). This is likely due to our small sample sizes. These cases are either not representative of the ensemble or, as in the case for rainfall in Canberra (figure 3(e)), the corresponding result is flagged by our criteria for ensemble agreement and spread.

Discussion
This paper presents the projected frequencies of five climate extremes over southeast Australia for the late 21st century, derived from the NARCliM climate model ensemble. It is intended to provide useful information for risk mitigation and adaptation planning and to motivate further research into extremely rare climate events. Overall, the ensemble shows an increase in frequency of all five extremes over at least half of southeast Australia.   There is high agreement across the NARCliM ensemble that all cities examined here will experience multiple climate extremes more frequently in the late 21st century, with multiple cities projected to experience extremes more than six times more frequently based on the ensemble median. Further, all locations in our results exhibiting high ensemble agreement (stippling in figure 2) exclusively correspond to increases in frequency. Thus no decreases in frequency are projected with high agreement for any of the five extremes across southeast Australia. Figure 3 shows that the ensemble spread in our projections can vary widely, especially for extremes other than heat extremes, which are more directly affected by increasing greenhouse gas concentrations through changes in the surface radiation budget. The larger ensemble spread for other extremes can be attributed to their less-direct relationship to greenhouse gas concentrations and larger variance of physical processes involved, but also to the design of the NARCliM ensemble itself. The GCMs and RCMs selected for NARCliM were chosen to capture as wide a range of projected climate change as possible (Evans et al 2014). This contributes confidence to locations exhibiting high ensemble agreement and helps characterise the significant uncertainty in the climate change response at other locations. Variability among ensemble members forced by different GCMs is likely also due to large-scale modes of variability in the climate system. The 20-year time periods simulated by NARCliM are not long enough to capture the climate variability of southeast Australia related to multi-decadal modes, such as the Interdecadal Pacific Oscillation.
The projected increases in frequency of extremes shown here is to a first-order consistent with previous research as well as our understanding of the climate processes involved. Shorter return periods of extreme heat can be attributed to global warming due to increasing greenhouse gas concentrations (Hegerl et al 2019). The Clausius-Clapeyron relationship between temperature and saturation vapor pressure suggests rainfall intensity should increase by ∼7% for every degree of warming (Trenberth et al 2003), and this has been broadly observed for daily extremes across the globe (Westra et al 2013). Our projected increases in 1-in-20 year rainfall extremes (figure 2(d)) and the intensification of annual rainfall extremes in NARCliM  are consistent with this, however, scaling rates in NARCliM can substantially exceed 7%/°C for some ensemble members over Australia (Bao et al 2017). Our projected increase in meteorological drought (figure 2(c); defined by the 3 month SPI) is consistent with several aspects of CMIP5 and CMIP6 studies which show that time spent in drought as well as drought intensity may increase over the 21st century in far southeast Australia (Kirono et al 2020, Ukkola et al 2020. Our projection of more frequent bushfire weather extremes results from a combination of changing temperature, precipitation, humidity and wind speed. While we do not evaluate these components individually, research with GCM's suggests temperature plays the dominant role (Abatzoglou et al 2019). Finally, more frequent extremes in thunderstorm energy is consistent with previous non-convection resolving modelling showing increases in CAPE over the same region (Allen et al 2014, Ji et al 2020). Importantly, Allen et al (2014) showed that corresponding decreases in vertical wind shear (another important component of thunderstorm development) can be expected to partially offset the contribution of future increases in CAPE to more frequent thunderstorm environments.
Several caveats must be considered when interpreting our results. First and foremost, climate model simulations are imperfect representations of the real climate. Our results relate to the real climate in so much as the NARCliM ensemble incorporates its processes and behaviour. This uncertainty relates to both the GCMs driving NARCliM and the WRF configurations chosen to downscale these GCMs. For example, how well NARCliM captures persistent changes in precipitation reflected in the SPI can be affected by how well the driving GCMs capture large scale modes of variability as well as by the land processes simulated by WRF. The set of GCMs and WRF configurations used in NARCliM were chosen based on the independence of their errors and performance to ensure the uncertainty spanned is as large as reasonably warranted. Downscaled projections based on more recent phases of CMIP-such as the upcoming NARCliM1.5 (Nishant et al in review)-or on updated regional climate models may improve results. The most relevant improvement in model physics likely relates to modelling spatial resolutions capable of capturing convective processes relevant to thunderstorm development. Such resolutions have been shown to dramatically improve the representation of daily precipitation in certain seasons and would also allow sub-daily events to be assessed more robustly (Kendon et al 2017). In the absence of better models, methods to correct model biases, including some of those identified in NARCliM in section 3.1, by statistically adjusting model output towards climate observations, can be used. However, bias adjustment was not used here. Bias adjusted daily temperature and precipitation data are available for NARCliM, and use of these would likely improve the match to observations (figure 1). However, no bias adjusted data are available for CAPE and FFDI and we thus chose to keep our results internally consistent. More importantly, given we are interested in the simulated change in the frequency of events we place greater importance on the fact that NARCliM captures the spatial patterns of our chosen extremes rather than the biases in their magnitude (figure 1).
The use of different climate extremes definitions could also affect our results. For example, the FFDI measure of bushfire weather is a strong indicator of surface conditions. However, other bushfire weather indices measure the stability of the atmosphere (which can be important in exacerbating bushfire conditions) and, as such, may exhibit very different future changes compared to surface conditions. Our definition of drought is also confined to 3-month precipitation deficiencies, while severe droughts can last much longer. Further, the nonstationarity of the climate in our two time periods could lead to larger variances in the distributions of extremes compared to equilibrium climates, particularly for the future period where climate is likely changing at a faster rate. This may impact our GEV distributions and bias our projected return periods. Lastly, our analysis has focused on 1-in-20 year events aggregated annually. The multi-GCM and multi-RCM design of NARCliM prevented us from combining simulations when fitting the GEV distribution, which would increase our sample size and support the assessment of even rarer events. Assessing seasonal data may also produce projected changes in frequency of a different sign to the annual changes shown here.

Conclusion
Where previous work evaluating the future of extremely rare events in Australia has focused on temperature and rainfall from GCMs, here we evaluated changes in extremes of heat, rainfall, bushfire weather, meteorological drought and thunderstorm energy from a high-resolution regional climate model ensemble. Overall what is surprising is not that extremes are projected to occur more frequently, but the degree to which their frequencies are projected to increase by, with all cities examined here projected to experience multiple 1-in-20 year events more than twice as frequently in the future. Further, no consistent decreases in frequency were simulated anywhere in southeast Australia. It is clear that the higher frequency at which these extremes are projected to occur will also increase the probability of coincident extreme events. How the probability of coincident extremes will change in the future and their consequent compounding effects are the subject of future work.