Validity of estimating flood and drought characteristics under equilibrium climates from transient simulations

Future flood and drought risks have been predicted to transition from moderate to high levels at global warmings of 1.5 °C and 2.0 °C above pre-industrial levels, respectively. However, these results were obtained by approximating the equilibrium climate using transient simulations with steadily warming. This approach was recently criticised due to the warmer global land temperature and higher mean precipitation intensities of the transient climate in comparison with the equilibrium climate. Therefore, it is unclear whether floods and droughts projected under a transient climate can be systematically substituted for those occurring in an equilibrated climate. Here, by employing a large ensemble of global hydrological models (HMs) forced by global climate models, we assess the validity of estimating flood and drought characteristics under equilibrium climates from transient simulations. Differences in flood characteristics under transient and equilibrium climates could be largely ascribed to natural variability, indicating that the floods derived from a transient climate reasonably approximate the floods expected in an equally warm, equilibrated climate. By contrast, significant differences in drought intensity between transient and equilibrium climates were detected over a larger global land area than expected from natural variability. Despite the large differences among HMs in representing the low streamflow regime, we found that the drought intensities occurring under a transient climate may not validly represent the intensities in an equally warm equilibrated climate for approximately 6.7% of the global land area.


Introduction
Human activities and the associated greenhouse gas emissions have already caused an increase in global mean temperature (GMT) of ∼1.0 • C (0.8 • C-1.2 • C) relative to the pre-industrial era. Furthermore, the GMT is projected to increase by 1.5 • C relative to pre-industrial levels between 2030 and 2052 (Masson-Delmotte et al 2018). The special report Global Warming of 1.5 • C (Masson-Delmotte et al 2018) produced by the Intergovernmental Panel on Climate Change (IPCC) suggests that higher levels of GMT are associated with higher impacts on organisms, ecosystems, and human societies (Seneviratne et al 2018, Shindell et al 2018. Likewise, future risks from floods and droughts are anticipated to transition from moderate to high at a level between +1.5 • C and +2.0 • C above pre-industrial levels (Roudier et al 2016, Gosling et al 2017, King et al 2017, Mohammed et al 2017, Thober et al 2018, Lange et al 2020. The climate scenarios used in the previous studies, investigating the potential impacts of climate change for relatively low levels of global warming either stabilise greenhouse gas concentrations over the 21st century or limit end-of-century radiative forcing to a specific level (Rogelj et al 2019). By relying only on steadily warming climate (transient; 2015-2100 in figure 1), the condition of the climate system being in equilibrium (see 2100-2200 in figure 1), was ignored despite the IPCC's suggestion to distinguish between the two climate response types (Masson-Delmotte et al 2018).
The intensities of precipitation (including extreme events) obtained under transient and equilibrium climates at the same GMT have been investigated. Early studies investigating the equilibrium climate focused on the interactions between atmospheric carbon dioxide concentration and the response of the climate model (Manabe et al 1991(Manabe et al , 1992 and reported that, as a result of to the thermal inertia of the oceans, the transient climate was characterised by a reduction (or delay) in moisture supply from oceans resulting in reductions in soil moisture and runoff over continents in middle and high latitudes. Recent publications have directly compared transient and equilibrium (also called stabilised) climates, although the existing literature is limited in size and has focuses mostly on temperature and precipitation patterns (Nangombe et al 2018, Wei et al 2019, King et al 2020. A common key finding of these studies is that mean precipitation intensities are higher in the transient climate than a stabilised climate. In particular, land monsoon precipitation in the Northern Hemisphere was stated to be 30% larger in a transient climate than a stabilised climate for a GMT increase of 2 • C (Cao and Zhao 2020). Great differences between transient and equilibrium climates have been reported using mean and extreme temperatures. Recently, King et al (2020) revealed that 90% of the world's population is experiencing a warmer local climate under transient global warming than it would under equilibrium global warming. Such differences in temperature are particularly relevant for assessing dryland areas, which expand significantly more under a transient climate than an equilibrium climate (Wei et al 2019).
The literature examining the potential difference in streamflow between the transient and equilibrium climate is extremely limited (Boulange et al 2018). To date, most streamflow analyses have used transient climate projections (Roudier et al 2016, Asadieh and Krakauer 2017, Donnelly et al 2017, Gosling et al 2017, and to our knowledge, only a single study has employed an equilibrated climate (Nangombe et al 2018). For other variables, nonsignificant or marginal differences in climate change impacts between a transient and equilibrium climate have been reported (Manabe et al 1992, Blackport and Kushner 2016, Boulange et al 2018. Therefore, it remains unclear whether the floods and droughts anticipated at low levels of global warming under a transient climate can systematically be substituted with those occurring in an equilibrated climate at the same GMT. In this study, we analysed the characteristics of future floods and droughts derived from streamflow globally simulated (0.5 • × 0.5 • spatial resolution) by seven hydrological models (HMs) forced by biasadjusted climate projections from three global climate models (GCMs) under transient and equilibrium climates (figure 1). Altogether, simulations from 19 combinations of HMs and GCMs (hereafter the large ensemble; available simulations listed in supplementary table 1 available online at stacks.iop.org/ERL/16/ 104028/mmedia), operated using a common protocol under the Inter-Sectoral Impact Model Intercomparison Project Phase 2b (ISIMIP2b), were used to assess whether significant differences in global floods and droughts obtained under transient and equilibrium climates at the same GMT are detectable, and determine the locations where floods and droughts predicted under a transient climate may not be interchangeable with those under an equilibrium climate.

Global climate models (GCMS)
The ISIMIP (Warszawski et al 2014) is an effort to increase the reliability and comprehensiveness of climate change studies through the incorporation of multiple global impact models such as global hydrological models (GHMs). The protocol of ISIMIP2b was designed to support the IPCC Special Report on the 1.5 • C target (Frieler et al 2017), and it provides bias-corrected (Lange 2019) climate projections at a spatial resolution of 0.5 • × 0.5 • for four GCMs that contributed to the Fifth Assessment Report of the IPCC.
In this analysis, three radiative forcing scenarios were employed for each of the three selected GCMs (HadGEM2-ES, IPSL-CM5A-LR, and MIROC5): a pre-industrial climate with constant CO 2 concentration (286 ppm, spanning 1661-2005), a historical climate that includes the effects of human emissions , and a future low-emissions climate . The future climate is composed of the representative concentration pathway 2.6 (RCP 2.6, spanning 2006-2099) and its associated extended concentration pathway (ECP 2.6, spanning 2100-2299). Another GCM, GFDL-ESM2M, is included within the ISIMIP2b framework, but the bias-corrected forcing data provided do not cover the period until 2100-2299; therefore, this GCM was excluded from our analytical framework.
Given the absence of global warming and fixed anthropologic activities (fixed at the 1860 level) of the pre-industrial climate, this scenario can be used to reveal the effect of natural variability on our statistical analyses. Specifically, any significant differences in flood or drought characteristics detected between two randomly chosen 30 year periods belonging to the pre-industrial experiment can be attributed to natural variability (hereafter control simulations). Next, the historical climate was used to identify two streamflow thresholds that trigger flood and drought events (supplementary figure 1). Finally, the future scenario provided both transient and equilibrium climates. The transient climate is represented by the RCP while the equilibrium climate is approximated from the ECP (hereafter referred to as a quasi-equilibrium climate as the climate system will never be in equilibrium (Meehl et al 2020)).

Hydrological models (HMs)
The HMs (listed in supplementary table 1) considered in this study include four GHMs: CWatM The number of meteorological variables required to operate the models varies, depending on the representation, implementation, and interactions between hydrological and anthropological processes, but they generally include precipitation, temperature, solar radiation (shortwave and longwave downward), wind speed, specific humidity, and surface pressure (Telteu et al 2021). The HMs included in this study were the HMs that provided simulations for all three radiative forcing scenarios described above.

Time series analysis and metrics used
The transient and quasi-equilibrium climates are represented by 30 year periods from the RCP and ECP simulations, respectively (Boulange et al 2018, Wei et al 2019, King et al 2020. To comprehensively compare the characteristics of future floods and droughts obtained under the two climates, the GMT of the two 30 year periods must be identical. We consequently selected the transient and quasi-equilibrium periods so that their average 30 year rolling GMT were within ±0.1 • C of each other. More specifically, we initially computed, separately for all GCMs, the 30 year rolling GMT for the entire future period (figure 1). Next, restricting the search within the RCP simulations (2006-2099), we identified the transient period as the first 30 year period which average 30 year rolling GMT was within ±0.01 • C of a target temperature. The quasi-equilibrium climate was determined next by retrieving the first 30 year period in the ECP simulation (2100-2299) in which average 30 year rolling GMT was within ±0.1 • C of the corresponding transient temperature. Consistent with the Paris agreement, the target temperature used for MIROC5 and HadGEM2-ES was 1.5 • C above the pre-industrial level. For IPSL-CM5A-LR, because a GMT of +1.5 • C could occur as early as 2008, the target temperature was set higher, to 2.0 • C above the pre-industrial level.
The 30 year periods representing transient and quasiequilibrium climates for the three GCMs are provided in supplementary table 2. We confirmed that the derivation of the 30 year periods associated with the quasi-equilibrium climates are not overly sensitive to the ±0.1 • C criteria (supplementary table 3).
For each year of the 30 year periods representing transient and quasi-equilibrium climates (supplementary table 2), we identified a flood (drought) event when daily streamflow (Q) was higher (lower) than a given threshold (supplementary figure 1). The streamflow thresholds were determined prior to the analysis, separately for all HMs and GCMs, as the 5th and 95th percentiles of historical daily streamflow (period: 1970-2004) for flood and drought, respectively (Fleig et al 2006, Rahimi et al 2021. For as long as a flood or drought event persisted, we recorded its duration (in days), volume (in km 3 ), and intensity (in m 3 s −1 ). When the interval separating two events was less than 6 d, we assumed that the events formed a single event and consequently merged the initially distinct volumes and durations. The final intensity consisted of the highest intensity of the two initial events (supplementary figure 1).
In this analytical framework, extreme streamflow simulations initiate flood and drought events (supplementary figure 1), comparisons of the characteristics of flood and drought under transient and quasi-equilibrium climates may therefore be compromised by the following conditions. First, if the daily streamflow never exceeds (falls below) the threshold, flood (drought) events would be systematically absent (Q < threshold for flood; Q > threshold for drought). Separately, for all combinations of GCMs and HMs, we evaluated the percentage of global land area where no flood (drought) occurred during the transient and quasi-equilibrium periods. The second condition, only relevant for drought analysis, involves an excessive number of days featuring zero streamflow, which directly impacts the computation of drought characteristics. Independently for all HMs, we flagged grid cells where the 95th percentile of daily streamflow was equal to zero (Q 95 = 0), pooling data from both transient and quasi-equilibrium climates and all GCMs.
In this analysis, we systematically masked grid cells belonging to Köppen-Geiger regions EF (ice cap climate), BWh (hot desert climate), and BWk (cold desert climate). In such locations, simulated streamflows are low and most HMs have difficulties representing the hydrological processes realistically (Zhang et al 2016, Gädeke et al 2020.

Statistical analysis 2.4.1. Transient climate vs quasi-equilibrium climate
The three characteristics of flood and drought obtained under transient and quasi-equilibrium climates were compared, individually for each land grid cell and separately for each member of the ensemble (n = 19), using the Kolmogorov-Smirnov test (α 0 = 0.05). As multiple hypothesis tests are evaluated over a large network of grid-cells, the effects of multiple testing on the overall results were controlled through the false discovery rate (FDR) procedure described by Wilks (2006Wilks ( , 2017 and by setting the control level for the FDR (α FDR ) to 0.10. Employing this procedure, all grid cells were classified as experiencing significant or non-significant differences in flood and drought characteristics under transient and quasi-equilibrium climates. All steps involved in the statistical analysis are summarised in supplementary figure 2.
We report the percentage of global land area where a significant difference in the characteristics of flood and drought was obtained using various levels of aggregation (all 19 members, individual GCMs, and individual HMs; see figure 2).

Natural variability
Importantly, for some grid cells, significant differences in flood or drought characteristics under transient and quasi-equilibrium climates could be a result of natural variability. We revealed the effects of natural variability on the previously obtained statistical results using control simulations (figure 1). Specifically, a field significance test procedure was employed (Livezey andChen 1983, Wilks 2006). Through a Monte Carlo (MC) procedure, we randomly selected two non-overlapping 30 year periods from the control simulation and identified areas where flood and drought characteristics differed significantly using the Kolmogorov-Smirnov test (α 0 = 0.05) and controlling the FDR, as described above. Analogous to the procedure employed for the transient and quasiequilibrium climates, after masking the grid cells belonging to Köppen-Geiger regions EF, BWh, and BWk, we assessed the percentage of land area where a significant difference in the characteristics of floods and droughts occurred between the two control simulations and reported the locations of the grid cells involved. Repeating this procedure 1000 times for all combinations of HMs and GCMs, we report the 5th-95th percentile range of the global land area where significant differences in flood and drought are expected as a result of climate variability.

Inter-model agreement on significant difference (IMAoSD)
Next, we investigated the spatial distribution of the grid cells where the characteristics of flood and drought were significantly different under the two climates. For that purpose, we evaluated the IMAoSD for flood and drought characteristics. This indicator is computed independently at the grid cell level for every flood and drought characteristic. For various aggregation levels (large ensemble, GCMs, HMs), we first evaluated the number of models in the ensemble indicating a significant difference in each flood (drought) characteristic in every grid cell (m dif ). This number was then divided by the total number of models (m models ) in the ensemble so that IMAoSD could range from 0% to 100%. When all models in the ensemble indicate that the characteristics of flood or drought are significantly different between the two climates (m dif = m models ), IMAoSD attains its maximum value of 100%. By contrast, when none of the models of the ensemble indicate a significant difference in flood or drought characteristics under the two climates (m dif = 0), IMAoSD is equal to 0%. The IMAoSD indicator was additionally computed for the control simulations to reaffirm the influence of climate variability using the procedure outlined above. For a given combination of HM and GCM, there are 1000 control simulations available (from the MC procedure) to evaluate the IMAoSD.

Field significance of flood and drought characteristics
In the large ensemble, significant differences in the duration, intensity, and volume of floods under transient and quasi-equilibrium climates were found, on average (95th confidence interval across the ensemble indicated in bracket), over 0.9% (0%-3.4%), 1.1% (0%-4.9%), and 0.9% (0%-3.5%) of the global land area, respectively (figures 2(a)-(c)). Likewise, significant differences in the duration, intensity, and volume of droughts under the two climates were obtained over 6.4% (0.5%-16.7%), 8.9% (0.8%-8.7%), and 1.3% (0%-3.4%) of the global land area, respectively (figures 2(d)-(f)). These results are comparable to those of a previous analysis that reported significant difference in mean streamflow between transient and quasi-equilibrium climates over 5.8 ± 1.2% of the global land area based on a single GHM (Boulange et al 2018). Employing control simulations (see section 2.4.2), we revealed the contribution of natural variability to the detection of significant differences in flood and drought characteristics (figure 2). Significant differences in flood characteristics under transient and quasi-equilibrium climates occur over a small fraction of the global land area and may generally be caused by natural variability rather than the state of the two climates (Eisner et al 2017; see supplementary section 1). However, significant differences in drought characteristics under the two climates occur over a larger global land area. The significant differences between the transient and quasi-equilibrium climate exceed the difference expected from natural variability for only drought intensity (figure 2(e)). We confirmed that these results were not sensitive to the choice of the thresholds used to identify flood and drought events ( supplementary figures 3 and 4).
Disaggregating the large ensemble, we assessed the percentage of global land area where significant differences in flood and drought characteristics between the two climates were identified for individual GCMs and HMs (figure 2). Across the three GCMs, the results were mostly consistent for all flood and drought characteristics. The results varied greatly among HMs (up to two orders of magnitude), particularly for drought characteristics, emphasising that the main cause of uncertainty in flood and drought analysis comes from the use of multiple HMs. Employing a single HM, we found that the differences in flood and drought characteristics obtained under transient and quasi-equilibrium climates are generally globally significant (figure 2), which is consistent with previous analyses (Boulange et al 2018, King et al 2020. We note that the percentage of global land area where the characteristics of floods and droughts are significantly different under the two climates (transient and quasi-equilibrium or control periods) was commonly higher when using MATSIRO than when using other HMs (figure 2). This difference may have been caused by the base groundwater recharge of MATSIRO, which is higher than the recharge of other HMs (Reinecke et al 2021).

Spatial patterns of the IMAoSD
The magnitude of the IMAoSD calculated for all flood characteristics under the two climates was low. As an example, the IMAoSD exceeded 26.3% (i.e. at least 5 of the 19 ensemble members reported significantly different flood characteristics under the two climates) over less than 0.01% of the global land area.
Using control simulations, we obtained similar results, indicating that the spatial distribution of the IMAoSD calculated for flood characteristics under transient and quasi-equilibrium climates was driven by natural variability. This suggests that by employing an ensemble of GCMs and HMs, and for a relatively low level of global warming, the floods derived under a transient climate can be used to infer those under an equilibrated climate.
For drought characteristics, the maximum IMAoSD values were 52.6% (drought duration) and 63.1% (drought intensity and drought volume). That is 10 and 11 out of 19 ensemble members, respectively, reported that the characteristics of droughts are significantly different under transient and quasiequilibrium climates. Although these high IMAoSD values occurred over a small global land area (approximately 1%), this behaviour was noticeably absent from our control simulations. Consequently, for identically low GMT, the drought intensity acquired in a transient climate is not systematically representative of the intensity accruing in a quasi-equilibrium climate ( figure 3(a)).

Uncertainty in the IMAoSD
Climate change impact analysis is often performed by using different combinations of GCMs and HMs. It may therefore be difficult to infer the implications of the above results with respect to previous research. Therefore, wet repeated the above analysis for individual GCMs and HMs, assessing whether the observations and conclusions reached using the large ensemble hold when using different aggregation levels.
When aggregating at the GCM and HM levels, the spatial patterns and magnitudes in flood characteristics of the IMAoSD (supplementary figures 4 and 5) and those obtained using the large ensemble were mostly alike. The only exception was MATSIRO, with which the annual variations in flood characteristics simulated in eastern Europe, Brazil, and central Russia (supplementary figure 5) were substantial. Overall, this analysis confirmed that the floods derived under a transient climate can reliably (regardless of the combinations of GCMs and HMs employed) be used to infer the floods occurring under an equilibrated climate.
By contrast, the magnitude of the IMAoSD in drought intensity and drought duration varied, particularly across HMs (supplementary figures 6 and 7). For example, the IMAoSD was higher than 50% over 0.1%-10.1% and 1.4%-11.1% (minimum and maximum ranges) of the global land area for drought duration and drought intensity, respectively. We confirmed, using control simulations, that the above patterns in IMAoSD are not the result of climate variability (IMAoSD exceeds 50% for less than 0.1% of the global land area for all characteristics of floods and droughts.). Consistent across HMs, the IMAoSD in drought intensity was highest for some part of Brazil, central USA, East Africa, and northern Russia. For such locations, the drought intensities analysed in a transient climate may not be representative of the drought intensities experienced in a quasiequilibrium climate, at the same GMT (supplementary figure 8). The substantial variations among the HMs in the amplitude of the IMAoSD in drought intensity are investigated further in the next section.

Adjustment of the multi-model framework
The characteristics of flood under transient and quasi-equilibrium climates were generally significantly different for a small fraction of the global land area (figures 2(a)-(c)), as observed consistently across the large ensemble, GCM ensemble, and HM ensembles. By contrast, the percentage of global land area where drought intensity was significantly different under transient and quasi-equilibrium climates varied considerably (figure 2(e)). In addition, the spatial distribution of the IMAoSD in drought intensity obtained under the two climates also showed remarkable differences depending on the selected HMs (supplementary figure 8). Here, we carefully inspect the behaviours of the HMs and accordingly adjust the multi-model framework.
The absence of drought events from either the transient and quasi-equilibrium periods was consistent across all HMs and occurred, on average, over 2.2 ± 1.3% of the global land area ( figure 3(b)). Consequently, it does not explain the large differences of the IMAoSD in drought intensity across GHMs.
By contrast, the percentage of global land area where the 95th percentile of daily streamflow was equal to zero (see section 2.3; grid cells where Q 95 = 0) vary greatly across HMs ( figure 2(b)). At most, grid cells exhibiting Q 95 = 0 were detected over 33.3% of the global land area with LPJmL ( figure 3(b)). This behaviour may be associated with LPJmL producing zero groundwater recharge in a large number of grid cells (Döll et al 2018). Excluding this specific HM, the 95th percentile of daily streamflow was equal to zero, on average, over 5.0 ± 2.9% of the global land area.
The use of a large multi-model ensemble is largely encouraged in climate change impact studies because ensemble mean or median projections are typically more reliable and accurate than individual model projections (Gosling et al 2017, King et al 2017, Thober et al 2018. However, due to fundamental differences in the representation of low streamflow by the HMs, the mean or median of the multi-model ensemble is not the most reliable method for assessing the differences in drought characteristics between transient and quasi-equilibrium climates. By systematically simulating annual low streamflow equal to zero, some HMs make the comparison of drought characteristics under transient and quasi-equilibrium climates impractical. Hence, we re-evaluated the IMAoSD in drought intensity by first excluding, at the grid cell level, the combinations of GCM and HM that systematically produce zero streamflow (Q 95 = 0 for a given combinations of HM and GCM; see figure 3(c)). After this simple adjustment, significant differences in drought intensity under transient and quasi-equilibrium climates highlighted by at least half of the HMs in the ensemble occurred over 6.7% of the global land area (figure 3(c)), highlighting major river systems. Again, such behaviour was absent from the control simulations, confirming that for a relatively low level of global warming, the drought intensity derived from a transient climate does not necessarily represent the intensity occurring under an equilibrated climate. The regions where the future drought intensity obtained under a transient climate is not interchangeable with the intensity of a quasi-equilibrium climate depend on the selection of HMs, because of their representations of low streamflow. After adjusting our multi-model framework to lessen the effects of low streamflow representation, we found that for a relatively low increase in GMT, the IMAoSD in drought intensity was highest in east Brazil, East Africa, and northern Russia.

Summary and discussions
In this research, we determined the interchangeability of flood and drought characteristics obtained under transient and quasi-equilibrium climates using a large ensemble consisting of multiple GCMs and HMs.
The global land area where we identified significant differences in flood characteristics between the transient and quasi-equilibrium climates was not distinguishable from the area resulting from natural climate variability. In addition, the spatial patterns of IMAoSD in flood characteristics obtained using the transient and quasi-equilibrium climates and control simulations were indistinguishable. As a result, the characteristics of floods derived from a steadily warming climate reasonably approximate the characteristics expected in an equally warm, equilibrated climate not expected for dozens of decades into the future.
By contrast, the significant differences in drought intensity between the transient and quasiequilibrium climates occurred over a larger surface area of the globe than expected by natural variability. However, the heterogeneity of HMs for predicting drought characteristics was high, because of drastic differences in low streamflow simulation. Low streamflow regimes are dictated by natural processes, including infiltration characteristics of soils (further influenced by the number and depth of soil layers), aquifer properties, evapotranspiration rates, vegetation types, and anthropogenic activities, such as water abstraction (industrial, agricultural and municipal purposes), return flows from agricultural fields, inter-basin water transfer, and streamflow regulation by dams (Smakhtin 2001), all of which are simulated differently in the HMs. Recently, the inclusion of vegetation processes in HMs was proven to have a substantial effect on simulated groundwater recharge and hence low streamflow (Reinecke et al 2021). As a result, using the mean or median of a multi-model ensemble is not the best and most reliable method for assessing the differences in drought characteristics between transient and quasi-equilibrium climates (Krysanova et al 2018).
After adjusting our multi-model methodology to alleviate the effects of low streamflow representation, for approximately 6.7% of the global land area, the future drought intensity derived from a steadily warming climate may not represent the intensity expected in an equally warm, equilibrated climate.
According to our findings, previous studies quantifying the evolution of floods for a low level of global warming using a transient simulation should provide a reasonably accurate representation of the equilibrium climate. By contrast, the drought intensity characterised under transient simulation may not necessarily be indicative of the intensity expected in an equilibrated climate. In particular, drought analyses targeting specific regions such as Brazil, Central USA, East Africa, and northern Russia should pay attention to the validity of the findings in an equilibrated climate.

Data availability statement
The ISIMIP2b simulations used in this study are publicly available and from the ISIMIP website (www.isimip.org).
The data that support the findings of this study are openly available at the following URL/DOI: 10.5281/ zenodo.4171626.