Historical rainfall data in northern Italy predict larger meteorological drought hazard than climate projections

. Simulations of daily rainfall for the region of Bologna produced by 13 climate models for the period 1850– 2100 are compared with the historical series of daily rainfall observed in Bologna for the period 1850–2014 and analysed to assess meteorological drought changes up to 2100. In particular, we focus on monthly and annual rainfall data, seasonality, and drought events to derive information on the future development of critical events for water resource availability. The results show that historical data analysis under the assumption of stationarity provides more precautionary predictions for long-term meteorological droughts with respect to climate model simulations, thereby outlining that information integration is key to obtaining technical indications.


Introduction
Droughts are one of the most challenging risks for modern society. Indeed, most countries across the globe are exposed to medium/high drought risk (see Fig. 1). Recent events, like the Millennium Drought in Australia (Van Dijk et al., 2013) and droughts in California (Lund et al., 2018) and South Africa (Sousa et al., 2018) highlighted once again that drought events may occasionally last for several years, therefore leading to "multiyear droughts", with 5-10 years' duration, or "megadroughts", with a duration of more than 1 decade (Cook et al., 2016). The sporadic occurrence of megadroughts across the globe is also confirmed by several paleoclimatic records (Cook et al., 2016;Vance et al., 2015). Multiyear droughts and megadroughts are a cause for concern as they strongly impact ecosystems, water supply, socio-economical assets, and, ultimately, public health (Tabari et al., 2021;Ukkola et al., 2020;Tabari, 2020;Stahl et al., 2016). Also, the vulnerability and exposure to these events are much higher than in the past.
Multiyear droughts are rare extreme events, which are related to large-scale atmospheric teleconnections whose dynamics are dictated by chaotic behaviours. An implication of the rare occurrence of multiyear droughts is the limited availability of data to decipher their frequency and train prediction models. For the same reason, it is also difficult to predict how climate change may exacerbate drought risk. Indeed, the warmer climate is changing the hydrological cycle and further affects precipitation (Trenberth, 2011), with evidence demonstrating that precipitation is altering in terms of both annual mean (Knutti and Sedláček, 2013), seasonal variation (Polade et al., 2014;, and extreme events (Papalexiou and Montanari, 2019;Alexander et al., 2006). According to the Intergovernmental Panel on Climate Change Sixth Assessment Report (IPCC AR6), such changing conditions are likely to continue in the 21st century (IPCC, 2021). Therefore, the risk of multiyear droughts is likely to increase in the future, in view also of the increased water demands originated by global warming (Li et al., 2020).
The above reasoning highlights the key role of predictions in mitigating the risk of multiyear droughts and designing adaptation procedures. Future climatic scenarios are usually generated by general circulation models (GCMs), which attempt to simulate future climate variables under given scenarios of CO 2 emissions. Recently, the World Climate Research Programme (WCRP) released the Coupled Model Intercomparison Project Phase 6 (CMIP6; Eyring et al., 2016). Simulations from several models are available which deliver ensemble projections of future climate. The reliability of rainfall simulated by GCMs has been analysed by several studies that focused primarily on the generation of climate models that preceded CMIP6 (CMIP5 and previous models; see Palerme et al., 2017;Koutroulis et al., 2016;Aloysius et al., 2016;S. Kumar et al., 2014;Sillmann et al., 2013). Most of this research compared the climatic scenarios with historical datasets produced by reanalysis such as NCEP or ERA-Interim (Palerme et al., 2017;S. Kumar et al., 2014;Sillmann et al., 2013). However, on the one hand, reanalysis may introduce biases in the reproduction of extreme events. On the other hand, the observation period of these datasets covers only the recent past and therefore may undermine a statistical assessment of performances.
The present study aims at inspecting the ability of selected CMIP6 models in simulating regional-scale climate by focusing on meteorological drought occurrence. We first compare simulated statistics with those of one of the longest daily rainfall records globally available: the Bologna rainfall series, whose observation period dates back to 1813. We decided to adopt an observed record as a baseline instead of a reanalysis to take advantage of an extraordinarily long observation period. Second, we assess meteorological drought changes up to 2100.
The purpose of our study is twofold: (i) to evaluate the ability of up-to-date GCMs in simulating the statistics of observed precipitation by focusing on multiyear meteorological droughts and (ii) to infer how precipitation and drought risk will change in the future. The paper is organized as follows: Sect. 2 provides detailed descriptions of the data used in this study. Sections 3-5 describe the methods for bias correction, reliability testing, and future change assessment. Section 6 presents the results of the historical evaluation and future projections for both precipitation and drought risk. Finally, Sect. 7 summarizes our conclusions.

Rainfall observations
Italy is one of the first countries that started to systematically collect meteorological observations. Meteorological instruments and a network of observations were developed by Galileo's scholars and operated in the 17th century already. The rainfall series collected in Padua since 1725 is the longest daily record in the world, and five other rainfall stations have been continuously in operation -with few missing values -since the 18th century (Bologna, Milan, Rome, Palermo, and Turin). Therefore, a data set of enormous value has been accumulated in Italy over the last 3 centuries (Brunetti et al., 2006).
Rainfall observation in Bologna at a daily timescale dates back to 1714. The series in continuous from 1813 onwards. Brunetti et al. (2001) provided an interesting description of the history of the time series. The observatory was orig- inally located in the centre of the city. The rain gauge was changed in 1857 and likely in 1900. After 1978 data were collected by another observatory in a nearby location. Brunetti et al. (2001) proposed monthly correction factors to resolve apparent underestimation of rainfall during 1813-1858, 1900-1928, and 1813-1978. These corrections are valid for monthly rainfall only.
The daily rainfall series observed in Bologna from 1813 to 2021 can be obtained from the European Climate Assessment & Dataset (ECA & D) (https://climexp.knmi.nl/, last access: 9 March 2022). These are original data as reported in the transcripts of the observatory, without any correction. For the purpose of the present analysis, we assume that the daily series is homogeneous. In view of the time window adopted by CMIP6 GCMs for the historical reconstruction, we limit our analysis of the Bologna time series to period of 1850 to 2014 (see Sect. 2.2). Figure 2 shows the daily time series and the 10-year moving average, which oscillates between a minimum of 1.2 mm and a maximum of 2.5 mm (more than twice the minimum). The above extreme values occurred in the 1820s and the decade ending in 1902, respectively. After 1950, a mild increasing trend is noticed, which mirrors a similar tendency that occurred from the 1830s to the 1890s.
From each model, an ensemble simulation is generated for different initial conditions, initialization methods, physics versions, and forcing datasets. Similarly to Grose et al. (2020) and Kim et al. (2020), we analyse only the first run of each ensemble, which is identified by the abbreviation "r1i1p1f1", where "r", "i", "p", and "f" indicate initial conditions, initialization method, physics version, and forcing data set, respectively (see https://pcmdi.llnl.gov/CMIP6/ Guide/dataUsers.html, last access: 9 March 2022, for more details). We assume that analysing a single ensemble member per GCM is sufficient to sample the ensemble for testing model bias. The assumption above was recently proved to be tenable by Longmate et al. (2023). Table 1 shows detailed information for 13 GCMs used in this study. We select all the models providing future simulations for the three considered emission scenarios. The multimodel ensemble mean is the arithmetic average value of the 13 GCM outputs.

Bias correction
Simulations by GCM are provided at the grid scale. To compare them with observed data, one should take into account the potential bias that may be introduced by subgrid variability. For the considered timescale subgrid variability is expected to be limited in the region of Bologna. In fact, we focus on monthly and annual rainfall data, which exhibit low spatial variability in the region (see the annual reports of the Regional Agency for Environmental Protection at https://www.arpae.it, last access: 20 February 2023; see also Antolini et al., 2016, for an analysis of subgrid variability in the considered spatial domain).
To compensate for potential bias, we applied bilinear spatial interpolation to estimate the model prediction for Bologna depending on the four nearest GCM grid points. Moreover, we applied quantile delta mapping (QDM) to correct bias with respect to the observed daily rainfall series. QDM (Cannon et al., 2015) preserves model-projected relative change in quantiles while at the same time correcting the systematic biases in quantiles of a model simulation compared to observed values. It is widely adopted for bias correction of GCM output such as precipitation (Xavier et al., 2022;Fauzi et al., 2020). Here, we apply QDM to the model runs for the historical (1850-2014) and future periods (2015-2100).
First, we compute the empirical frequency of nonexceedance q f (t) of each GCM-simulated value m f (t) during the future (denoted by the subscript "f") period. Then, the relative change in quantiles (t) of GCM-simulated precipitation over two time periods is given by where F f and F h denote the distribution of empirical frequencies of each GCM-simulated value during the future and historical period, respectively. Then, the bias-corrected valuê o(t) corresponding to q f (t) is computed by the frequency of non-exceedance F o of the observations during the historical period: Finally, the bias-corrected GCM simulationm f (t) for the future period is given bŷ Note that bias correction for the historical GCM simulation only needs the application of Eq.
(2) by plugging in q h (t) in the right-hand side instead of q f (t).

Reliability testing
The reliability of GCM simulations with and without bias correction is evaluated by focusing on different temporal scales to obtain a comprehensive picture of model performances.

Monthly and annual rainfall
To assess the performance of each of the considered CMIP6 GCMs in reproducing the statistics of monthly data during the historical period (1850-2014), we use the "combined probability-probability" (CPP) plot (Koutsoyiannis and Montanari, 2022). The CPP plot compares the probability distributions F o and F h of observed and GCM-simulated monthly rainfall, respectively, during the historical period. The comparison refers to five subsequent 33-year-long time windows during 1850-2014 to assess the capability of GCMs to reproduce changes along the time of climate statistics.
To make the CPP plot, first a realization w from uniform random variable F w is generated in the range [0, 1], and then we compute the empirical probability distribution: The ability of GCM in reproducing observed statistics is regarded as good if the plot F w (w) versus w is the equality line, which is what we would like to check. In essence, the plot tests whether the two distributions, estimated from the GCM simulation and observation, are identical. Note that a CPP plot lying above (below) the equality line indicates underestimation (overestimation) while an S-shaped CPP plot with the initial part above (below) the equality line and the second part below (above) the equality line indicates an underestimation of low (high) rainfall and overestimation of high (low) rainfall (Koutsoyiannis and Montanari, 2022). We also compare the observed and simulated empirical density distributions and lag-1 autocorrelation coefficient of annual rainfall for the whole historical period. Autocorrelation is particularly interesting as it indicates the capability of models to simulate long-term cycles, such as those occurring during multiyear droughts.

Mean monthly and seasonal rainfall
The goodness of the simulation of monthly and seasonal rainfall averaged over the observation period is evaluated by a graphical comparison with the observed values and the Taylor diagram (Taylor, 2001). The latter is widely applied to summarize how accurately a model simulates an observed record. It integrates three statistical metrics: correlation (R), the centred root-mean-square error (CRMSE), and the ratio of standard deviation (SD) in a single diagram. The angular coordinate represents R. The CRMSE is measured by the distance from the point of reference (observation). Finally, the radial distance from the origin represents the ratio of SD between model simulation and observation. For perfect model simulation, R and the SD ratio assume unit value and CRMSE = 0. These statistics provide a quick summary of the correspondence between the modelled and observed mean monthly and seasonal rainfall, which is particularly useful in assessing the relative merits and overall performance of climate models (Rivera and Arnould, 2020;Dong and Dong, 2021;Yazdandoost et al., 2021). The Taylor diagram is herein used to assess the goodness of the fit of (a) the mean monthly rainfall, (b) March-April-May (MAM) mean rainfall, (c) June-July-August (JJA) mean rainfall, (d) September-October-November (SON) mean rainfall, and (e) December-January-February (DJF) mean rainfall.

Meteorological droughts
To test the GCM's reliability in simulating multiyear meteorological droughts we apply run theory (Yevjevich, 1967) to annual rainfall to characterize drought events in terms of drought frequency, duration, severity, and intensity. Run theory is one of the most effective approaches for drought identification and has been applied in several areas worldwide (Mishra et al., 2009;Wu et al., 2020;Ho et al., 2021). In detail, the long-term mean rainfall R LT is adopted as the threshold to identify positive or negative runs (see Fig. 3). If rainfall in a given year is lower than an assigned threshold T lower < R LT , a negative run is started which ends in the year when the rainfall is higher than R LT . If the interval between two negative runs is only 1 year and rainfall in that year is less than a selected threshold T upper > R LT , then these two runs are combined into one drought. Finally, only runs which have a duration of no less than 3 years are determined to be multiyear drought events. Here, the thresholds T upper and T lower are defined as 20 % more and 10 % less than R LT , respectively. The threshold values have been identified with a trial and error procedure by verifying that relevant droughts observed in the past have been consistently recognized.
Once a multiyear drought has been identified, drought duration is the time span between the start and the end of the event, and drought severity is computed as the cumulative rainfall deficit with respect to R LT during drought duration divided by the mean rainfall, and drought intensity is computed as the ratio between drought severity and duration. We also estimate the maximum deficit of the drought event, namely, the largest difference between annual rainfall and R LT during the event. Finally, drought frequency is computed as the ratio between the number of drought events that have been identified and the length of the observation period.

Future climate change assessment
Statistics of future projections of annual and seasonal rainfall of the 13 considered GCMs under the three considered scenarios are compared with observed and simulated statistics of the historical period to evaluate future climate change in the Bologna region. For a detailed comparison of seasonal rainfall, the future time horizon is divided into near-future (2030-2059) and far-future (2070-2099) time windows. The multi-model median and the 25th-75th percentiles of the projections given by the GCMs are considered for each scenario to represent the associated ensemble uncertainty.   bias correction, respectively. Note that QDM was operated over the whole historical period. While each individual model displays consistent performance in terms of probability distribution across various time windows with only minor differences, the results do not allow easy identification of the optimal model for a given time window. Before bias correction, MPI-ESM1-2-LR generally underestimates monthly rainfall for all periods while some other models (ACCESS-CM2, GFDL-ESM4, and MIROC6) end up with a prevailing overestimation. CMCC-ESM2 and CanESM5 relatively well capture the low rainfall, while underestimating high rainfall. The remaining models (e.g. FGOALS-g3, IPSL-CM6A-LR, and INM-CM4-8) generally fit the observed distribution well in some time windows while exhibiting slight departures in other periods. The multi-model ensemble satisfactorily simulates the mean value while overestimating and underestimating the low and high rainfall, respectively.
As expected, after bias correction all models show a better performance in reproducing probability distributions in different periods except FGOALS-g3 and INM-CM4-8, which slightly underestimate low rainfall after QDM. The performance of the ensemble remains somewhat consistent, although for some periods one notes an improvement in the fit of the mean value. In general, it is confirmed that bias correction improves the model performances for historical sim- ulations. The obvious limit of QDM is the requirement of an extended data set of historical data.
For the whole historical period, Fig. 6 shows a comparison between observed and simulated empirical density distribution of annual rainfall before and after bias correction. Before QDM, the majority of models perform worse in capturing heavier rainfall than lighter rainfall, with the opposite result occurring for EC-Earth3-Veg-LR, MIROC6, and MRI-ESM2-0. Only ACCESS-CM2 and GFDL-ESM4 overestimate both high and low rainfall. After QDM, all the models exhibit improved performances although underestimation of high rainfall is introduced by QDM for seven models and the ensemble simulation. It is noted that multi-model ensemble (MME) mean exhibits lower variability with respect to observed data both before and after bias correction. This result is coherent with what is shown in Figs. 4 and 5.
Table 2 displays the lag-1 autocorrelation coefficient for each model simulation before and after QDM bias correction. It is interesting to note that the correlation of observed data is slightly higher in than all the models, therefore highlighting a possible model weakness in simulating temporal correlation. Since the coefficient of the observation series is 0.226, the observed values may be correlated in pairs or triplets while most of the GCMs' series are uncorrelated, thereby implying a possible inadequacy of bi-annual or triennial fitness. After QDM, the correlation decreases which indicates that QDM may influence the autocorrelation of the raw output of GCM. Figure 7 shows the graphical comparison between observed and simulated mean monthly rainfall for the whole historical period before bias correction. Most of the models adequately replicate the seasonal rainfall pattern except INM-CM4-8 and FGOALS-g3. However, ACCESS-CM2 overestimates rainfall for all months. Moreover, all models except MPI-ESM1-2-LR, CMCC-ESM2, NorESM2-MM, and CanESM5 overestimate rainfall in DJF. Conversely, all models except ACCESS-CM2, GFDL-ESM4, EC-Earth3-Veg-LR, and MRI-ESM2-0 underestimate JJA and/or SON rainfall. For MAM rainfall, several models exhibit overestimation. In addition, the MME mean satisfactorily reproduces the annual cycle of rainfall, especially MAM rainfall while slightly underestimating JJA and SON and overestimating DJF rainfall. Figure 8 displays the same graphical comparison after QDM. Although INM-CM4-8 and FGOALS-g3 still fail to reproduce the seasonal pattern, the performance of ACCESS-CM2 has been improved. Other models show improvement in some seasons only (e.g. NorESM2-MM better captures SON rainfall but shows worse performances for MAM rainfall). In general, most of the models underestimate summer JJA rainfall after QDM. Figure 9 shows the Taylor diagram for each GCM and the MME mean when simulating the seasonal rainfall pattern. It confirms that six of the models and the MME mean adequately replicate the mean monthly rainfall with relatively R. Guo and A. Montanari: Meteorological drought hazard estimation by historical data and climate models Figure 6. Comparison of sample probability density of annual rainfall for observations and GCM historical simulations. Table 2. Lag-1 autocorrelation coefficient between observed annual rainfall and each historical simulation before and after bias correction.    high R, low CRMSE, and low SD. In particular, satisfactory results are provided by EC-Earth3-Veg-LR and MRI-ESM2-0. IPSL-CM6A-LR and FGOALS-g3 display relatively poor performance. However, a considerable improvement is exhibited in reproducing MAM, SON, and DJF rainfall for these two models, especially FGOALS-g3, which can simulate SON rainfall best. For DJF rainfall, all models show a better performance with respect to other seasons, and the MME mean shows the best performance of any individual model. Most of the models depict a worse performance in SON and JJA than MAM and DJF with much larger bias. In general, the EC-Earth3-Veg-LR provides a satisfactory simulation of both annual cycle and each seasonal rainfall, while GFDL-ESM4, ACCESS-CM2, and FGOALS-g3 can better reproduce the MAM, JJA, and SON rainfall, respectively. Although R is slightly lower than the EC-Earth3-Veg-LR, the MME mean performs satisfactorily in simulating the annual cycle of rainfall and best captures DJF rainfall but shows no advance over the single models in reproducing MAM, JJA, and SON rainfall. Figure 10 shows the Taylor diagram after QDM. In general, several models show higher CRMSE for JJA rainfall and lower SD for SON rainfall, while the performance for MAM rainfall is similar before and after QDM. Although QDM displays no evident improvement in the models' ability to capture seasonal rainfall patterns, one notes a slightly enhanced performance for the SON season. Overall, the models still show a better ability to reproduce DJF rainfall compared to other seasons. For the annual cycle, individual models (e.g. GFDL-ESM4 and MRI-ESM2-0), exhibit improved simulation abilities, but the MME mean shows a higher SD after bias correction.

Meteorological droughts
Tables 3 and 4 show the characteristics of multiyear droughts derived by using run theory for both observations and each model simulation before and after bias correction for both the whole historical and whole future periods, the latter for SSP2.6 and SSP8.5 scenarios. For the observation data, the long-term mean value of annual rainfall, which has an R LT of 705 (mm yr −1 ), is regarded as the threshold to identify the observed multiyear drought events.
The results highlight that FGOALS-g3, INM-CM5-0, MIROC6, MPI-ESM2-0, and NorESM2-MM simulate drought frequency (DF) fairly well. Notably, all models fail to replicate the drought duration (DD), drought intensity (DI), and maximum deficit (MD). For instance, IPSL-CM6A-LR and INM-CM4-8 show the best performance in simulating DD, which, however, is underestimated by about 10 % by the best simulation. Although MPI-ESM1-2-LR presents the highest value of DI and MD against all the models, marked underestimation with respect to the observations still occurs. The MME mean displays relatively good performance in terms of DF but still underestimates DD, DI, and MD. In detail, the MME mean DD is underestimated by about 17 %, while DI and MD for observations are even nearly 34 % and 39 % higher than the MME mean, respectively.
After QDM, a slight improvement is obtained for the simulation of DF. The MME mean confirms its satisfactory performance, although six models still end up with underestimation. Notably, all models still fail to replicate the DD, DI, and MD, with the only exception being INM-CM4-8, which satisfactorily reproduces drought characteristics. In general, the impact of QDM varies depending on each model and drought behaviour.
QDM slightly mitigates the underestimation by the MME mean of DD, DI, and MD, which, however, remain 12 %, 10 %, and 12 % lower than observations, respectively.
6.2 Future changes 6.2.1 Changes in average rainfall Figure 11 shows the future projections of annual rainfall after QDM for three different scenarios (SSP1-2.6, SSP2-4.5, and SSP5-8.5), compared with the MME mean simulation of the historical period (1850-2014). Future annual rainfall shows only a slight decrease for the SSP5-8.5 scenario while fluctuating with close to stationary behaviour with respect to the historical period. After applying bias correction, future projections exhibit similarities to the raw output but with a slightly lesser degree of change.
To inspect the temporal progress of changes, the annual cycle of rainfall after bias correction is considered and the future period is divided into the near future (2030-2059) and far future (2070-2099) related to the present-day simulation . Figure 12 shows an overall decrease in monthly rainfall under all the scenarios in each future period. Under the SSP2.6 scenario, the MAM rainfall in the far future is expected to be less than in the near-future period. Conversely, the rainfall during the SON season in the far future will be higher than in the near future. Moreover, rainfall may be more likely to be concentrated in November rather than in October in both periods. This change in rainfall pattern also occurs in the near-future period under the SSP4.5 scenario, but the overall monthly rainfall shows no evident difference between periods. Under the SSP8.5 scenario, the MAM and JJA rainfall in the far future will be considerably less than in the near future, and the rainfall will be more concentrated in the SON and DJF season. The results after QDM depict an analogous future change in the seasonal rainfall pattern. With respect to raw output, however, the JJA season is projected to be drier after QDM, which aligns with the historical simulation.  Table 3. Mean values over the considered period of drought frequency (DF), duration (DD), intensity (DI), and maximum deficit (MD) for multiyear meteorological droughts exhibited by observed data  and reproduced by models before bias correction for the historical (1850-2014) and future (2015-2100) periods under the two considered scenarios.

Models
Drought    The future changes in DF with respect to historical simulations are not remarkable. DD, DF, and MD are generally underestimated with respect to historical data. In fact, the values of DD, DI, and MD for the MME mean under SSP2.6 and SSP8.5 are both lower with respect to observations. Moreover, DD of all models except GFDL-ESM4 is shorter under SSP2.6 with respect to observed data. Only the Figure 11. Time series of annual rainfall for both the historical simulation and future projections after bias correction (mm yr −1 ). Historical simulation (black) and future MME mean simulations for SSP1-2.6 (green), SSP2-4.5 (blue), and SSP5-8.5 (red). The uncertainty ranges for the historical simulation (grey shading) and future projection under SSP5-8.5 (red shading) are shown in the 25th and 75th percentiles of the ensemble. MME mean and models CMCC-ESM2, MIROC6, and MPI-ESM1-2 show a consistent increase in DD when turning from the historical simulation to SSP2.6 and SSP8.5. Nearly all models show an increase in DI with respect to historical simulations for at least one future scenario, and all models except FGOALS-g3 and MIROC6 show a more intense drought under SSP8.5 than SSP2.6. Changes in MD are similar to DI. The above considerations show that historical data depict a worse future in terms of multiyear droughts with respect to simulations before QDM. Table 4 presents the corresponding future drought changes after QDM. The good performance of the MME mean in terms of DF is confirmed. However, more models exhibit less DF compared to observations under SSP8.5. For DD, QDM results in a general deterioration of performances in terms of underestimation. For DI and MD, similar values to observed data are reached by the MME mean under SSP8.5 only but with large variability among models. In general, QDM improves MME mean performances, but one still notes large variability among models. Moreover, the expectation of increased drought risk in the future with respect to historical observations is not confirmed even after QDM and the application of the worst emission scenario.

Conclusions
The present study refers to the region of Bologna, where the availability of a 209-year-long daily rainfall series allows us to make a unique assessment of GCMs' reliability and their predicted changes in rainfall and drought risk. The results show that GCMs provide a satisfactory simulation of rainfall seasonality, while statistics of rainfall series estimated for the long-term historical period exhibit discrepancies among models and limited reliability in some cases. In particular, the GCMs show weakness in capturing the correlation of an-nual rainfall, thereby implying a possible lack of fit in the simulation of cycles.
In fact, our focus is concentrated on the statistics of multiyear droughts. We found that the multi-model ensemble can satisfactorily simulate the mean frequency of droughts during the historical period. Conversely, the mean duration, intensity, and maximum deficit of multiyear droughts are underestimated.
Bias correction with QDM improves the simulation of the statistics of the monthly and annual series, while it does not show consistent enhancements in capturing the correlation of annual rainfall and the distribution of seasonal rainfall. The improvement by QDM to simulate drought characteristics is limited. Indeed, future projections by the multi-model ensemble of multiyear droughts depict a similar risk to historical observations, even after bias correction and adopting the most critical emission scenario.
Our results suggest that validation at the local scale of GCM simulations is an essential step to inform downscaling procedures and correction techniques to make sure that model predictions are consistent with the local features of climate. However, extreme events like multiyear droughts are infrequent, and therefore validating their predicted statistics is particularly challenging.
Therefore, the identification of future drought risk, which one would expect to be increased under climate change, remains a challenge, especially if we consider that the reliability of bias correction depends on the availability of observed historical data. For some situations, classical engineering methods for critical event estimation under the assumption of stationarity, with appropriate integration of the information provided by climate models to account for climate change, may still be the most precautionary approach. Therefore, rigorous use and comprehensive interpretation of the available information are needed to avoid mismanagement by also taking into account that the impact of multiyear meteorological droughts is likely to be exacerbated by further pressure on water resources due to increasing population and water demand.
Data availability. The historical rainfall series observed in Bologna can be obtained from https://climexp.knmi.nl/ (KNMI and WMO, 2022). All the CMIP6 GCM outputs are publicly available from the Copernicus Data Store (https://cds.climate.copernicus.eu/, Copernicus Climate Change Service, 2022).
Author contributions. AM proposed the main research question and supervised the work. RG made the computational analysis, elaborated additional research ideas, and prepared the paper.
Competing interests. The contact author has declared that neither of the authors has any competing interests.
Disclaimer. Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.