Systematic increases in the thermodynamic response of hourly precipitation extremes in an idealized warming experiment with a convection-permitting climate model

Changes in sub-daily precipitation extremes potentially lead to large impacts of climate change due to their influence on soil erosion, landslides, and flooding. However, these changes are still rather uncertain, with only limited high-resolution results available and a lack of fundamental knowledge on the processes leading to sub-daily extremes. Here, we study the response of hourly extremes in a convection-permitting regional climate model (CPRCM) for an idealized warming experiment—repeating present-day observed weather under warmer and moister conditions. Ten months of simulation covering summer and early autumn for two domains over western Central Europe and western Mediterranean are performed. In general, we obtain higher sensitivities to warming for local-scale extreme precipitation at the original grid-scale of 2.5–3 km than for aggregated analyses at a scale of 12–15 km, representative for currently conventional regional climate models. The grid-scale sensitivity over sea, and in particular over the Mediterranean Sea, approaches 12%–16% increase per degree, close to two times the Clausius–Clapeyron (CC) relation. In contrast, over the dry parts of Spain the sensitivity is close to the CC rate of 6%–7% per degree. For other land areas, sensitivities are in between these two values, with a tendency for the cooler and more humid areas to show lower scaling rates for the most intense hourly precipitation, whereas the land area surrounding the Mediterranean Sea shows the opposite behaviour with the largest increases projected for the most extreme hourly precipitation intensities. While our experimental setup only estimates the thermodynamic response of extremes due to moisture increases, and neglects a number of large-scale feedbacks that may temper future increases in precipitation extremes, some of the sensitivities reported here reflect findings from observational trends. Therefore, our results can provide guidance within which to understand recent observed trends and for future climate projections with CPRCMs.


Introduction
Last year, 2018, in particular in autumn, the area around the Mediterranean Sea was hit by several occurrences of extreme precipitation, resulting in (flash) floods and landslides, widespread damage and several tens of fatalities (see supplementary uncertainty in future projections of precipitation extremes, in particular for sub-daily, relatively smallscale precipitation extremes which primarily result from (organized) convective showers (Westra et al 2014, Lenderink andAttema 2015). This has multiple causes. One major factor is the lack of sufficiently reliable and long enough climate model simulations. It is widely recognized that current state-of-the-art regional climate models (RCMs), which are currently operated at grid-meshes down to 12 km, have insufficient resolution to resolve the complex dynamics of convective systems. To compensate, these models use simple statistical schemes, called parameterizations, which represent the effect of the convective mixing on the atmosphere in a simplified manner. Whether these simplified schemes are sufficiently reliable in the context of climate change is doubtful (Kendon et al 2017).
For this reason, so-called convection-permitting regional climate models (CPRCMs) have been developed and applied in recent years (e.g. Prein et al 2015, Coppola et al 2018). These models run at a considerably finer grid spacing, typically between 1.5 and 4 km. These models only resolve the larger scale convective motions (>2 times the grid spacing) in convective cloud systems, which is why they are called convection-permitting instead of convection-resolving. It has been shown that these models generate better extreme statistics of sub-daily precipitation, and represent the occurrence, duration and diurnal cycle of precipitation much better (Ban et al 2014, Prein et al 2015, Khodayar et al 2016, Lind et al 2016, Vanden Broucke et al 2019. However, in the context of climate change, limited evidence exists of the added value of these CPRCMs; a number of studies have found that changes in precipitation extremes obtained with CPRCMs are different from those obtained with RCMs-mostly for short duration, local scale and in summer-while in others changes in the CPRCMs were similar to the RCM-derived changes , 2017, Ban et al 2015. The issue of added-value is particularly relevant as CPRCMs are computationally very expensive. Simulations of thousands of years are possible with current coarser-scale RCMs, which can shed light on issues concerning the discrimination of natural variations from systematic trends due to climate change (Aalbers et al 2018). However, the simulation length in existing CPRCM simulations is typically only 10-15 years (e.g. Kendon et al 2014, Ban et al 2015. These short simulations lead to weak signal-to-noise ratios, blurring the systematic warming induced signal in the natural variations of the climate system.
Besides the limited number and length of CPRCM runs available, the lack of process understanding is also an important factor in explaining the large uncertainty. Commonly, the Clausius-Clapeyron (CC) relation is considered the cornerstone to understanding increases of precipitation extremes due to climatic warming Knutti 2016, Pfahl et al 2017). The CC relation governs the saturation specific humidity of the atmosphere as a function of temperature, yielding a rate of approximately 6.5% per degree near the surface for summertime temperatures. Combined with relatively small changes in relative humidity, the actual humidity of the air also increases at roughly the same rate or slightly below Muller 2010, Schneider et al 2010). Assuming only small changes in the atmospheric upward motionsimplying no change in the divergent atmospheric motions while changes in the rotational motions are still possible-one may therefore expect changes in precipitation extremes to scale with the CC relation. Yet, changes in the dynamics of the atmosphere may lead to deviations from CC scaling, both at larger scales (Pfahl et al 2017), but also possibly at the convective cloud scale (Trenberth et al 2003, Loriaux et al 2013.
Observational-based estimates of the sensitivity of hourly precipitation extremes suggest the existence of super-CC scaling: a dependency on near-surface temperature or dew point temperature exceeding the CC relation, up to even a factor 2 (Lenderink and van Meijgaard 2010, Lenderink et al 2011). There is still strong controversy in the literature about this behaviour (Bao et al 2017, Zhang et al 2017, Drobinski et al 2018. A number of recent trend analyses support super-CC behaviour, yet other studies do not find this , Zhang et al 2017, Guerreiro et al 2018. Also, from CPRCM simulations no strong evidence up to now has been found for super-CC scaling (e.g. Ban et al 2015, Chan et al 2015. In this paper we will address these issues in a rather idealized and well-constrained modelling setup based on a pseudo-global warming experiment (Schär et al 1996). In this experiment observed weather is repeated under warmer and moister atmospheric conditions. This leads to much better signal-to-noise ratios as the influence of variations in large-scale atmospheric conditions is, by construction, strongly suppressed (Prein et al 2016). Here, we applied this approach to a selection of 10 months for two modelling domains: one setup for western Central Europe and one setup for the western Mediterranean. While this does not provide the most realistic setup with which to study climate change in its full complexity, we believe that it does provide a good framework to study the influence of thermo-dynamical processes-that is, enhanced moisture content and its influence on the local dynamical processes-on sub-daily precipitation extremes in relation to the questions raised above.
In particular, this paper will focus on the following questions in this pseudo-global warming context: (i) is the response to warming of hourly precipitation at the grid-scale of a CPRCM (here 2.5-3 km) different from the response at the scale of aggregation typical of the mesh-size of conventional RCMs (here 12 km)? (ii) Is super-CC scaling in the response to warming possible under these well-constrained conditions? (iii) Are there systematic differences in the response of extremes between different climatic regions, and are there differences in response between the most rare extreme events and more moderate ones?

Modelling setup
The simulations are performed with the non-hydrostatic CPRCM HCLIM-AROME (cycle 38) (Lind et al 2016). HCLIM is an offspring of the long-term collaboration between the ALADIN and HIRLAM communities in Europe including the numerical weather prediction model HARMONIE-AROME (Bengtsson et al 2017). HCLIM-AROME is based the AROME physics and non-hydrostatic dynamics (Seity et al 2011), but with modifications in the physics made by the HIRLAM community (partly described in Bengtsson et al 2017). The parameterization of deep cumulus convection is completely turned off, whereas shallow convection is still parameterized. Experiments are performed for two domains: one centred over Central Western Europe (central domain, run at 2.5 km resolution) and one over the western part of the Mediterranean (southern domain, run at 3 km resolution).
HCLIM-AROME is nested within a RCM run at approximately 12 km resolution, providing boundary conditions at hourly time intervals. For the central domain, we use the RCM RACMO2 (operational at KNMI; van Meijgaard et al 2012), whereas for the southern domain HCLIM-ALADIN (operational at SMHI; Lindstedt et al 2015) is used. Both RCM runs are driven by ERA-interim (Dee et al 2011). For both domains, sets of simulations for 10 separate months have been performed. To ensure that these months capture a sufficient number of rain events as needed to produce stable statistics, the selection is based on wet summer months (May until September) after the year 2000 with a large number of exceedances of more than 25 mm d −1 in the E-OBS 25 km grid (Haylock et al 2008). For the central domain summertime months are simulated (9 in JJA and one in May), whereas for the southern domain late summer/early autumn months are selected (4 in July/August and 6 in September; see table in supplementary information S2 for exact months for both domains).
In the pseudo-global warming experiment, a 2-degree warming is applied at the lateral boundaries of the larger RCM domain (RACMO and HCLIM-ALADIN) as well as at the land surface (all soil layers) and the sea surface temperature, following Attema et al (2014). However, in contrast to this study we here applied it to a much larger domain (at the boundaries of the RCM domain, instead of directly at the CPRCM boundaries) and also in longer simulations. In the following the control experiment is denoted as CTL, whereas the pseudo-global warming experiment is referred to as TP2. The uniform warming applied at the boundaries suggests that that atmospheric stability does not change. However, the fact that the perturbation is applied at model pressure levels-consistent with the geostrophic thermal wind relation (Attema et al 2014)-implies a small stabilization of the atmosphere in the inner domain (Schär et al 1996). Also, in the inner domain a re-adjustment of the atmosphere vertical structure leads to an additional small stabilization (see supplementary information S3 for more details).

Analysis methodology
We examine the statistics of the CPRCM simulation at its native resolution, representative of rainfall at the 2.5-3 km scale, and compare this to rainfall statistics aggregated to the typical RCM scale of 12-15 km. For this purpose, we computed for each hourly time step, the mean as well as the maximum hourly precipitation over boxes of 5×5 grid-points, Pmean 5×5 and Pmax 5×5 respectively. The latter, Pmax 5×5 , is representative of the local grid-point scale extremes in the data set, whereas the box mean Pmean 5×5 field is considered representative of grid-point scale extremes for a typical RCM resolution. In the following we refer to this data set when using 'grid boxes.' We analysed the total sum of the rainfall including a small contribution from the shallow convection scheme.
The data set has been analysed in two simple ways. First, we examined the absolute maximum for each grid-box over the full simulation data set of 10 months, hereafter referred to as the whole 10 month time period, in order to investigate the response to warming and the difference between Pmean 5×5 and Pmax 5×5 .
Second, we investigated the data pooled over all time steps (approximately 7400 hourly time steps) and grid-boxes within certain areas. The pooled data was sorted and distributions of the sorted data were compared between the CTL and the TP2 experiment. We use the term 'pooled fraction of exceedance' (PFOE) to denote that this is data pooled from the spatial, as well as the temporal, dimension. We note that this statistic is practically equivalent to percentiles (PCTL)-with PFOE=1−PCTL/100, so e.g. the 99th percentile corresponds to a PFOE of 0.01-but avoids impractical values, such as 99.999; see e.g. figure 3 of Kendon et al (2014). Relative changes in the TP2 simulation compared to the CTL simulation are computed as function of this PFOE, and are normalized by the simulated change in dew point temperature over the area considered. We call these scaling rates.
Different areas are selected to pool data from: the full CPRCM simulation domain (skipping the outermost ±250 km in order to avoid spin-up and artificial effect due to the boundaries), and a number of smaller sub-regions covering e.g. the Mediterranean Sea and its direct surroundings (see supplementary information S2 for these analyses regions). We also examined the differences between changes for land and sea/ ocean points.
To establish confidence bands on the estimated changes, we then performed a bootstrapping method. Data sets of the same length are resampled (with replacement) from the time dimension. This was done simultaneously for the control as well as for the pseudo-global warming experiment, so that both resamples contain the same large-scale atmospheric forcing conditions. We resampled on an hourly basis, but also for time blocks of one-day in order to avoid over-confident estimates because of temporal correlation. We generated 100 resamples, and computed the 5th-95th percentile range of these resamples to provide information on the uncertainty in our estimates. A test with 1000 bootstrap resamples for the Benelux area yielded very similar results.

Results
Before discussing the response of hourly precipitation, we briefly mention the temperature and dew point temperature response. While a 2-degree warming is applied at the lateral boundaries of the domain and at the surface boundary over sea, the actual warming within the domain is slightly smaller. The temperature rise averaged across the domain is 1.8-1.9 degrees, and for dew point temperature change it is close to 1.8 degrees (see supplementary information S3). This 0.2degree temperature lag compared to the forcing at the boundaries is likely due to enhanced evaporation; for the central domain this averages 4 W m −2 , which, when using a sensitivity of 0.06°C-0.08°C per (W m −2 ), as derived from inter-annual temperature variability in Lenderink et al (2007), would lead to a cooling of 0.2°C-0.3°C. The temperature lag is mostly present during day-time and absent for the minimum temperatures, representing night-time Figure 1. Maximum of hourly precipitation over the 10 month simulation period for the central domain, for both CTL (left) and TP2 (right). The HCLIM-AROME grid-point data have first been re-gridded to boxes of 5×5 grid-points, approximately 12.5×12.5 km 2 , and the mean (Pmean 5×5 ) and maximum (Pmax 5×5 ) of these boxes is computed. The upper (lower) panels show the mean (maximum) precipitation for each box. The full analysis region is given by the area within the solid line.
conditions, reinforcing this mechanism. The mean precipitation increases by 9% (4%, land only) in the southern domain and 8% (7%, land only) in the central domain, equivalent to 2%-5% per degree warming (see supplementary information S3). This response is somewhat higher than the 1%-2% per degree response of globally averaged precipitation to warming as dictated by energy constraints (Allen andIngram 2002, Held andSoden 2006), but smaller than the CC relation. We note that the typical summer drying response in mean precipitation in southern Europe, which is primarily related to the atmospheric stability changes and circulation change (Brogli et al 2019), is not captured in our modelling setup.

Behaviour of time-maximum hourly precipitation
Hourly precipitation shows much higher extreme values over the 10 month periods at the CPRCM gridscale than when aggregated to the 12-15 km representative of the RCM scale (figures 1 and 2, comparing upper panels with lower panels). Whereas Pmean 5×5 values barely exceed 50 mm h −1 in the control simulations for both domains, the maximum values of Pmax 5×5 reach intensities of 80 mm h −1 and above.
Comparing the TP2 experiment to the CTL experiment, a clear intensification can be observed for both statistics and in both domains (figures 1 and 2, comparing left and right panels). For the central domain ( figure 1) this intensification appears to be occurring rather uniformly over the full domain, whereas for the southern domain (figure 2) most activity and most changes appear to be over and close to the Mediterranean Sea.
To quantify these differences, we then spatially pooled the 10 month hourly precipitation maxima over all grid-boxes. The percentage changes between the TP2 and the CTL experiment now show a distinct difference between Pmax 5×5 and Pmean 5×5 (figure 3). The changes for Pmax 5×5 are substantially higher for both domains and for the major fraction of the domain. Only at the far tail of the distribution (less than 0.5% of the grid-boxes, figure 3, right-hand panels) do they become comparable.

Behaviour of time pooled hourly precipitation
We continue with analysing the response of hourly precipitation in the full data set, by also pooling data over all time steps. For this purpose, the change in hourly precipitation is rescaled with the rise in dew point temperature, thereby yielding a scaling rate. First, we considered grid-boxes over land within the entire analysis region (figure 4). In general, the response is small or even slightly negative for the low intensities (order 1 mm h −1 ), showing that even in this fixed large-scale circulation and fixed relative humidity setting, the smallest showers get suppressed a warmer climate. This effect, however, quickly reverses to a positive scaling rate close to or beyond the CC rate for higher precipitation amounts. The analysis produces a less obvious difference between Pmax 5×5 and Pmean 5×5 , but there are still some systematic differences. In general, the grid-box mean Pmean 5×5 shows a smaller scaling rate (sensitivity to warming) by 2% per degree compared to the grid-box max Pmax 5×5 . For both statistics, the sensitivity (when normalized by the mean dew point temperature change of approximately 1.8 degrees) is between the CC rate and 1.5 times CC.
Over sea grid-boxes, the results are different and higher scaling rates are obtained, generally at 1.5 CC for the Pmean 5×5 and 2CC for the Pmax 5×5 (figure 5). For both domains the scaling rate in the very far tail of the distribution of Pmean 5×5 appears to slope upwards. This increase of sensitivity in the tail is also obtained in most of the resamples (>90% for the central domain, and >80% in the southern domain for the daily block resamples; see supplementary information S5 for a more detailed analysis).
The pooled statistics over large areas may be composed from contributions with different behaviour across different areas as, for instance, the difference between land and ocean grid-boxes suggests. To illustrate that this is indeed the case, we now show some results for smaller sub-regions. We focus here on the grid-box maximum precipitation Pmax 5×5 , which shows in general more robust statistics.
The 10 month maxima of hourly precipitation already show a large response in the Mediterranean Sea. Therefore, we selected an area covering the Mediterranean Sea, as well as nearby land (see supplementary information S2 for these areas). We note that these areas differ between the two domains as the southern domain covers a much larger fraction of the Mediterranean Sea and we do not want to neglect that area in the analysis. Despite the difference in analysis Figure 3. Spatially pooled distribution from the 10 month maxima of hourly precipitation. The left-hand plots give the distributions of Pmax 5×5 (stippled) and Pmean 5×5 (solid), where blue lines are the CTL and red lines TP2. Plots on the right denote the percentage response between TP2 and CTL (thick line central estimate, thin lines 5-95th range from bootstrapping using daily time blocks), where the black dashed horizontal line denotes the CC based prediction (1.8 degree times 6.5% per degree) and the orange lines denote a 2CC prediction. domain and the actual months that were simulated, the results are surprisingly similar. For both domains, the sea points clearly show a scaling rate at 2CC or slightly above, as can be seen in figure 6. We note that the intensities corresponding to the values for 'PFOE' (two values plotted at the vertical dashed lines) are also approximately the same.
The same similarity also applies to land grid-boxes close to the Mediterranean Sea. There, scaling rates are dependent on the fraction of exceedance, and generally slope upward from the CC rate for moderate hourly extremes to 1.5-2 times CC in the more extreme tail of the distribution. We note that the uncertainty margins are much smaller for the land points than for the sea points. This suggests that there are more independent events over land than over sea, which could well relate to bigger and longer lived rain events over the sea; note also the bigger difference between the daily block resampling (light grey band) as compared to the hourly resampling (dark grey band) in figure 6.
We finally turn to two other regions, each of which is only captured in one of the two domains: Spain and Benelux ( figure 7). The combined results from both regions provide an interesting insight into the behaviour of precipitation extremes. The land area in Spain is characterized by high temperatures, but also by relatively low values of relative humidity (see supplementary information S3 ( figure S6) showing the dew point depression as a measure of relative humidity). Scaling rates for the land area are close to the CC rate. This does not only apply to the Pmax 5×5 shown here, but also to the Pmean 5×5 . Because relative humidity values are low, convective clouds develop in an environment which is rather hostile to their existence. It is therefore likely that the development of these clouds is moisture limited, which sets the CC rate as the limit on their sensitivity to warming. We also note that the sea gridboxes surrounding Spain show a 2CC rate, in . Left-hand plots show the results for Pmean 5×5 and the right-hand plots for Pmax 5×5 ; upper panels are results for the central domain, and lower panels for the southern domain. The two values plotted at the bottom of the graph are the hourly precipitation amounts in CTL corresponding to the pooled fraction of exceedance of 1 × 10 -3 and 1 × 10 −5 , respectively. The shaded areas are the 5-95th percentile uncertainty range from a bootstrap resampling of the data using resampling of hourly time blocks (darker) and daily time blocks (lighter blue). The black dashed horizontal line denotes the CC rate (6.5% per degree) and the orange lines denote the 2×CC rate. correspondence with scaling results over the Mediterranean Sea (not shown).
Turning to a less moisture-constrained environment, scaling rates for the Benelux area progressively decrease with smaller values of the PFOE, in contrast to the behaviour for the Mediterranean land area (figure 7). At relatively low intensities (between 10 and 20 mm h −1 ) scaling rates are close to the 2CC rate, but a gradual decrease in sensitivity to the CC rate occurs for more extreme hourly precipitation.
While the latter result may appear coincidental, it is obtained in the far majority of the resamples (>90%). The same behaviour is also obtained for data selected in the southern part of the England (not shown). Moreover, there appears to be a correspondence between this behaviour and trends in extreme precipitation observed in the Netherlands. In earlier work we found a 2CC behaviour of 'soft-extremes'hourly precipitation with a typical intensity of 10 mm h −1 -in long-term variations in intensity for the De Bilt in centre of the Netherlands (Lenderink and Attema 2015). Yet, other researchers have found smaller trends for more rare extremes, more in correspondence with CC scaling, in Dutch hourly precipitation observations (Zhang et al 2017). Figure 8, which is derived from all hourly precipitation measurements in the Netherlands, shows that the observed trends from both studies actually fit rather well with the scaling rates derived from our experiments. In both the model experiment and the observations, the scaling rate appears to peak for moderately extreme hourly precipitation, with intensities between 10 and 20 mm h −1 . For more extreme precipitation, scaling rates appear to decline, while there is an indication in the observations that this trend is reversed for the most extreme hourly precipitation intensities.

Discussion and conclusions
We have conducted two idealized surrogate warming (Schär et al 1996) experiments with a CPRCM, HCLIM-AROME, to investigate the response of extreme hourly precipitation to warmer and moister conditions. Simulations have been performed for a central Europe domain at a 2.5 km grid-spacing and a southern Mediterranean domain at a 3 km gridspacing. We note that at these grid-spacings we only resolve part of the non-hydrostatic dynamics of convection, which may have affected the lifetime and   Results revealed that there is a clear scale-dependency in changes to hourly extremes moving from the CPRCM grid-point scale to an aggregated scale of 12-15 km that is representative of the common resolution in RCMs. This finding illustrates the addedvalue of the CPRCM simulation, as a discrimination of the response between these two scales cannot be inferred from a conventional RCM simulation. We find that hourly extremes will increase faster at the local scale compared to those at the 12-15 km aggregated scales, in general agreement with Kendon et al (2017). However, our modelling results suggest that this does not necessarily apply to the most extreme events, where changes at the grid-point scale and aggregated scale become comparable. We hypothesize that the latter finding may be related to a scale increase for the most intense convective cells as supported by an analysis of observations for the Netherlands Over relatively moist surfaces a super-CC behaviour in the response of hourly precipitation extremes to warming is obtained. In particular, over the Mediterranean Sea response rates are close to the 2CC rate (between 12 and 16%°C −1 ). In land areas close to the Mediterranean Sea the response increases with intensity, and the most extreme hourly precipitation appears to increase at a rate between 1.5 and 2 times CC. These results were obtained in both experiments despite the difference in experimental setup (simulation months, driving model, domain and resolution). This gives us confidence that the results are robust given the warming experimental setup and also that they are not substantially affected by internal variability.
Over the inland areas in Spain a response close to the CC rate is obtained. We think this is related to the dry (low relative humidity) average climate conditions, which implies that atmospheric moisture could limit the development of convective systems. The behaviour in a climate with cooler and moister conditions (medium to higher relative humidity) shows a different behaviour. For the Benelux area the response is close to 2CC for moderately extreme hourly precipitation intensities, but appear to fall back to the CC rate for the most extreme hourly precipitation intensities.
Our idealized warming approach obviously does not capture the full complexity of the changes in a warming climate (Kröner et al 2017, Brogli et al 2019. We neglect systematic changes in circulation statistics. While this approach provides good signal-to-noise ratios, circulation changes are likely to affect the extreme precipitation statistics (Chan et al 2015). In relation to this, changes in atmospheric stability (beyond those already captured by a re-adjustment within the domain due to convective processes) and decreases in relative humidity could also substantially affect changes in convective activity in the future climate (Loriaux et al 2013, Keller et al 2018, although CPRCM results for the Benelux showed relatively small impacts of lapse-rate changes in a pseudo warming setting (Attema et al 2014). Finally, a coupled atmosphere-ocean regional modelling system is required to describe atmosphere-ocean feedbacks more realistically (Somot et al 2018). Despite this, we argue that by choosing well-constrained experimental setups, the results of different CPRCMs could be more easily compared, allowing us to make more confident statements on whether they behave similarly, and if not, what could be causing the differences. Our finding that the observed trend in extreme hourly precipitation intensity for the Netherlands shows a reasonable correspondence to the scaling rate derived from the pseudo-global warming experiment, also strengthens our conviction that these experiments do have a relevance in a climate change context.