Convective-Scale Perturbation Growth across the Spectrum of Convective Regimes

Convection-permitting ensembles have led to improved forecasts of many atmospheric phenomena. However, to fully utilize these forecasts the dependence of predictability on synoptic conditions needs to be understood. In this study, convective regimes are diagnosed based on a convective time scale that identiﬁes the degree to which convection is in equilibrium with the large-scale forcing. Six convective cases are examined in a convection-permitting ensemble constructed using the Met Ofﬁce Uniﬁed Model. The ensemble members were generated using small-amplitude buoyancy perturbations added into the boundary layer, which can be considered to represent turbulent ﬂuctuations close to the grid scale. Perturbation growth is shown to occur on different scales with an order of magnitude difference between the regimes [ O (1) km for cases closer to nonequilibrium convection and O (10) km for cases closer to equilibrium convection]. This difference reﬂects the fact that cell locations are essentially random in the equilibrium events after the ﬁrst 12h of the forecast, indicating a more rapid upscale perturbation growth compared to the nonequilibrium events. Furthermore, large temporal variability is exhibited in all perturbation growth diagnostics for the nonequilibrium regime. Two boundary condition–driven cases are also considered and show similar char- acteristics to the nonequilibrium cases, implying that caution is needed to interpret the time scale when initiationis not within the domain.Furtherunderstanding of perturbation growth within the different regimes could lead to a better understanding of where ensemble design improvements can be made beyond increasing the model resolution and could improve interpretation of forecasts.


Introduction
Convection-permitting numerical weather prediction (NWP) models have led to improved forecasts of many atmospheric phenomena (e.g., fog and low cloud; convective precipitation; tropical cyclone intensity and tracks; McCabe et al. 2016;Clark et al. 2016;Xue et al. 2013). However, the atmosphere is chaotic and error growth is faster at smaller scales (Lorenz 1969). Therefore, increasing the resolution of an NWP model will result in faster error growth. For example, Hohenegger and Schär (2007a) found an order of magnitude difference between error doubling times when comparing a convection-permitting model (grid length of 2.2 km) with a coarser-resolution, convectionparameterizing model (grid length of 80 km). Rapid error growth implies more limited intrinsic predictability [defined as how predictable a situation is assuming an optimal forecast (Lorenz 1969) with near-perfect initial conditions and perfect boundary conditions] on convective scales (e.g., Hohenegger et al. 2006;Clark et al. 2009Clark et al. , 2010. The predictability of convection-permitting models remains an active area of research (e.g., Melhauser and Zhang 2012;Johnson and Wang 2016), important for both the modeling and forecasting communities. Several studies have shown that the predictability of precipitation depends, in part, upon whether the convection is predominantly controlled by large-scale or local factors (e.g., Done et al. 2006;Keil and Craig 2011;Kühnlein et al. 2014). Important aspects of convective-scale predictability include the timing (which is better captured with increasing resolution; Lean et al. 2008) and spatial positioning of convection.
The spatial variability of precipitation within convectionpermitting forecasts has led to issues with their verification. Mittermaier (2014) provides a review of these issues and of appropriate verification techniques. Analyses with scaledependent techniques such as the fractions skill score (FSS; Roberts and Lean 2008) have shown wide variations in the ability of models to forecast the locations of convective events. For example, a peninsula convergence line in the southwest of the United Kingdom on 3 August 2013 was forecast operationally with a high degree of spatial agreement between ensemble members close to the grid scale (i.e., predictable), whereas a convective event in the east of the United Kingdom on the previous day was poorly forecast with weak spatial agreement between ensemble members (Dey et al. 2016).
The growth and development of small-scale errors in convective-scale forecasts has been considered in various studies. Surcel et al. (2016) considered the locality of perturbation growth and showed that more widespread precipitation was associated with marginally better predictability at synoptic scales, but similar predictability to diurnally forced cases at smaller scales. Studies such as Zhang et al. (2007) and Selz and Craig (2015) have examined upscale error growth and found that an initial phase of rapid exponential error growth at the convective scale is linked to variations in the convective mass flux. Johnson et al. (2014) considered multiscale interactions to show that the growth with the largest energy occurred at wavelengths of around 30-60 km. All of these previous studies indicate a strong association between convection and error growth. However, they did not establish how such growth might depend on the character of the convection. 1 Convection can be classified as occurring in a spectrum between two main regimes. One regime is convective quasi equilibrium in which the large-scale production of instability is balanced by its release at the convective scale, typical for cases with large-scale synoptic uplift (Arakawa and Schubert 1974). The convection associated with this regime is often in the form of scattered showers and typically has limited spatial organization. The second regime is nonequilibrium convection. This regime occurs when there is a buildup of convective instability facilitated by some inhibiting factor. If this factor can be overcome then the convective instability is released. These types of events are more often associated with more organized forms of convection (Emanuel 1994). To distinguish between the regimes, the convective adjustment time scale t c may be used. This time scale was introduced by Done et al. (2006) and is defined as the ratio between the convective available potential energy (CAPE) and its rate of release at the convective scale (subscript CS): The rate of release can be estimated based upon the latent heat release from precipitation, leading to t c 5 1 2 where c p is the specific heat capacity at constant pressure; r 0 and T 0 are a reference density (1.2 kg m 23 ) and temperature (273.15 K), respectively; L y is the latent heat due to vaporization; g is the acceleration due to gravity; and P is the precipitation rate (which is best estimated from an accumulation over 1-3 h rather than an instantaneous precipitation rate; Flack et al. 2016). The factor of a half was introduced by Molini et al. (2011) to account for factors such as boundary layer modification, the neglect of which would lead to an overestimation of t c ). The convective adjustment time scale has been used in this study as an indicator of the convective regime. The threshold between the equilibrium and nonequilibrium regime occurs in the range of 3-12 h (Zimmer et al. 2011). As previously shown (e.g., Done et al. 2006;Keil and Craig 2011;Zimmer et al. 2011;Craig et al. 2012;Flack et al. 2016) the value of t c should be used to indicate the likely nature of the regime, rather than to definitively classify it. The time scale can be a particularly useful indicator at the onset of convection (when the event starts to precipitate), but is likely to reduce in value as convection develops further, particularly in nonequilibrium cases which are often long lived (e.g., Molini et al. 2011). The extent to which t c reduces either in time (e.g., Done et al. 2006) or with distance from the forcing region (e.g., Flack et al. 2016) depends on the event. The time scale is a particularly useful diagnostic if the value is far from the threshold. However, in practice t c can be close to the threshold. In such situations additional information (such as inspection of synoptic charts) may be necessary to help determine the character of a given event, or else it should be recognized that the event may be intermediate in character.
The convective adjustment time scale has been used for many purposes (e.g., Done et al. 2006;Molini et al. 2011;Done et al. 2012). Climatologies have been produced, based on observations over Germany (Zimmer et al. 2011) and model output over the United Kingdom (Flack et al. 2016). One of its key uses has been to consider the predictability of convection. Done et al. (2006) considered two MCSs over the United Kingdom and found that the total area-averaged precipitation was similar for all ensemble members in the equilibrium case and exhibited more spread for the nonequilibrium case. This regime dependence of precipitation spread was confirmed for other equilibrium and nonequilibrium cases by Keil and Craig (2011). Moreover, Keil et al. (2014) demonstrated that nonequilibrium cases were more sensitive to model physics perturbations compared to equilibrium cases. A similar contrast in the sensitivity was demonstrated for initial condition perturbations by Kühnlein et al. (2014), who further showed a relative insensitivity to variations in lateral boundary conditions. Their results are also consistent with Craig et al. (2012), who suggested that nonequilibrium conditions are more sensitive to initial condition perturbations produced by radar data assimilation: the assimilation has longerlasting benefits for the forecasts in cases with longer t c .
In this study we apply small boundary layer temperature perturbations in a controlled series of experiments to assess the intrinsic predictability of convection in different regimes using a selection of U.K. case studies. The case studies are chosen to cover a spectrum of t c and so sample over the convective regimes. We primarily focus on the magnitude and spatial characteristics of the perturbation growth as a greater understanding of the spatial predictability of convective events in various situations could lead to improved forecasts of flooding from intense rainfall events through improved modeling strategy or interpretation of forecasts. This focus is achieved by testing the following hypotheses: (i) there is faster initial perturbation growth in convective quasi equilibrium compared to nonequilibrium and (ii) because of the association of convection with explicit triggering mechanisms in the nonequilibrium regime (Done et al. 2006), perturbation growth will be relatively localized for nonequilibrium convection but more widespread for events in convective quasi equilibrium.
The rest of this paper is structured as follows: the ensembles and diagnostics are discussed in section 2, the cases considered are outlined in section 3, the perturbation growth characteristics are examined in section 4, and conclusions and a discussion are presented in section 5.

Methodology
Ensembles have been run for six case studies labeled A to F (section 3). The model and control run are described first (section 2a), followed by the perturbation strategy (section 2b) and the diagnostics (section 2c).

a. Model
The Met Office Unified Model (MetUM), version 8.2, has been used in this study. This version was operational in summer 2013 and produced forecasts for all but one of the cases examined. The dynamical core of the MetUM is semi-implicit, semi-Lagrangian, and nonhydrostatic. More details of the dynamical core of the version used in this study are described by Davies et al. (2005). 2 The MetUM has parameterizations for unresolved processes including a microphysics scheme adapted from Wilson and Ballard (1999), the Lock et al. (2000) boundary layer scheme, the Best et al. (2011) surface layer scheme, and the Edwards and Slingo (1996) radiation scheme. No convection parameterization is used for this study. The ensembles use the U.K. Variable resolution (UKV) configuration, which has a horizontal grid length of 1.5 km in the interior domain and so is classed as convection permitting ). The variable resolution part of the configuration occurs only toward the edges of the domain, where the grid length ranges from 4 to 1.5 km (Tang et al. 2013). The vertical extent of the model is 40 km, and its 70 levels are stretched such that the resolution is greatest in the boundary layer.
The 36-h simulations performed here are initialized from the Met Office global analysis (grid length 25 km) at 0000 UTC on the day of the event. All simulations have a spinup time of 3 h (estimated from temporal cross correlations of hourly precipitation accumulations between the control and perturbed forecasts of the same ensemble, not shown) associated with the downscaling of coarser-resolution initial conditions. Therefore, the analysis is restricted to the last 33 h of the ensemble forecasts. It is expected that the impact of spinup will be significantly reduced after this time in comparison to the perturbation growth.

b. Perturbation strategy
Perturbations have been applied on a single vertical level to create six-member ensembles for each of the cases. These perturbations are applied within the boundary layer across the entire horizontal domain and are based upon the formulation of Leoncini et al. (2010) and Done et al. (2012): where A is the amplitude of the perturbation, x is the position in the zonal direction, y is the position in the meridional direction, (x 0 , y 0 ) is the central position of the Gaussian distribution, and s is the standard deviation that determines the spatial scale of the perturbations. The amplitude is initially set to random values uniformly distributed between 61. A superposition of Gaussian distributions is created by centering Gaussian distributions at every grid point in the domain. This result is scaled to an appropriate amplitude for the total perturbation as in Leoncini et al. (2010) and Done et al. (2012). Here the perturbation field is added to potential temperature and scaled for a maximum amplitude of 0.1 K. Such an amplitude is typical of potential temperature variations within the convective boundary layer (e.g., Wyngaard and Cot 1971). Based on the perturbation amplitude experiments in Leoncini et al. (2010) (and sensitivity experiments performed for this study; not shown), increasing the amplitude of the perturbation would increase the initial growth of differences between runs but would not significantly change the saturation level of the differences. The standard deviation used is 9 km, a distance of 6Dx, for which the UKV configuration can be expected to reasonably resolve atmospheric phenomena and orography (e.g., Bierdel et al. 2012;Verrelle et al. 2015). The perturbations are designed to represent variability in turbulent fluxes that cannot be fully resolved by the model (via stochastic forcing), and are hence randomized each time they are applied. They are applied once every 15 min, throughout the forecast, corresponding to around half a typical eddy turnover time for a convective boundary layer (Byers and Braham 1948). The perturbations are applied at a model hybrid height of 261.6 m. This height is consistently within the boundary layer throughout the entire forecast on all days considered (not shown), but outside the surface layer to reduce modification by friction and other surface effects.
The perturbation approach is simplistic, and is not designed for use on its own in generating operational ensembles, However, it is sufficient to allow for effective perturbation growth at the convective scale (e.g., Raynaud and Bouttier 2016) and it keeps the synoptic situation indistinguishable from the control. Thus, differences in the intensity and position of convection between ensemble members are solely due to these perturbations. This is a different ensemble generation method to that used for the operational convection-permitting ensemble at the Met Office. The operational ensemble uses downscaled initial and boundary conditions from the global ensemble that modify the synoptic conditions (Bowler et al. 2008(Bowler et al. , 2009. Recent additions to the operational ensemble include random noise, although this is tiled across the domain rather than continuously varying across the domain as in our experiments. The sensitivity of the results to the perturbation strategy has also been tested for perturbations applied on multiple levels and for spatially correlated potential temperature and specific humidity perturbations. The sensitivity tests result in similar behavior to the results presented here (Flack 2017, chapter 6). The use of multiple-level perturbations, rather than single-level perturbations, had no discernible impact on any of the results because the perturbations are immediately processed by the boundary layer scheme and so spread in the vertical before numerical dissipation in the advection scheme can act to dampen their magnitude. The inclusion of spatially correlated moisture perturbations resulted in marginally faster initial perturbation growth only.

c. Diagnostics
Diagnostics have been considered that take into account both the magnitude and spatial context of the perturbation growth. These are described here.

1) CONVECTIVE ADJUSTMENT TIME SCALE
The convective adjustment time scale, calculated from the control forecast, is used to indicate where the case studies lie on the spectrum between the equilibrium and nonequilibrium regimes. For this study a spatial average across the domain, for only the points where t c is defined, is used alongside hourly averaged t c maps. The method to calculate t c is summarized here; justification and full details are presented in Flack et al. (2016), and sensitivity tests are shown in chapter 3 of Flack (2017). These sensitivity tests implied that using precipitation accumulations resulted in a time scale that was less spatially noisy (implying clearer regime classification) compared to when instantaneous precipitation rates were used.
The method uses a Gaussian kernel, with a halfwidth of 60 km, to smooth the coarse-grained hourly precipitation accumulations (converted into average precipitation rates) and the CAPE before (2) is then evaluated. The half-width is chosen to lie between typical cloud-separation distances and the synoptic scale, and for consistency with other studies (e.g., Keil and Craig 2011). A precipitation threshold of 0.2 mm h 21 is applied to the precipitation field after the Gaussian kernel has been applied. The precipitation threshold is chosen to limit stratiform rain but to allow for a meaningful sample of precipitating points in the domain. The hourly model data are used to provide a higher temporal resolution of t c compared to Flack et al. (2016).
Should the spatially averaged t c for an event be clearly distinct from 3 h (i.e., not within 2-4 h) then the regime is classed as being toward the nonequilibrium end of the spectrum (clearly above 3 h) or toward the equilibrium end of the spectrum (clearly below 3 h). This 3-h threshold is chosen based on the climatology over the United Kingdom presented in Flack et al. (2016), which indicated a distinct scale break in the t c spectrum at around 3 h. As discussed in the introduction, caution should be exercised in the use of t c for intermediate values close to the threshold. Synoptic charts are therefore also considered when characterizing the studied events in section 3.

2) MEAN SQUARE DIFFERENCE
The mean square difference (MSD) is a simple and effective measure for considering the spread of an ensemble, and has been used for many years at the convective scale (e.g., Hohenegger et al. 2006;Hohenegger and Schär 2007a,b;Clark et al. 2009;Leoncini et al. 2010Leoncini et al. , 2013Johnson et al. 2014). It is given by where x p is a variable in the perturbed forecast and x c is the same variable in the control forecast, and g x is a normalization factor that depends on the variable considered.
In this study, MSD has been calculated for two variables: the temperature on a model level in the lower free troposphere (on the model level closest to 850 hPa) and hourly accumulations of precipitation exceeding 1 mm (an arbitrary threshold for convective precipitation). When the temperature is being used the normalization factor is simply the reciprocal of the number of grid points in the domain, N (i.e., g T 5 1/N).
The MSD is a gridpoint quantity and so is subject to the ''double penalty'' problem  when applied to precipitation at convection-permitting scales. This problem occurs when a forecast is penalized twice for having precipitation in the wrong position: once for forecasting precipitation that is not observed and once for failing to forecast observed precipitation. This can complicate the interpretation of MSD. Here, we wish to use the precipitation MSD as a measure of changes in precipitation rates, and hence it is calculated only from those points where the hourly accumulation exceeds 1 mm in both the perturbed and control forecasts. So that the results are robust to total precipitation, to enable fair comparisons across the case studies considered, the normalization factor considers the total precipitation from all points in the control forecast that exceed the threshold. Hence, where P c the hourly precipitation accumulation in the control forecast.

3) FRACTION OF COMMON POINTS
The number of common points N 12 is defined as the number of points that exceed an hourly precipitation accumulation of 1 mm in two different forecasts for the same event (be it a control-member or membermember comparison). This allows the fraction of common points (F common ) to be defined as the ratio of the number of common points to the total number of precipitating points (i.e., the total number of precipitating points in the first forecast N 1 plus the total number of precipitating points in the second forecast N 2 minus the number of common points between both forecasts as to eliminate double counting): The term F common varies between zero and unity, where a value of unity implies two forecasts that are spatially identical and zero implies no common points.

4) FRACTIONS SKILL SCORE
The FSS was introduced by Roberts and Lean (2008) to combat the double-penalty problem. It is a neighborhood-based technique (Ebert 2008) used for verification and is given by where f represents the fraction of points with precipitation over a specified threshold in the forecast (perturbed member in our case) and o represents the fraction of points with precipitation over the same threshold in the observations (control forecast in our case). Here a threshold of hourly precipitation JANUARY 2018 F L A C K E T A L .
accumulations exceeding 1 mm is applied. The FSS can be adapted to consider ensemble spread by considering the mean over FSS differences between pairs of perturbed ensemble members, as proposed by Dey et al. (2014). This gives rise to the dispersive FSS (dFSS), which can be used as a tool for considering the predictability of convection (e.g., Johnson and Wang 2016). The FSS ranges between zero (forecasts completely different spatially) and unity (forecasts spatially identical). The distinction between a skillful forecast (with respect to either observations or to a different ensemble member) and a less skillful forecast is considered to occur at a value of 0.5 . Although it provides information about the spatial structure of perturbation growth, the FSS does not provide information about the perturbation magnitude.

Case studies
A set of case studies is examined that covers a spectrum of t c . This spectrum enables a picture to emerge of the differences between the regimes in real scenarios. Four of the cases (A-D) are presented in order from that closest to convective quasi equilibrium (case A) to that farthest from equilibrium (case D). Cases A-D are classified based on a combination of Figs. 1-4. Figure 1 is the operational surface analysis at 1200 UTC on the day of the convective event, Fig. 2 shows the number of ensemble members that are producing precipitation, Fig. 3 is the evolution of t c throughout the forecast, and Fig. 4 shows maps of t c at 1500 UTC. This time is selected as convective precipitation is well established in all of the forecasts and to indicate the differences in regime classification despite the similarity in the spatially averaged time scale. The other two cases (cases E and F) consider convection initiated outside of the domain, which is another scenario of importance for convective-scale modeling.

a. Case A: 20 April 2012
This case was part of the Dynamical and Microphysical Evolution of Convective Storms (DYMECS) field experiment (Stein et al. 2015) and shows typical conditions for scattered showers in the United Kingdom, which initiated at 1000 UTC. The 1200 UTC synoptic chart (Fig. 1a) shows the situation that was present throughout the entire forecast. There was a low pressure center situated in the northeast of the United Kingdom and several troughs over the country. Furthermore, the United Kingdom was positioned to the left of the  Fig. 2a), but have a consistent domain-averaged precipitation throughout the forecast with close agreement between the perturbed members and the control (Fig. 5); this result is in agreement with the equilibrium cases considered by Done et al. (2006Done et al. ( , 2012 and Keil and Craig (2011). The hypothesis that this event should be placed near the equilibrium end of the spectrum is supported by t c being   FIG. 2. A summary of the ensemble hourly precipitation accumulations greater than 1 mm given by the number of perturbed ensemble members precipitating at that point in the domain (color bar). The mean sea level pressure from the control forecast is also shown (4-hPa contour interval). Each plot is for 1200 UTC and the blue line in (b) represents a distance of 100 km. Each panel refers to the respective case.

JANUARY 2018
F L A C K E T A L .
consistently below the 3-h threshold throughout the forecast both temporally (Fig. 3) and spatially (Fig. 4a). This case is thus put toward the equilibrium end of the spectrum considered.

b. Case B: 12 August 2013
In this case a surface low was situated over Scandinavia and the Azores high was beginning to build (Fig. 1b), leading to persistent northwesterly flow. An upper-level cold front trailed a weak surface front and there was a trough passing over Scotland that provided large-scale synoptic uplift, suggesting an equilibriumregime day. The showers associated with this day initiated at 1100 UTC. The average rainfall is approximately constant at around 0.3 mm h 21 throughout the forecast (Fig. 5) and the ensemble members place the showers in different positions in the north of the country, with very few showers in the south (Fig. 2b). The time scale is consistently close to or below the threshold throughout the forecast period (Fig. 3) and there are no localized regions of long time scales in the map (Fig. 4b). However, it is more intermediate than case A, so is thus a marginal-equilibrium event.
c. Case C: 23 July 2013 This case was the fifth intensive observation period (IOP 5) of the Convective Precipitation Experiment (COPE; Leon et al. 2016). A low pressure system was centered to the west of the United Kingdom with several fronts ahead of the main center (Fig. 1c), which later decayed. The convection producing the most precipitation on this day, associated with surface water flooding in Nottingham (in central England;Nottingham City Council 2015), was ahead of these fronts and located along a surface trough. There were several convective events forming along the surface trough, with some of them producing intense precipitation (Fig. 5) and all tracking over similar regions. The first convection on this day initiated at 0200 UTC, and there was a further burst of convection later in the day initiating at 1500 UTC. The convective adjustment time scale (Fig. 3) showed initially long values, which later decreased as the event matured as expected for nonequilibrium events (e.g., Done et al. 2006;Keil and Craig 2011;Flack et al. 2016). However, Fig. 3 indicates that the domain-averaged t c , after spinup, is around 2-3 h for the majority of the forecast suggesting an intermediate event.
Synoptic analyses (Fig. 1c) do not suggest that a region of synoptic-scale forcing exists and Fig. 4c shows localized regions of longer t c (exceeding 3 h). These characteristics are typically associated with nonequilibrium convective events and so this event is classified as a marginal nonequilibrium event.

d. Case D: 2 August 2013
This case was IOP 10 of the COPE field campaign, with convection initiating at 1100 UTC. The synoptic situation (Fig. 1d) shows a low pressure system centered to the west of Scotland, which led to southwesterly winds and a convergence line being set up along the North Cornish coastline (in southwest England). The convective cells that developed on this day were mainly associated with this convergence line. The convective adjustment time scale remains above the 3-h threshold for the majority of the forecast period (Fig. 3) and long t c are found over most of the precipitating domain at the time shown in Fig. 4d. The domain-averaged precipitation (Fig. 5) remains consistent between ensemble members. Because of the synoptic situation implying limited synoptic-scale uplift around the convergence line, consistent long t c , and consistent positioning of precipitating cells (Fig. 2d), this case is classified as being toward the nonequilibrium end of the spectrum.

e. Case E: 27 July 2013
This case was IOP 7 of the COPE field campaign. Two MCSs influenced the U.K.'s weather throughout the forecast period. The first MCS was situated over mainland Europe influencing the Netherlands, Belgium, and southeastern parts of the United Kingdom and is associated with the initial smaller t c values in Fig. 3. The second MCS influenced the majority of United Kingdom. This second MCS entered the model domain from the continent. However, unlike the previous MCS, it traveled north, across the United Kingdom, during the forecast. As this MCS entered the domain it was associated with a long t c which later reduced (being still associated with the same event); later still, as the MCS intensified in the evening of 27 July, t c increased again (Fig. 3). The precipitation associated with the MCS led to flooding in parts of Leicestershire (in central England; Leicestershire County Council 2014). The heaviest precipitation was at approximately 1500 UTC when more stratiform rain was present, and at 0300 UTC the following morning, when the MCS started to return south (Fig. 5). Throughout the day there was persistent light southerly flow (Fig. 1e), with the United Kingdom being located in a region with a weak pressure gradient. This synoptic situation, together with the long t c , would imply a classification of this case toward the nonequilibrium end of the spectrum. However, as the MCS has been advected into the domain rather than initiated within it, we instead classify case E as a case driven by the boundary conditions.

f. Case F: 5 August 2013
This case, IOP 12 of the COPE campaign, has been deliberately chosen as a complex situation for considering convective-scale perturbation growth, and as a second case driven by the boundary conditions. For the first 25 h of the forecast a couple of fronts dominate the large-scale situation (Fig. 1f). There is embedded convection associated with the fronts, which led to localized surface water flooding in Cornwall (in southwest England) (Cornwall . There are also showers ahead of the warm front near the Outer Hebrides (to the west of Scotland; Fig. 2f), which dominate the precipitation after the front has cleared the United Kingdom. Figure 2f indicates that the front is consistently positioned in the ensemble members, but the showers are inconsistently positioned. The total precipitation across the ensemble members remains fairly consistent throughout the day after an initial heavy few hours (Fig. 5). Thus, case F represents a transition from a frontal regime to a convective regime driven by an evolving synoptic-scale flow, and most of the convection passes into the domain through the boundary conditions.

Results
The perturbation growth for the spectrum of cases is examined in this section both in terms of its magnitude (section 4a) and spatial characteristics (section 4b).

a. Magnitude of perturbation growth
We consider first whether the perturbation strategy employed induces biases in the perturbed members with respect to the unperturbed control. Figure 5 indicates that while there is some variation between the control forecast (solid lines) and the perturbed members (dashed lines) for a given case, there are no major systematic differences between the forecasts. These computations have also been performed using other precipitation thresholds (0.5 and 2 mm), with consistent results (not shown). To confirm that the perturbed forecasts show no systematic bias with respect to the control, a gamma distribution was fit to the probability density function of hourly accumulations for each run and the shape and scale parameters were compared (not shown). The shape and scale parameters indicate that the control lies within the spread of the perturbed members for both parameters in all cases. This result was further confirmed through the use of a Mann-Whitney U test, which indicates that the control and perturbed members are from a similar distribution at the 5% significance level. Combining the statistical tests with the visual similarity of the precipitation distributions implies that, unlike the experiments of Kober and Craig (2016) for example, none of our perturbed ensembles show any bias to the control. Differences between our study and Kober and Craig (2016) include (but are not limited to) the magnitude of the perturbations and the time variation of the perturbations. Given the lack of bias in our study, it is deemed reasonable to assess member-member comparisons alongside member-control comparisons. Figure 6 shows the MSD for precipitation using control-member and member-member comparisons. There is generally increasing spread in the MSD with time throughout all of the cases considered. The values for MSD are similar to results obtained by  Leoncini et al. (2010). 3 Differences are apparent when comparing the evolution of the growth across cases A-D. Sampling the ensemble members with replacement (10 000 times) to produce the 5% significance level indicates that differences in the magnitude of the MSD, throughout the forecast, between the cases are not statistically significant as there is more variation (noise) within each case than between cases. However, there is a dependence of the MSD on the convective development, as to be expected from Zhang et al. (2003) and Hohenegger et al. (2006). There is a clear difference in the behavior of the growth of MSD. Considering the ensemble plume, cases A and B have an initial rise and then level off over the first 18 h whereas cases C and D are more episodic, with both having at least two short-time-scale peaks during the increase in MSD to its overall peak value. This episodic growth is tied specifically to the intensification stages of the convective events [i.e., as the event strengthens the MSD rises and thus the ensemble spread increases; conversely when the event weakens the MSD falls (or stalls) and thus the ensemble spread decreases]. This difference between cases is also present in member-member comparisons, and when considering different thresholds for precipitation (not shown). It occurs because of the different behavior of convection in the two regimes. In convective quasi equilibrium, convection is continuously being generated to maintain the equilibrium. In contrast, in nonequilibrium there are periods (or places) when relatively little convection is occurring prior to it being ''triggered''; during such periods the growth in MSD will reduce before more rapid growth occurs again when convection initiates or intensifies. This finding is consistent with Leoncini et al. (2010) and Keil and Craig (2011) in which it was indicated that convective-scale perturbation growth is larger during convective initiation. The result is also robust to applying a precipitation threshold (not shown).
The perturbation growth is somewhat smoother when considering other variables, such as the 850-hPa temperature (exhibited by steadier increases in the temperature MSD compared to the precipitation MSD, not shown). Nonetheless, the temporal variability makes the concept of saturation difficult to consider in a meaningful way for the MSD diagnostic. A simple aspect of perturbation growth that remains meaningful across the spectrum is the MSD doubling time (the time it takes the MSD after spinup to double). Based on the characteristics of the regimes, it is hypothesized that the initial perturbation growth will be slower in the nonequilibrium events (than in the equilibrium events) prior to the development of strong convection and that, once convection has initiated, there will be greater ensemble variability in MSD doubling times for the nonequilibrium events. Table 1 shows the average MSD doubling time for all cases and the corresponding standard deviations in the MSD doubling time for the ensembles. The MSD doubling time is calculated from fitting a straight line to the MSD of the temperature at 850 hPa starting after spinup, and ending when the growth of the MSD becomes nonlinear, as in Hohenegger and Schär (2007a), using 15-min data. While case A has a shorter MSD doubling time than case D, there is no consistent increase in MSD doubling time from cases A to D; this implies that the MSD doubling times are not only dependent upon the convective regime. The values calculated are considerably shorter than those of Hohenegger and Schär (2007a). This difference is possibly due to the higher resolution of our convection-permitting ensemble (1.5-km grid spacing) compared to theirs (2.2-km grid spacing); although other relevant factors include the different model configurations or differences in the perturbation approaches.
The MSD doubling times indicate a larger standard deviation (spread) for cases closer to the nonequilibrium end of the spectrum (Table 1). The larger spread in doubling times implies a greater spread in the ensemble (i.e., a spread in times with the same MSD value rather than a spread of MSD values a specific time). The larger spread toward the nonequilibrium end of the spectrum is also evident in Fig. 5a, but is more evident in Fig. 5b where the standard deviation of the ensemble precipitation indicates greater spread at the nonequilibrium end of the spectrum (cases C and D).
While cases E and F are considered to be more complex, they exhibit similar values of precipitation MSD to the rest of the cases (Fig. 6). Case E shows similar behavior to that exhibited by cases C and D, by showing two short-time-scale peaks during the increase in MSD to its overall peak value at around 18 h, which is somewhat to be expected given the initially long t c . Case F shows modest differences in the precipitation MSD values and spread between the periods dominated by the front and the showers (i.e., there is only a slight increase in the standard deviation for the MSD at this time; not shown). This is in contrast to an MSD computed over all points: here once the front leaves the domain the MSD significantly increases as, when only showers are present, the double-penalty problem occurs (MSD for all points, not shown).

b. Spatial aspects of perturbation growth
While there are differences in the perturbation growth between cases, they are relatively subtle, and are not statistically significant when comparing magnitude. We now consider spatial aspects of the perturbation growth. It is hypothesized, given the range of spatial scales associated with convection in the different regimes, that spatial characteristics of perturbation growth will be dependent upon the regime. This hypothesis is first considered by simple diagnosis of the fraction of common points and then via the use of the FSS and dFSS. When considering F common across the spectrum of cases (Fig. 7) the most notable difference is the localization of the perturbation growth toward the nonequilibrium end of the spectrum, indicated by a larger percentage of points remaining in the same location as in the control forecast at the nonequilibrium end of the spectrum. The cases toward the equilibrium end of the spectrum (cases A and B) show a rapid reduction in F common with forecast lead time. In those cases F common reduces to around 0.20-0.25, which is close to the fraction that would be expected by pure chance, given the number of precipitating points in the control forecast (red line in Fig. 7). On the other hand, the cases toward the nonequilibrium end of the spectrum retain a larger fraction of common points and have a large difference between that fraction and that which would be expected by chance (particularly for case C, which has a fraction of approximately 0.5 common points by the end of the simulation). This agreement in the positioning of convective events that show nonequilibrium characteristics is consistent with Done et al. (2006) and Keil and Craig (2011), and is a result that is statistically significant at the 5% significance level after bootstrapping the ensemble [i.e., there is no overlap of ensembles given the 5% significance level and differences between cases in different regimes (case A vs case D) are far larger than the variability shown by either ensemble]. Case D (Fig. 7d) has the longest time scale for the decay of F common ; however, F common at later lead times becomes closer to that expected by chance than for case C. These results are likely due to there being a large spread of t c values across the domain in case D (Fig. 5d), allowing for some mix of growth characteristics despite the overall predominance of nonequilibrium characteristics. The separation between F common at later lead times and chance is similar in cases B and D, which may be because there is an element of local forcing involved from the orography in the region where the showers are forming. The element of local forcing may improve the spatial predictability for case B, whereas the elements of the equilibrium regime limit the predictability in case D. The results also hold for member-member comparisons.
The cases driven by the boundary conditions (cases E and F) show different behavior to each other. Case E shows behavior similar to that of case C in retaining a large fraction of common points. This is due, in part, to the convection that is formed close to the domain boundaries as the MCS enters the domain, as the lateral boundary conditions are the same in all members. However, once the MCS has entered the domain, there must also be some contribution from the nature of the convection itself. The fronts in case F (Fig. 7f) have consistent positioning in the perturbed members for the length of time that they remain in the domain (approximately 25 h). There is a sharp drop in F common at about the time the front leaves the domain, reflecting the change from a frontal to an equilibrium (i.e., scattered showers) regime. As with the MSD, these results for F common are robust to the precipitation threshold used (not shown), thus indicating that the convective regime has an influence on the spatial predictability as in Done et al. (2006Done et al. ( , 2012. The FSS and dFSS results (Fig. 8) indicate the perturbation growth across multiple scales. They allow for consideration of the scale at which two forecasts agree with each other, and hence provide evidence of the scale at which perturbation growth is occurring. For all of the cases there is greater agreement as the neighborhood size increases and the decrease in agreement with lead time occurs more rapidly at the grid scale. These are expected properties of the diagnostic (e.g., Roberts and Lean 2008;Dey et al. 2014).
There is a clear difference in behavior between those cases closer to convective equilibrium and those closer to nonequilibrium. The more equilibrium-like cases, A and B, are no longer ''skillful'' at the grid scale after 13 and 9 h, respectively. In contrast, the more nonequilibrium-like case, C, remains skillful at the grid scale throughout the forecast. As in Fig. 7, case D shows a difference to case C. Case D remains skillful until 20 h (and does not drop far below the skillful threshold, unlike cases A and B). This is likely to be as a result of a mixture of regimes across the domain. These results show that there is strong predictability in the location of precipitation at O(1) km for the nonequilibrium-type situations, but markedly weaker predictability in location of O(10) km for the equilibrium-type situations. This locality of spatial predictability is confirmed to be statistically significant at the 5% significance level from bootstrapping of the ensemble members and no overlap occurs between the different cases in different regimes.
Case E also retains the grid-scale predictability exhibited in Case C, which again could be partly due to the MCS entering the domain through the lateral boundaries. Case F (Fig. 8f) illustrates the complexity arising from an evolving synoptic situation. There is strong agreement in the positioning of the front on all scales with high values of FSS, but once the front leaves the domain there is a sharp reduction in the FSS implying much less agreement in the positioning of the showers as the regime becomes closer to convective quasi equilibrium.
As with the previous diagnostics, there is little distinction between member-member and member-control forecast comparisons: the dFSS shows similar results to the FSS and the results are also robust to the precipitation threshold considered (not shown). Taking together Figs. 2, 7, and 8, we find that more organized convection (associated with the nonequilibrium regime) has greater locational predictability (based on position agreement of the organized convection present in cases C-D compared to the unorganized convection in cases A-B) and more localized perturbation growth compared to convective quasiequilibrium cases (i.e., cell agreement is better at smaller scales toward the nonequilibrium end of the spectrum, and therefore the perturbation growth is more local than in equilibrium conditions). Considering also the evolution of the MSD (Fig. 6), we conclude that the perturbations used have an influence on the positioning of precipitation toward the quasi-equilibrium end of the spectrum (and hence details of location should not be trusted by forecasters) and mainly on the magnitude of precipitation toward the nonequilibrium end of the spectrum.

Conclusions and discussion
While convection-permitting ensembles have led to a greater understanding of convective-scale predictability, the links with the synoptic-scale environment are still being uncovered. The convective adjustment time scale is one measure for how convection links to the synoptic scale and gives an indication of the convective regime. By using Gaussian perturbations inside the UKV configuration of the MetUM, a convection-permitting ensemble has been generated for a spectrum of convective FIG. 8. The fractions skill score (FSS) between runs for hourly accumulations with a threshold of 1 mm as a function of time, for cases A-F. The black lines represent the FSS at the grid scale, the blue lines represents a neighborhood width of 10.5 km, the purple represents a neighborhood width of 31.5 km, and the green represents a neighborhood width of 61.5 km. The dashed red line (FSS 5 0.5) represents the separation between a skillful forecast with respect to the comparison run and not: those neighborhoods with an FSS greater than 0.5 are considered to have locational predictability, and those with an FSS less than 0.5 are considered to be unpredictable (in terms of location). The paler dashed lines represent member-member comparisons, with the vertical dot-dashed line representing the spinup time and the dot-dot-dot-dash line representing the time the front leaves the domain for case F. All values are plotted at half past the hour.

JANUARY 2018
F L A C K E T A L .
cases including two cases driven by the boundary conditions. The perturbed members produced similar precipitation distributions to each other in all cases and so the perturbations did not introduce bias. There were limited differences in the magnitude of the perturbation growth (which were not statistically significantly different as diagnosed from the magnitude of the MSD) throughout the spectrum of convective cases considered. However, there were marginally larger ensemble spreads of domain-integrated precipitation for nonequilibrium events compared to the equilibrium events in agreement with Craig (2011), Done et al. (2012), and Keil et al. (2014). One of the reasons for the subtle differences in the magnitude of the perturbation growth between regimes, in our study compared to some previous studies, is that here we consider only the common points between ensemble members and the control in our precipitation MSD diagnostic. This eliminates the impact of the ''double penalty'' problem as our MSD diagnostic measures variability in precipitation intensities only and not differences in location.
Differences in the temperature MSD doubling times between the regimes were also somewhat subtle, the nonequilibrium cases having slower growth than the equilibrium cases. However, the variation in doubling times among ensemble members was somewhat larger in the nonequilibrium regime. This result reflects the generally larger temporal variability for the nonequilibrium cases compared with the equilibrium cases and is consistent with the expectation that convection is fairly continuous in equilibrium conditions and is more sporadic for nonequilibrium conditions, early on in the forecasts. This behavior further demonstrates that the perturbation growth is closely dependent upon the evolution of convection in agreement with Zhang et al. (2003), Hohenegger et al. (2006), and Selz and Craig (2015).
While there are some subtle differences when considering the predictability of intensity between ensemble members, the more striking (and statistically significant) differences emerge when considering spatial aspects of the perturbation growth. Toward the equilibrium end of the spectrum, the small boundary layer perturbations are sufficient to displace the locations of the convective cells (even when there is an element of localized forcing-case B), to an extent that approaches a random relocation of the cells by the end of the forecast. This gives rise to perturbation growth at scales on the order of the cloud spacing, here O(10) km. Toward the nonequilibrium end of the spectrum, the perturbations are much less effective at displacing cells, but may perturb the development of the cells. Hence, the perturbation growth is more localized to scales on the order of the cell size, here O(1) km. These results were particularly apparent from consideration of the FSS and dFSS and have implications for forecaster interpretations of convective-permitting simulations such as the locations of warnings of flooding from intense rainfall events. The regime difference may be due to distinct triggering mechanisms being necessary and identifiable in models in nonequilibrium cases, such as localized uplift associated with convergence lines or orography Keil et al. 2014). The perturbation growth for Case D presented less localization than might have been anticipated given its large spatial-mean t c . However, the case does have a relatively large spatial variation of the t c , suggesting a spatially mixed regime.
All of the results were robust to varying the precipitation threshold. Furthermore, the conclusions were tested against variations of the perturbation strategy including perturbations across multiple vertical levels and applying spatially correlated specific humidity and temperature perturbations. The impact of the different perturbation strategies was negligible and resulted in the same conclusions as presented here [further details in chapter 6 of Flack (2017)].
Two complex cases were also considered that were primarily driven by the boundary conditions, cases E and F. Case E showed an initially large t c , but as the initiation of the event was not within the domain it could not be cleanly classified into a regime. The overall characteristics of the event show strong agreement in position and localization of perturbation growth. This is consistent with the characteristics of a nonequilibrium event, although the results are affected by the use of identical boundary conditions for all ensemble members. The second case was a frontal case (case F) and was used to determine if the simple convective regime classification remains useful in more complex, spatially and temporally varying cases. Specifically, the presence of a front dominated the precipitation pattern for the first 25 h of the forecast and showers behind the front dominated the final 11 h. This case highlights that the simple regime classification using t c may not provide sufficient information on the convection embedded within the front because the large-scale characteristics of the front dominate the perturbation growth. However, the simple regime concept became useful once the front had left the domain, since perturbation growth within the postfrontal convection (which initiated inside the domain) was consistent with that of the equilibrium cases considered.
While differences in convective-scale perturbation growth are not fully described by t c , various aspects of the spatial variability can be partially described in terms of t c . The relationship of convective-scale perturbation growth with convective regime, particularly from the perspective of spatial structure, suggests that different strategies may be preferable for prediction in the two regimes. Large-member ensembles may be more valuable for forecasting events in convective quasi equilibrium because of the larger uncertainties in spatial location. The larger-member ensemble will allow for more variability in position as there is little influence on the magnitude of the total area-averaged precipitation (e.g., Done et al. 2006Done et al. , 2012. On the other hand, higherresolution forecasts may be more valuable for nonequilibrium events due to their high spatial predictability, with agreement in location being retained at the kilometer scale despite boundary layer perturbations.