Convective cloud size distributions in idealized cloud resolving model simulations

,

1 Introduction obeys: where ∝ should be read "scales as", and b is a constant characterizing the power-law. 58 Since then, power-laws (or derived forms thereof like power-laws with an exponential 59 cutoff) have been universally recognized as the best functions to model cloud size  If we first consider the parameterization issue, knowing the precise distribution 71 of cumulus clouds may help constrain spectral schemes (Arakawa & Schubert, 1974) 72 for which the subgrid variability associated with convection can be described using a 73 size-resolved cloud population. By explicitly introducing information on cloud sizes, 74 such schemes can easily be made scale-aware (Neggers, 2015) and therefore provide 75 interesting solutions to parameterizing convection in the "grey zone" (that is at model identify and analyze systematic behaviors (organization) in the cloud ensembles.

134
The paper is organized as follows. In section 2, we give details on the numerical    Weibull and power-law distributions with an exponential cutoff (cutoff power-laws).

222
These can be written: where L is the cloud size, p X are the theoretical distribution functions and C X are 224 appropriate normalizing factors. α, λ, β, η, µ and ν are the parameters characterizing 225 each distribution that we seek to estimate. Log-normal distributions have also been 226 considered as possible alternative distributions, but they generally yielded worse fits 227 than any other function and were therefore ruled out as viable choices.

228
The corresponding complementary cumulative distribution functions (CCDFs) are defined from p X (L) as: where L varies from L min to L max (the size bounds over which the fits are performed), 229 and P X (L) represents the probability that a given cloud has a size larger than L. This 230 yields, for the four tested distributions: with Γ the upper incomplete gamma function defined by: Theoretical distributions are here defined over a finite size range L min − L max fol-  The procedure described above makes use of several statistical methods briefly summarized below, starting with MLE. Let's assume a set of empirical data x = (x min , ..., x max ) that we wish to approximate by a known distribution p X|Θ described by N parameters Θ = (θ 1 , ..., θ N ). Defining the log-likelihood function as: the set of parameters Θ yielding the best fit is the one that maximizes (Θ, x). For Θ cannot be reduced to a simple analytical formula. In this situation, Θ is obtained 277 from the numerical maximization of the log-likelihood function .

278
Once optimal parameters Θ have been found, the KS statistics D is computed to give an estimate of how good the fit it. D is simply defined by the maximum absolute distance between the empirical and theoretical CCDFs between x min and x max : where P e and P X| Θ denote the empirical and theoretical cumulative distributions. Finally, to compare power-law distributions to other plausible hypotheses an alternative goodness-of-fit test is employed, the LR test. More generally, best fits obtained for any distribution p X can be compared to any other alternative hypothesis by means of the LR test. For any best fit distribution p X| Θ determined over the range x min − x max , and any alternative distribution p X| Φ described by a set of parameter Φ fitted over the same x range, the log-likelihood ratio is defined by: where L is the standard likelihood function. Because larger values indicate a higher 293 probability for the data to be drawn from the hypothesized distribution, a negative like-294 lihood ratio LR < 0 means that the null hypothesis is more likely than the alternative 295 hypothesis. In contrast, a positive LR value means that the alternative hypothesis 296 is more likely. In practice, LR must be sufficiently negative (respectively positive) 297 for the null hypothesis (respectively the alternative hypothesis) to be unambiguously 298 identified as the better hypothesis.

399
Considering the RICO simulation, exponential distributions generally produce 400 best fits valid over a broader range than power-laws for clouds computed from q. That

Comparison to linear regression 409
To evaluate biases introduced when fitting power-laws using linear regression 410 in log-log coordinates, power-law exponents were recomputed with this technique for 411 all empirical CCSDs analyzed previously. The results are compared to our best fit 412 estimates in Figure 5. Linear regression was here applied over size ranges determined 413 visually from the distributions plotted in Figures 1 and 2

531
A distinctive property of the Weibull and cutoff power-law distributions is that they are expressed as the product of a power-law and an exponential. This is reminiscent of the characteristic distributions found for sub-critical percolating (Stauffer & Ahaorny, 1994) and SOC (Bak et al., 1988; Jensen, 1998) systems: with s a characteristic size, s 0 a correlation length, τ a critical exponent and G a scaling 532 function. In a system of finite size, the correlation length is related to the system size 533 L via s 0 ∝ L d , with d a critical exponent. Note that the pure power-law behavior 534 is only recovered asymptotically as the correlation length diverges. For s 0 sufficiently 535 small, finite-size scaling will affect the power-law pre-factor and critical exponents. 2. When one of these thermals becomes supersaturated, that is when the system's 554 critical threshold is exceeded, a cloud forms, that is an avalanche is triggered.  of-fit test is employed to find the optimal size range over which these fits hold. In 632 addition, the described algorithm also permits direct comparisons between best fits ob-  where water vapor plays the role of the control parameter, and clouds correspond to 656 avalanches triggered when water vapor locally exceeds a critical threshold (saturation).

657
Note however that as attractive as the concept may be, a more careful evaluation  Table 6).  The corresponding best fit parameters are collected in Table 6).     and CWP (bottom row) thresholds in the LBA simulation. The LR test is applied to all possible combinations of reference and alternative distributions. In each subplot, the reference distribution is read on the horizontal axis, and it is tested against an alternative distribution read on the vertical axis. The color code should be understood as follows: negative values in blue indicate that the reference distribution provides a better fit than the alternative distribution over its optimal size range; positive values in red indicate that the alternative distribution is a better fit than the reference one. Grey cells indicate that no acceptable fit could be found with the alternative distribution.