W0_sample = np.random.normal(0,1)?

In this note we explore the distribution of vacuum expectation values of the superpotential $W_0$ in explicit Type IIB flux compactifications. We show that the distribution can be approximated, universally across geometries, by a two-dimensional Gaussian with a model dependent standard deviation. We identify this behaviour in 20 Calabi-Yau orientifold compactifications with between two and five complex structure moduli by constructing a total of $\mathcal{O}(10^7)$ flux vacua. We observe a characteristic scaling behaviour of the width $\sigma$ of our distributions with respect to the D3-charge contributions $N_{\text{flux}}$ from fluxes which can be approximated by $\sigma \sim \sqrt{N_{\text{flux}}}$. This $W_0$ distribution implies that locating small values of $|W_0|$ as a preferred regime associated with classes of string theory solutions typically featuring hierarchies, simplifies to the basic statement of finding small Euclidean norms of normally distributed values. We do also identify small modifications to this Gaussian behaviour in our samples which might be seen as indications for the breakdown of the continuous flux approximation commonly used in the context of statistical analyses of the flux landscape.


Introduction
The distributions of observables arising in effective field theories (EFTs) from string theory hold vital information about the attainable low-energy physics within string theory.This includes insights into the genericity of low-energy phenomena, as well as into the distinguishing features of string theory constructions from standard bottom-up EFT models.On a higher level, such distributions can be linked with vacuum selection mechanisms of vacua resembling the physics we see around us.
Flux compactifications of Type IIB string theory provide the perfect arena to study the distribution of such low-energy properties.Here, one investigates the four-dimensional scalar potential for complex structure moduli and the axio-dilaton induced by 3-form fluxes [1] (see [2,3,4] for reviews).In particular, we are interested in understanding minima satisfying F -flatness conditions D I W = 0 in the large complex structure regime where moduli potentials are under computational control [5,6].Recent numerical improvements [7] enable us to get sizable samples of such solutions across a variety of geometries at relatively small computational costs.In this setting, we look at the distribution of 1 h which is tightly related to the gravitino mass.
A priori, it is unclear how W 0 is distributed as it is only determined upon solving a system of polynomial equations with an infinite sum of exponential corrections.In the case of large complex structure limits, the coefficients of these equations are given by topological quantities arising from the respective mirror geometries [5,6], see also [9] for recent advances.Ultimately, we would like to address the question of whether these types of structures of string theory leave imprints on the distributions of phenomenological observables.
In this context, it has been argued that many distributions depend only on a "few" UV parameters [10,11,12,13] like the orientifold tadpole as will be partially confirmed within this note.In particular, it has been previously argued that W 0 is distributed as a Gaussian with a tadpole dependent width.It is important to stress though that the aforementioned analyses have been mostly performed for hypothetical geometries in the continuous flux approximation.We will instead use geometries with explicit Z 2 -involutions for the orientifold and construct actual vacua obtained from solving F -term conditions for quantised fluxes.This allows us to obtain actual sizable samples of flux vacua from first principles, enabling actual statistical analysis of string data.
In this quest, we find that -as a first approximation -the real and imaginary part of W 0 indeed appear to be distributed as a Gaussian and that deviations arise only at sub-leading order.Our evidence is based on 20 background geometries for which we generated reasonably sized samples (see Table 1 for details).In this dataset, we find universal behaviour across the various geometries.From the physics perspective, this means that the W 0 -distributions arising from string theory, at least in our current samples, are indeed simple and almost universal.Furthermore, these observations allow for projections on the required sample size needed to generate solutions with small absolute values for |W 0 | which is an essential ingredient for the hierarchical scale separation in the KKLT scenario [14].Assuming that W 0 is normally distributed, the respective standard deviation of this distribution sets the expected hierarchical suppression.
The rest of this note is organised as follows: in Section 2 we describe our sample of geometries and algorithmic choices for fluxes.We present our numerical results in Section 3 and conclude in Section 4.

Geometry and flux sample
To produce sufficiently large samples of flux vacua, we pick 20 CY orientifold models with h 1,2 = 2, 3, 4, 5 as summarised in Tab. 1 for each of which we construct at least O(10 5 ) vacua.We compute the tadpole values Q D3 = h 1,1 + h 1,2 + 2 from the corresponding orientifold configurations with h 1,2 + = 0 following the algorithm of [15].The D7-tadpole is cancelled locally by putting D7-branes on top of the O7-planes. 2Using CYTools [16,9] we calculated the prepotential including the instanton series up to degree 10.We follow the conventions described in [7] to which we refer for more details.
We search for solutions to the F -term conditions D I W = 0 for all complex structure moduli Z i , i = 1, . . ., h 1,2 , and the axio-dilaton τ .To solve this optimisation problem, we generate fluxes and starting points for the minimisation using our ISD + sampling technique described in [7] by restricting to choices with N flux ≤ Q D3 .Specifically, we sample half of the flux vector from the range [−5, 5] and our starting points for the moduli within the (mirror) Kähler cone subject to the cutoff conditions On the resulting vacuum expectation values (VEVs) for the moduli, we subsequently evaluate EFT quantities of interest such as the flux superpotential ⟨W ⟩.The latter by itself is however not necessarily meaningful due to Kähler transformations.To avoid this we re-scale the superpotential with the Kähler potential and define W 0 as in Eq. (1) (cf.[12]).Below, we will report the W 0 distribution and examine its general behaviour across geometries.Subsequently, we focus on the gauge invariant quantity of the absolute value of |W 0 |.We leave a discussion of the physics of the phase of W 0 for the future (see for instance [17] for a discussion of phases in the context of soft-supersymmetry terms).

Numerical results
We now turn to a discussion of our empirical findings.We show the respective distributions of the re-scaled superpotential in Figure 3.As previously advertised, we find a universal behaviour across the different geometries with varying number of moduli.The distribution looks remarkably similar to that of a two-dimensional Gaussian which we discuss shortly.It is also astonishing that all of these solutions fall in a comparable range of values, indicating rather similar standard deviations for the Gaussians.
We note that some of the geometries show a feature close to Im(W ) = 0 which corresponds to values where a large re-scaling with the Kähler potential occurs.After further inspection, we discovered no inconsistencies for these solutions which is why we attribute these structures to sampling and algorithmic biases at this stage.
The approximately Gaussian behaviour for W 0 was predicted by the findings of [13] where it was argued that, at least under simplifying assumptions including the continuous flux approximation, W 0 should be distributed as a Gaussian distribution peaked around zero with the standard deviation σ being proportional to the tadpole Q D3 .
In fact, we observe such a scaling of the width in our distributions, i.e., we find that it scales with N flux /Q D3 .This is shown in Fig. 2 where the colours highlight samples with different ratios N flux /Q D3 .To make this quantitative, the last row in Fig. 2 shows σ as a function of N flux /Q D3 for the various models.We clearly see a universal trend for the slope across the different examples.Empirically, we find that is an excellent approximation for most of our models in agreement with [13].
Next, let us focus on the absolute value |W 0 | which is related to the value of the gravitino mass.Assuming a Gaussian distribution for the complex value, the resulting distribution for the absolute value is shown in Fig. 3 in the top right.At an elementary level, the observed minimal value of |W 0 | scales inversely with the sample size.For illustrative purposes we show the respective dependence for various standard deviations.The minimal values we identify in our samples are highlighted in Fig. 3 and they deviate only slightly from the Gaussian expectation.We also observe no noticeable trend with the number of moduli or the tadpole in these minimal values at this level.We hence expect that simply generating more samples will lead to the generation of solutions with larger hierarchical suppression.Having said that, an order of magnitude suppression requires roughly a two orders of magnitude increase in the sample size.Hence without using additional search strategies on the UV side (see [18,19,20] for human strategies), the expected cost for finding flux samples with a particular hierarchical suppression increases exponentially with the sample size.Similar to finding solutions with small tadpole at h 1,2 ≫ 1, we strongly believe that more time needs to be devoted to developing algorithmic approaches to probe exponentially suppressed tails of distributions within the landscape.
Moving to a quantitative comparison, we show in the bottom two rows of Figure 3 the best-fit Gaussian distribution and the respective ratio between our data and this model.We make out two features, namely the fits almost consistently under-predict at smaller values in comparison to our data, while they over-predict at larger values.These deviations from a simple Gaussian will be analysed in more detail in future works.
Having analysed the functional behaviour of W 0 , it is important to understand its dependence on our algorithmic choices described in Sec. 2. At this point, the exact influence of the sampling choices in Eq. (2) on our numerical solutions is non-linear, though our results seem to be largely independent of the chosen sampling method by comparing the W 0 distributions obtained from the various sampling procedures discussed in [7].Next, we performed some empirical checks to get a rough idea of the resilience of our results against variations of hyperparameters (see Fig. 4).In particular, we change the range for the sampled flux vectors as well as the initial guesses for the moduli VEVs.Overall, we only observe mild effects.Nevertheless, our results are subject to these algorithmic assumptions (see [21] for examples of such effects), but at this stage we are not aware of particular biases which are introduced by our algorithm.A more detailed analysis of algorithmic effects (e.g. to examine the sub-leading features seen in the distribution) are beyond the scope of this paper.

Conclusions
Our numerical methods allow a first glance at largely uncharted classes of string theory solutions.Studying these ensembles of flux vacua enables us to obtain meaningful insights into the distribution of |W 0 |.Here, we were able to find a hint for universal features across geometries as the distributions exhibit rather similar properties.Among others, this includes a characteristic scaling of the width of our distributions with the D3-charge contribution N flux from fluxes.Further, we observed a close similarity to a Gaussian distribution for W 0 , which, to our knowledge, explicitly verifies for the first time the behaviour discussed in [10,11,12,13].This in turn also sheds a different perspective on finding solutions with small |W 0 | as it is equivalent to sampling small absolute values in a two-dimensional Gaussian.
Without more efficient sampling techniques, hierarchical suppression will require exponentially more samples.Learning strategies to efficiently generate such samples promise an exciting angle to study these particular regions of string theory solutions (see [22,23,24] for successfully learned strategies in simple string theory settings).This approach seems to be timely to complement human strategies to identify such special solutions [18] at the tails of distributions within the landscape.For instance, it will be interesting to see how our numerical approach can be used to explicitly search for solutions with small tadpole at h 1,2 ≫ 1 -often referred to as the tadpole conjecture [25] (see [26,27] for numerical work in this direction).
It is clear from our work that a Gaussian distribution at the level of our samples seems to be a first approximation and that there are additional effects in our data which need to be accounted for.At this stage we were not able to attribute these additional features to algorithmic biases which therefore deserve further attention in the future.Some of us previously discussed potential sampling biases arising from the sampling method of initial points [7] and of the algorithm itself [21].Ultimately, quantifying these biases is a prerequisite for a proper understanding of the distributions arising in the string landscape.
We note that an analysis of other phenomenologically relevant variables along the lines of our analysis of the distribution of W 0 seems straight-forward, but at this stage we do not see a pressing physics case for such an analysis.Arguably this situation is changed when Kähler moduli stabilisation is included (e.g.[28]) as this sets many scales for the underlying particle physics model.
Another important line of future investigation is the comparison with random matrix potentials to understand the nature of the underlying string theory ensemble (see [29,30] for work on random matrix potentials and random potentials inspired by flux vacua at LCS [31,32] for spectral properties of the Hessian).
Clearly the analysis presented in this note is by no means exhaustive and the hints for universal behaviour deserve further study.We hope to report on progress along these lines at a later stage.

Figure 1 :
Figure 1: Distribution of our solutions for the real and imaginary parts of W 0 .Each box corresponds to one of our geometries as listed in Tab. 1.

Figure 2 :
Figure 2: Scaling of the distribution of W 0 on N flux .Each box corresponds again to one of our geometries as listed in Tab. 1.The colors in the first four rows highlight the dependence on the D3-charge contribution N flux from fluxes.For clarity, we picked only discrete values for N flux /Q D3 ∈ [0.01, 0.2, 0.4, 0.6, 0.9, 1] to show the qualitative dependence of the width on N flux .The bottom row shows the standard deviation σ as a function of N flux /Q D3 .

Figure 3 :
Figure 3: Top left: Comparison of expected minimal |W 0 | value as a function of the sample size.Different lines correspond to two-dimensional Gaussians with different standard deviations.The error bars are one standard deviation and are estimated with 100 runs each.The dots indicate our 20 samples coloured by their respective values of h 1,2 .Top right: The distribution of the absolute value for a two-dimensional Gaussian for different values of the standard deviation.Middle and bottom row : The distributions of |W 0 | in our flux vacua samples exhibit universal behaviour across geometries.We show the best fit Gaussian associated to our data sample and the ratio of this model with respect to our data.

Figure 4 :
Figure 4: Dependence of the distribution on algorithmic hyperparameter choices.Top:The top line shows the influence on our choice for the initial flux vector, i.e. we varied the sampling region for the respective subset of the flux vector.The different subplots show effect for different number of moduli.The default value is 5. Bottom: The bottom row compares different regions for the initial values for the moduli fields.The default value is 5.In both cases we see a smaller effect for larger number of moduli.

Table 1 :
1,1 h 1,2 Q D3 Summary of our compactification manifolds with their respective Hodge numbers, tadpole values and number of vacua.In total, we find 15, 486, 448 solutions.