String Landscape and Fermion Masses

Besides the string scale, string theory has no parameter except some quantized flux values; and the string theory Landscape is generated by scanning over discrete values of all the flux parameters present. We propose that a typical (normalized) probability distribution P (Q) of a physical quantity Q tends to peak (diverge) at Q = 0 as a signature of string theory. In the Racetrack Kähler uplift model, where P (Λ) of the cosmological constant Λ peaks sharply at Λ = 0, the electroweak scale (not the electroweak model) naturally emerges when the median Λ is matched to the observed value. We check the robustness of this scenario. In a bottom-up approach, we find that the observed quark and charged lepton masses follow the same probabilistic philosophy, with distribution P (m) that diverges at m = 0. This suggests that the Standard Model has an underlying string theory description, and yields relations among the fermion masses, albeit in a probabilistic approach (very different from the usual sense). Along this line of reasoning, the normal hierarchy of neutrino masses is clearly preferred over the inverted hierarchy, and the sum of the neutrino masses is predicted to be ∑ mν ' 0.0592 eV, with an upper bound ∑ mν < 0.066 eV. This illustrates a novel way string theory can be applied to particle physics phenomenology. February 26, 2019 ar X iv :1 90 2. 06 60 8v 2 [ he pth ] 2 5 Fe b 20 19


Introduction
In Science, a simple criteria on the success of a model/idea is the number of parameters needed to explain/predict the observable phenomena. With about 20 parameters (the number of parameters in the neutrino sector remains open), the Standard Model (SM) of strong and electro-weak interactions has been spectacularly successful in fitting/predicting thousands of data points. Can we find a theory that can further reduce the 20 or so parameters? Or equivalently, can we find relations among the parameters? It is not unthinkable that string theory, as a consistent theory of quantum gravity with a single parameter (the string scale M S α −1/2 ), has the Standard Model (and beyond) as one of its solutions. Because of its richness, however, so far we have failed to find the SM as a solution. Therefore, we advocate that the strategy might need some changes in the approach (or viewpoint). In particular, we propose to apply the intrinsic probabilistic nature of string theory to guide the search for the SM inside string theory. In a recent paper [1], we show how the electro-weak scale (m ∼ 10 2 GeV) can emerge. In this paper, we show how we can make predictions about neutrino masses. This novel approach in considering string theory opens a new door to do particle physics phenomenology.
We know that string theory allows for a Landscape. If we start with solutions that have a 4-dimensional spacetime with a dynamically flux compactified 6-dimensional manifold, scanning over all discrete flux values will yield a region of the string Landscape with an exponentially large amount of possibilities. At the low energy 4-dim effective field theory level, this corresponds to the introduction of a potential V (F i , φ j ), where the flux parameters F i take discrete values and the moduli (scalar fields) φ j are to be stabilized dynamically at some local minima. Parts of the string Landscape are generated as we scan over the "dense discretuum" of all flux values. Note that the number of flux parameters and fields may differ in different regions of the Landscape. As we move in the field space, a heavy mode that has been integrated out can become light, so we can no longer integrate it out, and we must include it into the effective potential. Additional F i must be introduced when new cycles appear. In short, a given V (F i , φ j ) corresponds to a "model", valid only over a patch of the Landscape, as illustrated in Fig. 1. Our fundamental assumption here is that string theory contains a solution that describes our universe today. Figure 1: The Landscape of string vacua. Different string ingredients (fluxes, localised objects) give rise to "qualitatively different" models or patches in the string Landscape, illustrated with different colors. Different patches are described by different effective low energy theories. Moreover, given a patch, there is still some freedom, e.g., dictated by the choice of flux values. Therefore, each point corresponds to a vacuum with a specific qualitative and/or quantitative description of physics. One of these we assume is the Standard Model. The challenge is to identify as many criteria /properties as possible to guide our search of the SM in the string Landscape.
In a patch of the Landscape, before fixing M P or Λ, string theory has no scale, so the probability distribution of any discrete flux parameter should be flat (see discussion below on this point). Our proposal is that some physical quantities are determined probabilistically in the Landscape. In the absence of parameters, the probability distribution P (Q) for a quantity Q(F i ) is either flat or peaks (diverges) at Q = 0, otherwise, a scale will be introduced. The degree of divergence at Q → 0 is determined by dynamics, and should be independent of the scale which will be introduced when we have more information, as illustrated with the relation between Λ, M P and m [1]. The divergent behavior of P (Q) is natural in the context of string theory and is taken as a signature of string theory. Applying this to the cosmological constant Λ allows us to understand why an exponentially small Λ can be natural [2]. In this paper, we show that the distribution of fermion masses reveals a distribution P (m) that peaks (diverges) at m = 0, which we interpret as evidence that the SM has an underlying string theory description. Applying this observation to the neutrinos allows us to see that the normal hierarchy is strongly preferred over the inverted hierarchy, and obtain a prediction on the sum of neutrino masses, Moreover, following the peaking behavior of P (m), we can obtain (numerically) the approximate mass dependence on the fluxes m(F i ). So in the search for the SM in the Landscape, we look for a specific model with the right gauge group and particle content, in which m(F i ) is reproduced. This illustrates how the bottom-up approach may reveal properties of the environment where the SM sits in the Landscape. Combining the top-down and the bottom-up approaches hopefully offers new strategies in finding where the SM is hiding in the Landscape, the holy grail of fundamental physics.
We have already advocated a probability distribution approach to address the cosmological constant issues (see [3] and references therein). Starting with a low energy effective theory derived (top-down) and/or inspired by string theory, we can determine Λ in terms of M P , since both are calculable from M S . By scanning over the discrete flux values, one can obtain the probability distribution P (Λ) for Λ. If the median value is comparable to the observed value Λ obs , then we consider the smallness of Λ obs to be statistically natural. This happens if the properly normalized P (Λ) peaks sharply (i.e., diverges) at Λ = 0. Requiring a statistically naturally small Λ should be a powerful clue in the search of the SM in the Landscape.
In particular, we observe this feature in the the Kähler uplift model [4][5][6][7][8], which is up-todate the best controlled way to reach deSitter (dS) space among the proposed ones. In the Racetrack Kähler uplift model [2], one calculates the probability distribution P (Λ) and finds that the median value can easily match the observed value. Furthermore, a new scale m 10 2 GeV automatically emerges [1]. It is important to explore other patches in the Landscape to check the robustness (or uniqueness) of this intriguing relation.
Similar to the statistical determination of Λ, we assume that the quark masses are not precisely calculable, but are flux-dependent m(F i ) so they are described by a probability distribution P (m) that peaks (diverges) at m = 0. The quark mass distribution we find fits well also the charged lepton masses, when we keep the same divergent behavior but re-set the overall energy scale, which is expected to be lower due to the absence of the strong interaction among the leptons. We like to believe that this is an improvement, since 9 fermion masses appear natural using 3 parameters for the two probability distributions. We consider this peaked probability distribution as evidence that the SM has an underlying string theory description.
The distribution of fermion masses have also been studied before [9]. In particular, also inspired by the string theory landscape, Ref [10,11] already pointed out that, among other properties, the measured fermion masses tend to follow distributions that peak at zero value. In this paper, motivated by the peaking behavior of P (Λ), we focus our attention on giving a more explicit string theoretical motivation of such a peaking behavior of probability distributions. Furthermore, with more recent data, we pushed this philosophy further by providing a prediction on the value of the lightest neutrino (and so the sum of neutrino masses).
The paper is organised as follows. In Section 2, we review and extend some basic observations given in [3] regarding the peaking behaviour of P (Λ) for some toy models. We also state the string theory constraints that lead to our proposal that the probability distributions of some physical quantities should tend to peak (diverge) at zero value. (More discussions on this proposal can be found in the last section.) In Section 3, we apply our proposal to quark and charged lepton masses, and we find that the probability distribution goes like P (m f → 0) ∼ m −0.731 f . We then apply this result to neutrino masses to obtain (1.1) for the normal hierarchy case. The Seesaw case is also considered, with almost no change in the prediction.. In Section 4, we review the Racetrack Kähler uplift of [1] and furnish new details confirming its robustness. We show that extending the Racetrack from two to three non-perturbative terms changes only the quantitative properties but not the qualitative features. We discuss the "attractive basin" (field range) and comment on the "flux basin" (flux range) of the Racetrack Kähler uplift model. Section 5 contains some discussion and conclusion. Here we also explain our proposal of the peaking P (Q) in more details.

Overview
In this Section, we recall some results obtained in a toy model [2,3] and extend them to illustrate the importance and validity of our main input/assumptions. The main point is to show that if we write down an effective potential satisfying some stringy requirements, then vacua with small vacuum energy are statistically preferred. Inspired by string theory, we rely upon three well-established stringy conditions as (1) absence of free parameters; (2) no decoupled sector; (3a) a discretuum of flux values with smooth distributions; 1 plus, we assume also that 1 Notice that throughout the paper we will use the terms "fluxes" (F i ) and "flux parameters/data"(a, b, c...) interchangeably. More generally, we assume that the probability distribution of discrete values F i of fluxes is flat or smooth, while a, b, c... are actually functions of F i so their probability distributions may not be smooth. In that case, we expect some of them to peak at zero values.
(3b) the discretuum of flux data is "dense enough" to make a statistical analysis meaningful.
Let us stress here that, without explicit calculations, we do not know "how dense" the discretuum is.
A simple toy model obeying (1)-(3) is [3]: with real scalar field φ and independent (real) flux parameters a k 's. Given V (φ), we can compute the vacuum energy Λ of locally stable minima as a function of flux data, Λ(a k ), and obtain its distribution P (Λ) after scanning over all a k values with flat or smooth distributions P k (a k ).
(By a smooth probability distribution, we mean a distribution that is nowhere divergent and is non-vanishing at zero.) For simplicity, we deal with three cases, truncating the potential to φ 3 , φ 4 , φ 6 . The φ 3 toy model can be studied analytically, while φ 4 , φ 6 are studied numerically.
As explained in [3], including two-loops radiative corrections in the φ 4 case does not change P (Λ). However, if we add an independent a 0 (e.g., Bousso-Polchinski terms [12]) or a decoupled sector (e.g., a potential for another field not coupled to V ) to V (φ) (2.1), the peaking in P (Λ) typically disappears. This is why condition (2) is crucial to rule out these possibilities. Fortunately, there is no decoupled sector in string theory, since all fields (moduli) couple to the closed string sector modes including the graviton and the dilaton. Furthermore, they can couple via flux parameters as well. In other words, all fields are all coupled together, directly or indirectly, via moduli and/or flux parameters.
In the Racetrack Kähler uplift model in Type IIB string theory, we find that P (Λ) peaks sharply at Λ = 0 [2]. Note that both the Kähler uplift model [4][5][6][7][8] and the Racetrack model [13][14][15] are scenarios well explored in string phenomenology. Matching the median Λ 50 to the observed value, we find that the electro-weak scale (m 10 2 GeV) emerges automatically [1], without knowing anything about the electro-weak model. This result rests on a F-term effective potential V in the low energy supergravity framework. For electroweak phenomenology, one may like to introduce a D-term (or some other term) to the potential. In the field theory framework, such a term will shift the vacuum energy density by a value orders of magnitude bigger than the observed Λ, thus ruining the above small Λ property. However, in string theory, such a new term must couple to the moduli and/or flux parameters in the F-term V . Coupling the two terms together and going to the resulting minimum typically renders them to have comparable magnitudes. So it is plausible that the introduction of new terms to V will not ruin the peaking behavior of P (Λ), only shift Λ and m by no more than a few orders of magnitude, thus not spoiling the remarkable result relating Λ and m. It is important to find out how a D-term (or some other term) for the electroweak interaction couples to the moduli/flux parameters in the F-term V . This should put a tight constraint on the possible origin of such terms.
In principle, we could have applied the same statistical approach to other uplift models, such as the KKLT mechanism [16]. Unfortunately, not enough is understood about the properties of the KKLT model to allow us to carry out a meaningfully reliable analysis. It is important to study those possibilities more carefully, in particular how the uplift term couples to the closed string modes and the fluxes. Recent results [17][18][19] may help in this direction.
Notice that a crucial working assumption was (3b), that is taking the flux discretuum to be "dense enough" in order to deal with smooth (i.e., quasi-continuous) flux distributions. At this point, one should wonder if string theory really allows for smooth flux distributions. Or, in other words, is (3b) a good assumption? Indeed, if the discretuum is not dense enough, dS vacua might be disallowed. Let us illustrate this fact with the simple case of φ 3 model, which can be dealt with analytically. There, we can set a scale so that a 3 = 1. By imposing V = 0, V > 0 we find the minimum and its vacuum energy: with the condition a 2 2 − 2a 1 > 0. Therefore, for a Λ > 0 solution to exist, we require We will refer to this region as the "flux basin". If, instead of fixing a 3 = 1, we allow a 3 to take discrete values, the above analysis is still valid if we replace a 1 → a 1 /a 3 and a 2 → a 2 /a 3 in the above equations. Note that the scale changes, but the degree of divergence P (Λ → 0 + ) ∼ − log(Λ) remains intact.
More generally speaking, a flux basin is the range of discrete flux parameter choices a k within which the minimum behaves qualitatively the same. To be more specific, we have in mind that in the neighbourhood of the SM in the Landscape, the particle content is the same but the quantitative values of the 20 or so parameters may vary as we vary the flux values within the flux basin. We shall provide evidence that such a flux basin exists in the vicinity of the SM in the Landscape.
In the φ 3 model, the better the distributions P (a 1 ), P (a 2 ) fill the flux basin, the more dS vacua we obtained. There are several possibilities, as illustrated with a few examples in Fig. 2. It may be possible that the separations between discrete flux values is so big that no dS solution exists (green and blue points). It is also possible that the discretuum is barely dense enough so the flux basin is not empty (orange points). We assume that the discretuum is dense enough that a smooth probability distribution is meaningful. For instance, Ref. [12] argues that 14 flux parameters should be enough to provide a dense enough discretuum to accommodate the observed Λ. This is not difficult to achieve in flux compactifications in string theory, though one cannot be sure in the absence of an explicit construction. Having a dense discretuum (3b) is crucial in the more realistic model of Racetrack Kähler uplift [1] in order to explain how the median value Λ 50 can be as small as the observed value. Moreover, as we will see in Section 3, (3b) allows us to draw important conclusions also regarding the probability distribution of fermion masses, which in turn supports the dense discretuum picture a posteriori. (the orange shaded area) give dS minima. The (a 1 , a 2 ) discretuum must be dense enough to allow for dS vacua. For instance, blue and orange points are obtain with uniform distributions with spacings 0.35, 0.13 respectively, while green points are obtained with a non-uniform distribution where spacings are doubled starting from a minimum spacing of 0.02 for a 1 and 0.01 for a 2 . It is clear that the orange one allows for more dS vacua than the other two. They must be denser for P (Λ) to be statistically meaningful.
Before explaining our approach, let us introduce here another notion, which we will use in the following. We define the "attractive basin", B, as the region in the field space around φ i,min such that if φ i ∈ B, then each φ i will roll towards φ i,min . For instance, in the case of φ 3 (with a 3 = 1), the attractive basin is simply (see Fig. 3) That is, starting from any point inside the attractive basin, φ will roll towards the bottom. (The Hubble parameter in the expansion of the universe may help to dampen any over-shooting in the rolling. )

Fermion Masses
Based on the picture we have now, locating the Landscape region where the SM lies is not enough to allow us to calculate any of its 20 or so parameters. The existence of a SM flux basin implies that slight changes of some of the flux values most likely will yield the same model but with different values for the SM parameters. This seems to mean that the parameters of the SM may never be unambiguously determined, so finding the SM solution in string theory does not really improve our understanding of the fundamental stringy features of particle physics. So a new criteria or strategy is necessary in our search. As illustrated by the case of Λ in our approach, even if it will never be precisely determined, one can find its probability distribution and check whether the observed value is natural or not. Here we like explore what we can learn via a bottom-up statistical approach.
If probability distributions in flux compactification dictate some part of the underlying physics, we should find other evidence of this behavior. Hopefully, this will provide new hints in where to find the Standard Model in the Landscape. Here we examine the fermion masses. To be more specific, consider the Yukawa couplings Y α , where α runs over fermions. Each Y α is a function of common fluxes F j and fluxes F (α) i that are associated specifically to the α-th fermions, therefore i ) since F j take same values for each fermion. 2 In the absence of αth-dependent F (α) i , all fermion masses will be identical. This is certainly not the case. So it is reasonable to assume that 2 In principle, Y α 's can be functions of some moduli Φ a as well. However, moduli vacuum expectation values (vevs) are determined by flux data, so we can just think of the Yukawa couplings in terms of common and specific flux dependence, i ) has the same functional form for each α. Hence, if all fluxes have the same distribution (as we will assume throughout the paper), Y α 's have the same P (Y α ). This immediately follows from the fact that If this is the case, any number of Yukawa couplings we sample from P (Y α ) should distribute accordingly. Since fermion masses in each vacuum are just one of these Yukawa couplings times the same Higgs vacuum expectation value, the fermion masses in each vacuum should clearly obey the same distribution P (m α ) = P (Y α ), which we expect to be divergent at m α = 0. In fact, we find a two-parameter probability distribution P (m q ) that describes very well the observed quark masses, where P (m q → 0) diverges. The same degree of divergence fits well also the charged lepton masses. We like to believe that this is an improvement, since 9 fermion masses appear natural using 3 parameters for the two probability distributions.
Here we show that both quark masses and charged lepton masses obey the same (properly normalized) Weibull distribution, with shape parameter k > 0 and (different) mass scale l. For k < 1, (as is the case here), f (x; k, l) (3.2) diverges at x = 0, and k measures the degree of divergence there, while l measures the width of the divergent peak. Note that the Weibull distribution can be re-written as a function of u = x/l and k only, f (u; k) = ku k−1 e −u k (where f (u; k) du = 1) which depends only on the degree of divergence 0 < k < 1 and has no scale. It is a prime example of what we have in mind for a typical probability distribution P (Q). A scale appears only when we understand the dynamics and Q has a dimension. For fermion masses m below, the scale l is obtained from fitting the data.
So, following our proposal, 0 < k < 1. Once dynamics introduces a new scale (e.g., m), it will in principle fix l, while k is unchanged. So it enjoys the feature that the median x 50 = l(ln 2) 1/k is much smaller than the meanx = lΓ(1 + 1/k), The actual probability distribution P (Q) for a physical quantity Q probably varies from f (Q/l; k).
For example, the divergence may involve powers of logarithms which are subdominant (e.g., see P (Λ) [2]), and P (Q) may tail off for large Q differently from f (Q/l; k). However, we will see numerically that f (x; k, l) (3.2) is good enough for our purpose.
We can also apply our probabilistic philosophy to the mixing angles. The CKM mixing angles for the quarks are compatible with a peaked distribution, but, if it is present, the degree of divergence is weaker, so it cannot be taken as evidence of another signature of string theory. Furthermore, the unitarity constraint on the mixing matrix (both the CKM matrix and the neutrino mixing matrix) renders the application of the probabilistic approach less obvious than the fermion mass cases.

Quark Masses
The 6 current quark masses are known [20], and we see that the Weibull distribution (3.2) with k = 0.269 and l = 2290 MeV provides an excellent fit, with r = 0.016. We see that r does not match the observed ratio m q 50 /m q 0.023, and this is because averaging the strange and charm quark masses to obtain the median is not reliable for such a steep distribution. On the other hand, allowing a range of values for the median, we have 0.0032 < m q 50 m q < 0.043 , which is in agreement with r = 0.016 within the limited amount of data. We also show in Table  1  Notice that another distribution maintaining the (qualitative) divergent behavior at m f = 0 but with a different tail can also provide a reasonable fit. Any realistic model, i.e., containing the SM as a meta-stable vacuum, should be such that the functional form m f (a k ) gives a mass distribution P (m f ) that peaks at zero, such that m f 50 m f . For instance, as a function of a single flux parameter with a smooth distribution P (a), as m f → 0, we can choose m f (a) a 1/k = a 3.72 . (3.5) If m f is a function of more than one flux parameter, the dependence is no longer determined. For example, one of the following two possibilities is acceptable for m f (a 1 , a 2 ) to produce the divergent behavior in P (m f ) as a j → 0 [21], In the latter case, the divergent behavior of individual distribution P i (a i ) is weakened in the distribution P (m f ) for the sum, but not erased. String theory constraint (2) clearly prefers the first possibility. Overall, this analysis indicates that the flux basin is big enough, i.e., the flux parameters a k for the quark masses are dense enough to cover the range of quark masses observed in nature. That 6 quark masses can be described by a single probability distribution (3.4) that involves only 2 parameters should be considered as an improvement.

Charged Lepton Masses
We expect that the peaking also happens in the case of the charged lepton masses m l , even though we have only 3 data points. We find that the three lepton masses are also relatively well fitted by the Weibull distribution, which has surprisingly the same k = 0.269 and so the same r = 0.016 as in the case of quark masses, but with a smaller mass scale l = 164 MeV. The different value for l is not unexpected, as the QCD coupling to quarks but not leptons tend to raise the quark mass scale. The validity of these fittings can be checked from the percentiles in Table 1. Note that P (m l ) (3.6) is a very good fit even though the observed ratio (3.3) is not close to r = 0.016, due to the scarcity of data.

Neutrino Masses
One would expect neutrino masses to follow the same kind of probability distribution but with a different l. Unfortunately, we do not have observed values for neutrino masses and cannot check this explicitly. Nonetheless, given our limited knowledge on neutrino masses, we will see that the proposal just suggested yields interesting insights on the values of neutrino masses. Namely, we show that the normal hierarchy is strongly preferred over the inverted hierarchy. Furthermore, we are able to obtain a very tight bound on the sum of the neutrino masses, m ν < 0.066 eV, which implies that the lightest neutrino mass is of order of 10 −3 eV or smaller.
Let us first recall the data. Take the three neutrino masses m ν to be (m 1 , m 2 , m 3 ) with m 1 < m 2 < m 3 . The study of neutrino oscillations provides us with the square of the differences between two neutrino masses, ∆m 2 ab = m 2 a − m 2 b . Depending on the choice of normal/inverted hierarchy, we have [22]: The normal hierarchy is slightly preferred by cosmological data [23] and further data from neutrino oscillations [24]. Following our proposal, since P (m ν ) peaks at m ν = 0 and is monotonically decreasing with m ν , the normal hierarchy is clearly preferred over the inverted one. which are plotted in Figure 4.

Dirac neutrinos
The minimum value for (3.10) is r min norm = 0.435, which corresponds to having a Weibull distribution with k = 0.668 according to (3.3). Since the origin of the Dirac masses for the neutrinos should be similar to that for the the other fermions, we only allows r < 1 so the inverted hierarchy is clearly ruled out, as shown in Fig. 4. Now that we have clearly seen that the inverted hierarchy is ruled out, we can ask how the normal hierarchy fits our proposal. Namely, we expect the best probability distribution fitting the three neutrino masses in the normal hierarchy should peak at zero but not elsewhere. Let us assume that the best fit is still a Weibull distribution, (3.2). As explained at the beginning of this section, to be consistent with our conjecture on P (m ν ), we need k < 1. Interestingly, this simple requirement imposes a tight upper bound on neutrino masses. This is because when the neutrino masses are too large, they become much closer one another than to zero, so that a probability distribution peaking at zero is not a good fit. Now, given one of the neutrino masses, let say m 1 , we can determine all masses in the normal hierarchy, hence the best fit In principle, if neutrinos are Dirac fermions we should expect k 0.269, while we should allow some uncertainty in k. As an example, if we impose the bound that k should not be bigger than double the fitted value of k = 0.269, i.e., k < 0.538, we then find that Comparing to the lowest value for the upper bound given by experiments, m ν < 0.14 eV (or a tighter bound but with less confidence, m ν < 0.09 eV) [20], our result is clearly much stronger. If we demand that the same k = 0.269 applies to the neutrino mass distribution P (m ν ) as one expects for Dirac masses, we find that m 1 10 −7 eV, m ν 0.0592 +0.0005 −0.0004 eV. (3.14) where m 2 0.0086 eV and m 3 0.051 eV. The resultant P (m ν ) with the above values of k are plotted in Fig. 6. Note that the distributions P (m/l) = f (u = m/l; k = 0.269) are identical for the quarks, the charged leptons and the Dirac neutrinos. We emphasize that this result here strongly depends on the reliability of the probability distribution behavior proposal based on the string theory "no parameter" property; so measuring the neutrino masses should provide a hint on how string theory can shed light on phenomenology.

Seesaw Mechanism
The above P where l = 0.0013 eV. Here we find that the lightest neutrino mass is now m 1 ∼ 10 −8 eV, while the sum (3.14) essentially remains unchanged. The corresponding P (m ν ) is plotted in Fig. 6.
Using the lepton mass scale l l = 164 MeV from the charged lepton distribution P (m l ) (3.6) and the neutrino mass scale l ν = 0.0013 eV from the neutrino distribution P (m ν ) (3.15), we find that M M = l 2 l /l ν 2.1 × 10 16 GeV M S (3. 16) which is very close to the string scale obtained earlier [1], which in turn is close to the GUT scale.

Revisiting the Racetrack Kähler Uplift Model
Let us revisit the Racetrack Kähler uplift model studied in Ref. [1,2]. Here we like to discuss the robustness of the model in two aspects: (1) So far we have focused on a single Kähler modulus case. Ref. [2] considers the multi-Kähler moduli case and check whether the multi-Kähler moduli case is compatible with the large volume approximation in the Racetrack Kähler uplift and show that qualitatively the sharp peaking of P (Λ) is maintained. Here we like to make a few minor comments on the multi-Kähler moduli case.
(2) In going from the Kähler uplift model to the Racetrack Kähler uplift model, i.e., going from one non-perturbative term to two non-perturbative terms for the Kähler modulus in the superpotential W , we find that P (Λ) becomes substantially more peaked at Λ = 0, resulting in a naturally small Λ.
Here we check what happens if we go further, in including more nonperturbative terms in the single Kähler modulus case. Not surprisingly, we find that the peaking of P (Λ) does not change qualitatively.
We consider a 6-dimensional Calabi-Yau (CY) manifold M with (h 1,1 ) Kähler moduli T j and h 2,1 > h 1,1 number of complex structure moduli U i , so the manifold M has Euler number χ(M ) = 2(h 1,1 −h 2,1 ) < 0. This simplified model of interest is given by (setting M P = 1) [2,6-8], Here, M 2 P V/α . The superpotential W 0 (U i , S) for U i and the dilaton S can in principle be extended to include other fields such as the Higgs boson. The uplift to deSitter space and the breaking of supersymmetry are provided by the α correction ξ term [25,26].

Multi-Non-Perturbative terms
The non-perturbative term W N P for the Kähler modulus T in V (4.1) can be extended to the multi-Kähler moduli case. Ref. [2] considers the Swiss-Cheese case with two Kähler moduli in the Racetrack Kähler uplift model. It is shown that the sharp peaking of P (Λ) at Λ = 0 remains in the presence of additional Kähler modulus. It is clear from the analysis that, under reasonable circumstances, more Kähler moduli will not change the overall picture of the single Racetrack Kähler uplift model. We may view this in another way. If the additional Kähler modulus is relatively heavy with respect to the other moduli and the dilaton, we may integrate it out in the low energy effective potential V . Its presence in W N P will be a function of the remaining light fields and some flux parameters, which may be approximated by the superpotential W 0 in V (4.1). Note that introducing higher α corrections will change the value of ξ, but not the overall picture.
The non-perturbative term W N P for the Kähler modulus T is introduced by gaugino condensates in the superpotential W to stabilize the T modulus [16]. So far we have considered only the case with two non-perturbative terms (n N P = 2) which form the Racetrack. However, in general there may be more gauge symmetries. To show the robustness of the model we introduce n N P non-perturbative terms, with the coefficients a i = 2π/N i for SU (N i ) gauge symmetry (i = 1, ..., n N P ). The dependence of A i (also functions of some flux parameters) on U i , S is suppressed. They are treated as independent (real) random variables with smooth probability distributions that allow the zero values, while the dilation S and the complex structure moduli U i are to be determined dynamically, yielding W 0 .
In the large volume region, the resulting potential may be approximated to, with T = t + iτ , where we have chosen N 1 = max{N 1 , N 2 , ..., N n N P }. Here we can already see that in the large volume region, the behaviour of λ is dominated by the first term and the terms with smallest β i . We expect that further adding more non-perturbative terms makes little changes. Following Ref. [1,2], the stability conditions at y = 0 give us the cosmological constant: where x is determined as As explained in Ref. [2], we can analyze the probability distribution of the cosmological constant, P (Λ). After randomizing A i , N i (with upper bound N max ∼ O(100) given by F-theory [27]) and W 0 , we collect all the classically stable solutions and find the probability distribution P (Λ). Note that for n N P ≥ 3, Eq. (4.4) cannot be solved analytically, but has always an unique solution for the typical regime of parameters considered in [1]. To get the analytical behaviour of P (Λ), we can numerically fit P (Λ) to some well-known probability distribution functions. It turns out that for n N P ≥ 3 and small Λ, the Weibull distribution works very well. Remarkably it is the same as the case of fermion masses. On the other hand, the case of n N P = 2 is given by [2] P (Λ) So for β 2 1, we see that the diverging behavior of the properly normalized P (Λ) is very peaked as Λ → 0. Note that the (− ln Λ) part is sub-dominant.
Setting the median Λ 50 equal to the observed Λ ∼ 10 −122 M 4 P , and recalling that the superpotential has mass dimension 3, Eq. where the string scale M S is around the GUT scale [1], which is close to the Majorana mass M M (3.16).
We now compare the peaking behaviour with different n N P . Below we show the peaking behaviour when n N P = 2, 3 only, since further increasing n N P simply causes the similar but even smaller changes. Here, Λ 50 is the median. Following [1], we find very simple approximation formulae for Λ 10 and Λ 50 in Table 2.
In Table 3, we present four cases, namely Λ 10 , Λ 50 matching the observed Λ ∼ 10 −122 M 4 P for n N P = 2, 3. Here, we see that Λ 50 for n N P = 3 matches the observed value for maximum N max = 126 (i.e., SU (126)). It turns out the resulting new scale is still m ∼ 10 2 GeV. For such a N max , Λ 50 for n N P = 2 is still orders of magnitude too big, when compared to the observed value. We see in Table 2 that P (Λ) is more peaked at Λ = 0 as n N P is smaller, but less suppressed at relatively higher percentiles. As a result, for the median Λ 50 to match observation    Table 3: Estimates for Λ Y for n N P = 2 and n N P = 3 non-perturbative terms in W . Here ξ 10 −3 .
can be achieved with smaller N max by increasing the number of non-perturbative terms in the Racetrack.
We have seen that converting the single non-perturbative term in W N P to the two-term Racetrack model makes a big difference in the form of P (Λ), which becomes much more sharply peaked at Λ = 0. Ref. [2] shows that the introduction of additional Kähler moduli does not change the picture qualitatively. The same happens when we introduce more non-perturbative terms, in the single Kähler modulus case, though quantitatively, the peaking of P (Λ) becomes sharper. These show the robustness of the nice qualitative feature of the Racetrack Kähler uplift model, while it has the flexibility of quantitatively adjusting to fit observations.

Flux basin and attractive basin
We now comment on the flux basin and attractive basin of our model in [1] (n N P = 2), to show how the universe can reach the vacuum we desire. As explained there, we see that the existence of a stable dS minimum with exponentially suppressed Λ requires an exponentially suppressed W 0 . In the weak coupling regime, we would expect the moduli U i , S O(1). Assuming the divergent behavior for P (W 0 ) (4.8), the discrete values of W 0 must be dense enough so that we can obtain a very small W 0 for Λ to be small enough to fit observation. This may be achieved if there are enough number of flux parameters inside W 0 (a high dimensional flux basin), or the spacing of the few flux parameters in W 0 are small enough (dense discretuum in a low dimensional basin). Let us now take a closer look at the attractive basin. For simplicity, since we are interested in the qualitative features, we will assume that the Kähler modulus T rolls in the potential following the "crude" equations of motion in an expanding universe: where H is the Hubble parameter due to the expansion of the universe, and "dot" means derivative with respect to an evolution parameter such as time. It is clear that the presence of H can damp the rolling of T and we assume a big enough H to prevent over-shooting. We further assume initiallyẋ =ẏ = 0 for simplicity, and denote the initial point be (x 0 , y 0 ). To proceed, we take the potential given by (see second column in Table 3) which yields a minimum with Λ 10 −122 at (x, y) (139.5, 0), see Fig. 7. Solving the differential equations (4.9), we can find the attractive basin, see Fig. 8 B = (x 0 , y 0 ) s.t. x 0 142.5, |y 0 | max ∝ 10 0.44x 0 , (4.11) As we see, in the x direction, there is also a relative maximum at x 142.5. If x 0 > 142.5, T would roll to infinity, leading to decompactification and vanishing cosmological constant. So the de-compactified vacuum has a much bigger basin than that of the meta-stable vacuum.
When x < 139.5, the point y = 0 is a relative maximum instead of minimum in y direction. Therefore, if |y 0 | is too large, T would roll away from y = 0 and cannot reach the meta-stable minimum. It means that T can roll backward in x direction, leading to collapsing solutions. On the other hand, if |y 0 | is sufficiently small, T rolls to y = 0 when it passes the relative minimum in x direction. Overall, there is an attractive basin in the neighborhood of the meta-stable minimum, as expected.

Discussions and Conclusion
We believe the approach presented here, based on the probabilistic nature of string theory, can complement the ongoing discussion between Swampland and Landscape. Assuming that the dS Landscape is realised (with SM-like realisations belonging to it), we can identify some probabilistic features it should obey in order for our observed universe to be considered a statistically natural realisation in it.
Implementing the stringy constraints presented in Section 2:(1) absence of free parameters; (2) all fields must couple together, directly or indirectly; and (3) a "dense discretuum" of flux values with smooth distributions that we scan over; we have proposed that the probability distribution P (Q) of any flux-dependent quantity Q is smooth, and it is typically uniform or "peaked at zero", i.e., Q 50 Q , while it cannot exhibit a monotonically increasing behaviour with Q (eventually peaking at some Q max ). Figure 9: Good and not so good distributions P (Q): type A,B are the most typical, C is allowed by some particular flux distribution or functional form that introduces a scale, while type D is not allowed by our proposal. Fig. 9 illustrates a few possibilities. In A, the probability distribution P (Q) is flat, so there is no scale, or preference for any particular value. As we have shown in a number of cases (e.g., P (Λ) (4.5) and P (m f ) (3.4)), P (Q) peaks (diverges) at Q = 0, at times very sharply, as illustrated in B. That the peak always happens at Q = 0 is a simple probability theory property, and that string theory has no free parameter and so there is no preference for any specific scale. The degree of divergence of P (Q) at Q = 0 is independent of any scale. Once we fix the dynamics (e.g., the value of Λ and m), scales can come into P (Q), but the degree of divergence will in general not change.
A few comments are in order : • Consider the Planck mass M P . Given the string scale M S , we should in principle obtain the probability distribution P (M P ), given a specific Kähler uplift model. However, in the low energy effective theory approximation, we trust our analysis only when M P M S , so P (M P ) is a not particularly meaningful distribution to study in the present context. On the other hand, P (1/M P ) may peak at zero, i.e., M P → ∞, which implies decompactification. This is in agreement with the above discussion, that the attractive basin for decompactification is large compared to that for the meta-stable vacuum. Fortunately, tunneling from the meta-stable minimum to the decompactified solution takes much longer than the age of our universe [3].
• A probability distribution of Type C can arise once scales are introduced dynamically, such as the value of Λ and m. It can also appear before that for a particular flux distribution. Here, the position of the peak is a new scale. It can appear as a function of the range of the discrete flux values we allow, e.g., the actual finite range, or the position of the peak and the width of a Gaussian distribution. The low energy effective theory is valid only for limited ranges of flux values. Going beyond will invalidate the particular low energy effective theory for a specific patch of the Landscape. Once a new scale appears in some flux distributions, distribution P (Q) of Type C can also emerge. It is important that the measured fermion mass distributions agree with Type B but not Type C.
• For distribution P (Q) of Type B, replacing a flat (A) flux distribution by a smooth (C) one does not impact on the degree of divergence of P (Q) at Q = 0, as long as the smooth distribution is non-vanishing at zero value and is nowhere divergent. The replacement of a flat flux distribution by a smooth distribution typically only affect the tail of P (Q), which changes the mean valueQ but has limited impact on the median Q 50 .
Ref. [28] argues that any solution must have W 0 = 0 as the solution value of the superpotential W 0 . In the Racetrack Kähler uplift model (4.3), we find that this immediately implies that Λ = 0, i.e., the 10-dimensional Minkowski spacetime as the solution. Indeed, this decompactification is the most natural solution, as a big enough value t (which measures the compactified volume) will simply run away, t → ∞. As we have seen, the meta-stable minimum we find has a very small attractive basin compared to that for the de-compactified solution. So, intuitively, we may expect that it is highly unlikely to end in the meta-stable vacuum we are in today. However, if one believes in the inflationary universe scenario, our universe starts with a large vacuum energy density and rolls down. A very sharply peaked P (Λ) implies that there is an exponentially large number of meta-stable solutions (with exponentially small Λ > 0) for it to roll into, so ending in one of them is not as surprising as one's naive expectation suggests, since once it is trapped, it does not have a chance to explore the many more possibilities for it to de-compactify.
In the more realistic situation, we find that |W 0 | = 10 −51 M 3 P , which is a tiny shift away from zero, yielding an exponentially small positive Λ that matches observation. Ref. [29,30] finds a W 0 (U i , S) based on an analysis of orientifolds. In general, its value W 0 at the locally stable minimum solution can be very small, but non-vanishing. As explained in Section 4, we can also consider multi-Kähler moduli and a superpotential W with only non-perturbative terms, i.e., W = A j e −a j T j only. Now let us integrate out the heavy ones to reach the single Kähler modulus case (4.1), so the non-perturbative terms of the heavy Kähler moduli inside W are converted to terms that are function of U i , S and flux parameters, yielding a non-zero perturbative looking W 0 (U i , S, F α ) inside W (4.1) (ignoring weak dependence on the light Kähler modulus). In short, a non-zero W 0 (U i , S, F α ) inside W is generic, and such a term is generically non-zero when we sit at a minimum.
We demonstrate that the application of string theory to particle physics phenomenology is possible even though we have no idea where the SM sits in the Landscape. We believe this novel statistical combination of top-down and bottom-up approaches provides a new way to do phenomenology. As our understanding of the Landscape improves, more phenomenology can be carried out, providing guidance in the search of the SM in the Landscape. If our proposal is correct, it could easily explain why we observe a small cosmological constant, small couplings and the distribution of quark and lepton masses (as we have seen): we live in one of the statistically