Vacuum Selection on Axionic Landscapes

We compute the distribution of minima that are reached dynamically on multi-field axionic landscapes, both numerically and analytically. Such landscapes are well suited for inflationary model building due to the presence of shift symmetries and possible alignment effects (the KNP mechanism). The resulting distribution of dynamically reached minima differs considerably from the naive expectation based on counting all vacua. These differences are more pronounced in the presence of many fields due to dynamical selection effects: while low lying minima are preferred as fields roll down the potential, trajectories are also more likely to get trapped by one of the many nearby minima. We show that common analytic arguments based on random matrix theory in the large $D$-limit to estimate the distribution of minima are insufficient for quantitative arguments pertaining to the dynamically reached ones. This discrepancy is not restricted to axionic potentials. We provide an empirical expression for the expectation value of such dynamically reached minimas' height and argue that the cosmological constant problem is not alleviated in the absence of anthropic arguments. We further comment on the likelihood of inflation on axionic landscapes in the large D-limit.


Introduction
The use of axions for inflationary model building has a long history, ranging from initial attempts to use the QCD axion and natural inflation [1] to multi-axion potentials in string theory as in N-flation [2] or monodromy inflation [3] among many other proposals, see [4,5] for recent reviews. Axions are popular candidates to drive inflation due to their (broken) shift symmetry, which protects the inflationary potential from corrections that would spoil inflation otherwise. It was soon realized that the QCD axion, as well as single field models in string theory, can not support large field inflation while remaining consistent with theoretical constraints and observations, see e.g. the current lower bound in [6] on the axion decay constant f , based on the Planck satellite mission that measured temperature fluctuations in the cosmic background radiation. This requirement is inconsistent with constraints in string -1 -JCAP04(2016)025 theory [7,8], see also [9,10] and [4], so that attention shifted to multi-field models. Here, alignment effects, such as the Kim-Nilles-Peloso mechanism [11], see also [15][16][17], enable the construction of large-field slow-roll inflationary models [18,19] that are consistent with observations [6] and appear to be under computational control [18][19][20]. 1 While inflation in such multi-field scenarios is certainly possible, it is not a priory clear how likely sufficiently long bursts of inflation are, or at which height the final vacuum after inflation sits. In the presence of many fields, whose masses and couplings constitute free parameters, we are essentially dealing with random potentials, see [30][31][32][33][34] for a small selection of relevant work. As a consequence, the first question is often addressed by sampling an ensemble of potentials and/or trajectories [31,33] while invoking anthropic arguments [35] to focus only on those trajectories that support at least sixty e-folds of inflation. Distributions for observables can be computed in a similar manner, see [36][37][38][39][40][41][42][43] for selected works (discrepancies between [38] and [39] are explained in [39]). One may also attempt to treat the functional form of the potential as random, see e.g. [44].
In this article, we are interested in the final resting place, that is the resulting vacuum after dynamical evolution on an axionic potential, and only comment briefly on the feasibility of inflation. A common approach is to count all vacua in a given potential, see [31,[45][46][47][48] for a few examples and [50] for efficient methods. Based on the resulting histogram, one attempts to draw conclusions about the likelihood of finding a vacuum at a particular height. If the distribution peaks at a height comparable to the observed value of the cosmological constant, one may not have to rely entirely on anthropic arguments [51,52], see [53,54] for recent reviews. 2 We show that simple counting arguments are misleading because they do not take into account the dynamical evolution of fields. As observed in [34] in a particularly simple toy model, low lying minima are usually more likely to be reached dynamically, particularly as the dimension of field space is increased. In essence, what would be a minimum in a low dimensional potential is more often than not a saddle point in the higher dimensional case, 3 so that fields keep on rolling further and further down. This effect can be quantified numerically in concrete landscapes and understood analytically, at least qualitatively, based on random matrix theory (see i.e. [58,59] for a textbook introduction to RMT) in the limit of many fields due to the feature of universality [60][61][62][63][64]. 4 However, a competing effect is the attraction to minima close to the trajectory: as the dimensionality is increased, more and more minima are close by, which can cause the trajectory to derail and get trapped early on. The interplay of these competing effect renders an analytic description challenging.
In this work, we consider a relatively simple axionic multi-field potential, section 2, similar to the one considered in [47,49]. We compute the distributions of dynamically reached 1 See however [9,[21][22][23][24][25][26][27][28][29] for constraints on axionic inflation based on the weak gravity conjecture. 2 Note that even if the height were in the range of the observed vacuum energy, the cosmological constant (CC) problem would not be solved, since subsequent phase transition lead to additional large contributions of the CC. Nevertheless, achieving a small bare contribution to the CC after inflation appears to be a step in the right direction. An additional complication is the measure problem, see [55][56][57] for a summary of proposed measures and reviews. 3 If one holds one field fixed and finds a minimum for the other ones, that point is usually not a minimum if the fixed field is taken into consideration again. Naively, one might think the chance for a critical point to be a minimum to be 2 −D , but for many random potentials, this chance is actually suppressed by a factor of e −constD p with p = 2 as long as the Hessian can be approximated by a random Gaussian matrix (a known results in random matrix theory), see e.g. [31]. 4 It should be noted that adding more structure to the scalar sector can render random matrix theory inapplicable, see e.g. the Type IIB flux compacitifcations examined in [48].

JCAP04(2016)025
vacua numerically in section 2.2. To explain these distributions analytically, particularly the preference of low lying minima, we examine the statistical properties of the potential: we compute the probability distribution function (PDF) of the potential and its first derivative at a random point in section 3.1 and 3.2.1. In conjunction with an empirical ansatz (tested numerically) of the average potential difference to the next critical point ∆V , we estimate the number of critical points n c that are encountered if a given potential difference is traversed, see section 3.2.2. We follow with the PDF of the Hessian in section 3.3, using results of random matrix theory. We pay particular attention to the many field limit, D 1. Some technical aspects of the computation are relegated to the appendix.
Given n c and the PDF of the Hessian, we compute the expected distribution of minima in section 4.1. While the analytic result recovers some aspects properly, such as the presence of a peak and the shift of the peak to lower values as D is increased, it is insufficient for a quantitative analysis. We identify the most likely causes for the observed discrepancies: the omission of dynamical effects as well as constraints and correlations between elements of the Hessian.
Based on numerical studies, we are able to provide a simple empirical fitting function for the ratio of the average potential difference between critical points to the mean potential difference to the nearest minimum, R(D) = dV /∆V , see section 4.2. In conjunction with our estimate for ∆V , we are able to compute the height of the most likely resting place (4.19), yielding where n D is the number of sources in the axionic potential. This result is in quantitative agreement with our numerical studies, but lacks a thorough theoretical justification, particularly the origin of the logarithmic dependence on D.
We discuss implications for the cosmological constant problem in section 4.3. We conclude that it is crucial to incorporate dynamical selection effects consistently if one is interested in the distribution of vacua that may be reached after inflation. Concretely, counting all vacua or relying on random matrix theory alone is insufficient for quantitative arguments.
We follow with a brief discussion of inflation on such landscapes in section 4.4: as the dimensionality of field space is increased, inflation becomes more likely. Due to additional numerical challenges in the large D limit, we postpone a detailed analysis to a forthcoming publication. We conclude in section 5.

Axionic landscapes
We consider multi-field axionic potentials of the form [47] where N axion is the number of axions, N source is the number of shift symmetry breaking sources from non-perturbative effects, Λ i sets the strength of each source term, φ j denote the axions with decay constants f j , n ij is a mixing matrix, θ i the relative phases between different source terms and C an overall additive constant affecting the value of the cosmological constant today.

JCAP04(2016)025
Our notation is consistent with the one in [47], in which the distribution of vacua was investigated via direct counting, taking into account constraints that enable the Kim-Nilles-Peloso (KNP) mechanism [11], see also [12][13][14]. We pursue a similar goal, but would like to take into account dynamical effects: once a field is let to evolve on a given landscape, it rolls towards low lying minima, leading to a selection effect that is expected to be stronger the higher the dimensionality of field space is, see [34,43]. On the other hand, trajectories are more likely to get trapped early on as N axion increases, since minima close to the trajectory act as attractors. Throughout this article we set

Model parameters
In line with [47], we do not consider the most general potential, but impose several constraints on model parameters. Firstly, we set so that the lowest minimum is at V = 0. Since it is unlikely that the axions settle in this minimum at the end of inflation, a non-zero contribution to the cosmological constant is generically present. To ease notation, we define and consider n D to guarantee a non-trivial landscape. If not stated otherwise, we use n = 125 and vary D.
Without loss of generality, we reabsorb the decay constants f j into the real valued mixing matrix n ij , that is we formally set f j = 1; in [47], the mixing matrix was integer valued and the f j were kept explicit. If not stated otherwise, we draw the elements of n ij from a normal distribution with zero mean and standard deviation σ n = 2 , (2.6) truncated to the interval [−3, 3] and properly rescaled. 5 The overall scale of the potential is set by the Λ i , which we draw from a flat distribution over the interval [0,1]. We further renormalize the potential such that For the relative phases, we choose a flat distribution over the interval [0, 2π). In [47], Λ i ≡ Λ = 1 and f i = f were chosen, making their potentials less general, albeit still comparable to ours. To our knowledge, there is no theoretical reason to demand Λ i = Λ j for i = j, except to keep the potential simple. Since the randomness of the potential is reduced by setting Λ i = Λ j and histograms are affected by this simplification, see figure 1, we keep the Λ i as random variables. Additional differences of our approach in comparison to the one in [47] are the re-normalization of the potential according to (2.7) and the use of a dynamical algorithm to search for minima, see section 2.  is provided in figure 1: using a distribution for the Λ i shifts the histograms to lover values, whereas the dynamical search for minima makes higher lying ones more likely to be found. We focus on the quantification of the latter effect in this publication. It should be noted that histogram (d) in figure 1 shows large variability: some histograms show a peak while others don't. Furthermore, a Gaussian does not describe such histograms well. Both effects are due to the low value of n used here. As a consequence, we use much larger values of n in the remainder of this study.
Potentials constructed according to the above prescription are usually to steep to allow for slow roll inflation. As argued in [47], one can increase the likelihood of inflation if the mixing matrix is chosen such that it is conductive for the KNP mechanism to be operational. To this end, define the matrix where α, β = 1, . . . , D. Let R be the ratio of the smallest eigenvalue of M to the second smallest. As shown in [47], demanding R 1 increases the likelihood for a flat direction to be present in the landscape due to the KNP mechanism. We demand R < 0.01 (2.9) in our numerical studies if not stated otherwise.

Numerical computation: the distribution of V min for varying dimension
We construct an ensemble of axionic potentials according to (2.1) and model parameters in section 2.1, including the condition on R in (2.9). We would like to compute the height of those minima that are reached dynamically and investigate how the peak of this distribution changes as the dimensionality of field space is increased.
To find minima, we first construct a potential. In this potential, we choose 5000 random initial field values according to a flat distribution over the interval [−10 5 , 10 5 ] and let the axions evolve from rest,φ i = 0, by solving their equations of motion in conjunction with the Friedmann equation in a flat Universe where H =ȧ/a is the Hubble parameter and a dot denotes a derivative with respect to cosmic time. We count all trajectories, regardless of whether they include an inflationary phase or not. To decide numerically if a minimum is reached, we demanḋ for i = 1, . . . , D. While shallow regions and saddle point can be mistaken for minima, we checked that the resulting distributions of are not dominated by such misidentification. We expect the peak of the resulting distribution to shift closer to zero as D is increased, as long as D n, in line with the study of truncated Fourier series potentials in [34]. This is indeed qualitatively the case, see figure 2. The resulting histograms are robust with respect to changes in the distribution of the Λ 4 i (we checked a flat and a truncated Gaussian distribution) and the omission of the phases θ i . This independence of model parameters is expected due to the central limit theorem, as long as the number of independent random variables in the potential is large.
Keeping D fixed, one may wonder how strongly the histograms vary from one realization to the next. To this end, we construct 30 potentials for D = 3, . . . , 7 and identify 500 dynamically reached minima for each potential. Noting the peak value for each histogram, we can compute the meanV min and variance, see figure 3 and table 1: while some statistical scatter is clearly visible, the resulting variability is smaller than the observed trend. The logarithm ofV min can be fitted well by a linear function ln(V min ) = const + cD , (2.14)    Table 1. The peak in the histrogram of 500 minima is averaged over 30 realisations of the potential defined in (2.1) for D = 3, . . . , 7 leading to the mean valuesV min and lnV min . See figure 3 for the accompanying variance.
for the narrow range of D that are numerically accessible, but due to the limited reach in D, many other functions could be used as fitting functions as well. This trend in (2.14) is in line with the theoretical expectation based on [34]. However, the analytic results in [34] can not be applied directly: the potential in (2.1) is lifted compared to the one in [34], so that V = 0 is the absolute minimum, and subsequently rescaled, such that 0 ≤ V ≤ 2. Furthermore, the coefficients are drawn from different distributions. The former could be accommodated by a rescaling of the results, but the latter may still have some impact, particularly for low D. Since the analytic result in [34] required a series of harsh approximations that were only tested for low D, we postpone a detailed discussion of the above results and a comparison with [34] to section 4.

Properties of random axionic potentials
Given the random axionic potential in (2.1), we would like to explain analytically the shape of the histograms in figure 2 as well as the the dependence ofV min (D) in (2.14). To this end, we aim to estimate first the number of critical points n c that are encountered on average as fields traverse a distance δV down the potential. In combination with the statistical properties of the Hessian, one should be able to compute the ensemble average of the peaks' position in the histograms in figure 2, similar to the approach taken in [34]. Furthermore, in the large D limit, one should be able to employ random matrix theory to attain analytic results.
To make the problem more tractable analytically, we make the potential less random: in (2.1) the mixing parameters n ij were random. Here, we set them deterministically to where J j runs from 1 toñ andñ ≡ n 1/D .
Evidently, this definition makes sense only ifñ is an integer, that is only for certain combinations of n and D, say n = 125 and D = 5 so thatñ = 3. As we shall see, the final results are independent ofñ so that this restriction can be lifted in retrospect. 6 By construction, elements in the mixing matrix cover the interval (0, . . . , 3] evenly instead of randomly and they are independent of the field index i = 1, . . . , D. We checked numerically that restricting the range to (0, . . . , 3] from [−3, . . . , 3] has no effect on the resulting histograms, in line with the robustness of the results with respect to the omission of the phase factors θ i . However, it should be noted that the above simplification renders all entries of the mass matrix in (2.8) identical so that the KNP mechanism is not operational. 6 We replace sums by integrals at some point, so that non-integerñ do not pose a problem any more.

JCAP04(2016)025
Next, we alter the notation for the n =ñ D independent random variables Λ 4 i to Λ J 1 ,...,J D . To enable the analytic computation of subsequent integrals, we further switch the flat distribution of the Λ i to a normal distribution for the Λ J 1 ,...,J D with mean and standard deviation We chose µ Λ ≡ 1/n, so that our potential remains nearly normalized for different n and D, that isñ a should be a small number so that negative values of Λ J 1 ,...,J D are rare. We checked that different choices for a don't change our results qualitatively as long as a 1. Given these simplifications and definitions, the potential in (2.1) reads This potential is quite similar to the one investigated in [34], so that similar techniques to attain analytic results can be employed as well. While our simplifications reduce the randomness and alter the distribution of random variables, we expect our subsequent results pertaining to the distribution of dynamically reached minima to remain valid as long as the number of random parameters entering the potential remains large, that is in the limit n D 1, essentially due to the central limit theorem. 7 To reiterate, our goal is to estimate the number of critical points as fields roll down the potential, see section 3.2, and compute the likelihood that a critical point is a minimum, see section 4.1, so that we can compute the likely height of the final resting place.

The PDF of V
The potential in (3.7) is the sum of products of random variables, the Λ J 1 ,...,J D and Since the distribution of the latter is much broader than the former and reasonably flat, we may approximate the PDF of the product by the distribution of the Λ J 1 ,...,J D . As a consequence, the potential at a random point is to a good approximation a normally distributed JCAP04(2016)025 random variable with mean and variance given by The numerical pre-factor in the variance is caused by the redefinition of the potential in (3.7) as well as replacing thex J 1 ,...,J D by their expectation values, but the scaling σ V ∝ 1/ √ n is robust and can be understood as follows: without renormalizing the potential, each term in the sum adds a random step, with even chance to be above or below one. The expectation value of the square-distance to the mean of this random walk scales with √ n. Since we rescaled the potential by a factor of 1/n, the square-distance scales as 1/ √ n, which in turn is proportional to σ V . This feature has consequences for the feasibility of inflation on such potentials, see section 4.4.

Estimation of the number of critical points n c
In this section, we estimate the number of critical points n c that are encountered if a potential difference of δV is traversed.

PDF of the potential's gradient
To estimate n c , we need the probability density function (PDF) of the partial derivative to compute the PDF of the gradient's absolute magnitude, (3.12) To ease notation, we define so that the partial derivative of the potential reads Since the argument of the sin-function at a given point is a random variable, x J 1 ,...,J D is a random variable as well, which is symmetric around its mean 1. Evidently, V k is a random variable with zero mean. We approximate the distribution of x J 1 ,...,J D by a flat one over the interval [0, 2]. Furthermore, defining the derivative of the potential becomes Here, N p stands for the number of permutations of the and we introduced the shorthand notation We proceed by calculating the variance for each of the four terms in (3.17) separately, keeping in mind that the mean of V k is zero.
The 1'st and 2'nd Term. The PDF of the product where x J 1 ,...,J D has a flat distribution and b J 1 ,...,J D a Gaussian one, is given by where c is a normalization constant and Ei the exponential Integral. We approximate this PDF by a normal distribution with zero mean and the same variance as b J 1 ,...,J D , i.e. σ Λ . This approximation is equivalent to replacing x J 1 ,...,J D by its mean value of 1 in (3.20), that is, we ignore the sin-function. In this approximation, the variance of the first term in (3.17) becomes where we kept the leading order term inñ only. Similarly, the variance of the second term becomes since N pñ =ñ D = n .

JCAP04(2016)025
The 3'rd and 4'th Term. Consider b ≡ñ where y i are independent and identically uniformly distributed, U(0, 1), random variables. According to appendix A, equation (A.8), the variance of this sum is with C ≈ 0.11. As a consequence, the variance of the third term in (3.17) is where we kept the leading order term in the last step. The variance of the forth term is zero, since it does not contain any random variable.
Since V k is the sum of three approximately Gaussian random variables and a constant, its variance becomes to leading order inñ; in the last step, we neglected a 2 = 0.01 1 and used 9C ≈ 1. Note that the constant proportionalality factor of order one in (3.29) can be reabsorbed by the free parameter β below, so that it is not crucial for us to keep any sub-leading terms. Hence, we approximate the PDF of V k by Consequently, the absolute magnitude of the potential's gradient in (3.12) obeys a χ-distribution, In the large n, D-limit, the expectation value of V can be approximated by where we used the identity Γ(x/2)Γ((x + 1)/2) = √ πΓ(x)/2 x−1 as well as the large argument limit Γ(x) ≈ √ 2πx x−1/2 exp(−x) for x 1.  We would like to estimate the potential difference ∆V between a random initial point with slope V defined in (3.12) and the nearest critical point with V critical = 0. Since our potential is relatively smooth, we expect this difference to be bigger, the larger the initial slope is. Thus, we make the Ansatz where β is a constant and we took out the expected scaling with the dimensionality of field space. Given the PDF of V in (3.31), we can compute the expectation value of ∆V to Evaluating the Gaussian integral yields where σ 2 = 1/n according to (3.29). In the large D limit, this potential difference becomes where we used the large argument limit of the Gamma-function. We tested the Ansatz in (3.34) numerically for varying D andñ and found that β is indeed approximately a constant of order one, β ≈ 1.9 ,

The Hessian
We are interested to compute the ratio of minima to critical points, particularly how this ratio changes with the dimensionality of field space. Thus, we need to compute the PDF of the Hessian. Consider a random critical point φ c at a height V (φ c ) ≡ V c . In this section a subscript c is not a free index, but denotes a quantity evaluated at a critical point. From the definition of the Hessian and the potential in (2.1) we get with (3.44) We extracted the term proportional to V c to highlight the deterministic dependence of the Hessian, and thus the likelihood of a minimum, on the height of the critical point V c . This dependence is crucial to retain, as first pointed out in [34]. The random contributionH kl has a mean of where we used the approximate normalization of the potential in (3.6). Readers not interested in the derivation of the variance may jump to section 3.3.3 for the result.
To compute the standard deviation ofH kl , we rewriteH kl as where we defined the random variables 8 The b J 1 ,...,J D are normal distributed with zero mean and variance σ Λ and, similar to section 3.2.1, we approximate the PDF of x J 1 ,...,J D by a uniform distribution on the interval [0, 2]. To keep the notation clean, we suppress the multi-index J 1 , . . . , J D on b and x in the following. Furthermore, it is useful to compute the standard deviation of the diagonal and off-diagonal elements ofH kl separately.

Diagonal elements ofH kl
For k = l, we havẽ where we used the shorthand notation defined in (3.19). The first term in (3.49) is a sum of Gaussians with identical variance and constant coefficients, so that it is also a Gaussian with variance where we used σ Λ = a/n, N p =ñ D−1 ,ñ D = n and kept only the leading order term inñ in the last step. The second term in (3.49) contains the product of a Gaussian random variable b and an approximately flat distributed one, x; as a consequence, the PDF of the first term is given by an exponential integral function where c is a normalization constant, which we further approximate by a Gaussian with zero mean and the same standard deviation as b, which is equivalent to replacing x by its mean. In this approximation, the variance of the second term for large n and D is the same as the one of the first term, The third term in (3.49) has the largest variance and is thus the most important one. To compute the variance, we ignore the −1 in (J 2 k − 1) right from the start, since the corresponding contribution to the sum is negligible in the large D,ñ-limit. The remaining term is proportional to whose variance is according to equation (A.11) in appendix A.2. As a consequence, the third term has approximately a normal distribution with variance where C ≈ 0.11 from appendix A.2 eq. (A.5) and µ Λ = 1/ñ D = 1/n. The main variability in the Hessian stems from the trigonometric functions and their random arguments as opposed to the Gaussian pre-factors Λ J 1 ,...,J k . Thus, we could have worked with constant pre-factors Λ J 1 ,...,J k = µ Λ from the start, without loosing any crucial information.

Off-diagonal
The derivation of the variance for off-diagonal elements of the Hessian, k = l, is analogous to the one for the diagonal ones in the preceding section; the only difference is the appearance of two sums, over J k and J l , instead of a single sum over J k , which changes some numerical factors. For the sake of brevity, we omit the detailed derivation, and note that the variance of the three terms appearing iñ (3.57) becomeσ where we used (A.19) for the computation ofσ 3 . The sum overÑ p in (3.57) is shorthand for summing over J 1 , . . . , J D without J k , J l .

Summary
The Hessian at a critical point of the axionic potential in (2.1) is comprised of a deterministic part, set by the height of the potential V c at said point, and a random, approximately Gaussian componentH kl , for k, l = 1, . . . , D. H kl is therefore, to a good approximation, a Gaussian random variable with mean µ kl ≈ 9 and variance see (3.56) and (3.59). We used a 1 above to argue that the variance is set be σ 3 andσ 3 . For the original potential in (2.1), Λ J 1 ,...J D is a flat distributed random variable on [0, 1]. We recomputed the variance for σ 2 and find based on results from appendix A. Thus, (3.62) should provide a decent approximation for the variance of the Hessian's elements for the potential (2.1) as well.
The histogram in figure 4 shows an exemplary comparison of numerical results for the distribution of entries of Hessians with our analytic approximation: we compute the Hessian  of the potential at random points and add 9V c /ñ 2 to it, where V c is the height of the potential at this point. We choseñ = 10 and D = 3. The solid line is our analytic approximation, not a fit. We see that the diagonal entries and off-diagonal entries of the Hessian are to a good approximation normally distributed with mean µ dia = (1 − V c )9/ñ 2 and variances σ dia and σ offdia respectively, see (3.62).

Dynamics on multi-field axionic potentials
In this section, we wish to investigate questions pertaining to the dynamical evolution of axions, particularly where they end up and whether or not inflation is likely to take place.

The distribution of minima based on random matrix theory
Given the Hessian derived in section 3.3, we can compute the probability that a given critical point φ c at height V c is a minimum by means of random matrix theory in the large D-limit, see appendix B equation (B.6). Given the mean and variance of the Hessian's elements, this probability becomes with σ offdia in (3.62). Note that P min depends onñ and D separately and not on n =ñ D . However, sinceñ is no summation index any more, one can use non-integer values ofñ according to any desired choice of n and D. Evidently, the probability for a random critical point to be minimum becomes exceedingly suppressed in the large D limit. Furthermore, for largeñ and D, that is σ kl 1, P min approaches the universal value, see (B.7),

JCAP04(2016)025
Let's consider the large D limit and ask, what is the probability that no minimum is found when starting from V ini > 1 until some 0 < V < 1? We first recall that the average potential difference between neighbouring critical points is according to (3.37), with β ≈ 1.9. Therefore, on the descent from V ini down to V , one encounters on average n c = (1 − V )/∆V ≈ (1 − V ) √ βn critical points that have a non-zero probability of being a minimum. Thus, the probability of not finding a minimum is where we kept the leading order term in P min 1 in the last step. (4.4) is a monotonic function, which explains correctly that minima are less likely to be encountered as D is increased. However, its derivative with respect to V , which should yield a distribution similar to those depicted in figure 3, does not have a peak and is therefore insufficient to explain the histograms. One cause of this discrepancy is the omission of dynamics: if fields evolve on the potential, or one simply follows the gradient, minima act as attractors to trajectories in their vicinity. Thus, it it is not surprising that more minima are found in our numerical studies than the naive application of random matrix theory and counting of critical points would have one believe. Another shortcoming becomes evident in the limit D → ∞, while keepingñ = const, yielding lim D,n→∞ P nomin (0) = 1 . (4.6) In this limit, runs that encounter a minimum appear to be of zero measure, even though the potential is sharply bounded from below at V = 0 so that every trajectory must find a minimum. Again, this discrepancy is easily understood: nowhere in the derivation of P min or ∆V did we impose the lower bound of the potential. This discrepancy should become important once the lower boundary is approached, that is once V ∼ ∆V .
In the above argument, we assumed that we start each run with V ini > 1. Of course, one can marginalize over the initial values, as we did in the computation of the histograms in figure 3, which leads to the same qualitative results: under the assumption of a flat prior for V ini over the range [0, 2], we can integrate over V ini to obtain the probability that no minimum is reached up to a final value of V , (4.7) P nomin is a again a monotonic decreasing function with P nomin (0) = 0 that approaches the universal value P nomin (0) ≈ 1 in the large D-limit, showing the same shortcomings as (4.4).
For small D andñ, (4.2), and thus (4.7), is not a good approximation. One can use (4.1) to get an expression for P nomin (V ) that can be evaluated numerically, However, this expression fails as well to explain the observed histograms quantitatively, see figure 5 for an exemplary comparison for D = 3 and n = 125: taking a derivative with respect to V yields indeed the expected peak, but its position and width show large discrepancies in comparison to the actual histograms. As noted in [65], a further conceptual shortcoming of applying random matrix theory is that the diagonal elements of the Hessian are not independent from each other or the off-diagonal elements, and vice versa. This dependence becomes unimportant in the large D limit, essentially due to the central limit theorem, but it is present for low D, as we were able to observe in numerical simulations: the Hessians at random points in our axionic landscape is indeed distinguishable from random matrixes in the GOE for low D andñ.
To conclude, the probability that a critical point is a minimum as derived in (4.1), which is at the heart of the above expressions, does not yield a quantitatively satisfactory explanation of the observed histrograms in figure 2 for any value of D. We attribute this discrepancy to three primary reasons, • the omission of dynamical effects (the primary reason), • the omission of constraints, such as the potential's lower bound (important for V ∼ ∆V ), • the omission of correlations between the Hessian's elements (relevant for low D andñ).
Evidently, P min in (4.1) does not coincide with the empirical result of dividing the average potential difference between encountered critical points ∆V by the mean potential difference to the next minimum, dV , To make quantitative progress, let us define the function for which we are able to provide an empirical analytic approximation in the next section.

Empirical distribution of minima
To model R(D) defined in (4.10) we make the Ansatz Comparison to numerical results for the potential in (2.1) yield the parameters (1σ errors) a ≈ 16.38 ± 0.14 , = (a ln(15 for the potential difference to the next minimum, where we used the large D limit in (3.38) in the last step. Thus, the expected height of the final resting place becomes One could obtain a distribution for the final resting place by integrating (4.17) over all possible V ini . Based on appendix A, we know that V ini ∈ N (1, a/ √ n), so that the resulting variance is strongly suppressed in the large n-limit. Thus, taking V ini = 1 should provide a good approximation for large n, D. This is in line with our prior theoretical result in (4.2) that P min = 0 if a critical point is at a height V > 1 in the same limit. However, for low D, we observe that some minima are reached at V > 1, see figure 2. Evidently, there is a some    dependence of the effective V ini on D. Treating V ini as a free parameter that is fitted to the data in figure 7a for low D yields V ini = 1.037 ± 0.0023 , (4.18) close to our theoretical expectation. Given this value of V ini , we compare (4.17) with the position of the histograms' peaks (see figure 3) in figure 7 and figure 8. We observe good agreement of (4.17) with the numerical results. We would like to highlight that while the five data-points could be fitted by other functions as well, as we did in (2.14), there is no such freedom left once the results depicted in figure 6 are taken into account, which have a significantly larger reach in D.
The expression for V min (D, n) in (4.15) can in principle become negative in the large D-limit for the numerical values for a and b, but it remains positive as long as n D is properly enforced. To be concrete, solving V min = 0 for D yields D = exp(( √ nβ − b)/a). Setting this expression equal to n and solving for n gives n ≈ 5075, that is, above this value one would need D > n to get V min = 0. The largest ratio of n/D for which V min = 0 is JCAP04(2016)025 possible is n/D ≈ 6, which occurs at n ≈ 565. Thus, as long as D < n/6, V min remains positive for all combinations of n and D, consistent with the definition of the positive semidefinite potential in (2.1). As such, our result is not applicable to the cases studies in [47] due to the low values of n employed there, see for instance the n = 13 and D = 8 case that we reproduced in figure 1.

Discussion: implication for vacuum selection and the cosmological constant problem
In line with the results in [34], we found, by direct application of random matrix theory in the large D limit, that almost all critical points are indeed saddle points: the probability for a minimum scales as ∝ exp(−D 2 ln(3)/4), see eq. (4.2). However, this probability can not be used directly to identify the likely resting place after dynamical evolution on the landscape, since minima act as attractors to neighbouring trajectories, see the discussion at the end of section 4.1. Thus, directly searching for all minima does not necessary shed light on our vacuum. Based on an empirical Ansatz for the average potential difference to the nearest minimum in (4.11), we were able to predict the most likely height of the dynamically reached minimum in (4.17), which scales as This logarithmic dependence on D is in stark contrast to the exponential dependence one might naively expect based on (4.2). The latter was used in [34] to argue that exceedingly low lying minima would be reached dynamically as the dimensionality of field space is increased. There, only limited numerical tests for low D were available, which were consistent with such an assertion. However, as we have seen here, the preference of low lying minima is considerably weaker than anticipated. An important consequence pertains to the cosmological constant problem: even with fine tuning of n and D, the height of the most likely final resting place can usually not be brought anywhere near the observed value of the cosmological constant. Hence, even though for large n and D an axionic landscape permits a dense distribution of vacua such that some ought to be at the correct height close to the observed value of the cosmological constant [66], it is extremely unlikely that any of them would be reached dynamically via classical evolution, since they are too close to the lower boundary of the potential at V = 0. One could add a finely tuned negative constant to the potential such the V min is at the correct height, but this would simply shift the fine tuning problem to another constant.
We did not account for tunnelling, which may very well reduce the lifetime of vacua at intermediate height drastically, see [67][68][69][70][71][72][73][74]. However, based on the observation of fluctuations in the cosmic microwave background radiation, and the absence of observable spatial curvature, we know that no tunnelling took place during the last sixty e-folds of inflation. Since any prior tunnelling event needs to leave the axions sufficiently high in the potential to allow for inflation and reheating above the energy scale needed for nucleosynthesis, we can conclude that the subsequent, classical evolution leads to the problem elaborated on above.
To conclude, we find that the cosmological constant problem is not alleviated at all by the mere presence of many fields and vacua, leaving yet again anthropic arguments in the framework of eternal inflation as a last resort [51][52][53][54]. We expect similar results to hold for other multi-field potentials, as long as they are sufficiently random.

Inflation on axionic potentials
We wish to investigate dynamics, particularily the feasibility of inflation, on axionic potentials in the limit n D 1. Thus, we need to solve the Friedmann equation in a flat universe (4.20) in conjunction with the generalized Klein-Gordon equations To achieve slow roll inflation for an extended period of time, the potential slow roll parameters, i.e. the ratio between the first and second derivative of the potential along the trajectory to the potential, need to be small. We first note that the potential at a random point is approximately normal distributed with meanV = 1 and variance σ V ∝ 1/ √ n, see eq. (3.10). Thus, as we increase n, the potential on the whole becomes flattened around the mean, which is a simple consequence of keeping the potential normalized so that its values stay in the interval between zero and two. Next, consider the slope in the steepest direction, as measured by Since V k is approximately Gaussian distributed with zero mean and σ V k ≈ 1/ √ n, see (3.29), we showed that the expectation value for the slope in (3.12) has a χ-distribution with mean and for an individual direction, we get k ∼ 1/(2n). Thus, picking a point at random, these slow roll parameters become strongly suppressed and inflation becomes likely. This is consistent with the scaling of the potential difference to the next critical point ∆V ∝ 1/ √ n, see (3.37): one has to traverse a smaller and smaller potential difference to find the next critical point, i.e. the potential is exceedingly flat and inflation should be likely. In addition, if one chooses the mass matrix in (2.8) such that the ratio of the smallest to the second smallest eigenvalue is small, the KNP mechanism is likely to yield a flat direction suitable for inflation.
To build more realistic models, one can shift the potential by the expected height of the final resting place, V min in (4.17), that is V → V −V min (D, n). Half of the trajectories land on average in a final minimum above (but considerably closer) to zero. These potentials are fine tuned. Such a shift reduces Hubble friction, increases the expected slow roll parameters and consequently reduces the likelihood of inflation to some degree. However, since 1 − V min ∼ O(1) for all values of n, D, see the discussion at the end of section 4.2, our conclusions with respect to the likelihood of inflation should remain qualitatively valid.
To test this expectation, we computed numerically the distribution for the number of e-folds N = Hdt for the unshifted and shifted potential V → V − V min (D, n), focussing on trajectories terminating at V min > 0. For D 6, we find N ∼ few before the trajectories settle into a minimum and eternal inflation results. For larger D, numerical methods become increasingly time consuming, since the computational cost scales with the number of terms in the potential. The computational limitation of globally defined random potentials is well known and can be alleviated by constructing the potential locally around the trajectory [65,75,76]. We are currently using such locally defined potentials, based on generalized Dyson Brownian potentials [75], to model axionic potentials and perform a quantitative analysis of inflation, including the computation of cosmological perturbations as in [36] via an extension of Mul-tiModeCode [77]. These results will be presented in a follow-up publication since they go beyond the scope of this article.

Conclusions
Motivated by the feasibility of inflation on multi-field axionic landscapes due to the KNP mechanism, we investigated the distribution of minima that are reached dynamically, particularly the dependence on the dimensionality of field space. Our potentials are bounded from below at V = 0 and rescaled, such that V ≤ 2 holds. Furthermore, cross couplings of the fields are included.
In numerical experiments for D ≤ 7, we found the peak of the distribution to be shifted to lower values as D is increased. To explain its position, we derived the statistical properties of the potential as well as its first and second derivative at well separated, random points in the limit of many fields and sources, n D 1. Together with an estimate for the distribution of critical points, we computed the distribution of minima. This analytic result recovers some qualitative aspects of the distribution, but fails at a quantitative level. We attribute this discrepancy to the difference between counting all minima and the methods by which minima are reached dynamically in a cosmological setting. It should be noted that almost all analytic studies in the literature use the former. We conclude that such simplified methods are insufficient for a quantitative assessment of the final resting place after inflation.
We proceeded by providing a phenomenological expression for the peak of the distribution in (4.19), which entails a logarithmic dependence on D, which does not approach the lower boundary V = 0 fast. Thus, even in the large D limit, a considerable bare contribution to the cosmological constant is present if it is not cancelled by a fine tuned additive constant.
We comment briefly on the feasibility of prolonged periods of inflation on such landscapes, which are likely, but postpone a quantitative study to future work, since the computational cost for the globally defined potentials employed in this paper is prohibitive. This problem can be alleviated by modelling the axionic potentials by locally defined ones, as in [75]. Our preliminary studies are encouraging.
A On the distribution of the sum of N non-identically distributed uniform random variables

A.1 The Irwin-Hall distribution
The Irwin Hall distribution applies to the the sum of n independent and identically uniformly distributed, U(0, 1), random variables. In the large n limit, it approaches a normal distribution with mean µ = n/2 and variance σ 2 = n/12, see [78]. where J j runs from 1 to n for each j, j runs from 1 to D, y J 1 ,...,J D = x J 1 ,...,J D /2 are independent and identically distributed (i.i.d.) uniform variables on the intervall (0, 1) and we dropped a tilde on n to keep the notation simple. We are interested to derive the distribution of E kl . To this end, consider first the simpler variable where y i are i.i.d. uniform variables on (0, 1). Note that a is the sum of n random variables a i , which are random variables on U(0, i), with i = 1, 2, . . . , n. In line with the Irwin Hall distribution, we can approximate the PDF of a by a normal distribution with where C is a constant that we computed numerically to C ≈ 0.11 . (A.5) To get closer to E kl , let us consider as the next step. Evidently, b is the sum of n random variables b i , which in turn is a random variable on U(0, 2i) for i = 1, 2, . . . , n. Thus, for large n, the distribution of b approaches again a normal distribution, but with twice the mean and four times the variance if compared to the one for a, Continuing our approach to E kl , consider next where E k = kb has a normal distribution with i .

B Random matrix theory
Here, we present some known results of random matrix theory pertaining to fluctuation probabilities of eigenvalues for matrices in the Gaussian Orthogonal Ensemble (GOE) with i.i.d. entries and zero mean, see [79]. The probability that an N × N real matrix in the GOE is positive-definite is [31,79,81,82] P ∝ exp − ln ( which is much smaller than one might naively expect based on the N eigenvalues of such a matrix. Instead of matrices in the GOE with zero mean, we want to study eigenvalue fluctuation of N × N random real symmetric matrices, whose entries have non-zero mean and different variances for diagonal and off-diagonal entries. Firstly, in the large N -limit, we can ignore the difference in variance for the diagonal elements, since it does not enter into the fluctuation probability, see e.g. the lecture notes in [80]. Based on numerically generating N × N , real, symmetric matrices with off-diagonal entries drawn from N (µ, σ) as well, we find that the probability for the largest eigenvalue to be positive for µ > 0 is approximately given by The distribution of the remaining eigenvalues obeys Wigner's Semicircle Law, so that Thus, the probability that a critical point is a minimum is approximately given by 9 P min ≈ P (λ 1 > 0) · P (λ 2 , . . . , λ N > 0) ≈ where we treat the largest eigenvalue as if it were independent from all the other eigenvalues, i.e., we don't require the remaining eigenvalues to be smaller than λ 1 . If µ < 0, equation (B.2) describes the probability that the smallest eigenvalue is less than zero so that the probability that a critical point is a minimum becomes