Propagation of chaos and the higher order statistics in the wave kinetic theory

This manuscript continues and extends in various directions the result in arXiv:2104.11204, which gave a full derivation of the wave kinetic equation (WKE) from the nonlinear Schr\"{o}dinger (NLS) equation in dimensions $d\geq 3$. The wave kinetic equation describes the effective dynamics of the second moments of the Fourier modes of the NLS solution at the kinetic timescale, and in the kinetic limit in which the size of the system diverges to infinity and the strength of the nonlinearity vanishes asymptotically according to a specified scaling law. Here, we investigate the behavior of the joint distribution of these Fourier modes and derive their effective limit dynamics at the kinetic timescale. In particular, we prove propagation of chaos in the wave setting: initially independent Fourier modes retain this independence in the kinetic limit. Such statements are central to the formal derivations of all kinetic theories, dating back to the work of Boltzmann (Stosszahlansatz). We obtain this by deriving the asymptotics of the higher Fourier moments, which are given by solutions of the wave kinetic heirarchy (WKH) with factorized initial data. As a byproduct, we also provide a rigorous justification of this hierarchy for general (not necessarily factorized) initial data. We treat both Gaussian and non-Gaussian initial distributions. In the Gaussian setting, we prove propagation of Gaussianity as we show that the asymptotic distribution retains the Gaussianity of the initial data in the limit. In the non-Gaussian setting, we derive the limiting equations for the higher order moments, as well as for the density function (PDF) of the solution. Some of the results we prove were conjectured in the physics literature, others appear to be new. This gives a complete description of the statistics of the solutions in the kinetic limit.


Introduction
Propagation of chaos is a central theme in all kinetic theories in statistical physics.Roughly speaking, it states that for a microscopic system with many interacting objects (particles or waves), two distinct objects should be statistically independent in the kinetic limit.Of course, this independence is not true before taking the limit, even if it is true at initial time, because naturally the dynamics produces correlations between the objects.Nonetheless, the fact that this independence is resurrected in the limit is a cornerstone of the whole kinetic description, in both particle and wave kinetic theories.In fact, almost all formal derivations of kinetic models, dating back to founding work of Boltzmann, assume propagation of chaos to hold in order to get a closed kinetic equation for the lowest nontrivial marginal or moment of the solution.
Mathematically speaking, propagation of chaos can be phrased in terms of the asymptotics of appropriate correlations or joint distributions of the solution.In wave kinetic theory, also called wave turbulence theory, these are given by the (second and higher order) moments of the Fourier modes of the solution to the dispersive equation that describes the microscopic system.If u(t) is this solution, the second moment E| u(t, k)| 2 is the central quantity whose asymptotics, in the kinetic limit, is given by the wave kinetic equation (WKE), which acts as the wave analog of Boltzmann's equation.The formal derivations of this equation in the physics literature, dating back to the pioneering works of Peierls, Hasselman, and others [29,21,22,27,28,33], are based on the unjustified assumption of propagation of chaos, which effectively allows to represent higher order mixed moments by products of second order ones, thus yielding a closed equation for the second moments.
A rigorous derivation of the WKE at the kinetic timescale, starting from the nonlinear Schrödinger (NLS) equation with random initial data, has been given in our recent work [12].This is the first result of its kind for any dispersive system (we will review some of the literature below).The derivation is done via a delicate analysis of the iterates of the NLS equation and their second order correlations, which are represented by ternary trees (and couples of such trees) often called Feynman diagrams.The analysis of such diagrams involves (a) identifying the leading order diagrams called regular couples, (b) proving that all remaining diagrams lead to negligible contributions, and (c) controlling the remainder term in the iteration.This outline is rather simplistic; in reality there are other almost-leading diagrams whose contributions have to be analyzed separately.Moreover, the problem of estimating the diagrams is probabilistically critical in the sense of [13], which is added to the factorial growth of the number of diagrams, to make the execution of this outline far from trivial.We will review some elements of that proof in Section 3 below, and also refer the reader to Section 3 of [12] for a more detailed exposition.
In particular, the proof in [12] does not require establishing propagation of chaos for the higher moments of the solution in order to obtain the effective equation for the second moment, in sharp contrast with the earlier works that make use of the BBGKY and other similar hierarchies.This brings us to the main goal of this manuscript, which is to establish propagation of chaos and the corresponding (wave kinetic) hierarchy a posteriori relying on the analysis introduced in [12].Highly interesting results and unique features will appear, for the higher order statistics, depending on the initial distribution of the data, as we discuss both Gaussian and non-Gaussian initial distributions (for concreteness, only the Gaussian case was treated in [12]).In the former case, we will prove propagation of Gaussianity, which states that the asymptotic distribution of the modes remain Gaussian as it is initially.In the latter case, we will derive the limiting equations for the probability density function.We remark that this gives a complete description of the statistics of the solutions in the kinetic limit, for both Gaussian and non-Gaussian initial distributions.
1.1.The kinetic setup.To state our results more precisely, let us first recall the wave kinetic setup starting with the microscopic system given by the cubic nonlinear Schrödinger equation.In dimension d ≥ 3, we set this equation on a large torus of size L. The torus may be rational or irrational, which can always be rescaled to the square torus T d L = [0, L] d but with the twisted Laplacian ∆ β = (2π , where β = (β 1 , • • • , β d ) ∈ (R + ) d determines the aspect ratios of the torus.Consider the cubic NLS equation with random initial data u(0) = u in , and Here Z d L = (L −1 Z) d , n in is a nonnegative Schwartz function on R d , and η k (ω) are i.i.d.random variables satisfying Eη k = 0, E|η k | 2 = 1.This distribution of initial data will be called Gaussian if the law of each η k is a standard complex Gaussian, and called non-Gaussian otherwise.Define The parameter α stands for the strength of the nonlinearity 1 and T kin is the kinetic timescale at which the NLS dynamics is approximated by that of the WKE.The kinetic limit is taken by letting L → ∞ (large box limit) and α → 0 (weak nonlinearity limit), according to some scaling law that specifies the relative rate of those two limits.
The general form of a scaling law is α = L −γ where 0 ≤ γ ≤ ∞, with the understanding that if γ = 0 then the α → 0 limit is taken after the L → ∞ limit, and vice versa for γ = ∞.As explained in the introduction of [12], not all scaling laws are admissible for the kinetic theory, and the admissibility range can depend on the shape of the torus (i.e. the diophantine properties of β).Indeed, without any diophantine conditions on β, the admissible range of γ is 0 ≤ γ ≤ 1, and one can show (e.g.[10]) that if γ > 1, then the kinetic description does not hold, for example when β = (1, . . ., 1).Imposing generic diophantine conditions on β by removing a set of "bad" vectors of zero Lebesgue measure, widens the admissible range of γ to 0 ≤ γ ≤ d 2 .In [12], we treated scaling laws of the form α = L −γ with γ ≤ 1 but sufficiently close to 1.When γ < 1 no requirements on the shape of the torus are needed, but for the endpoint γ = 1, the torus needs to have generic shape, i.e. β should belong to the complement of a fixed Lebesgue null set Z defined by a set of explicit Diophantine conditions (Lemma 2.1).We remark that the approach in [12] can be used to cover the full range γ ∈ (0, 1); this will be addressed in a forthcoming work under preparation.In the current paper, for the sake of concreteness, we will stick to the setup in [12] and adopt the scaling law α = L −1 , with the understanding that the result also applies to γ smaller but sufficiently close to 1 and without any diophantine condition on β.As such, throughout the proof we will assume β is generic in the above sense, λ = L (d−1)/2 , and T kin = L 2 /2.For 0 < δ 1 depending on n in , define the solution n = n(t, k), for t ∈ [0, δ] and k ∈ R d , to the wave kinetic equation ∂ t n(t, k) = K(n(t), n(t), n(t))(k), n(0, k) = n in (k), (WKE) where the nonlinearity The following theorem is the main result of [12], which describes the evolution of the variance E| u(t, k)| 2 in the limit.Here and below, the expectation E is always taken under the assumption that (NLS) has a smooth solution on [0, δ • T kin ], which happens with overwhelming probability.Theorem 1.1 (Theorem 1.1 of [12]).Fix A ≥ 40d, β ∈ (R + ) d \Z, and a function n in ≥ 0 such that Assume the law of each η k is Gaussian.Let δ be small enough depending on (A, β, C 1 ), and L be sufficiently large depending on δ.Set λ = L (d−1)/2 so α = L −1 and T kin = L 2 /2.Then, the 1 With overwhelming probability for large L, it can be shown that the size of the nonlinearity (say in L 2 norm) is comparable to α.This follows from the probabilistic analysis performed in [12], but can also be seen by simple heuristic considerations (cf. the introduction of [12]).We also note that it is common in the physics literature to use a different parametrization of the Fourier series in (DAT) by replacing the L −d factor in (DAT) with L −d/2 , in which case α would be defined as λ 2 and T kin = 1/2λ 4 .
equation (NLS), with random initial data (DAT), has a smooth solution up to time with probability ≥ 1 − L −A .Moreover, we have where n(t, k) is the solution to (WKE).
Remark 1.2.Theorem 1.1 is stated in [12] for Schwartz n in .A closer look at the proof shows that it remains true as long as n in ∈ S 40d , and δ should only depend on the S 40d norm of n in ; see the remarks after Theorem 1.1 in [12].The same comment also applies to all the main results of the current paper.
1.2.Propagation of chaos: The Gaussian case.As mentioned above, the proof of Theorem 1.1 does not require obtaining asymptotics on the higher Fourier moments.Such information is provided in our first main result, which can be viewed as an extension of Theorem 1.1.
Theorem 1.3 (Propagation of chaos and Gaussianity).Under the same assumptions as Theorem 1.1 above, fix a positive integer r and nonnegative integers p 1 , • • • , p r and q 1 , • • • , q r .Then, if at least one p j = q j (1 ≤ j ≤ r), we have Here, as in Theorem 1.1, the expectation is taken only when (NLS) has a smooth solution on [0, T ] where T = δ • T kin (which has probability ≥ 1 − L −A ).If p j = q j for each 1 ≤ j ≤ r, then we have A key feature of Theorem 1.3 is that, up to error terms that vanish as L → ∞, we have E| u(t, k j )| 2p j , and all other moments ≈ 0. (1.4)This means that, for fixed t, the random variables u(t, k) for different k become independent in the limit (at least in terms of the marginal distributions of any finitely many of them), which justifies rigorously the propagation of chaos assumption in the literature, as described in the beginning of this paper.Note that, these coefficients cannot be independent without taking limits, because correlations will always be produced by the nonlinear interactions in the NLS equation.Nonetheless, this independence reappears in the kinetic limit as L → ∞ and α → 0, for the same subtle and deep reason that makes the kinetic approximation in (1.1) hold.Namely, the only non-vanishing interactions contributing to the expectations in (1.1)- (1.3) are those obtained by concatenating blocks of basic interactions called (1, 1)-mini couples and mini trees (see Figures 1-3), thus forming what we call regular couples (for second moments) or regular multi-couples (for higher order moments, see Section 1.6).Such interactions can only be built if p j = q j in the notation of the Theorem 1.3; moreover, in the higher order case, the associated structure actually decouples into second order structures, hence (1.4) naturally occurs.The same reasoning also holds in the non-Gaussian case below (Section 1.3), for which (1.4) remains valid.
In addition, in this Gaussian setting we have which means that the law of u(t, k) in the limit is Gaussian with variance n t T kin , k as long as the initial state at t = 0 is Gaussian.This has been conjectured in the physics literature under the name of propagation of Gaussianity (see also the discussion following Theorem 1.5).
1.3.The non-Gaussian case.Highly interesting results appear in the non-Gaussian case, where unlike Theorems 1.1 and 1.3, the law of η k may not be Gaussian.While the second moments still follow the WKE in this setting, the non-Gaussianity of the initial law starts to exhibit itself at the higher (≥ 4) order moments and statistics.We will assume the law of η k is rotationally symmetric 1 , and has exponential tails.Then, we have the following modification to Theorem 1.3: Theorem 1.4 (Evolution of moments).Suppose the i.i.d.random variables {η k (ω)} have a law that is rotation symmetric, and satisfies that for some constant C 0 (this is equivalent to E(e |η k | β ) < ∞ for small β > 0).Then the same limits in (1.1) and (1.2) remain true.Moreover, instead of (1.3), we have Here the functions µ r (t, k) is defined as follows: recall n(t, k) is the solution to(WKE).Let n 0 (t, k) be the solution to the following equation where (1.7) Note that if {η k } is Gaussian, then µ p = p!, so (1.7) yields that µ q (t, k) = q!(n(t, k)) q , and we recover Theorem 1.3.Similarly, for q = 1 we have µ 1 (t, k) = n(t, k), so Theorem 1.1 remains true in the non-Gaussian case.
Note that in Theorem 1.4 we still have (1.4), thus propagation of chaos remains true in the non-Gaussian case.In addition, instead of (1.5) we have E| u(t, k)| 2p ≈ µ p t T kin , k where µ p (t, k) is defined as in (1.7).As far as we know, these expressions for higher order moments are new. 1 Though rotation symmetry seems to be always assumed in physics literature; it would be interesting to see what happens without this assumption, in particular if (1.10) remains true.Here the loss of gauge invariance may lead to additional contributions, but probably they will be error terms in the end.
We remark that Theorems 1.3 and 1.4 actually hold for moments whose degree (given by r j=1 (p j + q j ) in the notation of (1.2)) may diverge as L → ∞.Indeed, we will see in the proof that this degree can be taken as big as log L (for Theorem 1.3) or log L (log log L) 2 (for Theorem 1.4).Under slightly stronger assumptions, Theorem 1.4 allows us to describe the evolution of the law of individual Fourier modes in terms of the density function, which then provides a full description of the statistics of the NLS solution in the limit.This is summarized in our next theorem below.
Theorem 1.5 (Evolution of density).In Theorem 1.4, assume further that µ r ≤ C r (2r)! for some constant C (this is equivalent to E(e β|η k | ) < ∞ for small β > 0).Recall the solution n = n(t, k) to (WKE), and define Let the density function of each η k (ω) be ρ * = ρ * (v), where v ∈ C is also viewed as an R 2 vector; assume ρ * is a radial function.Let ρ k = ρ k (t, v) be the solution to the following linear equation (1.10) Clearly each ρ k is also radial.Fix t ∈ [0, δ], a positive integer r and distinct vectors converge in law, as L → ∞, to the random variable with density function (1.12) The factorization structure in (1.12) is a consequence of propagation of chaos, which has been established in Theorem 1.4; thus the main feature of Theorem 1.5 is the evolution of the individual density (1.10).It appears that this equation has only been discovered fairly recently in the physics literature (see [24,5], and Section 6.6 of [27]).
Note that in the Gaussian case (Theorem 1.3) we have ρ k) , so by (1.12), the limit distribution is given by independent Gaussians with variance n(t, k), which provides another manifestation of the propagation of Gaussianity.Other solutions to (1.10) can be obtained and analyzed using the method of characteristics in Fourier space, see [6].
1.4.The wave kinetic hierarchy.By taking p j = 1 in (1.3) or (1.6) we obtain the limits (1.13) These limit quantities are conjectured to solve an infinite hierarchy of equations called the wave kinetic hierarchy (WKH), which is a linear system for symmetric functions and has the form This hierarchy is the analog of Boltzmann and Gross-Pitaevski hierarchies, and is formally derived in recent works such as Chibarro et al. [7,8], Eyink-Shi [17] and Newell-Nazarenko-Biven [28], though it also follows from much earlier works including the foundational work of Peierls, see [29,3,27].
The key property of (WKH) is factorizability: where n(t, k) solves (WKE) with initial data n in .This follows from direct calculations together with a suitable uniqueness theorem, which is recently proved by Rosenzweig and Staffilani in [30].
In the above sense, we can view (WKH) as a generalization of (WKE) that allows for dependent Fourier modes.Indeed, suppose the initial data u in of (NLS) is given by (DAT) with u in (k) being independent for different k, then Theorem 1.4 implies that the limit (1.13) will be a factorized solution to (WKH) with factorized initial data, which is in fact the tensor product of the solution to (WKE).However, if u in does not have independent Fourier modes, then the initial data (1.13) at time 0, will not have factorized form, in which case (1.13) at time t is conjectured to be a more general solution to (WKH).
Such scenario may arise, as discussed in Section 1.3 of [30], if one considers a hybrid, or "twice randomized data" problem of (NLS) as follows: Instead of taking n in deterministic in (DAT), we choose it randomly according to a probability measure ζ defined on the space of all nonnegative functions n in , in such a way that new random function n in is independent of the pre-fixed i.i.d.random variables {η k }.In the case when η k are random phases (η k (ω) = e iθ k (ω) with θ k uniformly distributed on the circle), this process of randomization is referred to as "Random Phase and Amplitude" assumption in the wave turbulence theory literature, where in this general setup different amplitudes are not necessarily independent.
In other words, we are choosing a random initial data whose law of distribution (as a probability measure) is given by a suitable average of those specific probability measures which are laws of distribution of random data of form (DAT), i.e. having independent Fourier coefficients.This averaging is achieved by first generating a random nonnegative function n in according to the probability measure ζ on the space of all nonnegative functions, and then selecting the random initial data as (DAT) with some pre-fixed i.i.d.random variables {η k }.Since independent Fourier modes in (DAT) correspond to factorized solutions to (WKH), we know, using also the linearity of (WKH), that the above process will result in a solution to (WKH) which is an average of certain factorized solutions.These are referred to as "super-statistical solutions" in Eyink-Shi [17] and may provide a possible explanation of intermittency in wave turbulence.
Just like (WKE), the rigorous derivation of (WKH) has been an outstanding open problem.In fact these two problems are closely related; as mentioned in the beginning of this paper, there are many earlier works on similar problems that first derive the corresponding hierarchies and then restrict to factorized solutions to obtain the kinetic equations.In the wave turbulence context, such an approach is theoretically possible but has not yet been successful.Instead, we are following the exactly opposite route: we first derive the kinetic equation (WKE) in [12], then apply the same techniques to derive the hierarchy (WKH) a posterori, in the current paper.So our last main result is the rigorous derivation of (WKH) for general non-factorized initial data, which we state as follows.
Theorem 1.6 (Derivation of (WKH)).Fix a positive number X > 0 and a sequence of i.i.d.random variables {η k } as in Section 1.1 that satisfy the requirements of Theorem 1.4.Suppose for some large constant C 1 (note C 1 X).We say (n r ) in is admissible, if for any r ≥ 2 we have ˆRd Consider a probability measure ζ on the set A of nonnegative functions m = m(k) on R d , which is defined by For this ζ, consider the hybrid initial data u in which is given by (DAT), except that n in should be replaced by m, which is another random variable with values in A, such that m is independent with all the η k and the law of m is given by ζ.We say (n r ) in is hybrid, if there exists a ζ such that for the above choice of u in , it holds that for any L and any distinct k j ∈ Z d L (1 ≤ j ≤ r).Let T = δ • T kin where δ is as in Theorem 1.1 (except C 1 is now defined by (1.14)); the other parameters are as in Theorem 1.1.Then we have the followings.
(1) The sequence (n r ) in is hybrid if and only if it is admissible; in this case the measure ζ is unique.(2) Assume (n r ) in is admissible.Then with the hybrid initial data defined above, the equation (NLS) has a smooth solution up to time T with probability ≥ 1 − L −A .Moreover, for any fixed r we have where n r (t, k 1 , • • • , k r ) is the unique solution to (WKH) constructed in [30] with initial data (n r ) in .For any 0 ≤ t ≤ δ, this solution (n r )(t) is admissible in the sense of (1.15) for the same X.
We make two remarks regarding Theorem 1.6.First, the S 40d;r norms defined in (1.14) are much stronger than the L ∞ s, norms defined in [30], because of the strong S 40d norm used in Theorem 1.1.It may be possible to relax this regularity assumption to match [30], but this requires refining the proof of Theorem 1.1 (and Theorems 1.3-1.5),which we are not doing here.
Second, the admissibility requirement (1.15) seems natural in view of the conclusion (1): anything that actually arises from these hybrid initial data must be admissible.Non-admissible solutions to (WKH) do exist, but they are probably not physically meaningful as pointed out in [30].
1.5.Background literature.The proof of Theorems 1.3-1.6 are based on the framework introduced in [12] to prove Theorem 1.1.The latter work comes as a culmination of an extensive research effort over the past years to provide a rigorous justification of the wave kinetic equation starting from the nonlinear dispersive PDEs as first principle [26,4,18,14,15,16,11,9,10].This is Hilbert's sixth problem for waves; its particle analog is the rigorous derivation of the Boltzmann equation from Newtonian mechanics (see [20,25,19,2] and references therein).We refer the reader to the introduction of [12] for a discussion of the developments leading up to it.
We should remark on the progress that has happened since the submission of [12].First, we mention the work of Staffilani and Tran [32].In this work, the authors consider a high (≥ 14) dimensional discrete KdV-type equation, with a Stratonovich-type stochastic multiplicative noise, which has the effect of regularly randomizing the phases of the Fourier modes.In the presence of this noise, the authors derive the associated kinetic equation at the kinetic timescale T kin and in the scaling law α = L −0 .The authors also have a conditional result in the absence of the noise, which assumes that some a priori estimates hold for the solution, and they verify that these conditions are met for some more restrictive sets of initial data.
Another work in this direction is due to Ampatzoglou-Collot-Germain [1] which considers the problem of deriving the WKE in an inhomogeneous setting.The authors derive this equation from a quadratic NLS-type equation for short (asymptotically vanishing) timescales, which, similar to [11], is a subcritical version of the critical setting considered here and in [12].
Note that the works [4,9,10,11,12,14,15,16,26] concern cubic nonlinearities or 4-wave interactions, while the works [1,18,32] concern quadratic nonlinearities or 3-wave interactions.Both models represent a lot of important physical scenarios.Although the cubic case is considered in the current paper and in [12], we believe that the quadratic case can be treated in the same way without much difference in strategy (as exhibited by [1]).1.6.Idea of the proof.Before discussing the main ideas, we first review the proof of Theorem 1.1 in [12].The basic strategy is to perform a high order expansion of the NLS solution in Fourier space as Here, N is the order of the expansion which diverges appropriately with the size L of the domain, J n is the n-th Picard iterate, and R N is the remainder.The iterates J n can be written as the sum of J T , where T runs over all ternary trees that have n branches; these are often called Feynman diagrams.To derive (WKE) in [12], one has to compute the asymptotics of the second moments E| u(t, k)| 2 which leads to the analysis of the correlations E(J T 1 J T 2 ) for trees T 1 and T 2 of at most N branches.These expressions naturally lead to the notion of couples which consist of two ternary trees whose leaves are paired to each other.The key observation is that the leading couples in the expansion take a very special form, which we call regular couples, namely they are obtained by appropriately concatenating (1, 1)-mini couples and mini trees (see Figures 1-3).The proof in [12], as described before, then reduces to (a) establishing the precise asymptotics of the regular couples, which is made possible by their precise, albeit highly complex, structure, (b) showing the the remaining couples are of lower order, which constitutes the heart of the proof, and (c) showing the remainder R N is also of lower order.Now, in Theorem 1.3, we are interested in the higher order moments of the solutions, where the order R can be arbitrarily large (or even grow to infinity with L).If we perform the same expansion (1.19), then we need to consider expressions of the form where, as usual, a minus superscript denotes complex conjugation.This leads to the key new concept in the current paper, which we call gardens1 , that are formed by R trees whose leaves are paired to each other.
In the Gaussian setting of Theorem 1.3, gardens are the only new structures that emerge.Since R can be arbitrarily large and may even grow to infinity with L, the analysis of gardens of R trees will be a lot more complicated than that of couples of two trees.However, the methodology introduced in [12], originally designed to treat couples, is in fact so robust that it can be extended to gardens-even for very large R-with some additional twists.Indeed, the leading contributions here come from those gardens that are formed by putting together R/2 couples (we call them multicouples), which can be analyzed using the results of [12].In particular, as shown in [12], only the regular multi-couples, where each of the R/2 couples is regular, provide the top order contributions; these can be explicitly calculated as in [12] to match the desired right-hand side expressions, and the rest is of lower order.
As for the gardens that are not multi-couples, we apply the procedure of [12] (which are defined for couples but can be easily generalized to gardens) and conclude that they are of lower order (Proposition 4.7).A few technical differences occur here (such as in combinatorics, cf.Proposition 6.4 and Proposition 9.6 of [12]), but the most important one, which is also the reason why these terms are of lower order, comes from the structure of the molecules (see Section 6) associated with such gardens.This is stated in Proposition 6.3 (for comparison, we have χ = m instead of χ ≤ m − R/2 for multi-couples), which can be used to establish a power gain in the counting estimates (Proposition 6.8, note the m − R/2 in the exponent), and subsequently the lower order bounds.
In the non-Gaussian setting (Theorem 1.4), we need to introduce even more general structures.In fact, gardens appear from dividing the leaves of the R trees as above into two-leaf pairs.In the Gaussian case, due to Isserlis' theorem, only expressions associated with gardens need to be considered; in the non-Gaussian case, we have a substitute of Isserlis' theorem (Lemma 9.1), which is reminiscent of the cumulant expansions of the moments of random variables, but with the important quantitative estimates included.This leads to the notion of over-gardens which are basically the same as gardens but allow pairings of more than two leaves.Again, in this setting, we identify the leading over-gardens (called regular ones) and prove that the complementary set is of lower order.It is here that the non-Gaussianity starts to exhibit itself, as regular overgardens contribute to the leading terms in addition to regular gardens, which explains the difference between (1.3) and (1.6).
In all the proofs above, as well as in [12], the leading structures (regular couples, multi-couples and over-gardens) are still highly complex objects, whose number grows exponentially (rather than factorially) in their size.However, their redeeming feature is that one can write down exact expressions for them in the kinetic limit which allows to match their contribution, order by order, with the solutions of the kinetic equations that appear in (WKE), (WKE-0), or (WKH).
Finally, Theorem 1.5 is a direct consequence of (1.3) and (1.7), and uniqueness of the moment problem in this setting (i.e. the moments uniquely define the law), see Lemma 9.6, and Theorem 1.6 basically follows from averaging the results of Theorem 1.3 in different scenarios, and applying the Hewitt-Savage theorem (see Lemma 10.1) to represent arbitrary densities by tensor products.
We remark that the proof in this paper relies heavily on the notions and framework introduced in [12].On the other hand, despite a few places where we briefly go over the results and proofs of [12], the majority of this paper is devoted to the new components needed in the higher order setting.In particular, the gardens we introduce are fundamental objects with important new features (such as Proposition 6.3), which will play significant roles in future studies of wave turbulence.1.7.Organization of the paper.The paper is organized as follows: In Section 2 we review the setup and present some reductions to the problem.In Section 3, we review the argument in [12] and the needed results from there.In Section 4, we introduce the notion of gardens, their elementary combinatorial properties, and state the needed estimates to prove Theorem 1.3.These estimates are then proved in Sections 5-8.In Section 9 we deal with the non-Gaussian case and prove Theorems 1.4 and 1.5, and in Section 10 we prove Theorem 1.6.1.8.Acknowledgements.Yu Deng is supported in part by NSF grant DMS-1900251 and Sloan Fellowship.Zaher Hani is supported in part by NSF grant DMS-1654692 and a Simons Collaboration Grant on Wave Turbulence.The authors thank Sergey Nazarenko and Herbert Spohn for enlightening conversations and pointing out some references.Part of this work was done while the authors were visiting ICERM (Brown University), which they wish to thank for its hospitality.The first author thanks Matthew Rosenzweig for helpful discussions related to Theorem 1.6.

Preliminary reductions
2.1.Reduction of (NLS).As in [12] we make the following reductions.Suppose u is a solution to (NLS), define a = a k (t) such that where M is the conserved mass of u, then it solves the equation with the nonlinearity for ζ ∈ {±}.Here in (2.3) and below, the summation is taken over (k 1 , k 2 , k 3 ) ∈ (Z d L ) 3 , and and the resonance factor Note that k 1 k 2 k 3 is always supported in the non-degenerate set if p j = q j for some 1 ≤ j ≤ r, and ) with ν > 0 being an absolute constant and the implicit constants depending on R, where R : Note that if a k (t) solves (2.2) then e iθ a k (t) solves the same equation, with the initial data obeying the same law.From this it is easy to deduce that E r j=1 a k j (t) p j a k j (t) Below we will always assume As we consider the limit L → ∞ with R fixed, we may assume R ≤ log L. We shall introduce a simpler notation as follows.For 1 ≤ j ≤ r, take p j copies of the variable k j with sign + and q j copies of the variable k j with sign −, and rename them as ).For simplicity we will write k j instead of k * j below.Then (2.7) and (2.8) result from the following unified and more precise estimate, namely (2.9) Here we denote z + = z and z − = z, and the sum is taken over all partitions The first product is taken over all {j, j } ∈ P, and the second product is taken over all 1 ≤ j ≤ 2R such that ζ j = +.Finally M kin is defined as and the implicit constant in (2.9) depends only on (d, β, n in ) but not on R.
The goal for the rest of the paper is then to prove (2.9).

2.3.
Parameters and notations.Most of our parameters and notations are taken from [12].First, we fix β ∈ (R + ) d \Z, where Z is defined by the following lemma.
Throughout this paper, we will use C to denote any large constant that depends only on the dimension d, and use C + to denote any large constant that depends on (d, β, n in ); these may differ from line to line, and note in particular that they do not depend on the value of R in (2.9).The notations X Y and X = O(Y ) will mean X ≤ C + Y unless otherwise stated.
Recall that A ≥ 40d and δ, which is small enough depending on A and C + , are fixed as in Theorem 1.3.We also fix ν = (100d) −1 1 and define N = (log L) 4 .Note that the value of N is different from the one in [12].As above we assume R ≤ log L. For later purposes we may need slightly larger values (like 2R), but all our proofs work equally fine as long as R ≤ 2 log L, which will be satisfied throughout the paper.Note that we do not assume any inequality between δ and R.
We adopt the shorthand notation k[A] = (k j ) j∈A and similarly for other vectors, and also define dα[A] = j∈A dα j .We also use multi-indices ρ with the usual notations.Define the time Fourier transform (the meaning of • later may depend on the context) Define the X κ norm for functions where • denotes the Fourier transform in t or (t, s).In the case when F or G does not depend on k, this norm will not depend on κ and will be denote by X. Define the localized version X κ loc (and similarly X loc ) as If we will only use the value of G in some set (for example {t > s} in Proposition 3.8), then in the above definition we may only require G = G in this set.Define the Z norm for function a = a k (t), (2.13)

3.
A brief summary of [12] The results of this section are proved in [12].Here we state the relevant propositions and definitions that will be needed in the proof below.
3.1.Trees, couples, and decorations.We first recall the definitions of trees, couples, and decorations, which are drawn directly from [12].Definition 3.1 (Definition 2.1 in [12]).A ternary tree T (we will simply say a tree below) is a rooted tree where each non-leaf (or branching) node has exactly three children nodes, which we shall distinguish as the left, mid and right ones.We say T is trivial (and write T = •) if it consists only of the root, in which case this root is also viewed as a leaf.
We denote generic nodes by n, generic leaves by l, the root by r, the set of leaves by L and the set of branching nodes by N .The scale of a tree T is defined by n A tree T may have sign + or −.If its sign is fixed then we decide the signs of its nodes as follows: the root r has the same sign as T , and for any branching node n ∈ N , the signs of the three children nodes of Once the sign of T is fixed, we will denote the sign of n ∈ T by ζ n .Define the conjugate T of a tree T to be the same tree but with opposite sign.Definition 3.2 (Definition 2.2 in [12]).A couple Q is an unordered pair (T + , T − ) of two trees T ± with signs + and − respectively, together with a partition P of the set L + ∪ L − into (n + 1) pairwise disjoint two-element subsets, where L ± is the set of leaves for T ± , and n = n + + n − where n ± is the scale of T ± .This n is also called the scale of Q, denoted by n(Q).The subsets {l, l } ∈ P are referred to as pairs, and we require that ζ l = −ζ l , i.e. the signs of paired leaves must be opposite.If both T ± are trivial, we call Q the trivial couple (and write Q = ×).
For a couple Q = (T + , T − , P) we denote the set of branching nodes by N * = N + ∪ N − , and the set of leave by L * = L + ∪ L − ; for simplicity we will abuse notation and write Q = T + ∪ T − .We also define a paired tree to be a tree where some leaves are paired to each other, according to the same pairing rule for couples.We say a paired tree is saturated if there is only one unpaired leaf (called the lone leaf ).In this case the tree forms a couple with the trivial tree •.
L for each node n, and that for each branching node n ∈ N , where ζ n is the sign of n as in Definition 3.1, and n 1 , n 2 , n 3 are the three children nodes of n from left to right.Clearly a decoration D is uniquely determined by the values of (k l ) l∈L .For k ∈ Z d L , we say D is a k-decoration if k r = k for the root r.Given a decoration D, we define the coefficient where k 1 k 2 k 3 is as in (2.4).Note that in the support of D we have that (k n 1 , k n 2 , k n 3 ) ∈ S for each n ∈ N .We also define the resonance factor Ω n for each n ∈ N by and moreover k l = k l for each pair {l, l } ∈ P. We define E := D + D − , and define the resonance factors Ω n for n ∈ N * as in (3.2).Note that we must have Finally, we can define decorations D of paired trees, as well as D and Ω n etc., similar to the above.Definition 3.4 (Definition 4.2 in [12]).Define a regular couple to be a couple formed from the trivial couple × by repeatedly applying one of the steps A and B, where in step A one replaces a pair of leaves with a (1, 1)-mini couple, and in step B one replaces a node with a mini tree.Here a (1, 1)-mini couple is a couple formed by two trees each of scale 1 such that no siblings are paired, and a mini tree is a saturated paired tree of scale 2 such that no siblings are paired.See Figures 1-3.We also define a regular tree to be a saturated paired tree T , such that T forms a regular couple with the trivial tree.This is equivalent to the definition in Remark 4.15 of [12], namely that T can be obtained from a regular chain by replacing each leaf pair with a regular couple.Here a regular chain (see Definition 4.6 of [12]) is defined to be the result of repeatedly applying step B at a branching node or the lone leaf starting from the trivial tree •.Note that the scale of a regular couple or a regular tree is always even.Proof.See [12], Lemma 6.6.

Expansion ansatz and regular couples.
The following results are taken from [12].
Proposition 3.7.For any tree T , define Here in (3.4), n is the scale of T , ζ(T ) = n∈N (iζ n ), D runs over all k-decorations of T , and D is the domain We may expand a k (t) as where the second sum is taken over all trees T + of sigh + such that n(T + ) = n.
The remainder b satisfies the equation where the terms on the right hand side are defined by In (3.8) the summations are taken over (u, v, w), each of which being either b or J n for some 0 ≤ n ≤ N ; moreover in the summation (j) for 0 ≤ j ≤ 2, exactly j inputs in (u, v, w) equals b, and in the summation (0) we require that the sum of the three n's in the J n 's is at least N .Lastly, uniformly in t ∈ [0, 1] and k ∈ Z d L , we have that Proof.The expansion (3.6) is introduced in Sections 2.2.1 and 2.2.2 in [12], and (3.4) follows by combining the formulas in Section 5.1 of [12].The equation (3.7) for b is deduced in Section 2.2.2 of [12].Finally (3.9) is a qualitative version of Theorem 1.1, which is proved in Section 12 of [12].
Note that here we are choosing N = (log L) 4 instead of N = log L , but the proof is not affected as long as (say) N L δ 2 .
Proposition 3.8.For any couple Q, define (3.10) Here in (3.10), n is the scale of Q, ζ * (Q) = n∈N * (iζ n ), E runs over all k-decorations of Q, the last product is taken over all l ∈ L * with sign +, and E is the domain Now suppose Q is a regular couple with scale 2n where n ≤ (log L) 50 , then there exist a function (K Q ) app (t, s, k), which is the sum of at most 2 n terms, such that each term has the form δ n •J A(t, s)• M(k) (with possibly different J A and M for different terms), and that Similarly, for any regular tree T with lone leaf l * , define Here in (3.14), n is the scale of T , ζ(T ) = n∈N (iζ n ), D runs over all k-decorations of Q, the last product is taken over all l ∈ L\{l * } with sign +, and D is the domain where (l * ) p is the parent of l * .Suppose T has scale 2n where n ≤ (log L) 20 , then there exist a function (K * T ) app (t, s, k), which is the sum of at most 2 n terms, such that each term has the form δ n • J A * (t, s) • M * (k) (with possibly different J A * and M * for different terms), and that Proof.This follows from Propositions 6.7 and 6.10 of [12].Whether the upper bound for n is (log L) 6 or (log L) 50 does not affect the proof (again, as long as n L δ 2 ).), such that T j has sign ζ j for 1 ≤ j ≤ 2R, together with a partition P of the set of leaves in all T j into two-element subsets (again called pairings) such that the two paired leaves have opposite signs, see Figure 4.The width of the garden is defined to be 2R, which is always an even number.The scale n(G) of a garden G is the sum of scales of all T j (1 ≤ j ≤ 2R).We denote to be the set of branching nodes, where L j and N j are the sets of leaves and branching nodes of T j .Note that a garden of width 2 is just a couple.If the set {1, • • • , 2R} can be partitioned into two-element subsets such that for each such subset {j, j }, the leaves in T j and T j are all paired with each other (in particular ζ j = −ζ j ), then we say this garden is a multi-couple.In this case, this garden is formed by R couples (T j , T j ).If each of them is a regular couple then we say the multi-couple is regular.A trivial garden is a garden when all T j are trivial trees; note that it is always a regular multi-couple (formed by R trivial couples).If in a garden G, no two trees T j and T j have all their leaves paired with each other, then we say the garden is mixed.Definition 4.2.Given a garden G, a decoration of G, denoted by I , is a set of vectors (k n ) n∈G where n runs over all nodes of G, such that (k n ) n∈T j is a decoration of T j for each 1 ≤ j ≤ 2R, and k n = k n for each pair of leaves {n, n }.Given vectors (k , where r j is the root of T j .For any branching node n ∈ N * , define Ω n as in (3.2), see Figure 4. We also define I = 2R j=1 D j , where D j is defined as in (3.1), with D j being the restriction of I to T j .Proposition 4.4.For any garden G there exists a unique prime garden G sk such that G is obtained from G sk by applying steps A and B. This G sk is called the skeleton of G. Finally, G sk is a trivial garden, if and only if G is a regular multi-couple.
Proof.The proof is the same as Proposition 4.13 of [12].For the convenience of the reader we present the proof here.Denote the inverse operations of A and B by A and B, where one collapses a (1, 1)-mini couple or a mini tree to a leaf pair or a single node.To prove existence of G sk , by definition, one just needs to repeatedly apply A and B until no such operation is possible.
To prove uniqueness of G sk , we just make one key observation: if G contains two basic objects (i.e.(1, 1)-mini couples or mini trees), and let D 1 and D 2 be the inverse operations associated with them, then D 1 D 2 = D 2 D 1 .In fact, this just shows that collapsing one of the basic objects does not affect the other, which can be directly verified by definition.Now we can prove the uniqueness of G sk by induction.The base case is easy, suppose uniqueness is true for G of smaller scale, then for any G we shall look for (1, 1)-couples and mini trees (Definition 3.4).If there is none then G is already prime; if there is only one, then we apply A or B to collapse it and apply induction hypothesis for the resulting garden.Suppose there are more than one, then we apply A or B to collapse any one of them and apply induction hypothesis for the resulting garden.The final result does not depend on the first A or B we choose, because any two such steps, which can be performed for the original G, must commute as proved above.Therefore G sk is unique.Proposition 4.5.Suppose G is a garden with skeleton G sk .Then G is formed from G sk by replacing each leaf pair with a regular couple and each branching node with a regular tree, see Figure 5.This representation is unique.Here each T j and T j represents a regular tree, and each Q j represents a regular couple.
Proof.The proof is basically the same as Proposition 4.14 of [12].To prove existence, we can induct on the scale of G.The base case G = G sk is obvious.Suppose the result is true for G, and let G + be obtained from G by applying A or B. We know that G is obtained from G sk by replacing each branching node with a regular tree T j (1 ≤ j ≤ n), and replacing each leaf pair by a regular couple Q j (1 ≤ j ≤ m).Then: (1) If one applies step A, then this step A must be applied, either at a leaf pair belonging to some regular couple Q i (1 ≤ i ≤ m), or at a leaf pair belonging to some regular tree T i (1 ≤ i ≤ n).In this case the other regular trees and regular couples remain the same, and the regular tree T i or regular couples Q i is replaced by AT i or AQ i .
(2) If one applies step B, then this step B must be applied, either at node belonging to some regular couple Q i (1 ≤ i ≤ m), or at a node belonging to some regular tree T i (1 ≤ i ≤ n).In this case the other regular trees and regular couples remain the same, and the regular tree T i or regular couples Q i is replaced by BT i or BQ i .
In either case, notice that a regular tree or a regular couple still remains a regular tree or a regular couple after applying step A or B. This proves existence.Now to prove uniqueness of the representation, note that by Definition 3.4, the process of forming G from G sk can also be described as follows: (i) first replace each branching node of G sk by a regular chain, forming a garden G int ; (ii) replacing each leaf pair in G int by a regular couple to form G. Given G sk , clearly G int uniquely determines the regular chains in step (i), and also uniquely determines the regular couples in step (ii) replacing the leaf pairs in G int , so it suffices to show that G uniquely determines G int .Now we can show, via a case-by-case argument, that G int contains no nontrivial regular sub-couple (i.e.no two subtrees rooted at two nodes in G int form a nontrivial regular couple).Since G is formed from G int by replacing each leaf pair with a regular couple, we see that G int can be reconstructed by collapsing each maximal regular sub-couple (under inclusion) in G to a leaf pair (because any regular sub-couple of G must be a sub-couple of one of the regular couples in G replacing a leaf pair in G int ).Clearly this collapsing process is commutative as explained in the proof of Proposition 4.4, hence the resulting couple G int is unique.This completes the proof.Proposition 4.6.Given any G sk , the number of gardens G that has scale m, width 2R and skeleton Proof.This is basically the same as Corollary 4.16 in [12].If G has scale m and width 2R, then G sk has scale at most m and width at most 2R.Given G sk , to construct G, using Proposition 4.5, we just need to choose a regular tree at each branching node of G sk , and a regular couple at each leaf pair of G sk .Note that the number of branching nodes in G sk is at most m, and the number of leaf pairs in G sk is at most m + R. Thus the number of choices for G is at most where m = 2m + R, and C 0 is an absolute constant as in Proposition 3.5.

Expressions M
) and scale m, and k j ∈ Z d L for each 1 ≤ j ≤ 2R, and time t ∈ [0, 1], define Here in (4.1), ζ * (G) = n∈N * (iζ n ) and I = 2R j=1 D j where D j is the restriction of I to T j (which is a k j -decoration of T j ), the sum is taken over all (k 1 , • • • , k 2R )-decorations I , the last product is taken over all l ∈ L * with sign +, and I is the domain By using Isserlis' theorem (Lemma A.2 in [12]) and repeating the arguments in Section 2.2.3 of [12], we can obtain, for any tree T j (1 ≤ j ≤ 2R) with sign ζ j , that Here the sum is taken over all possible pairings P that make (T 1 , • • • , T 2R ) a garden, and G is the resulting garden.We can reduce (2.9) to the following two propositions.Here Proposition 4.7 is the key component, and Proposition 4.8 follows from similar arguments.Note also that Proposition 4.8 is actually an improvement of Propositions 12.1-12.2 of [12], where the decay of exceptional probability is improved from L −A to e −(log L) 3 .
where the sum is taken over all mixed gardens G = (T 1 , • • • , T 2R ) of width 2R and signature ) such that the scale of T j is m j for 1 ≤ j ≤ 2R, then we have uniformly in t and in (k Proposition 4.8.With probability ≥ 1 − e −(log L) 3 , we have for all 0 ≤ n ≤ N 3 , as well as ) Here R and L are defined in (3.8), and the Z norm is defined in (2.13).
Proof of Theorem 1.3.We only need to prove (2.9).Let E 1 be the event that (NLS) has a smooth solution on [0, T ], and E ⊂ E 1 be the event that Proposition 4.8 holds, then P(E 1 ) ≥ P(E) ≥ 1 − e −(log L) 3 .
Note that, under the assumption E, we can bound the remainder b defined in (3.6) by b Z ≤ e −(log L) 4 .This can be proved similarly as in Proposition 12.3 of [12].In fact, the equation (3.7) satisfied by b can be written as We view this as the fixed point equation for a contraction mapping from the set {b ∈ Z : b Z ≤ e −(log L) 4 } to itself, hence the solution b is unique and satisfies the desired bound.The contraction mapping property follows from the estimates (using also the definition of B and C , see (3.8)) ) L n Z→Z ≤ e 2(log L) 3 (∀ 0 ≤ n ≤ N ), (4.12) so we may replace E 1 by E in (4.14).Under the assumption E, we may expand a k j (t) using (3.6), which leads to different combinations of terms.
Consider the terms where all factors are of form J n .For such factors we will also replace 1 E by 1, and deal with the resulting error term later.As such, we get a contribution For fixed (m 1 , • • • , m 2R ), using the second expansion in (3.6) and ( 4.3), we can write where the sum is taken over all gardens such that the scale of T j is m j for 1 ≤ j ≤ 2R.Note that by definition, each G is uniquely expressed as the union of some couples and a mixed garden; suppose the number of couples is R 1 ≤ R, and ) to be nonzero one must have k j = k j .For P fixed, the contribution of this part of sum equals {j,j }∈P where for fixed {j, j } ∈ P, the sum is taken over all couples Q = {T j , T j } such that the two trees have signs ζ j and ζ j and scales m j and m j respectively, and the equality in (4.17) follows from (4.3).Now, upon summing over all choices for (m 1 , • • • , m 2R ) and using (3.9), we obtain that this contribution equals P {j,j }∈P where in the last inequality we have used that 1 + |n(δt, k j )| ≤ M kin for each j.
Next, consider the contribution where R 2 > 0. Up to a factor 2R 2R 2 R 1 !≤ (2R) 2R 2 R! and a permutation, we may assume and sum over the other m j , then in the same way as above, we can bound the corresponding contribution by where the sum is taken over all mixed gardens such that the scale of T j is m j .By Proposition 4.7 we have that Upon summing over (m 1 , • • • , m 2R 2 ) and using that R ≤ log L, we can bound this contribution by the right hand side of (2.9).Finally, we show that all the remainder terms are bounded by the right hand side of (2.9).In fact, the above arguments imply that in particular we have From now on we will focus on the proof of Propositions 4.7-4.8.

Irregular chains
5.1.Reduction to prime gardens.Let G sk be the skeleton of a garden G, which is then a prime garden.By Proposition 4.5, G can be obtained from G sk by replacing each branching node m with a regular tree T (m) , and replacing each leaf pair {m, m } in G sk with a regular couple Q (m,m ) .Similar to Section 8.1 of [12], using Proposition 3.8, we can reduce For the sake of completeness we briefly recall the reduction process below.
Recall that where m is the scale of G, I is the domain defined in (4.2),I is a (k 1 , • • • , k 2R )-decoration and other objects are defined as before, all associated to the garden G.By definition, the restriction of I to nodes in G sk forms a (k 1 , • • • , k 2R )-decoration of G sk , and the relevant quantities such as Ω n are the same for both decorations (i.e. each Ω n in the decoration of G sk uniquely corresponds to some Ω n in the corresponding decoration of G).Now, let {m, m } be a leaf pair in G sk , which becomes the roots of the regular sub-couple Q (m,m ) in G.We must have k m = k m .In (9.10), consider the summation in the variables k n , where n runs over all nodes in Q (m,m ) other than m and m (these variables, together with k m and k m , form a k m -decoration of Q (m,m ) ), and the integration in the variables t n , where n runs over all branching nodes in Q (m,m ) , with all the other variables fixed.By definition, this summation and integration equals, up to some sign ζ * (Q (m,m ) ) and some power of δ(2L d−1 ) −1 , the exact expression Here we assume ζ m = + and ζ m = −, and m p is the parent of m (if m is the root of some tree then t m p should be replaced by t; similarly for (m ) p ).The relevant notations here and below are defined as in Proposition 3.8.
Similarly, let m be a branching node in G sk , which becomes the root p and lone leaf q of a regular tree T (m) in Q.We must have k p = k q .In (9.10), consider the summation in the variables k n , where n runs over all nodes in T (m) other than p and q (these variables, together with k p and k q , form a k m -decoration of T (m) ), and the integration in the variables t n , where n runs over all branching nodes in T (m) , with all the other variables fixed.In the same way, this summation and integration equals, up to some sign ζ(T (m) ) and some power of δ(2L d−1 ) −1 , the exact expression K * T (m) (t p p , t q , k p ).Here p p is the parent of p (again, if p is the root of some tree then t p p should be replaced by t).
After performing this reduction for each leaf pair and branching node of G sk , we can reduce the summation in (9.10) to the summation in k m for all leaves and branching nodes m of G sk , i.e. a (k 1 , • • • , k 2R )-decoration of G sk .Moreover, we can reduce the integration in (9.10) to the integration in t m for all branching nodes m of G sk (for a regular tree, the time variables t p p and t q for G correspond to t m p and t m for G sk where m p is the parent of m).This implies that where m 0 is the scale of G sk , I sk is the domain defined in (4.2), , the other objects are as before but associated to the garden G sk .Moreover in (5.2), the first product is taken over all leaves m of sign + with m being the leaf paired to m, the second product is taken over all branching nodes m, and m p is the parent of m.Using Proposition 3.8, in (5.2) we can decompose Here (K Q (m,m ) ) app and (K * T (m) ) app are the leading terms in Proposition 3.8, and each of them is a linear combination of functions of (t, s) multiplied by functions of k, which in turn satisfy (3.12) and (3.16); the remainders R Q (m,m ) and R * T (m) satisfy (3.13) and (3.17).We may fix a mark in {L, R} for each leaf pair and each branching node in G sk which indicates whether we select the leading term (• • • ) app or the remainder term R or R * ; for a general garden G we can do the same but only for the nodes of its skeleton G sk .In this way we can define marked gardens, which we still denote by G, and expressions of form (5.2) but with K Q (m,m ) and K * T (m)   replaced by the corresponding leading or remainder terms, which we still denote by M G .By definition, any sum of M G over unmarked gardens G equals the corresponding sum over marked gardens G for all possible unmarked gardens and all possible markings.
In the next Section we will define the notion of irregular chains to exhibit the cancellation between M G for some different gardens G with specific symmetries.5.2.Irregular chains and congruence.The notion of irregular chains for gardens is defined in the same way as for couples, see Section 8.2 of [12].Definition 5.1 (Definition 8.1 of [12]).Given a garden G (or a paired tree T ), we define an irregular chain to be a sequence of nodes (n 0 , • • • , n q ), such that (i) n j+1 is a child of n j for 0 ≤ j ≤ q − 1, and the other two children of n j are leaves, and (ii) for 0 ≤ j ≤ q − 1, there is a child m j of n j , which has opposite sign with n j+1 , and is paired (as a leaf) to a child p j+1 of n j+1 .We also define p 0 to be the child of n 0 other than n 1 and m 0 .Definition 5.2 (Definition 8.2 of [12]).Consider any irregular chain H = (n 0 , • • • , n q ).By Definition 5.1, we know p j is the child of n j other than n j+1 and m j for 0 ≤ j ≤ q − 1, thus p j has the same sign with n j (hence it is either its first or third child).Now for two irregular chains H = (n 0 , • • • , n q ) and H = (n 0 , • • • , n q ), with p j and p j etc. defined accordingly, we say they are congruent, if ζ n 0 = ζ n 0 , and for each 0 ≤ j ≤ q − 1, either p j is the first child of n j and p j is the first child of n j , or p j is the third child of n j and p j is the third child of n j , counting from left to right.
In particular, if q and the congruence class (and hence ζ n 0 ) are fixed, then an irregular chain H is uniquely determined by the signs ζ n j for 1 ≤ j ≤ q.We relabel the nodes n j , p j (0 ≤ j ≤ q) by defining {b j , c j } = {n j , p j }, and that b j = n j if and only if ζ n j = +.Further, we label the two children of n q other than p q as e and f, with ζ e = + and ζ f = −.Proposition 5.3 (Proposition 8.3 of [12]).Let H = (n 0 , • • • , n q ) be an irregular chain.For any decoration D (or E ), its restriction to n j (0 ≤ j ≤ q) and their children is uniquely determined by 2(q + 2) vectors k j , j ∈ Z d L (0 ≤ j ≤ q + 1), such that k b j = k j and k c j = j for 0 ≤ j ≤ q, and k e = k q+1 and k f = q+1 .These vectors satisfy and for each 0 ≤ j ≤ q we have ζ n j Ω n j = 2 h, k j+1 − k j β .Moreover kn j1 kn j2 kn j3 = k j+1 j+1 j , where (n j1 , n j2 , n j3 ) are the children of n j from left to right.We say this decoration has small gap, large gap or zero gap with respect to H, if we have 0 < |h| ≤ 1 100δL , |h| ≥ 1 100δL or h = 0. Proof.See Proposition 8.3 of [12].Definition 5.4 (Definition 8.4 of [12]).Let H = (n 0 , • • • , n q ) be an irregular chain contained in a garden G or a paired tree T .If we replace H by a congruent irregular chain H = (n 0 , • • • , n q ), then we obtain a modified couple G or paired tree T by (i) attaching the same subtree of e and f in G (or T ) to the bottom of e and f , and (ii) assigning to n 0 the same parent of n 0 and keeping the rest of the couple unchanged.
Given a marked prime garden G sk , we identify all the maximal irregular chains H = (n 0 , • • • , n q ), such that q ≥ 10 3 d, and all n j and their children have mark L. For each such maximal irregular chain H, consider H • = (n 5 , • • • , n q−5 ) formed by omitting 5 nodes at both ends (so that it does not affect other possible irregular chains).We define another marked prime couple G sk to be congruent to G sk , if it can be obtained from G sk by changing each of the irregular chains H • to a congruent irregular chain, as described above.
Given a marked garden G, we define G to be congruent to G, if it can be formed as follows.First obtain the (marked) skeleton G sk and change it to a congruent marked prime couple G sk .Then, we attach the regular couples Q (m,m ) and regular trees T (m) from G to the relevant leaf pairs and branching nodes of G sk .Note that if an irregular chain G sk , with relevant nodes m j , p j etc. as in Definition 5.1, then for 0 ≤ j ≤ q − 1, the same regular couple Q (m j ,p j+1 ) is attached to the leaf pair {m j , p j+1 } in G sk .Similarly, for 1 ≤ j ≤ q, if ζ n j = ζ n j then the same regular tree T (n j ) is placed at the branching node n j in G sk ; otherwise the conjugate regular tree T (n j ) is placed at n j .
Note that the congruence relation preserves the scale of each tree of a garden; ) are congruent, then the scale of T j equals the scale of T j for 1 ≤ j ≤ 2R.

Expressions associated with irregular chains.
We shall analyze the expressions associated with irregular chains, in the same way as Section 8.3 of [12].
Given one congruence class F of marked gardens as in Definition 5.4, consider the sum which is taken over all marked gardens G ∈ F .Let the lengths of all the irregular chains H • involved in the congruence class F , as in Definition 5.4, be q Since these irregular chains do not affect each other, we may focus on one individual chain, say ; that is, we only sum over G ∈ F obtained by altering this irregular chain H • .
In the summation and integration in (5.2), we will first fix all the variables k n and t n , except k n with n ∈ {n j , p j , m j−1 } (1 ≤ j ≤ q) and t n with n = n j (1 ≤ j ≤ q − 1), and sum and integrate over these variables.Note that we are fixing k n 0 and k p 0 as well as k e and k f , in the notation of Definition 5.2, and are thus fixing (k 0 , 0 , k q+1 , q+1 ) and k 0 − 0 = k q+1 − q+1 = h as in Proposition 5.3.It is easy to see that in the summation and integration in (5.2) over the fixed variables (i.e.those k n and t n not in the above list), the summand and integrand does not depend on the way H • is changed, because the rest of the couple is preserved under the change of H • , by Definition 5.4.
We thus only need to consider the sum and integral over the variables listed above.By Proposition 5.3, this is the same as the sum over the variables k j (1 ≤ j ≤ q), with j := k j − h, and integral over the variables t j := t n j (1 ≤ j ≤ q − 1), which satisfies t 0 > t 1 > • • • > t q−1 > t q with t 0 := t n 0 and t q := t nq .For any possible choice of H • (there are 2 q of them), the sum and integral can be written, using (5.2) and Proposition 5.3, as Here in (5.5), we have ) app and K * j = (K * T (n j ) ) app where T (n j ) is chosen to have sign +; note that if T is the regular tree conjugate to T then K * T = K * T , and the same holds for the leading contribution (• • • ) app .
Note that, to calculate the above-mentioned contribution (i.e. the sum (5.4) with only H • altered), we need to sum over all possible choices of H • (i.e.all possible choices of ζ n j (1 ≤ j ≤ q)), in addition to the summation and integration in (5.5).This results in the expression ζn j ∈{±} (1≤j≤q) (5.5) = some function of (k 0 , 0 , k q+1 , q+1 , t 0 , t q ).
(5.6) Now (5.6) is exactly the same expression that is explicitly calculated in Sections 8.3.1 and 8.3.2 of [12], so we shall take the results of such calculations from [12] and apply them below.There are three cases depending on the value of h := k 0 − 0 .
(1) The zero gap case (h = 0): this is very easy, as we have k j = j , so in view of the k j+1 j+1 j factors we must have k 1 = • • • = k q = k 0 , so the expression (5.5) gains a large negative power of L, and can be treated in the same way as the small gap term below.(2) The small gap case (0 < |h| ≤ (100δL) −1 ): we have * tq e πiλtq dσdλ. (5.7) Here m tot is the sum of the scales of all regular couples Q (p j ,m j−1 ) and regular trees T (n j ) , , and the functions G and P satisfy (5.8) (3) The large gap case (|h| > (100δL) −1 ): we have the same expression (5.7) and the same bound (5.8), but the factor L −40d on the right hand side of the second inequality of (5.8) should be replaced by 1.
Below we will ignore the zero gap case.In the other two cases, we define the new marked garden G < sk as follows.In the small gap case, and in the large gap case assuming also k 0 = k q+1 , we remove the whole chain H • by setting (p 0 , e, f) (see Definition 5.2) to be the three children nodes of n 0 , with the order determined by their signs and the relative position of p 0 , and remove the other nodes (i.e.(n j , p j ) for 1 ≤ j ≤ q and m j for 0 ≤ j ≤ q − 1).In the large gap case assuming k 0 = k q+1 , we must have k 0 = k q since k q = k q+1 in view of the factor k q+1 q+1 q in (5.5), so in this case we remove the chain (n 0 , • • • , n q−1 ), which is the chain H • less one node, in the same way as above.
In either case, denote the scale of G < sk by m < 0 .Note that G < sk does not depend on the choice of H • in the fixed congruence class (unless in the large gap case, where this dependence does not matter), and for the decoration of G < sk coming from the decoration of G sk , we have ζ n 0 Ω n 0 = Ω * for each choice of H • .Then, we can reduce the expression (5.9) using (5.7),where in (5.9) the sum is taken over all marked gardens G formed by altering the irregular chain H • in G sk , and ( * ) represents either "sg" or "lg", where we restrict to the small gap or large gap case.In fact, using (5.7) we have Here in (5.10) the sum is taken over all (k 1 , • • • , k 2R )-decorations I < sk of G < sk , and the other notations are all associated with G < sk , except I < sk and I < sk ; instead, for I < sk we add the one extra condition t n p 0 > t n 0 + σ (where n p 0 is the parent of n 0 ) to the original definition (4.2).As for I < sk , in the "sg" case we remove the one factor kn 01 kn 02 kn 03 (where n 0j are the children of n 0 from left to right) from the original definition (3.1), while in the "lg" case we set it to be the same as I < sk .Moreover, the variables (k 0 , 0 ) are defined as in Definition 5.3, and the functions G and P etc., are as in (5.7), which satisfy either (5.8) or the alternative version in the "lg" case.We also insert the corresponding "sg" or "lg" cutoffs restricting to 0 < |h| ≤ 1/(100δL) or |h| > 1/(100δL) in (5.10).Finally, in the functions K * T (n 0 ) and K Q (m,m ) for the leaf pair {m, m } containing p 0 , the input variable t n 0 should be replaced by t n 0 + σ.Remark 5.5.In the small gap case, due to the absence of kn 01 kn 02 kn 03 in I < sk , in the summation in (5.10), the decoration (k n ) may be resonant at the node n 0 (i.e.(k n 01 , k n 02 , k n 03 ) ∈ S, see (2.6)), but it must not be resonant at any other branching node.This resonance may lead to an (at most) L 4d loss in the counting estimates in Proposition 6.8, but this can always be covered by the L −40d gain from P in (5.8).See Remark 6.9 for further explanation.5.4.Summary.Now we may repeat the reduction described above for every irregular chain H • in G sk , noticing that these irregular chains do not affect each other, in the same way as in Section 8.4 of [12].Let G # sk be the marked garden obtained by removing all the irregular chains H • from G sk as described above in Section 5.3.This does not depend on the choice of G sk in the fixed congruence class, nor on the choice of G ∈ F .We then have (5.4) Here in (5.11), m 0 is the scale of G # sk and m 1 is the sum of all the m tot and q in (5.10), the summation is taken over all k-decorations I # sk of G # sk , and the other notations are all associated with G # sk , except I # sk ; instead, for I # sk we add the extra conditions t n p > t n + σ n (where n p is the parent of n) to the original definition (4.2), for n ∈ Ξ, where Ξ is a subset of the set (N # sk ) * of branching nodes.The vector parameters are λ and k[G # sk ] is the vector of all the k n 's.The functions G(λ) and (5.12) We also insert various small gap or large gap cutoff functions, and some input variables in some of the K Q (m,m ) or K * T (m) functions may be translated by some σ n , in the same way as in (5.10).Finally, the function I # sk may miss a few knkn 1 kn 2 kn 3 factors compared to the original definition (3.1), but for each such missing factor we can gain a power L −40d on the right hand side in the second inequality in (5.12).
At this point, we may expand the functions K Q (m,m ) and K * T (m) (or their leading or remainder contributions) using their Fourier L 1 (or X κ loc ) bounds, and combine the K factors and the P factor in (5.11), to further reduce to the expression (5.13)Here in (5.13) , the function G is different from the one in (5.11), but still satisfies the same first inequality in (5.12) with weights in λ and µ also included.Using the second inequality in (5.12), the X κ loc bounds for K Q (m,m ) and K * T (m) and their components, and the definition of markings L and R, we deduce that the function X tot satisfies uniformly in λ, where r 0 is the total number of branching nodes and leaf pairs that are marked R in the marked garden G # sk .In (5.14) we can also gain a power L −40d per missing factor knkn 1 kn 2 kn 3 in I # sk , as described above.
Note that the garden G # sk is still mixed, and prime.Moreover by definition, it does not contain an irregular chain of length > 10 3 d with all branching nodes and leaf pairs marked L. In particular, if r 0 is the number of branching nodes and leaf pairs that are marked R, r irr is the number of maximal irregular chains, and Q is the total length of these irregular chains, then we have (5.15) Based on this information, as well as the first inequality in (5.12) and (5.14), we will establish an absolute upper bound for the expression (5.13).This will be done in the following two sections.

Gardens and molecules
Definition 6.1 (Definition 9.1 in [12]).A molecule M is a directed graph, formed by vertices (called atoms) and edges (called bonds), where multiple and self-connecting bonds are allowed.We will write v ∈ M and ∈ M for atoms v and bonds in M; we also write ∼ v if v is an endpoint of .We further require that (i) each atom has at most 2 outgoing bonds and at most 2 incoming bonds (a self-connecting bond counts as outgoing once and incoming once), and (ii) there is no saturated (connected) component, where connectedness is always understood in terms of undirected graphs, and a component is saturated if it contains only degree 4 atoms.For a molecule M we define V to be the number of atoms, E the number of bonds and F the number of components.Define χ := E − V + F .Definition 6.2 (Definition 9.3 in [12]).Given a garden G, define the molecule M associated with G, as follows.The atoms of G are all the 4-element subsets formed by a branching node in n ∈ N * and its three children nodes.For any two atoms, we connect them by a bond if either (i) a branching node is the parent in one atom and a child in the other, or (ii) two leaves from these two atoms are paired with each other.We call this bond a PC (parent-child) bond in case (i) and a LP (leaf-pair) bond in case (ii).Note that multiple bonds are possible, and a self-connecting bond occurs when two sibling leaves are paired.We fix a direction of each bond as follows.If a bond corresponds to a leaf pair, then it goes from the atom containing the leaf with − sign to the atom containing the leaf with + sign.If a bond corresponds to a branching node n that is not a root, suppose n is the parent in the atom v 1 and is a child in the atom v 2 , then the bond goes from v 1 to v 2 if n has + sign, and go from v 2 to v 1 otherwise.See Figure 6 for an example.
. By definition of mixed gardens we know that no T i and T j have their leaves completely paired.For the molecule M, clearly the number of atoms V = m, since each atom in M corresponds to a unique branching node in G. Moreover the number of bonds E = 2m − R.This is because each bond corresponds to either a unique non-root leaf pair or a non-root branching node.The total number of leaf pairs and branching nodes (including roots) is (m + R) + m = 2m + R, however each root should be subtracted once (it should be excluded from the set of branching nodes if it is a branching node, and should be excluded from the set of leaf pairs if it is a leaf and is paired to another leaf), and only once (because there do not exist two roots that are both leaves and are paired to each other).This implies E = 2m − R as there are 2R roots.
Finally, for any T j (1 ≤ j ≤ 2R), let S j be the set of atoms corresponding to branching nodes in T j , then M is the union of all S j (1 ≤ j ≤ 2R).By definition all atoms in S j are connected to each other.Moreover, if some leaf in T j is paired to some leaf in T j then S j and S j are also connected to each other.Since the leaves in the union of any odd number of trees T j cannot all be paired with each other (since each T j has an odd number of leaves), and also that the garden does not contain two trees T i and T j with their leaves completely paired, we know that any connected component in M must be the union of at least four S j , in particular F ≤ R/2.This implies that Proof.This is basically the same as Proposition 9.6 in [12].For each atom v ∈ M, each bond ∼ v corresponds to a unique node n in the 4-node subset corresponding to v. We may assign a code to this pair (v, ) indicating the relative position of n in this subset (say code 0 if n is the parent node, and codes 1, 2 or 3 if n is the left, mid or right child node).In this way we get an encoded molecule which has a code assigned to each pair (v, ) where ∼ v. Clearly if M is fixed then the corresponding encoded molecule has at most C m+R possibilities, so it suffices to reconstruct G from the encoded molecule.
In fact, if the encoded molecule is fixed, then the branching nodes of G uniquely correspond to the atoms of M.Moreover, the branching node corresponding to v 2 is the α-th child of the branching node corresponding to v 1 , if and only if v 1 and v 2 are connected by a bond such that the codes of (v 1 , ) and (v 2 , ) are α and 0 respectively.Next, we can determine the leaves of G by putting a leaf as the α-th child for each branching node and each α, as long as this position is not occupied by another branching node; moreover, the α-th child of the branching node corresponding to v 1 and the β-th child of the branching node corresponding to v 2 are paired, if and only if v 1 and v 2 are connected by a bond such that the codes of (v 1 , ) and (v 2 , ) are α and β respectively.
Finally, note that a node n is a root if and only if it is not a child of any other node, so we can uniquely identify the roots of the trees.Permuting these 2R roots leads to at most (CR)!choices, and once a permutation is fixed, the garden G will also be fixed as the structure of each tree, as well as the leaf pairing structure, has been fixed as above.This gives at most (CR)!C m possible choices for G.Note that, if one of the trees in G is trivial, then the reconstruction will be slightly different, but this does affect the result.Definition 6.5 (Definition 9.7 in [12]).We define the type I and type II (molecular) chains in a molecule M, as in Figure 7.Note that type I chains are formed by double bonds, and type II chains are formed by double bonds and pairs of single bonds.For type I chains, we require that the two bonds in any double bond have opposite directions.For type II chains, we require that any pair of single bonds have opposite directions, see Figure 7.
Given a molecule M, the main subject of this section is the following counting problem associated with M, similar to [12].Definition 6.6 (Definition 9.8 in [12]).Given a molecule M and a set S of atoms.Suppose we fix (i) a ∈ Z d L for each bond ∈ M, (ii) c v ∈ Z d L for each (non-isolated, same below) atom v ∈ M, assuming c v = 0 if v has degree 4, (iii) Γ v ∈ R for each atom v, and (iv) for each atom v.Here in (6.1) the sum is taken over all bonds ∼ v, and ζ v, equals 1 if is outgoing from v, and equals −1 otherwise.We also require that (a) the values of k for different ∼ v are all equal given each v ∈ S, and this value equals f v if also d(v) < 4, and (b) for any v ∈ S and any bonds 1 , 2 ∼ v of opposite directions (viewing from v), we have k 1 = k 2 .Note that this actually makes D depending on S, but we will omit this dependence for simplicity.We say an atom v is degenerate if v ∈ S, and is tame if moreover d(v) < 4.
In addition, we may add some extra conditions to the definition of D(M).These conditions are independent of the parameters, and have the form of (combinations of) (k 1 − k 2 ∈ E) for some bonds 1 , 2 ∈ M and fixed subsets E ⊂ Z d L .Let Ext be the set of these extra conditions, and denote the corresponding set of vectors k[M] be D(M, Ext).We are interested in the quantities sup #D(M, Ext), where the supremum is taken over all possible choices of parameters (a , c v , Γ v , f v ).Remark 6.7.The vectors k[M] will come from decorations of the garden G from which M is obtained.
Let v ∈ M be an atom corresponding to a branching node n ∈ G. Then d(v) = 4 unless n is the root of some T j , or some other T i is a trivial tree paired with a child of n (there may be more than one such i).
It is easy to check, using Definitions 3.3 and 6.6, that the followings hold.If , then the right hand sides of the above equations should be corrected by suitable algebraic sums of k j and (or) k i , and |k j | 2 β and (or) |k i | 2 β , where j and i are associated with n as stated above.Note that all these k j and k i are fixed when considering the decoration k then either the values of k for different ∼ v are all equal (and this value equals k j if d(v) < 4 where j is as above), or for any bonds 1 , 2 ∼ v of opposite directions we have k 1 = k 2 .Note that a degenerate atom corresponds exactly to a branching node n for which kn 1 kn 2 kn 3 = −1.Proposition 6.8.Let M be the molecule associated with a mixed garden G of width 2R and scale m, where R, m ≤ (log L) 20 .Suppose also that M does not contain any triple bond.Then, D(M) is the union of at most C m subsets.Each subset has the form D(M, Ext), and there exists 0 ≤ r ≤ m, and a collection of at most C(r + R) molecular chains of either type I or type II in M, such that (i) the number of atoms not in one of these chains is at most C(r + R), and (ii) for any type II chain in the collection and any two paired single bonds ( 1 , 2 ) in this chain (see Figure 7), the set Ext includes the condition (k 1 = k 2 ).Moreover we have the estimate that where m 1 is the number of atoms in the union of type I chains.Remark 6.9.In view of Remark 5.5, in Definition 6.6 we may also fix some set S * of atoms such that neither (a) nor (b) is required for v ∈ S * , but we are allowed to multiply the left hand side of (6.2) by L −40d•|S * | .In this way we can restate Proposition 6.8 appropriately, and the new result can be easily proved with little difference in the arguments, due to the large power gains.For simplicity we will not include this in the proof below.
Proof of Proposition 6.8.The proof is basically the same as the proof of Proposition 9.10 in [12].We define the same steps as in Section 9.3 of [12], including the good and normal steps, and apply the same algorithm as in Section 9.4 of [12].Let the total number of good steps in the process be r ≥ 0 (we may assume r ≤ m up to a constant because the total number of steps is at most O(m)), then we may repeat the proofs in Section 9.5 of [12].The only difference here is the initial state of the molecule (as M is obtained from a mixed garden rather than a couple), but in the current case we still have , where V <4 and F are the number of atoms with degree < 4 and the number of connected components and the constant in O depends only on d.
Note that in the proof of of Proposition 9.10 in [12], the quantities that are monitored include V , E, F , V j for 0 ≤ j ≤ 4 (which is the number of atoms with degree j), V * 2 (which is the number of degree 2 atoms with two single bonds), and ξ (which is the number of "special bonds" connecting two degree 3 atoms that have a special form, see Definition 9.12 of [12]).Since V <4 + F = O(R), it is clear that in the beginning, the value of each of these quantities in the current case is the same as in [12], up to errors of size O(R).Thus, the same proof as in [12] yields that M contains at most C(R + r) type I or II molecular chains, such that the number of atoms not in one of these chains is at most where κ and γ are calculated retrospectively from the algorithm, as described in Section 9.2 of [12].The calculation for κ is the same up to O(r) errors, so we have κ = m+m 1 2 up to errors O(R + r) which are acceptable.To calculate γ, note that in [12] we are actually calculating γ − χ, and the same proof yields that (d − 1)(γ − χ) ≤ −2νr for the initial molecule.Now by Proposition 6.
as desired.

L 1 coefficient bounds
We now return to the study of the expression (5.13).Let G # sk and (r 0 , r irr ) be as in Section 5.4.For simplicity, in this section we will write G # sk simply as G, and the associated sets (N # sk ) * as N * etc. Recall, by (5.15), that the total length of the irregular chains in G is at most C(r 0 + r irr ).Let Ξ be a subset of N * , we may define, as in (5.13), the function where σ = σ[Ξ] ∈ [0, 1] Ξ , and the domain I is defined as in (4.2), but with the extra conditions t n p > t n + σ n for n ∈ Ξ, where n p is the parent of n.Then, let m 0 be the scale of G, we can write Let M be the molecule associated with G as in Definition 6.2.It is easy to see that M contains no triple bond, as triple bonds in M can only come from (1, 1)-mini couples and mini trees (as in Definition 3.4) in G.By the proofs in Section 6, we can introduce at most C m 0 sets of extra conditions Ext, such that the summation in I = k[G] in (7.2) can be decomposed into the summations with each of these sets of extra conditions imposed on k[G].Moreover, for each choice of Ext there is 1 ≤ r 1 ≤ n 0 such that the conclusion of Proposition 6.8, including (6.2), holds true (with r replaced by r 1 ).
Notice that a type I chain in M can only be obtained from either one irregular chain, or the union of two irregular chains in G; for couples this can be proved in the same way as in Section 10.1.2 of [12] (which deals with type II chains), and the same proof works also for gardens.Therefore, the total length p of type I chains in M is bounded by the total length of irregular chains in G, which is at most C(r 0 + r irr ).However, each irregular chain in G also corresponds to a type I chain in the base molecule, so r irr ≤ C(r 1 + R), hence p ≤ C(r + R), where r = r 0 + r 1 .This means the number of atoms in M that are not in one of the (at most C(r + R)) type II chains is at most C(r + R).Now, suppose n and n are two branching nodes in G which correspond to two atoms in M that are connected by a double bond in a type II chain, then we must have ζ n Ω n = −ζ n Ω n under the extra conditions in Ext, see Remark 6.7.In fact we will restrict {n, n } to the interior of this type II chain by omitting 5 pairs of atoms at both ends of the chain, in the same way as in Definition 5.4.Then, we make such {n, n } a pair, and choose one node from each such pair to form a set N ch .If it happens that one of {n, n } is a parent of the other, we assume the parent belongs to N ch .Let N rm be the set of branching nodes not in these pairs, and define N = N ch ∪ N rm .
We will be interested in estimates on the function U G in (7.1)where α n = δL 2 ζ n Ω n + λ n , which means that α n + α n = µ n for each n ∈ N ch , where n is the node paired to n and µ n = λ n + λ n is a parameter depending on λ.Under this assumption on α n , we can write for some function V G .This function actually depends also on the parameters µ n for n ∈ N ch , but we will omit this for notational convenience.We then have prove the following: Then, uniformly in (t, s), in the choices of (S n ) n∈ N , and in the parameters (µ n ) n∈ N ch , we have where r = r 0 + r 1 .
Proof.The proof is exactly the same as the proof of Proposition 10.1 in [12], except that all the "couples" should be replaced by "gardens".The reason is that, the proof in [12] goes by induction; moreover in each inductive step we remove either two branching nodes or one chain containing modules A and B (see Sections 10.1-10.2 of [12]).In either case this step involves at most two trees and the other trees are not affected, so the proof is the same for couples and for general gardens.
In the proof in [12] we have also introduced the simpler structures (for the purpose of induction) of unsigned couples and double-trees, which are naturally replaced by unsigned gardens and multitrees (the collection of 2R trees with some branching nodes paired, compared to two trees in [12]).The rest of the proof is exactly the same.Note also that the exponents Cr in Proposition 10.1 of [12] are replaced by C(r + R) as the number of type II chains in M, as well as the number of atoms not in one of these chains, is now C(r + R) instead of Cr due to Proposition 6.8.

Proof of Theorem 1.3
In this section we prove Propositions 4.7 and 4.8, which completes the proof of Theorem 1.3.8.1.Proof of Proposition 4.7.Note that, if G and G are congruent in the sense of Definition 5.4, then they have the same width, same signature, and the same scale for each of their component trees; moreover G is mixed if and only if G is mixed.Thus, the sum in (4.4) can be decomposed into different terms, where each term has the form (5.4) for one congruence class F .
For any fixed F , consider (5.4), which then equals (5.13) and (7.2).Note that in (7.2) the G actually means G # sk by our notation.Using the decay factors in (5.14) we can restrict to the subset where |k l −a l | ≤ 1 (∀l ∈ (L # sk ) * ) for some fixed parameters (a l ), with summability in (a l ) guaranteed.Using the first inequality in (5.12), we may also fix the value of λ (and hence µ n ).
As in Section 6, by decomposing into at most C m 0 terms (where m 0 is the scale of G # sk ), we can add the set of extra conditions Ext, which also defines the sets N (as in Section 7), etc., and the value r 1 ≥ 1.Let r = r 0 + r 1 as above, then thanks to Ext, we can use (7.3) Moreover, for each n ∈ N , the value δL 2 ζ n Ω n + λ n belongs to some subset of R of cardinality at most L 3d , as k[G # sk ] varies (this is because each k n belongs to a ball of radius at most n ≤ (log L) 20 under our assumptions).In particular the value m n = δL 2 ζ n Ω n + λ n belongs to a set S n ⊂ Z with cardinality at most L 3d , for all possible choices of k[G # sk ].To estimate (7.2) with λ fixed, we first integrate in σ.Using (5.14), we can estimate (7.2) using where sk (we also have additional factors that will be collected at the end).We next fix the values of m n ∈ S n for each n; note that then by definition, so if we use (7.4) to sum over (m n ) in the end, we can further estimate (8.1) using where a l and b n are constants, and we also include the conditions in Ext.Now (8.2) is almost exactly the counting problem D(M, Ext) stated in Definition 6.6, due to Remark 6.7, except that we only assume |k l − a l | ≤ 1 for leaves l.However, for any branching node n there exists a child n of n such that k n ± k n belongs to a fixed ball of radius µ • n (with µ • n defined Lemma 3.6), so by using (3.3), one can reduce (8.2) to at most C m 0 counting problems, each of which having exactly the same form as D(M, Ext) in Definition 6.6.Therefore, (8.2) can be bounded using Proposition 6.8 (and using Remark 6.9 if necessary).Collecting all the factors appearing in the above estimates, we get that which is then bounded by (C + δ 1/4 ) m L −3ν(r+R)/2 δ −q/2 , where q is the total length of type I chains in the molecule obtained from Q # sk .We know q ≤ C(r + R) so δ −q/2 ≤ L ν(r+R)/2 , which implies that Finally, suppose we fix r, then the molecule associated with G # sk (see Definition 6.2) is, up to at most C(r + R) remaining atoms, a union of at most C(r + R) type II chains with total length at most m 0 .This clearly has at most (C(r + R))!C m possibilities.By Proposition 6.4, the number of choices for G # sk is also at most (C(r + R))!C m .To form G sk from G # sk one needs to insert at most C(r + R) irregular chains with total length at most m, which also has at most C m possibilities.Finally, using Corollary 4.6, we see that G has at most (C(r + R))!C m choices.The number of choices for markings, as well as Ext, are also at most C m and can be accommodated.This means that, if we decompose (4.4) into terms of form (5.4), then further decompose by markings and/or Ext etc., then each of the resulting term has an index r ≥ 0, such that each term of index r is bounded by (C + δ 1/4 ) m L −ν(r+R) , see (8.4), and that the number of terms with index r is at most because in any case r + R is bounded by a power of log L, which is L ν .This completes the proof of Proposition 4.7.8.2.Proof of Proposition 4.8.The proof is almost identical with the corresponding proofs in [12], which we briefly present here.
First, by Chebyshev's inequality, to prove (4.6) it suffices to show that and Note that due to our choice N = (log L) 4 instead of N = log L , the proof of (4.6) is conceptually easier than [12] as we do not need the hypercontractivity property (Lemma A.3 of [12]) or the higher moment estimates.Now, to prove (8.6), we argue as in the proof of Proposition 12.1 of [12] (but with p replaced by 2), and apply Gagliardo-Nirenberg to bound the left hand side of (8.6), up to a multiple L 10d , by sup which is then bounded by (C + √ δ) n L 40d in the same way as [12].In fact, the bound for E|(J n ) k (t)| 2 is as in Proposition 2.5 of [12] (again our choice N = (log L) 4 here does not affect the proof), while the bound for E|∂ t (J n ) k (t)| 2 is as in (12.4) of [12], which is proved by similar arguments.This settles (8.6).The proof of (8.7) is the same, except that J n is replaced by R and n is replaced by N .Finally, to prove (4.7), again by Chebyshev's inequality, it suffices to show that the kernel (L n ) ζ k (t, s) of the R-linear operator L n (with ζ ∈ {±} indicating the linear and conjugate linear parts) can be decomposed as and that for each n ≤ m ≤ N 3 , the kernel (L n ) m,ζ k (t, s) satisfies that Now the decomposition is provided as in Proposition 11.2 of [12], and (8.8) is proved as in the proof of Proposition 12.2 of [12] (in particular this proof does require the hypercontractivity property).Note that in that proof, we actually further decompose M,k (t, s) for dyadic M , and proves (8.8) for (L n ) m,ζ M,k with the right hand side involving a negative power of M ; see (12.10) of [12].Both proofs carry over to the current case with our choice N = (log L) 4 with out any change, which then proves (4.7) and completes the proof of Proposition 4.8.

The non-Gaussian case
In this section we briefly discuss the non-Gaussian case, i.e.Theorem 1.4, which we prove in Sections 9.1-9.2(Theorem 1.5 basically follows from it and is proved separately in Section 9.3).Since much of the proof will be identical with Theorem 1.3, we will only elaborate on the parts where the proofs are different.
First, in the Gaussian case our proof yields uniform estimates as long as R ≤ log L (or R ≤ 2 log L); here we will make slightly stronger assumptions R ≤ log L/(log log L) 2 .Again we may consider 2R at some places, but it does not affect the result.
Next, using the expansion (3.6), we can reduce the proof of Theorem 1.4 to analyzing the correlations The rest of the proof, including the treatment of the remainder term b, can be done in the same way as with these correlations, see Section 8.2.Specifically, the Gaussian hypercontractivity inequality, which is used in Section 8.2, can be substituted by similar inequalities for the current density function thanks to the assumed bound on µ r ≤ (Cr)!; an instance of such argument can be found in Lemma 3.1 of [11] which treats the particular case of the uniform distribution on the unit circle, but the general case can be treated in the same manner.Therefore, below we will focus on the study of (9.1).

9.1.
A substitute for Isserlis' theorem.The obvious difference in the study of (9.1) in the non-Gaussian case is that the Isserlis' theorem is not available.Instead we have the following substitute: Lemma 9.1.Recall all the random variables η k are i.i.d. with radial law, and E|η k | 2r = µ r with µ 1 = 1 and µ r ≤ (Cr)! for a positive integer C.Then, for any Here in (9.2), O runs over all partitions of n into even positive integers (in particular if n is odd then the right hand side of (9.2) is zero).For fixed O, the O runs over all over-pairings of the set 3).To prove λ(2, , we simply notice µ 1 = 1 and that if O contains a certain number of terms 2, then any O O must contain at least the same number of 2's.Then we may proceed inductively using (9.3).
Next we prove (9.2) with λ(O) defined by (9.3).Assume all different values of these k j are m i (1 ≤ i ≤ r), where for each j there are a i copies of m i with corresponding ζ j = +, and b j copies with ζ j = −.We may assume a i = b i (otherwise it is easy to check that both sides of (9.2) are zero), hence the left hand side of (9.2) is reduced to In fact, consider any subset A ∈ O, say |A| = 2a, which is partitioned a = b 1 + • • • + b q as described above.To divide A into subsets of cardinalities 2b j (1 ≤ j ≤ q) to form part of O (we may call this part O A ), we need to divide the set of elements with + sign and the set of elements with − sign separately, leading to at most (a!) 2 /((b 1 )! • • • (b q )!) 2 choices, considering also that there may be repetitions due to symmetry.Applying this for each A, we get the upper bound (9.6).
Using the definition of η O,O , we can further reduce this to where each P i is a partition of 2a i into even positive integers, and at least one P i is nontrivial (i.e.contains at least two elements).Suppose q of these r partitions are nontrivial, then |P 1 |+• • •+|P r | ≥ r + q.Thus we get an upper bound r q=1 r q r! (r + q)! sup where P runs over all nontrivial even partitions of 2a.As r q r! (r+q)!≤ 1/q!, it suffices to show that which would then imply that (9.7) ≤ 1/2 and thus complete the induction.The proof of (9.8) is easy.Note that log G is convex, so if Using also that (m/3) m ≤ m! ≤ m m , we can further bound this by (C 0 max((s − 1)/3, a − s + 1)) C 0 (s−1) ≤ (a/4) −C 0 (s−1) a 2(s−1) .
The number of choices of P is at most a s−1 , so the left hand side of (9.8) is bounded by a s=2 a s−1 (a/4) −C 0 (s−1) a 2(s−1) ≤ 1 4 provided C 0 is big enough (we may assume a ≥ 5, since the cases a ≤ 4 are easily verified).This proves (9.8), and finishes the inductive proof of (9.4).
Finally, suppose the sum of elements in O that are at least 4 is q, then |O| ≥ (n − q)/2.Thus by (9.4) we have which completes the proof.
Here O is a set of over-pairings of the leaves of the trees T j (1 ≤ j ≤ 2R), which is a partition of these leaves into subsets, such that the number of leaves with sign + in each subset is equal to the number of leaves with sign −.
where all the objects are defined as in (4.1), except that for the decoration I we require that k n (n ∈ A) are all equal for each A ∈ O.
Clearly an over-garden OG can be turned into a garden G by dividing each over-pairing A ∈ O into leaf pairs; below we will write OG ∼ G for this.In this case M OG is the same as M G , except for the finitely many additional conditions of form k l = k l associated with the over-pairings structure of OG, which are added to the decoration I in the summation (9.10).Now by (9.9) we have where G satisfies the same condition as OG but is a runs over gardens instead of over-gardens.Now the study of (9.1) reduces to the study of the quantities M OG for over-gardens OG.To this end we introduce the notion of regular over-gardens, and one simple linear algebra lemma.Definition 9.3.Define an over-garden OG to be a regular over-garden, if there exists G such that OG ∼ G, and (i) G is a regular multi-couple (Definition 4.1), and (ii) for each leaf l in each overpairing A ∈ O with |A| ≥ 4, the tree T j containing l must be a regular tree and l must be its lone leaf.
Lemma 9.4.Let A ⊂ B be two sets of affine linear equations posed on R n , in terms of the coordinates (x 1 , • • • , x n ).Let A ⊃ B be the affine submanifold of R n determined by equations in A and B respectively, assume B = ∅, and denote p = codim A (B).Let 1 ≤ r ≤ n − 1.
For any fixed Proof.We omit the proof as it is elementary.Now we can state the main estimate for the non-Gaussian case.
Consider the sum I as in (9.12), but we restrict to non-regular over-gardens OG in the summation.Then we have Proof.Let OG be an over-garden and OG ∼ G. Since the expression M OG in (9.10) is just the expression M G in (4.1) with finitely many extra requirements of form k l = k l in the summation, it is clear that M OG can at least be estimated in the same way as M G with no power loss.Also the number of O is at most C m , and for fixed G and O, the third sum in (9.12) contains σ 1 (O) terms, where σ 1 (O) has the same upper bound as λ(O) in Lemma 9.2.
Therefore, compared to the bounds for M G that we already know, we only have two tasks in proving the desired result: first, gain the extra power L −ν , and second, gain enough extra powers to cancel the factor |λ(O)| in case the latter is too large.Note also that if O is given and G and G are congruent in the sense of Definition 5.4, then the over-gardens OG |= O, OG ∼ G are in one-to-one correspondence with the over-gardens OG |= O, OG ∼ G , and cancellations between the terms M OG and M OG are the same as the cancellations between M G and M G in Section 5, up to minor modifications.As such, we can exploit the same cancellation for irregular chains in G as in Section 5 and [12].Now let us go over the process of studying M G , and see what the extra conditions k l = k l may do at each step of this process.Below let q 0 be the number of independent extra equations k l = k l , then q 0 ∼ q, where q is the sum of elements in O that are at least 4 as in Lemma 9.1.In fact, we have q = 4≤a∈O a, q 0 = 4≤a∈O a 2 − 1 , hence 2q 0 ≤ q ≤ 4q 0 .We will keep track of the codimension introduced by these q 0 extra equations to the affine manifold of all possible decorations (k l ) using Lemma 9.4.
(1) Assume that G is a regular multi-couple.In this case, we shall estimate the summation (together with the integration) in (9.10) using the method in Section 6 of [12] (note that here we have to treat all the regular couples in G together-instead of one at a time in Section 6 of [12]because of the extra conditions linking different regular couples together, but this will cause minor changes to the proof).In particular, we define the variables x n and y n as in the proof of Proposition 6.7 of [12].Note that there are 2R linear equations that any decoration of the leaves of G must satisfy (and such decoration of leaves uniquely determines the full decoration of G); moreover the set of decorations satisfying these 2R equations is in affine bijection with the set of free variables (x n , y n ), see the proof of Proposition 6.7 of [12].Now, with the extra conditions, the dimension of the affine manifold of all possible decorations (k l ) gets strictly lower, and the codimension r introduced satisfies r max(1, q 0 − O(R)).In fact we have r > 0 because at least one extra condition must take the form k l = k l where l is not the lone leaf of a regular tree by Definition 9.3, and this equation will be independent of the 2R original equations stated above (since the only way for this extra condition to be dependent is for the two trees containing l and l to be distinct and coupled, which would easily imply that they are two regular trees with lone leaves l and l ).The lower bound q 0 − O(R) is because the number of independent extra equations is q 0 , and we subtract O(R) because some extra equations combined may imply some of the 2R original equations.
Using the affine bijection, we know that the (x n , y n ) variables must satisfy r independent linear equations.Then, we proceed as in the proof of Proposition 6.1 in [12], and sum over the (x n , y n ) variables one by one.At each step, suppose we are summing over the pair (x j , y j ), depending on the extra equations satisfied by these variables, we have one of three possibilities: (a) there is no restriction on (x j , y j ) and we are summing over all choices of (x j , y j ); (b) we are summing over (x j , y j ) that satisfies one linear equation (such as x j = const or y j = const or x j ± y j = const); (c) we are summing over (x j , y j ) that satisfies two linear equations, i.e. over only one point (x * j , y * j ).In either case the summation can be performed in the same way as in [12], and in cases (b) and (c) we are gaining a power of L in this summation, compared to the factor L 2d−2 in [12].Also, by repeating Lemma 9.4, we know that case (b) or (c) must occur at least r times during this process.
Therefore, putting altogether, with these extra conditions we can gain power L −cr for some small constant c with r max(1, q − O(R)), in the estimate of M OG compared to M G .This already provides the needed L −ν gain.It also covers any possible loss due to λ by our choice R ≤ log L/(log log L) 2 , which can then be covered by the L −ν gain; also the various loss of C m is unimportant as they can be absorbed into (C + ) m in (9.13).
(2) Now we assume G is not a regular multi-couple.Then the sum of M G already gains the power L −ν in view of Proposition 4.7 and also Proposition 10.4 in [12].It then suffices to cover the possible loss due to λ(O).We perform the reduction steps as in previous sections, and analyze the extra conditions appearing in each step.As in (1), the total codimension introduced by the q 0 extra equations is r max(1, q 0 − O(R)); we may assume q 0 R because otherwise the loss |λ(O)| ≤ C m m Cq 0 ≤ C m L ν/2 can already be covered by the guaranteed L −ν gain.Therefore we now have r q 0 .
Step 1.We first remove the regular couples and regular trees to reduce G to its skeleton G sk as in Proposition 4.4.In this process we are fixing all the remaining k n variables (which are determined by the variables k l 1 for leaves l 1 of G sk ) and sum over the variables k l 2 , where l 2 are leaves of these regular couple and regular trees, similar to (1) above.By Lemma 9.4, there exists r 1 + r 2 = r, such that for any fixed (k l 1 ), the codimension of the submanifold formed by the (k l 2 ) variables is r 2 , and the codimension of the submanifold formed by the (k l 1 ) variables is r 1 .By repeating the argument in (1) above, we can gain a power L −cr 2 in summing over the (k l 2 ) variables.Note that some extra equations satisfied by the (k l 2 ) variables may be of form k l 2 = const instead of k l 2 = k l 2 as in (1), but this does not affect the proof.
Step 2. We further remove the irregular chains from the skeleton G sk and exploit the cancellation as in Section 5. Note that if G is not a regular multi-couple, then any OG ∼ G must be non-regular in the sense of Definition 9.3, so for fixed G, the summation in OG we are studying here is still the same as the one in (9.12) even though we have made the restriction that OG is non-regular.Thus, as said above, the cancellation for irregular chains works the same way in the current situation as in Section 5.The extra conditions again lead to gain of powers in L. Like in Step 1, we can write r 1 = r 3 + r 4 , such that we can gain a power L −cr 3 in the current step, and after removing the irregular chains, the remaining decoration (of the remaining garden G # sk , see Section 5.4) still satisfies r 4 extra linear equations.
Step 3. Finally we reduce the estimate of the remaining expression to the counting problem associated with the molecule formed from G # sk .Here we only need to show that, if, in addition to the equations in the counting problem, the variables in question also satisfy r 4 additional independent linear equations, then we can improve the upper bound for the counting problem by a power L −cr 4 with a small constant c.
To see this, we follow the procedure described in Section 6, and in particular apply the algorithm introduced in Section 9.4 of [12].In this process, where we fix some of the variables in each step, we keep track of the codimension, or the number p of independent equations satisfied by the remaining variables.Initially we have p r 4 , while in the end we have p = 0. Therefore, there must be at least r 4 steps where p strictly decreases.If this step is a good step in the sense of Section 9.3 of [12], then we are gaining a constant power L −ν here; even if it is a normal step, since ∆p < 0, by Lemma 9.4, in doing the counting estimate for this step, we can take into account an additional independent linear equation satisfied by the variables in consideration.For example, if we perform the step (3R-1) defined in Section 9.3.8 of [12], then the corresponding counting problem we solve is (say) L , |a|, |b|, |c| 1, which has O(L 2d−2 ) solutions.However, if we add to this system another independent linear equation αa + βb + γc = const, where (α, β, γ) is not a multiple of (1, −1, 1), then the number of solutions will be at most L d with d < 2d − 2. This leads to a power gain in each such step, so in total we can gain a power L −cr 4 for some constant c.
After the above three steps, the total power we gain would be L −c(r 2 +r 3 +r 4 ) = L −cr , which is enough to cover the loss C m m Cq from λ(O) because r q 0 q.Therefore in any case we can cover the possible loss with an extra gain of L −ν , hence (9.13) holds.This completes the proof.With Proposition 9.5 we can now prove Theorem 1.4.
Proof of Theorem 1.4.We use (3.6) to expand E 2R j=1 a ζ j k j (t) .The estimate for the remainder term b can be done using arguments similar to Section 8.2, which we shall omit.Then, using also (9.12), we can write where G runs over all gardens of width 2R such that the scale of each tree is at most N , and O runs over all even partitions of 2(m + R) where m is the scale of G.By Proposition 9.5 and summing over all possible m j as in the proof of Theorem 1.3 above, we see that with R fixed and L → ∞, the contribution of non-regular over-gardens OG decays like L −ν in the limit.Thus, we only need to consider regular over-gardens OG.Suppose OG ∼ G, then G is a regular multi-couple.Therefore, unless we can divide {1, • • • , 2R} into pairs such that for each pair {i, j} we have k i = k j and ζ i = −ζ j , the contribution of regular over-gardens must vanish, in particular (1.2) is true.Now we only need to prove (1.6).
For any regular couple Q = (T 1 , T 2 ), the tree T 1 is a regular tree if and only if T 2 is a regular tree (and hence the two lone leafs are paired).In this case we say Q is tangential (since the two trees only have one leaf-pair in common), otherwise say Q is non-tangential.Note that by the proof of Theorem 1.3 in [12] we have Q M Q (t, t, k) = n(δt, k) + O(L −ν ), (9.15)where Q runs over all regular couples with both trees having scale at most N .If we restrict to tangential couples only, then the sum should be approximated by n 0 (δt, k) where n 0 is defined in (WKE-0).The easiest way to see this is that, the expression M Q (t, t, k) contains a factor n in (k) if and only if Q is tangential, because for any regular tree T with root r and lone leaf l we must have k l = k r = k.Thus, since the sum (9.15) over all couples exactly matches the Taylor expansion of n(δt, k) (as shown in [12]), we know that the sum over tangential couples will exactly match the terms in the Taylor expansion that contain the factor n in (k).Due to the form of (KIN), it is easy to see that the sum of these terms is exactly n 0 (t, k), hence the result.Therefore we have  We now return to the sum (9.14) over regular over-gardens OG.For (1.6), we may rename (k 1 , • • • , k 2R ) such that there are 2a j copies of k * i for 1 ≤ i ≤ r (with half of them having sign + and half having sign −) where the k * i are all different and a 1 + • • • + a r = R.For simplicity we will write k i instead of k * i below.Clearly the 2a i trees corresponding to the input variable k i must form a i couples in G; assume b i of these a i couples are tangential and the rest are non-tangential, where 0 ≤ b i ≤ a i .Note also that for any OG ∼ G we have M OG = M G because over-pairings can only happen at lone leaves of regular trees.Therefore, for fixed (b 1 , • • • , b r ), we can calculate the contribution of regular over-gardens as  which is just (1.6) given (1.7).This completes the proof.9.3.Evolution of density.Finally we prove Theorem 1.5.Note that if µ r ≤ C r (2r)!, then by (1.7) for any t we also have µ r (t, k) ≤ C r (2r)! perhaps for some different C. Thus, convergence in law will be a consequence of the following lemma: Lemma 9.6.Suppose {X n } are R d valued random variables, such that for any multi-index µ the limit A µ := lim n→∞ E(X µ n ) exists and |A µ | ≤ C |µ| (|µ|)!.Then {X n } converges in law to a random variable X satisfying E(X µ ) = A µ for any multi-index µ.Moreover, the law with these given moments is unique.
Proof.First the assumption implies that E|X n | 2 is uniformly bounded in n, thus the sequence of laws of X n is tight.For any subsequence X n k we then have a subsequence X n k that converges in law to (say) some random variable X.For any µ, since E|X n | 2|µ| are bounded in n it is easy to see that E(X µ ) = lim E(X µ n ) = A µ .Therefore, it suffices to prove that the law of the random variables X with E(X µ ) = A µ is unique.This is true because, since |A µ | ≤ C |µ| (|µ|)!, we then have E(e ε|X| ) < ∞ for small enough ε, hence f (ξ) = E(e iξ•X ) is well-defined and holomorphic in the region |Imξ| < ε.The moments E(X µ ) uniquely determines the Taylor expansion of f (ξ) at ξ = 0, hence uniquely determines the value of f (ξ) in a neighborhood of 0-and consequently in the whole region |Imξ| < ε by analyticity.In particular the moments uniquely determine the value of f (ξ) on R, which is the characteristic function of X.Thus the law of X is unique, as desired.Now, for any t ∈ [0, δ] and k ∈ Z d L , consider the unique radial density ρ = ρ k (t, v) (where v ∈ C also viewed as an R 2 vector) such that We are now ready to prove Theorem 1.6.
Proof of Theorem 1.6.First, if (n r ) in is hybrid then it must be admissible.In fact, by the definition of hybrid initial data we have for any distinct k j ∈ Z d L (1 ≤ j ≤ r).Since m and (n r ) in are all continuous functions, by taking suitable limits and letting L → ∞ we know that (10.3) is actually true for all k j ∈ R d (1 ≤ j ≤ r).Then we simply integrate (10.3) in k r , using the integral condition in the definition of A, to get (1.15).
From now on we shall assume (n r ) in is admissible, then by Lemma 10.1, we can find a unique measure ζ such that (10.3) holds.Consider the hybrid data u in described in the statement of Theorem 1.6.We can view it as obtained by first randomly selecting m ∈ A with law given by ζ, then working in the same setting as in Theorems 1.1 and 1.3-1.5 with the particular choice n in = m.Since m ∈ A, by Theorem 1.1 we know that the conditional probability P(E|m) ≥ 1 − L −A for any m ∈ A, where E is the event that (NLS) has a smooth solution up to time T .This implies that P(E) = ˆA P(E|m) dζ(m) ≥ 1 − L −A .
Finally we prove (1.18).As is clear from the proof of Theorem 1.4, the remainder in (1.6) is in fact O(L −ν ) for some absolute constant ν > 0, where the implicit constant in O(•) may depend on r, but is uniform in (t, k j ) and n in = m, as long as m ∈ A. Thus, by Theorem 1.4, for the hybrid data u in , for t ∈ [0, T ] we have that  In fact, for any m ∈ A, by definition m is the solution to (WKE) with initial data m, so by direct calculation we see that ( m) ⊗r is a solution to (WKH) with initial data m ⊗r .Since (WKH) is linear, dk 1 dk 2 dk 3 .(KIN) Here δ is the Dirac delta, and for k = (k 1 , • • • , k d ) and = ( 1 , • • • , d ) we denote

Figure 1 .
Figure 1.A (1, 1)-mini couple.Here and below two leaves of same color are paired There are two possibilities indicated by codes 00 and 01 as in [12].

Figure 2 .
Figure 2. A mini tree.There are six possibilities indicated by codes 10 ∼ 31 as in [12].

Figure 3 .
Figure 3. Steps A and B as in Definition 3.4.

Definition 4 . 3 .
Define the steps A and B for gardens in the same way as for couples in Definition 3.4, see Figure3.Define a garden G to be prime if it is not obtained from any other garden by performing steps A or B.

Figure 5 .
Figure 5.A garden whose skeleton is the garden in Figure 4, see Proposition 4.5.Here each T j and T j represents a regular tree, and each Q j represents a regular couple.

Figure 6 .
Figure 6.The molecule associated with the garden G in Figure 4.Here the atoms 1 ∼ 4 correspond to branching nodes k 1 ∼ k 4 , and atoms 5 ∼ 7 correspond to branching nodes 1 ∼ 3 in G.

Proposition 6 . 4 .
Fix m and R. Given any molecule M of m atoms, the number of gardens G of width 2R and scale m that corresponds to M in the sense of Definition 6.2 is at most (CR)!C m .

Figure 7 .
Figure 7.The two types of molecular chains.For type II, the single bonds of the same color are paired single bonds, and must have opposite directions.
{1, • • • , n} subordinate to O (we use the notation O |= O), which are partitions of {1, • • • , n} such that the cardinalities of the subsets exactly form the partition O, and for each subset A exactly half of the signs ζ j (j ∈ A) are + and half are −.The coefficients λ(O) are constants depending only on O (and n), and λ(2, • • • , 2) = 1; in general we have λ(2, • • • , 2, 2a 1 , • • • , 2a r ) = λ(2a 1 , • • • , 2a r ).Moreover, let q be the sum of the elements in O that are at least 4, then we have|λ(O)| ≤ C n 1 n C 1 q for some constant C 1 C.Proof.We may assume n is even and half the signsζ j (1 ≤ j ≤ n) are +and half are −, since otherwise both sides of (9.2) are zero.Denote by |O| the number of elements in O (counted with multiplicity).For two partitions O and O , we write O O, if O can be formed by further partitioning some elements in O into even integers (also define and ≺ etc. accordingly).Similarly for set partitions O and O , we write O O if O can be formed by further partitioning some subsets in O (still keeping half of the signs + and half − in each subset).Now, for O O, we define ξ O,O as follows: given a partition O |= O, consider the number of partitions O O such that O |= O .The number of choices for O is independent of the choice of O, and we define it to be ξ O,O .Obviously ξ O,O = 1.We define the coefficients λ(O) for each O, such that they satisfy the following recurrence relation: first λ(2, • • • , 2) = 1, and for each O, we have 2b∈Oµ b = O O ξ O,O • λ(O ).(9.3)Here the product is taken over all elements 2b appearing in O, counted with multiplicity.Clearly the values of of λ(O) for each O are uniquely determined by(9.

2 . ( 9 . 5 )
is a partition of n and the sets {j : k j = m i } form a partition O * |= O * .Moreover, on the right hand side of (9.2), in order for the product A∈O j,j ∈A 1 k j =k j to be nonzero, one must have O O * and O O * , and in this case this product equals 1.Thus the right hand side of (9.2) equals using the definition of ξ O * ,O and (9.3), as desired.Next we prove that C 0 = C + 40.The base case O = (2, • • • , 2) is clear.By induction, and using that µ r ≤ (Cr)!, we only need to prove that for any O, we have O ≺O ξ O,O (|O|)!(|O |)! 2b∈O (C 0 b)! 2b∈O (C 0 b)! ≤ 1 Now, fix a partition O of {1, • • • , n} subordinate to O. To construct O , we first fix a partition of each element of O into even positive integers, such that the terms in these partitions exactly constitute O .Let the number of choices for these partitions be η O,O .Once these partitions are fixed, we have that Number of choices for
Structure of gardens.The key concept of this paper is a generalization of couples, which we call gardens.
Definition 4.1.Given a sequence (ζ 1 , • • • , ζ 2R ), where ζ j ∈ {±} and exactly R of them are +, we define a garden G of signature (ζ 1 , • • • , ζ 2R ), to be an ordered collection of trees (T 1 , • • • , T 2R As assumed we have R ≤ log L. Using mass conservation we can bound |a k j (t)| L d for each k j , so if E 1 is replaced by E 1 \E in (4.14), the corresponding contribution is bounded by (4.13)follow from (4.7), our choice N = (log L)4and Neumann series expansions.Now, to prove (2.9) we need to calculateE 1 E 1 • 19))Since R ≤ log L, this allows to control the terms where all factors are of form J n , but with 1 E replaced by 1 − 1 E (where we simply apply Cauchy-Schwartz and use the fact P(E c ) ≤ e −(log L) 3 ); similarly, if at least one factor in the expansion is the remainder b, then we can also apply Cauchy-Schwartz and use the bound |b k (t)| ≤ e −(log L) 4 together with(4.19)tocontrol this term.This completes the proof.
The total number of these leaves is 2(m + R) where m is the sum of scales of T j (1 ≤ j ≤ 2R), and O induces an even partition of 2(m + R) which we denote by O, such that O |= O in the sense of Lemma 9.1.The coefficient λ(O) is as in Lemma 9.1, and OG is the set of these 2R trees together with the set of over-pairing O, which we refer to as an over-garden.
Note that we may also write OG |= O instead of O |= O. Finally, like (4.1) we have