Moment bounds for SPDEs with non-Gaussian fields and application to the Wong-Zakai problem

Upon its inception the theory of regularity structures allowed for the treatment for many semilinear perturbations of the stochastic heat equation driven by space-time white noise. When the driving noise is non-Gaussian the machinery of theory can still be used but must be combined with an infinite number of stochastic estimates in order to compensate for the loss of hypercontractivity. In this paper we obtain a more streamlined and automatic set of criteria implying these estimates which facilitates the treatment of some other problems including non-Gaussian noise such as some general phase coexistence models - as an example we prove here a generalization of the Wong-Zakai Theorem found by Hairer and Pardoux.


Introduction
In the paper [10] the main focus was the convergence of smooth approximations u to the solution of the SPDE ∂ t u = ∂ 2 x u + H(u) + G(u)ξ. (1.1) Here u(t, x) is a function from R + × S 1 to R, H, G : R → R are respectively twice and five-times continuously differentiable and ξ denotes space-time white noise. One can immediately obtain a solution u to (1.1) by viewing it as an infinite dimensional Itô integral equation in time.
The fundamental obstacle to interpreting (1.1) without stochastic calculus is the irregularity of ξ. The smooth approximations u satisfy the above equation with ξ replaced by ξ ε := ξ * ε where ε is a mollifier converging to a space-time delta function as ε ↓ 0. More concretely the authors of [10] set where : R 2 → R is an even, smooth, compactly supported function which integrates to 1. Let u ε denote the classical solution to the equation driven by ξ ε . Unsurprisingly, the u ε do not converge to u in general. One already sees this in finite dimensions where the Wong-Zakai Theorem [19,20] (for more recent progress c.f. [14] and references therein) states that smooth approximations to an SDE converge to the Stratonovich solution to the SDE which in general differs from the Itô solution. Of course this discrepancy can be cured by "renormalizing" the SDE by inserting a Stratonovich-Itô correction term into the mollified SDEs. The main result of [10] is the corresponding result for the SPDE setting.
We defer exact formulae for the renormalization constants; they can be explicitly written as integrals involving the heat kernel and . The 1 ε term in (1.2) is exactly the Itô-Stratonovich correction term which diverges as expected -there is no notion of infinite dimensional analog of Stratonovich integration. The definition ofH(u) also involves two finite renormalizations which are chosen so that it is precisely the Itô solution to which the u ε are converging. In fact, along the proof [10] obtains a natural notion of solution to (1.1) which is pathwise -an analogous situation as in [15] (on rough paths) and [5] (on evolution equations).
In [10,Remark 1.7] Hairer and Pardoux ask if an analogous statement can be proven if one replaces the mollified space-time white noise ξ ε (t, x) with ε −3/2 ζ(ε −2 t, ε −1 x) where ζ is a non-Gaussian random field which is supported on smooth functions and satisfies a central limit theorem. They conjectured that in addition to the renormalization seen in the Gaussian case one would see additional terms of order ε − 1 2 .
The question of [10, Remark 1.7] is our point of departure. Let ζ be stationary, centered, generically non-Gaussian random field on R 2 which is almost surely continuous and for which all cumulants 1 exist and are exponentially decaying 2 . We also assume that ζ is normalized ζ(0)ζ(z) dz = 1. Let ζ (ε) be a random field on R × [− 1 2ε , 1 2ε ] which is a periodization of ζ (see Remark 1.3). We then set ζ ε (t, x) := ε −3/2 ζ (ε) (ε −2 t, ε −1 x). (1.4) Our result is then the following, which is proved using the theory of regularity structures developed in [7]. Theorem 1.2. Let ζ be as above and H and G be as in Theorem 1.1 and as before u be the Itô solution to (1.1) started from some initial condition u(0, ·) ∈ C(S 1 ). Then there exist constants {C , all of which are independent of the parameter ε, such that the following holds: Suppose that u ε is the classical solution to the random ζ + c (4) ζ G (u)G (u)G 2 (u) (1.6) and u ε is started with the initial condition u(0, ·) ∈ C(S 1 ). Then for every T > 0, the family of random functions u ε converges in law to the Itô solution u as ε ↓ 0, with initial data u 0 in the space C α/2,α ([0, T ] × S 1 ), for any α ∈ (0, 1 2 ).
We now explictly specify the renormalization constants of appearing in the above theorem. We write will be written as C n .) For example we have C = R 2 R 2 dz 1 dz 2 P (z 2 − z 1 )P (0 − z 2 )C 3 (z 1 , z 2 , 0).
Finally, the notation stands for a renormalized kernel. If the variables corresponding to its endpoints are z 1 and z 2 then it represents the kernel ε is said to be a periodization of ζ, in the same sense as in [12, Assumption 2.1], i.e. for every sufficiently small ε > 0, there exists a coupling of ζ (ε) and ζ such that for every T > 0 and every δ > 0, As an example, let µ be a Poisson point process on R 2 × [0, 1] with uniform intensity measure, let ϕ(t, x, a) be a continuous function bounded by e −|t|−|x| , and set 1] ϕ(t − s, x − y, a)µ(ds, dy, da) − µ(ϕ) . (1.9) Let µ (ε) be the periodic extension to R 2 × [0, 1] of a Poisson point process on R × [−1/(2ε), 1/(2ε)] × [0, 1] with uniform intensity measure, and x) is a continuous function that is equal to 1 when |x| < ε −1/4 and 0 when |x| > 2ε −1/4 , and let ζ (ε) be as in (1.9), with µ replaced by µ (ε) and ϕ replaced by ϕ (ε) . Then one can verify that for ε small enough, (1.8) is satisfied, where the natural coupling between ζ and ζ (ε) is such that ζ 1 4 ]. Note that the cumulants of ζ (ε) are allowed to have infinite range (but exponential decaying) in t. Remark 1.4. The same results as Corollaries 1.10, 1.11, 1.13 in [10] (on local continuity of the solution with respect to the initial condition in pathwise sense, sharp regularity result for the solution etc.) can also be proved. Since the proofs follow along the same lines, we refrain from redoing them here.

Moment estimates for SPDE with non-Gaussian fields
Much of our paper is spent developing a set of criterion for estimating moments of certain non-Gaussian random variables which will be written as (Π 0 τ )(ϕ λ 0 ) and defined in Section 2. These random variables can be represented as rooted trees. Each of these random variables is a multilinear functional of the driving noise ζ ε and the tree describes how to write this functional as an integral -the edges correspond to kernels, at each leaf we have an occurrence of the driving noise, and all variables other than the root correspond to integrated space-time variables.
Roughly speaking 3 , the p-th moments of this random variable can be written as a graphical sum where one takes p copies of this tree, identifies their roots, and then sums over all the possible ways of grouping the leaves into cumulants. For Gaussian noise only second cumulants appear but in general we have higher order cumulants which we view as hyperedges (edges incident to more than two vertices). We refer to this graphical sum as a cumulant expansion.
Estimates on moments are needed to establish regularity/homogeneity of certain random space-time processes via a generalization of the Kolmogorov continuity criterion, the random variable we are estimating is the analog of an "increment" of the process with the size of the increment given by a parameter λ. Our primary goal is, for each p, a bound of the type uniform in λ ∈ (0, 1] and ε small where |τ | is determined by the structure of the tree.
Here, for any continuous function ϕ : For the bound (1.10) we need to estimate, for each of the larger hypergraphs appearing in the cumulant expansion of the given moment, a complicated convolution of kernels and cumulants. The paper [11] developed criteria for graphs which guarantee that they satisfy the desired upper bound. Moreover by hypercontractivity (e.g. [7, Lemma 10.5]) one has Thus in the Gaussian case one needs (1.10) only for p = 2 which is obtained by checking a small number of graphs. 3 See [9, Section 5] for a more in depth discussion.
When hypercontractivity is not available establishing (1.10) for every p involves estimating infinitely many hypergraphs 4 . This problem was tackled in [12] where a criterion for individual trees was given which implies that any larger graph built by merging p copies of this tree satisfies the necessary criterion in [11]. However the criteria of [12] requires one to do some manipulation on the trees and the larger graphs one got after the merging: (i) cumulants had to be replaced by collections of edges and good factors had to be distributed according to "epsilon allocation rules" and (ii) so called "positively -renormalized" edges had to be estimated by hand on a case by case basis leading to more trees to check.
In this paper we provide streamlined criteria for these trees, proving the sufficiency of these criterion will also be easier. By working with hyperedges issue (i) is avoidedthis makes proving very general results like [16] much easier. Handling the positivelyrenormalized edges automatically deals with issue (ii) which makes the treatment of the Wong-Zakai problem much easier.
In Section 2 we fix our regularity structures and formulate the abstract fixed point problem for the Wong-Zakai equation in a space of modeled distributions. In Section 3 we prove Theorem 3.19 which states that certain criteria on the graphs yield the desired moment bounds. In Section 4 we apply the results obtained in Section 3 to the Wong-Zakai problem and prove Theorem 1.2.

Regularity Structures
The moment estimates we prove will be used as input for the theory of regularity structures developed in [7] (see also [6]). This machinery allows us to go from these estimates to the construction of an actual solution to the SPDE in question along with convergence of regularized and renormalized solutions to this limiting solution.
We refer readers looking for a detailed exposition to [8], [3][Ch. 15], and [1]; our description of the theory will be quite brief. The most basic object in the theory is a regularity structure which consists of a triple (A, T , G). The set A ⊂ R is a list of the possible homogeneities we allow in our expansions; it is assumed to be locally finite and bounded below. T is a graded vector space T = ⊕ α∈A T α where each T α is a Banach space with a distinguished basis. G is a group of continuous linear transformations on T with the property that for all α ∈ A, τ ∈ T α , and Γ ∈ G one has (Γτ −τ ) ∈ ⊕ β<α T β . A regularity structure is used to describe "jets of abstract Taylor expansions"; the vector space T is the target space for the jets and the structure group G includes transformations on the target space corresponding to change of base-point operations.

The Wong-Zakai regularity structure
The specific regularity structure we use for our Wong-Zakai type model is exactly the same as the one used in [10]; in particular T is spanned by a set of indeterminants τ ∈ W, each carrying a homogeneity |τ |. We first define a a larger class of indeterminants and then take W as an appropriate subset.
We start with the indeterminants 1, X 0 , and X 1 which are the abstract counterparts of 1, t, and x -since our scaling is parabolic we set |1| = 0, |X 0 | = 2, and |X 1 | = 1. Given a multi-index k = (k 0 , k 1 ) ∈ N 2 we write X k as a shorthand for X k0 0 X k1 1 . For such k we set |k| s def = 2k 0 + k 1 so |X k | = |k| s . We also writeT for the commutative algebra generated by 1, X 0 , and X 1 , the set of abstract polynomials.
Define U to be the smallest collection of indeterminants which contains 1, X 0 , and X 1 and that satisfies the conditions (i) τ ∈ U ⇒ I(τ ) ∈ U, (ii) τ ∈ U ⇒ I(Ξτ ) ∈ U, and (iii) τ,τ ∈ U → ττ ∈ U. Finally we set W := τ ∈ U ∪ { τ Ξ : τ ∈ U} : |τ | ≤ 5 2 . (2.1) A is given by the set of homogeneities that appear in W which by [7,Lemma 8.10] is bounded below and locally finite, also for each α ∈ A the vector space T α spanned by indeterminants of homogeneity α in W is finite dimensional. Often we write the elements of W as blue symbolic trees with Ξ = . Each occurrence of the abstract integration map I is then denoted by a downward straight line. The product τ andτ is represented by attaching the trees for τ andτ at the root. For example, we have ΞI 2 (Ξ) = . Note that we never see an expression of the form Ξ 2 in W. We also use the shorthand ΞX 1 = . The elements in W with negative homogeneities are: Having defined the T of the Wong-Zakai regularity structure, now we turn to defining a structure group G. To do this we introduce another set of indeterminants denoted W + and denote by T + the commutative algebra they generate. The construction of the structure group can be summarized as follows: there will be a single "abstract" matrix of indeterminants from T + which acts on T -all the individual elements of G arise by specifying an appropriate map f ∈ T + where T + is the set of algebra homomorphisms from T + to R.
Following [10] we set Here the operators J k (·) on W + are analogous to I[·] on W. We use the convention that J k (τ ) def = 0 if |τ | ≤ −2 + |k| s . The abstract matrix described earlier will be map ∆ : T → T ⊗ T + which we now define recursively on T . The base cases are given by We then recursively set The product on the RHS of the first definition is component-wise. Also note that the sum in the second definition only has finitely many non-vanishing terms.
Given any f ∈ T * + we define a linear transformation Γ f : T → T by setting We then define G to be the set of all linear transformations of the above form. The only non-trivial thing is to check is that G forms a group. This is done by equipping T + with a Hopf-algebra structure for which ∆ serves as a comodule coproduct -we refer the curious reader to [7, Section 8].
The existence of such a K is not hard to show, see [7,Section 5], and we consider it fixed for the rest of the paper. We now have everything in place to define the set of admissible models M on the Wong-Zakai regularity structure. A model is a pair of maps (Π, Γ) with Π : R 2 → L(T , S (R d )) which we write z → Π z and Γ : R 2 × R 2 → G which we write (z,z) → Γ zz . Here L(T , S (R 2 )) is the space of linear maps from T into the space of tempered distributions S (R 2 ). These maps are required to satisfy the algebraic conditions Π z Γ zz = Πz and Γ zz Γz z = Γ zz for any z,z, z ∈ R 2 .
(2.4) Let B be the set of all ϕ ∈ S (R 2 ) supported on the ball of radius 1 and with |D k ϕ| ≤ 1 where D denoted differentiation and k ∈ N 2 with |k| s ≤ 2. We also require that models satisfy the analytic bounds for each compact set K ⊂ R 2 , uniformly over ϕ ∈ B, λ ∈ (0, 1], τ ∈ W and α ∈ A. In the second bound τ α denotes the T α -norm of the T α component of τ . The set of models can be equipped with a family of pseudometrics indexed by compact sets K ⊆ R 2 -for two models Z = (Π, Γ) andZ = (Π,Γ) one sets |||Z;Z||| K to the maximum of two optimal constants for each of the the bounds of (2.5) where in the first and second bound the individual objects are replaced by differences (Π z −Π z ) and (Γ zz −Γ zz ), respectively. Together the pseudometrics |||·; ·||| K make the space of models a non-linear metric-space 6 .
To formulate the condition of admissibility it is convenient to switch to parameterizing models by pairs of maps (Π, f ) where Π is as before and f : The correspondance between f 's and Γ's is given by Γ zz = Γ −1 fz Γ fz , here we use the notation of (2.3). It is straightforward to verify that (Π, Γ) constructed in such a way automatically satisfies condition (2.4).
The notion of admissibility can then be stated as follows.
Definition 2.1. A pair of maps (Π, f ) as above is said to be admissible on (T , G) if the following conditions hold for all z,z ∈ R d , and for any multi-index k, We remark that if the pair (Π, f ) is admissible and Π satisfies the first bound of (2.5) then the Γ's built from f satisfy the second and (Π, f ) determines a model. We denote by M the complete metric space of admissible models.
We now describe one way to lift a continuous space-time function ψ to a corresponding admissible model Z ψ = (Π, f ). The algebraic constraints placed on admissible models are quite strong -if we define then one can use the identities of Definition 2.1 to define the rest of the action of (Π, f ). We call the model Z ψ built this way the canonical model built from ψ -we use the A defect of the family of models Z ε is that they do not converge to a limiting model in M as ε ↓ 0, the key difficulties coming from symbols τ which correspond to products of insufficiently regular space-time processes. We will have to modify this family to get a new collection of renormalized modelsẐ ε = (Π ε ,f ε ) -in general these new models will not satisfy the second identity of (2.6) -as an example we will have Π ε It is a fairly non-trivial task to determine how to deform the product property of a canonical model and still be left with an admissible model. In the theory of regularity structures this type of deformation of the product property is encoded via the action of a linear map M : T 0 → T 0 for an appropriate subset T 0 ⊂ T . One then has the following theorem, which is combination of Prop. 8.36, Def. 8.41, Theorem 8.44 in [7] and Theorem B.1 of [11].
Let M : T 0 → T 0 be a linear map that commutes with both the application of I[·] and multiplication by X k . Then there exist a unique linear, multiplicative mapM : T + → T + fixing abstract polynomials and a unique linear map ∆ M : T → T ⊗ T + such that one haŝ Suppose furthermore that the map ∆ M is upper triangular, that is for every α ∈ A and τ ∈ T α one has Furthemore, the family of M satisfying the above properties form a group R under composition.
Later, we will prescribe renormalization maps M ε , sketch how one checks the uppertriangle condition for ∆ M ε , and setΠ ε def Returning to our previous example, one will have ∆ M ε = M ε ⊗ 1 and M ε = − C ε −1 .

Modeled distributions and abstract fixed point problem
Given an admissible model Z, one can then start formulating abstract fixed point problems in spaces of modelled distributions D γ,η . Definition 2.3. Given an admissible model Z ∈ M and γ, η ∈ R we define the space of modelled distributions D γ,η to be the set of all functions U : R 2 → ⊕ α<γ T α such that for every compact set K ⊂ R 2 one has One of the main theorems in [7] says that there exists a reconstruction operator R mapping the elements of D γ,η to actual functions or distributions. In the space D γ,η one can define the notions of multiplication and composition with smooth functions. It is also possible to construct a linear operator P on this space which represents the space-time convolution by the heat kernel, namely one has RPU = P * RU where * is space-time convolution. These constructions allow us to formulate and solve abstract fixed point problems in the space D γ,η , and then apply the operator R on the abstract solution, which yields an actual function or distribution. For instance, regarding the equation (1.1), the abstract fixed point problem in the space D γ,η is formulated as follows: where P u 0 is understood as naturally lifted to the abstract polynomialsT . We actually consider this fixed point problem in a subspace D γ,η U ⊂ D γ,η consisting of functions that take values in the span of U rather than W (see (2.1)).
with the understanding that the product between any number of terms such that their homogeneity adds up to γ or more vanishes. Another property we will use is that PU − IU ∈T , so any solution U to (2.8) satisfies for all points z = (t, x) with t ∈ (0, T ). It follows from (2.10) and (2.9) that if we consider it as an element of D γ,η with γ greater than, but sufficiently close to, 3 2 , then U is of the form for some functions u and u . The symbols appearing here are introduced in Subsection 2.1. In D γ for γ > 0 sufficiently close to 0 we have the identitŷ This expansion is needed to derive the renormalized equations (1.5) and (1.6).

Graphical Moment Bounds
As discussed earlier the moments we need to bound can be represented by sums of graphs with hyperedges representing higher cumulants of the non-Gaussian noise. In this section we prove Theorem 3.19 which states that Assumption 3.17 for a given tree implies the desired bounds for every graph that appears in the aforementioned sum for the moment of that tree. We do not specialize to (1.1) and instead work in the general

Assumptions on kernels associated to hyper-edges
We start by recalling the notion of labeled coalescence trees, which will be useful for both the definition of norms on the kernels that are functions of more than two variables, and the proof to Theorem 3.7. A labeled coalescence tree (T, ) is a rooted binary tree with every inner node v associated with a natural number v which respects the partial order of the nodes, namely, v ≥ w whenever v ≥ w (i.e. w belongs to the shortest path connecting v to the root).
We denote by T n the set of labeled coalescence trees with precisely n leaves. Given (T, ) ∈ T n we define D(T, ) ⊂ (R d ) n to be the set of all tuples (x 1 , . . . , x n ) such that for Definition 3.1. For any α ≥ 0 we define the following quantities and any function κ n of n > 2 space-time points, we define Here denotes the root of the tree (T, ).

Definition 3.2.
Given any α ≥ 0 and p ∈ N, and a function κ 2 : Proof. Without loss of generality suppose (A.1) with θ = e −1 . Fix n, and suppose we are We want to show that if z ∈ D(T, ) then the RHS above is bounded by some constant times 2 n ( )|s|/2 . If 2 − ( ) < ε then this is immediate (just bound the exponential factor by one), so suppose instead where we used the inequality e −t t −n|s|/2 for t ≥ 1. Thus the claim follows.

The entire graph with hyper-edges
We now state a modified version of the bound on generalized convolutions found in [11]. The difference here is that we allow for the presence of the hyperedges described above. The proof of our version of the bound essentially follows in the same way as that found in [11], so instead of giving a full proof here we only list the ways in which the proof needs to be modified.
The basic setting for these proofs is encoding the key properties of our generalized convolution as a (decorated) finite graph G = (V, E). V as before is the vertex set, which includes a subset of distinguished vertices V , one of which we call 0. The set E can be decomposed as E 2 E h ( meaning disjoint union) where E 2 is the set of normal directed edges (denoted by ordered pairs (e − , e + ) with e − , e + ∈ V) and E h is the set of hyper-edges (subsets e ⊂ V with |e| ≥ 3) 7 .
We make further assumptions on the set E which we list below.
• For any e ∈ E 2 one has |e ∩ ē∈E hē | ≤ 1. • For all e ∈ E h one has e ∩ V = #.
The edges e ∈ E are also decorated with labels a e , r e where a e ∈ R and r e ∈ Z. We now list assumptions we make on these labels.
(a) For every e ∈ E h , a e = |e||s|/2 and r e = 0.
(b) If e ∈ E 2 ,ē ∈ E h , and e ∩ē = #, then r e ≤ 0 and e ∩ē = {e − }. 8 The edges e ∈ E 2 are associated with kernels K e which are smooth functions on R d × R d \ {0} and satisfy K e ae,p < ∞ for any p > 0. The edges e ∈ E h are associated with functions κ e on (R d ) |e| with κ e ae < ∞ where κ e ae is defined in Definition 3.1. We will write κ e (e) def = κ e (x 1 , . . . , x |e| ) if e = {x 1 , . . . , x |e| }. For edges e ∈ E 2 one also has renormalized kernelsK e as follows. If r e < 0 then define the distribution where {I e,k } |k|s<|re| is a collection of real numbers, and the distributional "kernel"K e acts on smooth ϕ on (3.3) To lighten notation we assume that the κ e are always symmetric functions of their arguments. With these notations the key quantity of interest is as follows. 7 We sometimes use the term edge to refer to any element of E, not just the elements of E 2 . 8 This requirement allows us to ensure that we never need estimates on derivatives of the kernels associated to our hyperedges.
For anyV ⊂ V, the subsets E ↑ (V) and E ↓ (V) of E are defined in the same way as in [11], namely, E ↑ (V) = {e ∈ E : e ∩V = e − & r e > 0} and E ↓ (V) = {e ∈ E : e ∩V = e + & r e > 0}, in particular E ↑ (V), E ↓ (V) ⊂ E 2 by our assumption on the label r e . We also set We use the shorthands r + e = (r e ∨ 0) and r − e = −(r e ∧ 0). We now state our main assumptions on the labels (a e , r e ) e∈E . 3. For every subsetV ⊂ V containing 0 and of cardinality at least 2, one has 4. For every non-empty subsetV ⊂ V \ V , one has the bounds Remark 3.6. The second assumption above is automatic forV = e ∈ E h since condition (A.2) then asks that |s| |V|/2 < |s| (|V| − 1) which follows from |e| = |V| ≥ 3.
The main result in this subsection is the following.  where α = |s||V \V |− e∈E a e , λ ∈ (0, 1], p = max{|r e | : e ∈ E}+1, and the proportionality constant only depends on the structure of G and the labels r e .

Multiscale expansion
The proof of Theorem 3.7 is by a multiscale analysis implemented by a scale decomposition of all kernels. For e ∈ E 2 the kernelsK e are decomposed into an infinite collection of kernelsK (n) e with n ∈ N 3 just as in [11,Lemma A.4,A.5]. We remark that as in [11], for an edge e ∈ E 2 without any kernel K e associated, the kernelK (n) e is still defined and we set (a e , r e ) = (0, 0).
We also implement a multiscale decomposition for the κ e as follows. where n i,j = n {vi,vj } ∈ N 3 . We define κ (ne) e as follows. We set κ (ne) e = 0 unless n i,j = (m i,j , 0, 0) for every i < j; in the latter case, we set where N = |e|, Ψ (n) is the cutoff function supported in the annulus of radius ∼ 2 −n with k-th derivative bounded by 2 |k|sn , and n Ψ (n) = 1.
As in [11] for λ ∈ (0, 1], let N λ be the set of n such that 2 −|ne| ≤ λ for every e = (0, v) with v ∈ V \ {0}. Since ϕ λ can be viewed as a kernel with a = 0 and norm being λ −|s| , it suffices for the proof of Theorem 3.7 to show (3.9)

Multiscale clustering and coalescence trees
As in [11] one can associate a coalescence tree (T, ) ∈ T (V) to any collection of vertex positions {x v } v∈V0 with x v ∈ R d via a coalescing process. For any two nodes u, v of T , u ∧ v denotes the common ancestor of u, v. For any edge e ∈ E of the graph (may be hyperedge), e ↑ is the common ancestor of all the leaves for the points in e, and e ⇑ is the immediate ancestor of e ↑ . Given a labelled tree (T, ) ∈ T (V) and a constant c > 0, we define the set N (T, ) of functions n : V 2 → N 3 as in [11] with the additional constraint that for every e ∈ E h and every {v, w} ⊆ e we enforce n (v,w) = (m, 0, 0) with |m − v∧w | ≤ c. If {v, w} / ∈ E 2 and {v, w} ⊂ e for all e ∈ E h , then the set N (T, ) imposes no requirement on n (v,w) . Lemma 3.9. There exists c > 0 such that the following holds: for every n with the property that integral in (3.9) is non-vanishing, there exists an element (T, ) ∈ T (V) with n ∈ N (T, ).
Proof. The only difference in our setting versus that of [11,Lemma A.9] are the additional constraints imposed by the requiring the support of κ (ne) to be non-empty for every e ∈ E h , however the argument remains exactly the same. Lemma 3.9 allows us to bound I G λ by a sum over labelled trees: where T λ (V) ⊂ T (V) is the set of coalescence trees such that 2 − v∧w ≤ λ for any v, w ∈ V . As in [11], when one wants to implement negative renormalizations to get a better (convergent) bound on the contribution from certain problematic labeled trees (T, ) ∈ T λ (V) the procedure is to replace the integrand appearing in (3.9) with a cleverly chosen functionK n (x) (see below) which satisfies suppK n ⊂ D(T, ) and integrates to the same value. We then write where T • is the set of inner nodes of T .
In [11] the key criterion used to get the bound (3.8) is the following lemma. The distinguished node v * in this lemma will correspond to the internal node of T which is first common ancestor of the leaves V * .
Furthermore suppose that the two following conditions on η hold: Then it follows that one has I λ (η) λ |η| uniform for λ ∈ (0, 1] where |η| Since Lemma 3.10 is a result only about coalescence trees, and has nothing to do with the graph (V, E), we do not need to re-prove this lemma.
The goal is to show that Assumption 3.5 implies that for any labeled tree (T, ) ∈ T λ (V) we can find an η : T • → R such that: (i) the bound holds uniform in n ∈ N (T, ) and (ii) the above function η satisfies the two conditions in Lemma 3.10.

Definition of η and proof of the theorem
As in [11] let A − ⊂ E be the subset of edges e with r e < 0 such that e ↑ only has two descendants e − and e + in the tree T . Given any edge e = (e − , e + ) and any r > 0, we define an operator Y r e acting on sufficiently smooth functions V : where D e+ is differentiation with respect to the coordinate x e+ and P e (x) v = x v if v = e + and P e (x) v = x e− if v = e + (in other words it turns x e+ to x e− ).
We replace the integrand in (3.10) bỹ (1) , . . . , e (k) }. By our assumption (see assumption (b) in the beginning of this subsection), if e ∈ A − intersects a hyper-edge, then the intersection is the single vertex e − . Therefore the operator Y r e leaves κ (ne) e unchanged, which is very important in the following proofs. Defineη Although the definition looks the same as in [11], the e ∈ E here may be a hyper-edge, and in the case e ∈ E h we haveη e (v) = − 1 2 |s||e|1 e ↑ (v). In other words for a hyper-edge e = {v 1 , . . . , v n }, we add a weight of α = |s|n/2 to the first common ancestor of v 1 , . . . , v n . Lemma 3.11. The functionsK (n) satisfy the bound uniform in n ∈ N (T, ).
Proof. Since the operator Y r e leaves κ (ne) e unchanged, the functionsK (n) can be factored asK (3.13) and the last factor here is (3.14) We refer to [11,Lemma A.16] for the precise definition of the notations ∂A − , x|y and Q k,e x (dy e ) appearing above, but only remark thatK (n) 1 and the first product on the right hand side of (3.13) can both be bounded as in [11] by the right hand side of (3.12) ifη(v) were defined as in (3.11) with e∈E replaced by e∈E2 . By the multiplicative structure of the second factor, it remains to show that for each e ∈ E h sup x∈R d This follows immediately from Definitions 3.1 and 3.8. Note that the cut-off functions Ψ (mi,j ) in (3.7) impose that the tree (T ,¯ ) over the |e| vertices of e induced from the tree (T, ) has precisely e ↑ as its root.
The proof of Theorem 3.7 is finished with the following lemma. Proof. The proof of [11][Lemma A.19] applies to our situation verbatim so we just give a sketch here. Fix ν ∈ T • , we write L ν ⊂ V for the set of leaves which are descendants of ν in T .
When one calculates v≥νη (v) the result takes three different forms. If L ν = e for some e with r e < 0 then it takes the value |s| − a e + r e . Otherwise, the value depends on whether 0 ∈ L ν or 0 ∈ L ν -in the former case the value of the sum is given by the difference of the righthand and lefthand sides of (A.2) of Assumption 3.5 while in the latter it is given by the difference of the righthand and lefthand sides of (A.3) of Assumption 3.5 -in both cases one takesV = L ν .
On the other hand, if ν ≤ ν * then v ≥νη (v) is given by the difference of the righthand and lefthand sides of (A.4) withV = V \ L ν .

The elementary graphs
In this subsection we will show that Assumption 3.17 imposed on an "elementary graph" will imply that Assumption 3. We also enforce that deg(v) = 1 for every v ∈ H ex and deg(v) ≥ 2 for every v ∈ H in . Any edge e with e ∩ H ex = # will be called an"external edge". Edges e = e which are not external edges are called "internal edges". The unique internal vertex connected to an external vertex v will be denoted i(v). We require that v ∈ H in , and that for every external edge e one has a e = |s|. 9 We also enforce that for all edges e with |e| = 2 one has a e < 2|s|.
We can construct graphs V by Wick contracting several copies of H, similarly as in [12], except that we now build hyper-edges over external vertices instead of identifying them. For a set D we denote by P(D) the set of partitions of D. Definition 3.14. Given a set A and an integer p > 1, let {A (i) } p i=1 be p copies of A and let D be their p-fold disjoint union -that is D = p i=1 A (i) . For π ∈ P(D) we say that π ∈ P w (D; A, p) ⊂ P(D) if for every B ∈ π, one cannot find 1 ≤ i ≤ p such that B ⊂ A (i) . In other words, we enforce that every block of the partition π must contain elements from at least two different copies of A. In particular, one must have |B| > 1.
Definition 3.15. Suppose that we are given an elementary graph H and an integer p > 1. For 1 ≤ i ≤ p let H (i) be a copy of the graph H. Suppose that we are also given a partition π ∈ P w ( p i=1 H (i) ex ; H ex , p). From H, p, and π we will construct a labeled graph G = (U, E) which will be called a p-fold Wick contraction of H.
To define the vertex set U we first start with p i=1 H (i) and then identify all the p copies of the distinguished vertex 0. The edge set is given by As in the last section we have the decomposition E(U) = E 2 (U) E h (U).
Each edge e ∈ E(U) is naturally associated with a label (a e , r e ), which is (|s|, −1) if e ∈ E c (U) ∩ E 2 (U), or (|e||s|/2, 0) if e ∈ E c (U) ∩ E h (U), or otherwise inherits the label (a e , r e ) from H. We also set U 0 def = U \ {0} and U ⊂ U to be given by the set p i=1 H (i) with all the copies of 0 identified.
While we have defined enough structure to formulate Assumption 3.5 for G, it turns out that this labeled graph does not quite satisfy this assumption in general; the second inequality will be violated whenever one has a block B = {u, v} in π of cardinality 2.
Pictorially one then has where as before, for an external vertex w ∈ H (i) ex we denote by i(w) the unique element of H (i) in which is connected to w by an edge of E 0 (H (i) ). In this scenario a subset V ⊂ {u, v, i(u), i(v)} with |V | > 2 is called a bad chain for G.
The outer two edges of (3.15) each carry a label (|s|, −1). While they are divergent by power counting we expect them to be integrable since they represent approximate identities. The solution is to perform the integration of vertices u and v before we 9 In practice one can always attach a new edge with label |s| representing the Dirac function to an external vertex to ensure this assumption holds.
EJP 22 (2017), paper 68. perform our multiscale analysis. Pictorially we replace (3.15) by which carries a label (|s|, −1). Definition 3.16. Given a a partition π ∈ P w ( p i=1 H (i) ex ; H ex , p) we define a labeled graph G = (V, E (V)) which we call a reduced p-fold Wick contraction of H. Let G = (U, E(U)) denote the corresponding non-reduced p-fold Wick contraction as in Definition 3.15. G represents the reduced graph obtained after we have integrated out the following variables 10 The vertex set is given by V = U \ U rem , while the edge set is given by we can just let all the other edges inherit their labeling from G.
Finally we let V 0 = V \ {0} and V = U and also define a retraction map r : U → V associated to G given by We now state our counterpart of Assumption 3.5 for the elementary graphs H. For various sets of edges we recall the notation (3.5).
Our goal is to show that Assumption 3.17 on an elementary graph H will imply assumptions 3.5 for any reduced p-fold Wick contraction built from H. It is more convenient to instead prove a weaker version of Assumption 3.5 for the corresponding non-reduced Wick contraction. We have the following lemma which is a straightforward consequence of our definitions. 10 In plain words, Urem is the set of vertices like u and v in (3.15) for which we want to remove from the vertex set.
EJP 22 (2017), paper 68. Lemma 3.18. Let H be an elementary graph, G = (U, E(U)) be a p-fold Wick contraction of H, and G = (V, E (V)) be the corresponding reduced p-fold Wick contraction. Suppose that G satisfies items 1,3, and 4 of Assumption 3.5 and that Eq. (A.2) holds for every subsetŪ ⊂ U 0 of cardinality at least 3 which is not a bad chain. Then the graph G satisfies Assumption 3.5.
Proof. The fact that G will satisfy item 1 is quite clear so we focus on the other items.
The first key point is that for any of the conditions 2,3, and 4 of Assumption 3.5, given an appropriateV ⊂ V the difference between the LHS and RHS's of the needed inequality remains the same if one replacesV, E (V), E 0 (V), and E ↓ (V) byŪ def = r −1 (V), E(Ū), E 0 (Ū), and E ↓ (Ū), respectively -one must have |Ū| = 2n in which case making this switch increases both the LHS and RHS by 2n|s|. The second point is that for |V| > 2 the set r −1 (V) will not be a bad chain.
We will denote the p copies of H by H (1) , . . . , H (p) and write H  IfŪ has connected components of size 1 then it suffices to check (A.2) for the smaller vertex set where one drops these components; and the same holds for components of size 2 (here one uses the assumption |a e | < 2|s| in Definition 3.13). If all the components ofŪ have cardinality at least 3, then if (A.2) holds for each of these components then summing up these bounds yields an even stronger bound forŪ . This covers all the disconnected cases so we assume thatŪ is connected.
The LHS of (A.2) is then For going to the bottom lines we use the following reasoning for each of the terms on the first line: (i) the sum over J 1 is obviously zero, (ii) since |Ū| ≥ 3 and is assumed to be connected, the edges involved in the J 2 -summation are all external with labels |s|, (iii) one can apply (H.2) for the sum over J ≥3 , and (iv) one has e∈E0(Ū )∩Ec(U ) Observe that if J ≥3 = # then by (H.2) the inequality in (3.16) is actually strict. We now deal with the case where J 1 = J 2 = # and |J ≥3 | = n ≥ 1. If J ≥3 = {i} then the last term on the first line of (3.16) must vanish so with the RHS being bounded above by |s|(|Ū| − 1), as desired. If n > 1 we must havē H (i) ex = # for all i ∈ J ≥3 and our claim follows from the fact that Note that in all remaining cases one must haveH The case when |J ≥3 | ≥ 1 and J 1 J 2 = # follows similarly to the two cases treated above: the upper bounds in (3.18) and (3.19) will apply if we increase them by ( 1 2 |J 1 | + 3 2 |J 2 |) × |s| but the quantity |s|(|Ū| − 1) we compare them against goes up by (|J 1 | + 2|J 2 |) × |s|.
Henceforth we assume J ≥3 = #. Suppose that (|J 1 |, |J 2 |) = (1, 1) or (0, 2). SinceŪ is not a bad chain it must be the case that E 0 (Ū) ∩ E c (U) = #. Then by (3.16) one has e∈E0(Ū ) a e ≤ |J 2 | × |s| which is strictly smaller than |s| × (|J 1 | + 2|J 2 | − 1). In the remaining scenarios J ≥3 = # and (|J 1 |, |J 2 |) = (1, 1) or (0, 2) -(A.2) then follows by observing that We now turn to proving that (H. (a e + r e − 1) − where for the last equality we note that 0 is neither internal nor external. We now show (H.4) implies (A.4). SupposeŪ ⊂ U \ U . Using the bound (r e − 1) We now state a lemma which is a partial converse to the above theorem in the case of symmetric pairings of elementary graphs. A symmetric pairing of H is a special type of Wick contraction -one has p = 2 and π = {{v (1) , v (2)  Proof. Clearly H satisfies the first item of Assumption 3.17 as a consequence of (V, E ) satisfying the first item of Assumption 3.5.
We claim that given appropriateH ⊂ H, the needed criteria for the other case of item 2 (whereH ∩ H ex = #) or items 3 or 4 are equivalent to the corresponding items of Assumption 3.5 for the setV ⊂ V, whereV a e < |s|(|V| − 1) = |s|(2|H in | − 1), which is exactly the desired condition forH.
Remark 3.21. The above lemma establishes a type of "hypercontractive" or "equivalence of moments" bound in the non-Gaussian setting.
We state two lemmas and a remark before proceeding to the stochastic estimates. Assume that there exist internal edges e 1 , e 2 ∈ E with a ej = |e j ||s|/2 and r ej ≤ 0 for j = 1, 2, and that E = (E \ {e 1 , e 2 }) ∪ e with e = e 1 ∪ e 2 and a e = |e||s|/2. In plain words H is formed by merging e 1 , e 2 into one hyper-edge e. Then, H also satisfies Assumption 3.17.
Proof. For items 1-3 of Assumption 3.17 and any allowable subsetH ⊂ V, the LHS of the bounds forH as a subgraph of H is always smaller or equal to the LHS forH as a subgraph of H, while the RHS remains the same.
For item 4 and any allowable subsetH, the LHS of the bound forH as a subgraph of H is always larger or equal to the LHS forH as a subgraph of H, while the RHS again remains the same.
Therefore if H satisfies Assumption 3.17, then after merging e 1 , e 2 into one hyperedge the new graph also satisfies Assumption 3.17.

Lemma 3.23.
For graphs H such that H ex = #, Assumption 3.17 for H is equivalent with Assumption 3.5 for V = H.
Proof. Immediate upon comparing the two assumptions with the condition H ex = #. Remark 3.24. When we do our stochastic estimates there are symbols τ and selfcontractions π on τ such that the elementary graph H τ,π will have a bad chain, failing to abide Assumption 3.17. However, it is easy to see that the finite set of bad chains of H τ,π can be eliminated by integrating a noise vertex in each one yielding an abiding H τ,π . Below this is done implicitly whenever the scenario arises.

Application: Wong-Zakai theorem for non-Gaussian noise
We now apply the machinery of the preceding sections to prove Theorem 1.2.
Our maps M : T 0 → T 0 will be of the form are nilpotent linear operators on T 0 given as follows. L (1) is defined in the same way as the map called in L in [10]: it iterates over all occurrences of as a "subsymbol" of τ and "erases" it in the graphical notation:  Proof. One can check that L (i) L (j) τ = 0 for for all τ ∈ W 0 and any i, j -thus the operators L (i) all commute and one actually has M = I − 7 i=1 i L (i) . Furthermore since R is a group it suffices to check, for 1 ≤ j ≤ 7, the upper-triangularity of ∆ Mj where M j def = e − j Lj . This can be checked by computation. The j = 1 case requires the most work but the computations for this case are exactly the same as those found in [10][Sec 4.2] for the operator called L.
For j ≥ 2 we observe that one has, for all τ ∈ T 0 ,M j J k (τ ) = J k (M j τ ) and ∆ Mj = M j ⊗ Id. Clearly ∆ Mj is upper triangular.
We then define, for each ε ∈ (0, 1], a map M ε ∈ R by specifying the constants Recall that the constants on the right hand side are defined in (1.7).
We then construct the renormalized model (Π (ε,ε) ,Γ (ε,ε) ) from ζ ε,ε together with the renormalization maps M ε,ε with constants specified in the same way as in (4.3), except that we replace every P by our truncation K, every C n by C One may find that for a fixed ε > 0, sendingε to zero does not exactly recover the renormalization constants for (Π (ε) ,Γ (ε) ); for instance in the latter model the renormalization constants are defined via P . This does not matter for two reasons: (i) we will only consider the situation that ε is much less thanε; (ii) when we bound the moments for (Π (ε) x τ )(ϕ λ x ) below, we will actually replace the constants forΠ (ε) by the ones defined via K and C (ε) n here, with an error that goes to zero as ε → 0. More precisely, for the constants ε −1 C , ε − 1 2 C , ε − 1 2 C , by exponential decay of C n and the fact that P (z) = K(z) for |z| < 1, one can easily see by a scaling argument that this error is bounded by θ 1 ε with θ ∈ (0, 1); thus even though these errors are multiplied by "graphs" which may diverge as ε → 0 they still vanish in that limit. For the other constants one can also argue as in [10] that the error of such replacement vanishes as ε → 0. Proposition 4.2. Let (Π (ε) ,Γ (ε) ) and (Π (ε,ε) ,Γ (ε,ε) ) be defined as above. There exist κ, η > 0 such that for every τ ∈ { , , , , , , , , , } and every p > 0, one has uniformly in allε ∈ (0, 1] and all ε ∈ (0, 1] sufficiently small depending onε, all λ ∈ (0, 1], all test functions ϕ ∈ B, and all x ∈ R 2 .
We also draw an arrow with label (2, 0) for the kernel K = ∂ x K and for the test function (t, x) → xϕ λ (t, x). Note thatφ(t, x) = xϕ(t, x) is again an admissible test function and one has xϕ λ (t, x) = λφ λ (t, x). As a consequence, when a test function ϕ is replaced with test functionφ, one gains an additional power of λ.

The symbols , , and
We start with the simplest object after the noise. By translation invariance we take the point x in (4.5) to be 0. Using Definition A.3 which allows us to represent a product of the noises as Wick products, one has where E ε here and below is an error which arises from the difference between ε −1 C and C ε and goes to zero. Now if we compare the above graphs with the corresponding ones in [10, Eq. (5. 2)] in the Gaussian noise case, we realize that they are essentially the same graphs. Since in [10] it is checked that the symmetric pairing of the first graph, as well as the last graph above, satisfy Assumption 3.5, our Lemma 3.20 and Lemma 3.23 immediately yield Assumption 3.17 for the two graphs on the RHS of (4.6), and therefore by Theorem 3.19 one concludes the desired bounds for Π (ε) 0 (ϕ λ 0 ).
The other two symbols and are bounded in the same way. In these cases, graphs appearing in the expansions are the same with those in the Gaussian case. However, in general, for other symbols below, there will be new terms due to the nontrivial higher cumulants of our non-Gaussian noise (we sometimes refer to them as just "new terms"). The discussion here shows that we only need to treat with these new terms.

The symbol
Besides the terms appearing in [10], we have the following new terms in the expression for Π (ε) 0 (ϕ λ ): It is straightforward to check that Assumption 3.17 is satisfied for the two graphs on the RHS, which yields the desired bounds E|(Π

The symbol
We now turn to . In this case, besides the terms shown in [10], we have the following new terms in the expression for Π (ε) 0 (ϕ λ ): One can also check that Assumption 3.17 holds for these two graphs.

The symbol
The new terms for Π (ε) 0 (ϕ λ ) are: The sum of the last two terms can be written as a sum of 11 graphs (expanding the terms represented by the barred arrows yield 12 terms, and the renormalization constant cancels one of them); each of them has a fourth order hyper-edge which can be split into two edges as ⇒ . After this split all the graphs satisfy Assumption 3.5 as checked in [10] so they satisfy Assumption 3.17 by Lemmas 3.23 and 3.22.
After cancellations with renormalization constant we only need to check the assumption for the following graphs understood as a function of z, with label (3 1 2 , −1). It can be checked that these eight graphs all satisfy the conditions in Assumption 3.17.

The symbol
The new terms are For the last two terms, after cancellation with C 3,ε , there are seven terms, each having a hyper-edge that can be split into two edges representing a cumulant of the top and bottom noises and a cumulant of the left and right noises. These terms all satisfy Assumption 3.5 as checked in [10]. So we are left with labeled by (3 1 2 , −1). It can be checked that these seven graphs all satisfy the conditions in Assumption 3.17.

The symbol
The new terms are The sum of the 2nd and the 3rd terms is −6 + 3 The sum of the last two terms is − 3 + 3 − Therefore, for this symbol, six graphs remain to be checked. Indeed, they all satisfy Assumption 3.17.

The symbol
The new terms are The sum of the first two terms is 2 times The 4th term is equal to (noting that the heat kernels in annihilate constants)

−2 +
Finally the sum of the last two terms is The first two graphs above have a fourth order hyper-edge, by splitting it into two edges and using Lemma 3.22 they become the graphs checked in [10].
Therefore, for this symbol, there are still nine graphs to be checked. It is straightforward (though a bit tedious) to check that they all satisfy Assumption 3.17.
of Proposition 4.2. Collecting all the results of this section, we obtain the weaker bound (4.5). The bound (4.4), follows in essentially the same way as the first bound, by the argument in [12] (see also the verification of the second bound in [7,Theorem 10.7]). Indeed, as we consider the difference betweenΠ (ε) 0 τ andΠ (ε,ε) 0 τ for any τ = Ξ, we obtain the same graphs as above, and in each graph some of the instances of δ are replaced by ε and exactly one instance is replaced by δ − ε . By the bound δ − ε −3−κ ε κ one obtains the same bound as (4.5) with an extra factorε κ , which is exactly required by the bound (4.4). The case τ = Ξ can be also proved as in [12].

Proof of the main theorem
It was shown in [10] that if we replace ζ ε by ξ ε with ξ ε = ε * ξ where ξ is the Gaussian space-time white noise, the renormalised models built from ξ ε converge to the limit Z = (Π,Γ) called the Itô model, which satisfies the following property. For every τ ∈ U and every (t, x), the process s → Π (t,x) τ (s, ·) is F s -adapted for s > t and, for every smooth test function ϕ supported in the future {(s, y) : s > t}, one has the identity where the integral on the right is the Itô integral.
The goal of this section is to show that our renormalised models (Π (ε) ,Γ (ε) ) built from ζ ε defined above converge to the same limit. We prove this by applying a "diagonal argument" as in [12].  (1.7)). Let Z = (Π,Γ) be the Itô random model. Then, as ε → 0, one hasẐ ε →Ẑ in distribution in the space M of admissible models for T .
It therefore remains to bound the second term. This follows from the same argument as in [12]. Firstly, one has a central limit theorem for the noise ζ ε , namely for every α < − 3 2 the field ζ ε converges in law to space-time white noise ξ in C α (R × S 1 ), see [12, Prop. 6.1]. Therefore lim ε→0 E ζ 0,ε − ζ ε,ε p C 1 ;K = 0 . for any bounded domain K. Also, the map from the space of stationary and almost surely periodic noise equipped with L p of C 1 norm to the space of random admissible models is continuous. Therefore one has lim ε→0 E|||Ẑ ε,ε ,Ẑ 0,ε ||| = 0 , for every fixed (sufficiently small)ε > 0, so that the second term in (4.9) also vanishes, thus concluding the proof.
We finally collect all our results and prove the main theorem.

A Cumulants and Wick products
We define joint cumulant functions C n (z 1 , . . . , z n ) of a space-time centered random field ζ. Given any subset A of space-time points we will define a joint cumulant function C(Ā) of ζ. The definition operates recursively in |A|, one sets where P(A) denotes the set of all partitions of A. The key cumulant identity comes from moving all the cumulants of (A.1) to the LHS.
We give an example of a random field ζ satisfying the above property.
We now give the definition of the Wick product of a generically non-Gaussian random field evaluated at a collection of points. Note that : z∈B ζ(z) : is understood as an operation on {ζ(z)} z∈B , not the product.
The second line of (A.4) follows from the first by applying (A.1). Note that any non-empty Wick product is always mean zero.
The key Wick product identity comes from moving all the subtracted terms of on the RHS of the second line of (A.4) to the first line of the RHS.
We close with a lemma which is sometimes called "diagram formula" in the literature 11 and generalizes (A.1). It states that moments of Wick products can be calculated by summing over partitions without "self-contractions". where M def = {i : 1 ≤ i ≤ m}, P def = {k : 1 ≤ k ≤ p}, and P M (M × P ) denotes the set of all partitions π ∈ P(M × P ) with the property that for every B ∈ π there exist (i, k), (i , k ) ∈ B such that k = k (in particular |B| > 1).