Convergence of the critical finite-range contact process to super-Brownian motion above the upper critical dimension: I. The higher-point functions

We consider the critical spread-out contact process in Z^d with d\ge1, whose infection range is denoted by L\ge1. In this paper, we investigate the r-point function \tau_{\vec t}^{(r)}(\vec x) for r\ge3, which is the probability that, for all i=1,...,r-1, the individual located at x_i\in Z^d is infected at time t_i by the individual at the origin o\in Z^d at time 0. Together with the results of the 2-point function in [van der Hofstad and Sakai, Electron. J. Probab. 9 (2004), 710-769; arXiv:math/0402049], on which our proofs crucially rely, we prove that the r-point functions converge to the moment measures of the canonical measure of super-Brownian motion above the upper-critical dimension 4. We also prove partial results for d\le4 in a local mean-field setting.

3 Linear expansion for the r-point function 1 Introduction and results

Introduction
The contact process is a model for the spread of an infection among individuals in the d-dimensional integer lattice Z d . Suppose that the origin o ∈ Z d is the only infected individual at time 0, and assume for now that every infected individual may infect a healthy individual at a distance less than L ≥ 1. We refer to this type of model as the spread-out contact process. The rate of infection is denoted by λ, and it is well known that there is a phase transition in λ at a critical value λ c ∈ (0, ∞) (see, e.g., [21]).
In the previous paper [15], and following the idea of [22], we proved the 2-point function results for the contact process for d > 4 via a time discretization, as well as a partial extension to d ≤ 4. The discretized contact process is a version of oriented percolation in Z d × εZ + , where ε ∈ (0, 1] is the time unit. The proof is based on the strategy for ordinary oriented percolation (ε = 1), i.e., on the application of the lace expansion and an adaptation of the inductive method so as to deal with the time discretization.
In this paper, we use the 2-point function results in [15] as a key ingredient to show that, for any r ≥ 3, the r-point functions of the critical contact process for d > 4 converge to those of the canonical measure of super-Brownian motion, as was proved in [19] for ordinary oriented percolation. We follow the strategy in [19] to analyze the lace expansion, but derive an expansion which is different from the expansion used in [19]. The lace expansion used in this paper is closely related to the expansion in [14] for the oriented-percolation survival probability. The latter was used in [13] to show that the probability that the oriented-percolation cluster survives up to time n decays proportionally to 1/n. Due to this close relation, we can reprove an identity relating the constants arising in the scaling limit of the 3-point function and the survival probability, as was stated in [12,Theorem 1.5] for oriented percolation.
The main selling points of this paper in comparison to other papers on the topic are the following: 1. Our proof yields a simplification of the expansion argument, which is still inherently difficult, but has been simplified as much as possible, making use of and extending the combined insights of [9,14,15,19].
2. The expansion for the higher-point functions yields similar expansion coefficients to those for the survival probability in [14], thus making the investigation of the contact-process survival probability more efficient and allowing for a direct comparison of the various constants arising in the 2-and 3-point functions and the survival probability. This was proved for oriented percolation in [12,Theorem 1.5], which, on the basis of the expansion in [18], was not directly possible.
3. The extension of the results to certain local mean-field limit type results in low dimensions, as was initiated in [5] and taken up again in [15]. 4. A simplified argument for the continuum limit of the discretized model, which was performed in [15] through an intricate weak convergence argument, and which in the current paper is replaced by a soft argument on the basis of subsequential limits and uniformity of our bounds.
The investigation of the contact-process survival probability is deferred to Part II of this paper [17], in which we also discuss the implications of our results for the convergence of the critical spread-out contact process towards super-Brownian motion, in the sense of convergence of finite-dimensional distributions [20]. See also [11] and [24] for more expository discussions of the various results for oriented percolation and the contact process for d > 4, and [25] for a detailed discussion of the applications of the lace expansion.

Main results
We define the spread-out contact process as follows. Let C t ⊆ Z d be the set of infected individuals at time t ∈ R + , and let C 0 = {o}. An infected site x recovers in a small time interval [t, t + ε] with probability ε+o(ε) independently of t, where o(ε) is a function that satisfies lim ε↓0 o(ε)/ε = 0. In other words, x ∈ C t recovers at rate 1. A healthy site x gets infected, depending on the status of its neighboring sites, at rate λ y∈Ct D(x − y), where λ ≥ 0 is the infection rate. We denote the associated probability measure by P λ . We assume that the function D : Z d → [0, 1] is a probability distribution which is symmetric with respect to the lattice symmetries. Further assumptions on D involve a parameter L ≥ 1 which serves to spread out the infections, and will be taken to be large. In particular, we require that D(o) = 0 and D ∞ ≡ sup x∈Z d D(x) ≤ CL −d . Moreover, with σ defined as where | · | denotes the Euclidean norm on R d , we require that C 1 L ≤ σ ≤ C 2 L and that there exists a ∆ > 0 such that See [15,Section 5] for the precise assumptions on D. A simple example of D is which is the uniform distribution on the cube of radius L.

Previous results for the 2-point function
We first state the results for the 2-point function proved in [15]. Those results will be crucial for the current paper. In the statements, σ is defined in (1.1) and ∆ in (1.2). Besides the high-dimensional setting for d > 4, we also consider a low-dimensional setting, i.e., d ≤ 4. In this case, the contact process is not believed to be in the mean-field regime, and Gaussian asymptotics are thus not expected to hold as long as L remains finite. However, following the rescaling of Durrett and Perkins in [5], we have proved Gaussian asymptotics when range and time grow simultaneously [15]. We suppose that the infection range grows as where L 1 ≥ 1 is the initial infection range and T ≥ 1. We denote by σ 2 T the variance of D = D T in this situation. We will assume that with the error estimate in (1.9) uniform in k ∈ R d with |k| 2 / log(2 + t) sufficiently small. Moreover, (1.12) (ii) Let d ≤ 4, δ ∈ (0, 1 ∧ ∆ ∧ α) and L 1 ≫ 1. There exist λ T = 1 + O(T −µ ) for some µ ∈ (0, α − δ) and C 1 , C 2 ∈ (0, ∞) (depending only on d) such that, for every 0 < t ≤ log T , 1 τ λ T T t (0) x |x| 2 τ λ T T t (x) = σ 2 T T t 1 + O(T −µ ) + O (1 + T t) −δ , (1.14) 15) with the error estimate in (1.13) uniform in k ∈ R d with |k| 2 / log(2 + T t) sufficiently small.
In the rest of the paper, we will always work at the critical value, i.e., we take λ = λ c for d > 4 and λ = λ T as in Theorem 1.1(ii) for d ≤ 4. We will often omit the λ-dependence and write τ (r) t ( x) = τ λ t ( x) to emphasize the number of arguments of τ λ t ( x). While τ λc t (x) tells us what paths in a critical cluster look like, the critical r-point functions give us information about the branching structure of critical clusters. Our goal in this paper is to prove that the suitably scaled critical r-point functions converge to those of the canonical measure of super-Brownian motion (SBM).

The r-point function for r ≥ 3
To state the result for the r-point function for r ≥ 3, we begin by describing the Fourier transforms of the moment measures of SBM. These are most easily defined recursively, and will serve as the limits of the r-point functions. We defineM (1) t (k) = e − |k| 2 2d t , k ∈ R d , t ∈ R + , (1. 16) and define recursively, for l ≥ 2, (1) t (k 1 + · · · + k l ) where J = {1, . . . , l}, J 1 = J \ {1}, t = min i t i , t I is the vector consisting of t i with i ∈ I, and t I − t is subtraction of t from each component of t I . The quantityM (l) t ( k) is the Fourier transform of the l th moment measure of the canonical measure of SBM (see [19,Sections 1.2

.3 and 2.3] for more details on the moment measures of SBM).
The following is the result for the r-point function for r ≥ 3 linking the critical contact process and the canonical measure of SBM: (ii) Let d ≤ 4, k ∈ R d(r−1) , t ∈ (0, ∞) r−1 and let δ, L 1 , λ T , µ be the same as in Theorem 1.1(ii). For every r ≥ 2, 0 < max i s i ≤ log T and as T ↑ ∞, 19) uniformly in k ∈ R d(r−1) with r−1 i=1 |k i | 2 bounded.
Since the statements for r = 2 in Theorem 1.2 follow from Theorem 1.1, we only need prove Theorem 1.2 for r ≥ 3. As described in more detail in Part II [17], Theorems 1.1-1.2 can be rephrased to say that, under their hypotheses, the moment measures of the rescaled critical contact process converge to those of the canonical measure of SBM. The consequences of this result for the convergence of the critical contact process towards SBM will be deferred to [17]. Theorem 1.2 will be proved using the lace expansion, which perturbs the r-point functions for the critical contact process around those for critical branching random walk. To derive the lace expansion, we use a time-discretization. The time-discretized contact process has a parameter ε ∈ (0, 1]. The boundary case ε = 1 corresponds to ordinary oriented percolation, while the limit ε ↓ 0 yields the contact process. We will prove Theorem 1.2 for the time-discretized contact process and prove that the error terms are uniform in the discretization parameter ε. As a consequence, we will reprove Theorem 1.2 for oriented percolation. The first proof of Theorem 1.2 for oriented percolation appeared in [19]. In [3,4], spread-out oriented percolation is investigated in the setting where the finite variance condition (1.2) fails, and it was shown that for certain infinite variance step distributions D in the domain of attraction of an α-stable distribution, the Fourier transform of two-point function converges to the one of an α-stable random variable, when d > 2α and α ∈ (0, 2). We conjecture that, in this case, the limits of the r-point functions satisfy a limiting result similarly to (1.18) when the argument in the r-point function in (1.18) is replaced by k vT 1/α for some v > 0, and where the limit corresponds to the moment measures of a super-process where the motion is α-stable and the branching has finite variance (in the terminology of [6,Definition 1.33,p. 22], this corresponds to the (α, d, 1)-superprocess and SBM corresponds to α = 2). These limiting moment measures should satisfy (1.17), but (1.16) is replaced by e −|k| α t , which is the Fourier transform of an α-stable motion at time t.

Organization
The paper is organised as follows. In Section 2, we will describe the time-discretization, state the results for the time-discretized contact process and give an outline of the proof. In this outline, the proof of Theorem 1.2 will be reduced to Propositions 2.2 and 2.4. In Proposition 2.2, we state the bounds on the expansion coefficients arising in the expansion for the r-point function. In Proposition 2.4, we state and prove that the sum of these coefficients converges, when appropriately scaled and as ε ↓ 0. The rest of the paper is devoted to the proof of Propositions 2.2 and 2.4. In Sections 3-4, we derive the lace expansion for the r-point function, thus identifying the lace-expansion coefficients. In Sections 5-7, we prove the bounds on the coefficients and thus prove Proposition 2.2.

Outline of the proof
In this section, we give an outline of the proof of Theorem 1.2, and reduce this proof to Propositions 2.2 and 2.4. This section is organized as follows. In Section 2.1, we describe the time-discretized contact process. In Section 2.2, we outline the lace expansion for the r-point functions and state the bounds on the coefficients in Proposition 2.2. In Section 2.4, we prove Theorem 1.2 for the time-discretized contact process subject to Propositions 2.2. Finally, in Section 2.5, we prove Proposition 2.4, and complete the proof of Theorem 1.2 for the contact process.

Discretization
In this section, we introduce the discretized contact process, which is an interpolation between oriented percolation on the one hand, and the contact process on the other. This section contains the same material as [15,Section 2.1]. We shall also use the notation N = {1, 2, . . .}, Z + = {0}∪ N and R + = [0, ∞).
The contact process can be constructed using a graphical representation as follows. We consider Z d × R + as space-time. Along each time line {x} × R + , we place points according to a Poisson process with intensity 1, independently of the other time lines. For each ordered pair of distinct time lines from {x} × R + to {y} × R + , we place directed bonds ((x, t), (y, t)), t ≥ 0, according to a Poisson process with intensity λ D(y − x), independently of the other Poisson processes. A site (x, s) is said to be connected to (y, t) if either (x, s) = (y, t) or there is a non-zero path in Z d × R + from (x, s) to (y, t) using the Poisson bonds and time line segments traversed in the increasing time direction without traversing the Poisson points. The law of {C t } t∈R + defined in Section 1.2 is equal to that of {x ∈ Z d : (o, 0) is connected to (x, t)} t∈R + .
We follow [22] and consider an oriented percolation process in Z d × εZ + with ε ∈ (0, 1] being a discretization parameter as follows. Figure 1: Graphical representation of the contact process and the discretized contact process. bond. In particular, b is said to be temporal if x = y, otherwise spatial. Each bond is either occupied or vacant independently of the other bonds, and a bond b = ((x, t), (y, t + ε)) is occupied with probability provided that sup x p ε (x) ≤ 1. We denote the associated probability measure by P λ ε . It has been proved in [2] that P λ ε weakly converges to P λ as ε ↓ 0. See Figure 2.1 for a graphical representation of the contact process and the discretized contact process. As explained in more detail in Section 2.2, we prove our main results by proving the results first for the discretized contact process, and then taking the continuum limit ε ↓ 0.
We denote by (x, s) −→ (y, t) the event that (x, s) is connected to (y, t), i.e., either (x, s) = (y, t) or there is a non-zero path in Z d × εZ + from (x, s) to (y, t) consisting of occupied bonds. The r-point functions, for r ≥ 2, t = (t 1 , . . . , t r−1 ) ∈ εZ r−1 + and x = (x 1 , . . . , x r−1 ) ∈ Z d(r−1) , are defined as Similarly to (1.6), the discretized contact process has a critical value λ (ε) c satisfying The discretization procedure will be essential in order to derive the lace expansion for the r-point functions for r ≥ 3, as it was for the 2-point function in [15]. Note that for ε = 1 the discretized contact process is simply oriented percolation. Our main result for the discretized contact process is the following theorem, similar to Theorem 1.2: Theorem 2.1 (Convergence of time-discretized r-point functions to SBM moment measures).
For oriented percolation for which ε = 1, Theorem 2.1(i) reproves [18,Theorem 1.2]. The uniformity in ε in Theorem 2.1 is crucial in order for the continuum limit ε ↓ 0 to be performed, and to extend the results to the contact process.

Overview of the expansion for the higher-point functions
In this section, we give an introduction to the expansion methods of Sections 3-4. For this, it will be convenient to introduce new notation for sites in Z d × εZ + . We write and we write a typical element of Λ as x rather than (x, t) as was used until now. We fix λ = λ (ε) c throughout Section 2.2 for simplicity, though the discussion also applies without change when λ < λ (ε) c . We begin by discussing the underlying philosophy of the expansion. This philosophy is identical to the one described in [19, Section 2.2.1].
As explained in more detail in [15], the basic picture underlying the expansion for the 2-point function is that a cluster connecting o and x can be viewed as a string of sausages. In this picture, the strings joining sausages are the occupied pivotal bonds for the connection from o to x. Pivotal bonds are the essential bonds for the connection from o to x, in the sense that each occupied path from o to x must use all the pivotal bonds. Naturally, these pivotal bonds are ordered in time. Each sausage corresponds to an occupied cluster from the endpoint of a pivotal bond, containing the starting point of the next pivotal bond. Moreover, a sausage consists of two parts: the backbone, which is the set of sites that are along occupied paths from the top of the lower pivotal bond to the bottom of the upper pivotal bond, and the hairs, which are the parts of the cluster that are not connected to the bottom of the upper pivotal bond. The backbone may consist of a single site, but may also consist of sites on at least two bond-disjoint connections. We say that both these cases correspond to double connections. We now extend this picture to the higher-point functions.
For connections from the origin to multiple points x = (x 1 , . . . , x r−1 ), the corresponding picture is a "tree of sausages" as depicted in Figure 2. In the tree of sausages, the strings represent the union over i = 1, . . . , r − 1 of the occupied pivotal bonds for the connections o −→ x i , and the sausages are again parts of the cluster between successive pivotal bonds. Some of them may be pivotal for {o −→ x j ∀j ∈ J}, while others are pivotal only for {o −→ x j } for some j ∈ J.
We regard this picture as corresponding to a kind of branching random walk. In this correspondence, the steps of the walk are the pivotal bonds, while the sites of the walk are the backbones between subsequent pivotal bonds. Of course, the pivotal bonds introduce an avoidance interaction on the branching random walk. Indeed, the sausages are not allowed to share sites with the later backbones (since otherwise the pivotal bonds in between would not be pivotal).
When d > 4 or when d ≤ 4 and the range of the contact process is sufficiently large as described in (1.7)-(1.8), the interaction is weak and, in particular, the different parts of the backbone in between different pivotal bonds are small and the steps of the walk are effectively independent. Thus, we can think of the higher-point functions of the critical time-discretized contact process as "small perturbations" of the higher-point functions of critical branching random walk. We will use this picture now to give an informal overview of the expansions we will derive in Sections 3-4. △ △ Figure 2: (a) A configuration for the discretized contact process. Both and △ denote occupied temporal bonds; is connected from o, while △ is not. The arrows are occupied spatial bonds, representing the spread of an infection to neighbours. (b) Schematic depiction of the configuration as a "string of sausages." We start by introducing some notation. For r ≥ 3, let . , x is − y}, and abuse notation by writing From the sausage at the origin, there may be anywhere from zero to r −1 pivotal bonds for {o −→ x J } emerging, where we let (2.10) Configurations with zero or more than two pivotal bonds will turn out to constitute an error term. Indeed, when there are zero pivotal bonds, this means that o =⇒ x i for some i, which constitutes an error term. When there are more than two pivotal bonds, the sausage at the origin has at least three disjoint connections to different x i 's, which also turns out to constitute an error term. Therefore, we are left with configurations which have one or two branches emerging from the sausage at the origin. When there is one branch, then this branch contains x J . When there are two branches, one branch will contain x I for some nonempty I ⊆ J 1 and the other branch will contain x J\I , where we require 1 ∈ J \ I to make the identification unique. The first expansion deals with the case where there is a single branch from the origin. It serves to decouple the interaction between that single branch and the branches of the tree of sausages leading to x J . The expansion writes τ ( x J ) in the form where (f ⋆ g)(x) represents the space-time convolution of two function f, g : Λ → R given by For details, see Section 3, where (2.11) is derived. We have that (2.13) where π(x) is the expansion coefficient for the 2-point function as derived in [15,Section 3]. Moreover, for r = 2, so that (2.11) becomes This is the lace expansion for the 2-point function, which serves as the key ingredient in the analysis of the 2-point function in [15]. 1 The next step is to write A( x J ) as where, to leading order, J \ I consists of those j for which the first pivotal of x j is the same as the one for x 1 , while for i ∈ I, this first pivotal is different. The equality (2.16) is the result of the first expansion for A( x J ). In this expansion, we wish to treat the connections from the top of the first pivotal to x J\I as being independent from the connections from o to x I that do not use the first pivotal bond. In the second expansion for A( x J ), we wish to extract a factor τ ( x I − y 2 ) for some y 2 from the connection from o to x I that is still present in B(y 1 , x I ). This leads to a result of the form where a( x J\I , x I ) is an error term, and, to first approximation, C(y 1 , y 2 ) represents the sausage at o together with the pivotal bonds ending at y 1 and y 2 , with the two branches removed. In particular, C(y 1 , y 2 ) is independent of I. The leading contribution to C(y 1 , y 2 ) is p ε (y 1 ) p ε (y 2 ) with y 1 = y 2 , corresponding to the case where the sausage at o is the single point o. For details, see Section 4, where (2.17) is derived. We will use a new expansion for the higher-point functions, which is a simplification of the expansion for oriented percolation in Z d × Z + in [19]. The difference resides mainly in the second expansion, i.e., the expansion of A( x J ).
In the course of the expansion, in Section 4.4, we shall also describe a close relation between the expansion coefficients for the r-point functions derived in this paper and the ones for the survival probability of the descritized contact process derived in [14]. See [17] for a more detailed discussion of this relation and its consequences.

The main identity and estimates
In this section, we solve the recursion (2.11) by iteration, so that on the right-hand side no r-point function appears. Instead, only s-point functions with s < r appear, which opens up the possibility for an inductive analysis in r. The argument in this section is virtually identical to the argument in [18, Section 2.3], and we add it to make the paper self-contained.
We define where B ⋆ n denotes the n-fold space-time convolution of B with itself, with B ⋆ 0 (x) = δ o,x . The sum over n in (2.18) terminates after finitely many terms, since by definition B((x, t)) = 0 only if t ∈ εN, so that in particular B((x, 0)) = 0. Therefore, B ⋆ n (x) = 0 if n > t x /ε, where, for x = (x, t) ∈ Λ, t x = t denotes the time coordinate of x. Then (2.11) can be solved to give The function ν can be identified as follows. We note that (2.19) for r = 2 yields that Thus, extracting the n = 0 term from (2.18), using (2.14) to write one factor of B as A ⋆ p ε (cf., (2.13)) for the terms with n ≥ 1, it follows from (2.20) that Substituting (2.21) into (2.19), the solution to (2.11) is then given by where a( x J ) = a( x J ; 1) + where we recall that r 1 = |J \ I| + 1 and r 2 = |I| + 1, and we write the superscripts of the r-point functions explicitly. Since 1 ≤ |I| ≤ r − 2, we have that r 1 , r 2 ≤ r − 1, which opens up the possibility for induction in r.
The first term on the right side of (2.26) is the main term. The leading contribution to ψ(y 1 , y 2 ) is using the leading contribution to C described below (2.17). Here, we are writing ψ s 1 ,s 2 (y 1 , y 2 ) for ψ((y 1 , s 1 ), (y 2 , s 2 )). We will analyse (2.26) using the Fourier transform. For brevity, we write t = (t 1 , . . . , t r−1 ) and k = (k 1 , . . . , k r−1 ). For I ⊂ {1, 2, . . . , r − 1}, we also write k I = (k i ) i∈I , k I = i∈I k i and k = r−1 i=1 k i . For I ⊆ J, we further write t I = min i∈I t i and t = t J . With this notation, the Fourier transform of (2.26) becomeŝ where • t≤s≤t ′ is an abbreviation for s∈[t,t ′ ]∩εZ + . The identity (2.28) is our main identity and will be our point of departure for analysing the r-point functions for r ≥ 3. Apart from ψ and ζ (r) , the right-hand side of (2.26) involves the s-point functions with s = 2, r 1 , r 2 . As discussed below (2.26), we can use an inductive analysis, with the r = 2 case given by the result of Theorem 1.1 proved in [15]. The term involving ψ is the main term, whereas ζ (r) will turn out to be an error term.
The analysis will be based on the following important proposition, whose proof is deferred to Sections 5-7. In its statement, we use the notation We note that the number of powers of ε is precisely such that, for d > 4, We also rely on the notation and, for d ≤ 4, we write β T = L −d T . Then, the main bounds on the lace-expansion coefficients are as follows: Proposition 2.2 (Bounds on the lace-expansion coefficients). The lace-expansion coefficients satisfy the following properties: 2 ) and λ = λ (ε) c . Lett denote the second-largest element of {t 1 , . . . , t r−1 }. There exist C ψ , C (r) ζ > 0 (independent of L) and L 0 = L 0 (d) such that, for all L ≥ L 0 , q ∈ {0, 2}, s i ≥ 0, t, r ≥ 3 and k i ∈ [−π, π] d , and uniformly in ε ∈ (0, 1], the following bounds hold: and µ ∈ (0, α − δ) be as in Theorem 1.1(i). There exist C ψ , C (r) ζ > 0 (independent of L) and L 0 = L 0 (d) such that, for L 1 ≥ L 0 with L T defined as in (1.7), r ≥ 2, 0 < s ≤ log T , as T ↑ ∞, and uniformly in ε ∈ (0, 1], the following bounds hold: c , is finite uniformly in ε > 0. The constant V of Theorem 1.2 should then be given by lim ε↓0 V (ε) . In Proposition 2.4 below, we will prove the existence of the limit lim ε↓0 V (ε) . Sincê (2.27), it follows from Proposition 2.2 that uniformly in ε > 0, This establishes the claim on V of Theorem 1.2(i). For d ≤ 4, on the other hand, β = β T converges to zero as T ↑ ∞, so that V (ε) is replaced by 2 − ε in Theorem 2.1(ii).

Induction in r
In this section, we prove Theorem 2.1 for ε ∈ (0, 1] fixed, assuming (2.28) and Proposition 2.2. We fix λ = λ (ε) c throughout this section. The argument in this section is an adaptation of the argument in [18, Section 2.4], adapted so as to deal with the uniformity in the time discretization. In particular, in this section, we prove Theorem 2.1 for oriented percolation for which ε = 1.
We start by giving the proof for d > 4. Lett denote the second-largest element of {t 1 , . . . , t r−1 }. We will prove that for d > 4 there are positive constants L 0 = L 0 (d) and uniformly in t ≥t and in k ∈ R (r−1)d with r−1 i=1 |k i | 2 bounded, and uniformly in ε > 0. Since theM (r−1) t ( k) are smooth functions of t (cf., [19, (2.51)]), proving the above is sufficient to prove Theorem 2.1(i).
We will prove (2.40) by induction in r, with the case r = 2 given by Theorem 2.1(i) for r = 2. Indeed, Theorem 2.1(i) for r = 2 giveŝ using the facts that |k| 2 is bounded, t 1 ≤ t, and κ < d−4 2 . Proof of Theorem 2.1(i) assuming Proposition 2.2. Let r ≥ 3. The proof is by induction in r, with the induction hypothesis that (2.40) holds for τ (s) with 2 ≤ s < r. We have seen in (2.41) that (2.40) does hold for r = 2. The induction will be advanced using (2.28). By (2.35),φ (r) n ( k) is an error term. Thus, we are left to determine the asymptotic behaviour of the first term on the right side of (2.28).
Fix k with r−1 i=1 |k i | 2 bounded. To abbreviate the notation, we write k (t) = k/ √ v (ε) σ 2 t. Recall the notation t = min{t 1 , . . . , t r−1 }. Given 0 ≤ s 0 ≤ t, let t 0 = min{s 0 , t − s 0 }. We will show that for every nonempty subset I ⊂ J 1 , Using the fact that κ < 1, the summation in the error term can be seen to be bounded by a multiple of t 1−κ ≤ t 1−κ . With the induction hypothesis and the identity r 1 + r 2 = r + 1, (2.43) then implies that where the error arising from the error terms in the induction hypothesis again contributes an amount . The summation on the right-hand side of (2.44), divided by t, is the Riemann sum approximation to an integral. The error in approximating the integral by this Riemann sum is O(εt −1 ). Therefore, using (1.17), we obtain Since t ≥t, it follows that t r−2−κ ≤ Ct r−2 (t + 1) −κ . Thus, it suffices to establish (2.42).
To prove (2.42), we write the quantity inside the absolute value signs on the left-hand side as To complete the proof, it suffices to show that for each nonempty I ⊂ J 1 , the absolute value of each T i is bounded above by the right-hand side of (2.42).
In the course of the proof, we will make use of some bounds on sums involving b (ε) s 1 ,s 2 : > 0 and fix α ∈ (0, α), recall that β T = β 1 T −bd and letβ T = β 1 T −α . There exists a constant C = C(κ, d) such that the following bound holds uniformly in ε ∈ (0, 1] Proof. (i) This is straightforward from (2.29), when we pay special attention to the number of powers of ε present in b (ε) s 1 ,s 2 and use the fact that the power of (1 + s 1 ) and of (1 + s 2 − s 1 ) is (d − 2)/2 > 1. (ii) We shall only perform the proof for d ≤ 4 with d = 2, the proof for d = 2 being a slight modification of the argument below. Using (2.29), we can perform the sum to obtain as long as α ∈ (0, α). Using thatβ T converges to 0 as T ↑ ∞, this proves (2.51).
By the induction hypothesis and the fact thatt I i ≤ t, it follows that |τ , uniformly in t I i and k I i . Therefore, it follows from (2.34) and the definition of V (ε) in (2.38) that where the final bound follows from the second bound in (2.50). Similarly, by (2.34) with q = 2, now using the first bound in (2.50), It remains to prove that To begin the proof of (2.55), we note that the domain of summation over s 1 , Therefore, |T 3 | is bounded by We expand the product of (2.57) and (2.58). This gives four terms, one of which is cancelled bŷ τ (r 1 ) (2.56). Three terms remain, each of which contains at least one factor from the second terms in (2.57)-(2.58). In each term we retain one such factor and bound the other factor by a power of t, and we estimateψ using (2.34). This gives a bound for the j = 0 contribution to (2.56) equal to the sum of plus a similar term with J \ I replaced by I. By the induction hypothesis, the difference of r 1 -point functions in (2.59) is equal to Using (1.17), the difference in (2.60) can be seen to be at most O(s 1 t −1 ). Therefore, (2.59) is bounded above, using (2.50), by The proof of Theorem 2.1(ii) is similar, now using Proposition 2.2(ii) instead of Proposition 2.2(i) and Lemma 2.3(ii) instead of Lemma 2.3(i). For d ≤ 4, we will prove that for there are positive constants L 0 = L 0 (d) and such that for λ T and µ as in Theorem 1.1(ii), L 1 ≥ L 0 , with L T defined as in (1.7), and δ ∈ (0, 1 ∧ ∆ ∧ α), we havê We will again prove (2.62) by induction in r, with the case r = 2 given by Theorem 2.1(ii) for r = 2. This part is a straightforward adaptation of the argument in (2.41), and is omitted.
We now advance the induction hypothesis. By (2.28) and (2.37), By Lemma 2.3(ii), using the fact thatβ T = β 1 T −µ and the tree-graph inequality, we can bound and, by (2.27) and the fact that As a result, we obtain that The remainder of the argument can now be completed as in (2.43)-(2.45), using the induction hypothesis in (2.62) instead of the one in (2.40).

The continuum limit
In this section we state the results concerning the continuum limit when ε ↓ 0. This proof will crucially rely on the convergence of A (ε) , V (ε) and v (ε) when ε ↓ 0. The convergence of A (ε) and v (ε) was proved in [15, Proposition 2.6], so we are left to study V (ε) . When 1 ≤ d ≤ 4, we have that the role of A (ε) , V (ε) and v (ε) are taken by A (ε) = 1, V (ε) = 2 − ε and v (ε) = 1, so there is nothing to prove. Thus, we are left to study the convergence of V (ε) when ε ↓ 0 for d > 4.

there exists a finite and positive constant
(2.68) Before proving Proposition 2.4, we first complete the proof of Theorem 1.2.
Proof of Theorem 1.2. We start by proving Theorem 1.2(i). We first claim that lim ε↓0τ . For this, the argument in [15, Section 2.5] can easily be adapted from the 2-point function to the higherpoint functions.
Using the convergence ofτ λ (ε) c t;ε ( k), together with Theorem 2.1(i) and the uniformity of the error term in (2.4) in ε ∈ (0, 1], to obtain where we have made use of the convergence of v (ε) to v, and the fact that k →M (r−1) t ( k) is continuous. This proves (1.18).
The proof of Theorem 2.1(ii) is similar, where on the right-hand side of (2.69) we need to replace A, A (ε) , v and v (ε) by 1, V (ε) by 2 − ε, V by 2 and δ by µ ∧ δ.
Proof of Proposition 2.4. The proof of the continuum limit is substantially different from the proof used in [15], where, among other things, it was shown that A (ε) and v (ε) converge as ε ↓ 0. The main idea behind the argument in this paper also applies to the convergence of A (ε) and v (ε) , as we first show. This simpler argument leads to an alternative proof of the convergence of A (ε) and v (ε) .
For this proof, we use [15, Proposition 2.1], which states that, uniformly in ε ∈ (0, 1], The uniformity of the error term can be reformulated by saying that Therefore, we obtain that, uniformly in ε ∈ (0, 1] and t ≥ 0, Now we take the limit ε ↓ 0, and use that, as proved in [15, Section 2.4], we have lim ε↓0τ . (2.74) Since A (ε) = 1 + O(β), uniformly in ε ∈ (0, 1], we see from (2.71) that t →τ λc t;ε (0) is a bounded sequence. Therefore, we conclude that alsoτ λc t (0) is uniformly bounded in t ≥ 0. Therefore, there exists a subsequence of times {t l } ∞ l=1 satisfying t l → ∞ such thatτ λc t l (0) converges as l → ∞. Denote the limit ofτ λc t l (0) by A. Then we obtain from (2.72) and (2.74) that This completes the proof of convergence of A (ε) . A similar proof can also be used to prove that the limit lim ε↓0 v (ε) = v exists. On the other hand, the proof in [15] was based on the explicit formula for A (ε) , which reads wherep λ ε (k) = 1 − ε + λεD(k), and on the fact that 1 ε 2π λ (ε) c s;ε (0) converges as ε ↓ 0 for every s > 0. This proof was much more involved, but also allowed us to give a formula for A in terms of the pointwise limits of 1 For the convergence of V (ε) , we adapt the above simple argument proving convergence of A (ε) . We use (2.4) for r = 3, t = (t, t) and k = 0, rewritten in the following way: where the error term satisfies γ ε (t) = O((t + 1) −δ ) uniformly in ε. Therefore, We conclude that We next let ε ↓ 0, and use that the limits both exist, so that The above bounds are true for any t. Moreover, by the tree-graph inequality and the fact thatτ (2) Sinceτ (2) s (0) is uniformly bounded in s by K, say, we obtain that, uniformly in t, Therefore, there exists a subsequence {t l } ∞ l=1 with lim l→∞ t l = ∞ such that the limit exists. Then, using that γ(t) = o(1) as t → ∞, we come to the conclusion that This completes the proof of Proposition 2.4.

Linear expansion for the r-point function
In this section, we derive the expansion (2.11) which extracts an explicit r-point function τ ( x J − v), and an unexpanded contribution A( x J ). In Section 4, we investigate A( x J ) using two expansions. The first of these expansions extracts a factor τ ( x J\I − y 1 ) from A( x J ) in (2.17), and the second expansion extracts a factor τ ( x I − y 2 ) from A( x J ). This will lead to (2.16)-(2.17).
From now on, we suppress the dependence on λ and ε when no confusion can arise. The r-point function is defined by where we recall the notation (2.8) and (2.10). Rather than expanding (3.1), we expand a generalized version of the r-point function defined below.
. The vertices at the top of the right figure are the components of x J .
Definition 3.1 (Connections through C). Given a configuration and a set of sites C, we say that y is connected to x through C, if every occupied path from y to x has at least one bond with an endpoint in C. This event is written as y Below, we derive an expansion for P v C −→ x J . This is more general than an expansion for the Thus, to obtain the linear expansion for the r-point function, we need to specialize to y = o and C = {o}. Before starting with the expansion, we introduce some further notation.
Definition 3.2 (Clusters and pivotal bonds). Let C(x) = {y ∈ Λ : x −→ y} denote the forward cluster of x ∈ Λ. Given a bond b, we defineC b (x) ⊆ C(x) to be the set of sites to which x is connected in the (possibly modified) configuration in which b is made vacant. We say that b is pivotal for , if x is connected to y in the possibly modified configuration in which the bond is made occupied, whereas x is not connected to y in the possibly modified configuration in which the bond is made vacant.
Remark (Clusters as collections of bonds). We shall also often view C(x) andC b (x) as collections of bonds, and abuse notation to write, for a bond a, that a ∈ C(x) (resp. a ∈C b (x)) when a ∈ C(x) and a is occupied (resp. a ∈C b (x) and a is occupied).
We now start the first step of the expansion. For a bond b = (x, y), we write b = x and b = y. The event {v C −→ x J } can be decomposed into two disjoint events depending on whether there is or is not a See Figure 3 for schematic representations of E ′ (v, x; C) and E ′ (v, x J ; C).
If there are such pivotal bonds, then we take the first bond among them. This leads to the following partition: Defining For the second term, we will use a Factorization Lemma (see [10], and, in particular, [14, Lemma 2.2]). To state that lemma below, we first introduce some notation. (3.8) We adopt the convenient convention that {x −→ x in C} occurs if and only if x ∈ C.
We will often omit "occurs" and simply write {E in C}. For example, we define the restricted r-point where we emphasize that, by the convention below (3.8), τ C (v, x J ) = 0 when v ∈ C. Note that, by Definition 3.1, A nice property of the notion of occurring "in" is its compatibility with operations in set theory (see [10,Lemma 2.3]): The statement of the Factorization Lemma is in terms of two independent percolation configurations. The laws of these independent configurations are indicated by subscripts, i.e., E 0 denotes the expectation with respect to the first percolation configuration, and E 1 denotes the expectation with respect to the second percolation configuration. We also use the same subscripts for random variables, to indicate which law describes their distribution. Thus, the law of . Given a site w ∈ Λ, fix λ ≥ 0 such that C(w) is almost surely finite. For a bond b and events E, F determined by the occupation status of bonds with time variables less than or equal to t for some t < ∞, We now apply this lemma to the second term in (3.7). First, we note that , as required in Lemma 3.5, the occupation status of b is independent of the other two events in (3.13). Therefore, when we abbreviate p b = p ε (b − b) (recall (2.9)) and make use of (3.9)-(3.10) as well as (3.12), we obtain On the right-hand side of (3.15), again a generalised r-point function appears, which allows us to iterate − −− → x J into the right-hand side of (3.15). In order to simplify the expressions arising in the expansion, we first introduce some useful notation. For a (random or deterministic) variable X, we let Note that, by this notation, This completes the first step of the expansion. We first take stock of what we have achieved so far. In (3.18), we see that the generalized r-point function P v C −→ x J is written as the sum of A (0) (v, x J ; C), a term which is a convolution of some expansion term B (0) (v, y; C) with an ordinary r-point function τ ( x J − y) and a remainder term. The remainder term again involves a generalized r-point function Thus, we can iterate the above procedure, until no more generalized r-point functions are present. This will prove (2.11).
In order to facilitate this iteration, and expand the right-hand side in (3.18) further, we first introduce some more notation. For N ≥ 1, we define where the superscript n of M (n) denotes the number of involved nested expectations, and, for n ≥ 0, We now resume the expansion of the right-hand side of (3.18). As we notice, we have P(v C −→ x J ) again in the right-hand side of (3.18), but now with v and C being replaced by b andC b (v), respectively. Applying (3.18) to its own right-hand side, we obtain By repeated application of (3.18) to (3.21) until the remainder vanishes (which happens after a finite number of iterations, see below (3.20)), we arrive at the following conclusion, which is the linear expansion for the generalised r-point function: Applying Proposition 3.6 to the r-point function in (3.3), we arrive at where we abbreviate and similarly for In the remainder of this paper, we will specialise to the case where v = o and C = {o}, and abbreviate This completes the proof of (2.11). In the next section, we will use Proposition 3.6 for a general set C in order to obtain the expansion for A( x J ).
For future reference, we state a convenient recursion formula for which follows immediately from the second representation in (3.19).

Expansion for A( x J )
We now consider A( x J ) in (3.24). Our goal is to extract two factors τ ( x J\I − y 1 ) and τ ( x I − y 2 ) from A( x J ), for some I J with I = ∅ and some y 1 , y 2 ∈ Λ. Let r 1 = |J \ I| + 1 and r 2 = |I| + 1. We devote Section 4.1 to the extraction of the first r 1 -point function τ ( x J\I − y 1 ), and Section 4.2 to the extraction of the second r 2 -point function τ ( x I − y 2 ).

First cutting bond and decomposition of
First, we recall (3.17) and, by the recursive definition (3.19) for N ≥ 1, where the subscripts indicate which probability measure describes the distribution of which cluster. For example, (3.19)), but is deterministic for P N . Therefore, to obtain an expansion for A (N) ( x J ), it suffices to investigate P(E ′ (v, x J ; C)) for given v ∈ Λ and C ⊂ Λ. In this section, we shall extract an r 1 -point function Recall (3.2) and (3.4) to see that there must be a j ∈ J such that v where we use the convention that Because of this convention, for j = 1, the event {v If j ≥ 2, then we can ignore the intersection in the second line of (4 . . , j − 1, so that the event in the second line is automatically satisfied. We now define the first cutting bond: which, by definition and (3.4), equals The contribution due to F ′ (v, x J ; C) will turn out to be an error term. Next, we consider the union over j ∈ J in (4.6). When b is the x j -cutting bond, there is a unique nonempty set I ⊂ J j ≡ J \ {j} such that b is pivotal for v −→ x i for all i ∈ J \ I, but not pivotal for v −→ x i for any i ∈ I. On this event, the intersection in the third line of (4.6) can be ignored. For a nonempty set I J, we let j I be the minimal element in J \ I, i.e., Then, the union over j ∈ J in (4.6) is rewritten aṡ To this event, we will apply Lemma 3.5 and extract a factor τ ( x J\I − b). To do so, we first rewrite this event in a similar fashion to (3.13) as follows: where the first and third events in the right-hand side are independent of the occupation status of b.
Similarly to (3.13), H 2 and H 3 can be written as so that, also using that To prove (4.9), it remains to show that When j I = 1, which is equivalent to 1 ∈ I, then the first intersection is an empty intersection, so that, by convention, it is equal to the whole probability space. We use that where we write (in Λ \ C) to indicate that the equality is true with and without the restriction that the connections take place in Λ \ C. Therefore, we can rewrite (4.17) as which equals (4.16). This proves (4.9).
, by the independence statement in lemma 3.5, the occupation status of b is independent of the first and third events in the right-hand side of (4.9). This completes the proof of Proposition 4.2.
We continue with the expansion of P(E ′ (v, x J ; C)). By (4.6) and (4.8), as well as Lemma 3.5, Proposition 4.2 and (3.10), we obtain where, in the second equality, we omit " depends only on bonds before time t b . Applying Proposition 3.6 to P(bC − −− → x J\I ) and using the notation we obtain The first step of the expansion for A (N) ( x J ) is completed by substituting (4.22) into (4.1) as follows. Let (see Figure 6) and, for N ≥ 1, Define, furthermore, for N ≥ 0, where we use the convention that, for N = 0, Here a (N) ( x J ; 1) and a (N) ( x J\I , x I ; 2) will turn out to be error terms. Then, using (4.1), (4.22), and the definitions in (4.23)-(4.26), we arrive at the statement that for all N ≥ 0, where we further make use of the recursion relation in (3.19). In Section 4.2, we extract a factor τ ( x I − y 2 ) out ofB (N) (y 1 , x I ) and complete the expansion for  Figure 6: Schematic representations of a (1) ( x J ; 1),B (1) (y 1 , x I ) and a (1)

Second cutting bond and decomposition ofB
First, we recall that, for N = 0, where, by (4.3), for j I = 1, {o −→ (x 1 , . . . , x j I −1 )} c is the whole probability space, while, for j I > 1 and since j I − 1 ∈ I by (4.7),B (0) (y 1 , x I ) ≡ 0. For N ≥ 1, we recall that Therefore, to decomposeB (N) (y 1 , x I ) and extract τ ( x I − y 2 ), it suffices to consider for any fixed I J with I = ∅, v ∈ Λ, C ⊂ Λ and a bond b, where the second term is zero if j I = 1 (see (4.3)). If j I > 1, then both terms in the right-hand side are of the form where Ω is the whole probability space. (Do not be confused with the convention in (4.3).) We note that the random variables in the above expectation depend only on bonds, other than b, whose both end-vertices are inC b (v), and are independent of the occupation status of b. For an event E and a random variable X, we let SinceC b (v) = C(v) almost surely with respect toP b , we can simplify (4.32) as To investigate (4.35), we now introduce a second cutting bond: Second cutting bond). For t ≥ t v , we say that a bond e is the t-cutting bond for Let By the convention (4.33), this equality also holds when j I = 1 and A = {v}, so that in both cases we are left to analyse (4.38). To the right-hand side, we will apply Lemma 3.5 and extract a factor τ ( x I − y 2 ).
To do so, we first rewrite the event in the second indicator on the right-hand side as follows: Proposition 4.4 (Setting the stage for the factorization II). For A ⊂ Λ, t ≥ t v and a bond e, where the first and third events in the right-hand side are independent of the occupation status of b.
We continue with the expansion of the right-hand side of (4.38). First, we note that B δ (b, y 1 ; C(v)) is random only when t y 1 is strictly larger than t b , and depends only on bonds whose both endvertices are which is almost surely finite as long as the interval [t v , T ] is finite. As a result, we claim that, a.s., Indeed, this follows since the first term of B δ (b, y 1 ; C(v)) in (4.21) does not depend on C(v) at all, while the other term, due to the definition of B(b, y 1 ; C(v)) in (3.20) only depends on C(v) up to time t y 1 − ε. As a result, by conditioning on C(v; t y 1 − ε) and using Proposition 4.4, the summand in (4.38) for e = b can be written as where the second expression is obtained by using t b ≤ t y 1 ≤ t e and the fact that the event {e is occupied} is independent of the other events. To the expectation on the right-hand side of (4.43), we apply Lemma 3.5 with E in (3.12) being replaced byẼ b , which, we recall, is the expectation for oriented percolation defined over the bonds other than b. Then, (4.43) equals where the first equality is due to the fact that the event {e −→ x I in Λ \C e (v)} depends only on bonds after t e (≥ t b ), so thatẼ b can be replaced by E, and the second equality is obtained by using (3.9)-(3.10). By performing the sum over B ⊂ Λ and using (4.42), (4.44) equals if X depends only on bonds before t b . As in the derivation of (4.22) from (4.20), we use Proposition 3.6 to conclude that, by (4.38) and (4.45)-(4.46),  The expansion forB (N) (y 1 , x I ) is completed by using (4.30)-(4.31) and (4.47) as follows. For convenience, we let  Figure 7: Schematic representations of a (1) (y 1 , x I ; 3) ± . The random variable B δ (b N+1 , y 1 ; C(b N )) in (4.50) for N = 1 is reduced to B (0) (b 2 , y 1 ; C(b 1 )) (in bold dashed lines).
Using this notation, as well as the abbreviations and, for ℓ = 3, 4, These functions correspond to the second term in the left-hand side of (4.47) and the first and second terms in the right-hand side of (4.47), respectively, when (4.47) is substituted into (4.30). We note that the functions (4.50) depend on I via the indicator ½{j I >1}, which is due to the fact that both terms in the right-hand side of (4.31) contribute to the case of j I > 1, while for the case of j I = 1, the contribution is only from the first term that has been treated as the case of A = {b N }. Now we arrive at where a (N) (y 1 , x I ; ℓ) for ℓ = 3, 4 turn out to be error terms. This extracts the factor τ ( x I − y 2 ) from B (N) (y, x I ).

Summary of the expansion for A( x J )
Recall (4.28) and (4.53), and define, for N ≥ 0,  Figure 8: Schematic representations of φ (1) (y 1 , y 2 ) ± and a (1) (y 1 , x I ; 4) ± . The random variables B δ (b N+1 , y 1 ; C(b N )), B δ (e, y 2 ;C e (b N )) and A(e, x I ;C e (b N )) in (4.49)-(4.52) for N = 1 are reduced, respectively, to B (0) (b 2 , y 1 ; C(b 1 )), B (0) (e, y 2 ;C e (b 1 )) and A (0) (e, x I ;C e (b 1 )) (depicted in bold dashed lines). let a (N) ( x J ) be given by (2.25) and define Now, we can summarize the expansion in the previous two subsections as follows: Proof. We substitute (4.53) into (4.28). Note that, by (4.7), j I > 1 precisely when 1 ∈ I. Thus, also taking notice of the difference in J \ I, which contains 1 in (2.17), but may not in (4.28), we split the sum over I arising from in (4.28) as where y ′ 1 , y ′ 2 and I ′ in the middle expression correspond to y ′ 1 = y 2 , y ′ 2 = y 1 and I ′ = J \ I on the left hand side of (4.58). Therefore, we arrive at (4.56)-(4.57). This completes the derivation of the lace expansion for the r-point function.

Proof of (2.33) and a comparison to the survival probability expansion coefficients
In this section, we prove (2.33) and compare the lace-expansion coefficients for the r-point functions to the ones of the survival probability derived in [14].

Bounds on B(x) and A( x J )
In this section, we prove the following proposition, in which we denote the second-largest element of {t j } j∈J byt =t J : 1. For λ ≤ λ (ε) c , N ≥ 0, t ∈ εN, t J ∈ (εZ + ) |J| and q = 0, 2,

2)
where the constant in the O(β) term is independent of ε, L, N and t (ort in (5.2)).

4)
where the constants in the O(β T ) and O(β T ) terms are independent of ε, L 1 , T, N and t (ort in (5.4)).
In Section 5.1, we define several constructions that will be used later to define bounding diagrams for B(x), A( x), C(y 1 , y 2 ) and a( x). There, we also summarize effects of these constructions. Then, we prove the above bounds on B(x) in Section 5.2, and the bounds on A( x J ) in Section 5.3. Throughout Sections 5-7, we shall frequently assume that λ ≤ 2, which follows from (2.5) for d > 4 and L ≫ 1, and from the restriction on λ T in Theorem 1.1 for d ≤ 4 and L 1 ≫ 1.

Constructions: I
First, in Section 5.1.1, we introduce several constructions that will be used in the following sections to define bounding diagrams on relevant quantities. Then, in Section 5.1.2, we show that these constructions can be used iteratively by studying the effect of applying constructions to diagram functions. Such iterative bounds will be crucial in Sections 5.2-5.3 to prove Proposition 5.1.

Definitions of constructions
For b = (u, v) with u = (u, s) and v = (v, s + ε), we will abuse notation to write p and (see Figure 9) where ϕ for u = v corresponds to λεD ⋆ τ for u = v. We call the lines from u to x in L(u, v; x) the L-admissible lines. Here, with lines, we mean ϕ(x − u) and (ϕ ⋆ λεD)(x − u) when u = v. If u = v, then we define both lines from u to x in each term in L(u, u; x) to be L-admissible. We note that these lines can be represented by 2-point functions as, e.g., Thus, below, we will frequently interpret lines to denote 2-point functions.
We will use the following constructions to prove Proposition 5.1: Definition 5.2 (Constructions B, ℓ, 2 (i) and E). (i) Construction B. Given any diagram line η, say τ (x − v), and given y = x, we define Construction B η spat (y) to be the operation in which is the sum of τ (x−v)δ x,y and the results of Construction B η spat (y) and Construction B η temp (y) applied to τ (x − v). Construction B η (s) is the operation in which Construction B η (y, s) is performed and then followed by summation over y ∈ Z d . Constructions B η spat (s) and B η temp (s) are defined similarly. We omit the superscript η and write, e.g., Construction B(y) when we perform Construction B η (y) followed by a sum over all possible lines η. We denote the result of applying Construction B(y) to a diagram function f (x) by f (x; B(y)), and define f (x; B spat (y)) and f (x; B temp (y)) similarly. For example, we denote the result of applying Construction B spat (y) to the line ϕ(x) by where δ o,y (λεD ⋆ τ )(x) is the contribution in which p of ϕ is replaced by λεD.
(ii) Construction ℓ. Given any diagram line η, Construction ℓ η (y) is the operation in which a line to y is inserted into the line η. This means, for example, that the 2-point function τ (u − v) corresponding to the line η is replaced by We omit the superscript η and write Construction ℓ(y) when we perform Construction ℓ η (y) followed by a sum over all possible lines η. We write F (v, y; ℓ(z)) for the diagram where Construction ℓ(z) is performed on the diagram F (v, y). Similarly, for y = (y 1 , . . . , y j ), Construction ℓ( y) is the repeated application of Construction ℓ(y i ) for i = 1, . . . , j. We note that the order of application of the different Construction ℓ(y i ) is irrelevant.
(iii) Constructions 2 (i) and E. For a diagram F (v, u) with two vertices carrying labels v and u and with a certain set of admissible lines, Constructions 2 (1) u (w) and 2 (0) u (w) produce the diagrams where η is the sum over the set of admissible lines for F (v, u). Here and elsewhere, we use Einstein's summation convention: each diagram function F (v, u; 2 (i) u (w)) depends only on v and w, but not on u. We call the L-admissible lines of the added factor L(u, z; w) in (5.12) the 2 (1) -admissible lines for F (v, u; 2 (1) u (w)). Construction E y (w) is the successive applications of Constructions 2 (1) y (z) and 2 (0) z (w) (cf., Figure 10): F (v, y; E y (w)) = F v, y; 2 (1) y (u), 2 (0) u (w) where η is the sum over the 2 (1) -admissible lines for F (v, y; 2 (1) y (u)). Note that F (v, y; E y (w)) also depends only on v and w, but not on y. We further define the E-admissible lines to be all the lines added in the Constructions 2 (1) y (z) and 2 (0) z (w).  (1) y (z) are due to the fact that L(y, u; z) for some u consists of 2 terms, and that the result of Construction B η (u) consists of 3 (= 2 + 1) terms, one of which is the trivial contribution: F (v, y) δ y,u . The number of admissible lines in the resulting diagram is 2 for this trivial contribution, otherwise 1. Therefore, the number of resulting terms at the end is 54, which is the sum of 6 (due to the identity in (5.13)), 24 (= 4 × 6, due to the non-trivial contribution in the first stage followed by Construction 2 (0) z (w)) and 24 (= 2 × 2 × 6, due to the trivial contribution having 2 admissible lines followed by Construction 2 (0) z (w)).

Effects of constructions
In this section, we summarize the effects of applying the above constructions to diagrams, i.e., we prove bounds on diagrams obtained by applying constructions on simpler diagrams in terms of the bounds on those simpler diagrams. We also use the following bounds onτ t that were proved in [15]: there is a K = K(d) such that, for d > 4 with any t ≥ 0, For d ≤ 4 with 0 ≤ t ≤ T log T , we replace β by β T = L −d T , and σ by σ T = O(L 2 T ). Furthermore, by [15, Lemma 4.5], we have that, for q = 0, 2 and d > 4, for some c < ∞. Again, for d ≤ 4, we replace σ q β by σ q T β T , Lemma 5.3 (Effects of Constructions B and ℓ). Let s ∧ min i∈I t i ≥ 0, and let f t I ( x I ) be a diagram function that satisfies x I f t I ( x I ) ≤ F ( t I ) by assigning l 1 or l ∞ norm to each diagram line and using (5.15)-(5.16) in order to estimate those norms. Let d > 4. Then, there exist C 1 , C 2 < ∞ which are independent of ε, s and t I such that, for any line η and q = 0, 2, x I ,y |y| q f t I ( x I ; B η (y, s)) ≤ (N η σ 2 s) q/2 (δ s,tη + εC 1 ) F ( t I ), (5.17) x I ,y |y| q f t I ( x I ; ℓ η (y, s)) ≤ C 2 (N η σ 2 s) q/2 (1 + s ∧ t η ) F ( t I ), (5.18) where Proof. The first inequality (5.17), where δ s,tη is due to the trivial contribution in B η (y, s), is a generalisation of [15,Lemma 4.6], where η was an admissible line. For q = 2, in particular, we first bound |y| 2 by N η Nη i=1 |y i − y i−1 | 2 , where (y 0 , s 0 ) ≡ o, (y 1 , s 1 ), (y 2 , s 2 ), . . . , (y Nη , s Nη ) ≡ (y, s) are the endpoints of the diagram lines along the (shortest) path from o to (y, s). Then, we estimate each contribution from |∆y i | 2 ≡ |y i − y i−1 | 2 using the bound on |∇ 2τ s i −s i−1 (0)| in (5.15) or the bound on sup ∆y i |∆y i | 2 (τ s i −s i−1 * D)(∆y i ) in (5.16). As a result, we gain an extra factor O(s i − s i−1 )σ 2 or O(s i − s i−1 )σ 2 T depending on the value of d. Summing all contributions yields the factor O(s)σ 2 or O(s)σ 2 T . The rest of the proof is similar to that of [15,Lemma 4.6]. To prove the second inequality (5.18), we note that We first perform the sum over y using (5.15)-(5.16) and then perform the sum over z using (5.17). This yields, for d > 4, For d ≤ 4, we only need to replace σ in the above computation by σ T . This completes the proof of Lemma 5.3. There is a constant c < ∞ which does not depend on f, L f , C f and t such that, for d > 4, x |x| q f (u; 2 (1) u (x, t)) ≤ Proof. The idea of the proof is the same as that of [15,Lemma 4.7]. Here we only explain the case of q = 0; the extension to q = 2 is proved identically as the extension to q = 2 in [15,Lemma 4.8].
First we recall the definition (5.12). Then, we have x f (u; 2 (1) u (x, t)) ≤ Since f s (u) has at most L f lines at any fixed time between 0 and s, by Lemma 5.3, we obtain By (5.6) and (5.16), we have that, for d > 4 and any u, v ∈ Z d and s, s ′ ≤ t, For d ≤ 4, β is replaced by β T . The factor ε δ (u,s),(v,s ′ ) will be crucial when we introduce the 0 th order bounding diagram (see, e.g., (5.36) and (5.63) below). To bound the convolution (5.23), however, we simply ignore this factor. Then, the contribution to (5.23) from δ s,s ′ in (5.24) is bounded by c ′ β or c ′ β T (depending on d) multiplied by The above constants c ′′ , c ′′′ are independent of ε and t. To obtain the required factor (1 as follows: This completes the proof.

Bound on B(x)
In this section, we estimate B(x). First, in Section 5.

Diagrammatic bound on B (N) (v, y; C)
First we define bounding diagrams for B (N) (v, y; C). For v, w, c ∈ Λ, we let For v, w ∈ Λ and C ⊆ Λ, we define w − = (w, t w − ε) and where (c, w) ∈ C precisely when the bond (c, w) is a part of C. We now comment on this issue in more detail. Note that C ⊆ Λ appearing in B (N) (v, y; C) is a set of sites. However, we will only need bounds on B (N) (v, y; C) for C =C N for some N . As a result, the set C of sites here have a special structure, which we will conveniently make use of. That is, in the sequel, we will consider C to consist of sites and bonds simultaneously, as in Remark 3 in the beginning of Section 3, and call C a cluster-realization when for some c ∈ Λ. The diagram S (0) (v, w; C) is closely related to the diagram c∈C S (0) (v, w; c), apart from the fact that S (0,1) (v, w; c) is multiplied by λεD(w − c) in (5.31) and by ½{(c,w)⊆C}(1 − δ c,w − ) in (5.32). In all our applications, the role of C is played by aC N -cluster, and, in such cases, since (c, w) is a spatial bond, λεD(w − c) is the probability that the bond (c, w) is occupied. This factor ε is crucial in our bounds. Furthermore, we define where the admissible lines for the application of Construction 2 (0) w (y) in (5.34)-(5.35) are (τ ⋆ λεD)(w − v) and τ (w − v) in the second lines of (5.29)-(5.30). If c = v, then, by the first line of (5.29) and recalling (5.13) and the definition of Construction B applied to a "line" of length zero (see below (5.9)), we have We further define the diagram P (N) (v, y; c) (resp., P (N) (v, y; C)) by N applications of Construction E to P (0) (v, y; c) in (5.34) (resp., P (0) (v, y; C) in (5.35)). We call the E-admissible lines, arising in the final Construction E, the N th admissible lines. We note that, by (5.6) and this notation, it is not hard to see that Therefore, an equivalent way of writing (5.14) is where η is the sum over the admissible lines for F (v, y). In particular, we obtain the recursion where η is the sum over (N − 1) th admissible lines.
The following lemma states that the diagrams constructed above indeed bound B (N) (v, y; C): Lemma 5.5. For N ≥ 0, v, y ∈ Λ, and a cluster-realization C ⊂ Λ with min c∈C t c < t v , Proof. A similar bound was proved in [14,Proposition 6.3], and we follow its proof as closely as possible, paying attention to the powers of ε.
We prove by induction on N the following two statements: where η is the sum over the N th admissible lines. The first inequality together with (3.20) immediately imply (5.40).
To verify (5.41) for N = 0, we first prove where E •F denotes disjoint occurrence of the events E and F . It is immediate that (see, e.g., [14, (6.12)]) However, when ε ≪ 1, the above bound is not good enough, since it does not produce sufficiently many factors of ε. Therefore, we now improve the inclusion. We denote by w the element in C with the smallest time index such that v −→ w. Such an element must exist, since E ′ (v, y; C) ⊂ {v C −→ y}. Then, there are two possibilities, namely, that v is not connected to w − ≡ (w, t w − ε), or that w − ∈ C. In the latter case, since C is a cluster-realization with min c∈C t c < t v , there must be a vertex c ∈ C such that the spatial bond (c, w) is a part of C. Together with (5.44), it is not hard to see that (5.43) holds.
Recall that a spatial bond b has probability λεD(b) of being occupied. We note that, since {u −→ w} • {u −→ y} occurs, and when w = u and y = u, there must be at least one spatial bond b with b = u, such that either b −→ w or b −→ y. Therefore, this produces a factor ε. Also, when w = y and u = y, then the disjoint connections in {w −→ y} • {u −→ y} produce a spatial bond pointing at y. Taking all of the different possibilities into account, and using the BK inequality (see, e.g., [7]), we see that which is (5.41) for N = 0.
To verify (5.42) for N = 0, we use the fact that where η is the sum over the 0 th admissible lines. Indeed, to relate (5.46) to (5.45), fix a backward occupied path from w to v. We note that this must share some part with the occupied paths from v to y. Let u be the vertex with highest time index of this common part. Then, there must be a spatial bond b with b = u. Recall that the result of Construction B η (z) is the sum of η P (0) (v, y; C) τ (w − y) and the results of Construction B η spat (z) and Construction B η temp (z). We also recall (5.8)-(5.9). Therefore, z in (5.46) is b = u in the contribution due to Construction B η spat (z), and z = b in the contribution from Construction B η temp (z). This completes the proof of (5.42) for N = 0. To advance the induction, we fix N ≥ 1 and assume that (5.41)-(5.42) hold for N − 1. By (3.19), (5.45), (5.35) and (5.32), we have Note that t c ≥ t b . By the Markov property, we obtain Substitution of (5.48) into (5.47) and using (5.31) and (5.34), we arrive at where η is the sum over the admissible lines for P (0) (b, y;C N−1 ). The argument in (5.47)-(5.49) then proves that We use the induction hypothesis (5.42) to bound , as well as the fact that (cf., (5.39)) where η ′ is the sum over the (N − 1) th admissible lines. This leads to This completes the advancement of (5.42).
We close this section by listing a few related results that will be used later on. First, it is not hard to see that (5.42) can be generalised to Next, we let We will use the recursion formula (cf., (5.39)) where η is the sum over the N th admissible lines. This can easily be checked by induction on M (see also [14, (6.21)-(6.24)]). We will also make use of the following lemma, which generalises (5.58) to cases where more constructions are applied: where η is the sum over the N th admissible lines for P (N) (b). Recall that Construction ℓ( x) for x = (x 1 , . . . , x j ) is the repeated application of Construction ℓ η i (x i ) for i = 1, . . . , j, followed by sums over all possible lines η i for i = 1, . . . , j.
Proof. The above inequality is similar to (5.58), but now with two extra construction performed to the arising diagrams. The equality in (5.58) is replaced by an upper bound in (5.59), since on the righthand side there are more possibilities for the lines on which the Constructions ℓ( x) and ℓ( z) can be performed.

Proof of the bound on B (N) (x)
We now specialise to v = o and C = {o}, for which we recall (3.25) and (5.56)-(5.57). The main result in this section is the following bound on P (N) t (x) ≡ P (N) ((x, t)), from which, together with Lemma 5.5, the inequalities (5.1) and (5.3) easily follow. For λ ≤ λ (ε) c , N ≥ 0, t ∈ εZ + and q = 0, 2, where the constant in the O(β) term is independent of ε, L, N and t.
where the constants in the O(β T ) and O(β T ) terms are independent of ε, T, N and t.

Bound on A( x J )
In this section, we investigate A( x J ). First, in Section 5.3.1, we prove a d-independent diagrammatic bound on (3.25). Then, in Section 5.3.2, we prove the bound (5.2) for d > 4 and the bound (5.4) for d ≤ 4 simultaneously.

Diagrammatic bound on
The main result proved in this section is the following proposition: To prove Lemma 5.8, we first note that, by (3.16)-(3.17) and (3.19)-(3.20), Thus, we are lead to study P E ′ (v, x J ; C) . As a result, Lemma 5.8 is a consequence of the following lemma: where η is the sum over the (N − 1) th admissible lines for P (N−1) (v, b N ; C). Ignoring the restriction b N = z and using an extension of (5.53), we obtain Y ≤ P (N) (v, z; C, ℓ( x I )). (5.72) For X, we use (5.36) and (5.39) to obtain Finally, we use the BK inequality to bound . This completes the proof.
Proof of Lemma 5.9. Recall (5.43). We show below that First, we prove (5.70) assuming (5.74). Substituting (5.74) into P(E ′ (v, x J ; C)), we have For the sum over z = v, we use the BK inequality to extract P(z −→ x J\I ) ≡ τ ( x J\I − z) and apply the following inequality that is a result of an extension of the argument around (5.46): This completes the proof of (5.70). It remains to prove (5.74). Summarising (4.5)-(4.9), we can rewrite E ′ (v, x J ; C) as Note that the first event on the right-hand side is a subset of the second event, when I = J j and z = x j , for which J \ I = {j} and {z −→ x J\I } = {x j −→ x j } is the trivial event. This completes the proof of (5.74) and hence of Lemma 5.9.

Proof of the bound on
We prove (5.2) for d > 4 and (5.4) for d ≤ 4 simultaneously, using Lemmas 5.3 and 5.7-5.8. Below, we will frequently use where we recall max i∈I t i ≤ T log T for d ≤ 4. For simplicity, let I = {1, . . . , i}. Then, (5.79) is an easy consequence of Lemma 5.3 and the tree-graph inequality [1]: First we prove (5.2), for which d > 4, for N ≥ 1. By Lemma 5.8, we have Note that the number of lines contained in each diagram for P (N) (z) at any fixed time between 0 and t z is bounded, say, by L, due to its construction. Therefore, by Lemmas 5.3 and 5.7, we obtain and further that (5.83) More generally, by denoting the second-largest element of {s, t I } bys t I , we have where the combinatorial factor (L+|I|−1)! (L−1)! is independent of β and N . Substituting this and (5.60) into (5.81) and using (5.79), we obtain that, since (d − 2)/2 > 1, wheret =t J . This proves (5.2) for N ≥ 1.
To prove (5.4), for which d ≤ 4, for N ≥ 1, we simply replace O(β) N in (5.84) by O(β T ) O(β T ) N −1 using Lemma 5.7(ii) instead of Lemma 5.7(i). Then, we use the factor β T to control the sums over s ∈ εZ + in (5.85), as in (5.28). Since t J\I ≤ T log T , This completes the proof of (5.4) for N ≥ 1.
Next we consider the case of N = 0. Similarly to the above computation, the contribution from the latter sum in (5.68) over z = v (= o in the current setting) equals εO(β(1 +t) r−3 ) for d > 4 and εO(β T (1 +t) r−3 ) for d ≤ 4. It remains to estimate the contribution from If ε is large (e.g., ε = 1), then we simply use the BK inequality to obtain Therefore, by (5.79), we have If ε ≪ 1, then we should be more careful. Since {o −→ x I } and {o −→ x J\I } occur bond-disjointly, and since there is only one temporal bond growing out of o, there must be a nonempty subset I ′ of I or J \ I and a spatial bond b with b = o such that {b −→ x I ′ } • {o −→ x J\I ′ } occurs. Then, by the BK inequality and (5.79), we obtain This completes the proof of (5.2) for d > 4 and (5.4) for d ≤ 4.
We will prove Lemma 6.3 using the following three lemmas: For v, x ∈ Λ and t v < t ≤ t x , (cf., Figure 11) Then, Lemma 6.6. Let X be a non-negative random variable which is independent of the occupation status of the bond b, while F is an increasing event. Then, (6.45) and for N ≥ 1 and The remainder of this subsection is organised as follows. In Section 6.3.1, we prove Lemma 6.3 assuming Lemmas 6.5-6.7. Lemma 6.5 is an adaptation of [14,Lemmas 7.15 and 7.17] for oriented percolation, which applies here as the discretized contact process is an oriented percolation model. The origin of the event {z −→ w − } ∪ {w − / ∈ A} in (6.43) is similar to the intersection with the second line in (5.43), for which we refer to the proof of (5.43). Lemma 6.6 is identical to [14,Lemma 7.16]. We omit the proofs of these two lemmas. In Section 6.3.2, we prove Lemma 6.7.
6.3.1 Proof of Lemma 6.3 assuming Lemmas 6.5-6.7 Proof of Lemma 6.3 for N 2 = 0. First we prove the bound on φ (N,N 1 ,0) (y 1 , y 2 ) + , where, by (4.46) and (4.48), Note that, by Lemma 6.5, H ty 1 (b N , e; {b N }) is a subset of V ty 1 −ε (b N , e), which is an increasing event. We also note that the event E ′ (b N , b N+1 ;C N−1 ) and the random variable B , are independent of the occupation status of b N+1 . By Lemma 6.6 and using (3.16) and (3.19), Figure 12: Schematic representations of the events (a) The bound (6.17) for N 2 = 0 now follows from Lemma 6.7.

Proof of Lemma 6.7
Proof of Lemma 6.7 for N 1 = 0. Since B (0) δ (b N+1 , y 1 ;C N ) = δ b N+1 ,y 1 , the sums over b N+1 on the left-hand sides of (6.45)-(6.46) are identical to the sums on the right-hand sides over b with b = y 1 . We also note that t y 1 = t b N+1 + ε in this case. By the definitions of R (N) and Q (N) in (6.13)-(6.14), to prove Lemma 6.7 for N 1 = 0 it suffices to show (cf., (3.27)), LHS of (6.57 LHS of (6.58) = On the other hand, by the recursive definition of P (N) (cf., (5.58)), RHS of (6.57) = where Construction ℓ(c) in (6.61) is applied to the (N − 1) th admissible lines of P (N) and that in (6.62) is applied to the (N − 2) th admissible lines. By comparing the above expressions and following the argument around (5.47)-(5.49), it thus suffices to prove 63) First we prove (6.63). Note that, by (3.16), LHS of (6.63 Using (6.40), we obtain The event E ′ (b N , b N+1 ;C N−1 ) implies that there are disjoint connections necessary to obtain the bounding diagram can be accounted for by an application of Construction ℓ(v), and then {v −→ y 2 } • {z −→ y 2 } can be accounted for by an application of Construction 2 (0) v (y 2 ). The event { x ∈C N } implies additional connections, accounted for by an application of Construction ℓ( x). By (6.13), this completes the proof of (6.63).
Next, we prove (6.64). Note that, by (3.19), (6.68) Using (6.43) and following the argument below (6.67), we obtain Similarly to the above, E ′ (b N , b N+1 ;C N−1 ) implies the existence of disjoint connections necessary to obtain the bounding diagram P (0) (b N , b N+1 ;C N−1 ). The event subject to the union over z is accounted for by an application of Construction B(u) followed by multiplication of w:tw >t b N+1 S (0) (u, w;C N−1 , 2 (0) w (y 2 )), resulting in the bounding diagram u,w tw >t b N+1 The event { x ∈C N } is accounted for by applying Construction ℓ( x I ) to P (0) (b N , b N+1 ;C N−1 , B(u)) and Construction ℓ( x J\I ) to S (0) (u, w;C N−1 , 2 (0) w (y 2 )), followed by the summation over I ⊂ J. Then, by (5.32) and (5.35), we have ) is a random variable (sinceC N−1 is random) which depends only on bonds in the time interval [t b N , t b N+1 ], and that t a ≥ t b N+1 , which is due to (5.29)-(5.30) and the restriction on t w . Therefore, by the Markov property (cf., (5.48)) and (5.34), × P (0) u, y 2 ; a, ℓ( x J\I ) . (6.72) We need some care to estimate First, by (5.32) and t v ≤ t b N+1 ≤ t a , we obtain Then, by the BK inequality, we have However, by a version of (5.55), we have where η is the sum over the admissible lines of the diagram P (0) (b N−1 , b N ;C N−2 ). Therefore, the sum of the second line on the right-hand side of (6.73) is bounded by .

(6.77)
By the definition of Construction ℓ η (c), the diagram function D(v) can be written as Thanks to this identity, the second line of (6.77) is regarded as the result of applying Construction B η (v) to the diagram line (τ * λεD)(v − c ′ ) at v, followed by a multiplication of τ (a − v) and a summation over v. This is not accounted for in the first line of (6.77) and is the difference between the result of Construction ℓ(a) and that of Construction ℓ ′ (a) in the first line of (6.77). By this observation, we obtain (6.77) = η c,v (6.79) Therefore, by applying the bounds (6.76) and (6.79) to (6.73) and using (5.31) and (5.34), Finally, by Lemma 5.6 and a version of (6.14), we obtain This completes the proof of (6.64).
7.2 Proof of (7.6) To prove (7.6), it thus suffices to show that the sum ofã (N,N ′ ) ( x J\I , x I ; 2) over N ′ satisfies (7.6). We discuss the following three cases separately: (i) |J \ I| = 1, (ii) |J \ I| ≥ 2 and N ′ = 0, and (iii) |J \ I| ≥ 2 and N ′ ≥ 1. The reason why a (N) ( x j , x J j ; 2) for some j is small is the same as that for a (N) ( x J ; 1) explained in Section 7.1. However, as seen in Figure 6, the reason for general a (N) ( x J\I , x I ; 2) with |J \ I| ≥ 2 to be small is different. It is because there are at least three disjoint branches coming out of a "bubble" started at o.
(ii) If |J \ I| ≥ 2 and N ′ = 0, then we use (5.68) to obtaiñ Following the argument between (6.72) and (6.81) (see also (6.84)), we obtain b N+1 ½{b N −→ x I }P (0) b N+1 , z;C N , ℓ( x I ′ ) ≤ P (N+1) z; ℓ( x I ),l ≤tz ( x I ′ ) , (7.20) wherel ≤tz ( x I ′ ) means that we apply Construction ℓ( x I ′ ) to the lines contained in P (N+1) (z; ℓ( x I )), but at least one of |I ′ | constructions is applied before time t z . This excludes the possibility that there is a common branch point for x I∪I ′ after time t z . Let To estimate the sums of (7.21)-(7.22) over x J ∈ Z d|J| , we use the following extensions of (7.14): (1 + s) (d−2)/2 (1 +s t I ) |I|−1 , where the second sum in the last line is interpreted as zero if T I ′ > t I\I ′ . The first sum is readily bounded by O(β T )∆t, whereas the second sum, if it is nonzero (so that, in particular, T I ′ ≤t), is bounded by ≤ O(β T ) ∆t.
For d ≤ 2, we use (7.27) twice and 1 + t I\I ′ ∧ T I ′ ≤ 1 +t = ∆t to obtain This completes the proof of (7.56) and hence of Lemma 7.3.