New results on pathwise uniqueness for the heat equation with colored noise

We consider strong uniqueness and thus also existence of strong solutions for the stochastic heat equation with a multiplicative colored noise term. Here, the noise is white in time and colored in q dimensional space ($q \geq 1$) with a singular correlation kernel. The noise coefficient is H\"older continuous in the solution. We discuss improvements of the sufficient conditions obtained in Mytnik, Perkins and Sturm (2006) that relate the H\"older coefficient with the singularity of the correlation kernel of the noise. For this we use new ideas of Mytnik and Perkins (2011) who treat the case of strong uniqueness for the stochastic heat equation with multiplicative white noise in one dimension. Our main result on pathwise uniqueness confirms a conjecture that was put forward in their paper.


Introduction
This work is the third in a series of papers dealing with the pathwise uniqueness of the stochastic heat equation with Hölder continuous noise coefficients: For t > 0 and x ∈ R q we set X(0, x) = X 0 (x) and consider ∂X ∂t = 1 2 ∆X + σ(t, x, X)Ẇ (t, x) + b(t, x, X) a.s. (1) Here, X : R + × R q → R is random, ∆ denotes the Laplacian,Ẇ a space-time noise on R + × R q , and σ and b are real valued functions. Stochastic partial differential equations (SPDE) of the form (1) arise naturally in the description of the densities of measure-valued processes on R q , that are obtained, for one, as diffusion limits of spatial branching particle systems. For example, in the case of super-Brownian motion in dimension q = 1 the measure at any positive time t > 0 has a density X t (x) = X(t, x) a.s., and this density satisfies the above equation (1) with σ(t, x, X) = √ X, b ≡ 0 andẆ space-time white noise ( [KS88], [Rei89]).
Here, we want to focus on equation (1) in any dimension q ≥ 1 in the case when the noise coefficient σ is not necessarily Lipschitz but merely Hölder continuous in the solution X andẆ is a noise that is white in time and colored in space. This means that W is a Gaussian martingale measure on R + × R q as introduced in [Wal86] with spatial correlation kernel k : R 2q → R specified as follows. For φ ∈ C c (R q ), the continuous compactly supported functions on R q , the real-valued process (W t (φ)) t≥0 is a Brownian motion with quadratic variation given by (2) SPDEs with colored noise of this form arise as diffusion limits of branching particle systems in a random environment, whose spatial correlation is described by the kernel k, in the case that σ(t, x, X) = X, see [Stu03] and also [Myt96]. More general noise coefficients σ should correspond to an additional dependence of the branching on the local particle density, see [Zäh10] for a recent general formulation in the non-spatial setting without a random environment. In this article we give conditions for pathwise uniqueness of solutions to equation (1) with the correlation kernel k in the following form: There exist constants α ∈ (0, 2 ∧ q) and c 3 > 0 such that k(w, z) ≤ c 3 (|w − z| −α + 1) for all w, z ∈ R q . (3) For noise correlation kernels of this form, existence and pathwise uniqueness of solutions to (1) when σ is Hölder continuous in the solution was previously considered in [MPS06], where an equivalent formulations of condition (3) can be found as well as further conditions that any correlation kernel as in (2) must satisfy. The techniques used in [MPS06] for finding sufficient conditions on pathwise uniqueness were further refined in [MP11] albeit for (1) in dimension q = 1 with space-time white noise. In this work, we want to utilize the ideas of [MP11] in order to improve the results of [MPS06]. In order to rigorously describe our new results as well as the preceding results of [MPS06] and [MP11] we introduce some conditions on the coefficients as well as some notation. We will impose a growth condition and a Hölder continuity condition on σ as well as the standard Lipschitz condition on b. So assume that there exists a constant c 4 such that for all (t, x, X) ∈ R + × R q+1 , |σ(t, x, X)| + |b(t, x, X)| ≤ c 4 (1 + |X|).
Furthermore, for some γ ∈ (0, 1) there are A 1 , A 2 > 0 and for all T > 0 there is an A 0 (T ) so that for all t ∈ [0, T ] and all (x, X, X ′ ) ∈ R q+2 , |σ(t, x, X) − σ(t, x, X ′ )| ≤ A 0 (T )e A1|x| (1 + |X| + |X ′ |) A2 |X − X ′ | γ , and there is a B > 0 such that for all (t, x, X, X ′ ) ∈ R + × R q+2 , Also, we denote by C c , C 0 , C b the spaces of continuous functions with compact support, vanishing at infinity or bounded, respectively. By C(E, F ) we denote the continuous functions from E to F for some topological spaces E and F. If the function is k-times continuously differentiable for k ∈ N ∪ {∞} we write a superscript k. We also write B q (x, r) for the ball with center x and radius r in R q . Throughout the paper we will use the convention that constants denoted by c i.j , c i refer to their appearance in Lemma i.j or Equation (i), respectively. We will denote generic constants by C, which may change their values from line to line. Further dependence on parameters is indicated in brackets. Finally, let p t (x) = (2πt) −q/2 exp(− |x| 2 2t ) be the q-dimensional heat-kernel. We say that (X, W ) is a (stochastically weak) solution if there exists a filtered probability space (Ω, F , (F t ) t≥0 , P ) that supports a colored noise W defined as in (2) and (3) such that X and W are adapted and the mild formulation of (1) holds, namely X(t, x) = p t (x − y)X(0, y)dy + t 0 p t−s (x − y)σ(s, x, X(s, y))W (ds dx) + t 0 p t−s (x − y)b(s, x, X(s, y))dx ds almost surely for all t ≥ 0 and φ ∈ C c (R q ), where we used the abbreviation for R q . (In the following the integration domain will always be assumed to be R q if nothing else is specified.) For more details about these so called mild solutions and the existence of the stochastic integral with respect to W see [Dal99], for more about the notion of weak solutions see [Jac80] Def. 5.2(a). Define the space of tempered functions by C tem := {f ∈ C(R q , R) : f λ < ∞ ∀λ > 0} , where ||f λ := sup x∈R q |f (x)|e −λ|x| .
For the existence of solutions we state Theorem 1.1. Let X 0 ∈ C tem and let b, σ be continuous functions satisfying (4). Assume that (3) holds for some α ∈ (0, 2 ∧ q). Then there exists a stochastically weak solution to (1) with sample paths in C(R + , C tem ). Additionally, it holds that for all T, λ, p > 0, This theorem is essentially Theorem 1.2 and Theorem 1.8 of [MPS06] combined, except that we add a drift b and allow space and time dependence of b and σ. The full proof addressing these straightforward generalizations can be found in Chapter 8 of [Rip12].
We say that pathwise uniqueness for (1) holds if for any two solutions X 1 and X 2 ∈ C tem on the same filtered probability space (Ω, F , (F t ) t≥0 , P ) supporting a noise W and with X 1 0 = X 2 0 almost surely we have that X 1 (t, x) = X 2 (t, x) for all t ≥ 0, x ∈ R q almost surely. We are now in the position to state our main result regarding pathwise uniqueness of solutions to (1): Theorem 1.2. Let X 0 ∈ C tem and assume that b, σ : R + × R q × R → R satisfy (4), (5) and (6). Assume that (3) holds for some α ∈ (0, 2 ∧ q). Then pathwise uniqueness for solutions of (1) holds if α < 2(2γ − 1).
Our main result improves the sufficient conditions for pathwise uniqueness given in [MPS06] in the same setting: There, it was shown that pathwise uniqueness holds if α < (2γ − 1). Since it was known already then from [Dal99,PZ00] that for Lipschitz continuous noise coefficients σ (corresponding to γ = 1) pathwise uniqueness holds if α < 2∧q there was an obvious gap for γ close to 1 in dimensions q ≥ 2. We close this gap with the present work. In addition, heuristic arguments (b) Condition (5) implies the following local Hölder condition: For all K > 1 there is an L K so that for all t ∈ [0, K] and x ∈ B q (0, K), X, X ′ ∈ [−K, K], 2 Proof of Theorem 1.2 The proof of Theorem 1.2 is inspired by the idea of Yamada and Watanabe [YW71] that was already used in [MPS06] and [MP11]. We closely follow Section 2 in [MP11] as most of the ideas can be transferred from white to colored noise and also to the multi-dimensional setting. Now consider Theorem 1.2 and assume its hypotheses throughout. Let X 1 and X 2 be two solutions of (1) on (Ω, F , (F t ) t≥0 , P ) with sample paths in C(R + , C tem ) a.s., with the same initial condition, X 1 0 = X 2 0 = X 0 ∈ C tem , and of course the same noise W. We start by observing that X i for i = 1, 2 satisfy the distributional form of (1): In fact, for adapted processes with sample paths in C(R + , C tem ), the mild formulation (7) is equivalent to the distributional formulation (10) of solutions to (1), see page 1917 of [MPS06]. Let for any K > 1 be a stopping time. Since X i ∈ C(R + , C tem ) we have T K → ∞ for K → ∞. Up to time T K condition (5) implies that for some R 0 , R 1 > 0. Thus, a stopping time argument allows us to prove Theorem 1.2 for σ where (5) is replaced by (12) (see the text after (2.30) in [MP11] for more on the sufficiency of this argument). In order to apply an argument similar to that of Yamada and Watanabe we set for any n ∈ N as in [MP11] a n = exp{−n(n + 1)/2}, fix a positive function ψ n ∈ C ∞ (R, R + ), such that supp ψ n ⊂ (a n , a n−1 ), ψ n (x) ≤ 2 nx and an−1 an ψ n (x) dx = 1.
As this function approximates a δ-function at zero as n → ∞, we define which then approximates the modulus. More precisely, we have Next we fix a point x ∈ R q and t 0 > 0 and a positive function Φ ∈ C ∞ c (R q , R + ) such that supp Φ ⊂ B q (0, 1) and Φ(y)dy = 1. Let Φ m x (y) = m q Φ(m(y − x)) for m > 0. Define the difference of the solutions u := X 1 − X 2 and note that we can write down an equation of the form (10) for u. Let ·, · denote the scalar product on L 2 (R q ) and assume t ∈ [0, t 0 ]. We apply the Itô-formula for the semimartingale u t (·), Φ m x (·) , which is the difference of the two semimartingales given in (10), with φ n as in (13) in order to obtain × σ(s, w, X 1 (s, w)) − σ(s, w, X 2 (s, w)) σ(s, z, X 1 (s, z)) − σ(s, z, X 2 (s, z)) We integrate this function of x against another non-negative test function Ψ ∈ C ∞ c ([0, t 0 ] × R q ). Choose K 1 ∈ N so large that for λ = 1, We then apply the classical and stochastic versions of Fubini's Theorem, see Theorem 2.6 of [Wal86]. The expectation condition in Walsh's Theorem 2.6 may be realized by localization, using the stopping times T K for K → ∞. Arguing as in the proof of Proposition II.5.7 of [Per02] to handle the time dependence in Ψ we then obtain that for any t ∈ [0, t 0 ], , Ψ s σ(s, y, X 1 (s, y)) − σ(s, y, X 2 (s, y)) W (ds dy) × σ(s, w, X 1 (s, w)) − σ(s, w, X 2 (s, w)) σ(s, z, X 1 (s, z)) − σ(s, z, X 2 (s, z)) Now set m n = a −1/2 n−1 = exp{(n − 1)n/4} for n ∈ N. This choice of m n differs from that in [MPS06] and is essential for the improvements that are made here to the results in [MPS06], in particular to their Lemma 4.3.
We quote essentially Lemma 2.2 from [MPS06] (where m n is used for m) and add a last point treating I mn,n 5 (t): Lemma 2.1. For any stopping time T and constant t ≥ 0 we have: Proof. The points (a), (b) and (c) are proven in Lemma 2.2 of [MPS06]. We only need to show the last point (d), for which we follow (2.48) of [MP11]. Since |φ ′ n (x)| ≤ 1 for all x ∈ R q by (15), (6) implies that for a stopping time T , The integral over y converges pointwise in x and s due to continuity. Using (8) we can obtain an integrable bound for this integrand and Lebesgue's Dominated Convergence Theorem thus implies for n → ∞,Ĩ and hence in L 1 since, again by (8), (Ĩ n 5 (t)) n∈N is L 2 -bounded.
It will be I mn+1,n+1 3 which will mostly concern us for the rest of this work. In its integral definition we may assume |x| ≤ K 1 by (17) and so |w| ∨ |z| ≤ K 1 + 1. If K ≥ K 1 , s ≤ T K and |w| ≤ K 1 + 1 we have by (11) Therefore (3), (9) and the fact that ψ n (x) ≤ 2 nx 1{a n < x < a n−1 } show that since We note that a −1 n+1 = a −1−2/n n . Thus, as the quantity of interest we define Proposition 2.2. Suppose {U M,n,K : M, n, K ∈ N, K ≥ K 1 } are F t -stopping times such that for each K ∈ N ≥K1 , are satisfied. Then the conclusion of Theorem 1.2 holds.
The proof of this proposition is the same as the proof of Proposition 2.1 in [MP11], here using Lemma 2.1. What one shows is that (t, x) → E[u(t, x)] is a non-negative subsolution of the heat equation with Lipschitz drift started in 0. Hence, two solutions coincide pointwise and so by continuity of paths we have: X 1 = X 2 . We omit the details and refer to the proof of Proposition 2.1 in [MP11].
Observe that all that is left for the proof of our main result, Theorem 1.2, is the construction of the stopping times U M,n,K and the verification of (H 1 ) and (H 2 ). As these steps are extremely long we want to give a heuristic explanation for the sufficiency of α < 2(2γ − 1) leading to (H 2 ) even if we will not yet discuss the construction of the stopping times, which is done in Section 6.
where | · | always denotes the Euclidean norm on the corresponding space.
Note that the indicator function in the definition of I n in (26) implies that there is anx 0 ∈ B q (x, √ a n ) such that |u(s,x 0 )| ≤ a n . If we could takex 0 = w = z we could bound I n (t) by see page 1929 of [MPS06]. Thus, (H 1 ) and (H 2 ) would follow immediately with U M,n,K = T K . (The criticality of α < 2(2γ − 1) in this argument is deceptive as it follows from our choice of m n .) Thus, in order to satisfy the hypotheses of Proposition 2.2 we now turn to obtaining good bounds on |u(s, w) − u(s,x 0 )| with |x 0 − w| ≤ 2 √ a n . The standard 1 − α/2 − ε-Hölder modulus of u (see Theorem 2.1 in [SSS02]) will not give a sufficient result. In [MPS06], provided that α < 2γ − 1, the Hölder modulus near points where u is small was refined to 1 − ε for any ε > 0. More precisely, let Theorem 2.3. For each K ∈ N and 0 < ξ < 1− α 2 1−γ ∧ 1 there is an N 0 = N 0 (ξ, K, ω) ∈ N a.s. such that for all natural numbers N ≥ N 0 and all (t, x) ∈ Z(N, K), Theorem 4.1 of [MPS06] is stated and proved for equation (1) without a drift. For the necessary changes to include the drift we refer to Section 9.9 in [Rip12].
We now argue how this locally improved Hölder regularity can be used. As already mentioned after (27) the choice of m n is crucial. It is related to the locally improved Hölder regularity and so for the moment set m n = a −λ0 n−1 for some λ 0 > 0. We will take the liberty to use the approximation m n ≈ a −λ0 n in the following heuristic argument. Then for (H 2 ) it suffices to show For x fixed, the pointx 0 mentioned before (27) will now lie in B q (x, m −1 n ) and on the other hand only those w and z with |w − x| ∨ |z − x| ≤ m −1 n will appear in the integral (28). So w, z ∈ B q (x 0 , 2a λ0 n ). Theorem 2.3 implies that for α < 2γ − 1 u(t, ·) is ξ-Hölder continuous near its zero set for ξ < 1, which allows us to bound |u(s, w) − u(s,x 0 )| by (2a λ0 n ) ξ , and therefore |u(s, w)| by a n + 2a λ0ξ n which in turn is bounded by 3a λ0ξ n if λ 0 ≤ 1. We can use this and (27) in (28) to bound I n (t) for 0 < λ 0 ≤ 1 by a constant times the following if 2γ − 1 > α and we choose λ 0 , ξ close to one. This was just the result in [MPS06]. However, in Theorem 2.3 the restriction by 1 in the condition ξ < 1− α 2 1−γ ∧ 1 seems unnatural and not optimal.
To obtain an improved result we need to extend the range of ξ beyond 1. We will obtain a statement close to the following one: where ∇u denotes the spatial derivative (in a loose sense as u is not differentiable). Actually, we cannot really write down (30) formally, but some statements come close to it, e.g. Corollary 5.10 for m =m + 1. At this point we would like to note that a similar argument as in [MP11] shows that, using the techniques for α > 2(2γ − 1), we will not be able to improve (30) to So we can extend the range of ξ up to 2 − ε, but not beyond with this technique. Assuming α < 2(2γ − 1) and (30), we outline the idea of how we will be able to derive (28). We choose 0 = β 0 < β 1 < · · · < β L =β < ∞, a finite grid, and definê x | < a n , |∇u(s, x)| ∈ (a βi+1 n , a βi n ]} for i < L and for i = 0, x | < a n , |∇u(s, x)| > a β1 n } and for i = L,Ĵ n,L (s) = {x ∈ R q : | u s , Φ mn+1 x | < a n , |∇u(s, x)| ∈ [0, a βL n ]}. Since our goal of proving I n (t) → 0 will be attained, if we can show that For a grid of β i fine enough we will be able to replace the condition that the absolute value of the gradient is contained in (a βi+1 n , a βi n ] in the definition ofĴ n,i (s) by the condition that it is approximately equal to a βi n for i = 1, . . . , L. Note that due to the boundedness of the support of Φ n x , for x ∈Ĵ n,i (s) there must bex n (s) ∈ B q (x, a λ0 n ) such that |u(s,x n (s))| < a n . By (31) we have for w ∈ B q (x, a λ0 n ) and [x n (s), w] the Euclidean geodesic between the two points: |u(s, w)| ≤ a n + sup (|∇u(s, x)| + |w − x| ξ )|x n (s) − w| ≤ a n + (a βi n + 2a λ0ξ n )a λ0 n ≤ 7(a n ∨ a if we choose λ 0 = 1 2 , which is the smallest possible value for balancing the terms. Similarly, β i ≤ 1 2 is optimal in (34). If we put this estimate into (32), then we can boundÎ n,i (t) by and (27) leads to the bound for some K 1 > 0, since Ψ is compactly supported. If β i is rather small, we find ourselves in the situation that the Hölder estimate (34) is not that strong. With a choice of λ 0 = 1 we would have gotten back to the case α < 2γ − 1, since small β i corresponds to neglecting the estimate on derivatives. However, particularly in that case we can give a good estimate on |Ĵ n,i (s)|, the q-dimensional Lebesgue measure ofĴ n,i (s). But, let us first consider β L =β. Then, by the estimate in (35) we havê as n → ∞ as long as we require β L =β ≥ 1/2. From this and the considerations just after (34), we know that it should suffice to chooseβ = 1/2, or more precisely, choosingβ smaller will not lead to an optimal result, whereasβ > 1 2 will not improve the result. We still need to check the convergence for i = 0, . . . , L−1 and write in order to simplify notation β = β i and J n =Ĵ n,i (s). From (31) we see that if x ∈ J n , then there is a direction σ x ∈ S q−1 := {x ∈ R q : |x| = 1} with σ x · ∇u(s, y) ≥ 1 2 a β n if |y − x| ≤ La β/ξ n for an appropriate constant L and (y − x) σ x , meaning that (y − x) is parallel to σ x . Assuming for the heuristic that u(s, x) > −a n (which we only know precisely for a pointx n (s) ∈ B q (x, a 1/2 n ) due to | u s , Φ a 1/2 n x | < a n ) we obtain because of the positive gradient for y ∈ x + R + σ x by the Fundamental Theorem of Calculus: Similarly, one can also show (but we will not go into details here) that, by adapting L appropriately, if x, z ∈ J n and |x − z| ≤ La β/ξ n , we also have for z ′ ∈ z + σ x · [4a 1−β n ; La β/ξ n ] that u(s, z ′ ) > a n and thus z ′ / ∈ J n . So for x ∈ J n , denoting by {x + σ ortho x } the plane through x orthogonal to σ x , we have This is what we wanted to show and ends the heuristic outline of the proof (some more details in the case of white noise can be found in Section 2 of [MP11]). Remark 2.4. In the previous heuristics it suffices to consider one direction of the gradient. This will be sufficient to obtain uniqueness for α < 2(2γ − 1) rigorously. However, it is tempting to include further information on the gradient, e.g. ∇u ≈ (a β 1 n , a β 2 n , . . . ). We believe that no further improvement can be achieved, since (34) only requires the size of the principal component of the gradient.

Verification of the hypotheses of Proposition 2.2
In this section we make the heuristics of the previous section rigorous in the sense that we derive hypothesis (H 2 ). This proof relies on the definition of sets similar to the ones defined before (32) and on Proposition 3.2, whose proof is given in Section 6 and contains the verification of hypothesis (H 1 ).
We follow the arguments of Section 3 in [MP11] and will also restrict our attention to the case b ≡ 0 for notational convenience. All of the results can be extended to non-trivial b satisfying the Lipschitz condition (6), for more details we refer to Section 8 of [MP11] or Section 9.10 of [Rip12]. Otherwise, we assume the setting of the beginning of Section 2. That means that X 1 and X 2 are two solutions of the SPDE (1) with the same noise W and u := X 1 − X 2 is the difference of the two, i.e.
where D(s, y) = σ(s, y, X 1 (s, y)) − σ(s, y, X 2 (s, y)) which by (12) obeys Let (P t ) t≥0 be the heat-semigroup acting on C tem . For δ ≥ 0 set With the help of the Stochastic-Fubini-Formula (Theorem 2.6 in [Wal86], where localization with T K and (8) are used for the condition on the expectation) reformulate that for δ ≤ t to We define the following functions for which we easily obtain u 1,δ (t, x) = G δ (t, t, x). We denote by the spatial derivative of the heat-kernel. Then the following result holds, which is analogous to Lemma 3.1 in [MP11] and the lines preceding it and has essentially the same proof: Lemma 3.1. The random fields G δ and F δ,l are both jointly continuous in (s, t, x) ∈ R 2 + × R q and Additionally, u 1,δ and u 2,δ are both C(R + , C tem )-valued.
Note that for the special choice of s = t in the previous lemma we have that .
be the set of points with the smallest u-values in a certain neighborhood close to x and let be a measurable choice of a point in B n (t, x) (e.g. with the smallest first coordinate, if this does not suffice to uniquely select a point, take the smallest second coordinate and so on). Let us fix two positive but very small constants ε 0 , ε 1 throughout the paper and β L+1 = 1 2 − ε 1 . So alltogether for i = 0, . . . , L + 1: We define the following subsets of R q : and for i = 1, . . . , L − 1: Recall (26) and observe that for t ≥ 0, n ∈ N: To verify the hypotheses of Proposition 2.2, it suffices to show the existence of stopping times U M,n,K satisfying (H 1 ) as well as for i = 0, . . . , L, We will get to the definition of these stopping times in Section 6. We now define σ x := σ x (n, s) := ∇u 1,an (s,x n (s, x))(|∇u 1,an (s,x n (s, x))|) −1 as the direction of the gradient ∇u 1,an at the pointx n (s, x) close to x. We also set where dependence on β i is not written out explicitly if there are no ambiguities.
To get (H 2,i ) we need to derive some properties of points in J n,i . Therefore, set and for i = 1, . . . , L − 1: We also define two deterministic constants and will from now on always assume that The next proposition shows that we can ultimately estimate the size of the setsJ n,i (s) instead of that of J n,i (s) : The proof of this proposition can be found in Section 6. We will use this proposition to show (H 2,i ) at the end of this section. We need the following notation for i ∈ {0, . . . , L}: where we omit the dependence on β i if there are no ambiguities and obtain: Lemma 3.3. If i ∈ {0, . . . , L} and n > n M (ε 1 ), then Proof.
by (43), (45) and because a ε1 n < 2 −8 by (47). This gives the first inequality. For the second one, We give some elementary properties of the setsJ n,i (s).
Proof. To prove (a) let n, i, s, x, x ′ , x ′′ be as above. Since the distance to x of any point on the line between x ′ and x ′′ is bounded from above by 5l n (β i ). By the Mean Value Theorem and the definition ofJ n,i (s), we get To prove (b) w.l.o.g. consider (x ′′ − x ′ ) · σ x ≥ 0 and so estimate analogously to (a) (remember that [·, ·] denotes the Euclidean geodesic between two points in R q ): Next, we prove (c) using that |y ′ − x| ∨ |y ′′ − x| < √ a n +l n (β i ) +l n (β i ) ≤ 5l n (β i ).
where, in the next to last inequality, we used that x ∈J n,i (s) for the ∇u 1,a λ i n -part and y ∈J n,i (s) for the u 2,a λ i n -part. Finally, prove (d) much in the same way as the previous claims: We have |x n (s, x) − w| < |x n (s, x) − x| + |x − w| ≤ 2 √ a n ≤l n (β i ) by Lemma 3.3. So we can apply (a) for x ′ =x n (s, x) and The next lemma provides some conclusions that can be drawn about points that lie inJ n,i (s) for i ∈ {0, . . . , L − 1} and s ∈ R + .
Clearly, |z| ≤ √ a n and for Therefore, we can apply Lemma 3.4 (b) in the case (x − x) > 0 to obtain The same can be done in the case (x − x) · σ x < 0.
To show (b) use the same ideas as before, where Lemma 3.4 (b) is replaced by Lemma 3.4 (c), in order to deduce that 97 32 a n .
So, we can apply (b) for x, y ∈J n,i (s) to obtain that Let Σ x be a q × (q − 1) dimensional matrix consisting of an orthonormal basis of the orthogonal space σ ortho x = {y ∈ R q : σ x · y = 0} and let |A| denote the Lebesgue measure of a measurable set A ⊂ R q .
Lemma 3.6. For i ∈ {0, . . . , L − 1} and s ≥ 0, n ∈ N there is a constant c 3.6 = c 3.6 (q) such that ) and cover the compact setJ n,i (s) with a finite number of these balls, say . So, if we increase the radius of the balls around x ′1 , . . . , x ′Q ′ tol n (β i )/2, it suffices to use those balls whose centers have at least distancel n (β i )/4, which we denote by x 1 , . . . , x Q . If we consider B q (x k ,l n (β i )/8), k = 1, . . . , Q, then all of these balls are disjoint. Thus, we have and alsoJ Next we want to consider the Lebesgue measure of the sets on the right-hand-side using some kind of Cavalieri decomposition and Lemma 3.5 (c). Fix k ∈ {1, . . . , Q} and denote by C(q) the volume of the q-dimensional Euclidean ball. We have Here, we were able to apply Lemma 3.5(c) in the last inequality with z = x k + Σ x k z ′ since |x k + Σ x k z ′ − x k | = |Σ x z ′ | = |z ′ | ≤l n /2. And therefore, by (49) and (50) for c 3.6 = 4 · 4 q C(q − 1) we obtain |J n,i (s)| ≤ c 3.6 K q 0 l n (β i )(l n (β i )) −1 .
We are now in the position to complete the Verification of the Hypothesis (H 2 ) in Proposition 2.2. Let n > n M (ε 1 ) ∨ n 0 (ε 0 , ε 1 ), t > 0 and M ∈ N fixed. First, consider i = 0. For x ∈ J n,0 (s) and |y − x| ≤ √ a n we have |u(s, y)| ≤ 3a (1−ε0)/2 n due to Proposition 3.2. So, we obtain in (46) for n large enough so that ε 1 > 2 n : n a −α/2 n t 0 K q 0 l n (β 0 )l n (β 0 ) −1 (by (47) and Lemma 3.6) And this expression tends to zero as n → ∞ since by (43) Next, let i ∈ {1, . . . , L} and assume x ∈J n,i (s), y ∈ R q , |y − x| ≤ √ a n . So, we can use Lemma 3.4 (d) to get that |u(s, y)| ≤ 5a Put that into (46) for y = w and y = z to obtain that To treat the integral in w and z, we use (27) leading to Next, we use Lemma 3.6 in the case i ∈ {1, . . . , L − 1} and obtain . Hence, it suffices to check for positivity of ρ 1,i and ρ 2,i to obtain the desired result.
. Additionally, note that by (43), So we can calculate (43). To finish the proof, we note that in the case i = L it suffices to use a trivial bound on the integral in (54) and we obtain with β L ≥ 1 2 − 6ε 1 − ε 0 ≥ 1 2 − 7ε 1 from (43): And so, we are done with the proof of Proposition 2.2. ✷

Heat kernel estimates
This section will be concerned with estimates for the heat kernel in R q defined by and its derivative in space There are already a number of results in Section 5 of [MPS06] regarding bounds on heat kernels, in particular when they are connected by a correlation kernel and also in Section 4 of [MP11] regarding the derivatives of heat kernels. Here, we will combine the techniques used for those results in order to obtain bounds on integrals of the derivatives p t,l that are connected by a correlation kernel related to colored noise. All of the proofs are put into the appendix. As necessary we will highlight the dependence of constants C on various quantities. This first simple lemma will be used frequently later on: Then there is a constant C = C(r 0 , r 1 ) > 0 such that for all r ∈ [r 0 , r 1 ] and a ≥ 0, u ≥ 1, A trivial consequence is the following Lemma 4.2 in [MP11]: For the heat kernel in R q there is a constant C > 0 such that for l = 1, . . . , q, t > 0, x ∈ R q , The next lemma is about the integral over distances of heat kernel derivatives: A simple extension of Lemma 5.1 in [MPS06] is the following lemma: and there is a constant C = C(K, R) such that for x, y ∈ [−K, K] q : Using the two previous lemmas we can obtain a result on integrals "outside" a certain area:

Local bounds on the difference of two solutions
In this section we present the extension of Theorem 2.3, i.e. the results showing (in some sense) "Hölder-continuity of order 2". This section is very similar in its ideas to Section 5 of [MP11]. Hence, we do not give all of the proofs but can refer the interested reader to Section 9.4 of [Rip12] for the details. First, let us recall that for n ∈ N, a n = exp(−n(n + 1)/2) and for (t, x), Define for N, K, n ∈ N, β ∈ [0, 1/2] the random set |u(t 0 ,x 0 )| ≤ a n ∧ ( √ a n 2 −N ), and |∇u 1,an (t 0 ,x 0 )| ≤ a β n }.
The induction start is proved as in Proposition 5.1 of [MP11] using Theorem 2.3, here, instead of their Lemma 2.3, so we omit it. The induction step from (P m ) to (P m+1 ) is a bit more technical and needs some preparation. It will be completed at the end of this section on page 30.
As the proof is essentially the same as Lemma 5.2 in [MP11]' we omit it. The lemma gives control on u(s, y) for y close to points in Z(N, n, K, β). To do the induction step we want to use this control in the estimate |D(r, w)| ≤ R 0 e R1|w| |u(r, w)| γ from (39), which appeared in δ > 0, 0 ≤ s ≤ t, x ∈ R q and for 1 ≤ l ≤ q, see (42) and Lemma 3.1. This is related to the derivative of u 1,δ as given in Lemma 3.1. Using the bound from Lemma 5.3 will lead to an improved bound on u 1,δ . Later, we will also give estimates for u 2,δ and the combination of the two bounds allows us to do the induction step at the end of the section.
To estimate F δ,l we use the following decomposition for s ≤ t ≤ t ′ , s ′ ≤ t ′ : All of these three expressions in the moduli are martingales in the upper integral bound, when the rest of the values x, x ′ , t, t ′ , (s ∧ s ′ − δ) + stay fixed. We want to consider the quadratic variations of these martingales and use the Dubins-Schwarz theorem. In order to calculate the first two quadratic variations we need to introduce the following partition of R q (for fixed values of x, x ′ , η 0 ): whenever 0 ≤ r < t. For estimating (63), we now introduce the following square functions for i, j ∈ {1, 2}:

Now we want to establish an upper bound for
when s, t, x, s ′ , t ′ , x ′ are subject to some restrictions. Then (64) is clearly an upper bound itself for the quadratic variation of each of the three martingales in (63).
Remark 5.4. Since we would execute the same calculations for any spatial dimension l we restrict ourselves now to l = 1 for the estimates on F δ,l . We already omitted this dependence in the definitions leading up to (64). Also, note that dependence of constants on the universal constants α, q, γ, R 0 and R 1 will not be mentioned in the following lemmas.
With the help of Lemma 4.5 for t = t ′ ≤ K bound this by where we used Lemma 4.1. The proof for the temporal estimate is similar but we omit it here.
Next we need to consider the distances for the cases i = j = 1.
Proof. First, we estimate Q X . Let ξ = 1−(8R) −1 ∈ (15/16, 1) and set N 5.6 = N 1 (m, n, ξ, ε 0 , K, β). W.l.o.g. s > δ and therefore we always have d((r, w), (t, x)) ∧ d((r, z), (t, x)) ≥ √ a n in the integral. An application of Lemma 5.3 and the bound on |w − x|, |z − x| respectively gives Then, Lemma 4.3 allows the following bound Using we can bound the above by We start with an estimate on I 1 . If r ≤ s − δ and t − r ≥d 2 N then Use that to obtain We want to drop the minimum with 1 to consider It holds that for p ∈ (−1, 1 Here, the first inequality follows since for p = 0 the left hand side equals 1 |p| |a p − b p |, which can be bounded using 1 − x ≤ − log x, x ≥ 0. For p = 0 we even have equality. The second inequality follows by distinguishing cases for p negative, positive, and zero, and by noting that K ≥ 1. Hence, we have The log-term is bounded by C(K, R)|x − x ′ | −η1/2 (use Lemma 4.1). Moreover, by Lemma 4.1(c) in [MP11] we bound Therefore, To finish the proof for Q X we replace ξ = 1 − (8R) −1 by 1 and γ ′ = γ(1 − 2η 0 ) by γ at the cost of by some algebra using η 1 > 32η 0 ∨ R −1 . We will not give the proof for Q T as it is quite similar except that some exponents change slightly.
Notation: Let us now introducē We note that in this definition and in the following λ ∈ [0, 1] replaces the analogous α of [MP11].
Proof. We do the proof for l = 1 only, see Remark 5.4. Let R = 33η −1 1 , η 0 ∈ (R −1 , η 1 /32) and consider the case t ≤ t ′ in the beginning only. Set By Corollary 5.8 for (t, x) ∈ Z(N, n, K, β), N ≥ N 5.8 it holds that i.e. N 4 = N 4 (a n , ε 0 , N 5.8 , c 78 ) and hence N 3 = N 3 (n, ε 0 , N 5.8 , K, η 1 ), which is stochastically bounded uniformly in (n, λ, β). Let N ′ ∈ N be such thatd ≤ 2 −N ′ , which impliesd 1−η1/2 ≤ 2 −N ′ η1/4d1−3η1/4 . Then it is true that on the event Recalling the decomposition of F δ,1 in (63) into the sum of three martingales and applying the Dubins-Schwarz-Theorem we can write as long as where we used the Reflection Principle in the next to last inequality. Next apply a lemma similar to the Kolmogorov-Centsov estimate Lemma 5.7 in [MP11], which is used in the proof of Proposition 5.8 in [MP11]. For details we refer to the proof in Section 9.4 of [Rip12]. Then, we obtain for a certain N 5.9 which is bounded uniformly in n, λ, β: For N ≥ N 5.9 and (t, Thus, However if t ′ ≤ t, then (t ′ , x ′ ) ∈ Z(N − 1, n, K + 1, β), and interchanging (s, t, x) with (s ′ , t ′ , x ′ ) gives the same estimate as (78) so that we obtain that Q tot a λ n (s ′ , t ′ , x ′ , s, t, x) is bounded by 4 times the right hand side of (78). Proceeding as in the case t ≤ t ′ we end up with (81) replaced by |F a λ n ,1 (s ′ , t ′ , x ′ ) − F a λ n ,1 (s, t, x)| ≤ 2 −86 q −4d1−η1∆ u ′ 1 (m, n, λ, ε 0 , 2 −N ). This completes the proof for the first coordinate. Clearly, the constants c 5.9 and N 5.9 can be chosen such that the result holds uniformly for all dimensions 1 ≤ l ≤ q.
Recalling the definition of J n,i , however, we just "know" the range of the gradients of u 1,δ for δ = a n . But it will be helpful to find a result relating this range to the gradients of u 1,δ for δ = a λ n . The definition of F δ,l allows us to relate these two gradients, since for δ ≥ a n and s = t − δ + a n , ∂ x l u 1,δ (t, x) = ∂ x l P δ (u (t−δ) + )(x) = ∂ x l P t−s+an (u (s−an) + )(x) = −F an,l (s, t, x) (82) = −F an,l (t − δ + a n , t, x).
Note the last equality holds for any t, δ, a n ≥ 0, where they are trivial if t − δ ≤ 0. So we need to relate F an,l (t − a λ n + a n , t, x) and F an,l (t, t, x). We can show a lemma on the square function Q T,an (s, t, t, x) using Lemma 4.2 and 4.4 and ideas from the proof of Lemma 5.6. Then transfer that to the following proposition using the same techniques as in the proof of Proposition 5.9. For details we refer to Lemma 5.9 in [MP11] or to [Rip12].
The proof of this result is similar, even easier than the proof leading to Proposition 5.9 and is omitted here, but details can be found in Section 9.8 in [Rip12].
We leave out the proof since it is really the same as the proof of Proposition 5.13 in [MP11], but present the key idea. By definition Then use Corollary 5.10 for T 1 and T 2 and Proposition 5.12 for T 3 to get the result.
We also would like to obtain a similar result for u 2,δ . We omit its proof which is simpler than the previous calculations. In the statement of the result we use the following abbreviations: Proposition 5.14. Let 0 ≤ m ≤m+1 and assume that (P m ) holds. For all n ∈ N, there is an N 5.14 = N 5.14 (m, n, η 1 , ε 0 , K, λ, β)(ω) ∈ N almost surely such that for all N ≥ N 5.14 , (t, x) ∈ Z(N, n, K, β), t ′ ≤ T K : Moreover, N 5.14 is stochastically bounded uniformly in (n, λ, β).
Clearly, √ a n 2 −N (1−ξ) ≤ √ a n /2 ≤ a β n /2 and by (85) and an easy calculation (see Lemma 5.15 in [MP11]) we arrive at Therefore, we can write This completes the proof of Proposition 5.2.
We define four collections of random times the first one being x 0 )| ≤ a n ∧ ( √ a n ε), |∇u 1,an (t,x 0 )| ≤ a β n , and |∇u 1,a λ n (t, x) − ∇u 1,a λ n (t, whenever M, n ∈ N, β > 0. We define U (1) M,n,0 in the same way, omitting the condition on |∇u 1,an (t,x 0 )|. These random times are actually stopping times by Theorem IV.T.52 of [Mey66]. Remark 6.2. We note that the fact that we consider splitting at δ = a λ n rather than δ = a n is essential in the previous lemma.
In the case β = 0 we make the analogous definition without the condition on |∇u 1,an (t,x 0 )|. Again those are stopping times, and we obtain the analogous statement: Lemma 6.4. For all n ∈ N, β as in (45) it holds that U The proof of this lemma requires Proposition 5.11 and structurally equals the one of Lemma 6.1. As the fourth collection of stopping times define Lemma 6.5. Almost surely U By Lemmas 6.1, 6.3, 6.4, 6.5 we have that U M,n fulfills (H 1 ). Hence there is not much left to do in order to complete the proof of Proposition 3.2. It just remains to show the compactness ofJ n,i (s) andJ n,i (s) ⊃ J n,i (s) for all s < U M,n . We will be mostly concerned withJ n,i (s) ⊃ J n,i (s), show that in several steps and assume (47) throughout the rest of the section, i.e. a ε1 n ≤ 2 −M−4 and √ a n ≥ 2 −a −ε 0 ε 1 /4 n .
We first give a list of three lemmas that are analogous to Lemmas 6.5, 6.6 and 6.7 of [MP11]. As the proofs are quite similar we only show the last lemma since it also contains a slight improvement of Lemma 6.7 of [MP11].

The proof uses U
(1) M,n,β , but is left out. To finish things we only need a similar result for the u 2 expressions for which we give the details of the proof: Lemma 6.8. When i ∈ {0, . . . , L}, 0 ≤ s < U M,n , x ∈ J n,i (s), x ′ , x ′′ ∈ R q and |x − x ′ | ≤ 4 √ a n , then |u 2,a λ i n (s, as long as |x ′ − x ′′ | ≤l n (β i ).
Assume that |x ′ − x ′′ | ≤l n (β i ) (≤ 2 −M ). Since s < U (2) M,n,βi , it holds that We will now show the following Claim: Lemma 6.10. If 0 ≤ s < U M,n and x ∈ J n,0 (s) then and This statement has just the same proof as Lemma 6.8 in [MP11], so we omit it. We are finally going to complete the Proof of Proposition 3.2. The compactness ofJ n,i (s) follows from the continuity of all the functions involved and the inclusion J n,i (s) ⊂J n,i (s) follows from Lemmas 6.7, 6.8 and 6.10.
Next, combine that lemma with Lemma 5.1 in [MPS06]: Proof of Lemma 4.3. There are two estimates to make, one for each part of the ∧. First, let us consider the left part. Expanding the product in the integral gives Note that by a change of variables (and |w| = | − w|) the last two lines coincide. The same is true for the first two lines except that t and t ′ differ. Thus, expression (101) is equal to |p t,l (w)p t,l (z)|(|w − z| −α + 1) dwdz + |p t ′ ,l (w)p t ′ ,l (z)|(|w − z| −α + 1) dwdz + 2 |p t,l (w − (x − x ′ )) p t ′ ,l (z)|(|w − z| −α + 1) dwdz.
For the first line of (102) we write, using |w l | ≤ |w| and (55), by an application of Lemma 5.1 in [MPS06] and the fact that t ≤ t ′ . For the second line (with t ′ ) we can do exactly the same and obtain the same even with t instead of t ′ , since t ≤ t ′ .
In order to prepare Lemma 4.5 we give the following proof.
Next apply Lemma 5.1 (b) of [MPS06] if r 3 > 0 and their Lemma 5.1 (a) if r 3 = 0, to get the first estimate. For the second estimate note that by (55) and |x| ≤ √ qK, ≤ C(K, R)p 2t (x − w)(t r1/2 + 1) so that we obtain the result by the first part.