Schauder estimates for solutions of linear parabolic integro-differential equations

We prove optimal pointwise Schauder estimates in the spatial variables for solutions of linear parabolic integro-differential equations. Optimal H\"older estimates in space-time for those spatial derivatives are also obtained.


Introduction
Integro-differential equations appear naturally when studying discontinuous stochastic process. In a series of papers of Caffarelli-Silvestre [5,6,7], regularities of solutions of fully nonlinear integro-differential elliptic equations such as Hölder estimates, Cordes-Nirenberg type estimates and Evans-Krylov theorem were established. Regularity for parabolic integro-differential equations has been also studied, e.g., in [8,9,10,13,14,23] and many others. In this paper, we prove optimal pointwise Schauder estimates in the spatial variables for solutions of linear parabolic integro-differential equations. In general, we can not expect any interior continuity of the derivative of local solutions in the time variable even for the fractional heat equation u t + (−∆) σ/2 u = 0 without extra assumptions; see example 2.4.1 in [10].
We consider the linear parabolic integro-differential equation where Lu(x) := R n δu(x, y; t)K(x, y; t) dy, (1.2) δu(x, y; t) = u(x + y, t) + u(x − y, t) − 2u(x, t) and K(x, y; t) is a positive kernel. We will restrict our attention to symmetric kernels which satisfy K(x, y; t) = K(x, −y; t). (1.3) This assumption is somewhat implicit in the expression (1.1). We also assume that the kernels are uniformly elliptic (2 − σ)λ |y| n+σ ≤ K(x, y; t) ≤ (2 − σ)Λ |y| n+σ (1.4) for some σ ∈ (0, 2), 0 < λ ≤ Λ < ∞, which is an essential assumption leading to local regularizations. Finally, we suppose that the kernels are C 1 away from the origin and satisfy |∇ y K(x, y; t)| ≤ Λ |y| n+σ+1 , (1.5) and in certain cases we assume more that the kernels are C 2 away from the origin and satisfy (1.6) These smoothness assumptions are usually used to reduce the influence of the boundary data in the exterior domain, and one of the consequences is that the solutions of translation invariant (or "constant coefficients") equations will have high regularity. Moreover, the conditions (1.5) and (1.6) are scaling invariant, which will be used in our perturbative arguments. We say that a kernel K ∈ L 0 (λ, Λ, σ) if K satisfies (1.3) and (1.4), and K ∈ L 1 (λ, Λ, σ) if K satisfies (1.3), (1.4) and (1.5). If in addition that K satisfies (1.6), then we say that K ∈ L 2 (λ, Λ, σ).
Various Schauder estimates for solutions of some linear elliptic nonlocal equations were obtained before in, e.g., [1,2,11,16] and global Schauder estimates for some linear parabolic nonlocal equations with non-symmetric kernels were obtained in [22] using probabilistic arguments, compared to which a feature of our estimate (1.9) in Theorem 1.1 is that the solution u of (1.1) is In the case of second order parabolic equations, if the coefficients are of C α x in x and only measurable in the time variable, then for a solution u of such equations, its second order spatial derivatives ∇ 2 . Such results and related ones can be found in, e.g., [3,12,15,17,20,21,25]. Similar optimal interior Hölder estimates in space-time for spatial derivatives of solutions of (1.1) will follow from Theorem 1.1; see Corollary 2.6 and Corollary 2.7 in Section 2.3. In Theorem 1.1, we require that K and f have regularity in t at t = 0 as well, which is needed in our compactness arguments for weak limits of nonlocal parabolic operators.
One common difficulty in approximation arguments to obtain regularities of solutions of nonlocal equations is to control the error of the tails at infinity, which results in a slight loss of regularity compared to second order equations, especially in the case when σ + α > 1 with σ < 1, and in the case σ + α > 2. In this paper, we will approximate the genuine solution by solutions of "constant coefficients" equations instead of polynomials, which is inspired by [4,19]. In this way, we do not need to take care of the tails at infinity, that leads to the optimal regularity. The only place where (1.5) or (1.6) is used is to obtain higher regularity of solutions of those corresponding "constant coefficients" equations.
In the following section, we prove the optimal pointwise Schauder estimates (1.9). We first establish high regularity for solutions of translation invariant equations in Section 2.1, which is the only place that we require K is C 2 away from the origin especially for σ + α ∈ (2, 3). In Section 2.2 we use perturbative arguments to prove Theorem 1.1. Section 2.3 is on the Hölder estimates in space-time for those spatial derivatives. In the Appendix, we recall some definitions and notions of nonlocal operators from [6,10], and establish two approximation lemmas for our own purposes, which are variants of those in [6,10].
Tianling Jin was supported in part by NSF grant DMS-1362525. Jingang Xiong was supported in part by the First Class Postdoctoral Science Foundation of China (No. 2012M520002) and Beijing Municipal Commission of Education for the Supervisor of Excellent Doctoral Dissertation (20131002701).

Translation invariant equations
In this section, we first establish good regularity on the solutions of translation invariant equations, which is similar to "constant coefficients" equation in the case of second order equations.
, then there exists a positive constant c 1 depending only on n, λ, Λ, σ 1 such that and there exists another positive constant c 2 depending only on n, λ, Λ, σ 1 , σ 2 such that This proposition will follow from the next lemma and standard integration by part techniques.
and v k be the solution of the following translation invariant equation The existence and uniqueness of such v k is guaranteed by Theorem 3.3 in [9]. Then we have by and thus, by the maximum principle again, Let w k+1 = v k+1 − v k . It follows from the assumption estimate (2.3) that for x ∈ B ρ k+2 , where β = min(σ − 1,ᾱ), and C depends only on n, λ, Λ,ᾱ,c, ε 0 and σ 1 . Meanwhile, it follows from the assumption estimate (2.3) that This finishes the proof.
Proof of Proposition 2.1. First of all, we know from Theorem 6.2 in [9] that ∇ x v is local Hölder continuous in space-time. We will use integration by parts techniques which can be found in [5]. Let η 1 be a smooth cut-off function supported in B 7 and η 1 ≡ 1 in B 6 . Let w 1 = ∇ x (η 1 v). Then it satisfies in viscosity sense that This proves the case of i = 1.
, then there exist a positive constant c 1 depending only on n, λ, Λ, σ 0 such that When σ ≤ 1 − ε 0 for some ε 0 > 0, there exist a positive constant c 2 depending only on n, λ, Λ, σ 0 , ε 0 such that When σ = 1, then for all β ∈ (0, 1) there exist a positive constant c 3 depending only on n, λ, Λ, σ 0 , β such that The proofs of Lemma 2.3 and Proposition 2.4 are very similar to those of Lemma 2.2 and Proposition 2.1, respectively, and we leave them to the readers.

Proof of the main theorem
Now we are in position to prove Theorem 1.1 by approximations.
Proof of Theorem 1.1. The strategy of the proof is to find a sequence of approximation solutions which are sufficiently regular, and the error between the genuine solution and the approximation solutions can be controlled in a desired rate.
We may assume that We claim that we can inductively find a sequence of functions w i , i = 0, 1, 2, · · · , such that for all i, and and and where τ is an arbitrary constant in (0, 1), c 1 > 0 and α 1 ∈ (0, 1) are positive constants depending only on λ, Λ, n, σ 0 , and c 2 > 0 additionally depends on α. Then, Theorem 1.1 follows from this claim and standard arguments. Indeed, as in (2.5), we have, for Note that we used |σ + α − j| ≥ ε 0 for j=1,2,3 in obtaining C 1 , C 2 , C 3 , which actually blow up at a rate of O(|σ + α − j| −1 ) as σ + α → j ∈ {1, 2, 3}. The estimate (1.9) is proved using the claim. Now we are left to prove this claim. Before we provide the detailed proof, we would like to first mention the idea and the structure of (2.8)-(2.12): • Solving (2.8) and (2.9) inductively is how we construct this sequence of functions {w i }.
• (2.10) will follow from the approximation lemmas in the appendix, where (2.12) will be used.
• (2.11) will follow from (2.10), maximum principles and the estimates in Proposition 2.1 and Proposition 2.4.
The proof of the above claim is by induction, and it consists of three steps.
Let w 0 be the viscosity solution of We also think of w 0 ≡ u in R n × (−5 σ , −4 σ ). Then by comparison principles, where c 0 is a positive constant depending only on n, λ, Λ, σ 0 . By normalization, we may assume that For some universal small positive constant γ < 1, which will be chosen in (2.23), we also may assume that |f ( . This can be achieved by the scaling for r < 1 small that if we letK then we see thatũ (2.16) Step 2: Prove the claim for i = 0.
Let w 0 be the one in Step 1. It follows from Proposition 2.1 and Proposition 2.4 that there exists a positive constant c 2 depending only on λ, Λ, n, σ 0 , α such that (2.17) For τ ∈ (0, 1], it follows from Theorem 6.1 in [9] (see [5] for the elliptic case), standard scaling and covering (contributing at most a factor of 4/τ ) argument that there exist constants α 1 ∈ (0, 1), c 1 > 0, depending only on n, λ, Λ, σ 0 , such that Let us set up to apply the first approximation lemma in the Appendix, Lemma A.1. Let ε = 5 −(σ+α) and M 1 = 1 and let us fixed a modulus continuity ρ(s) = s α 1 . Then for these ρ, ε, M , there exist η 1 (small) and R (large) so that Lemma A.1 holds. We can rescale the equation of u so that it holds in a very large cylinder containing The latter one can be done due to (2.18). And we will choose γ < η 1 /50 in (2.23). Then we can conclude from Lemma A.1 that and thus, This proves (2.8), (2.9), (2.10) and (2.11) hold for i = 0. Moreover, in viscosity sense. Indeed, let t 0 ∈ (0, 1) and we smooth w 0 by using a mollifier η ε (x, t), and let g ε = η ε * w 0 (thinking of It follows from Theorem 4.1 in [10] that ∂ t v ε is Hölder continuous in space-time. Thus, (2.20) Meanwhile, by the Hölder interior estimates, we have that w ε 0 locally uniformly converges to some continuous function w. By the stability result Theorem 5.3 in [10], w is a viscosity solution of Hence w ≡ w 0 . Thus, by sending ε → 0, and t 0 → 0 with a standard perturbation argument ( using ν/t for small ν), (2.19) holds in B 4−2τ × (−4 σ + τ σ , 0] in viscosity sense. By the choice of γ in (2.23), It follows from the Hölder estimates (2.18) proved in [9], standard rescaling and covering arguments (contributing at most a factor of 4/τ ) that This finishes the proof of (2.12)for i = 0.
where we used (2.23) in the second inequality. Thus, it follows from (2.18) and (2.24) that Thus, (2.12) hold for i + 1 as well. This finishes the proof of the claim.
A corollary of Theorem 1.1 would be the Schauder estimates for elliptic equations. If we consider the linear elliptic integro-differential equation for all r ∈ (0, 1], x ∈ B 5 , and for some positive constant M f .
The constant C in (2.29) does not blow up as σ → 2, but it will blow up as σ + α approaches to integers.

Hölder estimates in space-time for spatial derivatives
Another corollary of the pointwise Schauder estimate (1.9) is the following uniform (in t) interior Schauder estimates in spatial variables.
By using (2.17) and the equation (2.13) itself, this estimate is clear if we consider it as a priori estimate.
Proof of Lemma 2.8. Let t 0 ∈ (0, 4 σ ). To proceed, we smooth v by using a mollifier η ε (x, t), and let g ε = η ε * v. Let v ε be the solution of It follows from Theorem 4.1 in [10] that ∂ t v ε is Hölder continuous in space-time. By Proposition 2.1 and Proposition 2.4, we know that v ε is C 2 in x. Thus, v ε satisfies its equation in the classical sense. By the equation of v ε , where the estimates in Proposition 2.1 and Proposition 2.4 are used in the third inequality. Meanwhile, by the Hölder interior estimates, we have that v ε locally uniformly converges to some continuous function w. By the stability result Theorem 5.3 in [10], w is a viscosity solution of Hence w ≡ v, and thus, we have that We finish the proof by sending t 0 → 0. Remark 2.9. Indeed, by similar arguments and the integration by parts technique used in the proof of Proposition 2.1 one can also show that ∇ x v is Lipschitz in time, as well as ∇ 2 x v if K ∈ L 3 (λ, Λ, σ, α) and σ > 1 (so that we have estimates for ∇ 4 x v). We omit the proof here. Proof of Corollary 2.7. If we let w l be as in the proof of Theorem 1.1, then by Lemma 2.8 and Remark 2.9, we have that w l , ∇ x w l is Lipschitz in time, as well as ∇ 2 x w l provided that K ∈ L 3 (λ, Λ, σ, α). By Corollary 2.6, we may assume that which proves (2.33).
Suppose that 1 < σ + α < 2. We have Since ∇ x u(0, 0) = ∞ l=0 ∇ x w l (0, 0), we have By the equation of w l , Lemma 2.8, Remark 2.9 and (2.11), we have Meanwhile, it follows from the estimate (1.9) that For 5 −(i+1) ≤ |x| < 5 −i , we have, by triangle inequality, Thus Hence, we have shown that Suppose that 2 < σ + α < 3. We have Meanwhile, it follows from the estimate (1.9) that By triangle inequality and the estimate for I 1 , we have, for Thus By the equation of w l , Lemma 2.8, Remark 2.9 and (2.11), we have Thus, by combining the estimates for II 1 , II 2 , II 3 , we have that This completes the proof of Corollary 2.7.
If we do not assume K ∈ L 3 (λ, Λ, σ, α) when σ + α > 2, we have that ∇ 2 x u is of C β in the time variable for some β > 0. This is because ∇ 2 x u is Hölder continuous in x and ∇ x u is Hölder continuous in t, which implies that ∇ 2 x u is Hölder continuous in t as well; see Lemma 3.1 on page 78 in [18].

A Approximation lemmas
Our proof of Schauder estimates uses perturbative arguments, and we need the following two approximation lemmas, which are variants of Theorem 5.6 in [10] (Lemma 7 in [6] in elliptic cases). We will do a few modifications for our own purposes, and we decide to include them in this appendix for completeness and convenience. If it is just for our particular linear equations, those approximation lemmas can be simplified much. But we would like to include nonlinear equations as well in this step.
Given such a nonlocal operator I defined on Ω×(−T, 0], a norm I was defined in Definition 5.3 in [10]. Here, we also define a (weaker) norm I * for our own purpose, and I * = sup t∈(−T,0] I(t) * . We say that a nonlocal operator I is uniformly elliptic with respect to L 0 (λ, Λ, σ), which will be written as L 0 (σ) for short, if Λδv(x, y; t) + − λδv(x, y; t) − |y| n+σ dy.
It is also convenient to define the limit operators when σ → 2 as It has been explained in [6] that M + L 0 (2) is a second order uniformly elliptic operator, whose ellipticity constantsλ andΛ depend only λ, Λ and the dimension n. Moreover, is the second order Pucci operator with ellipticity constantsλ and Λ. Similarly, we also have corresponding relations for M − L 0 (2) . For compactness arguments, we shall use the concept of the weak convergence of nonlocal operators, which can be found in Definition 5.1 in [10] (Definition 41 in [6] in the elliptic cases).
Lemma A.1. For some σ ≥ σ 0 > 0 we consider nonlocal continuous operators I 0 , I 1 and I 2 uniformly elliptic with respect to L 0 (σ). Assume also that I 0 is translation invariant and I 0 0 = 1.
Given M 1 > 0, a modulus of continuity ρ and ε > 0, there exist η 1 (small, independent of σ) and R (large, independent of σ) so that if u, v, I 0 , I 1 and I 2 satisfy in viscosity sense, and Proof. It follows from the proof of Theorem 5.6 in [10] with modifications. But since the choice of norms are different, we include the proof for completeness. We argue by contradiction. Suppose the above lemma was false. Then there would be sequences σ k , I  2], η k → 0 and all the assumptions of the lemma are valid, but sup is a sequence of uniformly elliptic translation invariant operators with respect to L (σ k ), by Theorem 5.5 in [10] (and its proof) that we can take a subsequence, which is still denoted as I (k) 0 , that converges weakly to some nonlocal operator I 0 , and I 0 is also translation invariant uniformly elliptic with respect to the class L 0 (σ).
It follows from the boundary regularity Theorem 3.2 in [10] that u k and v k have a modulus of continuity, uniform in k, in B 1 × [−1, 0]. Thus, u k and v k have a uniform (in k) modulus of continuity on B R k × [−1, 0] with R k → ∞. We have subsequences of {u k } and {v k }, which will be still denoted as {u k } and {v k }, converge locally uniformly in R n × [−1, 0] to u and v, as well as in C(−1, 0, L 1 (R n , ω)) by dominated convergence theorem, respectively. Moreover, In the following, we are going to show in viscosity sense that The second equality of (A.3) follows from Theorem 5.3 in [10]. To prove the first equality of (A.3), let p be a second order parabolic polynomial touching u from below at a point ( Since u k converges uniformly to u in B 1 × [−1, 0], for large k, we can find (x k , t k ) ∈ B r (x) × (t − r, t] and d k so that p + d k touch u k at (x k , t k ). Furthermore, (x k , t k ) → (x, t) and d k → 0 as k → ∞.
Lemma A.2. For some σ ≥ σ 0 > 0 we consider nonlocal continuous operators I 0 , I 1 and I 2 uniformly elliptic with respect to L 0 (σ). Assume also that I 0 is translation invariant and I 0 0 = 0.
Since I (k) 0 is a sequence of uniformly elliptic operators, we can take a subsequence, which is still denoted as I (k) 0 , that converges weakly to some nonlocal operator I 0 , and I 0 is also translation invariant and elliptic with respect to the class L 0 (σ).
Since v k is bounded and has a fixed modulus continuity on ((B 3/2 \B 1 )×[−1, 0])∪B 1 ×{t = −1}, then by Theorem 3.2 in [10], there is another modulus continuity that extends to It follows from the proof of (A.3) that u and v solve the same equation u t − I 0 (u, x, t) = 0 = v t − I 0 (v, x, t) in B 1 × (−1, 0]. Then u = v, which is a contradiction.