On a class of hypoelliptic operators with unbounded coefficients in ${\matbb R}^N$

We consider a class of non-trivial perturbations ${\mathscr A}$ of the degenerate Ornstein-Uhlenbeck operator in ${\mathbb R}^N$. In fact we perturb both the diffusion and the drift part of the operator (say $Q$ and $B$) allowing the diffusion part to be unbounded in ${\mathbb R}^N$. Assuming that the kernel of the matrix $Q(x)$ is invariant with respect to $x\in {\mathbb R}^N$ and the Kalman rank condition is satisfied at any $x\in{\mathbb R}^N$ by the same $m<N$, and developing a revised version of Bernstein's method we prove that we can associate a semigroup $\{T(t)\}$ of bounded operators (in the space of bounded and continuous functions) with the operator ${\mathscr A}$. Moreover, we provide several uniform estimates for the spatial derivatives of the semigroup $\{T(t)\}$ both in isotropic and anisotropic spaces of (H\"older-) continuous functions. Finally, we prove Schauder estimates for some elliptic and parabolic problems associated with the operator ${\mathscr A}$.


Introduction
In the last decades the interest towards elliptic and parabolic operators with unbounded coefficients in unbounded domains has grown considerably due to their applications to stochastic analysis and mathematical finance.
The literature on uniformly elliptic operators with unbounded coefficients in R N is nowadays rather complete (we refer the interested reader, e.g., to [3]). The picture changes drastically when one considers degenerate elliptic operators with unbounded coefficients. The prototype of such operators is the degenerate Ornstein-Uhlenbeck operator defined on smooth functions by where Q = (q ij ) and B = (b ij ) are suitable square matrices such that Q is singular and the condition det Q t > 0 is nevertheless satisfied for any t > 0. Here, The condition det Q t > 0 is equivalent to the well-known Kalman rank condition which requires that rank[Q for some m < N . In particular, A is hypoelliptic in Hörmander's sense. A suitable change of the orthonormal basis of R N (see Remark 2.5) allows to rewrite the operator A on smooth functions ϕ as for some positive definite and not singular p 0 × p 0 matrixQ = (q ij ) and some p 0 ∈ {1, . . . , N − 1}.
In [17] Lunardi proves that one can associate a semigroup of bounded operators {T (t)} in C b (R N ) (the space of all bounded and continuous functions) with the operator A in a natural way, i.e., for any f ∈ C b (R N ), T (t)f is the value at t > 0 of the (unique) classical solution to the homogeneous Cauchy problem D t u(t, x) = A u(t, x), t ∈]0, +∞[, x ∈ R N , where by classical solution we mean a function u which (i) is once continuously differentiable with respect to the time variables and twice continuously differentiable with respect to the spatial variable in ]0, +∞[×R N , (ii) is continuous in [0, +∞[×R N and bounded in [0, T 0 ] × R N for any T 0 > 0 and (iii) solves (1.4). One of the main peculiarities of the Ornstein-Uhlenbeck operator is that an explicit representation formula for the associated semigroup is available. This fact allows the author of [17] to prove uniform estimates for the spatial derivatives of the function T (t)f when t approaches 0 and f belongs to various spaces of (Hölder-) continuous functions. In fact, the behavior of the spatial derivatives of T (t)f depends on the variable along which one differentiates. As a byproduct, this shows that the right (Hölder-) spaces where to study the semigroup {T (t)} are not the usual ones but rather anisotropic spaces modelled on the degeneracy of the operator A . Denoting, roughly speaking, by C θ (R N ) these anisotropic spaces, Lunardi shows that t ∈]0, 1], (1.5) for any 0 < α ≤ θ and some positive constant C, independent of t, i.e., what one can expect in the non-degenerate case when C α and C θ are the usual Hölder spaces, even for unbounded coefficients; see e.g., [2,18]. Estimate (1.5) represents the key stone to apply an abstract interpolation argument from [16] to prove optimal Schauder estimates for the solution both to the elliptic equation x ∈ R N , λ > 0, (1.6) and to the non-homogeneous Cauchy problem when f, g, h are suitable continuous functions such that g(t, ·), f , h have some additional degrees of smoothness.
Recently, the second author, in [13,14], has extended these results to some non-trivial perturbations of the Ornstein-Uhlenbeck operator in (1.1). More precisely, in [13,14] the operator (1.3) has been studied under the assumption that the (N − p 0 ) × p 0 matrixB 3 has full rank, and assuming that the matrixQ depends on x ∈ R N and its entries are possibly unbounded functions at infinity. These assumptions imply that the Kalman rank condition (1.2) is satisfied at any x ∈ R N , with m = 1.
To prove the crucial estimates (1.5) a different technique than that in [17] has been applied since in this new situation no explicit representation formulas for the associated semigroup is available. More precisely, such estimates have been obtained by developing a variant of the classical Bernstein method in [1].
Recently, the results in [17] have been generalized, both with analytic and probabilistic methods, in [22,23,25] to non-trivial perturbations of the operator A in (1.3) in which an additional unbounded drift term is added. More specifically, Saintier in [25] considers the case when the differential operator is of typeÂ = A + p0 j=1 F j D j , with A being given by (1.3), with an even N and p 0 = N/2 and Q = B = I. Here, F is any smooth function with bounded derivatives up to the third-order. This operator arises e.g., in the study of the motion of a particle y of mass one subject to a force field depending on y and its first-order derivative, perturbed by a noise. We refer the interested reader to [8] for further details. Applying the same techniques as those in [13,14], Saintier proves optimal Schauder estimates for both the solutions to (1.6) and (1.7). Note that in this situation, the operator A satisfies the Kalman rank condition with m = 1. The same problem is investigated with a stochastic approach in [22].
Very recently the results in [22,25] have been generalized in [23] with both analytic and stochastic methods to the case whenÂ = A + p0 j=1 F j D j with some p 0 < N , A still being given by (1.3). In this paper we extend a part of the results in [13,14,22,23,25] considering a class of elliptic operators that, up to a change of the coordinates, may be written in the following form for some p 0 < N , where the matrices Q 0 (x) = (q ij (x)), defined byq ij ≡ q ij if i, j ≤ p 0 andq ij ≡ 0 otherwise, and B satisfy the Kalman rank condition (1.2) for some m independent of x. We assume that F : R N → R N is a smooth function with derivatives whose growth at infinity is comparable with the growth of the minimum eigenvalue of the matrix Q 1 2 (x). In the particular case when F ≡ 0, our results apply to any elliptic operator of the type when the Kalman rank condition is satisfied, by any fixed x ∈ R N , for some m < N , independent of x. The paper is organized as follows. First, in Section 2 we introduce the function spaces we deal with, as well as some notation. Moreover, we introduce the Hypotheses that will be assumed in the whole of the paper and we recall some preliminary results mainly from [13]. Next, in Section 3, the main part of this paper, we prove uniform estimates of the spatial derivatives for the semigroups associated with the family of non-degenerate elliptic operators A ε := A + ε∆ ⋆ with ∆ ⋆ being the Laplacian containing the missing second order derivatives, i.e., ∆ ⋆ = D (p0+1)(p0+1) + · · · + D 2 N N . More precisely, we show that the constants appearing in the estimates can be chosen to be independent of ε ∈]0, 1[. Then, in Section 4, using these estimates, we prove that we can associate a semigroup {T (t)} of bounded operators in C b (R N ) with the operators A in (1.8) and (1.9) and that the uniform estimates of the preceding section may be extended to {T (t)}. We also state some remarkable continuity properties of the semigroup {T (t)}. Further, we show that we can associate a "weak" generator with the semigroup {T (t)}, a generalization of the classical concept of infinitesimal generator of a strongly continuous semigroup, and we give a characterization of its domain. In Section 5, we prove Schauder estimates for the distributional solutions to the elliptic equation (1.6) and the non-homogeneous Cauchy problem (1.7). Finally, in Appendix A we prove some technical lemmas that are used in the proof of the uniform estimates.

Main assumptions and preliminaries
In this section we introduce the main assumptions on the operators we consider. We also fix the notation and the define the function spaces we use in this paper.

2.1.
Hypotheses. The assumptions on the coefficients of the operator A in (1.8) and (1.9), we always put throughout this paper are the following. We begin by considering the case when A is given by (1.8).
(i) Q(x) = (q ij (x)) is a p 0 × p 0 symmetric matrix, with entries which belong to C κ (R N ) for some κ ∈ N, κ ≥ 3, such that p0 i,j=1 for some positive function ν such that inf R N ν(x) = ν 0 > 0. Further, for some positive constant C α . (ii) There exist integers p 1 , . . . , p r with p 0 ≥ p 1 > . . . ≥ p r such that the matrix B can be split into blocks as follows: where B h is a p h × p h−1 matrix with full rank, i.e., rank(B h ) = p h (h = 1, . . . , r).
The hypotheses on the coefficients of the operator A in (1.9) are the following.
. . , N ) for some κ ∈ N, κ ≥ 3, and there exists a function ν : (ii) For any α ∈ N N 0 with length at most κ, there exists a positive constant C = C α such that iii) The kernel of the matrix Q(x) is independent of x ∈ R N and it is a proper subspace of R N . Moreover, ker(Q(0)) does not contain non-trivial subspaces which are invariant for B * .
Remark 2.4. Note that Hypothesis 2.3(iii) can be rewritten in one of the following equivalent forms: (a) the matrix Q t (x) = t 0 e sB Q(x)e sB * ds is positive definite for any t > 0 and any x ∈ R N ; (b) there exists r < N such that the rank of the block matrix is N for any x ∈ R N . To prove this claim, it suffices to adapt to our situation the proof of [12,Proposition A.1]. For the reader's convenience we give a detailed proof in the appendix (see Lemma A.1).
Remark 2.5. If the coefficients of the operator A in (1.9) satisfy Hypotheses 2.1, then one can find a suitable change of variables which transforms A in an operator of the type (1.8) (with F ≡ 0). To check this fact, let us denote by {V k : k ∈ N} the sequence of nested vector spaces defined by for any k ∈ N. In view of Lemma A.1 and Hypothesis 2.3(iii), there exists a positive integer In view of Remark 2.5, without loss of generality, throughout the paper, we can limit ourselves to dealing with the case when A is given by (1.8) and its coefficients satisfy Hypotheses 2.1.

General notation.
Functions. For any real-valued function u defined on a domain of R × R N , we indiscriminately write u(t, ·) and u(t) when we want to stress the dependence of u on the time variable t. Moreover, for any smooth real-valued function v defined on a domain of R N , we denote by Dv its gradient and by |Dv(x)| the Euclidean norm of Dv(x) at x. Similarly, by D k v (k ∈ N) we denote the vector consisting of all the k th order derivatives of v with no repetitions. This means that we identify k th order derivatives of type ∂ k v ∂xi 1 ...∂xi k and ∂ k v ∂xj 1 ...∂xj k when (j 1 , . . . , j k ) is a permutation of (i 1 , . . . , i k ).
We agree that the vector D k v contains only derivatives Asymptotics. Given any real-valued function u defined in some neighborhood of +∞ and m ∈ N, we use the usual notation u = o(s m ) when lim s→+∞ s −m u(s) = 0. If {u a } a∈F is a family of functions which are defined in a right-neighborhood of 0 (independent of a), we write u a = o(t m ) (for some m ∈ N) when lim t→0 + t −m u a (t) = 0 for any of such parameters.
Matrices. We denote the k × k identity matrix by I k and the transposed of a matrix A by A * . For any matrix A we denote by A its Euclidean norm. If A is symmetric, λ min (A) is the minimum eigenvalue of A. Finally, we use the notation "⋆" to denote matrices when we are not interested in their entries.
Miscellanea. We agree that N 0 = N ∪ {0}. Given a multi-index α = (α 1 , . . . , α m ) ∈ N m 0 , we denote by α := m i=1 α i its length. Moreover, by a + we denote the maximum between a ∈ R and 0. For any R > 0, we denote by B(R) the open ball in R N centered at x = 0 and with radius R. B(R) is its closure.

2.3.
Ordering the derivatives of smooth functions. Here, we introduce a splitting of the vector of all the derivatives of a function u : R N → R of a given order into sub-blocks. This splitting will be extensively used in Section 3.
Given k, q ∈ N, we introduce a (total) ordering " q " in the set I k,q of all the multi-indices in N q+1 0 with length k. We say that (m 0 , . . . , m q ) q (m ′ 0 , . . . , m ′ q ) if there exists h = 0, . . . , q such that m j = m ′ j for any j = 0, . . . , h − 1 and m h > m ′ h . We thus may order the elements of I k,q in a sequence i Here, c k,q := q+k q . Now, to order the entries of the vector D k u (k ∈ N) we proceed as follows. Let {p 0 , . . . , p r } be a given set of non-increasing integers such that p 0 + · · · + p r = N , throughout the paper these will be fixed as in Hypotheses 2.1 (ii). We set p −1 := 0 and introduce the sets I j = {i ∈ N : r j < i ≤ r j+1 }, (j = 0, . . . , r), where r l = l k=0 p k−1 for any l = 0, . . . , r + 1. Moreover, we split R N into the direct sum R N = r j=0 R pj . Hence, any multi-index α ∈ N N 0 can be split as α = (α 0 , . . . , α r ) with α j ∈ N pj 0 (j = 0, . . . , r) and we can write |α| := ( α 0 , . . . , α r ). We can now split the vector D k u as follows: (i) we split D k u into blocks according to the rule: D k u = (D k 1 u, . . . , D k c k,r u), where D k j u (j = 1, . . . , c k,r ) contains all the derivatives D α ϕ of order k such that |α| = i (k,r) j , where (ii) we order the entries of the vectors D k j u (j = 1, . . . , c k,r ) according to the following rule: if D α u and D β u belong to the block D k j u, we say that D α u precedes D β u if β N −1 α.

Hölder spaces.
Here, we introduce most of the isotropic function spaces we deal with in this paper.
Definition 2.6. For any k ≥ 0, C k b (R N ) denotes the subset of C k (R N ) of functions which are bounded together with their derivatives up to the [k] th order. We endow it with the norm given an open set Ω (eventually, Ω = R N ), by C ∞ c (Ω) we denote the set of all infinitely many times differentiable functions with compact support.

Uniform estimates for the approximating semigroups
To investigate the elliptic and parabolic problems associated with A we approximate this operator by the uniformly elliptic operator A ε defined on smooth function ϕ by for any ε > 0. It is known that one can associate a semigroup of bounded linear operators {T ε (t)} on C b (R N ) with each operator A ε . For any f ∈ C b (R N ) and any t > 0, T ε (t)f is the value at t of the unique classical solution to the Cauchy problem The uniqueness of the classical solution to problem (3.1) follows from a corresponding maximum principle (see, e.g., Proposition 3.1(ii)). The existence of a solution to problem (3.1) can be proved approximating such a problem with Dirichlet Cauchy problems in balls centered at 0 and radius n and using classical Schauder estimates and a compactness argument to show that the sequence of solutions {u n } to such Dirichlet Cauchy problems converges, as n → +∞, to a function u ε which turns out to solve problem (3.1). We refer the reader, for example, to [3, Chapter 1] and [20, Section 4] for more details.
By letting ε go to 0 and applying a compactness argument we will show the existence of a semigroup "generated by" A . For this purpose we need estimates for the spatial derivatives of {T ε (t)} uniformly for ε ∈]0, 1]. This section is devoted to the proof of such estimates.
We start with a maximum principle for (degenerate) elliptic and parabolic equation, which leads to uniqueness of the distributional solutions to the problems (1.6) and (1.7), but which will be also crucial in the proof of the estimates for the spatial derivatives in Theorem 3.2 and Theorem 3.3. We postpone the, more or less standard, proof to Appendix A.
with the coefficients q ij and F j (i, j = 1, . . . , m) being (possibly) unbounded functions in R N which may grow, respectively, at most quadratically and linearly at infinity. Then the following assertions hold true.
and λ > 0, Further, suppose that D i u and D ij u exist in the classical sense for any i, j = 1, . . . , m. Then, corresponding to some f ∈ C b (R N ) and g ∈ C(]0, T 0 ] × R N ). Further, assume that D i u, D ij u (i, j = 1, . . . , m) exist in the classical sense.
The following theorem will be the most crucial ingredient for the construction of the semigroup associated with A . Theorem 3.2. For any ε > 0, any h ∈ N and any f ∈ C h b (R N ), the function T ε (t)f belongs to C κ b (R N ) for any t > 0. Moreover, for any T 0 > 0 and any h, l ∈ N with h ≤ l, the function Proof. We restrict ourselves to showing the assertion in the case when h = 0, the other cases being similar and even easier. We split the proof into two steps. In the first one, we prove that there exists a positive constant C, independent of f , such that 3) for any f ∈ C b (R N ), t > 0. Next, in Step 2, we prove that the function (t, is continuous up to t = 0.
Step 1. Without loss of generality, we can limit ourselves to proving (3.3) in the particular case when f ∈ C ∞ c (R N ). Indeed, in the general case it suffices to approximate f ∈ C b (R N ) with a sequence of smooth functions f ∈ C ∞ c (R N ), bounded in C b (R N ) and converging to f locally uniformly in R N . It is well known that T ε (·)f n converges to T ε (·)f uniformly in [0, T 0 ] × B(M ), as n → +∞, for any M, T 0 > 0 (see e.g., [ for any M, T 0 > 0 and some positive constantĈ, depending on M, T 0 . Hence, D l T ε (t)f n converges to D l T ε (t)f n locally uniformly in ]0, +∞[×R N and this allows us to extend (3.3) to any f ∈ C b (R N ). Now for the proof of (3.3) for f ∈ C ∞ c (R N ), let ϕ ∈ C ∞ c (R) be a non-increasing function such that ϕ(t) = 1 for any t ∈ ] − 1/2, 1/2[, ϕ(t) = 0 for any t ∈ R\] − 1, 1[. For R > 1 define the functions η R : where u R denotes the classical solution to the Dirichlet Cauchy problem in the ball B(R) with initial value f , and a is a positive parameter to be fixed later on (a will be small ). To simplify the notation, we drop out the index R, when there is no danger of confusion.
The classical Schauder estimates of [11, Chapter 4, Theorem 5.1] imply that v is continuous in Moreover, a straightforward computation shows that v solves the Cauchy problem where, for any t ∈ [0, T 0 ] and any x ∈ B(R), the function g is given by Here, [D m , A ] denotes the commutator between the operators D m and A . Using the ellipticity assumption on q ij we get where D m ⋆ u (respectively D m ⋆⋆ u) denotes the vector whose entries are the m th order derivatives ∂ m u ∂xi 1 ...∂xi m with i j ≤ p 0 for some j = 1, . . . , m (respectively i j > p 0 for all j = 1, . . . , m).
We turn to estimating the function g 3 . From Hypotheses 2.1 it follows easily that for any x ∈ R N and some positive constant C 1 . Taking this into account and using Young's inequality we conclude for any t ∈ [0, T 0 ] and some positive constant C 2 , independent of ε and t. The term g 4 can be estimated similarly, taking now (2.4) into account. We obtain . . , l) denotes the maximum of the sup-norm of the functions D h q ij (respectively D h F j ) (i, j = 1, . . . , N ). Hence, taking Hypotheses 2.1(i) and 2.1(ii) into account, we can write for any t ∈ [0, T 0 ] and some positive constant C 3 , independent of t. Summing up, from (3.4), (3.6) and (3.7) we easily deduce that for any m = 1, . . . , l + 1. Since in M ⋆ m (a, T 0 ) and M ⋆⋆ m (a, T 0 ) apart from the first negative term everything vanishes as a ց 0 for any m = 1, . . . , l + 1, it follows that for sufficiently small a > 0 (independent of R!) the inequality g(t, x) ≤ 0 holds for any t ∈ [0, T 0 ] and any x ∈ B(R). The classical maximum principle yields then Step 2. We now conclude the proof by showing that the function w l : for any R > 0. Hence, by a compactness argument, we can easily show that Let us now consider the general case when and converges to f locally uniformly in R N . Let us fix k, m ∈ N. By [15, Proposition 1.1.3(iii)], we know that for some positive constant P l = P l (M ) and any function ψ ∈ C l+1 (B(M )). Apply (3.8) for ψ = t l/2 T (·)f n − t l/2 T (·)f and use the already proved inequality (3.3) to conclude for some positive constant P ′ l . The right-hand side of (3.9) vanishes as n → +∞. By the arbitrariness of T 0 and M , it follows immediately that the function w l is continuous in [0, +∞[×R N . In particular, it vanishes at t = 0 since the function (t, This completes the proof. We are now in a position to prove the main result of this section. Our ultimate aim is to show that the semigroups {T ε (t)} converge to a semigroup {T (t)} which is associated with the operator A , and we also wish to establish estimates for the spatial derivatives of {T (t)}. Contrary to the uniformly elliptic situation of {T ε (t)} the behavior near t = 0 of the partial derivatives of D α T (t)f is expected to depend not only on the length α of the multi-index α, but also on the directions along which we differentiate. Thus the well-know behavior t − α /2 is replaced by some function growing faster near 0. The exact behavior is well-known, e.g., for the Ornstein-Uhlenbeck semigroup (see [17]) and the optimal exponent is actually given by the following function q. We define q : N r+1 With this function the, still to be constructed, semigroup {T (t)} will obey the estimate and α ∈ N N 0 (recall the notation |α| = (α 0 , α 1 , . . . , α r ) from Subsection 2.3). Whereas, if we have a control over certain derivatives of f , say f ∈ C h b (R N ) we expect a better behavior. Indeed, this will be the case. For the precise statement we will need the following function q h : N r+1 This function describes the expected behavior near t = 0 in the estimates of the derivatives, and it models the following: if we have a function in C h b (R N ), then we can drop out any h partial derivatives from a multi-index α, since these should not contribute to the power of t. We do this in a way that derivatives which would give the largest contribution in the derivative-estimate are dropped out. Then, we can evaluate our q on this new multi-index and get the right behavior near t = 0.
The in ε ∈]0, 1] uniform estimates for spatial derivatives of {T ε (t)} are given by the following result.
(vii) Let α and α be two multi-indices such that α j ≤ α j for all j = 0, . . . , r and α j0 < α j0 for some j 0 . Further, let β = 2e Also the next linear algebraic lemma will be used in the proof Theorem 3.3. For a proof we refer to [13, Lemma 2.6]. Lemma 3.5. Suppose that Q = (q ij ) and A are non-negative definite N × N square matrices. Further, assume that, for some m ∈ N, the m × m-submatrix Q 0 = (q ij ), obtained erasing the last N − m rows and columns, is positive definite and q ij = 0 if max{i, j} > m. Then, where A 1 is the submatrix obtained from A by erasing the last N − m rows and columns.
Proof of Theorem 3.3. Throughout the proof, we simply write c k and i where u ε = T ε (·)f and H (l) (t) (l = 0, . . . , k) are suitable symmetric matrices. Namely, H (0) = 1 and the matrices H (l) (t) (l = 1, . . . , k) are split into c l blocks H (l) m,p (t) according to the splitting of the vector D l u ε introduced in Subsection 2.3. We set s for any t > 0 and some constant s m,ℓ(m) to be determined later on just as well as the positive parameters a > 1, η (l) m,m and η (l) m,ℓ(m) . We put the following requirements on these parameters: for any l = 1, . . . , k and any m > c l−1 . Conditions (3.11) guarantee that the matrix H (l) (l = 1, . . . , k) is positive definite for any t > 0. Moreover, we will also need to assume that η (0) for any l = 1, . . . , k. For the moment, as it will be crucial in the following, we assume that the constants η . . , c l ), satisfying the conditions (3.11)(a) and (3.12), can be actually determined. We will return to this point at the end and show that this is actually the case.
where the function g ε is given by the matrixḢ (l) is obtained by entrywise differentiating the matrix H (l) with respect to time, and we have D 0 D i u ε = D i u ε . Note also that the commutators here are understood coordinatewise.
When h = k we agree that the first sum in the second line of (3.13) disappears.
We are going to prove that we can fix T 0 small enough, but independent of ε, such that In particular, this implies that for some positive constantĈ. Since the matrices H (j) (t) are positive definite for any j and any t if we assume (3.11), we obtain that (3.10) holds in the time interval [0, T 0 ]. The semigroup property allows then to extend this estimate to any compact time interval J ⊂]0, +∞[. We now turn to the estimation of g ε .
Estimating the function g 1,ε . Lemma 3.5 and the ellipticity condition (2.1) imply that This is a term of negative type and it will help us to control (most of) the remaining terms in (3.13). More precisely, the right-hand side of (3.14) contains all the derivatives D α u ε of order less than or equal to k + 1 such that, if we split α = (α 0 , . . . α r ) (as explained in Subsection 2.3), then α 0 = 0. So, we miss all the derivatives of u of the type D α u with α ≤ k + 1 and α 0 = 0. We will recover these latter derivatives from (a part of) the term g 2,ε .
Using the very definition of the matrices for any t > 0. Thanks to (3.11)(a), we can fix γ (l) m . By Young's inequality (we will use the same trick several times in the sequel) and Lemma 3.4(iii) we now infer that Estimating the term g 2,ε . Observe that for any t > 0. By virtue of Lemma A.2 and a straightforward computation, we can write  m has maximum rank (which equals the number of its columns) for any m and any l, we can fix the matrix H Hence, observing that, by properties (ii) and (v) in Lemma 3 for any t > 0. Now, we estimate the second and the third terms in (3.16). For this purpose, we first conclude from Lemma 3.4(iv) the following: for any t ∈]0, 1]. Inequalities (3.18a) and (3.18b) will allow us to split the powers of t by using Young's inequality in the estimate of the second and third terms in (3.16). Since we are looking for a right-neighborhood of t = 0 where g ε is non-positive, without loss of generality we can assume that t ∈]0, 1]. We now consider several cases according to the values of p and s. We handle the different cases for p, respectively for s, parallely. First, suppose that p, s ≤ c l−1 . Using conditions (3.12)(c) and (3.12)(d), we obtain 2η m,ℓ(m) otherwise. From this, (3.18a), (3.18b) and Young's inequality we can conclude for any t ∈]0, 1]. The case when m ≤ c l−1 and s > c l−1 can be addressed similarly, taking now (3.12)(c) into account. We thus obtain for any t ∈]0, 1]. Finally, we consider the case when m, p, s > c l−1 . Observe that p < m for p ∈ A (l) m , p = m, and hence from condition (3.12)(b) we obtain 2η s,ℓ(s) by (3.12)(a). These yield for any t ∈]0, 1]. We now estimate the fourth term in (3.16). First, notice that ℓ(m) ) − 1) + , by Lemma 3.4(iv), or, equivalently If ℓ(m) > c l−1 , then because of 2η ℓ(m),ℓ(ℓ(m)) (see (3.12)(e)) we can write holds for any t ∈]0, 1].
Estimating the term g 3,ε . As it has been already remarked, this term occurs only if h < k. We begin by estimating the term t m,ℓ(m) |D l ℓ(m) u ε (t)|·|D l m u ε (t)|, when l > h and m > c l−1 ; note that, by Lemma 3.4 for any t > 0, where β (l) ℓ(m) is as above. From (3.25) and taking condition (3.12)(a) into account, we now get for any t > 0.
Estimating the terms g 4,ε and g 5,ε . We begin with g 4,ε , the case of g 5,ε being completely analogous. Let us observe that we have for any m = 1, . . . , c l and some matrices P (l,z) m whose entries linearly depend only on the derivatives (of order at least 1 and at most l) of the functions q ij (i, j = 1, . . . , p 0 ). In particular, these matrices are independent of ε. Moreover, if we split the matrices P . From the second equality in (3.27) (which is immediate if we recall that q ε ij is constant if at least one of i and j are greater than p 0 ) it follows that the terms appearing in (3.27) and obtained from [D α , Tr(Q ε D 2 )]u ε are D β u ε with coefficients in front depending on the derivatives of q ij , i, j ≤ p 0 , and with some β ∈ N N 0 such that |β| = (β 0 , . . . , β r ) with 1 ≤ β 0 ≤ α 0 + 2, β j ≤ α j for any j = 1, . . . , r and β ≤ α + 1. In particular, since β 0 > 1, then s obtained in this way from multi-indices α with |α| = i (l) m . By Lemma 3.4 ℓ(m) . These inequalities will allow us to split the powers of t by using Young's inequality. Hence we can write for any t > 0. In the following, we assume again t ∈]0, 1] and denote by C positive constants, independent of ν, t and a, which may vary from line to line. We can estimate the summands in the first term in the right-hand side of (3.28) as follows: for any m ≤ c l . Let us consider the first term in the right hand side of (3.29). Since we have z ≤ l here, we can use (3.12)(c) and estimate where β m + η (l+1) . Thus, we can estimate by using Young's inequality The summands in the second and the third terms in (3.28) can be estimated likewise. By (3.12)(c), (3.12)(d) and (3.12)(e) we have η ℓ(m),ℓ(ℓ(m)) (this latter for ℓ(m) > c l−1 ), so by Young's inequality we can deduce ℓ(m),ℓ(ℓ(m)) , otherwise. By putting everything together, we obtain Just in the same way, we can estimate the function g 5,ε and get for any t ∈]0, 1].
If we now fix the parameter a sufficiently large, condition (3.11)(b) is satisfied and, for an even larger constant a, the terms in right-hand side of (3.32) will be negative, provided that one can choose the parameters η m . By this restriction, condition (3.11)(a) will be satisfied. So from now we concentrate only on (3.12). Notice that for c l−1 < m < p we have ℓ(m) < ℓ(p), so (3.12)(b) is automatically satisfied by monotonicity. Note also that for such m we have ℓ(m) < m. Hence, if we choose a (l) n for all n, also (3.12)(a) and (3.12)(e) will be satisfied. Now we turn to the actual construction keeping all the above requirements on a Remark 3.6. Notice that the above proof works also for other functions q : N r+1 0 → R + replacing q h , as long as this function q has the properties similar to that of q h as listed in Lemma 3.4.

Construction of the semigroup
In this section we prove that, for any f ∈ C b (R N ), the Cauchy problem (iv) For any f ∈ C b (R N ) and any multi-index α ∈ N N 0 , with α ≤ κ − 1, the derivative D α T (·)f exists in the classical sense in ]0, +∞[×R N and it is a continuous function. Moreover, there exists a positive constant C, depending only on ω, h and α such that, for any f ∈ C h b (R N ) and any α as above, we have Proof. (i) First of all, notice that uniqueness follows immediately form the maximum principle, Proposition 3.1. Throughout the proof, we denote by C positive constants, independent of ε ∈]0, 1[, which may vary from line to line. As a first step, we show that, for any ω > 0, there exists a positive constantĈ =Ĉ(ω), independent of ε, such that for any f ∈ C h b (R N ) and any α ≤ κ. Estimate (4.3) follows from the semigroup law and from (3.10). Indeed, fix ω > 0 and let C 0 = min{1, inf t∈[1,+∞[ t −q h (|α|) e ωt }. Splitting T ε (t) = T ε (1)T ε (t − 1), for any t > 1, and taking (3.10) in Theorem 3.3 into account, we get We can now prove that problem (4.1) admits a unique classical solution for any f ∈ C b (R N ), For this purpose, as in the proof of Theorem 3.2, we set u ε = T ε (·)f . Then, from (4.3), we easily deduce that for any 0 < T 1 < T 2 Since the function u ε solves the Cauchy problem (3.1) and the coefficients of the operator A ε are locally bounded, uniformly with respect to ε ∈]0, 1[, the function D t u ε is bounded in ]T 1 , T 2 [×B(R), for any R > 0, by a constant, independent of ε. Therefore we have u ε ∈ Lip([T 1 , T 2 ]; C(B(R))) with norm independent of ε ∈]0, 1[. By applying [15, Propositions 1.1.2(iii) and 1.1.4(i)], we now deduce that u ε ∈ C θ/2,κ−1+θ (]T 1 , T 2 [×B(R)) for any ε as above and some θ ∈]0, 1[, and with C θ/2,κ−1+θnorm being bounded by a constant independent of ε. As a byproduct, using that u ε solves (3.1), we deduce that D t u ε ∈ C θ/2,κ−3+θ (]T 1 , T 2 [×B(R)) and, again, its C θ/2,κ−3+θ -norm is bounded by a constant independent of ε. Since T 1 , T 2 , R are arbitrarily fixed, using both a compactness and a diagonal argument, we can determine an infinitesimal sequence {ε n } such that {u εn } converges in C 1,κ−1 (K), for any compact set K ⊂]0, +∞[×R N , to a function u f ∈ C 1+θ/2,κ−1+θ loc (]0, +∞[×R N ). Of course, the function u f solves the differential equation in (4.1) for t > 0. The continuity of u f up to t = 0 and the condition u f (0, ·) = f , are obtained in three steps.
Step 1. Suppose that f ∈ C 2 c (R N ). Then, by the proof of [20, Proposition 4.3], we know that Hence, taking the limit, first as n → +∞ and then as t → 0 + , we obtain that u f is continuous at t = 0 where it equals f . So we have shown that u f is the unique classical solution to problem (4.1). Moreover, we infer that u ε converges to u f , as ε → 0 + , in C 1,κ−1 ([T 1 , T 2 ] × B(R)) for any T 1 , T 2 , R as above. Indeed, by uniqueness (or by the maximum principle in Proposition 3.1), any sequence u εn (with ε n being positive and infinitesimal) which converges in C 1,κ loc ([0, +∞[×R ) , must converge to u f . Again the maximum principle implies that for a non-negative f ∈ C c (R N ) the solution u f is also non-negative.
Step 2. Suppose now that f vanishes at ∞. Then, we can approximate f by a sequence of smooth and compactly supported functions f n . By estimate (3.2) we know that Letting m → +∞ yields Hence u fn converges to u f uniformly in [0, +∞[×R N . Since u fn is continuous in [0, +∞[×R N and u fn (0, ·) = f n , it follows that u f is continuous in [0, +∞[×R N as well, and that u(0, ·) = f holds. The same argument in the last part of Step 1, shows that, also in this situation, the function T ε (·)f converges to u f in C 1,κ−1 loc (]0, +∞[×R N ) as ε → 0 + . Moreover for non-negative f we see the solution u f to be non-negative as well.
(iv). By (i), we know that the function T (t)f belongs to C κ−1 (R N ) for any t > 0 and any f ∈ C b (R N ), and T ε (t)f converges to T (t)f in C κ−1 loc (R N ) as ε → 0 + . Since the constant in (4.3) is independent of ε ∈]0, 1], it is immediate to conclude that (4.2) holds for any α ≤ κ − 1.
With respect to derivatives in the first and second block of variables we can prove more regularity.
Proof. We set u = T (·)f and split the proof into several steps. In the first one we show a formula that will be used in Steps 2 to 5, in the actual proof of (4.5). Until Step 5, we will assume at least f ∈ C κ−1 b (R N ), and then in Step 5 we proceed with an approximation argument. Finally, in Step 6, we show that the function D α u is continuous in ]0, +∞[×R N .
Step 1. We fix R > 0, j ∈ {1, . . . , N } and f ∈ C κ−1 b (R N ), and prove that, for any η D ⋆ u and D 2 ⋆ u denoting, respectively, the vector of first-order derivatives of u with respect to indices not greater than p 0 and the quadratic submatrix obtained erasing the last N −p 0 rows and columns from D 2 u.
To prove (4.6), for any δ ∈ ] − 1, 1[, we introduce the operator τ j δ defined on C b (R N ) by Moreover, we set w j ε,δ = ϑτ j δ v ε where v ε = ηu ε . In the sequel, in order to shorten the notation, if there is no danger of confusion we only stress explicitly the dependence on ε of the functions considered. As it is easily seen, (4.8) and D ⋆⋆ ψ denotes the vector of the first-order derivatives of the function ψ : R N → R with respect to the last N − p 0 variables. In view of the variation of constants formula (see [21,Theorem 3.5]), we obtain that w ε satisfies We are going to show that we can take the limit as ε → 0 + in (4.9) and write where g j,δ is obtained from g j,δ,ε by replacing u ε by u and letting ε = 0 in (4.8). By the results in the proof of Theorem 4.1(i), it follows immediately that the continuous function g j,δ,ε converges to the function g j,δ uniformly in [0, +∞[×R N , as ε → 0 + . This implies that, for any r, s > 0, T ε (r)g j,δ,ε (s, ·) converges to T (r)g j,δ (s, ·) locally uniformly in R N , as ε → 0 + . Indeed, for any compact set K ⊂ R N , we have From the proof of Theorem 4.1(i) we see that the last term in the previous chain of inequalities vanishes as ε → 0 + . Moreover, since the semigroups {T ε (t)} are contractive, the function (r, s) → T ε (r)g j,δ,ε (s, ·) is bounded in [0, +∞[×[0, +∞[×R N , uniformly with respect to ε ∈]0, 1[. Therefore, the dominated convergence theorem yields (4.10).
We can now prove formula (4.6). Since, by Theorem 4.1(iv), the function u is bounded in [0, , it is immediate to see that g δ (s, ·) converges to g(s, ·) uniformly in R N for any s > 0. Formula (4.6) now follows from (4.10) via the dominated convergence theorem.
Step 2. Here, and in the forthcoming Steps 3 and 4, we assume that f ∈ C κ b (R N ). Let us fix a multi-index α = (α 1 , . . . , α N ) ∈ N N 0 with α = κ and (α 1 , . . . , α p0 ) ≥ 1. We denote by j the largest integer such that α j = 0 and set β : . . , β r ), and denote by ι the smallest integer with β ′ ι > 0. To prove that the derivative D α u exists in the classical sense it suffices to show that we can differentiate, with respect to the multi-index β, the function in (4.6). For this purpose, we observe that, from (4.2) with h = κ − 3 and h = κ − 2, we deduce that holds for any t ∈]0, +∞[, ψ ∈ C κ−3+θ b (R N ) and θ = 0, 1. By interpolation, we can extend the previous estimate to any θ ∈ [0, 1]. Estimate (4.3) implies that, for any multi-index γ with length κ− 1 and any t > 0, the function D γ u ε (t, ·) is Lipschitz continuous in R N with Lipschitz semi-norm that can be bounded by Ce ωt for any ω > 0 and some C = C(ω), where the constants are uniform in ε. Since u ε converges to u in C 1,κ−1 loc (]0, +∞[×R N ), the function D γ u(t, ·) is Lipschitz continuous in R N and its norm can be bounded by Ce ωt . As a byproduct, we infer that, for any θ ∈]0, 1[ and any T 0 > 0, the function g j (s, Consequently, if we take θ > 2ι/(2ι + 1), we get an integrable function on the right hand side of (4.11), and hence we can differentiate under the integral sign in (4.6). This proves that the derivative D α u exists in the classical sense. Moreover, it satisfies (4.2). Indeed, the sup-norm of D α u can be controlled from above by the Lipschitz seminorm of D β u which, as we have shown, can be estimated from above by Ce ωt for any t > 0, any ω > 0 and some C = C(ω).
Step 3. We now assume that α = κ and α i = 0 for all i = 1, . . . , p 0 , whereas α j = 0 for some j ∈ {p 0 + 1, . . . , p 0 + p 1 }; we again set β := α − e (N ) j . We are going to show that we can differentiate the formula (4.6) with respect to the multi-index β. For this purpose, let again ι be the largest integer with β ι = 0, and note that it suffices to prove that, for any T 0 > 0, the function g j is bounded in ]0, T 0 [ with values in C κ−2+θ b (R N ), for some θ ∈] 2ι−1 2ι+1 , 1[. Indeed, once this property is proved, estimate (4.2) gives for all 0 < s < t ≤ T 0 , for arbitrary T 0 > 0 and some C = C(ω), and we can complete the proof applying the same arguments as in the previous step. Due to the structure of g j , in order to prove that g j is bounded in ]0, , it suffices to show that for any pair of indexes l ≤ p 0 and l ′ ≤ p 0 + p 1 , with l ≤ l ′ , the function D ll ′ u(t, ·) belongs to C κ−2+θ (R N ) and sup t∈]1/M,M[ D hl u(t, ·) C κ−2+θ (B(M)) < +∞ for any M > 0. Actually, only the first and the fifth, second-order terms have to be taken care of in (4.7). Indeed, applying D γ with γ = κ − 2 to any of the other terms g from (4.7) (in which there are only first order derivatives of u), we get that D γ g(t, ·) is Lipschitz continuous, uniformly in ]0, t 0 [ and sup t∈]0,t0[ g(t, ·) C κ−2+θ (R N ) < +∞ (these follow by approximating u by u ε as we have done several times above).
We now prove the assertion about D ll ′ u. So let now β ′ ∈ N N 0 with β ′ = κ − 2. Denote furthermore by i the largest integer such that β ′ i > 0, and define β = β ′ + e . From (4.2) and from the already proved assertion in Step 2 we obtain, by using interpolation as well, that (4.12) for any θ, ρ ∈ [0, 1]. From (4.12) it now follows that holds for any 0 < s < t ≤ T 0 . Hence, if we fix γ ∈]0, 1[ and take ρ = θ 1 = γ 3 , we see that the function in the right-hand side of (4.13) is integrable in ]0, T 0 [ for any T 0 > 0. Thus we can differentiate under the integral in (4.6), and conclude that the function D ll ′ u is bounded in ]R −1 , R[ with values in C κ−2+θ1 (B(R)). Due to the arbitrariness of R, it follows that D ll ′ u is bounded in H with values in C κ−2+θ1 (K) for any compact set H × K ⊂]0, +∞[×R N .
As a second step, using (4.12), we deduce that D ll ′ u is bounded in H with values in C κ2+θ2 (K) for any H and K as above, where θ 2 = γ 1+θ1 3−2θ1 . Iterating, this argument, we see that D ll ′ u is bounded in H with values in C κ−2+θ k (K), where the sequence {θ k } is defined by recurrence as follows: where either k 0 = +∞ or k 0 is the largest integer such that θ k < 3/2. It easy to see that θ k < θ k+1 holds for any k ≤ k 0 and choice γ ∈]0, 1[. For the choice γ = 3 4 , the equation ℓ = γ 1+ℓ 3−2ℓ has no real solutions. This fact combined with the monotonicity property implies that there exists k 1 such that θ k1 > 1. It follows that D ll ′ u is bounded in H with values in C κ−2+θ k (K) for any θ ∈]0, 1[ and, consequently, g i is locally bounded in ]0, +∞[ with values in C κ−2+θ b (R N ) for any θ ∈]0, 1[. The proof of Step 3 is complete.
Step 4. We now show that for any t ∈]0, +∞[, any f ∈ C κ b (R N ), any h ∈ N with h < κ, and any α ∈ N N 0 with length κ and such that α j = 0 for some j ≤ p 0 + p 1 . For this purpose, fix j ≤ p 0 + p 1 such that α j = 0. Further, we let β = α − e (N ) j . From (4.2) it follows that for any x ∈ R N −1 the Lipschitz seminorm of the function ψ : , with C depending only on ω. Since the Lipschitz seminorm of the function ψ equals the sup-norm of the function (D α T (t)f )(x 1 , . . . , x j−1 , ·, x j+1 , . . . x n ) (which is already known to be existing by Steps 2 and 3), the desired estimate follows.
Step 5. We now prove (4.5) for a general f ∈ C h b (R N ) (h < κ) and any multi-index α ∈ N N 0 such that α = κ and α j = 0 for some j ≤ p 0 + p 1 . Let us notice that we can limit ourselves to proving that the derivative D α T (t)f exists in the classical sense for any t > 0 and any f ∈ C b (R N ). Indeed, once this property is checked, estimate (4.5) can be proved arguing as in Step 4. We begin by considering the case when f ∈ BU C(R N ), and we fix a sequence {f n } ∈ C κ b (R N ) converging to f uniformly in R N . We can write for any t > 0 and any n, m ∈ N. If follows that {D α T (t)f n } is a Cauchy sequence in C b (R N ) and, consequently, Step 6. To complete the proof, we have to show that for any multi-index α with length κ such that α j = 0 for some j ≤ p 0 + p 1 , the function D α T (t)f is continuous in ]0, +∞[×R N . For this purpose, let i be the largest integer such that α i > 0. Let us fix y ∈ R N −1 , and introduce the function ψ = D β u(·, y 1 , . . . , y i−1 , ·, y i+1 , . . . , y N ) where, again, β = α − e (N ) i , and still β j ′ > 0 for some j ′ ≤ p 0 + p 1 . From the results in Steps 2 to 5 we know that ψ is bounded in ]a, b[ with values in C 1+θ (B(R)) for some θ ∈]0, 1[ and any a, b, R > 0, with a < b. Applying [15, Propositions 1.1.2(iii) and 1.1.4(iii)] to the function ψ(t, ·) − ψ(s, ·) (s, t, ∈ [a, b]), we immediately see that for some constant C, independent of y. Since u ∈ C 1,κ−1 (]0, +∞[×R N ), we immediately deduce that the right-hand side of the previous chain of inequalities vanishes as |t− s| → 0 + , implying that the function D α u(·, x) is continuous in [a, b] uniformly with respect to x ∈ R N . This is enough to conclude that D α u is continuous in ]0, +∞[×R N . Remark 4.3. (i) We remark that the results proved in Theorem 4.1 are stronger than those in [24]. (ii) Some calculation yields that the bootstrap argument used in Step 3 of the proof of Theorem 4.2 cannot be applied to prove the existence of the derivative D α T (t)f in the classical sense when α = κ and α j = 0 for all j = 1, . . . , p 0 + p 1 .

4.1.
Properties of the semigroup. In this section we first state some continuity property of the semigroup {T (t)} that will play a fundamental role in order to prove the Schauder estimates of Section 5. Then, we characterize the domain of the weak generator of the semigroup. Since the proofs of the following proposition can be obtained arguing as in [14], we omit it.
Consequently, {T (t)} can be extended to the space B b (R N ) of all bounded and Borel measurable functions f : R N → R with a semigroup of positive contractions.
Differently from what happens in the classical case when the coefficients are bounded, in general the semigroup associated with elliptic operators with unbounded coefficients is neither analytic in C b (R N ), nor strongly continuous in BU C(R N ). Assertion (ii) above in Proposition 4.4, however, expresses the fact that the semigroup {T (t)} is bi-continuous for the topology of locally uniform convergence τ c (see [9,10] or [6]), or which is essentially the same it is a locally-equicontinuous semigroup with respect to the mixed topology. The mixed topology is finest locally convex topology agreeing with τ c on · ∞ -bounded sets. (See [27] or [26] for the definition of the mixed topology; [6] for the equivalence of these two families of semigroups; and [28, Section IX.2.] for locallyequicontinuous semigroups). This allows us to associate an infinitesimal generator (A, D(A)) to the semigroup (see [9,10]): With this definition the infinitesimal generator (A, D(A)) is a Hille-Yosida operator, and the resolvent of A can be calculated where the integral exists in the topology τ c and for all positive λ. In general one could replace here the τ c -convergence by pointwise convergence resulting in the so-called "weak-generator", in our case, however, this would not result in any difference.
Remark 4.5. We note that assertion (iii) in Proposition 4.4 follows also directly from the first part of (ii). Actually, we even have the equivalence of these two statements, for details see, e.g., [6].
The next proposition characterizes the domain D(A).
Proposition 4.6. The following characterization holds true: (4.14) Moreover, Af = A f for any f ∈ D(A). Here and above, A f is meant in the sense of distributions.
This tells us essentially that C 2 b (R N ) is a core for the generator A with respect to the mixed topology, or which is the same is a bi-core with respect to τ c (see [10]). For the proof we use an invariance argument and need the following preparatory lemma.
is a bounded sequence converging locally uniformly to some function f ∈ C b (R N ), then, for any λ > 0, R(λ, A)f n converges to R(λ, A)f , locally uniformly in R N ; (iii) for any λ > 0, R(λ, A) is a bounded operator mapping C h b (R N ) into itself for any h ∈ N such that h < κ.
Proof. (i). We begin the proof by recalling that, for any t > 0, T ε (t) and A ε commute on D(A 0 ) since they commute on On the other hand, recalling that {T ε (t)} is a contraction semigroup, we can write for any t > 0. Since A ε f converges uniformly in R N to A f as ε → 0 + , estimate (4.15) implies that T ε (t)A ε f tends to T (t)A f , locally uniformly in R N .
(ii) This is a property shared by resolvents of generators of bi-continuous semigroups, see [9,10]. For the sake of completeness we give the straightforward proof. Let {f n } and f be as in the statement of the lemma. Observe that for any compact set K ⊂ R N , Proof of Proposition 4.6. Taking Lemma 4.7 into account, it is easy to check that, for any f ∈ D 0 (A ), it holds that for any x ∈ R N . Therefore, f ∈ D(A) and We could now conclude the proof by using density and the invariance under {T (t)} of D(A 0 ) and by referring, e.g., to [10,Proposition 1.21], or to [19,Proposition 2.12] (the analogous statement for strongly-continuous semigroups is in [5,Proposition II.1.7]). We nevertheless give a complete proof.
Let us fix f ∈D (the function space defined by the right-hand side of (4.14)) and let {f n } ⊂ C 2 b (R N ) be a bounded sequence with respect to the sup-norm which converges to f locally uniformly in R N and it is such that the sequence {A f n } ⊂ C b (R N ) is bounded and converges locally uniformly in R N to some function g ∈ C b (R N ). By the above results we know that Lemma 4.7(ii) allows us to take the limit as n → +∞ in (4.17), getting f = R(λ, A)(λf − g), so that f ∈ D(A) and Af = g. We claim that Af = A f (where A f is meant in the distributional sense). For this purpose, it suffices to observe that, for any ϕ ∈ C ∞ c (R N ), we have where A * is the formal adjoint of the operator A . Letting n → +∞ in (4.18), the claim follows.
We have so proved thatD is contained in D(A) and A = A onD. We now prove that D(A) ⊂D. For this purpose, we fix f ∈ D(A), and h ∈ C b (R N ) be such that f = R(1, A)h. By convolution, we can determine a sequence of smooth functions {h n } ⊂ C 2 b (R N ), bounded in C b (R N ) and converging locally uniformly to h as n → +∞. By Lemma 4.7(ii) and (iii), the sequence {R(1, A)h n } is contained in C 2 b (R N ) and it converges to f locally uniformly in R N . Further, arguing as in the proof of (4.16), one can easily show that A R(1, A)h n = −h n +R(1, A)h n for any n ∈ N. Hence, the sequence {A R(1, A)h n } is bounded in C b (R N ) and it converges to −h + f ∈ C b (R N ), locally uniformly in R N . It follows that f ∈ D(A).

Schauder estimates
In this section we prove Schauder estimates for the (distributional) solutions to the elliptic equation and to the non-homogeneous Cauchy problem Throughout the section, we assume that Hypotheses 2.1 are satisfied with κ equal to the least common multiple of the odd numbers between 1 and 2r + 1.
The main results of this section are collected in the following two theorems.
Such a function u is the unique distributional solution to the equation (5.1) which is bounded and continuous in R N and it is twice continuously differentiable in R N with respect to the first p 0 variables, with bounded derivatives.
The general case when β ∈]0, κ[ is such that β/(2j + 1) / ∈ N for any j = 0, . . . , r now follows from the interpolation theorem. Indeed, The following is a straightforward consequence of the estimates in Theorem 4.1 Lemma 5.4. For any ω > 0, there exists a positive constant C = C(ω) such that Combining Theorem 4.1 and Lemma 5.4, we can now prove the following.
The estimate (5.9) is the keystone in the proof of Theorems 5.1 and 5.2. The candidate to be the solutions to the equation (5.1) and the non-homogeneous Cauchy problem (5.2) are, respectively, the functions R(λ, A)f and u defined by The results in the following proposition are now a straightforward consequence of the estimate (5.9) and the interpolation arguments in [16,Section 3]. For this reason we skip the proof, referring the reader to the quoted paper.
for any t ∈ [0, T 0 ] and estimate (5.4) is satisfied by some positive constant C independent of f and g.
We can now complete the proofs of Theorems 5.1 and 5.2.
Proof of Theorem 5.1. By Proposition 4.6, we know that Aψ = A ψ for any ψ in D(A), where A ψ is meant in the sense of distributions. Hence, the resolvent equality immediately implies that the function R(λ, A)f is a distributional solution of the equation (5.1). Moreover, by Proposition 5.6(i), R(λ, A)f ∈ C 2+θ (R N ) and satisfies estimate (5.3). As a byproduct, Proposition 3.1(i) implies that R(λ, A)f is the unique distributional solution to the equation (5.1) satisfying the properties of Theorem 5.1. The proof is now complete.
Proof of Theorem 5.2. The uniqueness part of the statement follows immediately from the maximum principle in Proposition 3.1(ii). Moreover, by virtue of Proposition 5.6, we can limit ourselves to proving that the convolution term in (5.12), that we simply denote by v, is a distributional solution to (5.2), with f ≡ 0. Actually for smooth g with compact support this is an easy and classical argument using variation of constants. For the general case we pick a se- , bounded in the sup-norm, and converging locally uniformly in [0, T 0 ] × R N to g. Moreover, for any n ∈ N, we denote by v n the convolution function defined as v, but with g being replaced by g n . As already indicated above, a straightforward computation, based on estimate (4.2) with α = 2 and h = 2, shows that v n is a classical solution to problem (5.2) (with f ≡ 0 and g being replaced by g n ). Moreover, its sup-norm may be bounded by a positive constant, independent of n and, by Proposition 4.4, v n converges to v pointwise in [0, T 0 ] × R N . Now, we observe that, for any smooth function where A * is the formal adjoint to operator A . Letting n → +∞, we deduce that v is a distributional solution of (5. 2) with f ≡ 0. Proof. We will show that (i) ⇔ (ii), (ii) ⇔ (iii), (ii) ⇔ (iv), (iii) ⇔ (v). We preliminarily note that both W (x) and W r (x) are independent of x, so that, in the rest of the proof, we simply write W and W r instead of W (x) and W r (x). (i) ⇔ (ii): To prove this equivalence, it suffices to observe that, for any x ∈ R N , the set W (x) is contained in Ker(Q(x)) and is its largest subspace, which is invariant for B * .
(ii) ⇔ (iv): Let us fix t > 0, x ∈ R N and let ξ ∈ R N be such that Q t (x)ξ, ξ = 0. This implies that e sB Q(x)e sB * ξ, ξ = 0 for any s ∈ [0, t]. Hence, Q(x)e sB * ξ = 0 for any s as above. Since Q(x)e sB * ξ = 0 if and only if Q(x)(B * ) k ξ = 0 for any k ∈ N 0 , that is if and only if ξ ∈ W . The equivalence between (ii) and (iv) follows immediately.
The following lemma plays a crucial role in the proofs of Theorems 3.3.
where e Proof. By using the chain rule and by taking the structure of the matrix B in (2.3) into account, it is easy to see that for any multi-index α ∈ N N 0 we have for any x ∈ R N . By definition we have i ℓ(m) = (0, . . . , 0, 1, d j1 − 1, . . . , d j2 , . . . , d j k , 0, . . . , 0). In (A.3) consider all the possible multi-indices α ∈ N N 0 with |α| = i ℓ(m) . We see immediately that [D l ℓ(m) , B·, D ]w is given by the right-hand side of (A.1) for some matrices J m has full rank which equals the number of its columns. We split the rest of the proof in two steps.
Step 1. First, we show that we can make some reduction. More precisely, we show that, without loss of generality, we can limit ourselves to prove the assertion for a generic smooth function w when: ( where we haven't written out the terms, which do not contribute to J (l) m . This means we that we can argue for the function D γ w hence assuming (ii), and the general case will follow, as well.
where [B * j1 ] −s denotes the matrix obtained from B * j1 by dropping out the first s columns. The block matrix above is block-lower triangular has full rank, as all its blocks on the diagonal do so, and its rank is equal to the number of its columns. Thus J (l) m which is similar to the above block matrix, has the asserted properties.
The following two lemmas are used in the forthcoming proof of the maximum principle of Proposition 3.1.
Lemma A.3. For the first-order differential operator B, formally defined by the equality Bu(x) = Bx, Du(x) for any x ∈ R N and any u ∈ C(R N ), where Du is meant in the sense of distributions, the following hold: (i) For any u ∈ BU C(R N ) such that Bu ∈ C(R N ), there exists a sequence {u n } of smooth functions, converging to u uniformly in R N , such that Bu n ∈ C b (R N ) for any n ∈ N and it converges to Bu locally uniformly in R N . In particular, if u is compactly supported in R N , then u n is compactly supported in supp(u) + B(1), for any n ∈ N. Proof. (i) For any n ∈ N, let u n = u * ̺ n , where ̺ n = n N ̺(n·), ̺ ∈ C ∞ c (B(1)) being a positive function with ̺ L 1 (R N ) and " * " denotes the convolution operator. As it is immediately checked, the function u n is smooth and converges to u uniformly in R N . Moreover, if u is compactly supported in R N , then each function u n is compactly supported in supp(u) + B(1).
To prove that Bu n converges to Bu locally uniformly in R N , we observe that Bu n = Bu * ̺ n + Tr(B)u n + u * B̺ n . (A.6) This is enough for our aims. Indeed, as it is immediately seen, u * B̺ n converges to Tr(B)u uniformly in R N . It follows that the right-hand side of (A.6) tends to Bu as n → +∞, locally uniformly in R N . Formula (A.6) is immediately checked in the particular case when u ∈ C 1 b (R N ), by means of a straightforward computation, based on an integration by parts. To prove it for any u ∈ C b (R N ), it suffices to write it with u n and u being replaced, respectively, by u m n = v m * ̺ n and v m , where {v m } is a sequence of smooth functions converging to u uniformly, and then take the pointwise limit as m → +∞. Indeed, it is immediate to check that u m n , Bu m n and v m * B̺ n converge, respectively, to u n , Bu n and u * B̺ n , locally uniformly in R N , as m → +∞. Moreover, since Bu m converges to Bu in the sense of distributions, then Bu m * ̺ n converges to Bu * ̺ n pointwise in R N as m → +∞.
(ii) The proof is similar to the previous one. We extend u to ] − ∞, 0[×R N , by settingũ(t, x) = u(−t, x) for such (t, x)'s. Next, we approximateũ by the sequence {u n } defined by taking the convolution ofũ with a standard sequence {̺ n } of mollifiers in R N +1 . Using the same approximation argument as in the proof of part (i), one can show that D tũn −Bũ n = (D t u−Bu) * ̺ n −Tr(B)u n − u * B̺ n , in [a, +∞) × R N for any positive number a such that na > 1. Letting n → +∞, it is easy to check that D tũn − Bũ n converges to D t u − Bu locally uniformly in ]0, +∞[×R N .