Regularity of density for SDEs driven by degenerate L\'evy noises

By using Bismut's approach about the Malliavin calculus with jumps, we study the regularity of the distributional density for SDEs driven by degenerate additive L\'evy noises. Under full H\"ormander's conditions, we prove the existence of distributional density and the weak continuity in the first variable of the distributional density. Under the uniform first order Lie's bracket condition, we also prove the smoothness of the density.


Theorem 1.1. Assume that (H α 1 ) holds and b is smooth and has bounded derivatives of all orders, and for any x ∈ R d and some n = n(x) ∈ N,
Rank[A 1 , B 1 (x)A 1 , · · · , B n (x)A 1 , A 2 , B 1 (x)A 2 , · · · , B n (x)A 2 ] = d. (1.4) Then X t (x) admits a density ρ t (x, y) with respect to the Lebesgue measure so that for any bounded measurable function f , In particular, the semigroup (P t ) t 0 has the strong Feller property.
Remark 1.2. When A 1 = 0 and b(x) = Bx is linear, condition (1.4) is called Kalman's rank condition. In this case, the smoothness of the density of the corresponding Ornstein-Uhlenbeck process has been studied in [16,9].
About the smoothness of the density, we have the following partial result.

7)
where ∇ k denotes the k-order gradient operator. In particular, if m = ∞, then X t (x) admits a smooth density ρ t (x, y) so that , t > 0. (1.8) In the continuous diffusion case (i.e. A 2 = 0 and A 1 = A 1 (x)), under Hörmander's conditions, Malliavin [13] proved that SDE (1.1) has a smooth density by using the stochastic calculus of variations (nowadays, it is also called the Malliavin calculus, and a systematic introduction about the Malliavin calculus is refereed to the book [14]). Since the pioneering work of [13], there are many works devoting to extend the Malliavin's theory to the jump case (cf. [5,4,15,8] etc.). However, unlike the case of continuous Brownian functionals, there does not exist a unified treatment for Poisson functionals since the canonical Poisson space has a nonlinear structure. We mention that Bismut's approach is based on the Girsanov's transformation (cf. [5]), while Picard's approach is to use the difference operator to establish an integration by parts formula (cf. [15]).
When A 1 = 0 and κ(z) = c|z| −d−α , Theorems 1.1 and 1.3 have been proved in [22] and [7] by using the Malliavin calculus for subordinated Brownian motions (cf. [11]). About the smoothness of distributional density for degenerate SDEs driven by purely jump noises, Takeuchi [20], Cass [6] and Kunita [10] have already studied this problem under different Hörmander's conditions. However, their results do not cover the present general case (see also [23,24,21] for some related works). Compared with [22] and [7], in this work we shall use Bismut's approach to prove Theorems 1.1 and 1.3, and need to assume that the Lévy measure is absolutely continuous with respect to the Lebesgue measure. It is noticed that in [7], the Lévy measure can be singular and the drift is allowed to be arbitrarily growth, which cannot be dealt with in the current settings.
In the proof of our main theorems, one of the difficulties we are facing is the infinity of the moments of L t . To overcome this difficulty, we consider two independent Lévy processes L 0 t and L 1 t with Lévy measures ν 0 (dz) := 1 |z|<1 κ(z)dz and ν 1 (dz) := 1 |z| 1 ν(dz) respectively. Clearly, L t has the same law as L 0 t + L 1 t . Notice that L 1 t is a compound Poisson process. Let 0 =: τ 0 < τ 1 < τ 2 < · · · < τ n < · · · be the jump time of L 1 t . It is well-known that E := {τ n − τ n−1 , n ∈ N}, G := {∆L 1 τ n := L τ n − L τ n − , n ∈ N} are two independent families of i.i.d. Let be a cádlág purely discontinuous R d -valued function with finite many jumps and 0 = 0. Following the argument of [22,Subsection 3.3], we consider the following SDE:X Clearly, ), then we have (see [22, (3.19)]) where for a function g(x) and y ∈ R d , ϑ y g(x) := g(x + y).
Basing on (1.9) and as in [22,Subsection 3.3], it suffices to prove Theorems 1.1 and 1.3 for X t (x; 0), that is, we only need to consider the SDE (1.1) driven by W t and L 0 t . This paper is organised as follows: in Section 2, we recall the Bismut's approach about the Malliavin calculus with jumps. In [4], Bichteler, Gravereaux and Jacod have already systematically introduced it, however, the α-stable like noise does not fall into their framework. Thus, we have to extend the integration by parts formula to the more general class of Lévy measures. Moreover, we also prove a Kusuoka-Stroock's formula for Poisson stochastic integrals. In Section 3, we introduce the reduced Malliavin matrix for SDE (1.1) used in the Bismut's approach (cf. [4]), and also give some necessary estimates. In Sections 4 and 5, we shall prove Theorems 1.1 and 1.3.

Functions spaces.
Let p 1. We introduce the following spaces for later use.
Lemma 2.2. (i) For any p 1, the spaces (H p , · H p ) and (V p , · V p ) are Banach spaces.
(iii) We only prove the density of V 0 in V p , i.e., for each v ∈ V p , there exists a sequence v n ∈ V 0 such that lim We shall construct the approximation by three steps.
(1) For ε ∈ (0, 1), define Notice that for ε ∈ (0, 1) and R > 1, where ̺(z) is defined by (2.1). By the dominated convergence theorem, we have (2) Next we can assume that for some compact set By (2.7) and Remark 2.1, it is easy to see that (3) Lastly we assume that v is smooth in z and satisfies (2.7). For R > 1, we construct v R (s, z) Clearly, v R ∈ V 0 . By (2.9) and the dominated convergence theorem, we have The proof is complete.
For p > 1, by Itô's formula, we have By Doob's maximal inequality and Young's inequality, we further have which together with (2.12) gives (2.10).
(ii) As above, for p 2, by Taylor's expansion, we have which in turn gives (2.11).
For v ∈ V 0 and ε > 0, define The following lemma is easy.
Lemma 2.4. For any v ∈ V 0 with compact support U ⊂ Γ 0 with respect to z, there exist an ε 0 > 0 and a constant C > 0 such that for any ε ∈ (0, ε 0 ) and all t, z, For any z ∈ U, since v and ∇ z v are bounded, we have which gives the desired estimate (2.13) by the compactness of U and κ ∈ C 1 (Γ 0 ; (0, ∞)). As for (2.14), it follows by a direct calculation.
For p 1 and Θ : By Burkholder's inequality and (2.2), we have Let Q ε t solve the following SDE: whose solution is explicitly given by the Doleans-Dade's formula: Proof. For any p 2, by (2.17), (2.13) and (2.10), we have From this and (2.17), one sees that Q ε t is a nonnegative martingale and EQ ε t = 1. For (2.18), by equation (2.17) Thus, by Burkholder's inequality and Lemma 2.4, we obtain (2.18).
Then the map be the class of all smooth functions on R m which together with all the derivatives have at most polynomial growth. Let F C ∞ p be the class of all Wiener-Poisson functionals on Ω with the following form: where f ∈ C ∞ p (R m 1 +m 2 ), h 1 , · · · , h m 1 ∈ H 0 and g 1 , · · · , g m 2 ∈ V 0 are non-random, and where "(·)" stands for w(h 1 ), · · · , w(h m 1 ), µ(g 1 ), · · · , µ(g m 2 ). By Hölder's inequality and (2.11), it is easy to see that for any p 1, where Θ ε is defined by (2.20). Thus, D Θ F is well defined, i.e., it does not depend on the representation of F. We have is closable in L p for any p > 1. Proof. (i) is standard by a monotonic argument.
(ii) We first assume Θ = (h, v) ∈ H 0 × V 0 . By (2.22) and Theorem 2.6 , we have By the definition of D Θ F, it is easy to see that Moreover, by (2.16) we also have (iii) Fix p > 1. Let F n be a sequence in F C ∞ p converging to zero in L p . Suppose that D Θ F n converges to some ξ in L p . We want to show ξ = 0. For any G ∈ F C ∞ p , noticing that F n G ∈ F C ∞ p , by Hölder's inequality, we have By (i), we obtain ξ = 0. The proof is complete. Definition 2.8. For given Θ = (h, v) ∈ H ∞− × V ∞− and p > 1, we define the first order Sobolev space W 1,p Θ being the completion of F C ∞ p in L p (Ω, F , P) with respect to the norm: We have the following integration by parts formula.
Moreover, we also have the following chain rule.

Kusuoka-Stroock's formula.
In this subsection we are about to establish a commutation formula between the gradient and Poisson stochastic integrals. On Wiener space this formula is given by Kusuoka and Stroock [12]. On configuration space similar formula is proven in [18].  Proof. (i) First of all, we assume that η(s, z) = 1 (t 0 ,t 1 ] (s)η(z), where η(z) is F t 0 -measurable, and satisfies (2.27) and z → η(z) has compact support U ⊂ Γ 0 . (2.29) For n ∈ N, let D n be the grid of R d with step 2 −n . For a point z ∈ R d , let φ n (z) be the left-lower corner point in D n which is closest to z. For ε ∈ (0, 1) and R > 1, let χ ε and χ R be defined by (2.3) and (2.4). For δ ∈ (0, 1), let η δ (z) be defined as in (2.8), and let us define η δ,n ε,R (ω, y) := χ ε (y)χ R (y) From this definition, we can write where ξ j ∈ W 1,∞− Θ is F t 0 -measurable and g j is smooth and has support By definition (2.21), it is easy to check that I (η δ,n ε,R ) : Thus, for proving (2.28), by Lemma 2.3 it suffices to prove that for any p > 1, We only prove the second limit. The first limit is similar. For fixed ε, R, set η ε,R := χ ε χ R η. Since for z U ε,R , η δ,n ε,R (z) = η ε,R (z) = 0, by Remark 2.1 and Hölder's inequality, we have On the other hand, since η has compact support U, by (2.27) and the dominated convergence theorem, we have Combining (2.31) and (2.32), we obtain (2.30).
(ii) Next we assume that for some compact set U ⊂ Γ 0 , η(s, z) = 0, z U. In this case, we have By Lemma 2.3 and (2.33), for any p 2, we have By the assumptions and the dominated convergence theorem, both of them converges to zero as n → ∞, and we obtain (2.28).

Reduced Malliavin matrix for SDEs driven by Lévy noises
As discussed in the introduction, in the remainder of this paper, we shall assume that Since ν(dz) is symmetric, by Lévy-Itô's decomposition, we can write

By Proposition 2.11, for any
Let X t = X t (x) solve the following SDE: Proof. Consider the following Picard's iteration: X 0 t = x and for n ∈ N, It is by now standard to prove that for any t 0 and p 1, By Gronwall's inequality, it is easy to prove that for any T > 0 and p 1, Let Y t solve the following SDE: Thus, by (3.3) we have X t ∈ W 1,p Θ and D Θ X t = Y t . The proof is complete. Let J t := J t (x) := ∇X t (x) be the Jacobii's matrix and K t := K t (x) := J −1 t (x). Then J t and K t solve the following ODEs (3.4) and it is easy to see that By (3.2) and the formula of constant variation, we have Below, let ζ(z) be a nonnegative smooth function with where η l (z) := ∂ l ζ(z) + ζ(z)∂ l log κ(z). In particular, for any p 2, Proof. Since d(z, Γ c 0 ) |z|∧(1−|z|), by (3.5) and (3.7), it is easy to check that Θ j (x) ∈ H ∞− ×V ∞− . Moreover, by definition (2.15) we immediately have (3.8). As for (3.9), it follows by (2.16).

Lemma 4.2. There exists a subsequence n m → ∞ such that P(Ω
where Proof. By ν(dz) = ν(−dz) and Doob's maximal inequality, we have Similarly, we also have The proof is complete.
By [17, p.64 , By Itô's formula, we also have Then P(Ω V 4 ) = 1. Now we can prove the following key lemma.
Assume that for some n = n(x) ∈ N, ∩ Ω 0 . Then by Lemmas 4.1-4.4, we have P(Ω) = 1. We want to prove that under (4.1), for each t > 0, the reduced Malliavin matrix Σ t (x, ω) is invertible for each ω ∈Ω. Without loss of generality, we assume t = 1 and fix an ω ∈Ω. For simplicity of notation, we shall drop (x, ω) below. By (3.10), for a row vector u ∈ R d we have Suppose that for some u ∈ S d−1 , Since s → K s is continuous and ω ∈ Ω 0 , we have Hence, by (3.4) we have which implies that Now we use the induction to prove Suppose that (4.2) holds for some n ∈ N. In view of ω ∈ Ω B n 4 , we have for all t ∈ [0, 1], By the induction hypothesis and the definition of H B n t , we further have which together with ω ∈ Ω B n 3 implies that In particular, Since ω ∈ Ω B n 2 , we also have uM B n t A i = 0, ∀t ∈ [0, 1], which together with (4.3) implies that uK s B n+1 A i (X s ) = 0, ∀s ∈ [0, 1].
Thus, we obtain uA i = uB 1 A i = · · · uB n A i = 0, i = 1, 2, which is contradict with (4.1). The proof is complete. 4.2. Proof of Theorem 1.1. Now we can finish the proof of Theorem 1.1 by the same argument as in [7]. We divide the proof into two steps.
Below we fix t > 0 and x ∈ R d . For each n ∈ N, let us define a finite measure µ n by For each ϕ ∈ C ∞ b (R d ), by the chain rule and (3.11), we have where ∇ = (∂ 1 , · · · , ∂ d ). So, Thus, by the integration by parts (2.24), we have for i = 1, · · · , d, where .
From this and using Lemma 3.3, by cumbersome calculations, we derive that where C n is independent of t, x. Hence, µ n is absolutely continuous with respect to the Lebesgue measure (cf. [14]), and by the Sobolev embedding theorem (cf. [1]), the density p n (y) satisfies that for any q ∈ [1, d/(d − 1)), where the constant C d,q,n is independent of t, x. Therefore, for any Borel set F ⊂ R d and R > 0, we have where m is the Lebesgue measure and q > 1. In particular, for Lebesgue zero measure A ⊂ R d , By Lemma 4.5 and the dominated convergence therem, we obtain that for any Lebesgue zero which means that the law of X t is absolutely continuous with respect to the Lebesgue measure.
(2) Let χ n ∈ C ∞ (R d ) be a smooth function with Let f be a bounded nonnegative measurable function. By Lusin's theorem, for any ε > 0, there exist a set F ε ⊂ {x ∈ R d : |x| < n + 1} and a nonnegative continuous function g ∈ C c (R d ) such that Let µ t,x;n be defined by µ t,x;n (A) := E 1 A (X t (x))Φ n (Σ t (x)) , A ∈ B(R d ).
By the dominated convergence theorem and (4.6), we have for any R > 0, First letting ε → 0 and then R → ∞, we obtain for n ∈ N, lim x→x 0 E ( f χ n )(X t (x))Φ n (Σ t (x)) E f (X t (x 0 )). (4.8) On the other hand, by the definition (3.10) of Σ t (x), it is easy to see that x → X t (x), Σ t (x) are continuous in probability.
Thus, by the dominated convergence theorem and (4.8), we have which, by Lemma 4.5 and letting n → ∞, implies lim x→x 0 E f (X t (x)) E f (X t (x 0 )).
Applying the above limit to the nonnegative function f ∞ − f (x), we also have Thus, we obtain the desired continuity (1.5).
Lemma 5.1. Let Y t = y + t 0 β s ds be an R d -valued process, where β t takes the following form: where γ t : R + → R d , Q t : R + → R d × R d and g t (z) : R + × R d → R d are three left continuous F t -adapted processes. Suppose that for some R > 0, |β t |, |Q t |, |γ t | R, |g t (z)| R(1 ∧ |z|). Then there exists a constant C 1 such that for any t ∈ (0, 1), δ ∈ (0, 1 3 ) and ε ∈ (0, t 3 ), The following lemma is simple. We also need the following estimate. where ζ(z) is defined by (3.7).