Sobolev embeddings for kinetic Fokker-Planck equations

We introduce intrinsic Sobolev-Slobodeckij spaces for a class of ultra-parabolic Kolmogorov type operators satisfying the weak H\"ormander condition. We prove continuous embeddings into Lorentz and intrinsic H\"older spaces. We also prove approximation and interpolation inequalities by means of an intrinsic Taylor expansion, extending analogous results for H\"older spaces. The embedding at first order is proved by adapting a method by Luc Tartar which only exploits scaling properties of the intrinsic quasi-norm, while for higher orders we use uniform kernel estimates.


Introduction
In this paper we develop a functional framework for the study of kinetic Fokker-Planck equations.Specifically, we introduce intrinsic Sobolev spaces suitably related to a system of Hörmander's vector fields: our main results are embedding, interpolation and approximation theorems that are the basic tools in many problems concerning partial differential equations.
Let pt, xq denote a point in R ˆRN and, for fixed d ď N , consider the vector fields where B is a constant N ˆN matrix and ∇ x " pB x 1 , . . ., B x N q.We assume the Hörmander's condition is satisfied: rank LiepB x 1 , . . ., B x d , Y q " N `1. (1.2) The classical example we have in mind is the Langevin kinetic model, given by the system of stochastic differential equations # dV t " dW t , dP t " V t dt, where W is a d-dimensional Brownian motion.Here the processes V and P represent the velocity and position of a system of d particles.The forward Kolmogorov (or Fokker-Planck) operator of (1.3), written in terms of the variables x " pv, pq P R d ˆRd , is in the form of a sum of squares of the vector fields B v 1 , . . ., B v d plus a drift (or transport term) Y 0 , precisely 1 2 (1.4) In this example, N " 2d and B " ˆ0d 0 d I d 0 d ẇhere I d and 0 d denote the d ˆd identity and null matrices respectively.Operator (1.4) satisfies the Hörmander's condition, is hypoelliptic and has a Gaussian fundamental solution that is the transition density of the Markov process pV, P q in (1.3).
The literature on generalized Sobolev spaces for Hörmander's vector fields is vast (see, for instance, [25], [26], [10], [31], [17]).When dealing with the regularity properties of PDEs modeled on the vector fields (1.1)-(1.2),as for example the equation in divergence form d ÿ i,j"1 B x i `aij B x j u ˘´Y u " 0, (1.5) it is standard to assign a formal weight to each of the vector fields, that is one for the directions B x 1 , . . ., B x d of diffusion and is two for the drift Y ; also, consistently with the structure of the equation, Y should be interpreted as a second order derivative in intrinsic sense.As earlier noted in [25] among others, this fact raises a question about the role of Y in the definition of first order intrinsic Sobolev space W 1,p : indeed, in the degenerate case d ă N , the regularity properties of (1.5) strongly rely on Hörmander's condition and involve the second order derivative Y in a crucial way.Many remarkable results have been proven for weak solutions of (1.5), defined as functions u such that Y u belongs to L 2 , in addition to the minimal assumptions u, B x 1 u, . . ., B x d u P L 2 needed to write the equation (1.5) in the sense of distributions: we refer for instance to [3], [4], [19], [5], [6], [1] and [9].In [22] a first L 2 ´L8 estimate has been proven by using Moser's approach; moreover, in [12] a Harnack inequality for kinetic Fokker-Planck equations with rough coefficients has been proven extending the De Giorgi-Nash-Moser theory.
In Section 2 we introduce intrinsic Sobolev-Slobodeckij spaces for (1.1), denoted by W k,p B , where at first order (i.e.k " 1) the vector field Y appears as a fractional derivative of order 1{2: this approach is coherent with the scaling properties of the Hörmander vector fields and therefore seems suitable for the study of (1.5).In particular, we can give a natural definition of weak solution u of (1.5) in the Sobolev space W 1,2  B without requiring Y u P L 2 as it is usually done in the literature: as far as we know, this is the first result in this direction.
We mention that the use of fractional derivatives makes it difficult to prove embedding results by means of representation formulas in terms of a parametrix, at least for k " 1, as in [22] or [5].Indeed, for the proof of our main embedding result, Theorem 1.1, we use a remarkable method developed by Tartar [28], that is only based on scaling arguments and a characterization of Lorentz spaces given in Lemma A.5.
In the following statement d denotes the homogeneous dimension of R N `1 induced by the vector fields (1.1), whose precise definition is given in (2.6): to fix ideas, d " 4d `2 for the Fokker-Planck operator (1.4).
where L p,q denotes the Lorentz space.In particular, W 1,p B Ď L q for p ď q ď p ˚; ii) for d ă p ă 8 we have Moreover, if u P W (1.9) The Morrey embedding (1.7) is given in terms of the optimal generalized Hölder spaces C k,α B only recently introduced in [18] together with an intrinsic Taylor formula.Embeddings for higher order spaces W k,p B are provided in Theorem 7.1.Remarkably, estimate (1.9) extends Trudinger's result [30].Embedding results for Kolmogorov equations were also proved in [7] and more recently in [11].
We acknowledge that Tartar himself applied his approach to the Langevin operator (1.4): according to [28], Appendices II and III, he proved that for a function f " f pt, x, vq on R ˆRd ˆRd , with f, ∇ v f, Y 0 f :" pB t `v ¨∇x qf P L p , one can first prove the "crude" embedding estimate for some q ą p and then get the embeddings: As a main motivation, our study is a first step in the development of a theory of generalized Besov spaces for possible applications to stochastic partial differential equations: we mention that recent results for stochastic kinetic equations were established in [21] and [32].Secondly, even for deterministic kinetic equations, our results improve the known regularity estimates available in the literature by providing the natural functional framework for weak solutions of kinetic Fokker-Planck equations.
The paper is structured as follows.In Section 2 we state the precise assumptions, introduce the intrinsic Sobolev and Hölder spaces and collect some preliminary result concerning the geometric structure induced on R N `1 by the vector fields (1.1).In Section 3 we prove a first interpolation result, Proposition 3.5, that provides a simplified and equivalent definition of intrinsic Sobolev quasinorm.In Section 4 we show an intrinsic Taylor expansion, Theorem 4.1, for functions in W k,p B which extends the analogous results for intrinsic Hölder spaces proved in [18].Crucial approximation and interpolation results, Theorems 5.2 and 5.3, are proven in Section 5. Section 6 contains the proof of our main result, Theorem 1.1, on the embeddings of W 1,p B .Eventually, in Section 7 we prove Theorem 7.1 on the higher order embeddings.For reader's convenience, in the Appendix we recall some basic result about interpolation and Lorentz spaces.
In the context of our proofs we will often use the notation A À B, meaning that A ď cB for some positive constant c which may depend on the quantities specified in the corresponding statement.

Assumptions
We recall that Hörmander's condition is equivalent to the well-known Kalman rank condition for controllability of linear systems (cf., for instance, Section 9.5 in [20]); also, it was shown in [13] that, up to a change of basis, condition 1.2 is equivalent to the following Assumption 2.1 (Hörmander's condition).The matrix B takes the block-form where B j is a pd j´1 ˆdj q-matrix of rank d j with In general, the ˚-blocks in (2.1) are arbitrary.Our second standing assumption is the following Assumption 2.2 (Homogeneity).All the ˚-blocks in (2.1) are null.
As proven in [13], Assumption 2.2 is equivalent to the fact that the kinetic Fokker-Planck operator is homogeneous of degree two with respect to the family of dilations defined as follows: first of all, consistently with the block decomposition (2.1) of B, we write x P R N as the direct sum x " x r0s `¨¨¨`x rrs where x ris P R N is defined as d j , d´1 :" 0, i " 0, . . ., r.

Intrinsic Hölder and Sobolev spaces
In this section we recall the definition of intrinsic Hölder space as given in [18] and introduce a notion of intrinsic Sobolev space, naturally associated to the system of vector fields (1.1).Let h Þ Ñ e hX z denote the integral curve of a Lipschitz vector field X starting from z P R N `1, defined as the unique solution of # d dh e hX z " X `ehX z ˘, h P R, e hX z| h"0 " z.
For the vector fields in (1.1), we have e hBx i pt, xq " pt, x `he i q, e hY pt, xq " pt `h, e hB xq, where e i is the i-th element of the canonical basis of R N .
Definition 2.3.Let m X be a formal weight associated to the vector field X.For α P s0, m X s, we say that u P C α X if the quasi-norm Hereafter, we set the formal weight of the vector fields B x 1 , . . ., B x d equal to one and the formal weight of Y equal to two, which is coherent with the homogeneity of the Fokker-Planck operator K with respect to the dilations D λ in (2.3).From [18] we recall the following Definition 2.4 (Intrinsic Hölder spaces).For α P s0, 1s we define the Hölder quasi-norms where ∇ d :" pB x 1 , . . ., B x d q and inductively, for n ě 2, Next we introduce the intrinsic Sobolev spaces.First, as in [17], for any u P L p , with p ě 1, we define the fractional Sobolev-Slobodeckij quasi-norm of order s P s0, 1r along a Lipschitz vector field X as rus X,s,p :" ˜żR N`

Dilation and translation groups
Besides the homogeneity with respect to 2) has also the remarkable property of being invariant with respect to the left translations in the group law pt, xq ˝ps, ξq " pt `s, e sB x `ξq, pt, xq, ps, ξq P R N `1.
Indeed, a simple computation shows that, for any z, where pt, xq ´1 " p´t, ´e´tB xq.Analogously, we have (see, for instance, [23]) A D λ -homogeneous norm on R N `1 is defined as is usually called the homogeneous dimension of R N `1 with respect to D λ .
Lemma 2.12.For any n P N and u P L p , with p ě 1, we have Proof.The first equality follows by a simple change of variable.Next, for i " 1, . . ., d we have, by (2.5) (by the change of variable z 1 " D λ z) Similarly (by the change of variables ph 1 , z 1 q " pλ 2 h, D λ zq) (2.11) (2.10) and (2.11) give the second equality for n " 1.The case n " 2 is analogous and the general case n ą 2 follows by induction.
3 Alternative Sobolev norms and a first interpolation result where Indeed these derivatives can be recovered by taking appropriate iterated commutators of the vector fields B x 1 , . . ., B x d and Y : exploiting these commutators, we can also rearrange the terms appearing in | ¨|n,p,B and provide a more explicit characterization which only make use of the intrinsic derivatives in the form (3.1).
First we recall some preliminary notions from [18], Section 4. By the structure of the matrix B, for any n " 0, . . ., r and v P R N we have and B n " 0 for n ą r.In particular, if v P V 0 then we have Moreover there exist subspaces is bijective.For v P V 0 , we introduce the following iterated commutators and recursively Then, for any i P N 0 , 2i `1 ď n, and v P V 0 we have Proof.We use an induction argument on n.If n ď 2 there is nothing to prove because X p0q v is a linear combination of the vector fields B x 1 , . . ., B x d and the thesis follows by definition.
Assume (3.6) is true for some fixed n ą 2 and let us prove it for n `1.We proceed by induction on i.For i " 0, again there is nothing to prove.We assume (3.6) for some i ą 0 such that 2pi `1q `1 ď n `1 and prove it for i `1. Here Proof.By induction it is not difficult to prove that Next, since ψ i in (3.4) is bijective, for every j " 1 `d i´1 , . . ., di there exists w j P V 0,i such that B i w j " e j P V i .Then f :" X piq which means that f is the weak derivative B x j u.
By Proposition 3.2 and the definition of intrinsic Sobolev spaces we eventually infer the following: Then, for any k P N 0 , β P N β 0 such that 2k `xβy B " l ď n, we have Corollary 3.4.The following quasi-norms are equivalent: Proof.By induction it is not difficult to check that |u| n,p,B controls all the L p -norms of the nth-order derivatives that are compositions of Y and B x 1 , . . ., B x d for any possible permutation, as well as the fractional quasi-norms of the pn ´1qth-order derivatives.Then it suffices to note that, by (3.5) we have then, proceeding as in the proof of Proposition 3.2 to rearrange the derivatives, we get (3.7).Moreover, by definition we have and the fractional part of the quasi-norm coincides with (3.7).As for the first term in the sum we need to distinguish two cases: if n " 2l `1 for some l P N, then we get an equivalence with (3.7) by rearranging the derivatives as in the proof of Proposition 3.2; indeed, compared to W n´1,p B we have the additional set of Euclidean derivatives B x j , j " 1 `d l´1 , . . ., dl which can be recovered from ř d j"1 B x j Y l u by (3.9), and similarly for the mixed derivatives.If n " 2l we have the derivatives Y l u that cannot be written as sums of iterated commutators and thus we get (3.8).Proof.The proof is based on a two-step induction.

Interpolation inequality and equivalence of the norms } ¨}W
Step 1: case n " 1 and m " 2. The estimate is standard (cf.for instance [2], Chapter 5).On the other hand, by Fubini's Theorem we have J p pz, hqdz, J p pz, hq :" |upe hY zq ´upzq| p .
By the mean value theorem along the vector field Y , for every z P R The thesis follows by a scaling argument: indeed, applying (3.11) to upD ε ´1 ¨q, by (2.9) we get Step 2: induction on n, m with m " n `1.We first prove the preliminary interpolation inequality: We have upe Y zq ´upzq ´Y upzq " ´Y upe δY zq ´Y upzq ¯dδ, and therefore ˇˇˇu pe Y zq ´upzq ´ż 1 0 ´Y upe δY zq ´Y upzq ¯dδ ˇˇˇp dz ď (by the triangular and Hölder inequalities) where and (by the change of variables z 1 " e Y z and δ 1 " δ ´1) reasoning as for I 2 in the last step.Then (3.12) follows by a scaling argument.
Next we prove that if, for some n P N, (3.10) holds with n " n, m " n `1 then it also holds with n " n `1 and m " n `2.By Step 1, (3.8) and (3.12), if n `1 is even we have (by the inductive hypothesis) If n `1 is odd, by (3.8) we derive the same estimate only exploiting Step 1.To conclude it suffices to take ε 1 " ε 2c 1 c 2 .
Step 3: backward induction on n.Let m P N, m ą 2, be fixed.We prove that if (3.10) is true for m and n " n for some n P t2, . . ., m ´1u, then it is also true for m and n " n ´1. By The proof is based on an induction procedure developed in [18] to derive the C n,α B estimate of the remainder.For completeness, here we give a fairly comprehensive presentation of the main lines, and refer to [18] for the details of the construction.To simplify the exposition we first split the proof in different steps, corresponding to particular cases of (4.2).Lemma 4.2.There exists c " cpn, pq such that, for any u P W n`1,p B X C 8 and δ P R, we have where rn{2s denotes the integer part of n{2.
Proof.We first check that, for u P W Notice that, for u P W 1,p B , we have L p -bounds only on the first d spatial derivatives.Thus, in order to prove (4.7) we must exploit estimate (4.4) and connect any arbitrary point z " pt, xq P R N `1 to z ˝p0, ξq " pt, x `ξq through a chain of integral curves associated only to the vector fields B x 1 , . . ., B x d and Y .
To do so, we define a sequence of points pz k " pt, xpkqqq k"0,¨¨¨,r adjusting, at any step k, the set of variables of the layer V k in (3.3).Following [18], Lemma 4.22 we set and z ´1 " z, z 0 " γ p0q v 0 ,δ 0 pz ´1q :" e δ 0 xv 0 ,∇y z " pt, x `ξr0s q.For k " 1, . . ., r let where v k " w k {|w k | and w k is the unique vector in V 0,k Ă V 0 such that B k w k " ξ rks `xrks ´xrks pk ´1q.
Importantly, it can be proven by induction that, for any v P V 0 we have In other words the flow γ pkq v,δ only affects the set of variables rk : rs.Moreover, γ prq vr ,δr pz r´1 q " pt, x `ξq by construction.Notice also that at any step, δ k does not depend on x and also, the specific choice of w k implies δ k ď c B |ξ| B (cf. [18], Lemma 4.22).
We are ready to prove (4.7): by the Minkowski inequality we have where we exploited the changes of variables z 1 " z k´1 " z k´1 pzq in the last step: here we use the fact that, by Remark 2.11, the Jacobian of the change of variables has determinant equal to one.Then the proof is completed once we have proved that, for any k, i P t0, . . ., ru we have We proceed by induction on k.The case k " 0 follows from (4.8).Assume now (4.9) holds for some k P t0, . . ., r ´1u: as before we have and the thesis follows from the inductive step and (4.4).
We are ready to prove Theorem 4.1.
Proof of Theorem 4.1.We prove that, for any n P N 0 , ζ P R N `1 and u P W n` .
By definition e sY z " pt `s, e sB xq, z ˝ζ " pt `s, ξ `esB xq.
Hence F 1 pz, ζq contains increments that only differ in the spatial variables, while F 2 pz, ζq contains increments that only differ along Y .
To estimate F 2 pz, ζq we first notice that the increments in the Taylor polynomials appearing in F 2 pz, ζq are given by pe sY zq ´1 ˝pz ˝ζq " p0, ξq, z ´1 ˝pz ˝ζq " ps, ξq, thus we have Taking the L p norm in dz and using (4., where x the point in R N defined by xris " x ris `ξris if 2i `1ąn.
Notice that px `ξ ´xq β " ξ β for xβy B ď n and |x ´ξ ´x| B ď |ξ| B .For x P R N , we introduce the notation x rks 0 ď i ă j ď r.
Now, since u P W n`1,p B has weak derivatives of order r n`1 2i`1 s in any direction of the increments ris, i ď r n 2 s, it is not difficult to check, similarly to (4.3) that On the other hand, by (4.12), we have Then, taking the L p norm in dz, we have Remark 4.4.By Theorem 4.1, for any i P t dj´1 `1, . . ., dj u with 2j `1 ą n we have in particular Then, using Fubini's Theorem it is straightforward to check that, for any ε ą 0 Together with Corollary 3.3 this gives the expected regularity in any spatial direction, which is not prescribed a priori by the definition of the spaces.Also, by Corollary 3.3 we can further infer that, for any k P N 0 , β P N N 0 with 2j `1 ą n ´2k ´xβy B ě 0 we have 5 Approximation and interpolation 5.1 Approximation in W n,p B Let ϕ be a test function supported on }z} B ď 1 with unitary integral.Following [23], we define the n-th order approximation for u P W n,p B as where T n upζ, zq is the B-Taylor polynomial in (4.1) and d the homogeneous dimension of R N `1.
We also recall the useful Lemma 3.2 from [23] which still holds for functions in W n,p B : indeed, its proof relies only on basic algebraic rules of derivation, namely the Leibniz formula and the chain rule for compositions with smooth functions.Lemma 5.1.For any u P W n,p B and z, ζ P R N `1 we have Theorem 5.2 (Approximation).Let n, m P N with n ă m.There exist constants c 1 " cpn, p, Bq and c 2 " cpn, m, p, Bq such that for any u P W n,p B and 0 ă ε ď 1 we have Proof.We denote by D l any weak derivative of intrinsic order l, that is D l " Y k B β x with 2k`xβy B " l, and let We prove the following preliminary estimates for u P W n,p B X C 8 0 : ) ) in the integration set of I 2 , and therefore, again by Hölder inequality, we have where the last inequality easily follows by a change of variables, and this proves (5.5).Lastly, (5.6) requires more attention.We have ´Tn´1 upζ, e hY zq ´upe hY zq ¯pD l ϕq ´Dε ´1 pζ ´1 ˝ehY zq ¯´pD l ϕq `Dε ´1 pζ ´1 ˝zq ˘¯dζ ε d ˇˇˇp dh J n pζ, zqD l ϕ `Dε ´1 pζ ´1 ˝zq ˘dζ ε d ˇˇp dh |h| 1`p 2 dz with J n pζ, zq " T n´1 upζ, e hY zq ´upe hY zq ´pT n´1 upζ, zq ´upzqq .
The term S 1 can be controlled as ru 0,ε s Y, 1 2 ,p : for simplicity here we assume that we can control the support of the increment of D l ϕ independently of ε on the whole integration set of h to exploit a Hölder inequality (otherwise we can just split the integration set and proceed as for the terms I 1 and I 2 of ru 0,ε s Y, 1 2 ,p ), then we have Then, after the change of variables z " e λhY z and exchanging the order of integration, the term inside the square brackets in (5.7) is bounded by By (2.5) and (2.8), recalling that λh ď ε 2 and }ζ} B ď 1 in the current integration set, we have Therefore, substituting in (5.7) we find The cases n " 1 or n " 2 are easier: it is easy to check that T 0 upζ, e hY zq " T 0 upζ, zq " upζq and T 1 upζ, e hY zq " T 1 upζ, zq, therefore it suffices to use (4.3) and proceed as above.Collecting the estimates for S 1 , S 21 , S 22 we get (5.6).
We are ready to prove (5.2) and (5.3) for u P W n,p B X C 8 0 , then the general statement follows by density.Clearly }u ´un,ε } p " }I pn,0q ε u} p À ε n }u} W n,p B by (5.4).On the other hand, by (3.7), with some slight abuse of notation, we have Since D i z T n upζ, zq " 0 for any i ą n we have meaning that D m , D m´i , D i may stand for any intrinsic derivative of order m, m ´i, i.Then, using Lemma 5.1 and that ż we can write and thus (5.9) Next we use an ingenious approach, devised by Tartar [28], which consists of applying (6.1) to a suitable non-linear transformation of u.Precisely, we consider ϕ k puq where pϕ k q kPZ is an appropriate sequence of functions involving the levels a k :" u ˚pe k q of Lemma A.5: for v P R and k P Z we set We have the following crucial Lemma 6.3.For p P r1, 8q, let u P W 1,p B .There exists a positive constant c " cpp, Bq such that where p ˚is the critical exponent in (1.6).
Proof.Notice that pa k ´ak`1 q1 p|u|ěa k q ď ϕ k puq ď pa k ´ak`1 q1 p|u|ěa k`1 q , where 1 A denotes the indicator function of the set A. Hence, for any q ě 1 we have Lebp|u| ě a k q 1 q pa k ´ak`1 q ď }ϕ k puq} q ď pa k ´ak`1 qLebp|u| ě a k`1 q where Lebp¨q represents the Lebesgue measure.By (A.8), which follows by construction of pa k q kPZ (also recall definition (A.4) of distribution function), we get e k´1 q pa k ´ak`1 q ď }ϕ k puq} q ď e k`1 q pa k ´ak`1 q (6.5) From the first inequality in (6.5) and (6.1) applied to ϕ k puq P W 1,p B with q, θ as in Lemma 6.1, we infer e k´1 q pa k ´ak`1 q À }ϕ k puq} 1´θ p |ϕ k puq| θ 1,p,B À (by the second inequality in (6.5)) À e pk`1qp1´θq p pa k ´ak`1 q 1´θ |ϕ k puq| θ 1,p,B .
Equivalently, we have and this concludes the proof since 1 q ´1´θ p " θ p ˚.
(6.6) Thus, combining (6.4) with (6.6), we deduce that e k p ˚pa k ´ak`1 q P ℓ p pZq for any u P W 1,p B .[Case 1 ď p ă d] A direct application of Lemma A.5 gives the improved Sobolev embedding In particular W 1,p B Ď L q for any q P rp, p ˚s by a standard application of the Young inequality.À ÿ kPZ pa k ´ak`1 q p e pk p ˚ă 8 by (6.4).Being decreasing, pa k q kPZ is then a bounded sequence and this yields W [Case p " d] As in the case p ă d, the embeddings (1.8) follow from Lemma A.5.To get estimate (1.9), it suffices to repeat the argument used by Tartar in [29], Chapter 30: more precisely, for p " d we have 1{p ˚" 0 so that a k ´ak`1 P ℓ d pZq by (6.4)-(6.6);applying Hölder's inequality we first prove that for every ε ą 0 there exists a constant c " cpε, uq ą 0 such that On the set where a k`1 ď |u| ă a k , which has measure less than e k`1 by (A.

Proposition 3 . 5 .
Let 1 ď nă m and p ě 1.There exists c " cpm, p, Bq such that |u| n,p,B ď c ´ε|u| m,p,B `ε´n m´n }u} p ¯, u P W m,p B , ε ą 0. (3.10)In particular the norms } ¨}W n,p B and ~¨~W n,p B are equivalent.

[
Case p ą d] We have p ˚ă 0 and therefore for any k ď 0 we have }∇ d u} p `rus Y, 1 2 ,p , |u| 2,p,B :" |∇ d u| 1,p,B `}Y u} p , are equivalent and therefore define the same functional spaces.This means that the intermediate orders quasi-norms are not needed to characterize W n,p B .Then, if we replace rus Y,s,p by tuu Y,s,p in Definition 2.6, we get equivalent norms.