Regularity of Stochastic Kinetic Equations

We consider regularity properties of stochastic kinetic equations with multiplicative noise and drift term which belongs to a space of mixed regularity ($L^p$-regularity in the velocity-variable and Sobolev regularity in the space-variable). We prove that, in contrast with the deterministic case, the SPDE admits a unique weakly differentiable solution which preserves a certain degree of Sobolev regularity of the initial condition without developing discontinuities. To prove the result we also study the related degenerate Kolmogorov equation in Bessel-Sobolev spaces and construct a suitable stochastic flow.


Introduction
We consider the linear Stochastic Partial Differential Equation (SPDE) of kinetic transport type and the associated stochastic characteristics described by the stochastic differential equation (SDE) consider the simplest noise dW t = d k=1 e k dW k t , {e k } k=1,...,d being an orthonormal base of R d .
For physical reasons, we chose in (1.1) a specific linear form for the drift in the degenerate component, i.e., v · D x . It seems reasonable to expect that a possible generalization to the case of a nonlinear drift term like G(x, v) · D x could be obtained under suitable Hörmander type conditions ensuring that the system is hypoelliptic. We shall mention that some results in this direction have already been obtained: two strong well-posedness results for the degenerate SDE (1.2) with nonlinear Hölder continuous drift terms are presented respectively in [6] and [38]. However, using our approach, such results are not enough to prove a well-posedness result for the SPDE (1.1) since a full hypoelliptic regularity result is yet not available at the level of the corresponding degenerate Kolmogorov equations.
Our aim is to show that noise has a regularizing effect on both the SDE (1.2) and the SPDE (1.1), in the sense that it provides results of existence, uniqueness and regularity under assumptions on F which are forbidden in the deterministic case. Results of this nature have been proved recently for other equations of transport type, see for instance [19], [16], [18], [2], but here, for the first time, we deal with the case of "degenerate" noise, because dW t acts only on a component of the system. It is well known that the kinetic structure has good "propagation" properties from the v to the x component; however, for the purpose of regularization by noise one needs precise results which are investigated here for the first time and are technically quite non trivial. Let us describe more precisely the result proved here.
First Theorem 4.5 shows that weak existence and uniqueness in law holds for the SDE (1.2) only assuming F ∈ L p (R 2d ; R d ) with p > 4d. To prove strong existence for (1.2) and existence of a stochastic flow we investigate the SDE (1.2) under the assumption (see below for more details) that F is in the mixed regularity space L p R d v ; W s,p R d x ; R d for some s ∈ ( 2 3 , 1) and p > 6d; this means that we require

Notation and examples
We will either use a dot or , to denote the scalar product in R d and | · | for the Euclidian norm. Other norms will be denoted by · , and for the sup norm we shall use both · ∞ and · L ∞ (R d ) . C b (R d ) denotes the Banach space of all real continuous and bounded functions f : R d → R endowed with the sup norm; is the subspace of all functions wich are differentiable on R d with bounded and continuous partial derivatives on R d ; for α ∈ R + \N, is the space of all infinitely differentiable functions with compact support. C, c, K will denote different constants, and we use subscripts to indicate the parameters on which they depend.
Throughout the paper, we shall use the notation z to denote the point (x, v) ∈ R 2d . Thus, for a scalar function g(z) : R 2d → R, D z g will denote the vector in R 2d of derivatives with respect to all variables z = (x, v), D x g ∈ R d denotes the vector of derivatives taken only with respect to the first d variables and similarly for D v g(z). We will have to work with spaces of functions of different regularity in the x and v variables: we will then use subscripts to distinguish the space and velocity variables, as in Hypothesis 2.1.
Let us state the regularity assumptions we impose on the force field F . Hypothesis 2.1. The function F : R 2d → R d is a Borel function such that where s ∈ (2/3, 1) and p > 6d. We may have various types of pathologies. We shall mention here some of them in the very simple case of d = 1, First, note that this function belongs to L p R v ; H s p (R x ) for some s > 2 3 and p > 6. To check this fact one can first observe that ∂ x (sign(x)|x| α ) = α|x| α−1 in distributional sense, so that F (v, ·) ∈ H 1 q (R x ) for some appropriate value of q > 2, and then use Sobolev embedding theorem: H 1 q (R) ⊂ H s p (R) for 1 p = 1 q − 1 + s. Thus F satisfies our Hypothesis 2.1. On the other hand when α ∈ 1 2 , 2 3 , the function sign (x) |x| α is not in C γ loc (R) for any γ > 2/3 and the results of [37], [38] do not apply.
Let us come to the description of the pathologies of characteristics and kinetic equation when F (x, v) = ±θ (x, v) |x| α .
Proof. Let us check that (x t , v t ) = At β , Aβt β−1 with the specified values of (β, A) and a small range of t, are solutions. We have With a little greater effort one can show, in this specific example, that every solution (x t , v t ) from the initial condition (0, 0) has, for small time, the form (x t , v t ) = A (t − t 0 ) β , Aβ (t − t 0 ) β−1 1 t≥t0 for some t 0 ≥ 0, or it is (x t , v t ) = (0, 0) ((β, A) always given by (2.5)) and that existence and uniqueness holds from any other initial condition, even from points of the form (0, v 0 ), v 0 = 0, around which F is not Lipschitz continuous. Given T > 0 and R > 0 large enough, there is thus, at every time t ∈ [0, T ], a set Λ t ⊂ R 2 of points "reached from (0, 0)", which is the set Using this family of sets one can construct examples of non uniqueness for the transport equation (2.3), because a solution f (t, x, v) is not uniquely determined on Λ t . However, these examples are not striking since the region of non-uniqueness, ∪ t≥0 Λ t , is thin and one could say that uniqueness is restored by a modification of f on a set of measure zero. But, with some additional effort, it is also possible to construct an example with In this case, for some negative m (depending on R and α), one can construct infinitely many solutions (x t , v t ) starting from any point in a segment (x 0 , 0), , uniformly in t. The next proposition identifies an example with non-Lipschitz F where this persistence of regularity is lost. More precisely, even starting from a smooth initial condition, unless it has special symmetry properties, there is a solution with a point of discontinuity. This pathology is removed by noise, since we will show that with sufficiently good initial condition, the unique solution f (t, z) is of class W 1,r loc (R 2 ) for every r ≥ 1 and t ∈ [0, T ] a.s., hence in particular continuous. However, in the stochastic case, we do not know whether the solution is Lipschitz under our assumptions, whereas presumably it is under the stronger Hölder assumptions on F of [38].
2) has a unique local solution on any domain not containing the origin, for every initial condition. For every t 0 > 0 (small enough with respect to R), the two initial conditions At β . As a consequence, the transport equation (2.3) with for some t 0 > 0, has a solution with a discontinuity at time t 0 at position (x, v) = (0, 0).
Proof. The proof is elementary but a full proof is lengthy. We limit ourselves to a few simple facts, without proving that system (2.2) is forward well posed (locally in time) and the transport equation (2.3) is also well posed in the set of weak solutions. We only stress that the claim (x t0 , v t0 ) = (0, 0) when the initial condition is At β 0 , −Aβt β−1 0 can be checked by direct computation (as in the previous proposition) and the discontinuity of the solution f of (2.3) is a consequence of the transport property, namely the fact that whenever f is regular we have where (x t , v t ) is the unique solution with initial condition (x 0 , v 0 ). Hence we have this identity for points close (but not equal) to the coalescing ones mentioned above, where the forward flow is regular and a smooth initial condition f 0 gives rise to a smooth solution; but then, from identity (2.6) in nearby points, the limit does not exists if t 0 is as above and f 0 At β 0 , −Aβt β−1 3 Well-posedness for degenerate Kolmogorov equations in Bessel-Sobolev spaces

Preliminaries on functions spaces and interpolation theory
Here we collect basic facts on Bessel and Besov spaces (see [3], [35] and [32] for more details). In the sequel if X and Y are real Banach spaces then Y ⊂ X means that Y is continuously embedded in X.
. This is a Banach space endowed with the norm f H s p = J s f p , where · p is the usual norm of L p (R d ) (we identify functions with coincide a.e.).

It can be proved that
and an equivalent norm in H s To show this characterization one can use that [32, page 134]), and basic properties of convolution and Fourier transform. We note that if s 2 > s 1 and, moreover, C ∞ c (R d ) is dense in any H s p (R d ). One can compare Bessel spaces with Besov spaces B s p,q (R d ) (see, for instance, Theorem 6.2.5 in [3]). Let p, q ≥ 2, s ∈ (0, 2), to simplify notation.
with equivalence of norms. However if s = 1, such that We also have the following result (cf. [ s ∈ (0, 2), p ≥ 2. Next we state a known result (see [3,Theorem 6.4.4]; for a direct proof see Appendix in [17]). This is useful to give an equivalent formulation to Hypothesis 2.1.

Interpolation of functions with values in Banach spaces
We follow Section VII in [27] and [9]. Let A 0 be a real Banach space. We will consider the Banach space L p (R d ; A 0 ), 1 ≤ p < ∞, d ≥ 1. As usual this consists of all strongly measurable functions f from R d into A 0 such that the real valued function f (x) A0 belongs to L p (R d ). We have by using the interpolation space (A 0 , A 1 ) θ,q , q ∈ (1, ∞), p ≥ 1 and θ ∈ (0, 1). One can prove that with equivalence of norms (see [27] and [9]). In the sequel we will often use, for s ≥ 0, p ≥ 2, L p (R d ; H s p (R d )) . (3.11) We will often identify this space with the Banach space , for a.e. v ∈ R d , and, moreover (see (3.1)) (here F x denotes the partial Fourier transform in the x-variable; as usual we identify functions which coincide a.e.). As a norm we consider , (3.14) p ≥ 2, 0 < s < 2. Finally using (3.10) and (3.9) we get for 0 < s 0 < s 1 < 2, θ ∈ (0, 1), In the sequel when no confusion may arise, we will simply write L . This convention about vector-valued functions will be used for other function spaces as well.
This section is devoted to the study of the equation shall start by considering the simpler equation with B = 0, i.e., Recall that D v ψ and D x ψ denote respectively the gradient of ψ in the v-variables and in the x-variables; moreover, D 2 v ψ indicates the Hessian matrix of ψ with respect to the v-variables (we have v ψ = Tr(D 2 v ψ)).
It turns out that X p,s is a Banach space endowed with the norm: (3.17)). With a slight abuse of notation, we will still write f ∈ X p,s for vector valued functions f : R 2d → R 2d , meaning that all components f i : R 2d → R, i = 1 . . . 2d belong to X p,s .
The following theorem improves results in [4] and [5]. In particular it shows that there exists the weak derivative D x ψ ∈ L p (R 2d ) so that (3.17) admits a strong solution ψ which solves equation (3.17) in distributional sense.
Step 1. We prove existence of solutions and estimates (3.19) and (3.20). Let us first introduce the Ornstein-Uhlenbeck semigroup is the Gaussian measure with mean 0 and covariance matrix 13]) we know that P t g is well-defined also for any g ∈ L p (R 2d ), z a.e.; moreover P t : Using the Jensen inequality, the Fubini theorem and (3.26) it is easy to prove that G λ g is well defined for g ∈ L p (R 2d ), z a.e., and belongs to L p (R 2d ). Moreover, for any p ≥ 1, ) . Arguing as in [31,Lemma 13] one can show that there exist classical solutions ψ n to (3.17) with g replaced by g n . Moreover, ψ n = G λ g n . By [31,Theorem 11], which is based on results in [5], we have that 29) λ > 0, n ≥ 1, C = C(p, d). Using also (3.28) we deduce easily that (ψ n ) and (D 2 v ψ n ) are both Cauchy sequences in L p (R 2d ). Let us denote by ψ ∈ L p (R 2d ) the limit function; it holds that ψ = G λ g and D 2 v ψ ∈ L p (R 2d ). Passing to the limit in (3.17) when ψ and g are replaced by ψ n and g n we obtain that ψ solves (3.17) in a weak sense (v · D x ψ is intended in distributional sense). By (3.29) as n → ∞ we also get To prove (3.19) it remains to show the estimate for D v ψ. This follows from (3.32) To this purpose we will use a result in [4] and interpolation theory.
By (3.10) and (3.8) with s 0 = 0 and s 1 = 2/3 we can interpolate between (3.33) and [29,Proposition 1.2.6]) and get, for ε ∈ (0, 2/3), ) and fix k = 1, . . . , d. By approximating g with regular functions, it is not difficult to prove that there exists the weak derivative Taking into account (3.8), (3.9) and (3.10) we can interpolate between (3.34) and (3.36) (see also (3.11) and (3.12)) and get ) with ε small enough (recall that s ∈ (1/3, 1)) we finally obtain that is linear and continuous. Moreover, we have with ψ = G λ g Step 2. We prove the last assertion (3.21). The main problem is to show that x, v ∈ R d . We know that h s ∈ L p (R 2d ) by our hypothesis on g. A straightforward computation based on the Fubini theorem shows that By using (3.30) (with g replaced by h s and ψ by G λ h s ) we easily obtain that By (3.40), (3.41) and the Fubini theorem we know that It follows that, for x ∈ R d a.e., In order to prove (3.42) with C(λ) → 0, we consider r ∈ (0, 1) such that rp > d. Let We can apply the Sobolev embedding theorem (see [35, page 203]) and get that Now we easily obtain (3.42) using (3.40) and (3.41), since We complete the study of the regularity of solutions to equation (3.17) with the next result in which we strengthen the assumptions of Lemma 3.5. Note that the next assumption on p holds when p > 6d as in Hypothesis 2.1.
). In addition assume that p(s − 1 3 ) > 2d, then the following statements hold. (i) The solution ψ = G λ g (see (3.28)) is bounded and Lipschitz continuous on R 2d . Moreover there exists the classical derivative D x ψ which is continuous and bounded on R 2d and, for λ > 0, The boundedness of ψ follows easily from estimates (3.19) and (3.20) using the Sobolev embedding since in our case p > 2d. Let us concentrate on proving the Lipschitz continuity.
First we recall a Fubini type theorem for fractional Sobolev spaces (see [33]): [35, page 203]) we get the assertion. According to (3.46) we check Estimate (3.48) follows by (3.37) which gives Let us concentrate on (3.49). We still use the interpolation theory results of Section 3.2 but here in addition to (3.12) we also need to identify (3.19) and (3.20) in Theorem 3.4 and using (3.35) we with c = c(d, p) > 0. Thus we can consider the following linear maps (s ∈ (1/3, 1) will be fixed below) Interpolating, choosing s ∈ (1/3, 1) such that we get (see (3.8) and (3.10) with θ = s−s and by the estimates in (3.51) we find and we finally get (3.49).
(ii) We fix j = 1, . . . , d and prove the assertion with D v ψ replaced by ∂ vj ψ. By Theorem 3.4 we already know that there exists D v ∂ vj ψ ∈ L p (R 2d ). Therefore to show the assertion it is enough to check that there exists the weak derivative We use again (3.53) with the same θ. Since 2θ > 1 we know in particular that D ). Thus we have that there exists the weak derivative ∂ vj D x ψ(x, ·), for x a.e., and (3.55) This finishes the proof. Now we study the complete equation .12) and (3.13)). From the previous results we obtain (see also Definition 3.3) ) > 0 such that for any λ > λ 0 there exists a unique solution ψ = ψ λ ∈ X p,s to (3.56) and moreover , ψ is bounded on R 2d and there exist the classical derivatives D x ψ and D v ψ which are bounded and continuous on R 2d ; we also have with C(λ) → 0 as λ → ∞ Proof. First note that, since p > 2d, the boundedness of ψ follows by the Sobolev embedding (recall also (3.18)). Similarly the second estimate in (3.59) follows from (3.60).
We consider the Banach space ) and use an argument similar to the one used in the proof of [10, Proposition 5]. Introduce the operator T λ : Y → Y , . It is clear that T λ is linear and bounded. Moreover we find easily that there exists λ 0 > 0 such that for any λ > λ 0 we have that the operator norm of T λ is less than 1/2.
Let us fix λ > λ 0 . Since T λ is a strict contraction, there exists a unique solution f ∈ Y to f − T λ f = g . (3.61) Uniqueness. Let ψ 1 and ψ 2 be solutions in X p,s . Set w = ψ 1 − ψ 2 . We know that By uniqueness (see Theorem 3.4) we get that w = G λ f . Hence, for z a.e., Since T λ is a strict contraction we obtain that f = 0 and so ψ 1 = ψ 2 .
Existence. It is not difficult to prove that is the unique solution to (3.56).
Regularity of ψ and estimates. All the assertions follow easily from (3.62) since (I − T λ ) −1 g ∈ Y and we can apply Theorem 3.4, Lemmas 3.5 and 3.6.
In the Appendix we will also present a result on the stability of the PDE (3.56), see Lemma 6.1 .

Regularity of the characteristics
We will prove existence of a stochastic flow for the SDE (1.2) assuming Hypothesis 2.1.
We can rewrite our SDE as follows.
(4.1) With this new notation, (1.2) can be rewritten as We have  is again defined componentwise (ũ : R 2d → R d ).
Remark 4.1. In the following, according to (4.1), we will say that the singular diffusion Z t (the noise acts only on the last d coordinates {e d+1 , . . . , e 2d }) or the associated Kolmogorov operator , are hypoelliptic to refer to the fact that the vectors e d+1 , . . . , e 2d , Ae d+1 , . . . , Ae 2d generate R 2d . Equivalently using Q given in (4.1) and the adjoint matrix A * we have that the symmetric matrix Q t = t 0 e sA Qe sA * ds is positive definite for any t > 0 (cf. (3.25)).
We collect here some preliminary results, which we will later need. Recall the OU process , i.e., L t = L z t = e tA z +  Using the fact that L t is hypoelliptic, for any t > 0, one gets that the law of L t is equivalent to the Lebesgue measure in R 2d (see for example the proof of the next lemma).
We also have the following result.
Lemma 4.2. Let (L z t ) be the OU process solution of (4.6). Let f : R 2d → R belong to L q (R 2d ) for q > 2d. Then there exists a constant C depending on q, d and T such that Proof. We need to compute where P t is the Ornstein-Uhlenbeck semigroup introduced in (3.24). By changing variable and using the Hölder inequality we find, for t ∈ [0, T ], z ∈ R 2d , |f (e tA z + Q t y)| q dy with c q independent of z. We now have to study when t 0 1 (det(Q s )) 1/2q ds < ∞ . By a direct computation for s → 0 + (det(Q s )) 1/2q ∼ c(s 4d ) 1/2q , hence the result follows for q > 2d.
We state now the classical Khas'minskii lemma for an OU process. The original version of this lemma ( [23], or [34, Section 1, Lemma 2.1]) is stated for a Wiener process, but the proof only relies on the Markov property of the process, so that its extension to this setting requires no modification.
we also have (4.10) We now introduce a generalization of the previous Khas'minskii lemma which we will use to prove the Novikov condition, allowing us to apply Girsanov's theorem. Proposition 4.4. Let (L t ) be the OU process solution of (4.6). Let f : R 2d → R belong to L q (R 2d ) for q > 2d. Then, there exists a constant K f depending on d, q, T and continuously depending on f L q (R 2d ) such that Proof. From Lemma 4.2, for any a > 1 s.t. q/a > 2d we get The next result can be proved by using the Girsanov theorem (cf. [22] and [28]).
defined on a stochastic basis (Ω, F, (F t ), P) on which it is defined an R d -valued Wiener process (W t ) = W , we can define the process is an F t -martingale. Then, by the Girsanov theorem (H t ) t∈[0,T ] is a d-dimensional Wiener process on (Ω, F T , (F s ) s≤T , Q), where Q is the probability measure on (Ω, F T ) having density Φ = Φ T with respect to P. We have that on the new probability space Uniqueness. To prove weak uniqueness we use some results from [28]. First note that the process We can apply to V = (V t ) [ i.e. the laws of V = (V t ) t∈[0,T ] and W = (W t ) t∈[0,T ] are equivalent. Moreover, by [28, It follows that, for any Borel set B ∈ B(C([0, T ]; R d )), this shows easily that uniqueness in law holds. Clearly (iii) follows from (ii). Let us prove (ii).
(ii) The processes L = (L t ) and Z = (Z t ), t ∈ [0, T ], satisfy the same equation (4.2) in (Ω, F, F t , Q, (H t )) and (Ω, F, F t , P, (W t )) respectively. Therefore, by weak uniqueness, the laws of L and Z on C([0, T ]; R 2d ) are the same (under the probability measures Q and P respectively). Hence, for any Borel set J ⊂ C([0, T ]; R 2d ), we have Since W t = ( L t , e d+1 , . . . , L t , e 2d ) we see that each W s is measurable with respect to the σ-algebra generated by the random variable L s , s ≤ T . By considering L as a random  We can now prove that the result of Lemma 4.2 holds also when replacing the OU process L t with Z t . Lemma 4.6. Let F ∈ L p (R 2d ; R d ) for p > 4d and Z z t be a solution of (4.2). Let f : R 2d → R belong to L q (R 2d ) for some q > 2d. Then there exists a constant C depending on q, d and T such that and a constant K f depending on q, d, T and continuously depending on f L q (R 2d ) for Proof. As seen in the previous proof, the laws of L t and Z t are the same under Q and P respectively. Then, applying Hölder's inequality with 1/a + 1/a = 1 we have Taking a > 1 small enough so that q/a > 2d, we can apply Lemma 4.2 to |f | a and control the first expectation on the right hand side with a constant times f L q (R 2d ) . Then we which has finite expectation due to Proposition 4.4. Both these estimates are uniform in z, so that (4.14) follows. Similarly, we have Both terms on the right hand side are finite due to Proposition 4.4: this proves (4.15).
From now to the end of the paper we will assume Hypothesis 2.1.

Lemma 4.7.
Any process (Z t ) which is solution of the SDE (4.2) has finite moments of any order, uniformly in t ∈ [0, T ]: for any q ≥ 2 It follows from (4.15) that for any q ≥ 1, E | T 0 F (Z t ) dt| q ≤ C. Using this bound, the explicit density of RW t and the Grönwall lemma we obtain the assertion.
In the proof of strong uniqueness of solutions of the SDE (4.2) we will have to deal with a new SDE with a Lipschitz drift coefficient, but a diffusion which only has derivatives in L p . However, following an idea of Veretennikov [36], we can deal with increments of the diffusion coefficient on different solutions by means of the process N t defined in (4.17). The following lemma generalizes Veretennikov's result to our degenerate kinetic setting and even provides bounds on the exponential of the process N t . It will be a key element to prove continuity of the flow associated to (4.2) and will also be used in Subsection 4.3 to study weak derivatives of the flow. Lemma 4.8. Let Z t , Y t be two solutions of (4.2) starting from z, y ∈ R 2d respectively, where · HS denotes the Hilbert-Schmidt norm. Then, N t is a well-defined, real valued, continuous, adapted, increasing process such that E and for any k ∈ R, uniformly with respect to the initial conditions z, y: We have where the constant C k depends on k, p, T and F L p (R 2d ) , but is uniform in z, y and r.
We can use again the Girsanov theorem (cf. the proof of Theorem 4.5). The process is a d-dimensional Wiener process on (Ω, (F s ) s≤T , F T , Q), where Q is the probability measure on (Ω, F T ) having the density ρ r with respect to P, Recalling the Ornstein-Uhlenbeck process L t (starting at z r ), i.e., we have: is an OU process on (Ω, (F s ) s≤T , F T , ρ r P). We now find, by the Hölder inequality, for some a > 1 such that 1/a + 1/a = 1, Observe that the bound on the moments of ρ r is uniform in the initial conditions z, y ∈ R 2d due to (4.21). Setting f (z) = |DD vũ (z)| 2a and using the Girsanov Theorem, assertion (4.20) follows from Lemma 4.2 if we fix a > 1 such that q = p/2a > 2d. Therefore, the process N t is well defined and E[N t ] < ∞ for all t ∈ [0, T ]. (4.18) and the other properties of N t follow.
To prove the exponential integrability of the process N t we proceed in a way similar to [15,Lemma 4.5]. Using the convexity of the exponential function we get and we can continue as above (superscripts denote the probability measure used to take expectations) The last integral is finite due to Proposition 4.4 because p/2 > 2d. The proof is complete. Proposition 4.9 (Itô formula). If ϕ : R 2d → R belongs to X p,s ∩ C 1 b and Z t is a solution of (4.2), for any 0 ≤ s ≤ t ≤ T the following Itô formula holds: Proof. Note that we can use (iii) in Theorem 4.5 to give a meaning to the critical term t s ∆ v ϕ(Z r ) dr. The result then follows approximating ϕ with regular functions and using Lemma 4.6.
Let ϕ ε ∈ C ∞ c → ϕ in X p,s . ϕ ε satisfy the assumptions of the classical Itô formula, which provides an analogue of (4.24) for ϕ ε (Z t ). For any fixed t, the random variables ϕ ε (Z t ) → ϕ(Z t ) P-almost surely. Using that Dϕ is bounded and almost surely F (Z r ) and AZ r are in L 1 (0, T ) (this follows by Lemma 4.6 and Lemma 4.7 respectively), the dominated convergence theorem gives the convergence of the first term in the Lebesgue integral. For the second term we use Lemma 4.
In the same way, one can show that E t s |D v ϕ ε (Z r ) − D v ϕ(Z r )| 2 dr converges to zero, which implies the convergence of the stochastic integral by the Itô isometry.

Remark 4.10.
Using the boundedness of ϕ, it is easy to generalize the above Itô formula (4.24) to ϕ a (Z t ) for any a ≥ 2.
We can finally prove the well-posedness in the strong sense of the degenerate SDE (4.2). A different proof of this result in a Hölder setting is contained in [6], but no explicit control on the dependence on the initial data is given there, so that a flow cannot be constructed. See also the more recent results of [37]. We here present a different, and in some sense more constructive, proof. This approach, based on ideas introduced in [19], [24], [15], will even allow us to obtain some regularity results on certain derivatives of the solution. We will use Theorem 3.7 from Section 3.3, which provides the regularity X p,s ∩ C 1 b (R 2d ) of solutions of (4.5).
Theorem 4.11. Equation (4.2) is well posed in the strong sense.
Proof. Since we have weak well posedness by (i) of Theorem 4.5, the Yamada-Watanabe principle provides strong existence as soon as strong uniqueness holds. Therefore, we only need to prove strong uniqueness. This can be done by using an appropriate change of variables which transforms equation (4.2) into an equation with more regular coefficients. This method was first introduced in [19], where it is used to prove strong uniqueness for a non degenerate SDE with a Hölder drift coefficient.
Here, the SDE is degenerate and we only need to regularize the second component of the drift coefficient, F (·), which is not Lipschitz continuous. We therefore introduce the auxiliary PDE (4.5) with λ large enough such that holds (see (3.59)). In the following we will always use this value of λ and to ease notation we shall drop the subscript for the solution U λ of (4.5), writing U λ = U. Let Z t be one solution to (4.2) starting from z ∈ R 2d . Since and U ∈ X p,s ∩ C 1 b (see Theorem 3.7), by the Itô formula of Proposition 4.9 we have Using the SDE to rewrite the last term we find Let now Y t be another solution starting from y ∈ R 2d and let We have γ(z)−γ(y) = z−y+U (z)−U (y), and so |z−y| ≤ |U (z)−U (y)|+|γ(z)−γ(y)|. Since we have chosen λ such that DU L ∞ (R 2d ) < 1/2, there exist finite constants C, c > 0 such For a ≥ 2, let us apply Itô formula to γ( Note that Z t has finite moments of all orders, and U is bounded, so that also the process γ(Z t ) has finite moments of all orders. Using also that DU is a bounded function, we deduce that the stochastic integral is a martingale M t : As in [24] and [14] we now consider the following process where we have used the equivalence (4.28) between |Z t − Y t | and |γ(Z t ) − γ(Y T )| and N t is the process defined by (4.17) and studied in Lemma 4.8. Just as the process N t , also B t has finite moments, and even its exponential has finite moments. With these notations at hand we can rewrite Again by Itô formula we have The term e −C a,d Bt dM t is still the differential of a zero-mean martingale. Integrating and taking the expected value we find Using again the equivalence (4.28) between |Z t − Y t | and |γ(Z t ) − γ(Y T )| and the fact that U is Lipschitz continuous, this finally provides the following estimate: By Grönwall's inequality, there exists a finite constant C such that (4.33) Using that B t is increasing and a.s. B T < ∞, taking z = y we get for any fixed t ∈ [0, T ] that P Z t = Y t = 0. Strong uniqueness follows by the continuity of trajectories. This completes the proof.

Corollary 4.12.
Using the finite moments of the exponential of the process B t , we can also prove that for any a ≥ 2, (4.34) Proof. Using Hölder's inequality and for an appropriate constant c, we have

Stochastic flow
The main result of this section is the existence of a stochastic flow generated by the SDE (4.2), which is presented in Theorem 4.18. This result follows in a standard way from the results of Lemma 4.13 and Corollary 4.17. One possible line of proof is to follow [25,Chapter II.2] or [26,Chapter 4.5], adapting such results to the irregular coefficients (as in [15]) and degenerate setting considered here.
Another standard result which follows from Corollary 4.12 and Lemma 4.13 is the (local) Hölder continuity of the flow, which we present in Theorem 4.15. Lemma 4.13. Let a be any real number. Then there is a positive constant C a independent of t ∈ [0, T ] and z ∈ R 2d such that Proof. Using the boundedness of the solution U of the PDE (4.5) (see (4.25)) one can show the equivalence Then, it is enough to prove that E 1 + |γ t | 2 a ≤ C a,d 1 + |γ(z)| 2 a . Set f (z) := (1 + |z| 2 ). The idea is to apply the Itô formula to g(γ t ), where g(z) = f a (z). Since

Regularity of stochastic kinetic equations
Here we have used the relation d γ t , γ t = σ(γ t ) σ t (γ t ) dt . Since γ t has finite moments, the first term on the right hand side of (4.35) is a martingale with zero mean. Note that f (z) ≥ 1, so that f a−1 ≤ f a and |z| ≤ f 1/2 (z). Moreover, since σ is bounded and b is Lipschitz continuous, | b(z)| ≤ C(1 + |z|) ≤ Cf 1/2 (z). Using all this, we can see that the second and third term on the right hand side of (4.35) are dominated by a constant times t 0 g(γ s ) ds. Therefore, taking expectations in (4.35) we have The first inequality was obtained in Corollary 4.12. To prove the second inequality we use the equivalence (4.28) between Z t and γ(Z t ). We use the Itô formula (4.29) for γ(Z t ) and γ(Z s ): we can control the differences of the first and last term using the fact that U and DU are bounded, together with Burkholder's inequality and for the linear part we use Hölder's inequality and Lemma 4.13: Applying Kolmogorov's regularity theorem (see [25,Theorem I.10.3]), we immediately obtain the following Theorem 4.15. The family of random variables (Z z t ), t ∈ [0, T ], z ∈ R d , admits a modification which is locally α-Hölder continuous in z for any α < 1 and β-Hölder continuous in t for any β < 1/2.
From now on, we shall always use the continuous modification of Z provided by this theorem.
To obtain the injectivity of the flow, we review the computations of Proposition 4.14: we now want to allow the exponent a to be negative. The proofs of the following lemma is given in Appendix.

Lemma 4.16.
Let a be any real number and ε > 0. Then there is a positive constant C a,d (independent of ε) such that for any t ∈ [0, T ] and z, y ∈ R 2d (4.37) From the above results one can obtain the following theorem. The line of proof is quite standard, but the interested reader can find a complete proof in Section 4.2 of [17].

Regularity of the derivatives
Although F is not even weakly differentiable, from the reformulation (4.26) of equation (4.2) it is reasonable to expect differentiability of the flow, since the derivatives DX t , DV t with respect to the initial conditions (x, v) formally solve suitable SDEs with well-defined, integrable coefficients. We have the following result. Then, for any t ∈ [0, T ], P-a.s., the random variable φ t (z) admits a weak distributional derivative with respect to z; moreover D z φ t ∈ L p loc (Ω × R 2d ) (i.e., D z φ t ∈ L p (Ω × K), for any compact set K ⊂ R 2d ), for any p ≥ 1.
where the process N t is defined as in (4.17), but with Z = φ(z + he) and Y = φ(z), and for every h > 0, dM h t is the differential of a martingale because DU is bounded and ξ h t has finite moments. Setting C p = (C 2 ) 2 C p,d we get After integrating and taking expectations we find A similar estimate holds for the case i ≤ d. We now apply Grönwall's inequality and proceeding as in the proof of Corollary 4.12 we finally get that (4.42) Step 2. Derivative of the Flow. Remark that, due to the boundedness of DU , the bound (4.42) is uniform in h and z, and we get

Stochastic kinetic equation
We present here results on the stochastic kinetic equation (1.1). The first result concerns existence of solutions with a certain Sobolev regularity (see Theorem 5.4). The second one is about uniqueness of solutions (see Theorem 5.7).
We will use the results of the previous sections together with results similar to the ones given in [16] to approximate the flow associated to the equation of characteristics. We report them in the Appendix for the sake of completeness. To prove that some degree of Sobolev regularity of the initial condition is preserved on has to deal with weakly differentiable solutions, according to the definition introduced in [16 2. P f (t, ·) ∈ ∩ r≥1 W 1,r loc R 2d = 1 for every t ∈ [0, T ] and both f and Df are in In the next result the inverse of φ t will be denoted by φ t 0 .
Theorem 5.4. If F satisfies Hypothesis 2.1 and f 0 ∈ ∩ r≥1 W 1,r (R 2d ), then f (t, z) := f 0 (φ t 0 (z)) is a weakly differentiable solution of the stochastic kinetic equation Proof. The proof follows the one of [16,Theorem 10]. We divide it into several steps.
Let f 0,n be a sequence of smooth functions which converges to f 0 in W 1,r (R 2d ), for any r ≥ 1, and so uniformly on R 2d by the Sobolev embedding. This can be done for instance by using standard convolution with mollifiers. Moreover suppose that F n are smooth approximations converging to F in L p (R 2d ) (p is given in Hypothesis 2.1), let φ t,n be the regular stochastic flow generated by the SDE (4.3) where B is replaced by B n = RF n and let φ t 0,n be the inverse flow. Then f n (t, z) := f 0,n φ t 0,n (z) is a smooth solution of We shall pass to the limit in each one of these terms. We are forced to use this very weak convergence due to the term where we may only use weak convergence of Df n .
Step 2. Convergence of f n to f . We claim that, uniformly in n and for every r ≥ 1, Let us show how to prove the second bound; the first one can be obtained in the same way. The key estimate is the bound (6.6) on the derivative of the flow, which is proved in Appendix. We use the representation formula for f n and the Hölder inequality to obtain The first term on the right-hand side can be uniformly bounded using Lemma 6.3. Also the last integral can be bounded uniformly: changing variables (all functions are regular) we get Df 0,n (y) 2r E J φt,n (y) dy , where J φt,n (y) is the Jacobian determinant of φ t,n (y). Then we conclude using again the Hölder inequality, (6.6) and the boundedness of (f 0,n ) in W 1,r (R 2d ) (for every r ≥ 1). Remark that all the bounds obtained are uniform in n and t.
We can now consider the convergence of f n to f . Let us first prove that, given t ∈ [0, T ] and ϕ ∈ C ∞ c (R 2d ), (convergence in probability). Using the representation formulas f n = f 0,n (φ t 0,n ), f = f 0 (φ t 0 ) and Sobolev embedding W 1,4d → C 1/2 we have (Supp(ϕ) ⊂ B R where B R is the ball of radius R > 0 and center 0) The first term converges to zero by the uniform convergence of f 0,n to f 0 . From and the convergence in probability (5.5) follows. This allows to pass to the limit in the first and in the last term of equation ( which allows to pass to the limit in the stochastic integral term of (5.1). Hence, one can easily show convergence of all terms in (5.1) except for the one in (5.2) which will be treated in Step 4. on Df n . We can then apply [16,Lemma 16] which gives P f (t, ·) ∈ W 1,r loc (R 2d ) = 1 for any r ≥ 1 and t ∈ [0, T ], and for every R > 0 and t ∈ [0, T ]. Hence, by monotone convergence we have A similar bound can be proved for f itself using (5.3), the convergence in probability (5.5) and the Vitali convergence theorem.
Step 4. Passage to the limit. Finally, we prove that we can pass to the limit in equation (5.1) and deduce that f satisfies property 3 of Definition 5.1. It remains to consider the term E Y t 0 R 2d b n (s, z) · Df n (s, z) ϕ (z) dzds . Since F n → F in L p (R 2d ), it is sufficient to use a suitable weak convergence of Df n to Df . Precisely, for t ∈ [0, T ], We have to prove that both I  n (t) converge to zero as n → ∞. By the Hölder inequality, for all t ∈ [0, T ] where 1/p + 1/p = 1 and C = C Y,T,ϕ . Thus, from (5.4), I converges to zero as n → ∞ for almost every s and satisfies the assumptions of the Vitali convergence theorem (we shall prove these two claims in Step 5 below). Hence I (2) n (t) converges to zero. Now we may pass to the limit in equation (5.1) and from the arbitrariness of Y we obtain property 3 of Definition 5.1.
Step 5. Auxiliary facts. We have to prove the two properties of h n (s) claimed in Step 4. For every s ∈ [0, T ] [16,Lemma 16] gives , we may extend the convergence property (5.8) to all ϕ ∈ L p (R 2d ) by means of the bounds (5.4) and (5.7), which proves the first claim. We now present the uniqueness result for weakly differentiable solutions. We exploit (in Step 2 of the proof of Theorem 5.7) a renormalization property of solutions, which is proved in Step 1. The proof seems to be of independent interest, see the following remark.
Remark 5.5. The main idea of our proof is to exploit the specific form of the equation using in Step 2 of the proof localizing test functions that have a different behavior in the x and v variables. We have then to perform two limits, and choosing the right order allows to deal with the problematic part of the drift coefficient.
It seems that this small trick allows to extend the possibility to apply the classical line of proof based on renormalized solutions, the DiPerna-Lions commutators lemma [12] and Grönwall's lemma to a wider class of degenerate equations. Remark 5.6. Potentially, it seems that the proof can be also done by the maximum principle, along the lines of [40,Section 4]. This however requires a generalization of the known results since for the linear part of the drift term we only have v/(1+|z|) ∈ L ∞ (R 2d ) (allowing to obtain the renormalization property for solutions f ), but v/(1+|z|) / ∈ L 2 (R 2d ). Therefore, b(z)/(1 + |z|) / ∈ L 2 (R 2d ).
Theorem 5.7. If F satisfies Hypothesis 2.1 and, moreover, div v F ∈ L ∞ (R 2d ) (div v F is understood in distributional sense) weakly differentiable solutions are unique.
Proof. By linearity of the equation we just have to show that the only solution starting from f 0 = 0 is the trivial one.
Step 1. f 2 is a solution. We prove that for any solution f , the function f 2 is still a weak solution of the stochastic kinetic equation. Take test functions of the form ϕ n ζ (z) = ρ n (ζ − z), where (ρ n ) n is a family of standard mollifiers (ρ n has support in B 1/n ). Let ζ = (ξ, ν) ∈ R 2d , f n (t, ζ) = f (t, ·) ρ n (ζ). By definition of solution we get that, P-a.s., The functions f n are smooth in the space variable. For any fixed ζ ∈ R 2d , by the Itô formula we get Now we multiply by ϕ ∈ C ∞ c (R 2d ) and integrate over R 2d . Using the Itô integral we pass to the limit as n → ∞ and find, P-a.s., Let us fix t ∈ [0, T ]. By definition of weakly differentiable solution it is not difficult to pass to the limit in probability as n → ∞ in all the terms in the left hand side of (5.9). Indeed, we can use that, for every t ∈ [0, T ], r ≥ 1, f n (t, ·) → f (t, ·) in W 1,r loc (R 2d ), P-a.s., which is finite. Thus we can apply the Vitali theorem and deduce the assertion. Let us check (5.11). By Sobolev regularity of weakly differentiable solutions we know that Hence it is enough to prove that B R |R n (s, ζ)|dζ → 0. Recall that Using the fact that b ∈ L p loc (R 2d ), with p given in Hypothesis 2.1, the Hölder inequality and basic properties of convolutions we have as n → ∞. This shows that (5.11) holds. We have proved that also f 2 is a weakly differentiable solution of the stochastic kinetic equation.

Regularity of stochastic kinetic equations
Now we fix m ≥ 1 and pass to the limit as n → ∞ by the Lebesgue theorem. We infer R 2d g(s, z)∆η(v/m) dzds .
Passing to the limit as m → ∞ we arrive at R 2d Since in particular g(t, z) ∈ C 0 ([0, T ]; W 1,r (R 2d )), with r = p p−1 , we obtain R 2d Applying the Grönwall lemma we get that g is identically zero and this proves uniqueness for the kinetic equation.
The stochastic integral in (6.1) is a martingale with zero mean ( σ is bounded). Proceeding as in (4.32), we get E e −Nt g(η t ) − e −N0 g(η 0 ) ≤ C a,d t 0 E e −Ns g(η s ) ds .
We now present some results on the convergence and regularity of approximations φ t 0,n of the inverse flow φ t 0 associated to the SDE (4.2). Note that φ t 0,n are solutions of SDEs with regular coefficients, see the proof of Theorem 5.4. These results are adapted from [16] and based on the following lemma on the stability of the PDE (4.5), which is of independent interest. Lemma 6.1 (Stability of the PDE (4.5)). Let U n be the unique solutions provided by Theorem 3.7 to the PDE (4.5) with smooth approximations B n (z) = (0, F n (z)) of B(z) = (0, F (z)) and some λ large enough for (6.4) to hold. If F n (z) → F (z) in L p (R d v ; H s p (R d x )), with s, p as in Hypothesis 2.1, then U n and D v U n converge pointwise and locally uniformly to the respective limits. In particular, for any r > 0 there exists a function g(n) → 0 as n → ∞ s.t.   Proof. To ease notation, we shall prove the convergence result for the forward flows φ t,n → φ t . This in enough since the backward flow solves the same equation with a drift of opposite sign. Since the flow φ t is jointly continuous in (t, z), the image of [0, T ] × B R is contained in [0, T ] × B r for some r < ∞. Thus for z ∈ B R , from Lemma 6.1 we get |U n (φ t,n ) − U (φ t )| ≤ g(n) + 1/2|φ t,n − φ t | and |D v U n (φ t,n ) − D v U (φ t )| ≤ g(n) + |D v U n (φ t,n ) − D v U n (φ t )|. Extending the definition (4.27) to γ n (z) = z + U n (z) we have the approximate equivalence 2 3 γ n (φ t,n ) − γ(φ t ) − g(n) ≤ φ t,n − φ t ≤ 2 γ n (φ t,n ) − γ(φ t ) + g(n) .
Therefore, it is enough prove the convergence result for the transformed flows γ t,n = γ n (φ t,n ) → γ(φ t ) = γ t . Proceeding as in the proof of Theorem 4.11 we get, for any a ≥ 2 1 a d γ t,n − γ t a ≤ γ t,n − γ t a−2 γ t,n − γ t · λ U n (φ t,n ) − U (φ t ) + A(φ t,n − φ t ) dt The stochastic integral is a martingale. Since |φ t,n − φ t | |γ t,n − γ t | ≤ C 1 + g(n) |γ t,n − γ t | , the term on the last line in (6.5) can be bounded using (6.3) by a constant times |γ t,n − γ t | a dB t,n + |γ t,n − γ t | a−2 g 2 (n)(dB t,n + dt), where for every n the process B t,n is defined as in (4.31) but with DU n (φ t,n ) and DU n (φ t ) in the place of DU (Z t ) and DU (Y t ) respectively. One can show that B t,n share the same integrability properties of the process N t studied in Lemma 4.8, uniformly in n, see [16,Lemma 14]. Computing E[e −Bt,n |γ t,n − γ t | a ] using the Itô formula and taking the supremum over t ∈ Using the integrability properties of φ t , φ t,n , U (φ t ), U n (φ t,n ) one can see that all terms are bounded, uniformly in n. To conclude the proof we can pass to the limit lim sup E e −Bt,n γ t,n − γ t a ds , apply Grönwall's lemma and proceed as in Corollary 4.12 to get rid of the exponential term. Proof. Let us show the bound for the forward flows φ t,n . These are regular flows: let θ t,n and ξ t,n denote the weak derivative of Dφ t,n and Dγ t,n = Dγ n (φ t,n ), respectively. They are equivalent in the sense of (4.39), so we shall prove the bound for ξ t,n instead of θ t,n .
Proceeding as in the proof of Theorem 4.19 we obtain as in (4.41) de −C1Bt,n ξ t,n a ≤ e −C1Bt,n C 2 ξ t,n a dt + dM t , where the process B t,n is simply given by t 0 |DD v U n (φ s,n )| 2 ds. We can integrate, take expected values, the supremum over t ∈ [0, T ] and apply Grönwall's inequality to get sup t∈[0,T ] E e −C1Bt,n |ξ t,n | a ≤ C T |ξ 0,n | a = C a,d,T .
Observe that this bound is uniform in n and z ∈ R 2d . Proceeding as in Corollary 4.12 we can get rid of the exponential term and obtain the desired uniform bound on ξ t,n .
Finally, we thank the referees for their useful comments.