It{\^o}-Krylov's formula for a flow of measures

We prove It{\^o}'s formula for the flow of measures associated with an It{\^o} process having a bounded drift and a uniformly elliptic and bounded diffusion matrix, and for functions in an appropriate Sobolev-type space. This formula is the almost analogue, in the measure-dependent case, of the It{\^o}-Krylov formula for functions in a Sobolev space on $\mathbf{R}^+ \times \mathbf{R}^d $.


Introduction
We fix (Ω, F, (F t ) t≥0 , P) a filtered probability space satisfying the usual conditions.Let T > 0 be a finite horizon of time, d, d 1 ∈ N * with d 1 ≥ d, and (B t ) t≥0 a (F t ) t≥0 -Brownian motion of dimension d 1 .
We consider the Itô process on R d defined, for t ∈ [0, T ], by where X 0 ∈ L 2 (Ω, F 0 ; R d ), b : [0, T ] × Ω → R d and σ : [0, T ] × Ω → R d×d 1 are progressively measurable processes.In the following, we will denote by µ t the law of X t and by a the matrix σσ * .
Let us fix a real-valued function u defined on the 2-Wasserstein space P 2 (R d ), i.e. the space of probability measures on R d having a finite moment of order 2. In this paper, we are interested in Itô's formula for u and the flow of probability measures (µ t ) t∈[0,T ] .This formula describes the dynamics of t → u(µ t ), essentially by computing its derivative (see (1.2) below).It has a wide range of applications for example in Mean-Field Games, McKean-Vlasov's control problems, McKean-Vlasov Stochastic Differential Equations (SDEs) but also in the study of interacting particle systems and the propagation of chaos.These applications will be detailed below.
Itô's formula for a flow of measures naturally requires differential calculus on the space of measures P 2 (R d ).We will use the linear (functional) derivative, which is a standard notion of differentiability for functions of measures relying on the convexity of P 2 (R d ).The function u admits a linear derivative if there exists a real-valued and continuous function δu δm defined on P 2 (R d ) × R d , at most of quadratic growth with respect to the space variable uniformly on each compact set of P 2 (R d ), and such that for all µ, ν ∈ P 2 (R d ) The standard Itô formula for a flow of measures can be found in [4] (see Theorem 6.1) or in Section 3 of [11] and Chapter 5 of [9] (see Theorem 5.99) under less restrictive assumptions.It states that for all t ∈ [0, T ] where x • y denotes the usual scalar product of two vectors x, y ∈ R d and A • B := Tr(A * B) the usual scalar product of two matrices A, B ∈ R d×d .The common point between these results is that the function u has to be C 2 in some sense.More precisely, it is always assumed that for all µ ∈ P 2 (R d ), the linear derivative δu δm (µ)(•) belongs to C 2 (R d ) or equivalently that the L-derivative ∂ µ u(µ)(•) belongs to C 1 (R d ) (see below for the definition of the L-derivative and its link with the linear derivative).This paper aims at proving Itô's formula (1.2) for functions u having a linear derivative δu δm that is not C 2 with respect to the space variable.
We now fix the assumptions on the Itô process (X t ) t∈[0,T ] .In this paper, we always assume that the drift b and the diffusion matrix σ in (1.1) satisfy the following properties.
(B) There exists δ > 0 such that almost surely Assumptions (A) and (B) stem from Section 2.10 of [20].Therein, Krylov deals with controlled diffusion processes and needs to apply the standard Itô formula for the so-called pay-off function which is not C 2 .That is why he proves an extension of the classical Itô formula for the Itô process (X t ) t∈[0,T ] satisfying Assumptions (A) and (B), and for a function g : R d → R belonging to an appropriate Sobolev space.The crucial point is that (X t ) t satisfies the non-degeneracy Assumption (B).It ensures that the noise does not degenerate and allows to produce a regularizing effect.Let us explain how.The non-degeneracy assumption leads to Krylov's inequality (see Theorem 4.1 taken from Section 2.3 of [20]).This inequality, in turn, implies that for almost all t ∈ [0, T ], µ t , the law of X t , has a density p(t, •) with respect to the Lebesgue measure (see Proposition 4.3).Moreover, this density belongs to L (d+1) ′ ([0, T ] × R d ), where (d + 1) ′ denotes the conjugate exponent of d + 1 defined in Section 2. The existence of densities together with the integrability property permit to assume Sobolev regularity for the function g.More precisely, Itô-Krylov's formula is established under the assumption that g is continuous on R d and that ∇g belongs to the Sobolev space W 1,k loc (R d ), for k ≥ d+1, i.e. that ∇g and ∇ 2 g are in L k loc (R d ) (see Section 2.10 of [20]).
Our goal here is to take advantage of the regularizing effect of the noise, stemming from the existence of the densities p(t, •) and their integrability property, to establish an analogue of Itô-Krylov's formula in the measure-dependent case.Looking at Itô's formula for a flow of measures (1.2), the regularizing effect comes from the presence of expectations which average, with respect to the space variable, the derivatives of δu δm on all the trajectories of (X t ) t .Indeed, the regularization by noise will only appear through the space variable of the linear derivative but not through its measure variable.This is not surprising since the space of measures P 2 (R d ) is somehow infinite dimensional while the noise is of finite dimension.Thus, we cannot expect a true regularization in the measure variable of δu δm .The fact that a finite dimensional noise cannot have a complete regularizing effect in the space P 2 (R d ) is explained in [23] in the context of McKean-Vlasov SDEs.
In order to prove Itô's formula (1.2) for u, it is clear that u needs to admit a linear derivative with at least distributional derivatives of order 1 and 2 with respect to the space variable in L k (R d ) for some k, as for the standard Itô-Krylov formula.Let us describe more precisely our assumptions on u.As said before, for almost all t ∈ [0, T ], the law µ t has a density p(t, •) such that p belongs to L (d+1) ′ ([0, T ]×R d ).Denoting by P(R d ) the space of measures µ ∈ P 2 (R d ) having a density with respect to the Lebesgue measure in L (d+1) ′ (R d ), our assumptions on the derivatives of δu δm (µ)(•) are only made for measures µ belonging to P(R d ).This is natural since for almost all t ∈ [0, T ], µ t belongs to P(R d ), and the derivatives of δu δm are evaluated along the flow (µ t ) t∈[0,T ] and integrated in time.Moreover, because of the integrability property of the densities p(t, •), the derivatives of δu δm (µ)(•) do not need to be defined and continuous on the whole space R d because they are somehow integrated against the densities p(t, •) (see (1.2)).We say "somehow" because it is not completely the case since b and a are random.But as they are bounded, we can omit them in some sense.More precisely, the integrability property of the densities leads us to assume that u admits a linear derivative such that for all µ ∈ P(R d ), ∂ v δu δm (µ)(•) belongs to the Sobolev space W 1,k (R d ) defined in Section 2, with k ≥ d + 1.This is exactly the same condition as in the standard Itô-Krylov formula, except that we replace W 1,k loc (R d ) by W 1,k (R d ).This is essentially explained by the expectations in Itô's formula (1.2).Indeed, the process (X t ) t cannot be localized by stopping times.Moreover, we assume that the map µ ) is continuous for a distance on P(R d ) satisfying the assumptions of Definition 2.3.This continuity assumption can be interpreted as the fact that the noise has no regularizing effect in the measure variable of the linear derivative, as explained above.The precise assumptions of our Itô-Krylov's formula are given in Definition 3.1 and Theorem 3.3.Eventually, we extend in Theorem 3.12 our formula to functions depending also on the time and space variables satisfying the assumptions of Definition 3.10.
We now focus on some applications of Itô's formula for a flow of measures.This one has been developed with the increasing interest for Mean-Field Games and McKean-Vlasov SDEs over the last decade.Mean-Field Games were initiated independently by Caines, Huang and Malhame in [5] and by Lasry and Lions in [21].The notion of Master equations has been introduced by Lions in his lectures at Collège de France [22] in order to describe Mean-Field Games.Master equations are Partial Differential Equations (PDEs) on the space of probability measures and can be derived with the help of Itô's formula.We refer to Lions' lectures [22], the notes written by Cardialaguet [6], and the books of Carmona and Delarue [9,10] for more details on Mean-Field Games and Master equations.We also mention Bensoussan, Frehse and Yam [1] and Carmona, Delarue [8] where Master equations are derived, with the help of Itô's formula in [8].The question of existence and uniqueness of classical solutions to Master equations was addressed by Cardaliaguet, Delarue, Lasry and Lions in [7] and by Chassagneux, Crisan and Delarue in [11].From a different point of view, Mou and Zhang deal with the well-posedness of Master equations in some weaker senses in [24].
Moreover, Itô's formula appears to be the natural way to connect a McKean-Vlasov SDE (more precisely the associated semigroup (P t ) t acting on the space of functions of measures) to a PDE on the space of probability measures (the Master equation) in the same manner as for classical SDEs.It turns out to be a crucial tool to study the stochastic flow generated by a McKean-Vlasov SDE, as explained in Chapter 5 of [9].The link between McKean-Vlasov SDEs and PDEs on the space of measures is at the heart of the work of Buckdahn, Li, Peng and Rainer [4] where the authors prove that the PDE admits a unique classical solution expressed with the flow of measures associated with the McKean-Vlasov SDE.Moreover, in the parallel work [11], Chassagneux, Crisan and Delarue adopt a similar approach and study the flow generated by a forward-backward stochastic system of McKean-Vlasov type under weaker assumptions on the coefficients of the equation.Both works are motivated by Mean-Field Games, and Itô's formula plays a key role.In [15], Crisan and McMurray prove that the Master equation admits a unique classical solution for some irregular terminal condition using Malliavin calculus.They point out a smoothing effect concerning the differentiability of the solution with respect to the measure even though there is no noise in the measure direction.Furthermore, the problem of propagation of chaos for the interacting particles system associated with the McKean-Vlasov SDE can also be addressed with the help of the associated PDE on the space of measures (see Chapter 5 of [9]).It allows to obtain quantitative weak propagation of chaos estimates between the law of the solution to the McKean-Vlasov SDE and the empirical measure of the associated particle system.This approach was adopted for example by Chaudru de Raynal and Frikha in [14,13], by Delarue and Tse in [16] and by Chassagneux, Szpruch and Tse in [12].Let us also mention that the Master equation satisfied by the semigroup has been recently used by Jourdain and Tse in [19] to study the mean-field fluctuation (CLT) of an interacting particle system.Finally, Itô's formula for a flow of measures is also important to deal with McKean-Vlasov control problems because it allows to derive a dynamic programming principle describing the value function of the problem as presented in Chapter 6 of [9].
Recently, Itô's formula has been extended to flows of measures generated by càdlàg semi-martingales.It was achieved independently by Guo, Pham and Wei in [18], who studied McKean-Vlasov control problems with jumps and by Talbi, Touzi and Zhang in [26] who worked on mean-field optimal stopping problems.In both works, dynamic programming principles are established thanks to Itô's formula for a flow of measures.Finally, we also mention that several Itô-Wentzell-Lions formulae for functional random fields of Itô type depending on measure flows have been established by dos Reis and Platonov in [17].
Let us explain our choice to work with the linear derivative.Indeed, the L-derivative, which was introduced by Lions in his lectures at Collège de France [22], is also well-adapted to establish Itô's formula for a flow of measures.We say that u is L-differentiable if its lifting defined by The advantage of the L-derivative is that it permits to use standard tools of differential calculus on Banach spaces.Of course, there is a link between the L-derivative and the linear derivative of u.Indeed, in general, the L-derivative ∂ µ u(µ)(•) is equal to the gradient of the linear derivative ∂ v δu δm (µ)(•) (see Propositions 5.48 and 5.51 in [9] for the precise assumptions).Under our assumptions presented above, Sobolev embedding theorem ensures that for all µ ∈ P(R d ), δu δm (µ)(•) belongs to C 1 (R d ; R), and that ∂ v δu δm (µ)(•) is continuous and bounded on R d .We would be tempted to deduce that u admits a L-derivative given, as recalled above, by ∂ v δu δm (µ)(•).However, this term is assumed to exist only for measures µ ∈ P(R d ) and not for µ ∈ P 2 (R d ).This is the case in Example 3.6, where this term is not well-defined for any µ ∈ P 2 (R d ) (see Remark 3.7).It seems therefore more restrictive to work with the L-derivative and thus justifies our choice to work with the linear derivative.
The paper is organized as follows.Section 2 gathers some notations and definitions used throughout the paper.In Section 3, more precisely in Definitions 3.1 and 3.10, we define the spaces of functions for which we will establish Itô-Krylov's formula.These formulas are given in Theorem 3.3 for functions defined on P 2 (R d ) and in Theorem 3.12 for functions depending also on the time and space variables.Moreover, we give examples of functions for which our formulas hold and we discuss our assumptions through them.The proofs of these examples are postponed to Appendix A for ease of reading.In Section 4, we give some preliminary results.We start with Krylov's inequality and its consequences on the existence of densities for the flow of measures (µ t ) t∈[0,T ] in Proposition 4.3.Then we recall some classical results on convolution and regularization.Finally, Sections 5 and 6 are respectively dedicated to the proofs Theorems 3.3 and 3.12.

Notations and definitions
2.1.General notations.Let us introduce some notations used several times in the article.
-B R is the open ball centered at 0 and of radius R in R d for the euclidean norm.
p ′ is the conjugate exponent of p ∈ [1, +∞], defined by -W m,k loc (R d ) is the space of functions u such that for all R > 0, u belongs to W m,k (B R ). -(ρ n ) n is a mollifying sequence on R d , that is a sequence of non-negative C ∞ functions, such that for all n, R d ρ n (x) dx = 1 and ρ n is equal to 0 outside B 1/n .We assume that ρ n (x) = ρ n (−x) for all x. - * denotes the convolution of two functions, when it is well-defined, or two probability measures.
-B(E) is the Borel σ-algebra where E is a metric space.
-A * denotes the transpose of the matrix A ∈ R d×d .
-A • B denotes the usual scalar product of two matrices A, B ∈ R d×d given by A • B := Tr(A * B).
2.2.Spaces of measures and linear derivative.The set P(R d ) is the space of probability measures on R d equipped with the topology of weak convergence.The Wasserstein space P 2 (R d ) denotes the set of measures µ ∈ P(R d ) such that R d |x| 2 dµ(x) < +∞, equipped with the 2-Wasserstein distance W 2 defined for µ, ν ∈ P 2 (R d ) by , where Π(µ, ν) is the subset of P 2 (R d × R d ) with marginal distributions µ and ν.We will work with the standard notion of linear derivative for functions of measures.
Definition 2.1 (Linear derivative).A function u : satisfying the following properties.
(1) For all compact K ⊂ P 2 (R d ) sup ( Remark 2.2.Instead of the second point of the previous definition, it is equivalent to assume that for all µ, ν One can find more details in Chapter 5 of [9], in particular the connection with the L-derivative. Let us fix (ρ n ) n a mollifying sequence on R d , that is a sequence of non-negative C ∞ functions, such that for all n, R d ρ n (x) dx = 1 and ρ n is equal to 0 outside B 1/n .We assume that ρ n (x) = ρ n (−x) for all x.Definition 2.3.Let us define P(R d ) as the space of measures µ ∈ P 2 (R d ) which admit a density dµ dx with respect to the Lebesgue measure belonging to L (d+1) ′ (R d ).We endow P(R d ) with a general distance d P satisfying the following properties. (H1) Note that for all n ≥ 1 and for all µ ∈ P 2 (R d ), µ * ρ n ∈ P(R d ).Indeed, its density is given by x → ρ n * µ(x) = R d ρ n (x−y) dµ(y).Jensen's inequality ensures that it belongs to L (d+1) ′ (R d ).Considering the space (P(R d ), d P ) comes in a natural way with Assumptions (A) and (B) on the Itô process X.As explained in the introduction, it implies the existence of a density p ∈ Note that d k is well-defined since for any The proof is postponed to the Appendix (Section A.1).

Itô-Krylov's formula, ah-hoc spaces of functions and examples
Let us introduce now the Sobolev-type space of functions on P 2 (R d ) for which we will prove Itô's formula for a flow of measures.Definition 3.1.Let W 1 (R d ) be the space of continuous functions u : P 2 (R d ) → R having a linear derivative δu δm such that for all µ ∈ P(R d ), the function δu δm (µ)(•) admits distributional derivatives of order 1 and 2 in L k (R d ), for a certain k ≥ d + 1, and satisfies the following properties.
(1) The map µ d is continuous for a certain distance d P satisfying (H1) and (H2).
(2) There exists α ∈ N such that k ≥ (1 + α)d and for all compact K ⊂ P 2 (R d ) and for any .
Remark 3.2.-The space W 1 (R d ) contains the functions which satisfy Assumption (1) in Definition 3.1 with with respect to the measure µ.It allows us to take advantage of the continuity of the flow in P 2 (R d ) (because the control is assumed on compact subsets of P 2 (R d )), but also of its integrability properties proved in Lemmas 4.5 and 4.6.The form of the inequality suggests the integration of functions in L k (R d ) with respect to µ, at least when the function u is linear in µ.
Theorem 3.3 (Itô-Krylov's formula).Let u be a function in W 1 (R d ), which was defined in Definition 3.1.We have for all t ∈ [0, T ] where ) is assumed to have a linear derivative on the whole space P 2 (R d ).This seems a bit strong at first sight in comparison with the assumptions on its spatial derivatives that are only made for measures µ ∈ P(R d ).Indeed, we could consider working with a linear derivative defined only on the space of densities, as done for example in [1].However, in order to establish Itô-Krylov's formula by regularization, the function u needs to be continuous on the whole space P 2 (R d ) and not only on . This is proved only for almost all t.Thus, as the function u has to be continuous on P 2 (R d ), we have chosen to assume the existence of a linear derivative on P 2 (R d ) even though we could have only required it on the space of densities.Now, we focus on examples of functions belonging to W 1 (R d ).Let us start with the linear case.
Let us now focus on the multi-linear case.

Example 3.6 (Polynomials on the Wasserstein space
The proof is postponed to the Appendix (Section A.2). Remark 3.7.-In Definition 3.1, the distributional derivatives of the linear derivative δu δm (µ) are not necessarily integrable functions for all µ ∈ P 2 (R d ).Of course, in Example 3.5, it is the case for all µ ∈ P 2 (R d ) as the linear derivative does not depend on the measure µ.However, in Example 3.6 for N = 2, the linear derivative is given by Formally, the derivative with respect to v of the first integral in (3.2) is This term is not well-defined for general measures µ ∈ P 2 (R d ) because we have only assumed that ∇g ∈ (W 1,k (R 2d )) 2d with k ≥ 2d.Indeed, for k = 2d, we just know by Sobolev embedding theorem that ∇g belongs to (L r (R 2d ) 2d with r ∈ [2d, +∞[ (see Corollary 9.11 in [3]).As we will see in the proof (Section A.2 of the Appendix), it is well-defined as an integrable function of v if we restrict to measures µ ∈ P(R d ).This also justifies why we have chosen to work with the linear derivative instead of the Lderivative.Indeed, the L-derivative of u would be equal to the gradient of the linear derivative ∂ v δu δm (µ)(•), which is not well-defined for all µ ∈ P 2 (R d ).Thus, the function u does not need to be L-differentiable in the usual sense in our setting.
-Our assumptions on the derivatives of δu δm in Definition 3.1 deal with P(R d ) instead of the whole space P 2 (R d ) essentially because in Itô's formula (3.1), these derivatives only appear under integrals along the flow (µ s ) s∈[0,T ] , which belongs to P(R d ) for almost all s ∈ [0, T ].However, we assume that u is continuous on The next example focuses on the particular case of convolution which has to be treated differently than in Example 3.6 with N = 2 because of the structure of the convolution which mixes the two variables.
Here, the particular structure of convolution enables us to work on the whole space P 2 (R d ) instead of P(R d ), as explained in the first point of Remark 3.2.The proof is postponed to the Appendix (Section A.3).
Finally, we give a non-linear example of functions belonging to The proof is again postponed in the Appendix (Section A.4).
We now deal with the extension of Itô's formula for functions depending also on the time and space variables.First, we define the space of functions generalizing the space W 1 (R d ).(1) For all (4) There exists k 2 ≥ 2d such that for all (t, µ) ∈ [0, T ] × P(R d ), δu δm (t, •, µ)(•) admits distributional derivatives with respect to v of order 1 and 2 such that for all t and R > 0 is continuous and measurable with respect to (5) There exists .
Remark The next theorem is the natural extension of the formula for functions in W 2 (R d ).Let (η s ) s∈[0,T ] and (γ s ) s∈[0,T ] be two progressively measurable processes, taking values respectively in R d and R d×d 1 and satisfying Assumptions (A) and (B).We set, for all t ≤ T where ξ 0 is a F 0 -measurable random variable with values in R d .Theorem 3.12 (Extension of Itô-Krylov's formula).Let u be a function in W 2 (R d ), which was defined in Definition 3.10.We have almost surely, for all t ∈ [0, T ] where ( Ω, F, P) is a copy of (Ω, F, P) and ( X, b, σ) is an independent copy of (X, b, σ).
Let us now give examples of functions belonging to the space W 2 (R d ).
Example 3.13.Let g ∈ C 0 (R 2d ; R) be a function such that its distributional derivative ∇g belongs to (W 1,k (R 2d )) 2d for some k ≥ 5d.Then, the function The proof is postponed to the Appendix (Section A.5).
is well-defined and continuous for some The proof is again postponed to the Appendix (Section A.6).
Remark 3.15.In the abstract, we said that our Itô-Krylov's formula for a flow of measure was the almost analogue of the standard Itô-Krylov formula.We used the word "almost" because Assumption (1) in Definition 3.10 is not completely satisfactory.Indeed, we do not assume Sobolev regularity with respect to time, as it is the case in Itô-Krylov's formula for functions defined on ) at most of quadratic growth in x uniformly in t, and such that the distributional derivatives ∂ t g, ∂ x g and ∂ 2 x g are in L k ([0, T ] × R d ) for some k ≥ d + 1, we will succeed in proving Itô-Krylov's formula for u.
Let us give the idea of the proof.We regularize u by setting u n (t, µ) := R d g * ρ n (t, x) dµ(x), where (ρ n ) n is a mollifying sequence on R × R d .The function u n clearly satisfies the assumptions of the standard Itô formula for a flow of measures (see Proposition 5.102 in [9]).It ensures that for all t ∈ [0, T ] As g is continuous, (g * ρ n ) n converges to g uniformly on compact sets.It follows from the growth assumption on g that u n converges point-wise to u.Using that The same holds with the two other integrals in (3.4).Taking the limit n → +∞ in (3.4) yields for all t ∈ [0, T ] In the general case, when the dependence in µ of the function u is not explicit, we cannot apply Krylov's inequality.Indeed, consider a function u : ).In Itô's formula for u, as in the classical formula, there should be the term t 0 ∂ t u(s, µ s ) ds.The assumption does not imply that this term is well-defined.One possible hypothesis is to assume that for all compact K ⊂ P 2 (R d ), sup µ∈K |∂ t u(•, µ)| ∈ L 1 ([0, T ]).Following our strategy to prove Itô-Krylov's formula, we would consider the mollified version of u defined by u n (t, µ) , where (ρ 1 n ) n and (ρ 2 n ) n are mollifying sequences on R d and on R respectively.Assume that we have proved Itô's formula for u n .In order to take the limit and deduce Itô's formula for u, we would like to show that However, this convergence is not obvious in the general case since the presence of µ s prevents us from using the classical results on convolution and we cannot apply Krylov's inequality if the dependence in the measure argument is not linear.

Preliminaries
4.1.Krylov's inequality and densities.The key element to prove the theorem is Krylov's inequality.We recall it in the next theorem taken from [20] (see Theorem 4 in Section 2.3).
For X 0 a R d -valued F 0 -measurable random variable, we define the Itô process X = (X t ) t , for all t ∈ [0, T ], by Let λ > 0 be a positive constant.Then, there exists a constant N = N (d, p, λ, δ, K) such that for all measurable function f : R We will use the following corollary for a finite horizon of time.
Corollary 4.2.If b and σ satisfy Assumptions (A) and (B), there exists Proof.We set b t = b T and σ t = σ T for t > T to guarantee that Assumptions (A1) and (A2) are satisfied, without changing the process X on [0, T ].It remains to apply Krylov's inequality to f (t, x) := f (t, x)1 t∈[0,T ] , which gives the existence of Krylov's inequality also provides the existence of a density with respect to the Lebesgue measure for µ s , for almost all s ∈ [0, T ].
If τ is a stopping time such that (X t ) t∈[0,T ] belongs to B R almost surely on the set {τ > 0}, then Moreover, for almost all s ∈ [0, T ], µ s = L(X s ) is equal to p(s, •) dx.
We give the proof for the sake of completeness.
Proof.We denote by µ the push-forward measure of λ⊗ P, where λ is the Lebesgue measure on [0, T ], by the measurable map Note that µ is a finite measure on [0, T ] × R d .The monotone convergence theorem and Krylov's inequality ensure that for all f : with Lebesgue measure 0, we deduce that µ(A) = 0. Thus µ is absolutely continuous with respect to the Lebesgue measure on Krylov's inequality exactly proves that the map To prove (4.2), it is enough to notice that Next, we establish that for almost all s ∈ [0, T ], µ s = p(s, •) dx.We fix s ∈ [0, T ], n ≥ 1 large enough and A ∈ B(R d ).Applying (4.3) with f = 1 [s−1/n,s+1/n]×A , and using Fubini-Tonelli's theorem, we deduce that Since t → P(X t ∈ A) is bounded and as Fubini's theorem implies that t → A p(t, x) dx belongs to L 1 ([0, T ]), it follows from Lebesgue differentiation theorem (see Theorem 7.7 in [25]) that for almost all s ∈ [0, T ] We denote by R the set of all Borel sets in R d of the form d i=1 ]a i , b i [, with a i < b i two rational numbers for all i.The set R is at most countable, thus for almost s ∈ [0, T ] The monotone class theorem enables us to conclude.
Note that for almost all s ∈ [0, T ], p(s, •) ∈ L (d+1) ′ (R d ) using Fubini-Tonelli's theorem.We deduce the following corollary.We now prove two lemmas dealing with the integrability of the density p.
By definition of the conjugate exponent, we get Lemma 4.6.Let p and q be two densities of two Itô processes of the form (1.1) and satisfying (A) and (B) given by Proposition 4.3.Then for k, α ∈ N such that k ≥ max{d + 1, d(α + 1)}, we have Proof.Owing to Lemma 4.5, the function s → q(s, . Using Hölder's inequality, the proof is complete once we prove that s → p(s, This is equivalent to our assumption k ≥ d(α + 1).

4.2.
Classical results on convolution and regularization.Fix p ∈ [1 + ∞[.We will need the two following basic lemmas, which we recall for the sake of clarity.-For all f ∈ L p (R d ) and for all g ∈ L 1 (R d ), the convolution f * g is well-defined and belongs to L p (R d ).Moreover, we have f * g -For all f ∈ L p (R d ) and for all g ∈ L p ′ (R d ), the convolution f * g is well-defined and belongs to The following proposition will also be useful.Proposition 4.9.Let f ∈ C 0 (R d ) be a function admitting distributional derivatives of order ) and for all i, j ∈ {1, . . .d} The next lemma deals with the convolution of a function f ∈ L p with µ ∈ P(R d ).
Proof.Note that the convolution f * µ is well-defined as an element of L p (R d ) thanks to Jensen's inequality which shows that Let (µ n ) n be a sequence of P(R d ) weakly convergent to µ ∈ P(R d ).Using Skorokhod's representation theorem (see Theorem 6.7 in [2]), there exists a probability space (Ω ′ , F ′ , P ′ ), a sequence of random variables (X n ) n converging P ′ -almost surely to a random variable X such that, the law of X n is µ n for all n and the law of X if µ.For any a ∈ R d , let us denote by τ a f the translation of f defined, for all x ∈ R d , by τ a f (x) := f (x − a).Jensen's inequality and Fubini-Tonelli's theorem yield It follows from the almost sure convergence of (X n ) n to X and the continuity of the translation operator in L p that τ Xn−X f − f p L p a.s.−→ 0.Moreover, the inequality  Proof.Let π ∈ P 2 (R d × R d ) be an optimal coupling between µ and ν.We consider a couple of random variables (X, Y ) with law π, and a random variable Z independent of (X, Y ) with law m.The law of X + Z being µ * m and the law of Y + Z being ν * m, one has Then, for all x ∈ E, we can find a version of u(x) such that (

The next corollary follows from the fact that ρ
where λ denotes the Lebesgue measure on R d .From Lebesgue differentiation theorem (see Theorem 7.7 in [25]), we deduce that for all x ∈ E, ũ(x, •) = u(x) λ-almost everywhere.We prove that for all n ≥ 1, u n is continuous.Note that

Proof of Theorem 3.3
The proof will be divided into three parts.
Step 1 is dedicated to prove that all the terms in Itô-Krylov's formula (3.1) are well-defined.In Step 2, we regularize u by convolution of the measure argument with a mollifying sequence (ρ n ) n .The effect of replacing u(µ) by u(µ * ρ n ) is that the linear derivative is regularized by convolution, in its space variable.Then, we apply the standard Itô's formula for a flow of measure.We finally take the limit n → +∞ in Step 3 with the help of Krylov's inequality.
Step 1: All the terms in (3.1) are well-defined.
Let us show that the two integrals in (3.1) are well-defined.
Measurability.Thanks to Lemma 4.13, we can find a version of ∂ v δu δm which is measurable with respect to (µ, v) Integrability.We can omit the coefficients b and a to prove the integrability properties because they are uniformly bounded.Taking advantage from the existence of a density coming from Proposition 4.3, we have by Hölder's inequality Step 2: Itô's formula for the mollification of u.
For n ≥ 1, we set u n : µ ∈ P 2 (R d ) → u(µ * ρ n ).By standard arguments, for each n ≥ 1, u n has a linear derivative given by Now, we aim at applying the standard Itô formula for a flow of probability measures (see for example Theorem 5.99 in Chapter 5 of [9] with the L-derivative) to u n for a fixed n ≥ 1.
Our aim is now to take the limit n → +∞ in (5.1).As for all µ ∈ P 2 (R d ), µ * ρ n W 2 −→ µ and u is continuous on P 2 (R d ), we deduce that (u n ) n converges pointwise to u.It remains to take the limit in the two integrals of (5.1).We show that Since b is uniformly bounded, it is enough to prove that By Proposition 4.3, Hölder's inequality and then the L 1 * L k convolution inequality, one has The integrand in I 1 converges to 0 for almost all s using Assumption (1) in Theorem 3.3 and the fact that Note that the set  4.11).Otherwise, we can assume that (n k ) k converges to +∞.We use the triangle inequality to get The last term converges to 0 owing to Lemma 4.12, and the first is bounded by W 2 (µ s k , µ s ) by the contraction inequality (see Lemma 4.11), which converges to 0. Thus Assumption (2) in Definition 3.1 ensures that there exists C > 0 such that for almost all s ∈ [0, T ] and for all n It follows from the convolution inequality L k ′ * L 1 that for almost all s , which is integrable on [0, T ] thanks to Lemma 4.5 since k ≥ max{d(α + 1), d + 1}.We conclude by the dominated convergence theorem that I 1 converges to 0. The term I 2 also converges to 0 following the same method.Indeed, for almost all s, ∂ v δu δm (µ s )(•) ∈ L k (R d ) thus the integrand converges to 0 by Lemma 4.8 and we conclude with the dominated convergence theorem.Therefore (5.2) is proved.Following the same lines, we take the limit n → +∞ in the last integral of (5.1) to obtain that for all t ∈ [0, T ] This concludes the proof of Theorem 3.3.

Proof of Theorem 3.12
The strategy of the proof is the following.In Step 1, we prove some integrability results coming from Assumption (5) in Definition 3.10.Step 2 is devoted to prove that all the terms in Itô-Krylov's formula (3.3) are well-defined using a localization argument, Krylov's inequality, and Step 1.Moreover, we see that it is enough to prove the formula up to random times localizing the process ξ.Step 3 is dedicated to regularize u using convolutions both in space and measure variables.In Step 4 and 5, we follow the strategy of the proof of Theorem 5.102 in [9] to prove Itô-Krylov's formula for u n , the mollified version of u.Finally, Step 6 aims at taking the limit n → +∞ thanks to Krylov's inequality.
Note that there are three kind of integrals in Itô's formula (3.3): the terms involving standard time and space derivatives in the first line, those involving the linear derivative in the second line and the martingale term in the third line.We will treat them separately.
It follows from Assumption (5) in Definition 3.10 and Lemma 4.6 that for any M > 0 the following quantities are finite: To prove this, we follow the method employed in Step 3 of the preceding proof to justify the dominated convergence theorem.We just give details for J 2 (M ) since it requires a bit more attention.Owing to Assumption (2) in Definition 3.10, we know that for all (t, µ) Sobolev embedding theorem (see Corollary 9.14 in [3]) ensures that the embedding . Thanks to Assumption (5) in Definition 3.10, there exists a constant C M > 0 such that for almost all s and for all n ≥ 1 , where we used the fact that {µ s * ρ n , s ∈ [0, T ], n ≥ 1} is relatively compact in P 2 (R d ) and the convolution inequality L k ′ 1 * L 1 .We conclude with Lemma 4.6 since k 1 ≥ max{d(2α 1 + 1), d + 1}.Note that these integrability properties remain true if we replace µ s * ρ n by µ s and remove the supremum.We justify it only for the second point.It follows from the continuity assumption (2) in Definition 3.10 that for almost all s ∈ [0, T ] Thus we obtain Step 2: Meaning of the terms in (3.3) and localization.
Let (T M ) M be the sequence of stopping times converging almost surely to T defined by Let ξ M t = ξ t∧T M , which is bounded by M on the set {T M > 0}.
(i) Terms involving standard derivatives in (3.3).We prove that almost surely By Proposition 4.3 and Hölder's inequality, one has which is finite (see (6.1) in Step 1).We deduce that almost surely, for all M ≥ 1 But it is clear that for almost all ω ∈ Ω and for M bigger than some random constant M (ω) ≥ 1, T M (ω) = T. Thus, since η is uniformly bounded, T 0 |∂ x u(s, ξ s , µ s ).η s | ds is finite almost surely.The other terms in the first line of Itô's formula (3.3) are treated with the same method.
(ii) Martingale term in (3.3).We need to prove that Reasoning as before, it is a consequence of the fact that J 2 is finite since we have . Therefore the martingale term in (3.3) is well-defined.
(iii) Terms involving the linear derivative in (3.3).We remark that X and ξ can be seen as independent processes on the product space Ω × Ω with L( Xs ) = p(s, •) dx and L(ξ s ) = q(s, •) dx for almost all s.Hölder's inequality gives that which was defined in (6.1) and is finite.We deduce as previously that δm is dealt similarly.
Since all the terms in (3.3) are well-defined, it is enough to prove Itô-Krylov's formula for u(t ∧ T M , ξ t∧T M , µ t∧T M ) almost surely for all t ∈ [0, T ], and then take the limit M → +∞ using the continuity of the integrals in Itô-Krylov's formula with respect to t.So we fix τ := T M for M ≥ 1 and we want to prove the formula up to time τ.
Let u n be the function defined by u n (t, x, µ) := u(t, •, µ * ρ n ) * ρ n (x).It is clearly continuous on [0, T ] × R d × P 2 (R d ), as u.Since ∂ t u is jointly continuous, it follows from Leibniz's rule that u n is C 1 with respect to t and that we can differentiate under the integral i.e. for all (t, x, µ) which is also jointly continuous.As a result of Lemma 4.8 and Proposition 4.9, u n is C 2 with respect to x and we have . These two functions are continuous on [0, T ] × R d × P 2 (R d ) by the dominated convergence theorem and the fact that u is jointly continuous.We define ρn by ρn (x, v) := ρ n (x)ρ n (v) for all x, v ∈ R d .It is easy to see that ( ρn ) n is a mollifying sequence on R 2d .Next, we claim that for all (t, x) ∈ [0, T ] × R d , u n (t, x, •) has a linear derivative given by This convolution is well-defined as δu δm is jointly continuous.To prove (6.2), note first that the bound of Assumption (3) in Definition 3.10 implies that for all (t, x) ∈ [0, T ] × R d , δu n δm (t, x, µ)(•) is at most of quadratic growth, uniformly in µ on each compact set.Since for all (t, x) the dominated convergence theorem proves that δu n δm (t, x, •)(•) is continuous.As explained in Remark 2.2, it is enough to compute, for µ, ν ∈ P 2 (R d ) and λ ∈ [0, 1], the derivative with respect to λ of u n (t, x, m λ ), where m λ = λµ + (1 − λ)ν.As recalled in the proof of Theorem 3.3, when Thanks to the bound Assumption (3) in Definition 3.10 for all compact K ⊂ R d , one has We can conclude with the help of Leibniz's rule and Fubini's theorem that It follows from the joint continuity of δu δm and Leibniz's rule that δu n δm is C 2 with respect to v and that the ball B M +1 coming from the fact that the support of ρn is included in B 1 .Since K * ρ n is compact in P 2 (R d ) and included in P(R d ), Assumption (5) in Definition 3.10 ensures that there exists C > 0 such that for all µ ∈ K .
Step 4: Itô's formula (3.3) for u n when the coefficients b and σ are continuous.
We claim that (t, The regularity with respect to x is clear with the preceding properties on u n .Let us thus focus on the regularity with respect to the time variable.For (t, x) ∈ [0, T ] × R d fixed, the regularity assumption on u with respect to t and the standard Itô formula for a flow of measures applied to u n (t, x, •) (see Theorem 5.99 in Chapter 5 of [9]) ensure that we have for h ∈ R satisfying t + h ≥ 0 The two other terms in (6.4) can be dealt similarly.Indeed, the dominated convergence theorem justified by (6.3) ensures that the functions (s, x) We can now apply the classical Itô formula for U n and ξ, up to the random time τ defined at the end of Step 2, to obtain that almost surely, for all t ∈ [0, T ] δu n δm (s, ξ s , µ s )( Xs ) • ãs ds (6.5) Note that (6.5) does not require Assumptions (A) and (B) on the Itô process X.These assumptions will only be used in Step 6.
Step 5: Removing the continuity hypothesis on the coefficients b and σ.
We consider (b m ) m and (σ m ) m two sequences of continuous and progressively measurable processes such that We set, for t ≤ T, X m t := X 0 + t 0 b m s ds + t 0 σ m s dB s , and µ m t the law of X m t .Owing to Step 4, Itô's formula (6.5) holds true for X m and ξ.Now, we aim at taking the limit m → +∞ in (6.5).Note that the set K := {µ m s , s ≤ T, m ≥ 1} ∪ {µ s , s ≤ T } is compact in P 2 (R d ).Indeed, using Jensen's inequality and the Burkholder-Davis-Gundy (BDG) inequalities, it is clear that We deduce that almost surely, for all t ∈ [0, T ] Now, we take the limit m → +∞ in the integrals in Itô's formula (6.5).
(i) Martingale term in (6.5).Using BDG's inequality, there exists C > 0 such that The dominated convergence theorem can be applied since γ is bounded and ∂ x u n is jointly continuous on (ii) Terms involving the linear derivative in (6.5).Let us write Cauchy-Schwarz's inequality ensures that .
We conclude that I 1 converges to 0 thanks to the bound (6.3) proved in Step 3 and since ξ is bounded by M on the set {τ > 0}.To show that I 2 → 0, we use the fact that b is bounded by K to get The continuity of ∂ v δu n δm and the convergence in L 2 of ( Xm s ) m to Xs ensure that for all ω ∈ Ω, δm (s, ξ s (ω), µ s )( Xs ) converges in probability on Ω to 0 as m goes to infinity.Using a uniform integrability argument coming from (6.3), we deduce that I 2 converges to 0. Following the same strategy, one has for all t ∈ [0, T ] (iii) Terms involving standard derivatives in (6.5).It follows from the dominated convergence theorem that almost surely, for all t ≤ T t∧τ 0 Moreover, η and γ are also uniformly bounded.
This concludes Step 5.

From
Step 5, we deduce that Itô's formula (6.5) in Step 4 holds for u n up to time τ.To conclude the proof, we need to take the limit n → +∞ in each term of (6.5).Then it remains to remove the stopping time τ as explained at the end of Step 2 (i.e.letting τ → T ).The continuity of u ensures that almost surely, for all t ≤ T , u n (t, ξ t , µ t ) → u(t, ξ t , µ t ).We now focus on the integrals in Itô's formula (6.5).
(i) Martingale term in (6.5).Thanks to BDG's inequality, Hölder's inequality, and the boundedness of γ, we have We prove that I 1 and I 2 converge to 0. First note that, due to the convolution inequality L r * L 1 , we have for ) .The control on B R+1 follows from the fact that the support of each ρ n is included in B 1 .Hence As a consequence of embedding theorem, for all t, the function is continuous.Since µ s ∈ P(R d ) for almost all s and thanks to Assumption (2) in Definition 2.3, we deduce that the integrand in Ĩ1 converges to 0 for almost all s.It follows from the dominated convergence theorem (see (6.1) in Step 1) that Ĩ1 converges to 0, as well as I 1 .We now focus on I 2 .The integrand in I 2 converges to 0 for almost all s because ∂ x u(s, •, µ s ) ∈ L 2k 1 (B M ).We conclude with the dominated convergence theorem as previously.This shows that, up to an extraction, almost surely (ii) Terms involving the linear derivative in (6.5).Following the same strategy, we obtain using Hölder's inequality ds.
The dominated convergence theorem justified by Assumption The term involving ∂ 2 x u in (6.5) is dealt similarly.
Computation of the distributional derivatives and continuity: Let µ ∈ P(R d ) and f ∈ L (d+1) ′ (R d ) be its density.By interpolation, we know that f ∈ L r ′ (R d ) for all r ≥ d + 1.Let ϕ ∈ C ∞ c (R d ) and i ∈ {1, . . ., d}.Using Fubini's theorem, justified by the quadratic growth of g and the fact that f dx ∈ P 2 (R d ), we have Our aim is to take the limit n → +∞ in both side of the previous equality.Using Fubini's theorem, the left-hand side term is equal to g(v, y)∂ v i ϕ(v) dv f n (y) dy.
Moreover, it converges to g(v, y)∂ v i ϕ(v)f (y) dy dv.
Indeed, f n W 2 −→ f and the function y → R d g(v, y)∂ v i ϕ(v) dv is continuous and at most of quadratic growth.For the right-hand side term, we prove that ∂ v i g(v, y)f (y)ϕ(v) dy dv.
Note that the limit is well-defined using Hölder's inequality The right-hand side term is finite because The same inequality shows that thanks to (A.1).Taking the limit n → +∞ in (A.2), we deduce that : Moreover, it belongs to L k (R d ) because applying Hölder's inequality, one has Note that this inequality and the linearity in f justify that µ ∈ (P(R d ), Following the same lines, we show that the distributional derivative of order 2 of δu δm (µ), for µ ∈ P(R d ), is given by the R d×d -valued function It is also a continuous function from (P(R d ), d k ) into L k (R d ).Indeed, as previously, we obtain : Fubini's theorem ensures that belongs to L k 2 (B R × R d ).Moreover, the function is continuous because F ∈ C 1 (R d × R; R) and thus y → ∂ y F (•, y) ∈ L ∞ (B R ) is continuous.The same reasoning proves that and that the function is continuous for all R > 0. We conclude that u ∈ W 2 (R d ) with Remark 3.11.

Definition 3 . 10 .
Let W 2 (R d ) be the set of continuous functions u : [0, T ] × R d × P 2 (R d ) → R satisfying the following properties for a certain distance d P satisfying (H1) and (H2).

Proposition 4 . 3 .
Under Assumptions (A) and (B) on the coefficients b and σ, there exists a function

Lemma 4 . 5 .
Let p be the density given by Proposition 4.3.Then for all k

4 . 3 .
enables us to conclude with the dominated convergence theorem.Convolution of probability measures.

4. 4 .
Measurability.We will need the following lemma to guarantee that, foru ∈ W 1 (R d ), we can find versions of ∂ v δu δm and ∂ 2 v δu δm that are measurable with respect to (µ, v) ∈ P(R d ) × R d .Lemma 4.13.Let u : E → L k (R d) be a continuous function, where E is a metric space and k > 1.
will be measurable by composition.First, note that µ s ∈ P(R d ) for almost all s ∈ [0, T ] (see Corollary 4.4) so we can change µ s on a negligible set of times s to ensure that µ s ∈ P(R d ) for all s ∈ [0, T ].But µ s = lim n→+∞ µ s * ρ n for d P by Assumption (H2) in Definition 2.3.It remains to show that s → µ s * ρ n ∈ P(R d ) is continuous and thus mesurable for all n.This follows from the continuity of s → µ s ∈ P 2 (R d ) and also from Assumption (H1) in Definition 2.3.

for some constant C coming from Assumption ( 2 )
in Definition 3.1 because the flow (µ s ) s≤T is compact in P 2 (R d ) and belongs to P(R d ) for almost all s.The last bound is finite thanks to Lemma 4.5 since k ≥ max{d(α + 1), d + 1}.The same properties hold for the term involving ∂ 2 v δu δm .
H1) in Definition 2.3 provides that µ m * ρ n d P −→ µ * ρ n when m → +∞.Finally, using the first assumption in Definition 3.1, we conclude that D 1 converges to 0 when m → +∞.This shows the continuity of ∂ v δu n δm on P 2 (R d ) × R d .The same reasoning proves the joint continuity of ∂ 2 .Let K ⊂ P 2 (R d ) be a compact set.For µ ∈ K and v ∈ R d , one has

2 −For a fixed n 1 ,
y)f (y) dy ∂ v i ϕ(v) dv = R d ×R d g(v, y)f (y)∂ v i ϕ(v) dy dv.Let us define f n (x) = 1 µ(Bn) (f 1 Bn ) * ρ n (x), for n large enough to have µ(B n ) > 0. The function f n is a probability density which is in C ∞ c (R d ).It easily follows from Lemma 4.7, Lemma 4.8 and the dominated convergence theorem thatf n L k ′ −→ f and f n W → f.(A.1) we have by definition of the distributional derivativeR d ×R d g(v, y)f n (y)∂ v i ϕ(v) dy dv = − R d ×R d ∂ v i g(v,y)f n (y)ϕ(v) dy dv.(A.2)

R
d ×R d g(v, y)f (y)∂ v i ϕ(v) dy dv = − R d ×R d ∂ v i g(v, y)f (y)ϕ(v) dy dv.Hence, the distributional derivative of v → R d g(v, y)f (y) dy is given by the functionv → R d ∂ v g(v, y)f (y) dy.
v φ(x, v) dx dv = R d R d g(v)∂ v φ(x, v) dv ∂ y F x, )φ(x, v) dv ∂ y F x, v) dx dv.This proves exactly that∀µ ∈ P 2 (R d ), ∀x, v ∈ R d , ∂ v δu δm (x, µ)(v) = ∇g(v)∂ y F x, R d g dµ .Since ∇g ∈ L k 2 (R d ) and ∂ y F •, R d g dµ ∈ L ∞ (B R), for all R > 0 and µ ∈ P 2 (R d ), the function the law of X t is equal to p(t, •) dx and belongs to P(R d ) (see Proposition 4.3).Let us give two examples for the distance d P .Example 2.4.The Wasserstein distance W 2 clearly satisfies Assumptions (H1) and (H2) in Definition 2.3.Another family of examples is given by the distance d k defined, for k ∈ [d + 1, +∞[, µ, ν ∈ P(R d ), by 3.11.-ThespaceW 2 (R d ) contains the functions satisfying the four first assumptions of Definition 3.10 with (P(R d ), d P ) replaced by (P 2 (R d ), W 2 ) and also assuming that the functions in Assumptions (2) and (4) are continuous with respect to (t, µ) ∈ [0, T ]×P 2 (R d ).Indeed, Assumption (5) is automatically satisfied with α 1 = α 2 = 0 because K is compact.-Thebound in Assumption (3) is quite natural.If the supremum in this bound was taken only over a compact set of P 2 (R d ), it would be the definition of the linear derivative.But we also need to control δu δm locally uniformly in the space variable x ∈ R d because of our regularization procedure through a convolution both in the space and measure variables.Assumptions (2), (4) and (5) are generalizations of those in Definition 3.1 adapted to the presence of the space and time variables.In Assumption (5), the condition on k 2 and α 2 changes a bit compared to the analogous assumption in Definition 3.1, essentially because it deals with functions on R 2d instead of R d .Let us mention that Assumption (5) in Definition 3.10 can be replaced by the integrability properties (6.1) established in Step 1 of the proof of the next theorem (see Section 6).
ρ n d P −→ µ s for almost all s thanks to Assumption (H2) in Definition 2.3.Let us now prove that the dominated convergence theorem applies.The integrand is bounded by two sequences, we have to find a convergent subsequence from (µs k * ρ n k ) k .Up to an extraction, we can assume that (s k ) k converges to some s ∈ [0, T ].There are two cases.If there exists l such that n k = l infinitely often, then µ s k * ρ l W 2 −→ µ s * ρ l by the contraction inequality (see Lemma to the dominated convergence theorem and the joint continuity of δu δm .Moreover for all compact K ⊂ P 2 (R d ) and for all M > 0 (∂ t u(s, ξ s , µ s ) + ∂ x u(s, ξ s , µ s ) • η s ) ds + 1 2 t∧τ 0 ∂ 2 x u(s, ξ s , µ s ) • γ s γ * s ds ∂ x u(s, ξ s , µ s ) • (γ s dB s ).This ends the proof as explained in Step 2. Proof of Example 2.4.(1) It follows from the contraction inequality in Lemma 4.11 and Corollary 4.12.