Self-improvement of the Bakry-\'Emery condition and Wasserstein contraction of the heat flow in RCD(K,\infty) metric measure spaces

We prove that the linear heat flow in a RCD(K,\infty) metric measure space (X,d,m) satisfies a contraction property with respect to every L^p-Kantorovich-Rubinstein-Wasserstein distance. In particular, we obtain a precise estimate for the optimal W_\infty-coupling between two fundamental solutions in terms of the distance of the initial points. The result is a consequence of the equivalence between the RCD(K,\infty) lower Ricci bound and the corresponding Bakry-\'Emery condition for the canonical Cheeger-Dirichlet form in (X,d,m). The crucial tool is the extension to the non-smooth metric measure setting of the Bakry's argument, that allows to improve the commutation estimates between the Markov semigroup and the Carr\'e du Champ associated to the Dirichlet form. This extension is based on a new a priori estimate and a capacitary argument for regular and tight Dirichlet forms that are of independent interest.


Introduction
RCD(K, ∞) spaces in short. This more restrictive class of spaces can also be characterised in terms of the Evolution variational inequality formulation of (H t ) t≥0 , see (4.15), that provides the W 2 contraction property W 2 (H t µ, H t ν) ≤ e −Kt W 2 (µ, ν) for every µ, ν ∈ P 2 (X). (1. 2) The RCD(K, ∞) condition is still stable with respect to measured Gromov-Hausdorff convergence [4,24] and thus includes all possibile measured Gromov-Hausdorff limits of Riemannian manifolds under uniform lower curvature bounds.
In RCD(K, ∞) spaces E := 2Ch is a strongly local Dirichlet form admitting a Carré du champ Γ(f ) that coincides with the squared minimal weak upper gradient |Df | 2 w associated to (1.1), see (4.8) and (4.9). In terms of the generator L : D(L) ⊂ L 2 (X, m) → L 2 (X, m) of (P t ) t≥0 this provides useful the Leibnitz and composition rules at least for a suitable class of functions in D(L), see § 2.2.
Distance and energy are intimately correlated by the explicit formula (1.1) (that involves the metric slope of Lipschitz functions) and by the somehow dual property that expresses d as the canonical distance [12] associated to E: every bounded function f ∈ D(E) with Γ f ≤ 1 has a continuous representativef , (1.3a) d(x, y) := sup ψ(x) − ψ(y) : ψ ∈ D(E) ∩ C b (X), Γ f ≤ 1 . (1.3b) Having a Carré du champ at disposal, it is then possibile to consider a weak version (see (3.1)) of the Carré du champ itéré 4) and to prove a weak BE(K, ∞) condition of the type Γ 2 (f ) ≥ KΓ f , where Γ f := Γ f, f , Γ 2 (f ) := Γ 2 (f, f ), (1.5) in a suitable weaker integral form (Definition 3.1), but still sufficient to get the crucial pointwise gradient bound It turns out that the implication RCD(K, ∞) ⇒ BE(K, ∞) can also be inverted and the two points of view are eventually equivalent. This has been shown by [5]: starting from a Polish topological space (X, τ ) endowed with a local Dirichlet form E with the associated Carré du champ Γ and the intrinsic distance d satisfying (1.3a,b) and inducing the topology τ , if BE(K, ∞) holds, then (X, d, m) is a RCD(K, ∞) metric measure space.

Applications of BE(K, ∞): refined gradient estimates and Wasserstein contraction
The identification between RCD(K, ∞) and BE(K, ∞) lead to the possibility to apply a large numbers of the results and techniques originally proved for smoother spaces satisfying the Bakry-Émery condition. Performing this project is not always simple, since proofs often use extra regularity or algebraic assumptions (see e.g. [8,Page 24]) that prevent a direct application to the non smooth context. Among the most useful properties, Bakry [7,8] showed that the Γ 2 condition expressed through the pointwise bounds (1.6) is potentially self-improving, since it leads to the stronger commutation inequality (1.7) (1.7) is in fact a consequence of the crucial estimate a formula whose meaning can be better understood recalling that in a Riemannian manifold (M d , g) endowed with the canonical Riemannian volume m = Vol g , we have (1.8) can be derived by applying the Γ 2 inequality (1.5) to polynomials of two or more functions f 1 , f 2 , · · · . However the Bakry's clever strategy of [7,8] requires a multivariate differential formula for the Γ 2 operator, that typically involves further smoothness assumptions. The aim of the present paper is twofold: from one side, we want to show how to obtain the estimate (1.8) in a very general setting, starting from the weak integral formulation of BE(K, ∞).
This result is independent of the theory of metric measure spaces, and it is obtained for general Dirichlet forms in Polish spaces satisfying standard regularity and tightness assumptions. It relies on a simple estimate showing that Γ f ∈ D(E) if f belongs to the space D ∞ , whose elements f are characterised by f ∈ D(L) with Γ f ∈ L ∞ (X, m), Lf ∈ D(E). Tightness and regularity of E are then sufficient to give a measure-theoretic sense to LΓ f , to Γ 2 (f ) and to multivariate calculus for Φ • f thanks to capacitary arguments. The main point here is that Γ 2 (f ) may be singular with respect to m, but its singular part is nonnegative; moreover, the multiplication of the measure Γ 2 (f ) with functions in D(E) still makes sense since the latter admit a quasi continuous representative and polar sets are negligible w.r.t. the measure Γ 2 (f ).
Finally, the application of (1.5) to contraction estimates for the heat flow (H t ) t≥0 in Wasserstein spaces follows the Kuwada's duality approach [26], thanks to (1.3a), (1.3b) and the refined argument developed in [5]. We can then prove the optimal contraction estimate for every L p -Wasserstein distance and, when K ≥ 0, for any transport cost depending on the distance d in an increasing way, see (4.1) (see [29] for a similar estimate in R n ).

Plan of the paper
We will recall in Section 2 a few basic results concerning Dirichlet forms, Carré du champ, multivariate differential calculus and capacities. A simple but important estimate is proved in Lemma 2.6. After a brief review of the weak formulation of the BE(K, ∞) condition, Section 3 contains the main properties for the measure theoretic interpretation of the Carré du champ itéré Γ 2 and the corresponding multivariate calculus rules. The main estimates are then proved in Theorem 3.4 and its Corollary 3.5.
Applications to RCD(K, ∞) spaces and to Wasserstein contraction of the heat flow are eventually discussed in the last section 4.

Acknowledgment
We would like to thank Luigi Ambrosio, Nicola Gigli, Michel Ledoux for various fruitful discussions and the anonymous reviewer for the accurate and valuable report.

Notation, Dirichlet forms and Carré du Champ
Let (X, τ ) be a Polish topological space. We will denote by B(X) the collection of its Borel sets and by M (X) the space of Borel signed measures with finite total variation, i.e. σadditive maps µ : B(X) → R. M (X) is endowed with the weak convergence with respect to the duality with the continuous and bounded functions of C b (X). M + (X) and P(X) will denote the convex subsets of nonnegative finite measures and of probabilities measures in X, respectively.
We will consider a σ-finite Borel measure m ∈ M + (X) with full support supp(m) = X and a strongly local, symmetric Dirichlet form E : L 2 (X, m) → [0, ∞] with proper domain V := f ∈ L 2 (X, m) : E(f ) < ∞ dense in L 2 (X, m). E generates a mass preserving Markov semigroup (P t ) t≥0 in L 2 (X, m) with generator L and domain D(L) dense in V.
We will still use the symbol E to denote the associated bilinear form in V. V is an Hilbert space with the graph norm induced by E: We will assume that E admits a Carré du Champ Γ ·, · : it is a symmetric, bilinear and continuous map Γ : In the following we set
We will denote by f := (f i ) n i=1 a n-uple of real measurable functions defined on X and by Φ(f ) = Φ(f 1 , · · · , f n ) the corresponding composed function.
For a proof of the following properties, we refer to [14, Ch. I, §6]: notice that we do not assume any bounds on the derivatives of Φ and Ψ since they will be composed with (essentially) bounded functions.
L.2 If f ∈ V and g ∈ G ∞ then f g ∈ V.
L.3 If f, g ∈ V ∞ (or f ∈ V and g ∈ G ∞ ) and h ∈ V then [14, Ch. I, Cor. 6.1.3] If a property of points in X holds in a complement of an E-polar set we say that it holds E-quasi-everywhere (E-q.e.) .
E-nests and E-polar sets can also be characterized in terms of capacities; we recall here a version that we will be useful later on. The capacity Cap (it corresponds to Cap h,1 with h ≡ 1 in the notation of [19]) of an open set A ⊂ X is defined by and it can be extended to arbitrary sets B ⊂ X by Notice also that Cap(A) ≥ m(A).  QR.2 There exists a dense subset of V whose elements have E-quasi-continuous representatives.

QR.3
There exists an E-polar set N ⊂ X and a countable collection of E-quasi-continuous f is unique up to q.e. equality. Notice that When m(X) < ∞ so that Cap(X) < ∞, Theorem 2.2(i) shows that QR.1 is equivalent to the tightness condition there exist compact sets K n ⊂ X, n ≥ 1, such that lim In the general case of a σ-finite measure m satisfying (2.6), we have the following simple criterium of quasi-regularity, where (with a slight abuse of notation) we will denote by V ∩ C(X) the subspace of V consisting of those functions which admits a continuous representative.

Lemma 2.4 (A criterium for quasi-regularity)
Let us assume that there exists a nondecreasing sequence (X n ) n∈N of open subsets of X satisfying (2.6) and let us suppose that is dense in V and it separates the points of X.
Then E is quasi-regular.
Proof. Let us set F k := k j=1 K j,j . (F k ) k∈N is a nondecreasing sequence of compact sets and whenever k ≥ n we get We introduce the convex set  [14, Ch. I, § 9.2] in the case of a finite measure m(X) < ∞), applied to the representation of ℓ through the 1-excessive function u ℓ of (2.10).
Proposition 2.5 Let us assume that E is quasi-regular. Then for every ℓ ∈ V ′ + there exists a (unique) σ-finite and nonnegative Borel measure µ in X such that every E-polar set is µ-negligible and then µ is a finite measure and µ(X) ≤ M .
We will identify ℓ with µ. Notice that if µ ∈ V ′ + and 0 ≤ ν ≤ cµ, then also ν ∈ V ′ + since The next Lemma provides a simple but important application of the previous Proposition to the case of a function u with measure-valued Lu. We first recall a well known approximation procedure (see e.g. [32, Proof of Thm. 2.7]), that will turn to be useful in the sequel. For f ∈ L 2 (X, m) let us set (2.13) P ε is positivity preserving and it is not difficult to check that for ε > 0 P ε f ∈ D(L) and for (2.14) Lemma 2.6 Let us assume that the strongly local Dirichlet form E is quasi-regular, according and there exists a unique finite Borel measure µ := µ + − g m with µ + ≥ 0, µ + (X) ≤ X g dm such that every E-polar set is |µ|-negligible, the q.c. representative of any function in V belongs to L 1 (X, |µ|), and Proof. Let u ε := P ε u, ε ≥ 0, and notice that by the regularisation properties of (P ε ) ε>0 which in particular yields Lu ε + P ε g ≥ 0. Choosing ϕ := u ε in (2.18) and inverting the sign of the inequality we obtain We can then pass to the limit as ε ↓ 0 obtaining (2.16). Moreover, taking nonnegative functions φ, ψ ∈ L 2 ∩ L ∞ (X, m) with 0 ≤ ϕ(x) ≤ 1 and ψ(x) > 0 for m-a.e. x ∈ X (such a function exists since m is σ-finite) and setting ϕ n (x) := 1 ∧ (ϕ(x) + nψ(x)), (2.18) applied to the differences ϕ n+1 − ϕ n ≥ 0 (notice that ϕ ≡ ϕ 0 ), yields that for every n ≥ 0 Passing to the limit as n → ∞, since ϕ n ↑ 1 m-a.e. we obtain since (P t ) t≥0 is mass preserving and thus X Lu ε dm = 0. Let us now denote by ℓ the linear Choosing a nonnegative ϕ ∈ V ∞ in (2.18) and passing to the limit ε ↓ 0 we easily find that Applying the previous Proposition 2.5 we conclude.
We denote by and we will write L ⋆ u = µ.
By a standard approximation argument by truncation we extend the previous identity to arbitrary ζ ∈ V (notice thatf is essentially bounded andζ ∈ L 1 (X, |µ|)).

The Bakry-Émery condition
Let us assume that the Dirichlet form E admits a Carré du champ Γ and let us introduce the multilinear form Γ 2 When f = g we also set .

Definition 3.1 (Bakry-Émery condition)
We say that the strongly local Dirichlet form E satisfies the BE(K, ∞) condition, K ∈ R, if it admits a Carré du Champ Γ and where I 2K (t) = t 0 e 2Kt dt.

An estimate for Γ f and multivariate calculus for Γ 2
Let us introduce the space The following Lemma provides a further crucial regularity property for Γ f when f ∈ D ∞ and shows how to define a measure-valued Γ ⋆ 2 (f ) operator.

Lemma 3.2 Let E be a strongly local and quasi-regular Dirichlet form. If BE(K, ∞) holds then for every
and Moreover, D ∞ is an algebra (closed w.r.t. pointwise multiplication) and if f = (f i ) n i=1 ∈ (D ∞ ) n then Φ(f ) ∈ D ∞ for every smooth function Φ : R n → R with Φ(0) = 0.
Proof. Let us first notice that for every f ∈ G ∞ we have Γ f ∈ L 1 (X, m) ∩ L ∞ (X, m) ⊂ L p (X, m) for every p ∈ [1, ∞].
For every f ∈ D ∞ we denote by Γ ⋆ 2 (f ) the finite Borel measure By Lemma 2.6, Γ ⋆ 2 (f ) has finite total variation, since The measure Γ ⋆ 2 (u) vanishes on sets of 0 capacity. We denote by γ 2 (u) ∈ L 1 (X, m) its density with respect to m: The main point is that Γ ⋆ 2 (·) can have a singular part Γ ⊥ 2 (·) w.r.t. m, but this is nonnegative and it does not affect many crucial inequalities.
According to (3.1) we also set for f, g ∈ D ∞ Γ ⋆ 2 (f, g) := and similarly The next lemma extends to the present nonsmooth setting the multivariate calculus for Γ 2 of [7,8].
Proof. The fact that Φ(f ) ∈ D ∞ has been proved in Lemma 3.2. In the following we will assume that the indices i, j, h, k run from 1 to n and we will use Einstein summation convention.
We set g ij : we will also consider the quasi-continuous representative.

By (2.3) and Lemma 3.2 we have
Since φ i φ j ∈ D ∞ by L.6 and g ij ∈ M ∞ by Lemma 3.2, we can apply (2.21) obtaining where we used g ij = g ji , On the other hand where we changed k with j in the first term and h with j in the last one. We end up with that gives (3.12).
It could be useful to remember that in the smooth context of a Riemannian manifold (M n , g) as for (1.9) we have [8,Page 96]

A pointwise estimate for Γ Γ f
Applying the previous results and adapting the ideas of [7] we can now state our first fundamental estimates.
Theorem 3.4 Let E be a strongly local and quasi-regular Dirichlet form. If (BE(K, ∞)) holds then for every f, g, h ∈ D ∞ (so that Γ f , Γ g , Γ h ∈ V ∞ ) we have (all the inequalities are to be intended m-a.e. in X) We choose the polynomial Φ : (3.18) keeping the same notation of Lemma 3.3 we have (3,2)}.
If f ∈ D ∞ Lemma 3.2 yields Φ(f ) ∈ D ∞ and we can then apply the inequality (3.9) obtaining (3.19) where both sides of the inequality depend on λ, a, b ∈ R. Evaluating γ 2 (Φ(f )) by (3.14), and choosing a countable dense set Q of the parameters (λ, a, b) in R 3 , for m-almost every x ∈ X the previous inequality holds for every (λ, a, b) ∈ Q. Since the dependence of the left and right side of the inequality w.r.t. λ, a, b is continuous, we conclude that for m-almost every x ∈ X the inequality holds for every (λ, a, b) ∈ R 3 . Apart from a m-negligible set, for every x we can then choose a : Since λ is arbitrary and we eventually obtain that provides (3.15). (3.16) then follows by first noticing that so that We argue now by approximation, fixing f, g ∈ D ∞ and approximating an arbitrary h ∈ V ∞ with a sequence h n ∈ D ∞ (e.g. by (2.13)) converging to h in energy with  we have thanks to (3.17).
Since G is continuous, we obtain G ε (0)e 2αKt ≤ G ε (t) which yields, after passing to the limit as ε ↓ 0 e 2αKt Since D ∞ is dense in V we can extend (3.26) to arbitrary f ∈ V and then obtain (3.23), since ζ is arbitrary.

RCD(K, ∞)-metric measure spaces
In this section we will apply the previous result to prove new contraction properties w.r.t. transport costs (in particular W p Wasserstein distance) for the heat flow in RCD(K, ∞) metric measure spaces.

Basic notions
Metric measure spaces, transport and Wasserstein distances, entropy We will quickly recall a few basic facts concerning optimal transport of probability measures, also to fix notation; we refer to [2,36] for more details. Let (X, d) be a complete and separable metric space endowed with a Borel measure m satisfying supp(m) = X, m(B r (x)) ≤ c 1 exp(c 2 r 2 ) for every r > 0, (m-exp) for some constants c 1 , c 2 ≥ 0 and a pointx ∈ X.
Recall that for every Borel probability measure µ ∈ P(Y ) in a separable metric space Y and every Borel map r : Y → X, the push-forward r ♯ µ ∈ P(X) is defined by r ♯ µ(B) = µ(r −1 (B)) for every B ∈ B(X). If µ i ∈ P(X), i = 1, 2, we denote by Π(µ 1 , µ 2 ) the collection of all couplings µ between µ 1 and µ 2 , i.e. measures in P(X ×X) whose marginals π i ♯ µ coincide with µ i (here π i (x 1 , x 2 ) = x i ). Given a nondecreasing continuous function h : [0, ∞) → [0, ∞), we consider the transport cost where we implicitly assume that the minimum is +∞ if couplings with finite cost do not exist. In the particular case h(r) := r p we set and we also set Denoting by P p (X) the space of Borel probability measures with finite p-th moment, i.e.
µ ∈ P p (X) ⇐⇒ X d p (x,x) dµ(x) < ∞ for some (and thus any)x ∈ X, (4.4) (P p (X), W p ) is a complete and separable metric space. The relative entropy of a measure µ ∈ P 2 (X) is defined as The entropy functional is well defined and lower semicontinuous w.r.t. W 2 convergence (see e.g. [3, §7.1] The Cheeger energy and its L 2 -gradient flow We first recall that the metric slope of a Lipschitz function f : X → R is defined by The Cheeger energy [15,3] is obtained as the L 2 -lower semicontinuous envelope of the functional f → 1 2 X |Df | 2 dm: If Ch(f ) < ∞ it is possibile to show that the collection admits a unique element of minimal norm, the minimal weak upper gradient |Df | w , that it is also minimal with respect to the order structure [3, §4], i.e.
[EVI K ]: For everyμ ∈ P 2 (X) there exists a curve (µ t ) t≥0 ⊂ D(Ent m ) such that lim t↓0 µ t =μ and d dt + the subdifferential ∂Ch is single-valued and coincides with the linear generator L, (h t ) t≥0 = (P t ) t≥0 , and for everyμ = f m ∈ P 2 (X) with f ∈ L 2 (X, m) the curve µ t = h t f m is the unique solution of (4.15). Eventually, any essentially bounded function f ∈ D(Ch) with |Df | w ≤ L admits a L-Lipschitz representativef , and for every f ∈ Lip b (X), g ∈ C b (X) we have Proof. The implication (I)⇔(IV) has been proved in [4,Thm. 5.1] in the case when m ∈ P 2 (X) and extended to the general case by [1]. (IV)⇒(II),(III) has been proved in [ When m(X) < ∞ we can always choose X n := X and QR.1 ′ reduces to the tightness property (2.9), that has been proved in [4,Lemma 6.7], following an argument of [28,Proposition IV.4.2]. In the general case we can adapt the same argument: we recall here the various steps for the easy of the reader.
Let us fix a pointx ∈ X and let us set X n := B n (x). In order to prove that (X n ) n∈N is an E-nest, we introduce the 1-Lipschitz cut-off functions ψ n : X → [0, 1] For every f ∈ V we can consider the approximations f n := ψ n f in VX n . The Lebesgue's Dominated Convergence Theorem shows that f n → f strongly in L 2 (X, m) as n → ∞. The Leibnitz rule yields so that lim n→∞ E(f − f n ) = 0 as well. This shows that f n → f strongly in V and (X n ) m∈N is an E-nest.
In order to prove QR.1 ′ , we fix n ∈ N, we consider a dense sequence (x j ) j∈N in X n+1 , and we define the functions w k : X → [0, 1] It is easy to check that w k are 1-Lipschitz and pointwise nonincreasing, they satisfy 0 ≤ w k ≤ ψ n+1 ≤ 1 and the pointwise limit w k ↓ 0 as k → ∞, so that w k → w strongly in L 2 (X, m) since supp(w k ) ⊂ X n+1 and m(X n+1 ) < ∞. The finiteness of m(X n+1 ) also yields that (w k ) k∈N is bounded in V, so that w k ⇀ 0 weakly in V as k → ∞.
The Banach-Saks theorem ensures the existence of an increasing subsequence (k h ) h∈N such that the Cesaro means v h := 1 h h i=1 w k i converge to 0 strongly in V. This implies [19,Thm. 1.3.3] that a subsequence (v h(l) ) of (v h ) converges to 0 quasi-uniformly, i.e. for all integers m ≥ 1 there exists a closed set G m ⊂ X such that Cap(X n+1 \ G m ) < 1/m and v h(l) → 0 uniformly on G m . As w k h(l) ≤ v h(l) , if we set F m = ∪ i≤m G i , we have that w k h(l) → 0 as l → ∞ uniformly on F m for all m and Cap(X n+1 \ F m ) ≤ 1/m. Therefore, for every δ > 0 we can find an integer p ∈ N such that w p < δ on F m ; since ψ n+1 (x) ≡ 1 when x ∈ X n , the definition of w p implies ∀ x ∈ X n ∩ F m ∃j ∈ N, j ≤ p : d(x, x j ) < δ, i.e. X n ∩ F m ⊂ p j=1 B(x i , δ).
Notice that σ ∈ Γ(H t µ, H t ν) since e.g. for every ϕ ∈ C b (X) we have and a similar computation holds integrating functions depending only on y. Therefore, since (4.22) yields d(x, y) ≤ e −Kt d(u, v) for γ u,v -a.e. (x, y) ∈ X × X, (4.25) iii) follows immediately by (4.23) by choosing h(r) := r p so that h Kt (r) = e pKt r p .