A sharp symmetrized form of Talagrand's transport-entropy inequality for the Gaussian measure

This note presents a sharp transport-entropy inequality that improves on Talagrand's inequality for the Gaussian measure, arising as a dual formulation of the functional Santal\'o inequality. We also discuss some extensions and connections with concentration of measure.

Moreover, equality holds iff there exists a symmetric positive matrix A such that µ is a non-degenerate centered Gaussian measure with covariance A and ν is a Gaussian measure with covariance A −1 (not necessarily centered).
We can then recover Talagrand's inequality by taking µ = γ. The assumption that one of the two measures must be centered cannot be removed in general: if we take µ and ν both as non-centered standard Gaussian measures, with respective barycenter m 1 and m 2 , the inequality would become |m 1 − m 2 | 2 ≤ |m 1 | 2 + |m 2 | 2 , which does not hold for all choices of m 1 and m 2 .

Remark 1.2.
It is easy to check using the triangle inequality for W 2 that Talagrand's inequality implies W 2 (µ, ν) 2 ≤ 4 Ent γ (µ) + 4 Ent γ (ν). The point here is that we can improve the prefactor in such a way that the statement becomes strictly stronger than Talagrand's inequality. This may be useful for applications where sharp estimates are desirable. For example, we will not lose a factor 2 when deriving concentration inequalities from the functional inequality. See also [21] for applications of symmetrized transport-entropy inequalities to concentration of measure.
It turns out that this inequality is dual to the functional Santaló inequality [1,26]. This connection was pointed out to us by N. Gozlan.

Theorem 1.3 (Functional Santaló inequality
). Let f and g be measurable functions on R d satisfying f (x) + g(y) ≤ −x · y for all x, y ∈ R d . If e f (or e g ) has its barycenter at zero, then e f dx e g dx ≤ (2π) d .
This statement is due to Lehec [26], and improves on previous results of Ball [4] (for even functions) and Artstein, Klartag and Milman [1] (where the function with barycenter at zero was assumed to be concave). It is a functional generalization of a result of Santaló on volumes of convex bodies [29]. See also [19] for related results. Duality between transport-entropy inequalities and integral bounds go back to [7], which gave a dual formulation of classical transport-entropy inequalities.
The simplest way to prove Theorem 1.1 is to derive it from the functional Santaló inequality by a duality argument. We shall nonetheless give an alternate proof, which still relies on results derived using the functional Santaló inequality. Despite not being the simplest proof, we believe it is of some interest, as it highlights the connection to optimal transport and calculus of variations.
As was pointed out by Klartag [22] (and discovered independently by Barthe and Cordero-Erausquin), such inequalities can be pushed to uniformly log-concave measures using the Caffarelli contraction theorem, leading to the following variant of Theorem 1.1. This statement was pointed out to us by Dario Cordero-Erausquin. Theorem 1.5. Let θ = e −V dx be a symmetric, uniformly log-concave probability measure, that is the potential V is smooth and satisfies Hess V ≥ α Id for some α > 0. Then for any symmetric probability measure µ and any other probability measure ν, we have The symmetry assumptions on µ and θ could be relaxed, and we would then have to assume instead that T dµ = 0, where T is the optimal transport map sending θ onto γ. Section 2 will contain the proofs of the results we just described. Section 3 will present a reverse form of the improved Talagrand inequality, under some convexity and symmetry assumptions on the measures. Finally, Section 4 will describe a concentration estimate that will be easily deduced from Theorem 1.1, in connection with Maurey's property (τ ) [28]. ECP 23 (2018), paper 81.

Proof of Theorem 1.1
The proof uses the following two statements. The first is a result of Santambrogio [30]. Theorem 2.1. Let µ be a centered probability measure that is not supported on a hyperplane. Then there exists an essentially continuous convex function ϕ, unique up to translations, such that ρ = e −ϕ dx is a probability measure on R d whose pushforward by the map ∇ϕ is µ. Moreover, it satisfies The first part of this statement was first proved by Cordero-Erausquin and Klartag [14], and is the main focus of [30]. The second part is a byproduct of the method Santambrogio used, but turns out to be useful for our purpose. Note that this result contains the Talagrand inequality, which is obtained when taking µ to be the Gaussian measure. We refer to [14] for a definition of essential continuity.
When ρ is the minimizer, the above quantity can be rewritten as The first term is the negative of the Fisher information of ρ, relative to the Gaussian, so this quantity can be identified as the negative of the deficit in the classical Gaussian logarithmic Sobolev inequality for ρ. This is a first hint that this result may be useful to study improvements to functional inequalities. Connections between deficit estimates for Gaussian functional inequalities and moment maps have also been investigated in [24]. The second tool we shall use is a reverse form of the Gaussian logarithmic Sobolev inequality, proven in [2], under some regularity assumptions. In [15], an alternative, simpler proof was established, which removed those extra regularity assumptions. We point out that the simpler proof of [15] uses the functional Santaló inequality of [1]. To state the reverse LSI, we first define the Shannon entropy of a probability measure ρ = e −ϕ dx, given by S(ρ) := ϕdρ. We then have the following inequality: Theorem 2.2. Take ρ = e −ϕ dx a probability measure, and assume it is log-concave.
We can now give the proof of Theorem 1.1. We can assume without loss of generality that µ is absolutely continuous with respect to the Lebesgue measure. Let f be the density of µ with respect to the Lebesgue measure, and consider the convex function ϕ given by Theorem 2.1. It satisfies almost everywhere the Monge-Ampère PDE which is equivalent to Here we have used the inequality x · ∇ϕdρ ≤ d. If ϕ was smooth, this would be an equality, immediately justified by an integration by parts. Since ϕ is not necessarily very smooth, we cannot do this, but [14] justified that when ϕ is essentially continuous (which they proved is the case here) the inequality is still true despite the lack of regularity. Equivalently, But since Theorem 2.1 states that ρ is a minimizer of ν −→ Ent γ (ν) − 1 2 W 2 (ν, µ) 2 , this implies that for any probability measure ν with finite first moment which after rearranging the terms is the statement we were aiming to prove.
Moreover, for equality to hold, it must also hold in the inequality S(γ) − S(ρ) ≥ 1 2 log det ∇ 2 ϕdρ. Since cases of equality here are known to only be when ρ is Gaussian with some positive definite covariance matrix A, and µ is a pushforward of ρ by ∇ϕ, it is then also Gaussian, and its covariance matrix is A −1 . A standard computation confirms that equality indeed holds in such a situation.

2.1)
The left-hand side of this inequality is still a transport cost, but with cost −x · y instead of |x − y| 2 .
The equivalence between the improved Talagrand inequality and the functional Santaló inequality is a consequence of the dual formulations of transport cost and entropy: 3) The first identity is the Kantorovitch dual formulation of the optimal transport problem with cost −x · y (see for example [32]), while the second identity is the classical reformulation of entropy as the Legendre transform of the log-Laplace functional. Let us first prove that the symmetrized Talagrand inequality implies the functional Santaló inequality. Take f and g such that f (x) + g(y) ≤ −x · y for all x, y, and such that xe f dx = 0. which after taking into account (2. 2) becomes f dµ + gdν ≤ f dµ − log e f dx + gdν − log e g dx + d log(2π).

(2.4)
This is easily seen to be the same thing as (1.1), after removing the terms appearing on both sides and taking the exponential.
For the converse, fix µ a centered probability measure, and ν any probability measure, and consider f and g satisfying f (x) + g(y) ≤ −x · y. There exists λ ∈ R d such that xe f +λ·x dx = 0. Indeed, the condition on f and g implies f decays to −∞ at infinity faster than any linear function, so this quantity is well defined for all λ, it is a smooth monotone function in λ so its range is convex, and any coordinate is unbounded, so its range is the whole space. Letf (x) = f (x) + λ · x andg(y) = g(y + λ). Thenf (x) +g(y) ≤ −x · y and xef dx = 0. Applying the Santaló inequality, we get log ef dx + log egdx ≤ d log(2π).
Since f dµ = f dµ because µ is centered, and since e g dx = egdx, we get and taking the supremum over all f and g yields (2.1), which concludes the proof.

Proof of Theorem 1.5
The argument is the same as in [22], we use the Caffarelli contraction theorem [9], which states that there exists a α −1/2 Lipschitz map T sending γ onto θ. The map T is invertible, and we can consider its inverse T −1 (which is the optimal transport map sending θ onto γ). Letμ (resp.ν) be the image of µ (resp. ν) by T −1 . By symmetry of θ, T is a symmetric function, and henceμ is still a centered measure. We then have which completes the proof. We could remove the symmetry assumption on θ and µ by requiring instead that the image of µ by T −1 is centered.

A remark on stability
Given a functional inequality F(f ) ≤ G(f ) with sharp constants for which all equality cases are known, a natural question is to determine whether one can prove an improvement of the form F(f )+d(f, E) α ≤ G(f ), where E is the set of functions for which equality holds, and d is some suitable distance on the space of functions or measures considered.
This problem has been recently studied for several Gaussian functional inequalities, such as the isoperimetric problem [16,5], the logarithmic Sobolev inequality [12,17,8] and Talagrand's inequality [17,13]. In particular, [24] investigated applications of the moment map problem to such deficit estimates. In [10], a deficit estimate for the inverse logarithmic Sobolev inequality was established. It takes the following form: where 0 , R, η depend on d and R( ) −→ +∞ as goes to zero. This result was established using a stability estimate for the functional Santaló inequality that was obtained in [6]. In high-dimensional situations, this is not a very good estimate, and it is an open problem whether the exponent 1/(129d 2 ) can be replaced by a constant independent of the dimension.
Using this estimate to refine the proof of Theorem 1.1, we straightforwardly obtain Theorem 2.4. Let µ be a centered measure, and assume that ν is a probability measure such that for some < 0 . Then there exists c > 0, a positive definite matrix A and a point x 0 ∈ R d such that the moment map ϕ of µ satisfies where 0 , R, η depend on d and R( ) −→ +∞ as goes to zero.

A reverse inequality
We shall use a reverse form of the Santaló inequality for unconditional measures to derive a reverse form of our transport-entropy inequality. It is not clear to us if this inequality has any application, but since reverse Santaló inequalities have attracted some attention, in relation to the Mahler conjecture, we felt this problem was natural, and the estimate worth writing down.
A function or probability measure is said to be unconditional if it is invariant by all symmetries with respect to a hyperplane {x i = 0} for any i ∈ {1, .., n}. In [18], following earlier results in [23] (see also [19,3]), the following inverse functional Santaló inequality was established: Theorem 3.1 ). Let f be an unconditional convex function, and let f * be its Legendre transform f * (x) := sup y x · y − f (y). Then By duality, we can establish the following inequality: Theorem 3.2. Let µ = e −f dx be an unconditional log-concave measure, and let µ * = e −f * dx. Then Proof. Let µ and ν be two unconditional log-concave measures, and denote by F uc the set of all unconditional convex functions. We can localize the duality formulas inf π −x · ydπ = sup f ∈Fuc f dµ + f * dν; ECP 23 (2018), paper 81.
For the second formula, this is trivial since we know the optimizer is the logarithm of the density. For the first one, this is a consequence of the convexity of the optimizer in Kantorovitch duality, and that the optimizer necessarily inherits symmetry properties shared by both measures. Using these formulas and following the same approach as in the proof of Theorem 1.1, we get Ent dx (µ) + Ent dx (µ * ) ≤ inf π −x · ydπ − d log(4).
Adding second moments and a constant, we get Ent γ (µ) + Ent γ (µ * ) ≤ 1 2 W 2 (µ, µ * ) 2 + d 2 log(π/2). It is known that this kind of estimate can be obtained as a consequence of the Santaló inequality via property (τ ). If we use the classical property (τ ) of Maurey [28], we obtain this estimate with constant 1/4 instead of 1/2 in the exponent, but for general sets. The case of even sets with constant 1/2 is an immediate consequence of the symmetric property (τ ) obtained by Lehec [25]. See also the survey [20] for the relationship between property (τ ) and transport inequalities. The point here is that deducing this concentration bound from the improved Talagrand inequality is completely straightforward.