A Brascamp-Lieb type covariance estimate

In this article, we derive a new covariance estimate. The estimate has a similar structure as the Brascamp-Lieb inequality and is optimal for ferromagnetic Gaussian measures. It can be naturally applied to deduce decay of correlations of lattice systems of continuous spins. We also discuss the relation of the new estimate with known estimates like a weighted estimate due to Helffer&Ledoux. The main ingredient of the proof of the new estimate is a directional Poincar\'e inequality which seems to be unknown.


Introduction
The main goal of this article is to deduce a new covariance estimate for a certain class of Gibbs measures on a finite-dimensional Euclidean space X (see Section 2 and Theorem 2.3 below). Here and later on, Z denotes a generic normalization constant turning µ into a probability measure. The covariance estimate can be seen as an analogue of the Brascamp-Lieb inequality (BLI), which estimates variances. The BLI was originally introduced by Brascamp & Lieb in [BL76]: Theorem 1.1 (Brascamp & Lieb). Let H : X → R be a smooth strictly convex function. Then for all smooth functions f (1.1) • in the convex case the bound is slightly weaker than in the BLI. The covariance estimate of Theorem 2.3 can be interpreted in the following way: The correlations of a non-convex perturbed Gibbs measure are dominated by the correlations of an suitable chosen Gaussian measure with ferromagnetic interaction. The proof of Theorem 2.3 is given in Section 2 and is based on a new type of functional inequality which we call directional Poincaré inequality (see Theorem 2.7 below). The proof the directional Poincaré inequality (PI) is based on ideas which were outlined by Ledoux for the proof of the weighted covariance estimate (cf. [Led01] and Theorem 3.1).
The use of the new covariance estimate is illustrated in Section 3, where we show how the estimate can be used to deduce decay of correlations of certain lattice systems of continuous spins. We distinguish two cases: In Section 3.1 we consider exponential decay of correlations. We show that the new covariance estimate yields a well-known weighted covariance estimate due to Helffer (see Theorem 3.1, [Hel99, Section 4] or [Led01, Proposition 2.1 or 3.1]). This weighted covariance estimate is the central ingredient in a common method to deduce exponential decay of correlations for unbounded spin systems with a non-convex single-site potential and a weak finite-range interaction (see [Hel99, Theorem 2.1], [BH99, Theorem 1.1], [BH00, Theorem 3.1] or [Led01, Proposition 6.2]). Additionally, we show how Theorem 2.3 directly yields an exponential decay of correlations in this situation without relying on Theorem 3.1 (see Corollary 3.3 and Proposition 3.4).
In Section 3.2 we consider algebraic decay of correlations. Using the new Brascamp-Lieb type covariance estimate, we give a criterion to deduce algebraic decay of correlations of lattice systems of continuous spins (see Proposition 3.5).
The main result of this article (i.e. Theorem 2.3) was successfully applied in other articles of the author: Because there is a deep connection between decay of correlations and the validity of certain functional inequalities like the logarithmic Sobolev inequality (LSI) or the PI (see for example [Zeg90,Zeg96,Hel99,BH99,Yos99,Yos01] or [BH99] for an overview), it is not surprising that Theorem 2.3 is one of the key ingredients to derive the LSI for the canonical ensemble µ N,m in the case of a weak two-body interaction [Men11b]. Additionally, Proposition 3.5 was used in [Men13] to refine the Otto-Reznikoff approach to the LSI.
We conclude the introduction by making a comment on the origin of the content of this article. Most of the material of this article is contained in the dissertation [Men11a] of the author but unpublished until now. The proof of the Brascamp-Lieb type covariance estimate of Theorem 2.3 emerged out of joint discussions with Felix Otto.
2. The Brascamp-Lieb type covariance estimate and its proof.
We consider a finite dimensional Euclidean space X. Norms | · | and gradients ∇ are derived from the Euclidean structure. If a probability measure µ on X satisfies the PI, we directly obtain the following standard covariance estimate: Lemma 2.1. Assume µ satisfies PI with constant ̺. Then for any smooth function f and g we have Even if the estimate (2.1) is optimal (cf. [OR07, Remark 4]), it does not yield information about the dependence of the covariance on the specific coordinates. Hence, the estimate (2.1) is useless for deducing decay of covariances. For example, let us consider a Gaussian Gibbs measure on R N with a symmetric and positive definite N × N-Matrix A. Then it is known that Therefore, we can hope for a finer estimate than (2.1) that is also sensitive to the dependence of the functions f and g on the specific coordinates x i . Our covariance estimate shows this feature: . We consider a probability measure dµ := Z −1 exp(−H(x)) dx on a direct product of Euclidean spaces X = X 1 × · · · × X N . We assume that • the conditional measures µ(dx i |x i ), 1 ≤ i ≤ N, satisfy a uniform PI with constant ̺ i > 0 which means that for all smooth functions f : Here, | · | denotes the operator norm of a bilinear form.
is positive definite. Then for all smooth functions f and g The structure of the estimate in Theorem 2.3 is related to the BLI in the sense that variance is replaced by covariance and that Hess H is replaced by A.
Remark 2.4 (Connection to BLI). We assume X i = R for i ∈ {1, . . . , N} and let A be a symmetric positive definite N × N-matrix. We consider a ferromagnetic Gaussian Hamiltonian given by where ferromagnetic means that the coupling is attractive i.e.
Then the covariance estimate (2.4) coincides with the BLI given by (1.1) provided the function f = g is an affine function.
The next remark considers the optimality of Theorem 2.3.
Remark 2.5 (Optimality). Provided the Hamiltonian H is ferromagnetic Gaussian, the estimate of Theorem 2.3 is optimal. This remark is verified by setting f (x n ) = x n and g(x k ) = x k and using (2.2).
Remark 2.6 (Criterion for PI). Theorem 2.3 contains a well-known criterion for PI i.e. If A ≥ ̺ Id, ̺ > 0, then µ satisfies a PI with constant ̺, which means that for all smooth functions f The assumption under which Theorem 2.3 holds has the same algebraic structure as the assumption in the Otto-Reznikoff criterion for LSI (cf. [OR07, Theorem 1]). The only difference is that the uniform LSI constant for the single-site conditional measures is replaced by the uniform PI constant.
Starting point of the proof of Theorem 2.3 is a representation of the covariance, which was used by Helffer [Hel95] to give another proof of the BLI. More precisely, one can express the covariance of the measure µ as where the potential ϕ is defined as the solution of the elliptic equation Here we used the convention, that µ also denotes the Lebesgue density of the probability measure µ. As a solution of (2.6) we understand any ϕ ∈ H 1 (µ) such that for all ζ ∈ H 1 (µ) The existence of such solutions follows directly from the Riesz representation theorem applied to The completeness of H w.r.t. the chosen inner product follows from the fact that µ satisfies some PI, which is guaranteed by our Assumption 2.2.
Let us return to the proof of Theorem 2.3. An application of the Cauchy-Schwarz inequality to (2.5) yields Now, an application of the following theorem yields the desired estimate (2.4) and completes the proof of Theorem 2.3.
Theorem 2.7 (Directional PI). Assume that the conditions of Theorem 2.3 are satisfied. For any function f let the potential ϕ be a solution of (2.6). Then for all i ∈ {1, . . . , N} Before we turn to the proof of Theorem 2.7, let us explain why we call the estimate (2.10) directional PI. For this let us recall the dual formulation of the PI (cf. for example [OV00]), which is an easy consequence of the dual characterization of the norm on the Hilbertspace H given by (2.8) and (2.9).
Lemma 2.8 (Dual formulation of the PI). A probability measure µ satisfies PI with constant ̺ > 0 if and only if for any function f and the solution ϕ of (2.6) Note that the directional PI given by (2.10) estimates each coordinate of the gradient of ϕ separately and therefore is a refinement of the dual formulation of the PI given by (2.11). As in [OV00, Section 3], function ϕ formally denotes the tangent vector at of the curve (1+εf )µ at ε = 0. Therefore, ∇ϕ can be interpreted as the infinitesimal optimal displacement transporting the measure µ into (1+εf )µ (cf. [OV00, Section 5]). So, the left hand side of (2.10) measures the average flux of mass into the direction of the i-th coordinate against a weighted gradient of f . For this reason we call (2.10) directional PI.
One can also interpret the estimate (2.10) in terms of the Witten complex (for a nice overview see [Hel02]). At least formally one can introduce the inverse Witten-Laplacian A −1 1 as A −1 1 ∇f := ∇ϕ, which maps the gradient of some function f onto the gradient of the solution ϕ of the equation (2.6). Let Π i denote the projection onto the space X i , i ∈ {1, . . . , N}. Then the estimate (2.10) becomes a weighted estimate of the L 2 -operator norm of Π i A −1 1 .
Let us now turn to the proof of Theorem 2.7, which is the only missing ingredient in the proof of Theorem 2.3. The argument is very basic. It combines the core inequality of Ledoux's argument for [Led01, Proposition 3.1] with linear algebra that was used in the argument of [OR07, Theorem 1].
Proof of Theorem 2.7. To make the main ideas of the argument more visible, we assume that the Euclidean spaces X i , i ∈ {1, . . . , N}, are one dimensional i.e. X i = R. The argument for general Euclidean spaces X i is almost the same. Then the product space X = X 1 × · · · × X N becomes R N . The gradient ∇ i on X i is just the partial derivative ∂ i w.r.t. the i-th coordinate. The first ingredient of the proof is the basic estimate for j ∈ {1, . . . , N} (2.12) which is just an equivalent formulation of the PI with constant ̺ j for the single-site measure µ(dx j |x j ) (cf. [Led01, Proposition 1.3, (1.8)] or [HS94,Hel98]). The second ingredient of the proof is the identity Indeed, by partial integration one sees that Applying now (2.7) on the terms of the r.h.s. yields the identity Let us have a closer look at the second term on the r.h.s of the last identity. It follows from the definition of µ that A combination of the last two formulas yields the desired identity (2.13).
Now, we turn to the proof of (2.10). A combination of (2.12) and (2.13) yields the estimate Applying Cauchy-Schwarz on the last estimate yields for all j ∈ {1, . . . , N} (2.14) A simple linear algebra argument outlined in [OR07,Lemma 9] shows that the elements of the inverse of A are non negative i.e. (A −1 ) ij ≥ 0 for all i, j ∈ {1, . . . , N}. Hence, (2.14) yields The proof of Theorem 2.3 is just a direct application of Theorem 2.7.
Proof of Theorem 2.3. Using the definition of ϕ, cf. (2.6), we obtain the following estimate of the covariance Now, the statement follows directly from Theorem 2.7.
3. Application of the B-L type covariance estimate:

Decay of correlations
In this section we show how Theorem 2.3 can be used to deduce decay of correlations. We distinguish between two cases: • exponential decay of correlations (see Section 3.1) • and algebraic decay of correlations (see Section 3.2).
3.1. Exponential decay of correlations. We start with reflecting a method based on Helffer [Hel99] that has often been used to derive exponential decay of correlations of spin systems with finite-range interaction or exponentially decaying (cf. [BH99] and [BH00]). This method is based on a weighted covariance estimate, which we present in the spirit of Ledoux [Led01, Proposition 3.1], but rephrase the estimate in our framework. We assume that there exists ̺ > 0 such that in the sense of quadratic forms Then the matrix A is positive definite and for all functions f and g, At the end of this section, we give a new proof of Theorem 3.1 showing that the weighted covariance estimate (3.2) is an easy consequence of our covariance estimate of Theorem 2.3. This shows that the statement of Theorem 2.3 is consistent with the existing literature.
Remark 3.2. Using a direct argument for deducing of Theorem 3.1, one sees that the condition (3.1) can be relaxed to a weaker condition (for the argument we refer the reader to [Men11a, Section 1.2.1] or [Che08, Proposition 3.2]). More precisely, let the symmetric N × Nmatrix A(x) = (A ij (x)) be defined by a direct application of Theorem 3.1 yields the following criterion for exponential decay of correlations.
We assume that there exists̺ > 0 such that in the sense of quadratic forms Then for all functions f = f (x i ) and g = g(x j ), i, j ∈ {1, . . . , N}, This criterion may also be stated more generally for functions with arbitrary disjoint supports. It is implicitly contained in the prelude of [Led01, Proposition 6.2].
At the end of this section we will also give a direct proof of Corollary 3.3, which is just based on the covariance estimate of Theorem 2.3 and does not need the weighted covariance estimate of Theorem 3.1. Now, let us give an example how Corollary 3.3 can be applied. For that purpose we consider a two-dimensional lattice system with nonconvex single-site potential and weak nearest-neighbor interaction. The same type of argument would also work for any dimension and finiterange interaction. Let X denote a two-dimensional periodic lattice of N-sites and let δ(·, ·) denote the graph distance on it. We assume that µ ∈ P(X) has the Hamiltonian where the smooth potential ψ is a bounded perturbation of a Gaussian in the sense that By a combination of the Bakry-Émery criterion (cf. Theorem A.1) and the of Holley-Stroock perturbation principle (cf. Theorem A.2) all conditional measures µ(dx i |x i ) satisfy a uniform LSI with constant ∆ := exp (− osc δψ). From (3.5) we see that Hence, we know that if the interaction is sufficiently weak in the sense of ε < ∆ 4 , the matrix A of Theorem 2.3 satisfies Analogously one obtains that if ε < ∆ 4 e −1 , the matrixÃ of Corollary 3.3 satisfiesÃ ≥ (∆ − 4εe) Id . Therefore, an application of Corollary 3.3 yields exponential decay of correlations: Proposition 3.4. Assume that ε < ∆ 4 e −1 . Then for any functions f = f (x i ) and g = g(x j ), i, j ∈ {1, . . . , N}, This statement reproduces the correlation bounds established by Helffer [Hel99] and reproved by Ledoux in [Led01, Proposition 6.2].
Let us now prove the statements mentioned in this section.
Proof of Theorem 3.1 using Theorem 2.3. We start with deducing that A is positive definite. Because A is a symmetric Matrix, it suffices to show that every eigenvalue of A is positive. Let λ ∈ R be an eigenvalue of A with eigenvector x i.e.

Ax = λx.
An application of (3.1) to the vector Dx yields which implies λ > 0. Now, we will deduce (3.2). Because A is symmetric, the inverse A −1 also is symmetric. Therefore, an application of Theorem 2.3 yields the estimate where the vectors z,z ∈ R N are defined for i, j ∈ {1, . . . , N} by Therefore, (3.2) is verified provided holds for any z ∈ R N . From the hypothesis (3.1) it follows that Hence, we have which immediately yields (3.6).
Direct proof of Corollary 3.3 using only Theorem 2.3. Let us fix two indices i, j ∈ {1, . . . , N}. Let f and g be arbitrary functions just depending on x i and x j respectively. We apply Theorem 2.3 and get where A is defined as in (2.3). Therefore, it remains to estimate the element (A −1 ) ij . By Neumann series (also called the random walk expansion of A −1 (cf. [BFS82])) we have e −δ(i,s) e −δ(s,l) e −δ(l,j) e −δ(i,s) e −δ(s,l) e −δ(l,j) κ is κ sl κ lj ̺ i ̺ s ̺ l ̺ j + · · · · · · . (3.8) By the triangle inequality we get for all i, s, j ∈ {1, . . . , N}. Hence, we can continue the estimation of (3.8) as whereÃ is defined as in (3.3). By (3.4) we have the bound which together with (3.7) and (3.9) finishes the proof.
3.2. Algebraic decay of correlations. In this section we show how Theorem 2.3 can be used to deduce an algebraic decay of correlations in the case of algebraically decaying interaction. Because in the article [Men13] the statement of Proposition 3.5 is applied to a d-dimensional lattice system, we change the notation a little bit.
Proposition 3.5. Let Λ ⊂ Z d an arbitrary finite subset of the ddimensional lattice Z d . We consider a probability measure dµ := Here, | · | denotes the operator norm of a bilinear form.
• the numbers κ ij decay algebraically in the sense of is strictly diagonally dominant i.e. for some δ > 0 it holds for any i ∈ Λ (3.11) j∈Λ,j =i Then for all functions f = f (x i ) and g = g(x j ), i, j ∈ Λ, and for any i, j ∈ Λ Proof of Proposition 3.5. Because the matrix A is strictly diagonal dominant in the sense of (3.11) by assumption, the matrix A is also positive definite. Therefore an application of Theorem 2.3 directly yields the estimate (3.12). So, it is only left to deduce the estimate (3.13). As in the proof of Corollary 3.3 the Neumann series representation of A −1 yields for i = j It follows from our assumption (3.11) that κm n ̺ n ≤ m∈Λ κ nm ̺ n ≤ c < 1 uniformly in n,m ∈ Λ. (3.15) Therefore we get the estimate Letñ denote the smallest integer larger than log |i−j| d+α | log c| . Then we have Considering (3.14) it only remains to estimate ñ k=0 T k . Assume for the moment that A combination of (3.14), (3.16), and (3.18) yields the desired statement (3.13).
Then there is at least one pair of (s 0 , s 1 ), (s 1 , s 2 )), . . ., (s k−1 , s k ), or s k , s k+1 that satisfies the estimate By the equivalence of norms in finite-dimensional vector-spaces the last inequality yields Therefore we have We show how the second term on the right hand side can be estimated. The estimation of the other terms works almost the same, hence we skip it. We have s 1 ,...,s k ∈Λ (s 1 ,s 2 ) satisfies (3.19) ≤ C (k + 1) d+α |i − j| d+α + 1 .
With similar bounds for the other terms we get the desired estimate T k ≤ C (k + 1) d+α+1 |i − j| d+α + 1 , which closes the argument.
Appendix A. The criterion of Bakry-Émery and the Holley-Stroock perturbation principle In this section we state the criterion Bakry-Émery and the Holley-Stroock perturbation principle, which we used in the main part of this article to deduce the PI for certain measures. Because we only work with the PI in this article we state those criteria for the PI. However, note that both criteria also hold on the stronger level of the LSI. The Bakry-Émery criterion connects convexity of the Hamiltonian to the validity of the PI. In non-convex cases the standard tool to deduce the PI is the Holley-Stroock perturbation principle. The perturbation principle of Holley-Stroock [HS87] allows to deduce the PI constants of non-convex Hamiltonian from the PI of an appropriately convexified Hamiltonian. However due to its perturbative nature, the dependence of the PI constant̺ usually is bad in physical parameters like system size or temperature. and Christian Loeschcke for the fruitful and inspiring discussions on this topic. The author was financially supported by the Deutsche Forschungsgemeinschaft through the Gottfried Wilhelm Leibniz program and partially by the Bonn International Graduate School in Mathematics during the years 2007 to 2009, where most of the content of this article originated.