Spectral calculus and Lipschitz extension for barycentric metric spaces

The metric Markov cotype of barycentric metric spaces is computed, yielding the first class of metric spaces that are not Banach spaces for which this bi-Lipschitz invariant is understood. It is shown that this leads to new nonlinear spectral calculus inequalities, as well as a unified framework for Lipschitz extension, including new Lipschitz extension results for CAT(0) targets. An example that elucidates the relation between metric Markov cotype and Rademacher cotype is analyzed, showing that a classical Lipschitz extension theorem of Johnson, Lindenstrauss and Benyamini is asymptotically sharp.


Introduction
Our main purpose here is to compute a bi-Lipschitz invariant, called metric Markov cotype, for barycentric metric spaces; an important class of metric spaces that contains all uniformly convex Banach spaces as well as complete simply connected metric spaces that are nonpositively curved in the sense of Aleksandrov.
The notion of metric Markov cotype arises from the deep work [3] of K. Ball on the Lipschitz extension problem. Based mainly on Ball's ideas in [3], combined with some additional geometric ingredients, we establish a fully nonlinear version of Ball's extension theorem that allows for targets that are not necessarily Banach spaces. Due to our computation of metric Markov cotype for barycentric spaces, this yields a versatile Lipschitz extension theorem that contains as special cases many Lipschitz extension theorems that appeared in the literature, as well as Lipschitz extension results that were previously unknown.
Another use of metric Markov cotype is due to [43], where it is shown to yield spectral calculus inequalities for nonlinear spectral gaps. Consequently, our computation of metric Markov cotype for barycentric metric spaces implies new nonlinear spectral calculus inequalities which, in the special case of CAT (0) spaces, lay the groundwork for our forthcoming construction [44] of expanders with respect to certain Hadamard spaces and random graphs.
Finally, we show that a beautiful construction of Kalton [29] yields a closed linear subspace X of L 1 (thus in particular X has Rademacher cotype 2) that fails to have finite metric Markov cotype. By obtaining a quantitative version of Kalton's result, we show that a classical Lipschitz extension theorem of Johnson, Lindenstrauss and Benyamini [23] is asymptotically sharp.
In order to give precise formulations of the above results one needs to recall some background. This will be done in the subsequent sections that contain a detailed description of the contents of this paper.
1.1. Markov type and metric Markov cotype. Given n ∈ N and π ∈ ∆ n−1 def = {x ∈ [0, 1] n : n i=1 x i = 1}, recall that a stochastic matrix A = (a ij ) ∈ M n (R) (here and in what follows, M n (R) denotes as usual the n by n matrices with real entries) is said to be reversible relative to the probability vector π if π i a ij = π j a ji for all i, j ∈ {1, . . . , n}. The following important definition is due to K. Ball [3]. Definition 1.1 (Markov type p). A metric space (X, d X ) is a said to have Markov type p ∈ (0, ∞) with constant M ∈ (0, ∞) if for every n, t ∈ N and every π ∈ ∆ n−1 , if A = (a ij ) ∈ M n (R) is a stochastic matrix that is reversible relative to π then every x 1 , . . . , x n ∈ X satisfy The infimum over those M ∈ (0, ∞) satisfying (1) is denoted M p (X).
The triangle inequality implies that M 1 (X) = 1 for every metric space (X, d X ), and Ball proved in [3] that M p (ℓ p ) = 1 for p ∈ [1,2]. In [50] it is shown that M 2 (ℓ p ) √ p for p ∈ [2, ∞) (here, and in what follows, A B and B A denotes the estimate A CB for some absolute constant C ∈ (0, ∞)). Additional examples of computations of Markov type will be discussed in Section 1.5.
Markov type is a bi-Lipschitz invariant that has proved itself useful to a variety problems in metric geometry, one of which will be recalled below. We refer to [3] for the natural probabilistic interpretation of (1) that explains the above terminology (this interpretation is not needed in the present paper, but it is important elsewhere). Definition 1.2 (metric Markov cotype p). A metric space (X, d X ) is said to have metric Markov cotype p ∈ (0, ∞) with constant N ∈ (0, ∞) if for every n, t ∈ N and every π ∈ ∆ n−1 , if A = (a ij ) ∈ M n (R) is a stochastic matrix that is reversible relative to π then for every x 1 , . . . , x n ∈ X there exist y 1 , . . . , y n ∈ X satisfying The infimum over those N ∈ (0, ∞) satisfying (1) is denoted N p (X).
Definition 1.2 is taken from [43]. In [3] Ball suggested a seemingly different notion of Markov cotype, but it is in fact equivalent to Definition 1.2, as explained in Section 7. Due to applications of (2) that will be described later, we believe that it is beneficial to work with the above definition of metric Markov cotype rather than Ball's original formulation. See Section 7 for a description of Ball's approach.
Condition (2) originates from an attempt to introduce an invariant that is "dual" to Markov type by reversing the inequality in (1). However, no non-singleton metric space can satisfy (1) with the direction of the inequality reversed (this follows formally from observations in [51] and [42], and can be also easily verified directly). (2) achieves a similar reversal of (1) by allowing one to pass from the initial points x 1 , . . . , x n ∈ X to new points y 1 , . . . , y n ∈ X. The first summand in the left hand side of (2) ensures that on average (with respect to π) y i is close to x i . The remaining terms in (2) correspond to the reversal of (1), with {x i } n i=1 replaced by {y i } n i=1 in the left hand side, and the power A t replaced by the Cesàro average 1 t t s=1 A s . Due to [3,43], Banach spaces that admit an equivalent norm whose modulus of convexity has power type p have metric Markov cotype p, in particular N p (ℓ p ) 1 for p ∈ [2, ∞) and N 2 (ℓ p ) 1/ √ p − 1 for p ∈ (1,2]. Prior to the present work this was the only nontrivial class of metric spaces whose metric Markov cotype was known. Here we enrich the repertoire of metric spaces for which one can prove a metric Markov cotype inequality such as (2), treating also spaces that are not necessarily Banach spaces.
1.2. Barycentric metric spaces. In order to avoid measurability considerations that are irrelevant to the discussion at hand, we will tacitly assume throughout this article that all measures are finitely supported and all σ-algebras are finite.
The set of probability measures on a set X is denoted P X . Denoting the point mass at x ∈ X by δ x ∈ P X , every µ ∈ P X can be written uniquely as µ = n i=1 λ i δ x i for some n ∈ N, distinct x 1 , . . . , x n ∈ X and (λ 1 , . . . , λ n ) ∈ ∆ n−1 ∩(0, 1] n . A coupling of µ, ν ∈ P X is a measure π ∈ P X×X such that z∈X π(x, z) = µ(x) and z∈X π(z, y) = ν(y) for every x, y ∈ X (both of these sums are finite). The set of all the couplings of µ and ν is denoted Π(µ, ν) ⊆ P X×X . If (X, d X ) is a metric space and p ∈ [1, ∞) then the corresponding Wasserstein p metric on P X is defined as usual by ∀ µ, ν ∈ P X , W p (µ, ν) def = inf π∈Π(µ,ν) X×X d X (x, y) p dπ(x, y) In what follows, a mapping B : P X → X satisfying B(δ x ) = x for every x ∈ X will be called a barycenter map. The notion of W p barycentric metric spaces was studied by several authors: see e.g. [33,17,60,20,54,53]. Note that if (X, d X ) is W p barycentric with constant Γ then it is also W q barycentric with constant Γ for every q p. Normed spaces are W 1 barycentric with constant Γ = 1, as exhibited by the barycenter map B(µ) = X xdµ(x). Metric spaces that are nonpositively curved in the sense of Busemann (see [9]) are also W 1 barycentric with constant Γ = 1, as shown in [17,53]. Definition 1.4 (p-barycentric metric space). Fix p, K ∈ [1, ∞). A metric space (X, d X ) is said to be p-barycentric with constant K if there exists a mapping B : P X → X such that for every x ∈ X and µ ∈ P X we have right hand side of (4), but as will become clear from the ensuing considerations, Definition 1.4 suffices for many purposes.
We refer to the books [6,26,9] for an extensive discussion of the important class of CAT (0) metric spaces, which includes e.g. complete simply connected Riemannian manifolds with nonpositive sectional curvature and Euclidean Tits buildings. For the sake of readers who are not familiar with this notion we state that the definition of the class of CAT (0) metric spaces can be taken to be those metric spaces (X, d X ) for which there exists a mapping B : P X → X that satisfies (4) with p = 2 and K = 1 for probability measures µ that are supported on at most two points [60,Thm. 4.9]. Readers who are not familiar with the theory of uniformly convex Banach spaces are referred to [18,5].
1.3. Metric Markov cotype for barycentric metric spaces. In Section 3 we prove the following result. Theorem 1.5. Fix p, K, Γ ∈ [1, ∞). Suppose that (X, d X ) is a metric space that is W p barycentric with constant Γ and also p-barycentric with constant K. Then (X, d X ) has metric Markov cotype p with The special case of Theorem 1.5 when X is a Banach space whose modulus of uniform convexity has power type p was proved in [3,43]. Our proof of Theorem 1.5 is based on an extension of the method of [43] to the present nonlinear setting. In particular we prove for this purpose a nonlinear analogue of Pisier's martingale cotype inequality [57]; see Section 2 below.
Remark 1.6. The property of having metric Markov cotype p is clearly a bi-Lipschitz invariant. Similarly, the property of being W p barycentric is a bi-Lipschitz invariant, but this is not the case for the property of being p-barycentic. Thus Theorem 1.5 leaves something to be desired, since its assumption is not invariant under bi-Lipschitz deformations while its conclusion is. By examining the proof of Theorem 1.5 one can extract a somewhat tedious bi-Lipschitz invariant condition that implies the same conclusion (5). It would be interesting to obtain a clean intrinsic characterization of those metric spaces (X, d X ) that are bi-Lipschitz equivalent to a p-barycentric metric space. For Banach spaces this was done in [45], the desired metric invariant being the notion of Markov p-convexity (see [45] for the definition). The method of [45] relies on the Banach space structure, so it remains open to characterize intrinsically those W p barycentric metric spaces that are bi-Lipschitz equivalent to a p-barycentric metric space. It would also be interesting to characterize those Finsler manifolds that are p-barycentric.
Following [43], given a metric space (X, d X ) and p ∈ (0, ∞) let γ + (A, d p X ) denote the infimum over those γ + ∈ (0, ∞] for which every x 1 , . . . , x n , y 1 , . . . , y n ∈ X satisfy Letting d R denote the standard metric on R, i.e., d R (x, y) = |x−y|, by simple linear algebra we see that γ + (A, d 2 R ) = 1/(1−λ(A)). One should therefore think of the quantity γ + (A, d p X ) as measuring the magnitude of the nonlinear absolute spectral gap of the matrix A with respect to the geometry of X. We refer to [43] for a detailed discussion of nonlinear spectral gaps and their applications.
Despite the fact that we call inequalities such as (6) "spectral inequalities", there is no actual spectrum present here, and therefore tools that are straightforward in the linear setting due to the link to linear algebra fail to hold true in general. This is especially important in the context of nonlinear spectral calculus, where one aims to relate . We refer to [43] for an explanation of the importance of this problem, where the following theorem is proved. Theorem 1.7. There exists a universal constant κ ∈ (0, ∞) with the following property. Suppose that p ∈ [1, ∞) and that (X, d X ) is a metric space that has metric Markov cotype p. Then for every n, t ∈ N, every symmetric stochastic matrix A ∈ M n (R) satisfies By combining Theorem 1.5 and Theorem 1.7 we conclude that the following result holds true. Theorem 1.8. There exists a universal constant c ∈ (1, ∞) such that for every p, K, Γ ∈ [1, ∞), if (X, d X ) is a metric space that is W p barycentric with constant Γ and p-barycentric with constant K then for every n, t ∈ N, every symmetric stochastic matrix A ∈ M n (R) satisfies For future applications it is worthwhile to single out the following special case of Theorem 1.8.
Corollary 1.9 was the main motivation for the investigations that led to the present paper, since it plays a key role in our forthcoming work [44] that establishes for the first time the existence of expanders with respect to certain Hadamard spaces and random graphs.
For the purpose of the applications in [44], the fact that the spectral calculus inequality (7) involves Cesàro averages of A rather than powers of A is immaterial, but it is natural to ask if it is possible to relate γ + (A t , d p X ) to γ + (A, d p X ). In the setting of general barycentric metric spaces this question remains open, but for CAT (0) spaces, or more generally under the requirement K = 1 in (4), it is indeed possible to do so, albeit via an upper bound on γ + (A t , d p X ) in terms of γ + (A, d p X ) that is weaker than the right hand side of (7). Theorem 1.10. There is a universal constant C ∈ (0, ∞) with the following property. Fix p, Γ ∈ [1, ∞) and suppose that (X, d X ) is a metric space that is W p barycentric with constant Γ and p-barycentric with constant K = 1. Then for every n, t ∈ N, every symmetric stochastic matrix A ∈ M n (R) satisfies Our proof of Theorem 1.10 relies on ideas from [43,Sec. 6], where a similar treatment is given to uniformly convex Banach spaces (in this special context the conclusion of Theorem 1.10 holds true even without the restriction K = 1). In the present nonlinear setting several modifications of the argument of [43] are required; see Section 4 below.
1.5. Lipschitz extension. If (X, d X ) and (Y, d Y ) are metric spaces then for every S ⊆ X denote by e(X, S, Y ) the infimum over those L ∈ (0, ∞) such that for every Lipschitz function f : the goal of the Lipschitz extension problem is to understand which pairs of metric spaces (X, d X ), (Y, d Y ) satisfy e(X, Y ) < ∞, and, when that happens, to obtain good bounds on e(X, Y ). Due to its intrinsic importance as well as many applications in analysis and geometry, the Lipschitz extension problem has been extensively investigated over the past century. We shall not attempt to indicate the vast literature on this topic, referring instead to the book [11] and the references therein. K. Ball introduced [3] the notions of Markov type and cotype in order to prove an important Lipschitz extension theorem known today as Ball's extension theorem. Based on Ball's ideas in [3], the following result is proved in Section 5. Let (X, d X ) be a metric space of Markov type p and let (Y, d Y ) be a metric space of metric Markov cotype p that is W p barycentric with constant Γ. Suppose that Z ⊆ X and f : Z → Y is Lipschitz. Then for every finite subset S ⊆ X there exists F : S → Y with F | S∩Z = f | S∩Z and By combining Theorem 1.11 with Theorem 1.5 we deduce the following Lipschitz extension result.
is a metric space of Markov type p and that (Y, d Y ) is a metric space that is W p barycentric with constant Γ and also p-barycentric with constant K. Suppose that Z ⊆ X and f : Z → Y is Lipschitz. Then for every finite subset S ⊆ X there exists F : In [3] Ball obtained the conclusion of Theorem 1.11 when Y is a Banach space, under the assumption that it satisfies a certain linear invariant that he called Markov cotype 2. He also proved that Banach spaces that admit an equivalent norm whose modulus of uniform convexity has power type 2 satisfy this assumption.
In [3,Sec. 6], Ball proposed a way to define Markov cotype 2 for metric spaces: he first defined a bi-Lipschitz invariant of metric spaces that he called "approximate convexity", and for approximately convex metric spaces he defined a notion of metric Markov cotype which is the same as (2), except that in the right hand side of (2) the Cesàro average of A is replaced by a certain Green's matrix corresponding to A. The precise formulation of these concepts is recalled in Section 7, where we show that Ball's notion of metric Markov cotype coincides with the notion of metric Markov cotype as in Definition 2. Our contribution here is to show that Ball's strategy yields the desired Lipschitz extension result, with the following differences: the W p barycentric condition is used in a key duality step (Lemma 5.2 below), and Lemma 5.1 below removes the need to use the notion of approximate convexity. Other than these changes and some expository simplifications, Section 5 is nothing more than a realization of Ball's plan as he originally envisaged it. Theorem 1.11 yields an extension of f to finitely many additional points, with a bound on the Lipschitz constant that is independent of the number of the additional points. This result is the main geometric content of the Lipschitz extension phenomenon studied here, but using standard arguments one can formally deduce from Theorem 1.11 bona fide solutions of the Lipschitz extension problem.
Specifically, let I denote the set of all finite subsets of X and let U be a free ultrafilter on I. Denoting by Y U the associated ultrapower of Y (see [30] for background on ultrapowers of metric spaces), Y is canonically embedded in Y U and it follows formally from Theorem 1.11 that there exists a mapping Φ : X → Y U that extends f and satisfies (8). If for some λ ∈ [1, ∞) there were a λ-Lipschitz retraction from Y U onto Y , then by composing Φ with this retraction we would deduce that If Y is Banach space then Y U is also a Banach space, and, as proved in [22], it follows from the principle of local reflexivity [38,25] that there is a linear isometry T : Y * * → Y U such that T (Y * * ) contains the canonical image of Y in Y U , and there is a norm 1 projection of Y U onto T (Y * * ). It therefore follows from Theorem 1.11 that e(X, Y * * ) ΓM p (X)N p (Y ).
If in addition there is a λ-Lipschitz retraction from Y * * onto Y then it would follow that (9) holds true.
It is a long standing open problem whether for every separable Banach space Y there is a Lipschitz retraction from Y * * onto Y , but in the nonseparable setting it has been recently proved by Kalton [28] that this need not hold true. A dual Banach space is always canonically norm 1 complemented in its bi-dual, and in [29,Sec. 5] Kalton proved that if Y either has an unconditional finite dimensional decomposition (UFDD) or is a separable order continuous Banach lattice then there is a Lipschitz retraction from Y * * onto Y .
If Y is a complete CAT (0) metric space then so is Y U , and moreover Y is a closed convex subset of Y U . In this case there is a 1-Lipschitz retraction from Y U onto Y (the nearest point map); see [9, Ch. II.2].
The above discussion yields a variety of target spaces Y for which the assumptions of Theorem 1.11 implies that e(X, Y ) < ∞. We single out in particular the following statement.
The Markov type of several important classes of metric spaces has been computed in the literature, and when one takes (X, d X ) to be one of those spaces Corollary 1.13 becomes a versatile Lipschitz extension theorem that encompasses a wide range of seemingly disparate Lipschitz extension results, that have been previously proved mostly via completely different methods.
Specifically, in [50] it was proved that Banach spaces that admit an equivalent norm whose modulus of uniform smoothness has power type p have Markov type p. It was also proved in [50] that trees, hyperbolic groups, complete simply connected Riemannian manifolds of pinched sectional curvature and Laakso graphs all have Markov type 2, and that spaces that admit a padded random partition (see [36]), in particular doubling metric spaces and planar graphs, have Markov type p for all p ∈ (0, 2). In [10] it was shown that series parallel graphs have Markov type 2, and finally in the recent work [13] it was shown that spaces that admit a padded random partition have Markov type 2. Thus, in particular, doubling spaces and planar graphs have Markov type 2. In [52] it was shown that spaces with finite Nagata dimension admit a padded random partition, and so by [13] they too have Markov type 2. In [55] it was shown that Aleksandrov spaces of nonnegative curvature have Markov type 2, and in [1] the Markov type of certain Wasserstein spaces was computed.
In light of these results, taking as an example the case when (Y, d Y ) is a Hadamard space in Corollary 1.13, we see that if (X, d X ) is a doubling space, planar graph, or a space with finite Nagata dimension, then e(X, Y ) is finite. These results were previously proved in [36] via the method of random partitions (Lipschitz extension for spaces of bounded Nagata dimension was previously treated in [34] and only later it was shown in [52] that they admit a padded random partition and therefore the corresponding extension results are a special case of [36]). It also follows that if (X, d X ) has nonnegative curvature in the sense of Aleksandrov and (Y, d Y ) is a Hadamard space then e(X, Y ) 1, a result that has been previously proved in [35], as a special case of an elegant generalization of the classical Kirszbraun extension theorem [31].
Given a metric space (X, d X ) and α ∈ (0, 1] let X α denote the metric space (X, d α X ). By the triangle inequality X α has Markov type p with constant 1 for every p ∈ (1, 1/α]. It therefore follows from the above discussion that e(X α , Y ) < ∞ for every metric space X, provided that Y has metric Markov cotype p ∈ (1, 1/α] and there is a Lipschitz retraction from Y U onto Y . In particular, every 1/2-Hölder mapping from a subset of a metric space X into a Hadamard space Y can be extended to a Y -valued 1/2-Hölder mapping defined on all of X; this statement was previously known when Y is a Hilbert space due to the work of Minty [47]. One can state several additional examples of this type, but we single out only one more special case of Corollary 1.13 that does not seem to follow from previously known theorems: if X is a Banach space whose modulus of smoothness has power type 2 (thus by [50] X has Markov type 2), e.g. X can be an L p (µ) space or the Schatten trace class S p for p ∈ [2, ∞), and Y is a Hadamard space, then e(X, Y ) < ∞.
1.6. On a construction of Kalton. Kalton recently used his "method of sections" to obtain several striking results on the nonlinear geometry of Banach spaces. Using Kalton's beautiful work in [29], we prove the following result in Section 6. Theorem 1.14. There exists a closed linear subspace of ℓ 1 that fails to have metric Markov cotype p for every p ∈ (0, ∞).
Much of the impetus for research on bi-Lipschitz invariants stems from the search for nonlinear formulations of key concepts in Banach space theory; see the surveys [4,49] and the references therein for more on this program. In particular, the use of the term "cotype" in Definition 1.2 arises from an analogy with the Banach space notion of Rademacher cotype (see e.g. [41]). ℓ 1 , and hence all of its linear subspaces, has Rademacher cotype 2, so Theorem 1.14 shows that for Banach spaces metric Markov cotype and Rademacher cotype are different notions. Nevertheless, it would be very interesting to understand the metric Markov cotype of ℓ 1 itself rather than its closed subspaces (note that, due to the existential quantifier in Definition 1.2, metric Markov cotype is not trivially inherited by subspaces). Question 1.15. Does ℓ 1 have metric Markov cotype 2? Less ambitiously, does ℓ 1 have metric Markov cotype p for some p ∈ [2, ∞)?
If ℓ 1 had metric Markov cotype 2 then it would follow from Corollary 1.13 that e(ℓ 2 , ℓ 1 ) < ∞. Whether or not e(ℓ 2 , ℓ 1 ) is finite is a long-standing open question that was asked by Ball in [3]; see [39] for algorithmic ramifications of this important question.
The proof of Theorem 1.14 yields the following quantitative statement. For every n ∈ N there exists an n-dimensional subspace Z n of Indeed, by [61] we know that X is 2-isomorphic to a subspace of ℓ k 1 , with k n log n (for our purpose we can also use the weaker bound on k of [58]). By Hölder's inequality ℓ k 1 is O(1)-isomorphic to a subspace of ℓ p with p = 1 + 1/ log k, so the desired upper bound on N 2 (X) follows from [3,43]. We ask whether (10) can be sharpened. Question 1.16. Is it true that for arbitrarily large n ∈ N there exists an n-dimensional subspace X of ℓ 1 with N 2 (X) √ log n?
An interesting byproduct of our quantitative analysis of Kalton's construction is that it shows for the first time that an old Lipschitz extension result of Benyamini, Johnson and Lindenstrauss [23] cannot be improved. Given ε ∈ (0, 1) and spaces (X, In other words, we are interested in the extension of Y -valued Lipschitz functions from ε-separated subsets of X, where ε-separated means that all positive distances in the subset are at least an ε-fraction of its diameter. In [23] it was shown that for every ε ∈ (0, 1), every metric space (X, d X ) and every Banach space Specifically, a first proof of (11) was given by Johnson and Lindenstrauss in [23] when Y is a Hilbert space, and in the appendix of the same paper Johnson and Lindenstrauss include a different argument that was subsequently found by Benyamini establishing (11) when Y is a general Banach space. A very short proof of (11) was later found by Johnson, Lindenstrauss and Schechtman [24]. Johnson and Lindenstrauss proved [23] that e ε (ℓ 1 , ℓ 2 ) 1/ 4 √ ε. Constructions of Johnson, Lindenstrauss and Schechtman [24] and Lang [32] yield the estimate e ε (ℓ ∞ , ℓ 2 ) 1/ √ ε. Here we show that (11) is sharp up to absolute constant factors, even when X is Hilbert space and Y is an appropriately chosen closed subspace of ℓ 1 .
Theorem 1.17. There exists a closed subspace Y of ℓ 1 that satisfies e ε (ℓ 2 , Y ) 1/ε for every ε ∈ (0, 1). Specifically, for every n ∈ N there exists a 5 n -dimensional subspace Y n of ℓ 1 and a 1/ 4 √ n net N of the unit ball of ℓ n 2 such that e(ℓ n It would be very interesting to understand those pairs of Banach spaces X, Y for which e ε (X, Our interest in this natural question is partially motivated by the forthcoming work [2], where it is asked whether e ε (ℓ 1 , ℓ 1 ) = o(1/ε), and it is shown that a positive answer to this question would have applications to dimension reduction in ℓ 1 (e.g., it is shown in [2] that if e ε (ℓ 1 , ℓ 1 ) = o(1/ε) then any n-point subset of ℓ 1 embeds with distortion O(1) into some Banach space of dimension (log n) O(1) ). Due to Theorem 1.17, one is tempted to believe that in fact e ε (ℓ 1 , ℓ 1 ) 1/ε, but the present approach does not seem to shed light on this question.

Pisier's martingale inequality in barycentric spaces
Martingales in metric spaces have been studied for several decades; see e.g. [14,16,17,59,12]. Here we will use a natural notion of martingale in barycentric metric spaces, the main goal being to extend an important martingale inequality of Pisier [57] from the setting of uniformly convex Banach spaces to the setting of barycentric metric spaces. This inequality will be used crucially in the proof of Theorem 1.5.
Let Ω be a finite set and µ ∈ P Ω be a probability measure such that µ(ω) > 0 for every ω ∈ Ω. Suppose that (X, d X ) is a metric space and fix a barycenter map B : P X → X. Let F ⊆ 2 Ω be a σ-algebra. For every ω ∈ Ω let F (ω) ⊆ Ω be the unique atom of F to which ω belongs. Given an X-valued random variable Z : Ω → X, its conditional barycenter B(Z|F ) : Ω → X is defined as If m ∈ N and {Ω, ∅} We warn that in contrast to the usual setting of martingales in Banach spaces, this definition does not necessarily imply that B(Z i |F j ) = Z j for every j ∈ {0, . . . , i−2}. Nevertheless, the above notion of martingale suffices to prove the following inequality.
3. Proof of Theorem 1.5 Throughout the remainder of this paper it will be convenient to use the following notation for Cesáro averages.
The following simple lemma will be used in the proof of Theorem 1.5, as well as in Section 7.
is a stochastic matrix that is reversible relative to π ∈ ∆ n−1 . Then for every metric space (X, d X ) and every Proof. By the triangle inequality, for every i, j, k ∈ {1, . . . , n} we have Consequently, Proof of Theorem 1.5. Fix p, K, Γ ∈ [1, ∞), a metric space (X, d X ) and a barycenter map B : P X → X with respect to which (X, d X ) is both W p barycentric with constant Γ and p-barycentric with constant K.
We also fix n, t ∈ N, a probability vector π ∈ ∆ n−1 and a stochastic matrix A = (a ij ) that is reversible relative to π. Given x 1 , . . . , x n ∈ X our goal is to prove that there exist y 1 , . . . , y n ∈ X such that (2) is satisfied with N ΓK.
In the proof of Theorem 1.5 we may assume that the right hand side of (2) is nonzero. By restricting to the support of π we may also assume that π ∈ (0, 1) n . Letting Π ∈ M n (R) be given by Π ij = π j , choose ε ∈ (0, 1/2) small enough so that for Since the matrix B is stochastic and reversible relative to π and none of its entries vanish, this shows that it suffices to prove Theorem 1.5 under the assumption that π i , a ij > 0 for every i, j ∈ {1, . . . , n}.
Multiplying (21) by π ℓ and summing over ℓ ∈ {1, . . . , n} while using the fact that A s−1 is stochastic and reversible relative to π shows that In order to bound the left hand side of (22) from below, observe that for every i, j ∈ {1, . . . , n} condition (3) of our assumption that ( Moreover, by the triangle inequality and convexity of u → u p on [0, ∞), Another application of (3) shows that Consequently, if we define then it follows from (23), (24) and (25) that A substitution of (27) into (22) now yields the following estimate.
By combining (28) and Lemma 3.1 we deduce that Next, we need to bound the quantity n i=1 π i d X (x i , y i ) p . We first claim that for every i ∈ {1, . . . , n}, every s ∈ {0, . . . , t} and every z ∈ X we have The proof of (30) is by induction on s. For s = 0 the desired inequality (30) holds as equality. Assuming the validity of (30) for some s ∈ {0, . . . , t − 1}, and recalling (19) and (20), observe that for every j ∈ {1, . . . , n} we have Consequently, it follows from (4) that So, thus completing the inductive verification of (30). When s = t inequality (30) becomes Hence, for every i ∈ {1, . . . , n}, By summing (29) and (33) we conclude that the desired inequality (2) holds true with y 1 , . . . , y n chosen as in (26) and N p = (4ΓK) p + 1. This completes the proof of Theorem 1.5. In order to state our results we need to first introduce a small amount of notation. Given a metric space (X, d X ) and p ∈ [1, ∞), for every n ∈ N we denote by L n p (X) the space of all function f : {1, . . . , n} → X, equipped with the metric Suppose that B : P X → X is a barycenter map. In what follows it will be convenient to use the following slight abuse of notation: for every f : {1, . . . , n} → X write For a symmetric stochastic matrix A = (a ij ) ∈ M n (R) define a mapping A ⊗ I n X : L n p (X) → L n p (X) by setting for every i ∈ {1, . . . n} and f : {1, . . . , n} → X, We warn that, unlike in the setting of Banach space valued mappings, given two symmetric stochastic matrices A, B ∈ M n (R) the composition (A ⊗ I n X ) • (B ⊗ I n X ) need not be of the form C ⊗ I n X for some symmetric stochastic matrix C ∈ M n (R), and in particular the identity (A ⊗ I n X ) • (B ⊗ I n X ) = (AB) ⊗ I X need not hold true. Definition 4.1. Fix p ∈ [1, ∞) and a metric space (X, d X ) equipped with a barycenter map B : P X → X. Given T : L n p (X) → L n p (X), define λ p (T ) ∈ [0, ∞] to be the infimum over those λ ∈ (0, ∞] for which every f ∈ L n p (X) satisfies d L n p (X) (T (f ), B (T (f ))) λ · d L n p (X) (f, B(f )) . Lemma 4.2 below relates the quantities γ + (A t , d p X ) and λ p (A ⊗ I n X ). Note that it assumes that (X, d X ) is p-barycentric with constant K, but K does not appear in the conclusion (34). The reason for this is that the proof of Lemma 4.2 uses a weaker version of (4) in which the rightmost term on the left hand side of (4) is dropped, i.e., the assumption that (X, d X ) is p-barycentric with constant K is used in Lemma 4.2 only through the requirement that every µ ∈ P X satisfies is a metric space that is both W p barycentric with constant Γ and pbarycentric with constant K. Let A = (a ij ) ∈ M n (R) be a symmetric stochastic matrix such that λ p (A ⊗ I n X ) < 1. Then In what follows, for p, K ∈ [1, ∞) we denote by β p (K) the unique β ∈ [1, ∞) satisfying Thus in particular, Observe that β p (K) ∈ [1, min{K, 2}] and β p (K) = 1 if and only if K = 1. We also have To verify (36) note that the function β → K (1 − (β − 1) p ) 1/p decreases from K to 0 on [1,2], so the maximum that appears in (36) is attained when β = K (1 − (β − 1) p ) 1/p , or, due to (35), when β = β p (K). An equivalent way to state (36) is that for every a ∈ [0, ∞) we have To deduce (37) from (36) simply write b = (β − 1)a for some β ∈ [1, 2]. Lemma 4.3. Fix p, K ∈ [1, ∞) and n ∈ N. Suppose that (X, d X ) is a p-barycentric metric space with constant K. Then every symmetric stochastic matrix A = (a ij ) ∈ M n (R) satisfies Assuming the validity of Lemma 4.2 and Lemma 4.3 for the moment, we now show how they imply Theorem 1.10.
Proof of Theorem 1.10. We are now assuming that (X, d X ) is both W p barycentric with constant Γ and p-barycentric with constant K = 1. Under the latter assumption the conclusion of Lemma 4.3 becomes Thus in particular λ p (A ⊗ I n X ) < 1, so we may use Lemma 4.2 in conjunction with (38) to obtain the estimate 2t/p .

It now remains to note the elementary inequality
which follows by considering the cases t pγ and t pγ separately.
We now proceed to prove Lemma 4.2 and Lemma 4.3. To this end, given n, t ∈ N and T : L n p (X) → L n p (X), we denote the t-fold iterate of T by T [t] , i.e., We also use the convention that T [0] is the identity mapping. If X is a Banach space then (A ⊗ I n X ) [t] = A t ⊗ I n X , but this need not hold true when X is not a Banach space. Observe that a direct iterative application of Definition 4.1 implies that Lemma 4.4. Fix p, K ∈ [1, ∞). Suppose that (X, d X ) is a p-barycentric metric space with constant K. Then for every n, t ∈ N, every symmetric stochastic matrix A = (a ij ) ∈ M n (R) and every f ∈ L n p (X), Proof. We will prove by induction on t that if B = (b ij ) ∈ M n (R) has nonnegative entries then The desired inequality (40) is then the special case of (41) when B is the identity matrix. (41) holds as equality when t = 0, so assume inductively that (41) holds true for some t ∈ N ∪ {0}. Fix i, j ∈ {1, . . . , n} and consider the probability measure µ j ∈ P X given by Then B(µ j ) = (A ⊗ I n X ) [t+1] (f )(j). By (4) with µ = µ j and x = f (i), where in (43) uses (42) and (44) uses the inductive hypothesis.
is a W p barycentric metric space with constant Γ. Then for every f, g ∈ L n p (X), Proof. By (3) we have By the triangle inequality in L n p (R) we have For every i ∈ {1, . . . , n} the triangle inequality in (X, d X ) implies that In combination with (47) and another application of the triangle inequality in L n p (R) we deduce that Due to (46), this implies the desired inequality (45).
Lemma 4.6. Fix p, K, Γ ∈ [1, ∞) and n, t ∈ N. Suppose that (X, d X ) is a metric space that is both W p barycentric with constant Γ and pbarycentric with constant K. Let A ∈ M n (R) be a symmetric stochastic matrix such that λ p (A ⊗ I n X ) [2t] < 1. Then Proof. For every f, g : {1, . . . , n} → X we have g, B(g)). (49) We proceed to bound each of the terms on the right hand side of (49) separately.
First, define µ f , µ g ∈ P X by .
Proof of Lemma 4.2. The desired estimate (34) is a consequence of (39) and Lemma 4.6.
We now proceed to prove Lemma 4.3. Recalling the definition of β p (K) in (35), we first establish the following estimate. Lemma 4.7. Fix p, K ∈ [1, ∞). Suppose that (X, d X ) is a p-barycentric metric space with constant K. Then for every n ∈ N, every symmetric stochastic matrix A ∈ M n (R) and every f ∈ L n p (X) we have Then by the triangle inequality in L n p (X), Next, define ν ∈ P X by Then and therefore It also follows from (59) and (62) that By combining (60), (63), (64) and (65), an application of (4) to the measure µ = ν with x = B(f ) yields the estimate The desired estimate (58) now follows by combining (61) and (66) with (37).
is a p-barycentric metric space with constant K. Then for every n ∈ N, every symmetric stochastic matirx A = (a ij ) ∈ M n (R) and every f ∈ L n p (X) we have Proof. For every i ∈ {1, . . . , n} define ν i ∈ P X by Thus An application of (4) with µ = ν i and x = B(f ) therefore implies that By averaging (69) over i ∈ {1, . . . , n} we conclude that Next, the definition of γ + (A, d p X ) implies that For every i ∈ {1, . . . , n}, an application of (4) with µ = 1 n n j=1 δ f (j) and x = (A ⊗ I n X )(f )(i) implies the estimate By averaging (72) over i ∈ {1, . . . , n} we see that (73) By substituting (73) into (71), and plugging the resulting estimate into (70), we conclude that which simplifies to give the desired inequality (67).
Proof of Lemma 4.3. Simply combine Lemma 4.7 and Lemma 4.8.

Proof of Theorem 1.11
Lemma 5.1 below plays an important role in our proof of Theorem 1.11. It was proved by the second named author in collaboration with M. Csörnyei (2001); we thank her for letting us include it here.
Lemma 5.1. Fix m, n ∈ N and p ∈ [1, ∞). Let B = (b ij ) ∈ M n×m (R) and C = (c ij ) ∈ M n (R) be stochastic matrices (of dimensions n by m and n by n, respectively). Fix π ∈ ∆ n−1 and suppose that C is reversible relative to π. Then for every metric space (X, d X ) and every z 1 , . . . , z m ∈ X there exist w 1 , . . . , w n ∈ X such that Proof. Let f : {z 1 , . . . , z m } → ℓ m ∞ be an isometric embedding (e.g. one can take f (z) = m r=1 d X (z, z r )e r , where {e r } m r=1 is the standard basis of R m ). Define y 1 , . . . , y n ∈ ℓ m ∞ by Next, for every i ∈ {1, . . . , n} choose w i ∈ {z 1 , . . . , z m } such that By the triangle inequality, for every i, j ∈ {1, . . . , m} we have Consequently, using the stochasticity of C and its reversibility relative to π, Recalling (76), the convexity of the function v → v p ∞ implies that Next, due to (77) and the fact that CB is a stochastic matrix, Recalling (76) and using the convexity of the function v → v p ∞ once more, we deduce that A combination of (78), (79) and (80) now implies that Next, note that by the triangle inequality for every i, j ∈ {1, . . . , n} and r ∈ {1, . . . , m} we have By multiplying inequality (82) by π i b ir c ij , summing over r ∈ {1, . . . , m} and i, j ∈ {1, . . . , n}, and using the stochasticity of B and C, we deduce that Recalling (76) and using the convexity of the function v → v p ∞ , we have A combination of (83) with (79), (80), and (84) yields the estimate which, due to (81), yields the desired estimate (74).
The following lemma is a natural variant of [3, Lem. 1.1].
Proof. We proceed via the following duality argument due to Ball [3] (which is itself inspired by the work of Maurey [40]), with a slight twist that brings in the assumption that Y is W p barycentric.
Consider the following set of n by n symmetric matrices.
Let D ⊆ M n (R) be the set of all n by n symmetric matrices with nonnegative entries and define E def = conv (C + D).
For every i, j ∈ {1, . . . , n} write t ij The assumption of Lemma 5.2 can be rephrased as It follows that the matrix T = (t ij ) belongs to E, since otherwise by the separation theorem (Hahn-Banach) there would exist a symmetric matrix Since E ⊇ C + D it follows from (86) that the entries of H are nonnegative, i.e., H ∈ D. Now (86) contradicts (85) since E ⊇ C.
Proof of Theorem 1.11. Fix m, n ∈ N. Take x 1 , . . . , x n ∈ X Z and z 1 , . . . , z m ∈ Z. If H = (h ij ) ∈ M n+m (R) is symmetric then write where U(H) ∈ M m (R), V (H) ∈ M n (R) and W (H) ∈ M n×m (R). With this notation, define and for every y 1 , . . . , y n ∈ Y , Fix from now on M ∈ (M p (X), 2M p (X)] and N ∈ (N p (Y ), 2N p (Y )]. Due to Lemma 5.2 it suffices to show that for every symmetric matrix H = (h ij ) ∈ M n+m (R) with nonnegative entries and for every δ ∈ (0, 1) one can find y 1 , . . . , y n ∈ Y such that L H (y 1 , . . . , y n ) Λ(R H + δ), Since the Lipschitz condition of f on {z 1 , . . . , z m } implies that it suffices to establish the existence of y 1 , . . . , y n ∈ Y that satisfy the inequality Fix t ∈ N and ε ∈ (0, 1) that will be determined later. Note that the diagonal entries {V (H) ii } n i=1 are irrelevant for the validity of (89), so we may assume from now on that V (H) ii = 0 for all i ∈ {1, . . . , n}.
Define π ∈ R n by Thus π ∈ ∆ n−1 . Next, define B = (b ir ) ∈ M n×m (R) by setting for every i ∈ {1, . . . , n} and r ∈ {1, . . . , m}, Thus B is a stochastic matrix. Finally, define A = (a ij ) ∈ M n (R) by setting for i ∈ {1, . . . , n}, and for distinct i, j ∈ {1, . . . , n}, The role of ε is only to ensure that the denominators that appear in (90), (91), (92) and (93) do not vanish. An inspection of the ensuing argument reveals that there is flexibility in the choice of the normalizing factors in (92) and (93); the choices above were made in order to simplify some expressions in what follows. Fixing ε, we will assume from now on that t is sufficiently large so as to ensure that a 11 , . . . , a nn are all nonnegative. Thus A is a stochastic matrix. Note also that since V (H) is symmetric, an inspection of (90) and (93) reveals that A is reversible relative to π. Set We will now show that the points y 1 , . . . , y n thus found satisfy the desired inequality (89).
For every i, j ∈ {1, . . . , n} and r, s ∈ {1, . . . , m} we have Consequently, where, using the stochasticity of A and B, using the stochasticity of A and B and the fact that X has Markov type p with M > M p (X), and, using the stochasticity of A and B, and the reversibility of A τ (A) relative to π, By substituting (104), (105) and (106) into (103), and combining the resulting estimate with (102) and (100) while recalling the definition of Λ in (88), we arrive at where we used the identities (98) and (99). Choosing ε ∈ (0, 1) so that 2ε n i=1 m r=1 d X (x i , z r ) p δ yields the desired estimate (89).
6. Proof of Theorem 1.14 and Theorem 1.17 Both Theorem 1.14 and Theorem 1.17 rely on a quantitative variant of a beautiful construction of Kalton [27,29]. Before passing to the construction itself, we record some basic facts on metric Markov cotype. Lemma 6.1. Fix p ∈ (0, ∞) and let (X, d X ) be a metric space with metric Markov cotype p. Suppose that S ⊆ X is a Lipschitz retract of X, i.e., there exists a Lipschitz mapping ρ : X → S such that ρ(s) = s for every s ∈ S. Then S also has metric Markov cotype p, and in fact Proof. Fix N > N p (X) and n, t ∈ N. Suppose that A = (a ij ) ∈ M n (R) is a stochastic matrix that is reversible relative to π ∈ ∆ n−1 . For every x 1 , . . . , x n ∈ S there exist y 1 , . . . , y n ∈ X such that Then, since ρ(x i ) = x i for every i ∈ {1, . . . , n}, We next show that the real line R (equipped with the standard metric) fails to have metric Markov cotype p for any p ∈ (0, 2). Lemma 6.2 below contains a simple explicit example that exhibits this fact, but when p ∈ [1, 2) there is a also a roundabout way to see that R fails to have metric Markov cotype p via the link to Lipschitz extension. Indeed, it follows from the definition of metric Markov cotype that N p (ℓ p ) = N p (R). Ball proved [3] that the Markov type p constant of ℓ p satisfies M p (ℓ p ) = 1 for p ∈ [1, 2). Corollary 1.13 therefore implies that e(ℓ p , ℓ p ) N p (R). But it is known that e(ℓ p , ℓ p ) = ∞ for every p ∈ [1, 2): for p ∈ (1, 2) see [48], and for p = 1 it is observed in [39] that this follows from [8] or [19] ( [39] also provides an interesting third proof of the fact that e(ℓ 1 , ℓ 1 ) = ∞). Lemma 6.2. For every p ∈ (0, 2) the real line R (equipped with the standard metric) fails to have metric Markov cotype p.
Proof. Fix p ∈ (0, 2) and suppose for the sake of obtaining a contradiction that N p (R) < ∞. Fixing n ∈ N, define A = (a ij ) ∈ M n (R) by a 11 = a nn = 1/2 and a i,i+1 = a i+1,i = 1/2 for every i ∈ {1, . . . , n − 1}, and the remaining entries of A vanish. Thus A is a symmetric stochastic matrix, corresponding to the standard random walk on {1, . . . , n} in which if the walker is at either 1 or n then with probability 1/2 it does nothing in the next step, and with probability 1/2 it moves in the next step to its unique neighbor in {2, n − 1}. Let {W 0 , W 1 , W 2 , . . .} denote this walk, i.e., W 0 is uniformly distributed on {1, . . . , n} and conditioned on W t = i the probability that W t+1 = j equals a ij . Thus, for every t ∈ N we have To justify the final inequality in (107), proceed by induction on t as follows. Since |W t+1 − W t | 1 point-wise, so for the induction step it suffices to show that for every t ∈ N we have E [(W t − W 0 )(W t+1 − W t )] 0. By conditioning on W 0 , W t , it suffices to check the point-wise inequality (108) is easy to verify: Due to (107), for every t ∈ N we have The definition of metric Markov cotype p therefore implies that there exist y 1 , . . . , y n ∈ R such that Suppose first that p ∈ [1,2). In this case we choose Using Hölder's inequality and (109) we deduce that Consequently, for every i ∈ {2, . . . , n} we have (111) implies that if y 1 n/2 then y i 3n/4 for every i ∈ {1, . . . , n} and if y 1 n/2 then y i n/4 for every i ∈ {1, . . . , n}. Hence, By substituting (112) into (109) and recalling (110) we conclude that which is a contradiction for large enough n. It remains to deal with the case p ∈ (0, 1]. Now our choice of t is Observe that since p ∈ (0, 1], for every i ∈ {2, . . . , n}, x ∈ S n−1 . Define a mapping f n θ : A n → Y n by Observe that f n θ (−a) = −f n θ (a) for every a ∈ A n . Lemma 6.4. For every n ∈ N, θ ∈ (0, 1] and τ ∈ [θ, 1] we have and that F (a) = f n θ (a) for every a ∈ A n . Then L n θ/4 . Proof. By replacing F (x) with (F (x)−F (−x))/2 we may assume without loss of generality that F (−x) = −F (x) for every x ∈ S n−1 . Since F takes values in Y n θ , it follows from (115) that there exists a mapping ψ : S n−1 → ℓ 5 n 1 such that Let σ n−1 denote the normalized Haar measure on S n−1 . We claim that for every y ∈ B 5 n ∞ = [−1, 1] 5 n we have The proof of (119) is a standard application of the concentration of measure phenomenon on S n−1 . Indeed, consider the set Since ψ(−x) = −ψ(x) for every x ∈ S n−1 we have σ n−1 (U y ) 1 2 . For x ∈ S n−1 and t ∈ (0, ∞) note that Indeed, if ψ(x), y t then for every u ∈ U y , since ψ(u), y 0, Ln τ /4 · x − u τ 2 , implying (120). By the isoperimetric inequality on S n−1 (see e.g. [46]) it follows from (120) that where c ∈ (0, ∞) is a universal constant. By symmetry, the same estimate holds true for σ n−1 ({x ∈ S n−1 : ψ(x), y −t}), and therefore completing the proof of (119). The Pietsch Domination Theorem [56] (see also [37,Prop. 3.1]) implies that there exists a Borel probability measure µ on B 5 n ∞ such that ∀ x ∈ ℓ 5 n 1 , Q n (x) 2 π 1 (Q n ) where π 1 (Q n ) is the 1-summing norm of Q n , i.e., For x ∈ S n−1 choose a ∈ A n such that x − a 2 1/ 4 √ n. Recalling (116) and (118), since F (a) = f n θ (a) we have Q n (ψ(a)) = a. Hence, 1 − Ln (τ −θ)/4 n τ /4 = 1 − L n θ/4 . (123) By combining (122) and (123), and recalling that τ θ, the proof of Lemma 6.5 is complete.
(124) Lemma 6.4 with θ = 2/n and τ = 2/p asserts that the function In light of (124), since the Y 2 n 2 2/n is finite dimensional it follows from Corollary 1.13 that there exists a function F : ℓ 2 n 2 2 → Y 2 n 2 2/n that extends f 2 n 2 2/n and satisfies (117) with θ = 2/n, τ = 2/p and L N p Y 2 n 2 2/n . We therefore deduce from Lemma 6.5 that For every n ∈ N the restriction to the nth coordinate is a 1-Lipschitz retraction from Y onto Y 2 n 2 2/n , so by Lemma 6.1 we have By Corollary 6.3 we also have N p (Y ) = ∞ for p ∈ (0, 2), so in order to complete the proof of Theorem 1.14 it remains to show that Y is isomorphic to a subspace of ℓ 1 . Since Y is the ℓ 1 direct sum of the spaces Y 2 n 2 2/n ⊆ ℓ 5 2 n 2 1 ⊕ ℓ 2 n 2 2 , it remains to recall that ℓ k 2 is (1 + ε)-isomorphic to a subspace of ℓ 1 for every ε ∈ (0, 1) and k ∈ N (e.g. by Dvoretzky's theorem [15]).
Proof of Theorem 1.17. By Lemma 6.4 and Lemma 6.5 with θ = τ = 1, e S n−1 , A n , Y n Since the diameter of A n is at most 2 and the minimal nonzero distance in A n is at least 1/ 4 √ n, the proof of Theorem 1.17 is complete.

Comparison with Ball's approach
Fix n ∈ N, t ∈ [1, ∞) and let A ∈ M n (R) be a stochastic matrix that is reversible with respect to π ∈ ∆ n−1 . Since A has norm 1 when viewed as an operator on L 2 (π), we can consider the following matrix.
We also denote the corresponding Green's matrix by where I n ∈ M n (R) denotes the identity matrix.
In [3] Ball worked with following linear invariant of Banach spaces. For p ∈ (0, ∞) say that a Banach space (X, · X ) has Markov cotype p with constant N ∈ (0, ∞) if for every n ∈ N and t ∈ [1, ∞), if A = (a ij ) ∈ M n (R) is a symmetric stochastic matrix that is reversible relative to π ∈ ∆ n−1 and x 1 , . . . , x n ∈ X then As we shall see shortly, in Banach spaces Markov cotype p implies metric Markov cotype p as in Definition 2, but their equivalence remains open. Note that Ball proved in [3] that ℓ 1 fails to have Markov cotype 2, so this question has relevance to Question 1.15.
In the closing remarks of his paper [3], Ball proposed the following two step definition of metric Markov cotype for metric spaces. First, given p ∈ [1, ∞) say that a metric space (X, d X ) is p-approximately convex if there exists K ∈ (0, ∞) with the following property. Fix m, n ∈ N and let B = (b ij ) ∈ M n×m (R) and C = (c ij ) ∈ M n (R) be stochastic matrices, such that C is reversible relative to π ∈ ∆ n−1 . Then for every z 1 , . . . , z m ∈ X there exist w 1 , . . . , w n ∈ X such that where D π ∈ M n (R) is given as in (75), i.e., it is the diagonal matrix whose diagonal equals π. Assuming that (X, d X ) is approximately convex, Ball defined it to have metric Markov cotype p if there exists N ∈ (0, ∞) such that for every n, t ∈ N, if A = (a ij ) ∈ M n (R) is stochastic and reversible relative to π ∈ ∆ n−1 then for every x 1 , . . . , x n ∈ X there exist y 1 , . . . , y n ∈ X such that Denote by N B p (X) the infimum over those N ∈ (0, ∞) for which (129) holds true. By Lemma 5.1 every metric space is a p-approximately convex (with K in (128) at most 6). So, the first step of Ball's definition is not needed. Observe also that for Banach spaces Markov cotype p trivially implies (129). Moreover, there is an immediate link between (129) and (2): due to (127) we have A t (A) ij B t (A) ij for every integer t 2 and i, j ∈ {1, . . . , n}. Therefore every metric space (X, d X ) satisfies N B p (X) N p (X). Despite the fact that one cannot bound from above (entry-wise) the matrix B t (A) by a constant multiple of the matrix A t (A), the following lemma implies that N p (X) N B p (X).
Lemma 7.1. Let (X, d X ) be a metric space. Suppose that n, t ∈ N and A ∈ M n (R) is a stochastic matrix that is reversible relative to π ∈ ∆ n−1 . Then for every p ∈ [1, ∞) and x 1 , . . . , x n ∈ X we have Proof. Write u def = ⌈pt⌉ and note that for every m ∈ N we have n i=1 n j=1 π i (A mu ) ij d X (x i , x j ) p = i∈{1,...,n} m+1 where in (131) we used the triangle inequality and Hölder's inequality, and in (132) we used Lemma 3.1.