Comparison of metric spectral gaps

Let $A=(a_{ij})\in M_n(\R)$ be an $n$ by $n$ symmetric stochastic matrix. For $p\in [1,\infty)$ and a metric space $(X,d_X)$, let $\gamma(A,d_X^p)$ be the infimum over those $\gamma\in (0,\infty]$ for which every $x_1,...,x_n\in X$ satisfy $$ \frac{1}{n^2} \sum_{i=1}^n\sum_{j=1}^n d_X(x_i,x_j)^p\le \frac{\gamma}{n}\sum_{i=1}^n\sum_{j=1}^n a_{ij} d_X(x_i,x_j)^p. $$ Thus $\gamma(A,d_X^p)$ measures the magnitude of the {\em nonlinear spectral gap} of the matrix $A$ with respect to the kernel $d_X^p:X\times X\to [0,\infty)$. We study pairs of metric spaces $(X,d_X)$ and $(Y,d_Y)$ for which there exists $\Psi:(0,\infty)\to (0,\infty)$ such that $\gamma(A,d_X^p)\le \Psi(\gamma(A,d_Y^p))$ for every symmetric stochastic $A\in M_n(\R)$ with $\gamma(A,d_Y^p)<\infty$. When $\Psi$ is linear a complete geometric characterization is obtained. Our estimates on nonlinear spectral gaps yield new embeddability results as well as new nonembeddability results. For example, it is shown that if $n\in \N$ and $p\in (2,\infty)$ then for every $f_1,...,f_n\in L_p$ there exist $x_1,...,x_n\in L_2$ such that {equation}\label{eq:p factor} \forall\, i,j\in \{1,...,n\},\quad \|x_i-x_j\|_2\lesssim p\|f_i-f_j\|_p, {equation} and $$ \sum_{i=1}^n\sum_{j=1}^n \|x_i-x_j\|_2^2=\sum_{i=1}^n\sum_{j=1}^n \|f_i-f_j\|_p^2. $$ This statement is impossible for $p\in [1,2)$, and the asymptotic dependence on $p$ in \eqref{eq:p factor} is sharp. We also obtain the best known lower bound on the $L_p$ distortion of Ramanujan graphs, improving over the work of Matou\v{s}ek. Links to Bourgain--Milman--Wolfson type and a conjectural nonlinear Maurey--Pisier theorem are studied.

Thus γ(A, d p X ) measures the magnitude of the nonlinear spectral gap of the matrix A with respect to the kernel d p X : X×X → [0, ∞). We study pairs of metric spaces (X, d X ) and (Y, d Y ) for which there exists Ψ : (0, ∞) → (0, ∞) such that γ(A, d p X ) Ψ (γ(A, d p Y )) for every symmetric stochastic A ∈ M n (R) with γ(A, d p Y ) < ∞. When Ψ is linear a complete geometric characterization is obtained.
Our estimates on nonlinear spectral gaps yield new embeddability results as well as new nonembeddability results. For example, it is shown that if n ∈ N and p ∈ (2, ∞) then for every f 1 , . . . , f n ∈ L p there exist x 1 , . . . , x n ∈ L 2 such that ∀ i, j ∈ {1, . . . , n}, This statement is impossible for p ∈ [1, 2), and the asymptotic dependence on p in (1) is sharp. We also obtain the best known lower bound on the L p distortion of Ramanujan graphs, improving over the work of Matoušek. Links to Bourgain-Milman-Wolfson type and a conjectural nonlinear Maurey-Pisier theorem are studied. (2) For p ∈ [1, ∞) and a metric space (X, d X ), we denote by γ(A, d p X ) the infimum over those γ ∈ (0, ∞] for which every x 1 , . . . , x n ∈ X satisfy Thus for p = 2 and X = (R, d R ), where we denote d R (x, y) def = |x − y| for every x, y ∈ R, we have For this reason one thinks of γ(A, d p X ) as measuring the magnitude of the nonlinear spectral gap of the matrix A with respect to the kernel d p X : X × X → [0, ∞). See [MN12] for more information on the topic of nonlinear spectral gaps.
Suppose that (X, d X ) is a metric space with |X| 2 and fix distinct points a, b ∈ X. Fix also p ∈ [1, ∞), n ∈ N, and an n by n symmetric stochastic matrix A = (a ij ). By considering x 1 , . . . , x n ∈ {a, b} in (3), we see that It therefore follows from Cheeger's inequality [Che70] (in our context, see [JS88] and [LS88]) that Consequently, any finite upper bound on γ(A, d p X ) immediately implies a spectral gap estimate for the matrix A. The ensuing discussion always tacitly assumes that metric spaces contain at least two points.
In (5), and in what follows, the notations U V and V U mean that U CV for some universal constant C ∈ (0, ∞). If we need to allow C to depend on parameters, we indicate this by subscripts, thus e.g. U β V means that U C(β)V for some C(β) ∈ (0, ∞) which is allowed to depend only on the parameter β. The notation U ≍ V stands for (U V ) ∧ (V U), and correspondingly the notation U ≍ β V stands for (U β V ) ∧ (V β U).
A simple application of the triangle inequality (see [MN12,Lem. 2.1]) shows that γ(A, d p X ) is finite if and only if λ 2 (A) < 1 (equivalently, the matrix A is ergodic, or the graph on {1, . . . , n} whose edges are the pais {i, j} for which a ij > 0 is connected).
It is often quite difficult to obtain good estimates on nonlinear spectral gaps. This difficulty is exemplified by several problems in metric geometry that can be cast as estimates on nonlinear spectral gaps: see [Laf08,Laf09,MN12,MN13a,Lia13,dlS13] for some specific examples, as well as the ensuing discussion on nonlinear type. In this general direction, here we investigate the following basic "meta-problem." Question 1.1 (Comparison of nonlinear spectral gaps). Given p ∈ [1, ∞), characterize those pairs of metric spaces (X, d X ) and (Y, d Y ) for which there exists an increasing function Ψ = Ψ X,Y : (0, ∞) → (0, ∞) such that for every n ∈ N and every ergodic symmetric stochastic A ∈ M n (R) we have The case p = 2 and (Y, d Y ) = (R, d R ) of Question 1.1 is especially important, so we explicitly single it out as follows. See also the closely related question that Pisier posed as Problem 3.1 in [Pis10].
Question 1.2 (Bounding nonlinear gaps by linear gaps). Characterize those metric spaces (X, d X ) for which there exists an increasing function Ψ = Ψ X : (0, ∞) → (0, ∞) such that for every n ∈ N and every ergodic symmetric stochastic A ∈ M n (R) we have Question 1.1 and Question 1.2 seem to be difficult, and they might not have a useful simple-to-state answer. As an indication of this, in [MN13a] it is shown that there exists a CAT (0) metric space (X, d X ), and for each k ∈ N there exist n k ∈ N with lim k→∞ n k = ∞ such that there are symmetric stochastic matrices A k , B k ∈ M n k (R) with Such questions are difficult even in the setting of Banach spaces: a wellknown open question (see e.g. [Pis10]) asks whether (7) holds true when X is a uniformly convex Banach space. If true, this would yield a remarkable "linear to nonlinear transference principle for spectral gaps," establishing in particular the existence of super-expanders (see [MN12]) with logarithmic girth, a result which could then be used in conjunction with Gromov's random group construction [Gro03] to rule out the success of an approach to the Novikov conjecture that was discovered by Kasparov and Yu [KY06]. There is little evidence, however, that every uniformly convex Banach space admits an inequality such as (7), and we suspect that (7) fails for some uniformly convex Banach spaces. When the function Ψ appearing in (6) can be taken to be linear, i.e., Ψ(t) = Kt for some K ∈ (0, ∞), Theorem 1.3 below provides the following geometric answer to Question 1.1. Given p ∈ [1, ∞) and a metric space (Y, d Y ), for every m ∈ N we denote by ℓ m p (Y ) the metric space Y m equipped with the metric Recall also the standard notation ℓ m p = ℓ m p (R). A coordinate-wise application of (3) shows that every symmetric stochastic A satisfies Fixing D ∈ [1, ∞), suppose that (X, d X ) is a metric space such that for every n ∈ N and x 1 , . . . , x n ∈ X there exists m ∈ N and a nonconstant mapping f : {x 1 , . . . , x n } → ℓ m p (Y ) such that where f Lip denotes the Lipschitz constant of f , i.e., f Lip def = max x,y∈{x 1 ,...,xn} x =y Then for every symmetric stochastic matrix A ∈ M n (R), Consequently, γ(A, d p X ) Dγ(A, d p Y ). The above simple argument, combined with standard metric embedding methods, already implies that a variety of metric spaces satisfy the spectral inequality (7) with Ψ linear. This holds in particular when (X, d X ) belongs to one of the following classes of metric spaces: doubling metric spaces, compact Riemannian surfaces, Gromov hyperbolic spaces of bounded local geometry, Euclidean buildings, symmetric spaces, homogeneous Hadamard manifolds, and forbidden-minor (edgeweighted) graph families; this topic is treated in Section 7. We will also see below that the same holds true for certain Banach spaces, including L p (µ) spaces for p ∈ [2, ∞).
The following theorem asserts that the above reasoning using metric embeddings is the only way to obtain an inequality such as (6) with Ψ linear. Its proof is a duality argument that is inspired by the proof of a lemma of K. Ball [Bal92, Lem. 1.1] (see also [MN13b,Lem 5.2]). Theorem 1.3. Fix n ∈ N and p, K ∈ [1, ∞). Given two metric spaces (X, d X ) and (Y, d Y ), the following assertions are equivalent.
(1) For every symmetric stochastic n by n matrix A we have (2) For every D ∈ (K, ∞) and every x 1 , . . . , x n ∈ X there exists m ∈ N and a nonconstant mapping f : {x 1 , . . . , x n } → ℓ m p (Y ) that satisfies It is worthwhile to single out the special case of Theorem 1.3 that corresponds to Question 1.2, in which case the embeddings are into ℓ 2 .
Corollary 1.4. Fix n ∈ N and K ∈ [1, ∞). Given a metric space (X, d X ), the following assertions are equivalent.
(1) For every symmetric stochastic matrix A ∈ M n (R) we have .
Inequality (12) is proved in Section 4 via a direct argument using analytic and probabilistic techniques, but once this inequality is established one can use duality through Corollary 1.4 to deduce the following new Hilbertian embedding result for arbitrary finite subsets of ℓ p . Corollary 1.6. If n ∈ N and p ∈ (2, ∞) then for every x 1 , . . . , x n ∈ ℓ p there exist y 1 , . . . , y n ∈ ℓ 2 such that ∀ i, j ∈ {1, . . . , n}, The dependence on p in (13) is sharp up to the implicit universal constant, and the conclusion of Corollary 1.6 is false for p ∈ [1, 2) even if one allows any dependence on p in (13) (provided it is independent of n): for the former statement see Lemma 1.11 below and for the latter statement see Lemma 1.12 below. It would be interesting to find a constructive proof of Corollary 1.6, i.e., a direct proof that does not rely on duality to show that the desired embedding exists.
Remark 1.7. Questions in the spirit of Theorem1.5 have been previously studied by Matoušek [Mat97], who proved that there exists a universal constant C ∈ (0, ∞) such that for every n ∈ N and every n by n symmetric stochastic matrix A ∈ M n (R),  (14). In order to obtain an embedding result such as Corollary 1.6 one needs to bound γ(A, · 2 ℓp ) rather than γ(A, · p ℓp ) by a quantity that grows linearly with 1/(1 − λ 2 (A)). We do not see how to use Matoušek's approach in [Mat97] to obtain such an estimate, and we therefore use an entirely different method (specifically, complex interpolation and Markov type) to prove Theorem 1.5.
As stated above, when p ∈ [1, 2) the analogue of Theorem 1.5 can hold true only if we allow the right hand side of (12) depend on n. Specifically, we ask the following question.
Question 1.8. Is it true that for every p ∈ [1, 2], every n ∈ N and every n by n symmetric stochastic matrix A we have Due to Theorem 1.3, an affirmative answer to Question 1.8 is equivalent to the assertion that if p ∈ [1, 2] then for every x 1 , . . . , x n ∈ ℓ p there exist y 1 , . . . , y n ∈ ℓ 2 that satisfy We conjecture that the answer to Question 1.8 is positive. As partial motivation for this conjecture, we note that by an important theorem of Arora, Rao and Vazirani [ARV09] the answer is positive when p = 1.
A possible approach towards proving this conjecture for p ∈ (1, 2) is to investigate whether the quantity γ(A, · 2 X ) behaves well under interpolation. In our case one would write 1 p = θ 2 + 1−θ 1 for an appropriate θ ∈ (0, 1) and ask whether or not it is true that for every n ∈ N every n by n symmetric stochastic matrix A satisfies Investigating the possible validity such interpolation inequalities for nonlinear spectral gaps is interesting in its own right; in Section 4.3 we derive a weaker interpolation inequality in this spirit.
, is define to be the infimum over those D ∈ [1, ∞) for which there exists s ∈ (0, ∞) and a mapping f : we use the simpler notation c p (X) = c ℓp (X). The parameter c 2 (X) is known in the literature as the Euclidean distortion of X. The Fréchet-Kuratowski embedding (see e.g. [Hei01]) shows that c ∞ (X) = 1 for every separable metric space X. We therefore define p(X) to be the infimum over those p ∈ [2, ∞] such that c p (X) < 10. The choice of the number 10 here is arbitrary, and one can equally consider any fixed number bigger than 1 in place of 10 to define the parameter p(X); we made this arbitrary choice rather than adding an additional parameter only for the sake of notational simplicity.
For n ∈ N and d ∈ {3, . . . , n − 1} let p(n, d) be the expectation of p(G) when G is distributed uniformly at random over all connected nvertex d-regular graphs, equipped with the shortest-path metric. Thus, if p = p(n, d) then in expectation a connected n-vertex d-regular graph G satisfies c p (G) 10. In [Mat97] Matoušek evaluated the asymptotic dependence of the largest possible distortion of an n-point metric space in ℓ p , yielding the estimate p(n, d) log d n, which is an asymptotically sharp bound if d = O(1). As a consequence of our proof of Theorem 1.5, it turns out that Matoušek's bound is not sharp as a function of the degree d. Specifically, in Section 4.1 we prove the following result. Proposition 1.9. For every n ∈ N and d ∈ {3, . . . , n − 1} we have and The significance of the quantity log d n appearing in Matoušek's bound is that up to universal constant factors it is the typical diameter of a uniformly random connected n-vertex d-regular graph [BFdlV82]. Thus, in (16) there must be some restriction on the size of d relative to n since when d is at least a constant power of n the typical diameter of G is O(1), and therefore c p (G) c 2 (G) = O(1). Both (16) and (17) assert that p(n, d) tends to ∞ faster than the typical diameter of G when n o(1) = d → ∞ (note also that (17) is consistent with the fact that p(n, d) must become bounded when d is large enough). While we initially expected Matoušek's bound to be sharp, Proposition (1.9) indicates that the parameter p(n, d), and more generally the parameter p(X) for a finite metric space (X, d X ), deserves further investigation. In particular, we make no claim that Proposition (1.9) is sharp.
The link between Theorem 1.5 and Proposition 1.9 is that our proof of Theorem 1.5 yields the following bound, which holds for every n ∈ N, every n by n symmetric stochastic matrix A, and every p ∈ [2, ∞).
where we recall that λ(A) was defined in (2). Proposition 1.9 is deduced in Section 4.1 from (18) through a classical argument of Linial, London and Rabinovich [LLR95]. The bounds appearing in Proposition 1.9 hold true with p(n, d) replaced by p(G) when G is an n-vertex d-regular Ramanujan graph [LPS88,Mar88], and it is an independently interesting open question to evaluate c p (G) up to universal constant factors when G is Ramanujan. Some estimates in this direction are obtained in Section 4.1, where a similar question is also studied for Abelian Alon-Roichman graphs [AR94].
1.2. Ozawa's localization argument for Poincaré inequalities. Theorem 1.10 below provides a partial answer to Question 1.1 when X and Y are certain Banach spaces. Its proof builds on an elegant idea of Ozawa [Oza04] that was used in [Oza04] to rule out coarse embeddings of expanders into certain Banach spaces. While Ozawa's original argument did not yield a nonlinear spectral gap inequality in the form that we require here, it can be modified so as to yield Theorem 1.10; this is carried out in Section 5. Throughout this article the unit ball of a Banach space (X, · X ) is denoted by Theorem 1.10. Let (X, · X ) and (Y, · Y ) be Banach spaces. Suppose that α, β : [0, ∞) → [0, ∞) are increasing functions, β is concave, and lim t→0 Suppose also that there exists a mapping φ : Then for every q ∈ [1, ∞), every n ∈ N, and every symmetric stochastic matrix A ∈ M n (R) we have The key point of Theorem 1.10 is that one can use local information such as (19) in order to deduce a Poincaré-type inequality such as (3).
Certain classes of Banach spaces X are known to satisfy the assumptions of Theorem 1.10 when Y is a Hilbert space. These include: L p (µ) spaces for p ∈ [1, ∞), as shown by Mazur [Maz29] (see [BL00, Ch. 9, Sec. 1]); Banach spaces of finite cotype with an unconditional basis, as shown by Odell and Schlumprecht [OS94]; more generally, Banach lattices of finite cotype, as shown by Chaatit [Cha95] (see [BL00, Ch. 9, Sec. 2]); Schatten classes of finite cotype (and more general noncommutative L p spaces), as shown by Raynaud [Ray02]. For these classes of Banach spaces Theorem 1.10 furnishes a positive answer to Question 1.2 with Ψ given by the right hand side of (20).
Note that (22) suffices via duality (i.e., Corollary 1.4) to obtain an embedding result for arbitrary subsets of ℓ p as in Corollary 1.6, with exponentially worse dependence on p. Yet another proof of such an embedding statement appears in Section 7, though it also yields a bound in terms of p that is exponentially worse than Corollary 1.6. We do not know how to prove the sharp statement of Corollary 1.6 other than through Theorem 1.5, whose proof is not as elementary as the above mentioned proofs that yield an exponential dependence on p.
1.3. Average distortion embeddings and nonlinear type.
such that for every n ∈ N and every x 1 , . . . , x n ∈ X there exists a nonconstant Lipschitz function f : When p = q we use the simpler notation Av The notion of average distortion and its relevance to approximation algorithms was brought to the fore in the influential work [Rab08] of Rabinovich. Parts of Section 7 below are inspired by Rabinovich's ideas in [Rab08]. Earlier applications of this notion include the work of Alon, Boppana and Spencer [ABS98] that related average distortion to asymptotically sharp isoperimetric theorems on products spaces; see Remark 7.9 below. In the linear theory of Banach spaces average distortion embeddings have been studied in several contexts; see e.g. the work on random sign embeddings in [Elt83,FJS88].
With the above terminology, Theorem 1.3 asserts that the linear dependence (11) holds true if and only if for every finite subset S ⊆ X there exists m ∈ N such that As stated earlier, the estimate (23) cannot be improved (up to the implicit constant factor): this is a special case of the following lemma.
Lemma 1.11. For every p, q, r, s ∈ [1, ∞) with 2 q p and every n ∈ N there exist x 1 , . . . , x n ∈ ℓ p such that if y 1 , . . . , y n ∈ ℓ q satisfy then there exist i, j ∈ {1, . . . , n} such that The proof of Lemma 1.11 is given in Section 4.3. It suffices to say at this juncture that the points x 1 , . . . , x n ∈ ℓ p are the images of the vertices of a bounded degree expanding n-vertex regular graph under Matoušek's ℓ p -variant [Mat97] of Bourgain's embedding [Bou85].
We also stated earlier that Corollary 1.6 fails for p ∈ [1, 2): this is a special case of the following lemma.
Lemma 1.12. Fix p, q, r, s ∈ [1, ∞) with p ∈ [1, 2) and q ∈ (p, ∞). For arbitrarily large n ∈ N there exist x 1 , . . . , x n ∈ ℓ p such that for every y 1 , . . . , y n ∈ ℓ q with 1.3.1. Bourgain-Milman-Wolfson type. The reason for the validity of the lower bound (25) is best explained in the context of nonlinear type: a metric invariant that furnishes an obstruction to the existence of average distortion embeddings. Let F 2 be the field of cardinality 2 and for n ∈ N let e 1 , . . . , e n be the standard basis of F n 2 . We also write e = e 1 + . . . + e n . Following Bourgain, Milman and Wolfson [BMW86], given p, T ∈ (0, ∞), a metric space (X, d X ) is said to have BMW type p with constant T if for every n ∈ N every f : F n 2 → X satisfies has BMW type p if it has BMW type p with constant T for some T ∈ (0, ∞); in this case the infimum over those T ∈ (0, ∞) for which (26) holds true is denoted BMW p (X). For background on this notion we refer to [BMW86], as well as [Pis86,NS02,Nao12a]. These references also contain a description of the closely related important notion of Enflo type [Enf76], a notion whose definition is recalled below but will not be further investigated here. The simple proof of the following lemma appears in Section 6.
Lemma 1.13. Fix p ∈ (0, ∞). For every two metric spaces (X, . The case r = s = 2 of Lemma 1.12 follows from Lemma 1.13 and the computations of BMW type that appear in the literature. The remaining cases of Lemma 1.12 are proved using similar ideas. The Hamming cube is the Cayely graph on F n 2 corresponding to the set of generators {e 1 , . . . , e n }. The shortest path metric on this graph coincides with the ℓ n 1 metric under the identification of F n 2 with {0, 1} n ⊆ R n . Let H n be the 2 n by 2 n symmetric stochastic matrix which is the normalized adjacency matrix of the Hamming cube. Thus for x, y ∈ F n 2 the (x, y)-entry of H n equals 0 unless x − y ∈ {e 1 , . . . , e n }, in which case it equals 1/n. It is well known (and easy to check) that λ 2 (H n ) = 1 − 2/n, so γ(H n , d 2 R ) ≍ n. A simple argument (that is explained in Section 6) shows that the definition (26) is equivalent to the requirement that γ(H n , d 2 X ) X n 2/p for every n ∈ N. The notion of Enflo type p that was mentioned above is equivalent to the requirement that γ(H n , d p X ) X n for every n ∈ N. For p ∈ [1, 2], by considering the identity mapping of F n 2 into ℓ n p (observe that x − y p p = x − y 1 for every x, y ∈ F n 2 ) one sees that γ(H n , · 2 p ) n 2/p ≍ 1/(1 − λ 2 (H n )) 2/p . Thus (21) is sharp.
1.3.2. Towards a nonlinear Maurey-Pisier theorem. Every metric space has BMW type 1 and no metric space has BMW type greater than 2; see Remark 6.4 below. Thus, for a metric space (X, d X ) define Maurey and Pisier [MP76] associate a quantity p X to every Banach space X, which is defined analogously to (27) but with BMW type replaced by Rademacher type. The (linear) notion of Rademacher type is recalled in Section 6.2 below; at this juncture we just want to state, for the sake of readers who are accustomed to the standard Banach space terminology, that despite the apparent conflict of notation between (27) and [MP76], a beautiful theorem of Bourgain, Milman and Wolfson [BMW86] asserts that actually the two quantities coincide.
The following theorem is due to Bourgain, Milman and Wolfson.
Theorem 1.14 ( [BMW86]). Suppose that (X, d X ) is a metric space with p X = 1. Then c X (F n 2 , · 1 ) = 1 for every n ∈ N. Thus, the only possible obstruction to a metric space (X, d X ) having BMW type p for some p > 1 is the presence of arbitrarily large Hamming cubes. This is a metric analogue of a classical theorem of Pisier [Pis73] asserting that the only obstruction to a Banach space having nontrivial Rademacher type is the presence of ℓ n 1 for every n ∈ N. In light of the Maurey-Pisier theorem [MP76] for Rademacher type, it is natural to ask if a similar result holds true for a general metric space X even when p X > 1: Is it true that for every metric space The answer to this question is negative if p X = 2. Indeed, we have p R = 2 yet c R (F n 2 , · 2 ) tends to infinity exponentially fast as n → ∞ because there is an exponentially large subset S of F n 2 with the property that x − y 2 ≍ √ n for every distinct x, y ∈ S. If, however, p X ∈ (1, 2) then the above question, called the Maurey-Pisier problem for BMW type, remains open. The Maurey-Pisier problem for BMW type was posed by Bourgain, Milman and Wolfson in [BMW86], where they obtained a partial result about it: they gave a condition on a metric space (X, d X ) that involves its BMW type as well as an additional geometric restriction that ensures that sup n∈N c X (F n 2 , · p X ) < ∞; see Section 4 of [BMW86]. In Section 6.1 we prove the following theorem.
Theorem 1.15. For every metric space (X, d X ) and every d ∈ N there exists N = N(d, X) ∈ N such that Thus, if BMW p X (X) < ∞, i.e., the supremum defining p X in (27) is attained, then for every d ∈ N one can embed (F d 2 , · p X ) into ℓ N 2 (X) for sufficiently large N ∈ N. Note that it follows immediately from (27) that BMW p (ℓ N 2 (X)) = BMW p (X) for every N ∈ N, so Theorem 1.15 is a complete metric characterization of the parameter p X when the supremum defining p X in (27) is attained. Note also that by passing to ℓ N 2 (X) we overcome the issue that was described above if p X = 2, since trivially (F n 2 , · 2 ) is isometric to a subset of ℓ n 2 (R). This indicates why Theorem 1.15 is easier than the actual Maurey-Pisier problem for BMW type, whose positive solution would require using the assumption p X < 2. We therefore ask whether or not it is true that for every metric space (X, d X ) and every p ∈ (1, 2), if there is K ∈ (0, ∞) such that for every d ∈ N there exists N ∈ N for which (F d 2 , · p ) embeds with distortion K into ℓ N 2 (X), then sup d∈N c X (F d 2 , · p ) < ∞? For p = 1 the answer to this question is positive due to Theorem 1.14, and by virtue of Theorem 1.15 a positive answer to this question would imply an affirmative solution of the Maurey-Pisier problem for BMW type.
Recalling Lemma 1.13, given the relation between BMW type and average distortion embeddings, it is natural to study the following weaker version of the Maurey-Pisier problem for BMW type: Is it true that for every metric space (X, d X ) we have In (28) we restrict to p X < 2 because one can show that

Av
(2) See Remark 6.9 below for the proof of (29). By Theorem 1.14 and Lemma 1.13, for every metric space (X, d X ), Given p ∈ (1, ∞), it is therefore natural to ask whether or not for every metric space (X, d X ) we have sup n∈N

Av
(2) If (31) were true for every p ∈ (1, 2) then a positive answer to the question that appears in (28) would imply a positive solution to the Maurey-Pisier problem for BMW type. More generally, in light of the availability of results such as (31), it would be of interest to relate average distortion embeddings to bi-Lipschitz embeddings. For example, is it true that if a metric space (X, d X ) satisfies Av (2) ℓ 2 (X) < ∞ then for every finite subset S ⊆ X we have c 2 (S) = o X (log |S|)? If the answer to this question is positive then Corollary 1.6 would imply that for p ∈ (2, ∞) any n-point subset of ℓ p embeds into Hilbert space with distortion o p (log n). No such improvement over Bourgain's embedding theorem [Bou85] is known for finite subsets of ℓ p if p ∈ (2, ∞); for p ∈ [1, 2) see [CGR08,ALN08].
For this reason one thinks of γ + (A, d p X ) as measuring the magnitude of the nonlinear absolute spectral gap of the matrix A with respect to the kernel d p X : is useful in various contexts (see [MN12]), and in particular it will be used in some of the ensuing arguments. It is natural to ask for the analogue of Question 1.1 with γ(·, ·) replaced by γ + (·, ·). However, it turns out that this question is essentially the same as Question 1.1, as explained in Proposition 2.1 below.
Suppose that there exists an increasing function Ψ : (0, ∞) → (0, ∞) such that for every n ∈ N and every n by n symmetric stochastic matrix A we have Then for every n ∈ N and every n by n symmetric stochastic matrix A we also have Conversely, suppose that Φ : (0, ∞) → (0, ∞) is an increasing function such that for every n ∈ N and every n by n symmetric stochastic matrix A we have Then for every n ∈ N and every n by n symmetric stochastic matrix A we also have Before passing to the (simple) proof of Proposition 2.1, we record for future use some basic facts about nonlinear spectral gaps.
Lemma 2.2. Fix p ∈ [1, ∞), n ∈ N and a metric space (X, d X ). Then every symmetric stochastic matrix A = (a ij ) ∈ M n (R) satisfies Proof. Fix distinct a, b ∈ X and let x 1 , . . . , x n ∈ X be i.i.d. points, each of which is chosen uniformly at random from {a, b}. Then every symmetric stochastic A ∈ M n (R) satisfies the desired conclusion now follows from the definition (3). The rightmost inequality in (37) follows by substituting x 1 = . . . = x n = a and y 1 = . . . = y n = b into (32).
Lemma 2.3. Fix p ∈ [1, ∞) and a metric space (X, d X ). Then for every integer n 2, every n by n symmetric stochastic matrix A = (a ij ) satisfies Proof. Since the diagonal entries of A play no role in the definition of γ(A, d 2 X ), it follows immediately from (3) that Next, fix x 1 , . . . , x n , y 1 , . . . , y n ∈ X. By the triangle inequality and the convexity of t → |t| p , for every i, j ∈ {1, . . . , n} we have and By averaging (40) and (41), and then averaging the resulting inequality over i, j ∈ {1, . . . , n}, we see that and By averaging (44), (45), (46) and (47) we see that By multiplying (48) by a ij /n and summing over i, j ∈ {1, . . . , n} while using the fact that A is symmetric and stochastic we conclude that A substitution of (49) into (43), and then a substitution of the resulting inequality into (42), yields the following estimate.
Proof of Theorem 1.3. Fix D ∈ (K, ∞) and define ε ∈ (0, ∞) by Let C ⊆ M n (R) be the set of n by n symmetric matrices C = (c ij ) for which there exists y 1 , . . . , y n ∈ Y with |{y 1 , . . . , y n }| 2 and Letting P ⊆ M n (R) be the set of all symmetric n by n matrices with nonnegative entries and vanishing diagonal, denote We first claim that the matrix T = (t ij ) ∈ M n (R) belongs to M. Indeed, if this were not the case then by the separation theorem (Hahn-Banach) there would exist a symmetric matrix H = (h ij ) ∈ M n (R) which has at least one nonzero off-diagonal entry and whose diagonal vanishes, satisfying Since P ⊆ M, the fact that the left hand side of (53) is bounded from below implies that and for every i, j ∈ {1, . . . , n}, then, provided δ ∈ (0, 1) is small enough, A = (a ij ) ∈ M n (R) is a symmetric stochastic matrix all of whose entries are positive such that Because M ⊇ C it follows from (54) and (55) that Since all the entries of A are positive, γ(A, d p Y ) ∈ (0, ∞), and therefore (56) furnishes the desired contradiction.

Interpolation and Markov type
Fix p ∈ [1, ∞) and m ∈ N. Following K. Ball [Bal92], given a metric space (X, d X ) define its Markov type p constant at time m, denoted M p (X; m), to be the infimum over those M ∈ (0, ∞) such that for every n ∈ N, every x 1 , . . . , x n ∈ X and every n by n symmetric stochastic Remark 4.1. In Section 1.3 we recalled the notions of BMW type and Enflo type. The link between these notions and Ball's notion of Markov type is that Markov type p implies Enflo type p (see [NS02]). One can also define natural variants of Markov type that imply BMW type (see the inequalities appearing in Theorem 4.4 of [NPSS06]). Recently Kondo proved [Kon11] that there exists a Hadamard space (see e.g. [BH99]) that fails to have Markov type p for any p > 1, answering a question posed in [NPSS06]. Since Hadamard spaces have Enflo type 2 (see [Oht09a]), this yields the only known example of a metric space that has Enflo type 2 but fails to have nontrivial Markov type (observe that the notions of Enflo type 2 and BMW type 2 coincide).
Lemma 4.2. Fix p ∈ [1, ∞) and m, n ∈ N. Let (X, d X ) be a metric space and A = (a ij ) ∈ M n (R) be a symmetric stochastic matrix. Then Proof. This is immediate from the definitions: for every x 1 , . . . , x n ∈ X we have The modulus of uniform smoothness of a Banach space (X, · X ) is defined for τ ∈ (0, ∞) as Cτ q for all τ ∈ (0, ∞). It is straightforward to check that in this case necessarily q ∈ [1, 2]. It is shown in [BCL94] that X has modulus of smoothness of power type q if and only if there exists a constant S ∈ [1, ∞) such that for every x, y ∈ X The infimum over those S ∈ [1, ∞) for which (62) holds true is called the q-smoothness constant of X, and is denoted S q (X). Observe that every Banach space satisfies S 1 (X) = 1.
The following theorem is due to [NPSS06].
is a Banach space whose modulus of smoothness has power type q. Then The statement corresponding to Theorem 4.3 in [NPSS06] (specifically, see Theorem 4.4 there), allows for a multiplicative constant with unspecified dependence on p and q, while in (63) we stated an explicit dependence on these parameters that will serve us later on several occasions. We shall therefore proceed to sketch the proof of Theorem 4.3 so as to explain why the dependence on p and q in (63) is indeed valid.
Proof of Theorem 4.3 (sketch). For every measure space (Ω, µ) we have The case q = 2 of (64) appears in [Nao12b], and the proof for general q ∈ [1, 2] follows mutatis mutandis from the proof in [Nao12b]. This has been carried out explicitly in Lemma 6.3 of [MN12], whose statement asserts the weaker bound S q (L p (µ, X)) p 1/q S q (X), but the proof of [MN12, Lem. 6.3] without any change whatsoever yields (64).
We record for future use the following corollary, which is an immediate consequence of Lemma 4.2 and Theorem 4.3.
Corollary 4.4. Fix q ∈ (1, 2] and p ∈ [q, ∞). Suppose that (X, · X ) is a Banach space whose modulus of smoothness has power type q. Then for every m, n ∈ N and every symmetric stochastic A = (a ij ) ∈ M n (R), where C ∈ (0, ∞) is a universal constant.
We refer to [BL76] for the background on complex interpolation that is used below. We also recall the definition of λ(A) in (2). The following theorem is the main result of this section.
Theorem 4.5. Let (H, Z) be a compatible pair of Banach spaces with H a Hilbert space. Suppose that θ ∈ [0, 1] and consider the complex interpolation space X = [H, Z] θ . Fix q ∈ [1, 2] and suppose that X has modulus of smoothness of power type q. Then for every n ∈ N and every n by n symmetric stochastic matrix A ∈ M n (R) we have Before proving Theorem 4.5 we present some of its immediate corollaries. First, since in the setting of Theorem 4.5 we always have S q (X) 1 for q = 2/(1 + θ) (see [Pis79,CR82]), the following corollary is a special case of Theorem 4.5.
Corollary 4.6. Under the assumptions of Theorem 4.5 we have In order for the above results to fit into the framework of Question 7, we need to bound γ(A, · 2 X ) in terms of λ 2 (A) rather than λ(A). This is the content of the next corollary.
Corollary 4.7. Under the assumptions of Theorem 4.5 we have Proof. Since A is symmetric and stochastic, all of its eigenvalues are in the interval [−1, 1]. Consequently, all the eigenvalues of the symmetric stochastic matrix (I + A)/2 are nonnegative, and hence An application of Theorem 4.5 to the matrix (I + A)/2 while taking into account the identities (69) and (39) implies that This yields the desired estimate (68) due to the elementary inequality We can now complete the proof of Theorem 1.5, and consequently also its corresponding dual statement Corollary 1.6.
As in the above proof of Theorem 1.5, by specializing Theorem 4.5 to X = ℓ p for p ∈ [2, ∞), we obtain the following corollary, which was stated in the Introduction as inequality (18).
Corollary 4.8. For every p ∈ [2, ∞), every n ∈ N and every n by n symmetric stochastic matrix A we have We now proceed to prove Theorem 4.5.
Proof of Theorem 4.5. In what follows, given a Banach space (Y, · Y ) and n ∈ N we let L n 2 (Y ) denote the Banach space whose underlying vector space is Y n , equipped with the norm If H is a Hilbert space with scalar product ·, · H , then L n 2 (H) is a Hilbert space whose scalar product is always understood to be given by Define an operator T : L n 2 (R) → L n 2 (R) by setting for every x ∈ R n and i ∈ {1, . . . , n}, Equivalently, The operator T ⊗ I Y : L n 2 (Y ) → L n 2 (Y ), where I Y denotes the identity on Y , is then given by Recalling (2) and (70), we have the following operator norm bounds.
The norm of T ⊗ I Z : L n 2 (Z) → L n 2 (Z) can be bounded crudely by using the fact that A is a symmetric stochastic matrix. Indeed, for every z ∈ L n 2 (Z) we have An interpolation of (72) and (73) (see [BL76]) shows that For every x ∈ L n 2 (X) let x ∈ L n 2 (X) be the vector whose ith coordinate equals At the same time, We shall now apply a trick that was used by Pisier in [Pis10], where it is attributed to V. Lafforgue: we can ensure that the condition (77) holds true if we work with a large enough power of A. We will then be able to return back to an inequality that involves A rather than its power by using Markov type through Corollary 4.4. Specifically, define This choice of m ensures that so we may apply (78) with A replaced by A m to get the estimate An application of Corollary 4.4 with p = 2 now implies that where in (81) we used the elementary inequality x.

Ramanujan graphs and Alon-Roichman graphs. Given a connected n-vertex graph
. . , n − 1} and that G is d-regular, i.e., for every i ∈ {1, . . . , n} the number of j ∈ {1, . . . , n} such that {i, j} ∈ E G equals d. The normalized adjacency matrix of G will be denoted A G , i.e., (A G ) ij = 1 d 1 {i,j}∈E G for every i, j ∈ {1, . . . , n}. Thus A G is an n by n symmetric stochastic matrix. We denote λ i (G) = λ i (A G ) for every i ∈ {1, . . . , n}, and we correspondingly set λ(G) = λ(A G ) and , an important idea of Linial, London and Rabinovich [LLR95] relates γ(G, d q X ) to a lower bound on c X (G) as follows. Let f : {1, . . . , n} → X be a nonconstant function and apply (3) with A = A G and x i = f (i) for every i ∈ {1, . . . , n}, thus obtaining the estimate Denoting Av For notational simplicity we will assume from now on that G is a vertex-transitive graph, since in this case we have By combining (83) and (84), Matoušek's argument in [Mat97] deduces from his bound (14) that if G = ({1, . . . , n}, E G ) is a vertextransitive graph such that λ 2 (G) is bounded away from 1 by a universal constant then for every p ∈ [2, ∞) we have Denote p(G) def = p({1, . . . , n}, d G ), where we recall that in Section 1.1 we defined for a separable metric space (X, d X ) the quantity p(X, d X ) (or simply p(X) if the metric is clear from the context) to be the infimum over those p ∈ [2, ∞] for which c p (X) 10. It follows from (85) that p(G) diam(G) (still under the assumption that λ 2 (G) is bounded away from 1). Using Corollary 4.8, we now show that it is possible to improve over this estimate.
Proof. By combining (83) and (84) (for q = 2) with Corollary 4.8 we see that By the definition of p(G), the following corollary is a formal consequence of Proposition (4.9). Then .
Proof. By combining (83) and (84) with Corollary 4.8 we see that where we used the fact that 1 n 2 n i=1 n j=1 d G (i, j) 2 (log d n) 2 and that λ(G) 2/ √ d. The latter bound uses the fact that G is a Ramanujan graph (in fact, weaker bounds on λ(G) suffice for our purposes), and the former bound holds true for any connected n-vertex d-regular graph (see [Mat97] for a simple proof of this).
If G is a uniformly random n-vertex d-regular graph then by [BS87] λ(G) 2/ 4 √ d with high probability (for the best known bound on λ(G) when G is a random d-regular graph, see [Fri08]). By arguing identically to the proof of Proposition 4.11, we see that with high probability (86) and (87) hold true for such G, implying Proposition 1.9.
Corollary 4.8 also implies new distortion bounds for Abelian Alon-Roichman graphs [AR94]. These are graphs that are obtained from the following random construction. Let Γ be a finite Abelian group, and think of Γ as the set {1, . . . , n}, equipped with an Abelian group operation. Fix ε ∈ (0, 1/2) and set k = 3 ε 2 log n . Let g 1 , . . . , g k ∈ Γ be chosen independently and uniformly at random. This induces a random Cayley graph G whose generating multi-set is {g 1 , g −1 1 , . . . , g k , g −1 k }. As explained in [AR94], with probability that tends to 1 as n → ∞ the graph G is connected. Note that since G is a Cayley graph it is vertextransitive. It follows from [CM08] that provided n is large enough we have λ(G) ε with probability at least 1 2 . Moreover, by [NR09,Prop. 3.5] we have diam(G) (log n)/(log 1/ε). A substitution of these estimates into Proposition 4.9 shows that 2 p log(1/ε) =⇒ c p (G) log n √ p log(1/ε) , and p log(1/ε) =⇒ c p (G) log n p log(1/ε) .
Remark 4.12. We warn that there is some subtlety in the definition of the parameter p(X) for a separable metric space (X, d X ). Given that X is isometric to a subset of ℓ ∞ , it is indeed natural to ask for the smallest p ∈ [2, ∞] such that X embeds with bounded distortion, say, distortion 10, into ℓ p . As an example of an application that was shown to us by Yuval Rabani, one can use the methods of [KOR00] to prove that subsets of ℓ p admit an efficient approximate nearest neighbor data structure with approximation guarantee e O(p) , so the parameter p(X) relates to approximate nearest neighbor search in X (it would be very interesting to determine the correct asymptotic dependence on p here). But, understanding the set of p ∈ [2, ∞) for which X admits a bi-Lipschitz embedding into ℓ p can be subtle. In particular, it is not true that if X embeds into ℓ p then for every q > p it also embeds into ℓ q . In fact, we have the following estimates for every n ∈ N and p, q ∈ (2, ∞).
(89) The asymptotic identity (88) is a standard consequence of the fact that L q has Rademacher cotype q (see e.g. [Woj91]). The remarkable asymptotic identity (89) is due to [FJS88] (using a computation of [GPP80]). The implicit dependence on p, q in (89) is unknown, and it would be of interest to evaluate it up to a universal constant factor. Observe that the exponent of n in (89) tends to (p − 2)/p 2 > 0 as q → ∞, and therefore the implicit constant in (89) must tend to 0 as q → ∞.

4.2.
Curved Banach spaces in the sense of Pisier. Motivated by his work on nonlinear spectral gaps [Laf08], V. Lafforgue associated the following modulus to a Banach space (X, · X ), a modulus has been investigated extensively by Pisier in [Pis10]. Given ε ∈ (0, ∞) let ∆ X (ε) denote the infimum over those ∆ ∈ (0, ∞) such that for every n ∈ N, every matrix T = (t ij ) ∈ M n (R) with Pisier introduced the following terminology in [Pis10]: X is said to be curved if ∆ X (ε) < 1 for some ε ∈ (0, 1). X is said to be fully curved if ∆ X (ε) < 1 for all ε ∈ (0, 1), and X is said to be uniformly curved if lim ε→0 ∆ X (ε) = 0. It is shown in [Pis10] that if X is either fully curved or uniformly curved then it admits an equivalent uniformly convex norm. A remarkable characterization of Pisier [Pis10] shows that ∆ X (ε) ε α for some α ∈ (0, ∞) if and only if X arises from complex interpolation with Hilbert space: formally, this happens if and only if X is isomorphic to a quotient of a subspace of an ultraproduct of θ-Hilbertian Banach spaces for some θ ∈ (0, 1); we refer to [Pis10] for the definition of these notions. A more complicated structural characterization of uniformly curved spaces (based on real interpolation) is also obtained in [Pis10].
One can use the above notions to give a generalized abstract treatment of results in the spirit of Theorem 4.5. Fix ε ∈ (0, 1) and suppose that ∆ X (ε) < 1 2 . Let A ∈ M n (R) be symmetric and stochastic and let T = (t ij ) ∈ M n (R) be given as in (70). By (72) we have T L n 2 (R)→L n 2 (R) = λ(A). Moreover, since abs(T ) = (|a ij − 1/n|) and A is symmetric and stochastic, it is immediate to check that abs(T ) L n 2 (R)→L n 2 (R) 2. By the definition of the modulus ∆ X (·) we therefore have T ⊗ I X L n 2 (X)→L n 2 (X) so that λ(A m )/2 = λ(A) m /2 ε, and apply the above reasoning with A replaced by A m . Arguing as in (75) and (76), we obtain the estimate We can now use the notion of Markov type through Lemma 4.2 to deduce the following statement.
In particular, using Theorem 4.3 and arguing as in the proof of Corollary 4.7, if X has modulus of smoothness of power type 2 then every symmetric stochastic matrix A ∈ M n (R) satisfies .
In conjunction with Theorem 1.3 we deduce the following geometric embedding result for uniformly curved Banach space that can be renormed so as to have modulus of smoothness of power type 2.

An interpolation inequality for nonlinear spectral gaps.
The modulus of uniform convexity of a Banach space (X, · X ) is defined for ε ∈ [0, 2] as X is said to be uniformly convex if δ X (ε) > 0 for all ε ∈ (0, 2]. Furthermore, X is said to have modulus of convexity of power type p if there exists a constant c ∈ (0, ∞) such that δ X (ε) c ε p for all ε ∈ [0, 2]. It is straightforward to check that in this case necessarily p 2. By Proposition 7 in [BCL94] (see also [Fig76]), X has modulus of convexity of power type p if and only if there exists a constant K ∈ [1, ∞) such that for every x, y ∈ X The infimum over those K for which (91) holds true is called the pconvexity constant of X, and is denoted K p (X). Note that every Banach space satisfies K ∞ (X) = 1. Below we shall use the convention K p (X) = ∞ if p ∈ [1, 2). For 1 q 2 p, the p-convexity constant of X is related to the q-smoothness constant of X (recall (62)) via the following duality relation [BCL94, Lem. 5]. 1 p + 1 q = 1 =⇒ K p (X) = S q (X * ).
Theorem 4.15. Let (X, Y ) be a compatible pair of Banach spaces. Fix θ ∈ [0, 1] and consider the complex interpolation space Z = [X, Y ] θ . Fix also p, q ∈ [1, ∞] and r ∈ [1, 2]. Then for every n ∈ N and every n by n symmetric stochastic matrix A we have where c ∈ (0, ∞) is a universal constant and s ∈ [2, ∞] is given by Observe that for every a, b ∈ (0, ∞) and every θ ∈ (0, 1) we have Consequently, the conclusion of Theorem 4.15 implies that γ(A, · s Z ) X,Y,Z,s γ(A, · p X ) θs/r γ(A, · q Y ) (1−θ)s/r . Such an estimate is in the spirit of the interpolation inequality (15), but it is insufficient for the purpose of addressing Question 1.8. Theorem 4.15 does suffice to prove Lemma 1.11, so we assume the validity of Theorem 4.15 for the moment and proceed now to prove Lemma 1.11.
Proof of Lemma 1.11. Matoušek proved in [Mat97] that if (X, d X ) is an n-point metric space then for p ∈ [2, ∞) we have The case p = 2 of (92) is Bourgain's embedding theorem [Bou85]. Now, for every n ∈ N let G n = ({1, . . . , n}, E Gn ) be a 4-regular graph with sup n∈N λ 2 (G n ) < 1, i.e., {G n } ∞ n=1 forms an expander sequence. Fixing n e p , by (92) we know that there exist x 1 , . . . , x n ∈ ℓ p such that Suppose that y 1 , . . . , y n ∈ ℓ q satisfy (24), i.e., If y i − y j ℓq D x i − x j ℓp for every i, j ∈ {1, . . . , n} then we need to show that D p/(q + r). Note that since G n is 4-regular, a constant fraction of the pairs (i, j) ∈ {1, . . . , n} 2 satisfy d Gn (i, j) log n (the standard argument showing this appears in e.g. [Mat97]). Hence, due to the leftmost inequality in (93), it follows from (94) that where in (96) we bounded γ(G n , · q ℓq ) using (14), and we used the fact that if {i, j} ∈ E Gn then y i − y j ℓq D x i − x j ℓp D(log n)/p, due to (93). By contrasting (96) with (95) we have D p/q, as required.
Hence, if we argue as in (96) we see that D p/r, as required.
We now prove Theorem 4.15.
Proof of Theorem 4.15. We may assume that A is ergodic, implying that γ(A, · p X ), γ(A, · q Y ) < ∞, since otherwise the conclusion of Theorem 4.15 is vacuous. So, by Lemma 2.3 we have and For a Banach space (W, · W ) and t ∈ [1, ∞] let L n t (W ) 0 be the subspace of L n t (W ) consisting of mean-zero vectors, i.e., L n t (W ) 0 def = (w 1 , . . . , w n ) ∈ L n t (W ) : Let Q : L n t (R) → L n t (R) 0 be the canonical projection, i.e, for every v ∈ L n t (R) and i ∈ {1, . . . , n}, Then by the triangle inequality, Q ⊗ I W L n t (W )→L n t (W ) 0 2, and consequently for every B ∈ M n (R) such that B(L n t (R) 0 ) ⊆ L n t (R) 0 , (BQ) ⊗ I W L n Note that if B is symmetric and stochastic then B(L n t (R) 0 ) ⊆ L n t (R) 0 , and by [MN12, Lem. 6.6] for every t ∈ [1, ∞] we have (Observe that we always have B ⊗ I W L n 1 because B is symmetric and stochastic, so (100) is most meaningful when t ∈ [2, ∞) and K t (W ) < ∞.) Consequently, for every m ∈ N we have Q.
The same reasoning applied to W = Y and t = q while using (98) shows that Interpolation of (102) and (103) yields the following estimate.
Lemma 5.1. Under the assumptions of Theorem 1.10, fix n ∈ N, q ∈ [1, ∞) and x 1 , . . . , x n ∈ X with For every i ∈ {1, . . . , n} let r i ∈ (0, ∞) be the smallest r > 0 such that Then for every n by n symmetric stochastic matrix A = (a ij ) ∈ M n (R), min i∈{1,...,n} Proof. Note that the fact that the function x → β(x 1/q ) q is concave on [0, ∞) implies that for every λ ∈ (0, ∞) we have By relabeling the points if necessary we may also assume without loss of generality that r 1 = min i∈{1,...,n} For the sake of simplicity we denote below For i ∈ {1, . . . , n} define where ρ : X → B X is given in (107). Since ρ is 2-Lipschitz, By definition y i ∈ x 1 + r 1 B X , so we may consider the vectors Now, It follows from (119) and Markov's inequality that if we set then |C| 3n/4. By (114) it follows that Recalling the definitions (113), (120) and (117), for every i ∈ B ∩ C we have Hence, for every i, j ∈ B ∩ C, Consequently, Fix an arbitrary index k ∈ B ∩ C and define It follows from (123) that S ⊇ B ∩ C, and therefore by (121) we have Moreover, by the definition (124), Then for every i ∈ {1, . . . , n} S and j ∈ S, Hence, 1 n i∈{1,...,n} S where the last step of (125) uses the trivial fact γ(A, d q R ) γ (since Y contains an isometric copy of R) and A combination of (126) and (129) yields the estimate Consequently, an application of Markov's inequality shows that i ∈ {1, . . . , n} : x i − x k X 2 1/q 2 q+1 γ + 2 q−1 r q 1/q n 2 .
If (2r) q r q 1 /2 then it follows from (130) that r q 1 2 q+3 γ, implying the desired estimate (110). So, suppose that (2r) q > r q 1 /2, which, by recalling the definition of r in (123), is the same as implying the validity of (110) in this case as well.
Proof of Theorem 1.10. We continue to use the notation that was introduced in the statement and the proof of Lemma 5.1. In particular, the assumptions (108) and (112) are (without loss of generality) still in force.

Limitations of Ozawa's method.
In the discussion immediately preceding the estimates (21) and (22) we stated that for every p ∈ [1, ∞) there exists a mapping φ : ℓ p → ℓ 2 such that for every x, y ∈ ℓ p we have and The estimates (135) and (136) are a special case of the following bounds on the modulus of uniform continuity of the Mazur map [Maz29] (see also [BL00,Ch.9]). Let (Ω, µ) be a measure space and fix p, q ∈ [1, ∞).
If p q then for every f, g ∈ L p (µ) with f Lp(µ) , g Lp(µ) 1 we have Note that (135) is a special case of (137), and (136) is also a consequence of (137) because M −1 p,q = M q,p . While the bounds appearing in (137) are entirely standard, they seem to have been always stated in the literature while either using implicit multiplicative constant factors, or with suboptimal constant factors. These constants play a role in our context, so we briefly include the proof of (137), following the lines of the proof of [BL00, Prop. 9.2]. The elementary inequality which holds for every u, v ∈ R and θ ∈ [1, ∞), immediately implies (with θ = p/q) the leftmost inequality in (137). To prove the rightmost inequality of (137), note the following elementary inequality, which also holds for every u, v ∈ R and θ ∈ [1, ∞). Consequently, where (138) follows from an application of Hölder's inequality with exponents p/q and p/(p − q) and (139)  2. Returning to Theorem 1.10 (in particular using the notation and assumptions that were introduced in the statement of Theorem 1.10), if one wants the bound (20) to be compatible with the assumption of Theorem 1.3 one needs (20) to yield an upper bound on γ(A, · q X ) that grows linearly with γ(A, · q Y ). This is equivalent to the requirement Since (140) is supposed to hold for every n ∈ N and every n by n symmetric stochastic matrix A, (140) is the same as requiring that β(t) X,Y t for every t ∈ (0, ∞]. Specializing the above discussion to Y = ℓ 2 and q = 2, if β(t) Kt for some K ∈ (0, ∞) and every t ∈ (0, ∞) then (20) yields the estimate γ(A, · q X ) (K/α(1/4)) 2 1 − λ 2 (A) .

Av
(2) In particular, if p ∈ (2, ∞) then due to (135) we get the estimate ℓ 2 (ℓ p ) p2 3p/2 , which is exponentially worse than (23). The following lemma shows that this exponential loss is inherent to the use of Theorem 1.10 for the purpose of obtaining average distortion embedding of finite subsets of ℓ p into ℓ 2 , i.e., that K/α(1/4) must grow exponentially in p as p → ∞.
Proof. Fix m, n ∈ N and s ∈ (0, 1]. For every x ∈ Z n define ψ(x) ∈ ℓ p to be the vector whose jth coordinate equals s n 1/p e 2πx j i/m if j ∈ {1, . . . , n}, and whose remaining coordinates vanish. Then we have ψ(x) ℓp s 1 for every x ∈ Z n .
By the results of Section 3 of [MN08], if is m divisible by 4 and m 2 3 π √ n then we have  (145) and (144) into (143) we see that which simplifies to give the desired estimate (142).

Bourgain-Milman-Wolfson type
Here we study aspects of nonlinear type in the sense of Bourgain, Milman and Wolfson [BMW86], proving in particular Lemma 1.12, Lemma 1.13 and Theorem 1.15 that were stated in the Introduction.
For every k ∈ {1, . . . , n} let Ω n k ⊆ F n 2 × 2 F n 2 be defined by Thus |Ω n k | = 2 n n k . Fixing a metric space (X, d X ) and f : F n 2 → X, define where for I ∈ {1, . . . , n} we set Thus, using the notation of the Introduction, we have e {1,...,n} = e. We also record for future use the following simple consequence of the triangle inequality in (X, d X ).
Lemma 6.1 . Fix n ∈ N and q ∈ [1, ∞). Suppose that (X, d X ) is a metric space and f : F n 2 → X. Then for every k, m ∈ {1, . . . , n} with k + m n we have Consequently, where .
The point-wise estimate (150) combined with the triangle inequality in L q (Ω n k+m ) implies that E The desired estimate (149) now follows from a substitution of (152) and (153) into (151).
For m, n ∈ N with n m, for every f : F m 2 → X denote its natural lifting to F n 2 by f ↑n : F n 2 → X, that is, We then have the following identity for every k ∈ {1, . . . , n}.
where (154)   (2) k (f ). With this notation, a metric space (X, d X ) has BMW type p ∈ (0, ∞) if and only if there exists T ∈ (0, ∞) such that for every n ∈ N and every f : F n 2 → X, E n (f ) T n 1/p E 1 (f ).
Let BMW n p (X) denote the infimum over those T ∈ (0, ∞) for which (157) holds true for every f : F n 2 → X. Thus BMW p (X) = sup n∈N BMW n p (X).

Consequently,
∀ m, n ∈ N, m n =⇒ BMW m p (X)m 1/p BMW n p (X)n 1/p . (158) Remark 6.4. Let (X, d X ) be a metric space and p ∈ (1, ∞) be such that X has BMW type p. Then necessarily p 2. Indeed, choose distinct x 0 , x 1 ∈ X and for every n ∈ N define f n : F n → X by f n (z) = x z 1 . x 1 ). This means that n 1/p BMW p (X) √ n for all n ∈ N, which implies that p 2. A straightforward application of the triangle inequality (see [BMW86]) implies that every metric space has BMW type 1 with BMW 1 (X) = 1.
Recalling the definition of p X ∈ [1, 2] in (27), we have the following lemma that relies on a sub-multiplicativity argument that was introduced by Pisier [Pis73] in the context of Rademacher type of normed spaces, and has been implemented in the context of nonlinear type by Bourgain, Milman and Wolfson [BMW86] (see also [Pis86]).
Lemma 6.5. For every metric space (X, d X ) we have ∀ n ∈ N, BMW n p X (X) 1. Proof. Write p = p X and suppose for the sake of obtaining a contradiction that there exists m ∈ N and ε ∈ (0, 1) such that BMW m p (X) < ε. We may also assume without loss of generality that ε > 1/n 1/p . Since BMW 1 p (X) = 1, we have m 2. If we define then q ∈ (p, ∞). By [BMW86, Lem. 2.3] (see also [Pis86,Lem.7.2]), for every k, n ∈ N we have BMW kn p (X) BMW k p (X) · BMW n p (X). Consequently, for every i ∈ N we have For every n ∈ N choose i ∈ N such that m i−1 n < m i . Since, by (158), BMW n p (X)n 1/p increases with n, it follows from (160) that BMW n q (X) = BMW n p (X)n 1/p n 1/q BMW m i p (X)m i/p n 1/q < m i/q n 1/q m 1/q .
Consequently BMW q (X) < ∞, i.e., X has BMW type q. Since q > p, this contradicts the definition of p X = p.
Lemma 6.6. Fix p ∈ [1, 2], n ∈ N, and k 1 , . . . , k m ∈ {1, . . . , n} such that k 1 + . . . + k m n. Then for every metric space (X, d X ) and every f : F n 2 → X we have Proof of Theorem 1.15. Denote p X = p. Fix ε ∈ (0, 1/3) and define Since n d, we can consider F d p as a subset of F n p (say, cannonically embedded as the first d coordinates).
By Lemma 6.5 we have BMW n p (X) 1, and therefore by the definition of BMW n p (X) there exists f : F n p → X such that By Lemma 6.6 with m = x − y 1 and k 1 = . . . = k m = 1, for every x, y ∈ F n p we have Fixing x, y ∈ F d p ⊆ F n p , write n = a x−y 1 +b for appropriate integers a and b ∈ [0, x−y 1 ). By Lemma 6.6 with m = b and k 1 = . . . = k m = 1, Using Lemma 6.6 once more, this time with m = a + 1, k 1 = b and k 2 = . . . = k a+1 = x − y 1 , and noting that since x − y 1 d εn we have m (1 + ε)n/ x − y 1 , we conclude that In combination with (167) and our assumption (164), this implies Recalling (163), we have x − y 1 /n d/n εBMW p (X) 4 (since x, y ∈ F d 2 ), and it therefore follows from (168) that Since ε ∈ (0, 1/3) can be taken to be arbitrarily small, by combining (166) and (169) we conclude that

Obstructions to average distortion embeddings of cubes.
We start by proving Lemma 1.13, whose proof is very simple.
Then there exists x ∈ F m 2 and i ∈ {1, . . . , n} such that Proof. By Pisier's inequality [Pis86] we have For every fixed x ∈ F m 2 it follows from Kahane's inequality (with asymptotically optimal dependence on r; see e.g. [Tal88]) that Combined with (175), this implies that max x∈F m 2 i∈{1,...,m} There are classes of Banach spaces Y , including Banach lattices of nontrivial type and UMD spaces, for which it is known that Pisier's inequality (175) holds true with the log m factor replaced by a constant that may depend on Y and r but not on m; see [NS02,HN13]. For such spaces we therefore obtain (174) without the log m term.
Lemma 6.8. Assume that p ∈ [1, 2) and fix q ∈ (p, 2] and r, s ∈ [1, ∞). Suppose that (Y, · Y ) is a Banach space with S q (Y ) < ∞, i.e., Y has modulus of uniform smoothness of power type q. If f : F m 2 → Y satisfies (173) then there exists x ∈ F m 2 and i ∈ {1, . . . , n} such that Proof. Due to (173), in order to prove (176) it suffices to show that for Note that it suffices to prove (177) when r 2, since otherwise we could replace q by r and use the fact that S r (Y ) S q (Y ).
By considering the standard random walk on the Hamming cube F n 2 and arguing mutatis mutandis as in [NS02, Sec. 5], (177) is a formal consequence of the Markov type estimate of Theorem 4.3. Alternatively, once can deduce (177) directly via the martingale argument in [KN06, Sec. 5], the only difference being the use of the martingale inequality (65) in place of Pisier's inequality [Pis75].
Proof of Lemma 1.12. Since for q ∈ [2, ∞) we have S 2 (ℓ q ) √ q − 1, Lemma 1.12 is a special case of Lemma 6.8 with Y = ℓ q , n = 2 m and x 1 , . . . , x 2 m being an arbitrary enumeration of F m 2 ⊆ Y . Remark 6.9. As promised in the Introduction, here we justify (29). The fact that Av R (F n 2 , · 2 ) 4 √ n is simple: consider the mapping φ : F n 2 → R given by φ(x) = max{ x 1 − n/2, 0}. Then φ is 1-Lipschitz with respect to the metric induced on F n 2 by the Euclidean norm · 2 . By the central limit theorem the average of |φ(x) − φ(y)| 2 over (x, y) ∈ F n 2 × F n 2 is of order √ n.
The corresponding lower bound Av R (F n 2 , · 2 ) 4 √ n is an example of a lower bound on the average distortion of the cube F n 2 that is not proved through the use on nonlinear type. Suppose that f : F n 2 → R satisfies |f (x) − f (y)| x − y 2 for every x, y ∈ F n 2 . Suppose also that S ⊆ F n 2 satisfies |S| 2 n−1 . Then by Harper's inequality [Har66] (see also [Led01,Thm. 2.11]), for every t ∈ (0, ∞) we have |{x ∈ F n 2 : ∀ y ∈ S, x − y 2 t}| 2 n = |{x ∈ F n 2 : ∀ y ∈ S, x − y 1 t 2 }| 2 n e −2t 4 /n .
Remark 6.10. Additional obstructions to average distortion embeddings that do not fall into the framework described in this section have been obtained in the context of integrality gap lower bounds for the Goemans-Linial semidefinite relaxation for the Uniform Sparsest Cut Problem. The best known result in this direction is due to [KM13] (improving over the works [DKSV06,KR09]), where it is shown that for arbitrarily large n ∈ N there exists an n-point metric space (X, d X ) such that the metric space (X, √ d X ) emebds isometrically into ℓ 2 , yet Av

Existence of average distortion embeddings
The main purpose of this section is to state criteria for the existence of average distortion embeddings. In what follows we often discuss probability distributions over random subsets or random partitions of metric spaces. To avoid measurability issues we focus our discussion on finite metric spaces. Such topics can be treated for infinite spaces as well, as done in [LN05].
7.1. Random zero sets. Fix ∆, ζ ∈ (0, ∞) and δ ∈ (0, 1). Following [ALN08], a finite metric space (X, d X ) is said to admit a random zero set at scale ∆ which is ζ-spreading with probability δ if there exists a probability distribution µ over 2 X such that every x, y ∈ X satisfy (178) We denote by ζ(X; δ) the infimum over those ζ ∈ (0, ∞) such that for every scale ∆ ∈ (0, ∞) the finite metric space (X, d X ) admits a random zero set at scale ∆ which is ζ-spreading with probability δ. If (X, d X ) is an infinite metric space then we write The following proposition asserts that random zero sets can be used to obtain embeddings into the real line R with low average distortion.
Proposition 7.1. Fix n ∈ N and δ ∈ (0, 1). Suppose that (X, d X ) is a metric space with ζ(X; δ) < ∞. Then for every p ∈ [1, ∞) and every x 1 , . . . , x n ∈ X there exists a 1-Lipschitz function f : Thus, using the notation of Section 1.3, Av (p) R (X) ζ(X; δ)/δ 1/p . Proposition 7.1 will be proven in Section 7.4 below. We will now explain how Proposition 7.1 can be applied to a variety of metric spaces. Due to the discussion preceding Theorem 1.3, such spaces will satisfy the spectral inequality (7) with Ψ linear. 7.1.1. Random partitions. Many spaces are known to admit good random zero sets. Such examples often (though not always) arise from metric spaces for which one can construct random padded partitions. If (X, d X ) is a finite metric space let P(X) denote the set of all partitions of X. For P ∈ P(X) and x ∈ X, the unique element of P to which x belongs is denoted P (x) ⊆ X. Given ε, δ ∈ (0, 1), the metric space (X, d X ) is said to admit an ε-padded random partition with probability δ if for every ∆ ∈ (0, ∞) there exists a probability distribution µ ∆ over partitions of X with the following properties.
• For every x ∈ X we have where B X (x, r) def = {y ∈ X : d X (x, y) r} for every r ∈ [0, ∞).
([ALN08, Fact 3.4] states this for the arbitrary choice δ = 1 2 , but its proof does not use this specific value of δ in any way.) One should interpret (180) as asserting that a lower bound on ε(X; δ) implies an upper bound on ζ(X, δ/4). The following classes of metric spaces (X, d X ) are known to satisfy ε(X; δ) > 0 for some δ ∈ (0, 1): doubling metric spaces, compact Riemannian surfaces, Gromov hyperbolic spaces of bounded local geometry, Euclidean buildings, symmetric spaces, homogeneous Hadamard manifolds, and forbidden-minor (edge-weighted) graph families. The case of doubling spaces goes back to [Ass83], with subsequent improved bounds on ε(X; δ) > 0 obtained in [GKL03]. The case of forbidden-minor graph families is due to [KPR93], with subsequent improved bounds on ε(X; δ) > 0 obtained in [FT03]. The case of compact Riemannian surfaces is due to [LN05], with subsequent improved bounds on ε(X; δ) > 0 obtained in [LS10]. The remaining cases follow from the general fact [NS11] that if (X, d X ) has bounded Nagata dimension then ε(X; δ) > 0 for some δ ∈ (0, 1) (see [LS05] for more information on Nagata dimension of metric spaces). We single out the following two consequences of Proposition 7.1 and (the easy direction of) Theorem 1.3, with explicit quantitative bounds arising from the estimates on ε(X; δ) obtained in [GKL03,LS10].
Corollary 7.2. Suppose that (X, d X ) is a metric space that is doubling with constant K ∈ [2, ∞). Then for every n ∈ N and every symmetric stochastic matrix A ∈ M n (R) we have Corollary 7.3. Suppose that (X, d X ) is a two dimensional Riemannian manifold of genus g ∈ N ∪ {0}. Then for every n ∈ N and every symmetric stochastic matrix A ∈ M n (R) we have The fact that the conclusion of Proposition 7.1 holds true under the assumption that ε(X; δ) > 0 for every δ ∈ (0, ∞) (as follows by combining Proposition 7.1 with (180)) was proved by Rabinovich in [Rab08] in the case p = 1. It has long been well known to experts (and stated explicitly in [BLR10]), that the original proof of Rabinovich extends mutatis mutandis to every p ∈ [1, ∞). The (simple) proof of Proposition 7.1 below builds on the ideas of Rabinovich in [Rab08].
An example of a class of metric spaces that admits good random zero sets for reasons other than the existence of random padded partitions is the class of spaces that admit a quasisymmetric embedding into Hilbert space. We refer to [Hei01] and the references therein for more information on quasisymmetric embeddings; it suffices to say here that L 1 (µ) spaces provide such examples (see [DL97]). It follows from [ALN08] (using in part ideas of [ARV09,NRS05,Lee05,CGR08]) that if (X, d X ) is a metric space that admits a quasisymmetric embedding into Hilbert space then there exist ε, δ ∈ (0, 1) (depending only on the modulus of quasisymmetry of the implicit embedding) such that for every n ∈ N, any n-point subset S ⊆ X satisfies ε(S; δ) ε/ √ log n. Consequently we have the following statement.
Corollary 7.4. Suppose that (X, d X ) is a metric space that admits a quasisymmetric embedding into a Hilbert space. Then there exists a constant C ∈ (0, ∞) (depending only on the modulus of quasisymmetry of the implicit embedding) such that for every n ∈ N and every symmetric stochastic matrix A ∈ M n (R) we have Note that Bourgain's embedding theorem [Bou85] implies that (181) holds true for every metric space (X, d X ) if one replaces the term log n by (log n) 2 (in which case C can be taken to be a universal constant). 7.2. Localized weakly bi-Lipschitz embeddings. Following the terminology of [NPSS06], for D ∈ [1, ∞) say that a metric space (X, d X ) admits a weakly bi-Lipschitz embedding with distortion D into a metric space (Y, d Y ) if for every ∆ ∈ (0, ∞) there exists a nonconstant Lipschitz mapping f ∆ : X → Y such that for every x, y ∈ X, The origin of this terminology is that such embeddings preserve (by design) weak (p, q) metric Poincaré inequalities. Specifically, a standard way by which one rules out the existence of bi-Lipschitz embeddings is via generalized Poincaré-type inequalities as follows. Suppose that n ∈ N and p, q, K ∈ (0, ∞), and there exist two measures µ, ν on {1, . . . , n} 2 such that every y 1 , . . . , y n ∈ Y satisfy Clearly if f : X → Y is a bi-Lipschitz embedding then the inequality (183) holds for (X, d X ) as well, with the right hand side of (183) multiplied by f Lip f −1 Lip . Thus a strong (p, q) inequality such as (183) are bi-Lipschitz invariants that can be used to show that certain spaces (X, d X ) must incur large distortion in any bi-Lipschitz embedding into (Y, d Y ). The obvious weak (p, q) variant of (183) is the assertion that for every u ∈ (0, ∞) and every y 1 , . . . , y n ∈ Y we have By definition, if a metric space (X, d X ) admits a weakly bi-Lipschitz embedding with distortion D into a metric space (Y, d Y ) satisfying (184) then for every n ∈ N, any x 1 , . . . , x n ∈ X satisfy We will see below that one can prove nonlinear spectral gap inequality such as (6) with Ψ linear by showing that (X, d X ) admits a weakly bi-Lipschitz embedding into (Y, d Y ). To this end it suffices to localize the condition (182) to balls of proportional scale, as follows. For D ∈ [1, ∞) say that a metric space (X, d X ) admits a localized weakly bi-Lipschitz embedding with distortion D into a metric space (Y, d Y ) if for every z ∈ X and ∆ ∈ (0, ∞) there exists a non-constant Lipschitz mapping f z ∆ : X → Y such that for every x, y ∈ B X (z, 32∆) we have The factor 32 here was chosen to be convenient for the ensuing arguments, but it is otherwise arbitrary.
Proposition 7.5. Fix n ∈ N and p, D ∈ [1, ∞). Suppose that (X, d X ) is a metric space that admits a localized weakly bi-Lipschitz embedding with distortion D into a metric space (Y, d Y ). Then for every x 1 , . . . , x n ∈ X there is a nonconstant mapping f : {x 1 , . . . , x n } → Z, where Z ∈ {Y, R}, such that Observe that if (Y, d Y ) contains an isometric copy of an interval [a, b] ⊆ R (in particular if Y is a Banach space) then the conclusion of Proposition 7.5 can be taken to be Av Lemma 3.5 in [ALN08] asserts that for every δ ∈ (0, 1), every finite metric space (X, d X ) admits a weakly bi-Lipschitz embedding into ℓ 2 with distortion ζ(X; δ)/ √ δ. Consequently, all the examples that arise from random padded partitions as described in Section 7.1.1 fall into the framework of Proposition 7.5, with the only difference being that an application of Proposition 7.5 rather than Proposition 7.1 yields an embedding into Hilbert space rather than into the real line. This difference is discussed further in Section 7.3 below. The following lemma shows that Proposition 7.5 has wider applicability than Proposition 7.1: in combination with Proposition 7.5 it yields a different proof of the case p ∈ (2, ∞) of (22) that avoids the use of Theorem 1.3.

∆ Lip
2K. If x, y ∈ z + 32∆B X satisfy x − y X ∆ then For p ∈ [1, 2), due to Lemma 1.12 and Proposition 7.5, ℓ p does not admit a localized weakly bi-Lipschitz embedding into Hilbert space (this can also be proved directly via a shorter argument). Since finite subsets of ℓ p embed isometrically into ℓ 1 (see e.g. [DL97]), it follows from [ARV09] that every n-point subset of ℓ p admits a weakly bi-Lipschitz embedding into ℓ 2 with distortion O( √ log n). By [Lee05], it is also true that every n-point subset of ℓ p admits a weakly bi-Lipschitz embedding into ℓ 2 with distortion O((log n) (2−p)/p 2 ), which is better than the O( √ log n) bound of [ARV09] if √ 5 − 1 < p 2. Therefore for every n ∈ N and every n by n symmetric stochastic matrix A, These are the currently best known bounds towards Question 1.8. 7.3. Dimension reduction. As discussed in Section 7.2, Proposition 7.1 yields an average distortion embedding into the real line, while Proposition 7.5, when applied in the context of spaces with random zero sets, yields an average distortion embedding into Hilbert space.
Here we briefly compare these two notions. The following lemma is a simple application of the classical Johnson-Lindenstrauss dimension reduction lemma [JL84].
Lemma 7.7. If (X, d X ) is an n-point metric space then

Av
(2) Proof. The rightmost inequality in (186) is trivial. Write D = Av (2) ℓ 2 (X) and take x 1 , . . . , x m ∈ X. By the Johnson-Lindenstrauss lemma [JL84] there exists k ∈ N such that k log n and there exists a 1-Lipschitz Therefore there exists s ∈ {1, . . . , k} such that Since f s : {x 1 , . . . , x m } → R is also 1-Lipschitz, we conclude that Av The following lemma shows that Lemma 7.7 is almost asymptotically sharp.
Choose an arbitrary point x i ∈ C i and set X n = {x 1 , . . . , x n } ⊆ ℓ m−1 2 . Since X n is isometric to a subset of Hilbert space, Av (2) ℓ 2 (X n ) = 1. Suppose that f : {x 1 , . . . , x n } → R is a 1-Lipschitz function. By the nonlinead Hahn-Banach theorem (see [BL00]) we can think of f as the restriction to X n of a 1-Lipschitz function defined on all of S m−1 . The Poincaré inequality on the Sphere S m−1 (see e.g. [Cha84,Led01]) asserts that For every i, j ∈ {1, . . . , n} and every (x, y) ∈ C i × C j we have and similarly, and Hence, By choosing ε = 1/(2 √ m), we have shown that every 1-Lipschitz func- Recalling that n (κ/ε) m = (2κ √ m) m , or m log n/ log log n, the proof of (187) is complete.
Remark 7.9. Given an n-point metric space (X, d X ) let S(X) denote the maximum of 1 2n 2 (x,y)∈X×X |f (x)−f (y)| 2 over all 1-Lipschitz functions f : X → R. The quantity S(X) was introduced by Alon, Boppana and Spencer in [ABS98], where they called it the spread constant of X. They proved that the spread constant of X governs the asymptotic isoperimetric behavior of ℓ n 1 (X) as n → ∞. They also state that "The spread constant appears to be new and may well be of independent interest." We agree with this assertion. In particular, it would be worthwhile to investigate the computational complexity of the problem that takes as input an n-point metric space (X, d X ) and is supposed to output in polynomial time a number that is guaranteed to be a good approximation of its spread constant. We are not aware of hardness of approximation results for this question. Let S ℓ 2 (X) denote the maximum of 1 2n 2 (x,y)∈X×X f (x) − f (y) 2 2 over all 1-Lipschitz functions f : X → ℓ 2 . The quantity S ℓ 2 (X) can be computed in polynomial time with arbitrarily good precision, since (by definition) it can be cast as a semidefinite program (see [GLS93]). The proof of Lemma 7.7 can be viewed as a simple approximation algorithm to the spread constant, achieving an approximation guarantee of O(log n). Lemma 7.8 can be viewed as yielding an almost matching integrality gap lower bound for the semidefinite program. Note that the parameter S ℓ 2 (X) itself has also been studied in the literature in the context of the problem of finding the fastest mixing Markov process on a given graph; see [SBXD06]. See also the works [Fie89, GHW08,GHR12] that study this quantity in the context of the absolute algebraic connectivity of a graph. Clearly (Av R (X)) 2 is closely related to S(X): it amounts to finding the (multi)subset of X with largest spread constant. The same can be said for the relation between (Av (2) ℓ 2 (X)) 2 and S ℓ 2 (X). 7.4. Proofs of Proposition 7.1 and Proposition 7.5. We start by recording the following very simple lemma, whose proof is a straightforward application of the triangle inequality.
Proof of Proposition 7.5. Fix n ∈ N and choose x 1 , . . . , x n ∈ X. Define r ∈ (0, ∞) as in (193) and k ∈ {1, . . . , n} as in (195). Let M be defined as in (197) and suppose that |M| > n 2 /2 7p . An application of (185) with ∆ = r/8 and z = x k shows that This yields the desired average distortion embedding into Y . If, on the hand, |M| n 2 /2 7p then the existence of the desired embedding into R follows by choosing f (x) = max{0, d X (x, x k ) − 2r} and applying Lemma 7.12.
It is natural to ask how the quantities Av  Y (X) are related to each other for distinct p, q ∈ [1, ∞) and two metric space (X, d X ) and (Y, d Y ). We shall now briefly address this matter.
Suppose that D > Av (q) Y (X) and fix x 1 , . . . , x n ∈ X. Then there exists a nonconstant mapping f : {x 1 , . . . , x n } → Y such that Suppose first that q < p, and continue using the notation of Lemma 7.10 and Lemma 7.11. If |M| > n 2 /2 7p then 1 2 7p · r q 8 q where c ∈ (0, ∞) is a universal constant. By substituting (212) and (213) into (211) we therefore have By extending φ to a mapping Φ : X → Y with Φ Lip e(X, Y ) φ Lip we see that where C ∈ (0, ∞) is a universal constant. By using Lemma 7.12 if |M| n 2 /2 7p , we deduce from (215) that p < q =⇒ Av It seems unlikely that (217) is sharp.

Appendix: a refinement of Markov type
Below is an application of Theorem 1.5 that I found in collaboration with Yuval Peres. I thank him for agreeing to include it here.
Fix n ∈ N and an n by n symmetric stochastic matrix A = (a ij ). Then for every m ∈ N and x 1 , . . . , x n ∈ R we have (218) becomes evident when one expresses the vector (x 1 , . . . , x n ) ∈ R n in an orthonormal eigenbasis of A, showing also that the multiplicative factor 1 + λ 2 (A) + . . . + λ 2 (A) m−1 is sharp. By using the estimate |λ 2 (A)| 1, it follows from (218) that Hilbert space has Markov type 2 with M 2 (ℓ 2 ) = 1; this was Ball's original proof of this fact in [Bal92]. Suppose that p ∈ (2, ∞). In [NPSS06] it was shown that ℓ p has Markov type 2, and in fact that M 2 (ℓ p ) √ p. This is the same as asserting the following inequality, which holds true for every n by n symmetric stochastic matrix A = (a ij ) and every x 1 , . . . , x n ∈ ℓ p .
Since A is symmetric and stochastic, (225) is the same as (223).
Corollary 7.14. Fix p ∈ [1, ∞) and m, n ∈ N. Suppose that A = (a ij ) is an n by n symmetric stochastic matrix. Then for every metric space (X, d X ) and every x 1 , . . . , x n ∈ X we have Proof. We may assume without loss of generality that γ(A, d p X ) < ∞, i.e., that A is ergodic. In this case we have lim t→∞ (A t ) ij = 1/n for every i, j ∈ {1, . . . , n}. Therefore by Lemma 7.13 (with t → ∞), The following corollary is an immediate consequence of Corollary 7.14 and the definition of the Markov type p constant M p (X; m).
To state two additional examples of consequences of this type, fix K ∈ [2, ∞) and let (X, d X ) be a metric space that is doubling with constant K. In [DLP13] it is shown that M 2 (X) log K, so in combination with Corollary 7.2 we deduce from Corollary 7.15 that Similarly, it was shown that if G is a connected planar graph then M 2 (G, d G ) 1, so in combination with Corollary 7.2 we deduce from Corollary 7.15 that Despite the validity of satisfactory spectral estimates such as (220), (221), (222), (227) and (228), they do not follow automatically only from the fact that the metric space in question has Markov type 2. Specifically, there exists a metric space (X, d X ) that has Markov type 2, yet it is not true that γ(A, d 2 X ) X 1/(1−λ 2 (A)) for every symmetric stochastic matrix A. To see this, by Theorem 1.3 it suffices to prove the following result.
Proof. If Λ ⊆ R n is a lattice of rank n then denote the length of the shortest nonzero vector in Λ by N(Λ). Also, let r(Λ) denote the infimum over those r ∈ (0, ∞) such that Euclidean balls of radius r centered at Λ cover R n . The dual lattice of Λ is denoted Λ * ; thus Λ * is the set of all x ∈ R n such that n i=1 x i y i is an integer for every y ∈ Λ.
For every n ∈ N choose an arbitrary rank n lattice Λ n ⊆ R n that satisfies r(Λ n ) N(Λ n ). See [Rog50] for the existence of such lattices. Let R n /Λ * n be the corresponding flat torus, equipped with the natural Riemannian quotient metric d R n /Λ * n (·, ·). Also, let µ n denote the normalized Riemannian volume measure on R n /Λ * n . Consider the ℓ 2 product Thus X consists of all the sequences x = (x n ) ∞ n=1 ∈ ∞ n=1 (R n /Λ * n ) such that ∞ n=1 d R n /Λ * n (x n , 0) 2 < ∞, equipped with the metric given by d X (x, y) 2 = ∞ n=1 d R n /Λ * n (x n , y n ) 2 for every x, y ∈ X. Since R n /Λ * n has vanishing sectional curvature, it is an Aleksandrov space of nonnegative curvature (see [Oht09b,Sec. 3]), and therefore by a theorem of Ohta [Oht09b] it has Markov type 2 constant at most 1 + √ 2. Being an ℓ 2 product of spaces with uniformly bounded Markov type 2 constant, X also has Markov type 2.
Fix ε ∈ (0, 1). By covering the fundamental parallelepiped of Λ * n by homothetic copies of itself, we see that there exists a finite measurable partition {U 1 , . . . , U k } of the torus R n /Λ * n into sets of diameter at most ε and µ n (U i ) = 1/k for every i ∈ {1, . . . , k}.