Non-triviality of the phase transition for percolation on finite transitive graphs

We prove that if $(G_n)_{n\geq1}=((V_n,E_n))_{n\geq 1}$ is a sequence of finite, vertex-transitive graphs with bounded degrees and $|V_n|\to\infty$ that is at least $(1+\epsilon)$-dimensional for some $\epsilon>0$ in the sense that \[\mathrm{diam} (G_n)=O\left(|V_n|^{1/(1+\epsilon)}\right) \text{ as $n\to\infty$}\] then this sequence of graphs has a non-trivial phase transition for Bernoulli bond percolation. More precisely, we prove under these conditions that for each $0<\alpha<1$ there exists $p_c(\alpha)<1$ such that for each $p\geq p_c(\alpha)$, Bernoulli-$p$ bond percolation on $G_n$ has a cluster of size at least $\alpha |V_n|$ with probability tending to $1$ as $n\to \infty$. In fact, we prove more generally that there exists a universal constant $a$ such that the same conclusion holds whenever \[\mathrm{diam} (G_n)=O\left(\frac{|V_n|}{(\log |V_n|)^a}\right) \text{ as $n\to\infty$.}\] This verifies a conjecture of Benjamini up to the value of the constant $a$, which he suggested should be $1$. We also prove a generalization of this result to quasitransitive graph sequences with a bounded number of vertex orbits and prove that one may indeed take $a=1$ when the graphs $G_n$ are all Cayley graphs of Abelian groups. A key step in our proof is to adapt the methods of Duminil-Copin, Goswami, Raoufi, Severo, and Yadin from infinite graphs to finite graphs. This adaptation also leads to an isoperimetric criterion for infinite graphs to have a nontrivial uniqueness phase (i.e., to have $p_u<1$) which is of independent interest. We also prove that the set of possible values of the critical probability of an infinite quasitransitive graph has a gap at $1$ in the sense that for every $k,n<\infty$ there exists $\epsilon>0$ such that every infinite graph $G$ of degree at most $k$ whose vertex set has at most $n$ orbits under Aut$(G)$ either has $p_c=1$ or $p_c\leq 1-\epsilon$.


Introduction
In Bernoulli bond percolation, the edges of a connected, locally finite graph G = (V, E) are chosen to be either deleted (closed) or retained (open) independently at random with retention probability p ∈ [0, 1]. We write P p = P G p for the law of the resulting random subgraph and refer to the connected components of this subgraph as clusters. When G is infinite, the critical probability p c = p c (G) is defined to be p c = sup p ∈ [0, 1] : every cluster is finite P G p -almost surely .
It is a fact of fundamental importance that percolation undergoes a non-trivial phase transition in the sense that 0 < p c < 1 on most infinite graphs. Indeed, in the traditional setting of Euclidean lattices 1 , it is a classical consequence of the Peierls argument that p c (Z d ) < 1 for every d ≥ 2 [23, Theorem 1.10], while the complementary bound p c ≥ 1/(M − 1) > 0 holds for every graph of maximum degree M by elementary path-counting arguments [25,Chapter 3]. Besides the obvious importance of such results to the study of percolation itself, non-triviality of the percolation phase transition also implies the non-triviality of the phase transition for many other important models in probability and mathematical physics, including the random cluster, Ising, and Potts models; see [24,Section 3.4] and Remark 1.11 below.
Since the pioneering work of Benjamini and Schramm [11], there has been substantial interest in understanding the behaviour of percolation beyond the traditional setting of Euclidean lattices. A natural level of generality is that of (vertex-)transitive graphs, i.e., graphs for which any vertex can be mapped to any other vertex by a graph automorphism. More generally, one can also consider quasitransitive graphs, i.e., graphs G = (V, E) for which the action of the automorphism group Aut (G) on V has finitely many orbits. We refer the reader to [23] for background on percolation in the Euclidean context and [47] for percolation on general transitive graphs.
Benjamini and Schramm conjectured in the same work that p c < 1 for every infinite, connected, quasitransitive graph that has superlinear volume growth (or, equivalently, is not rough-isometric to Z). The final remaining cases of this conjecture were finally resolved in the recent breakthrough work of Duminil-Copin, Goswami, Raoufi, Severo, and Yadin [18]. Several important cases of the conjecture have been known for much longer, including the cases that the graph in question has polynomial volume growth (see the discussion in [18,Section 1]), exponential volume growth [45], or is a Cayley graph of a finitely presented group [5,68]. Indeed, the precise result proven in [18] is that p c < 1 for every infinite, connected, bounded degree graph satisfying a d-dimensional isoperimetric inequality for some d > 4 (we recall what this means later in the introduction). The general conjecture follows since every infinite transitive graph that does not satisfy this condition must have polynomial volume growth by a theorem of Coulhon and Saloff-Coste [16], and hence is covered by previous results. Further works concerning this problem include [10,15,52,56,63]; see the introduction of [18] for a detailed guide to the relevant literature.
The purpose of this paper is to develop an analogous theory for finite transitive and quasitransitive graphs. For such graphs there are multiple, potentially inequivalent ways to define the supercritical phase (see Remark 1.13); we work with the most stringent such definition, which requires the existence of a giant cluster whose volume is proportional to that of the entire graph. Given a graph G = (V, E) and parameters α, q ∈ (0, 1), define the critical probability p c (G, α, q) via p c (G, α, q) = inf p ∈ [0, 1] : P p there exists an open cluster K such that |K| ≥ α|G| ≥ q .
Of course, we trivially have that p c (G, α, q) < 1 whenever G is a finite connected graph and α, q ∈ (0, 1), so the relevant problem is instead to find conditions on sequences of graphs (G n ) n≥1 guaranteeing that p c (G n , α, q) is bounded away from 1 as n → ∞.
By analogy with the infinite case, we would ideally like to say that this holds whenever the graphs G n are "not one-dimensional" in some sense. However, a precise interpretation of what this should mean is much more delicate to determine in the finite case. In the infinite case, well-known results of Trofimov [71], Gromov [28] and Bass and Guivarc'h [6,29] imply that for every transitive graph G of at most polynomial growth there exists an integer d such that for each n ∈ N the ball of radius n in G has cardinality bounded above and below by constants times n d . In particular, any such graph with superlinear growth has at least quadratic growth. Sequences of finite transitive graphs, on the other hand, may have volume growth that is only very barely superlinear. Indeed, if one considers a highly asymmetric torus (Z/nZ) × (Z/mZ) with m = m(n) n, one can show that the percolation phase transition is non-trivial if and only if m = Ω(log n) (see Lemma 3.3 and Remark 3.4). This example led Benjamini [7, Conjecture 2.1] to make the following conjecture. Conjecture 1.1 (Benjamini 2001). For every k ≥ 1, λ > 0 and α, q ∈ (0, 1) there exists ε = ε(λ, α, q) > 0 such that if G = (V, E) is a finite, vertex-transitive graph of degree at most k satisfying This conjecture has been verified for expander graphs (which automatically have diameter at most logarithmic in their volume) by Alon, Benjamini, and Stacey [4]; see also [9,41,53,58] for more refined results in this case. Malon and Pak [49] verified the conjecture for Cayley graphs of Abelian groups generated by Hall bases (a.k.a. hypercubic tori) and expressed a belief that the conjecture should be false in general. Note that the case of the symmetric hypercubic torus (Z/nZ) d is classical; indeed it follows from the work of Grimmett and Marstrand [26] that such a torus has a giant cluster with high probability for every p > p c (Z d ).
The main result of the present paper verifies Conjecture 1.1 for all but the very weakest instances of the hypotheses, applying in particular to graphs G = (V, E) that are (1+ε)-dimensional in the sense that diam(G) ≤ λ|V | 1/(1+ε) for some ε > 0 and λ < ∞. In fact, all of our results also apply more generally in the quasitransitive case. Of course, since every finite graph is trivially quasitransitive, we must define our quasitransitivity assumption quantitatively if it is to have any impact. Given n ∈ N, we therefore define a graph G = (V, E) to be n-quasitransitive if the action of Aut (G) on V has at most n distinct orbits, so that transitive graphs are 1-quasitransitive. Theorem 1.2. There exists an absolute constant a ≥ 1 such that for every k, n ≥ 1, λ > 0 and α, q ∈ (0, 1) there exists ε = ε(k, n, λ, α, q) > 0 such that if G = (V, E) is a finite, n-quasitransitive graph of degree at most k satisfying diam(G) ≤ λ|V | (log |V |) a then p c (G, α, q) ≤ 1 − ε.
A crucial ingredient in our argument is a direct proof of Theorem 1.2 for an arbitrary Abelian Cayley graph. In fact, in this case we show that the exponent a can be taken to be 1, resolving Conjecture 1.1 in full and significantly generalising the results of Malon and Pak [49]. Theorem 1.3. Let k ≥ 1, λ > 0, and α, q ∈ (0, 1). Then there exists ε = ε(k, λ, α, q) > 0 such that if G = (V, E) is a Cayley graph of a finite Abelian group with degree at most k satisfying then p c (G, α, q) ≤ 1 − ε.

About the proof
Our work relies crucially on the quantitative structure theory of finite transitive graphs as developed in a series of works by Tessera and the second author [64][65][66][67], building on Breuillard, Green and Tao's celebrated structure theorem for finite approximate groups [12]. Roughly speaking, this theory states that for each integer d ≥ 1 and each locally finite, vertex-transitive graph G, there exists a scale m such that G looks at least (d + 1)-dimensional on scales smaller than m and looks like a nilpotent group of dimension 2 at most d on scales larger than m; see Section 5 for detailed statements. Again, we stress that it is indeed possible for a vertex-transitive graph to look higher-dimensional on small scales than it does on large scales: Consider for example the torus (Z/nZ) d 1 × (Z/mZ) d 2 −d 1 with d 2 > d 1 and m n, which looks d 2 -dimensional on scales up to m and d 1 -dimensional on scales k satisfying m k ≤ n. Given these structure-theoretic results, the proof of Theorem 1.5 has three main components: an analysis of percolation on Cayley graphs of finite nilpotent groups, an analysis of percolation under a high-dimensional isoperimetric condition using the techniques of [18], and finally an argument showing that we can patch together the outputs of these two analyses at the relevant crossover scale if necessary.
Let us now outline this proof in a little more detail.
• In Section 2.2 we reduce Theorem 1.2 to the transitive case by noting every n-quasitransitive graph is rough-isometric to a transitive graph of comparable volume and diameter.
• In Section 3, we prove Theorem 1.3 by using techniques from additive combinatorics to reduce from arbitrary Abelian Cayley graphs to boxes in Z d with the standard generating set, which can be handled by the methods of Malon and Pak [49]. Then, in Section 3.3, we prove a version of Theorem 1.2 for Cayley graphs of nilpotent groups (Theorem 3. 19), but where the exponent a is taken to be the step of the group (i.e., the length of the lower central series). This is done by induction on the step, with the base case being handled by Theorem 1.3, and is the first place in which we lose additional powers of log in our analysis.
• In Section 4, which can be read independently of the rest of the paper, we adapt the methods of [18] to analyze percolation on finite graphs satisfying a 12-dimensional isoperimetric condition. As described in more detail just after the statement of Theorem 1.5 below, the proof of that paper adapts straightforwardly to show under a (4 + ε)-dimensional isoperimetric assumption that there is a non-trivial phase in which there exist large clusters (i.e. clusters of size going to infinity with the volume of the graph), but an additional argument is needed to deduce the existence of a giant cluster (i.e. a cluster of volume proportional to the volume of the graph).
• In Section 5 we review the structure theory of vertex-transitive graphs as developed in [65,66]. We also prove an important supporting technical proposition, stating roughly that at the scale where the graph crosses over from being at least (d + 1)-dimensional to at most ddimensional, we can find a set that well approximates the ball and that induces a subgraph satisfying a (d + 1)-dimensional isoperimetric inequality.
• Finally, in Section 6 we put all these ingredients together to deduce Theorem 1.2. Note that a non-trivial argument is still required to complete this stage of the proof, which is the second and last place that additional powers of log are lost in our analysis.

Further results
We now state our other main results.
Isoperimetric criteria for percolation. We now discuss our results concerning percolation on finite graphs under isoperimetric conditions, which build on the work of [18] and play an important role in the proof of Theorem 1.2 as described above. Let d ≥ 1 and c > 0. A locally finite graph G = (V, E) is said to satisfy a d-dimensional isoperimetric inequality with constant c, abbreviated (ID d,c ), if for every finite set of vertices K ⊆ V . Here, ∂ E K denotes the edge boundary of K, i.e., the set of edges with endpoints in both K and V \ K. It is a classical result of Coulhon and Saloff-Coste [16] that transitive graphs of at least d-dimensional volume growth always satisfy d-dimensional isoperimetric inequalities, and strong quantitative versions of this result for finite transitive graphs have recently been proven by Tessera and the second author in [66]. The aforementioned theorem of Duminil-Copin, Goswami, Raoufi, Severo, and Yadin [18], which we review in detail in Section 4, may be phrased quantitatively as follows. (While they did not phrase their results in this way, one may easily verify that all the constants appearing in their proof can be taken to depend only on the parameters d, c, and k; see Remark 4.2 below.) for every p ≥ 1 − ε and every finite non-empty set A ⊆ V .
Our second main theorem extends this result to finite graphs under a stronger assumption on the dimension. Note that neither Theorem 1.4 nor Theorem 1.5 require transitivity. Given two sets of vertices A and B, we write {A ↔ B} for the event that there is an open path connecting A to B. The dimensional threshold 6 + 2 √ 7 appearing here satisfies 6 + 2 √ 7 ≈ 11.29 < 12.
Theorem 1.5. Let G = (V, E) be a finite, connected graph with degrees bounded by k that satisfies a d-dimensional isoperimetric inequality (ID d,c ) for some d > 6 + 2 √ 7 and c > 0. There exists a positive constant η = η(d, c, k) such that for every ε > 0 there exists 0 for every p ≥ p 0 and every two non-empty sets A, B ⊆ V .
As we shall see in Section 4, the proof of [18] extends straightforwardly to show that if G satisfies a d-dimensional isoperimetric assumption for some d > 4 then there exists ε = ε(k, d) > 0 such that if p ≥ 1 − ε then each vertex v has a good probability to be connected to any set A ⊆ V satisfying |A| ≥ |V | 1−δ , where δ = δ(d) = (d − 4)/4(d − 1) is an explicit positive constant tending to 0 as d ↓ 4. An additional argument is required to deduce that a giant component exists, and we have been able to implement such an argument only under a stronger assumption on d.
Remark 1.6. We have not optimized the value of the dimensional threshold 6 + 2 √ 7 ≈ 11.29 appearing here. We have been able to extend the result to some lower values of the dimension, but not all the way down to 4 + ε, and do not pursue these improvements here. We expect that Theorems 1.4 and 1.5 should hold for any d > 1, but this appears to be beyond the scope of existing methods.
Critical probability gap for infinite transitive graphs. Although the main focus of this paper is percolation on finite graphs, a number of the techniques apply equally well to infinite graphs. In particular, this allows us to make the results of [18] more quantitative in the following sense. Recall that a transitive graph is said to have superlinear volume growth if lim sup n→∞ Theorem 1.7 (Critical probability gap). Let k, n ∈ N. Then there exists ε = ε(k, n) > 0 such that if G is an infinite, connected n-quasitransitive graph of degree at most k with superlinear volume growth then p c (G) ≤ 1 − ε. In particular, every infinite, connected, n-quasitransitive graph G of degree at most k has either p c (G) ≤ 1 − ε or p c (G) = 1.
Since the main results of [18] are already proven quantitatively as discussed above, the main novelty of the proof of Theorem 1.7 comes from the application of quantitative forms of Gromov's theorem as developed in [64][65][66][67] to handle the low-dimensional case in a quantitative way.
A critical probability gap of the form established by Theorem 1.7 was first suggested to hold by Gábor Pete [55, p. 225], who noted that it would follow from Schramm's locality conjecture [9] together with the (then conjectural) results of [18]. Moreover, Theorem 1.7 shows in particular that, in the formulation of the locality conjecture, one may harmlessly replace the assumption that p c (G n ) < 1 for all sufficiently large n with the a priori stronger assumption that lim sup n→∞ p c (G n ) < 1. See [9,37] for overviews of this conjecture and the progress that has been made on it.
Corollaries for the uniqueness threshold. Recall that if G = (V, E) is an infinite, connected, locally finite graph then the uniqueness threshold p u = p u (G) for Bernoulli bond percolation on G is defined by It is a result originally due to Häggstrom, Peres, and Schonmann [30,31,59] that if G is quasitransitive then there is a unique infinite cluster almost surely for every p > p u . Benjamini and Schramm [11,Question 3] asked whether the strict inequality p u < 1 holds for every transitive graph with one end. Of course, when the graph in question is amenable we have that p c = p u by the classical results of Aizenman, Kesten, and Newman [3] and Burton and Keane [14], so that the question has a positive answer in this case by the results of [18]. In the nonamenable setting, the question has been resolved positively for Cayley graphs of finitely presented groups by Babson and Benjamini [5] (see also [68]), for graphs defined as direct products by Peres [54], and for Cayley graphs of Kazhdan groups and wreath products by Lyons and Schramm [48] but remains open in general.
Theorem 1.5 leads to an interesting isoperimetric criterion for an infinite graph to have p u < 1. We define the internal isoperimetric dimension of an infinite, connected, locally finite graph G = (V, E) to be the supremal value of d for which there exists a positive constant c and an exhaustion V 1 ⊆ V 2 ⊆ · · · of V by finite connected sets such that the subgraph G n of G induced by V n satisfies the d-dimensional isoperimetric inequality (ID d,c ) for every n ≥ 1. Notions closely related to the internal isoperimetric dimension have been studied systematically in the recent work of Hume, Mackay, and Tessera [36], whose methods implicitly lead to computations of the internal isoperimetric dimension in various examples: For example, one can prove via their methods that Z d has internal isoperimetric dimension d for every d ≥ 1, the 3-regular tree has internal isoperimetric dimension 1, and graphs rough-isometric to d-dimensional hyperbolic space We are now ready to state our results on the uniqueness threshold.
Theorem 1.8. Let k ∈ N, c > 0 and d > 6 + 2 √ 7, and suppose that G = (V, E) is an infinite, connected graph with degree at most k for which there exists an exhaustion V 1 ⊆ V 2 ⊆ · · · of V by finite connected sets such that each subgraph G n of G induced by V n satisfies the d-dimensional isoperimetric inequality (ID d,c ). Then there exists ε = ε(d, c, k) > 0 such that Bernoulli-p bond percolation on G has a unique infinite cluster almost surely for every p ≥ 1 − ε. Corollary 1.9. Every infinite, connected, bounded degree graph with internal isoperimetric dimension strictly greater than 6 + 2 √ 7 has p u < 1.
As above, the dimensional threshold appearing in Theorem 1.8 has not been optimized, and we conjecture that the same conclusion holds for every infinite, connected, bounded degree graph with internal isoperimetric dimension strictly greater than 1. We remark that our notion of internal isoperimetric dimension is also closely related to the isoperimetric criteria for p u < 1 for graphs of polynomial growth developed in the work of Teixeira [63].
When G is transitive, Theorem 1.8 follows immediately from Theorem 1.5 applied to the sequence of finite graphs G n together with a result of Schonmann [59] stating that if G is transitive then there is almost-sure uniqueness at p ⇐⇒ lim where B(x, n) denotes the graph distance ball of radius n around x. See also [48,61] for stronger versions of this theorem. In order to deduce Theorem 1.8 from Theorem 1.5 without the assumption that G is transitive, we prove the following variation on Schonmann's theorem in Section 4.3.
then Bernoulli-p bond percolation on G has a unique infinite cluster almost surely for each p > p 0 .

Further discussion and remarks
Remark 1.11 (Other models). It follows by standard stochastic domination arguments that each of our main theorems implies analogous results for several other percolation-type models, including site percolation [27], finitely dependent models [44], and the Fortuin-Kastelyn random cluster model [24]. Using the Edwards-Sokal coupling [21], it follows moreover that the ferromagnetic Ising and Potts models have uniformly non-trivial ordered phases on the classes of graphs treated by these theorems (i.e., that the pair correlations of these models are uniformly bounded away from zero at sufficiently low temperatures).
Remark 1.12 (Sharp thresholds). It is a standard consequence of the abstract sharp-threshold theorem of Kahn, Kalai, and Linial [39] that if G n is a sequence of vertex-transitive graphs with volume tending to infinity then |p c (G n , α, 1 − ε) − p c (G n , α, ε)| → 0 as n → ∞ for each fixed 0 < α, ε < 1. See e.g. [25,Section 4.7] for background on the use of such sharp-threshold theorems in percolation. This allows us to immediately deduce the statement given in the abstract from that of Theorem 1.2: There exists a universal constant a such that if (G n ) n≥1 = ((V n , E n )) n≥1 is a sequence of finite, vertex-transitive graphs with bounded degrees and |V n | → ∞ such that as n → ∞ then for each 0 < α < 1 there exists p c (α) < 1 such that for each p ≥ p c (α), Bernoulli-p bond percolation on G n has a cluster of size at least α|V n | with probability tending to 1 as n → ∞.
Remark 1.13 (Large clusters vs. giant clusters). We now discuss a key issue underlying many of the additional difficulties arising in the finite volume that do not arise in infinite volume. Let (G n ) n≥1 = ((V n , E n )) n≥1 be a sequence of finite, vertex-transitive graphs of bounded degree and with |V n | → ∞. Our main theorems give criteria under which the critical probability is strictly less than 1. In the case that (G n ) n≥1 converges locally to some infinite, connected, locally finite, vertex-transitive graph G, it is natural to wonder whether the two critical probabilities p c (G) and p c ((G n ) n≥1 ) are necessarily equal. Indeed, if this were the case, one would be able to deduce our main results (and more!) rather easily from the results of [18,65] by a compactness argument. Alon, Benjamini, and Stacey [4] proved that the equality p c (G) = p c ((G n ) n≥1 ) holds when (G n ) n≥1 is an expander sequence. This equality is not true in general, however; indeed, the elongated torus G n,k = (Z/nZ) × (Z/k n Z) (with its standard generating set) Benjamini-Schramm converges to the square lattice Z 2 , which has p c (Z 2 ) = 1/2, but has p c ((G n,k ) n≥1 ) → 1 as k → ∞.
On the other hand, there are several alternative notions of critical probability for the sequence (G n ) n≥1 that do always coincide with p c (G) when G n converges locally to G. For example, if we let o n be a vertex of G n for each n ≥ 1, let K on be the cluster of o n , and define p T = p T ((G n ) n≥1 ) := sup p ∈ [0, 1] : lim sup n→∞ E Gn p |K on | < ∞ then it follows from the sharpness of the phase transition [2,19] that p T ((G n ) n≥0 ) = p c (G) whenever G n is a sequence of transitive graphs converging locally to a transitive, locally finite graph G. It follows by similar sharpness arguments that the critical probability p T can also be characterised as the supremal value of p for which the cluster volumes |K on | are tight as n → ∞. Together with the structure theoretic results of [65], one can deduce rather easily from these considerations and the results of [18] that if (G n ) n≥1 = ((V n , E n )) n≥1 is any sequence of finite, vertex-transitive, bounded degree graphs with diam(G n ) = o(|V n |) as n → ∞ then there exists p < 1 and f : for every n ≥ 1 and v ∈ V n . In other words, a sublinear diameter suffices for there to be a nontrivial phase in which the cluster of the origin is unboundedly large with good probability. However, as we see in the example of the elongated torus, it is possible in finite graphs to have a non-trivial phase in which there are many large clusters but no giant cluster. As mentioned above, this issue underlies many of the additional technical difficulties that arise for finite transitive graphs but not for infinite transitive graphs.
Remark 1.14. In forthcoming work of Easo and the first author, which complements the present paper, it is proven that the giant cluster is unique and has concentrated volume in supercritical percolation on any finite vertex-transitive graph.

Notation
Throughout the paper we assume without loss of generality that graphs do not have loops or multiple edges. This can indeed be done without loss of generality since adding loops has no effect on percolation, while adding multiple edges only makes it easier for a giant cluster to exist. Given a subset A of a group Γ, we write A −1 = {a −1 : a ∈ A} andÂ = A ∪ {id} ∪ A −1 . Given in addition m ∈ N, we define A m = {a 1 · · · a m : a 1 , . . . , a m ∈ A}, writeÂ m = (Â) m , and defineÂ 0 = {id}. Given two sets A and B we also write AB = {ab : a ∈ A, b ∈ B}. We write A = m≥0Â m for the subgroup of Γ generated by A, and say that A generates Γ if A = Γ.
Given a group Γ and a finite generating set S of Γ, the (undirected) Cayley graph Cay(Γ, S) is the graph with vertex set Γ and edge set {{x, y} ∈ Γ × Γ : x = ys for some s ∈ S ∪ S −1 \ {id}}.
Cayley graphs are always transitive since left multiplication defines a transitive action of Γ on Cay(Γ, S) by graph automorphisms. We write diam S (Γ) for the diameter of Cay(Γ, S), which is equal to the infimal m such that Γ =Ŝ m . When Γ is Abelian we will often denote the same objects using additive notation, so that e.g. −A = {−a : a ∈ A},Â = A ∪ {0} ∪ (−A), and mA = {a 1 + · · · + a m : a 1 , . . . , a m ∈ A}.
As above, we write P p = P G p for the law of Bernoulli-p bond percolation on a graph G, including the superscript only when the choice of graph is ambiguous. We write {x ↔ y} for the event that x and y belong to the same open cluster, and write {A ↔ B} for the event that there is an open cluster intersecting both A and B. We write K x for the cluster containing the vertex x.
Given a graph G = (V, E) and a subgroup H < Aut (G), we write Hv = {hv : h ∈ H} for the orbit of a vertex v in H and write H v = {h ∈ H : hv = v} for the stabilizer of v in H. We also use similar notation for orbits and stabilizers of edges. We will often take o to be a fixed root vertex of a vertex-transitive graph and write B(o, n) for the graph-distance ball of radius n around o.

Background on percolation and vertex-transitive graphs
In this section we present some basic tools for use in the rest of the paper.
The Harris-FKG inequality. Let G = (V, E) be a countable graph. A set A ⊆ {0, 1} E is said to be increasing if whenever ω ⊂ ω and ω ∈ A we have ω ∈ A. The Harris-FKG inequality [23,Chapter 2.2] states that increasing events are positively correlated under product measures, so that P p (A ∩ B) ≥ P p (A)P p (B) for every two increasing measurable sets A and B. Since we will usually be free to increase p whenever needed, we may also use the following trick to handle some situations in which Harris-FKG does not apply: If G is a countable graph and ω and ω are independent Bernoulli percolation configurations with retention probabilities 1 −p 1 and 1 −p 2 respectively, then ω ∪ ω is distributed as Bernoulli percolation with retention probability 1 −p 1p2 .
Criteria for a giant component. The next lemma gives necessary and sufficient conditions for a giant component in terms of point-to-point connection probabilities. 1], and let α ∈ (0, 1).
Remark 2.2. Note that, for any given u ∈ V , an assumption that P p (u ↔ v) ≥ β for every v ∈ V is enough to be able to apply the first part of Lemma 2.1, since E p |K u | ≥ β|V | in that case.
Remark 2.3. The second part of Lemma 2.1 is not true without the transitivity assumption: consider a path of length n connected at one end to a complete graph with n vertices.
The first item of the lemma follows trivially by applying Markov's inequality to |V | − |K u |. The second item appears in a slightly less general form in the work of Benjamini [7,Proposition 1.3], who attributed the argument to Schramm. For completeness, we now give a quick proof of this lemma at the full level of generality that we require.
noting that A is symmetric and contains the identity. Since {γ ∈ Γ : γo = v} = γ v Γ o for every v ∈ V , we have |A| > |Γ|/(m + 1). Moreover, we have trivially that {γ ∈ Γ : d(o, γo) ≤ 1} ⊆ A, and since this set generates Γ we deduce that A = Γ also. (This is why we considered α ∧ p instead of α.) Thus, we may apply Lemma 2.6 to deduce that A 3m = Γ. Setting k = 3m, for each v ∈ V there therefore exist a 1 , . . . , a k ∈ A such that γ v = a k · · · a 1 and hence that v = a k · · · a 1 o.
Writing v 0 = o and v i = a i · · · a 1 o for each 1 ≤ i ≤ k we deduce by Harris-FKG and the definition of A that This implies that there are at least αq|V |/2 vertices v satisfying P p (u ↔ v) ≥ αq/2, so that if G is vertex transitive then the second desired conclusion follows from Lemma 2.5.
The following variation on Lemma 2.5 will also be useful. Recall that when H is a subgroup of Aut (G) and v is a vertex of G, write Hv for the orbit Hv = {hv : h ∈ H} and H v for the stabiliser Lemma 2.7. Let G = (V, E) be a finite vertex-transitive graph, let o ∈ V , let p ∈ [0, 1], and let H < Aut (G). If α > 0 is such that |{v ∈ Ho : Proof. Let A = {h ∈ H : P p (o ↔ ho) ≥ α}, which is symmetric and contains the identity. Then we have by transitivity that |A| = |{v ∈ Ho : P p (o ↔ v) ≥ α}| · |H o | and |H| = |Ho| · |H o |, so that |A| > |H|/2. It follows that A 2 = H, since for each h ∈ H we have that hA ∩ A = ∅ and hence that there exist a 1 , a 2 ∈ A such that ha 1 = a 2 . The claim now follows similarly to the proof of Lemma 2.5.

Quotients and rough isometries
We now discuss several useful ways in which percolation on two different graphs can be compared. We will be particularly interested in the cases that either one graph is a quotient of the other or the two graphs are rough-isometric.
Monotonicity under quotients. We first recall a coupling of percolation on a graph and a quotient of that graph due to Benjamini and Schramm [11,Theorem 1] (see also [34,Section 2]), which implies in particular that if a graph G admits a quotient in which p c < 1 then G also has p c < 1. See also [50] for strengthened forms of this result. Proposition 2.8 (Benjamini-Schramm). Let G = (V, E) be a locally finite graph, let H < Aut (G), and let π : G → G/H be the quotient map. For each v ∈ G and p ∈ [0, 1], the cluster of π(v) in Bernoulli-p percolation on G/H is stochastically dominated by the image under π of the cluster of v in Bernoulli-p percolation on G. That is, See e.g. [25,Chapter 4.1] for background on stochastic domination. The next lemma is a straightforward observation allowing us to make a comparison in the opposite direction when H has bounded edge-orbits, at the cost of increasing p in a way that depends on the size of these orbits. It implies in particular that if H has bounded orbits and G has p c < 1 then G/H has p c < 1 also. Lemma 2.9. Suppose that G = (V, E) is a locally finite graph and that H < Aut (G) satisfies |He| ≤ k for some k ≥ 1 and every e ∈ E. For each v ∈ G and p ∈ [0, 1], the cluster of π(v) in Bernoulli-(1 − (1 − p) k ) percolation on G/H stochastically dominates the image under π of the cluster of v in Bernoulli-p percolation on G. That is, Rough isometries and rough embeddings. We now make note of a simple folkloric lemma allowing us to compare percolation on two rough-isometric graphs, or more generally on a graph that can be roughly embedded into another graph. We first recall the relevant definitions, referring the reader to e.g. [47, Chapter 2.6] for further background. Let G 1 = (V 1 , E 1 ) and G 2 = (V 2 , E 2 ) be connected, locally finite graphs and let α ≥ 1 and β ≥ 0. We abuse notation and write d( · , · ) for the graph distance on both G 1 and G 2 . A function φ : V 1 → V 2 is said to be an (α, β)-rough isometry if the following conditions hold: Note that the relation of rough isometry is approximately symmetric in the sense that for each α ≥ 1 and β ≥ 0 there exist α = α (α, β) ≥ 1 and β = β (α, β) ≥ 0 such that whenever there exists an (α, β)-rough isometry φ : V 1 → V 2 between two graphs G 1 = (V 1 , E 1 ) and G 2 = (V 2 , E 2 ) then there also exists an (α , β )-rough isometry ψ : V 2 → V 1 . (Indeed, simply choose ψ(y) to be an arbitrary element of the set {x : d(φ(x), y) ≤ β} for each y ∈ V 2 .) More generally, we say that φ is an (α, β)-rough embedding if the following conditions hold: Note that every (α, β)-rough isometry between graphs with degrees bounded by k is an (α, β )rough embedding for some β = β (α, β, k).
It is a standard and easily verified fact that if Γ is a group, S 1 and S 2 are generating sets of Γ, and m ≥ 1 is such that S 1 ⊆Ŝ m 2 then the identity function Γ → Γ is an (m, 1)-rough embedding (i.e., an m-Lipschitz injection) from Cay(Γ, S 1 ) to Cay(Γ, S 2 ). Lemma 2.10 therefore has the following immediate corollary. Corollary 2.11. For each k and m there exists a constant C = C(k, m) such that the following holds. Let f : [0, 1] → [0, 1] be the increasing homeomorphism f (p) = 1 − (1 − p 1/C ) C , let Γ be a group, and let S 1 and S 2 be finite generating sets of Γ of size at most k and withŜ 1 ⊆Ŝ m 2 . Then for each v ∈ Γ and p ∈ [0, 1], the cluster of v in Bernoulli-f (p) percolation on Cay(Γ, S 2 ) stochastically dominates the cluster of v in Bernoulli-p percolation on Cay(Γ, S 1 ).
Remark 2.12. Lemma 2.10 and Corollary 2.11 could also be phrased slightly more strongly as statements concerning stochastic ordering of the random equivalence relations given by connectivity.

Quasitransitive graphs
We reduce Theorems 1.2 and 1.7 to the transitive case via the following folkloric result.
Proposition 2.13. Let n, k ∈ N and suppose G = (V, E) is a connected n-quasitransitive graph of degree at most k. Then there exists a connected transitive graph G = (V , E ) of degree at most (k + 1) 2n satisfying diam(G ) ≤ diam(G) and an injective (2n, n)-rough isometry G → G. Moreover, if G is finite then we may insist that |V |/n ≤ |V | ≤ |V |.
The proof of Proposition 2.13 begins with the following simple observation.
Lemma 2.14. Let n ∈ N, suppose G = (V, E) is a connected n-quasitransitive graph and let Γ = Aut (G). Then every v ∈ V lies at a distance of at most n − 1 from each Γ-orbit of V .
Proof of Lemma 2.14. This is equivalent to the claim that the quotient graph G/Γ has diameter at most n − 1, which is trivial since this graph is connected and has at most n vertices.
Proof of Proposition 2.13. Let Γ = Aut (G) and let V ⊆ V be some Γ-orbit, noting that if G is finite then we may take V to be an orbit of maximum size so that |V |/n ≤ |V | ≤ |V | as required.
Since Γ acts by isometries on G it also acts by isometries on G = (V , E ), and this action is transitive by definition of V . Moreover, the degree of a vertex x in G is at most |B G (x, 2n)| ≤ (k + 1) 2n , as required.
We claim that the inclusion map V → V is a (2n, n)-rough isometry G → G. Indeed, Lemma 2.14 shows that for every v ∈ V there exists u ∈ V with d G (u, v) ≤ n, and we trivially This completes the proof that the inclusion V → V is a (2n, n)-rough isometry G → G, and also proves that G is connected with diameter at most diam(G).
Corollary 2.15. If Theorems 1.2 and 1.7 hold for n = 1 then they hold for all n ≥ 1.
Proof. In the case of Theorem 1.7 this is immediate from Proposition 2.13, Lemma 2.10 and the fact that superlinear growth is preserved under rough isometries, so we concentrate on Theorem 1.2. Let k, n ≥ 1, and λ > 0 and let G = (V, E) be a finite, n-quasitransitive graph of degree at most k satisfying By Proposition 2.13 there exists a transitive graph G = (V , E ) of degree at most (k + 1) 2n with |V |/n ≤ |V | ≤ |V | and diam(G ) ≤ diam(G), and hence and an injective (2n, n)-rough isometry φ : G → G. If Theorem 1.2 holds for transitive graphs then we may conclude by that theorem and the second part of Lemma 2.1 that for each ε > 0 there exists It then follows from Lemma 2.10 that for each ε > 0 there exists δ 2 = δ 2 (k, n, λ, ε) such that for every u, v ∈ V and p ≥ 1 − δ 2 , from which the claim follows easily by the first part of Lemma 2.1.

Nilpotent and Abelian groups
In this section we study percolation on Cayley graphs of Abelian and nilpotent groups. We study percolation in boxes in Z d in Section 3.1, use techniques from additive combinatorics to generalize these results to arbitrary Abelian Cayley graphs in Section 3.2, then use an inductive argument to study percolation on Cayley graphs of nilpotent groups in Section 3.3. Finally, in Section 3.4, which can be read independently of the rest of the section, we prove uniform upper bounds on p c for Cayley graphs of infinite, virtually nilpotent groups.

Percolation in an elongated Euclidean box
Malon and Pak proved Theorem 1.3 for certain specific types of generating sets called Hall bases [49, Theorem 1.2]. In this section we prove a variant of their result that concerns percolation on boxes rather than tori. In order to state this result it will be convenient to introduce some notation. Given n 1 , . . . , n d ∈ N, we define the box B(n 1 , . . . , n d ) ⊂ Z d via We view B(n 1 , . . . , n d ) as an induced subgraph of Z d , so that diam(B(n 1 , . . . , n d )) = 2(n 1 + · · · + n d ).
We now establish an analogue of Theorem 1.3 for Euclidean boxes. Then for every ε > 0 there exists p 0 = p 0 (λ, ε) such that P p (x ↔ y) ≥ 1 − ε for every p ≥ p 0 and x, y ∈ B. We first consider the case d = 2. The analysis of this case follows by standard methods and is similar to [23,Chapter 11.5], so we will keep our presentation brief and focus on the main conceptual ideas.
Remark 3.4. One can also show conversely that if p < 1 is fixed and m = m(n) = o(log n) then the torus (Z/nZ) × (Z/mZ), and in particular its subgraph B(n, m), does not contain a giant component with high probability under Bernoulli-p percolation as n → ∞. Indeed, this torus contains n/2 copies of the cycle Z/mZ with disjoint 1-neighbourhoods, and each such copy has all edges incident to it closed with probability (1 − p) 3m , independently of all other copies. We deduce that if n (1 − p) −3m then there will exist with high probability many cycles having this property. Since the locations of these cycles are uniform among the n/2 possibilities, their complement will not contain a giant cluster with high probability.
In order to prove Lemma 3.3, we first prove the following standard lemma concerning two dimensional bond percolation in a square box.
Lemma 3.5. For each p > 1/2 there exists a constant c(p) > 0, with c(p) → 1 as p → 1, such that every point x ∈ B(n, n) has probability at least c(p) to be connected to all four sides of the box B(n, n) in Bernoulli-p bond percolation on B(n, n).
Proof. We will prove that for an arbitrary p > 1/2 there exists a constant c(p) > 0 such that every point x ∈ B(n, n) has probability at least c(p) to be connected to all four sides of the box B(n, n) in Bernoulli-p bond percolation on B(n, n); using the fact that P p θ (A) ≥ P p (A) θ for every p, θ ∈ [0, 1] and every increasing event A [23, Theorem 2.38], we may then take c(p) → 1 as p → 1 as required. It follows from a standard duality argument [23, Lemma 11.21] that B(n, n) has a leftright crossing with probability at least 1/2 when p ≥ 1/2, and hence by symmetry and Harris-FKG that B(n, n) has both a left-right crossing and a top-bottom crossing with probability at least 1/4 when p ≥ 1/2. Moreover, it is a consequence of Kesten's theorem [40] that the critical probability for the quarter-plane [0, ∞) 2 ⊆ Z 2 is 1/2 (see e.g. [23,Chapter 11.5]), and hence that if p > 1/2 then there exists q = q(p) > 0 such that the origin is connected to infinity in the quarter-plane with probability at least q. Letting x ∈ B(n, n) and considering the four copies of the quarter-plane with corner at x, we have by Harris-FKG that with probability at least q 4 /4, B(n, n) has both a left-right crossing and a top-bottom crossing and x is connected to the boundary of B(n, n) within each of the four quarter-planes with corner at x. On this event, we see by a simple topological argument that x must be connected to all four sides of the box as required; see Figure 1 for an illustration. inequality that if p > 2/3 then the probability that e is connected to the bottom of the rectangle by a dual path of closed edges is at most r≥m If we take such a p, then it follows by a union bound that there does not exist any closed dual top-bottom crossing of the rectangle B(n, m) with probability at least 1 − 2/n. On this event there must exist an open path in the primal connecting the left and right sides of the rectangle.
On the other hand, for each p > 1/2 we have by Lemma 3.5 that there exists a constant c(p) > 0, with c(p) → 1 as p → 1, such that every element of the box [−m, m] 2 has probability at least c(p) to be connected to all four sides of the box [−m, m] 2 by open paths contained within this box. We can easily deduce from this that there is a giant cluster with high probability when p is close to 1. Indeed, if x and y are any two points in the rectangular box B(m, n), then if p is close enough to 1, x and y each have probability at least 1 − ε to be connected to both the top and bottom of the rectangle. By the Harris-FKG inequality, the probability that x and y are both connected to both the top and bottom of the rectangle and that there is an open left-right crossing of the rectangle is at least the product of these probabilities, and hence is close to 1 when p is close to 1 and n is large. On this event x must be connected to y (see Figure 1 for an illustration), and the claim is easily deduced.
We now apply the two-dimensional case to analyze the general case. Following Malon and Pak, we will do this by finding a constructing a homomorphic image of a two-dimensional box inside a box of general dimension, taking care to make sure the two-dimensional box has diameter of the correct order.
We may assume without loss of generality that n 1 ≤ · · · ≤ n d . We may also assume that |B| ≥ 100e λ , noting that this combines with (3.1) to force d ≥ 2. We claim that there exists 1 ≤ k < d such that Indeed, we will prove that if k is minimal such that N 1 · · · N k ≥ λ −1 log |B| then k < d and N k+1 · · · N d ≥ λ −1 log |B|. If k = 1 then this inequality is immediate since where the final inequality follows by calculus and the assumption that |B| ≥ 100e λ . We may therefore assume that k > 1. In this case the bounds (3.1) and d ≥ 2 imply that and hence that k < d. If (3.2) does not hold then we have that and hence that Since λ ≥ 1 this is contrary to the assumption that |B| ≥ 100, so that N k+1 · · · N d ≥ λ −1 log |B| as claimed. Let 1 ≤ k < d be such that (3.2) holds and set m = 1 2 (N 1 · · · N k −1) and n = 1 2 (N k+1 · · · N d −1), noting that m, n ∈ N since the N i are all odd. It follows from (3.2) and the assumption that Recall that a Hamiltonian path in a graph is a path that visits each vertex exactly once. Continuing to follow Malon and Pak, choose Hamiltonian paths such that φ 1 (0) = 0 and φ 2 (0) = 0, noting that such paths trivially exist, and note that the map φ = (φ 1 , φ 2 ) : B(m, n) → B is a bijective graph homomorphism satisfying φ(0) = 0. In particular, B(m, n) is isomorphic to a spanning subgraph of B, so that the desired result follows from (3.4) and Lemma 3.3.

Abelian groups
In this section we prove the following generalisation of Theorem 1.3. Throughout this section we use additive notation for Abelian groups. In particular, we write 0 for the identity element of an Abelian group, and given a subset A of an abelian group and a positive integer r we write rA = {a 1 + · · · + a r : a i ∈ A}. Given vertices x and y and a set of vertices A, we write {x A ← → y} for the event that x is connected to y by an open path of edges with both endpoints in A.
Theorem 3.7. For each k ≥ 1 and λ, ε ∈ (0, 1] there exist constants C = C k ∈ N and p 0 = p 0 (k, λ, ε) < 1 such that the following holds. Let Γ be an Abelian group with generating set S = {x 1 , . . . , x k } and let r ≥ 1 be such that |rŜ| ≥ λ(r + 1) log(r + 1). Then Remark. Theorem 3.7 is not true in an arbitrary group. For example, if S is a generating set for a non-Abelian free group Γ then |Ŝ r | grows exponentially in r, but under percolation on Cay(Γ, S) we have P p (x ↔ y) → 0 as d(x, y) → ∞ for every fixed p < 1.
Before we prove Theorem 3.7, let us confirm that it really does generalise Theorem 1.3.
Proof of Theorem 1.3. This follows by applying Theorem 3.7 with r = diam S (Γ) and using that |Γ| ≥ diam S (Γ), and then applying the first part of Lemma 2.1.
(Note that the L i need not be integers.) The progression P is called proper if each of its elements has a unique representation of the form 1 a 1 + · · · + k a k with | i | ≤ L i for every i. Note that if L 1 , . . . , L k are integers and P = P a 1 ,...,a k (L 1 , . . . , L k ) is a progression in an Abelian group then mP = P a 1 ,...,a k (mL 1 , . . . , mL k ) for every integer m ≥ 1. 3 . Let Γ be an Abelian group with generating set S = {x 1 , . . . , x k }. Then for each r ≥ 1 there exist non-negative integers L 1 , . . . , L k ≤ r such that the progression P = P x 1 ,...,x k (L 1 , . . . , L k ) is proper and satisfies P ⊆ rŜ ⊆ C k (P +Ŝ).
Note that we do not claim that the progression C k P is proper. Before proving this proposition, let us see how it implies Theorem 3.7, and hence in particular Theorem 1.3.
Let C = C k be the constant coming from Proposition 3.9 and let P = P x 1 ,...,x k (L 1 , . . . , L k ) be as in the statement of that proposition, so that P is proper and Since P is proper, there is a subgraph of G = Cay(Γ, S) with vertex set P that is isomorphic to the box B = B(L 1 , . . . , L k ). (The subgraph of G induced by P may 'wrap around the sides' of P and be strictly larger than this subgraph, but this does not cause any problems.) Moreover, we also have that and |CP + CŜ| ≤ |CP | · |Ŝ| C ≤ (2k + 1) C |CP | in the last line. Proposition 3.1 therefore implies that there exists for every x, y ∈ P and p ≥ p 0 . It follows trivially that for every p ≥ p 0 and x ∈ P ∪Ŝ also. Let y ∈ CP + CŜ, so that we can write y = a 1 + a 2 + . . . + a for some ≤ 2C and a 1 , . . . , a ∈ P ∪Ŝ. Writing y 0 = 0 for the identity of Γ and y i = i j=1 a j for each 0 ≤ i ≤ , we deduce from the Harris-FKG inequality that for every p ≥ p 0 and y ∈ CP + CŜ. A further application of Harris-FKG then yields that for every p ≥ p 0 and x, y ∈ CP +CŜ. The claim follows since rŜ ⊆ CP +CŜ and 2CP ⊆ 2CrŜ.
The remainder of this section is dedicated to proving Proposition 3.9, and is of an entirely additive-combinatorial nature. In order to facilitate an inductive proof we will prove a more general and technical statement concerning progressions that are proper modulo a set. We begin with some relevant definitions. Definition 3.10 (Divisibility by a subset). Let Γ be an Abelian group and let Q ⊆ Γ be symmetric and contain the identity. If a subset A ⊂ Γ satisfies then we may define an equivalence relation "≡ mod Q" on A by saying that x ≡ y mod Q if and only if x − y ∈ Q. We write A/Q for the set of equivalence classes of this equivalence relation, and call A/Q the quotient of A by Q. If in addition to (3.7) we have that then we say that A is divisible by Q. Note that if A is divisible by Q then so is every subset of A. If Q ⊂ Γ is symmetric and contains 0, we will say that a progression P ⊆ Γ is proper mod Q if it is proper, divisible by Q, and no two of its elements belong to the same equivalence class of P/Q. Equivalently, P is proper mod Q if it is proper and x − y / ∈ Q for each two distinct elements x, y ∈ P .
These definitions satisfy the following elementary inductive property.
We now prove the following proposition, which generalizes Proposition 3.9. We will always apply this proposition with Q = {0}, but we include Q in the statement in order to facilitate the inductive proof. Note also that we do not require that the set A generates Γ. Proposition 3.12. For each k ≥ 1 let C k = 2 6k (k!) 3 . Let Γ be an Abelian group, let Q ⊂ Γ be symmetric and contain the identity, and let A = {a 1 , . . . , a k } ⊂ Γ be a subset of Γ with k elements. For each r ∈ N such that rÂ is divisible by Q there exist non-negative integers L 1 , . . . , L k ≤ r such that the progression P = P a 1 ,...,a k (L 1 , . . . , L k ) is proper mod Q and satisfies P ⊆ rÂ ⊆ C k (P + Q +Â).
Proof. We prove the claim by induction on k, beginning with the base case k = 1. To prove this case, let A = {a} for some a ∈ Γ and suppose that rÂ = {−ra, . . . , ra} = P a (r) is divisible by Q. If P a (r) is proper mod Q then the proposition is satisfied, so we may assume not and set L < r to be the maximum non-negative integer such that P a (L) is proper mod Q. By maximality there exist distinct , ∈ Z with | |, | | ≤ L + 1 such that ( − )a ∈ Q. The divisibility of P a (r) by Q then implies that m( − )a ∈ Q for all m ∈ Z such that |m( − )| ≤ r, and so rÂ ⊆ P a (| − |) + Q.
In particular, we have that P a (L) ⊆ rÂ ⊆ P a (| − |) + Q ⊆ P a (2L + 2) + Q ⊆ 2(P a (L) + Q +Â) and the proposition is satisfied. Now suppose that k ≥ 2, let Γ, Q, A, and r be as in the statement, and suppose that the claim has already been proven for all smaller values of k. It suffices to prove that if r satisfies the additional assumption that rÂ ⊆ (r − 1)Â + Q, or equivalently that there exists x ∈ rÂ such that x ≡ y mod Q for every y ∈ (r − 1)Â, (3.9) then there exist non-negative integers L 1 , . . . , L k ≤ r such that P = P a 1 ,...,a k (L 1 , . . . , L k ) is proper mod Q and satisfies P ⊆ rÂ ⊆ (C k − 1)(P + Q +Â).
Indeed, suppose that we have proven this claim and that r does not satisfy (3.9). Then either rÂ ⊆ Q, in which case the claim is trivial, or we may take r ∈ N to be maximal with 1 ≤ r ≤ r such that there exists x ∈ r Â that is not equal to any element of (r − 1)Â mod Q. Since rÂ is divisible by Q, so that equivalence mod Q is an equivalence relation on rÂ, the maximality of r implies that every element of rÂ is equivalent to an element of r Â mod Q, so that rÂ ⊆ r Â + Q. Applying the claim to the set r Â , we obtain that there exist non-negative integers L 1 , . . . , L k ≤ r ≤ r such that the progression P = P a 1 ,...,a k (L 1 , . . . , L k ) is proper mod Q and satisfies P ⊆ r Â ⊆ (C k − 1)(P + Q +Â), so that P ⊆ rÂ ⊆ C k (P + Q +Â) as required.
We now carry out the induction step under the additional hypothesis (3.9) as discussed above. By condition (3.9) we may pick x ∈ rÂ \ ((r − 1)Â + Q). Note then that x = m 1 a 1 + · · · + m k a k for some m 1 , . . . , m k ∈ Z such that |m 1 | + · · · + |m k | = r. By relabelling the generators a 1 , . . . , a k if necessary, we may assume without loss of generality that m k = max i |m i |, which implies in particular that r k ≤ m k ≤ r. We next claim that if 1 , . . . , k ∈ Z satisfy | k | ≤ m k and 1 a 1 + · · · + k a k ∈ Q then Indeed, since Q is symmetric we may assume that 0 ≤ k ≤ m k . For such ( i ) k i=1 we have trivially that x ∈ (m 1 − 1 )a 1 + · · · + (m k − k )a k + Q, and using that x / ∈ (r − 1)Â + Q we deduce that as claimed. Applying this claim with 1 = · · · = k−1 = 0 shows that the progression P a k (m k /2) is proper mod Q. Similarly, taking arbitrary 1 , . . . , k−1 with | i | ≤ m k /4k establishes the implication We claim that P a 1 ,...,a k−1 (m k /4k, . . . , m k /4k) is divisible by P a k (m k /2) + Q. We start the proof of this claim by showing that the quotient of P a 1 ,...,a k−1 (m k /4k, . . . , m k /4k) by P a k (m k /2) + Q is well defined (i.e., that (3.7) holds). Given x, y, z ∈ P a 1 ,...,a k−1 (m k /4k, . . . , m k /4k) with x − y ∈ P a k (m k /2) + Q and y − z ∈ P a k (m k /2) + Q, it follows from (3.11) that there exist u, v ∈ P a k (m k /4) such that x ≡ y + u mod Q and y + u ≡ z + u + v mod Q (equivalence mod Q being well defined for x, y + u and z + u + v since they all belong to rÂ, which is divisible by Q). We deduce from this that x ≡ z + u + v mod Q, which implies in particular that x − z ∈ P a k (m k /2) + Q as required. To prove moreover that P a 1 ,...,a k−1 (m k /4k, . . . , m k /4k) is divisible by P a k (m k /2) + Q (i.e., to verify (3.8)), suppose that x, x , y, y ∈ P a 1 ,...,a k−1 (m k /4k, . . . , m k /4k) satisfy x ≡ x mod P a k (m k /2) + Q and y ≡ y mod P a k (m k /2) + Q. As before, (3.11) implies that there exist u, v ∈ P a k (m k /4) such that x ≡ x + u mod Q and y ≡ y + v mod Q, and since rÂ is divisible by Q it follows that x + y ≡ x + y + u + v mod Q. This in turn implies that x + y − x − y ∈ P a k (m k /2) + Q, and hence that x + y ≡ x + y mod P a k (m k /2) + Q as required. Note moreover that writing A = {a 1 , . . . , a k−1 } we have that m k 4k Â is a subset of P a 1 ,...,a k−1 (m k /4k, . . . , m k /4k), and is therefore divisible by P a k (m k /2) + Q also.
Define n ∈ N ∪ {0} to be minimal such that m k 4k Â ⊆ nÂ + P a k (m k /2) + Q, noting that n ≤ m k /4k and, by (3.11), that Applying the induction hypothesis to the sets A and P a k (m k /2) + Q, we deduce that there exist integers L 1 , . . . , L k−1 ≤ n such that P a 1 ,...,a k−1 (L 1 , . . . , L k−1 ) is proper mod P a k (m k /2) + Q and satisfies Set L k = m k /4k , so that r/4k 2 ≤ L k ≤ r/4k by (3.10). Considering separately the cases L k = 0 and L k ≥ 1 yields that P a k (r) ⊆ 4k 2Â + 8k 2 P a k (L k ). (3.14) Since the progression P a 1 ,...,a k−1 (L 1 , . . . , L k−1 ) is proper mod P a k (m k /2) + Q and P a k (m k /2k) is proper mod Q, we may apply Lemma 3.11 to deduce that the progression P a 1 ,...,a k (L 1 , . . . , L k ) is proper mod Q as required. This progression is also clearly contained in rÂ since L i ≤ r/4k for every 1 ≤ i ≤ k. It follows from (3.12) and (3.13) that Noting that P a k (m k /2) ⊆ 4kP a k (L k ) + 2kÂ (the term 2kÂ being necessary only if L k = 0), we deduce that It follows from (3.10) that r ≤ km k ≤ 8k 2 L k + 4k 2 , so we deduce that It follows from this and (3.14) that The claim follows since 32k 3 (C k−1 + 2) + 1 ≤ 64k 3 C k−1 = C k for every k ≥ 2.
Proof of Proposition 3.9. This follows by applying Proposition 3.12 with A = S and Q = {0}.

Finite nilpotent groups
In this section we extend Theorem 1.3 to finite nilpotent groups. We first recall some basic relevant facts about nilpotent groups, referring the reader to e.g. [ The lower central series of Γ is the nested sequence of normal subgroups Γ 1 > Γ 2 > · · · defined recursively by Each subgroup Γ i appearing in the lower central series is easily seen to be characteristic in Γ (that is, Γ i is fixed by every automorphism of Γ), and by definition Γ i /Γ i+1 is central in Γ/Γ i+1 for every i ≥ 1 (that is, every element of Γ i /Γ i+1 commutes with every element of Γ/Γ i+1 ). The group Γ is said to be nilpotent if there exists s ≥ 1 such that Γ s+1 = {id}. Note in this case that Γ i = {id} for every i > s and Γ s = Γ s /Γ s+1 is central in Γ = Γ/Γ s+1 . The minimal such s is known as the step of Γ, and is equal to 1 if and only if Γ is Abelian. The primary goal of this section is to prove the following proposition.
Proposition 3.13. For each k, s ∈ N, λ ≥ 1, and ε > 0 there exists p 0 (k, s, λ, ε) < 1 such that the following holds: If Γ is a finite s-step nilpotent group and S is a generating set for Γ of size at most k satisfying then Bernoulli-p bond percolation on Cay(Γ, S) satisfies P p (x ↔ y) ≥ 1 − ε for every p ≥ p 0 and x, y ∈ Γ.
The proof of Proposition 3.13 will proceed by induction on the step s of the group, with the Abelian case already being handled by Theorem 3.7. The proof will rely on the fact that if Γ is nilpotent of step s then Γ/Γ s is nilpotent of step s − 1. We begin by recalling some relevant basic facts about nilpotent groups.
is a homomorphism in each variable modulo Γ j+1 . Moreover, if γ i ∈ [Γ, Γ] for some i then φ j (γ 1 , . . . , γ j ) ∈ Γ j+1 . In particular, if Γ is j-step nilpotent group then φ j is a homomorphism in each variable and the commutator subgroup [Γ, Γ] is in the kernel of each of these homomorphisms. We will split the proof of the induction step into two cases according to whether |Γ s | ≥ log |Γ| or |Γ s | < log |Γ|. It will be useful to know that |Γ s | cannot be too large. We recall that the rank of a finitely generated group Γ is defined to be the minimal cardinality of a generating set of Γ. We now treat the case that |Γ s | ≥ log |Γ|, for which we will be able to use a rather general argument that does not use the induction hypothesis and is not specific to the nilpotent setting. Let Γ be a group with finite generating set S, and let H be a subgroup. Given r ≥ 1, we say that H is r-quasiconnected ifŜ r ∩ H generates H, or equivalently if the cosets of H are connected as subsets of Cay(Γ,Ŝ r ). In particular, Lemma 3.15 implies that if Γ is s-step nilpotent then Γ s is C s -quasiconnected with respect to any generating set. The next proposition shows very generally that the existence of a quasiconnected central subgroup of moderate size implies non-triviality of the percolation phase transition. As above, we say that a subgroup H of a group Γ is central if γh = hγ for every h ∈ H and γ ∈ Γ, noting that central subgroups are always normal.
Proof. We will construct a surjective rough embedding of a Euclidean box onto Cay(Γ, S). The box we construct will satisfy the hypotheses of Proposition 3.1 by (3.16), so that we can conclude by applying that proposition together with Lemma 2.10. Let π : Γ → Γ/H be the projection map. It suffices by Corollary 2.11 to prove the assertion about the non-triviality of the percolation phase transition for G = Cay(Γ,Ŝ r ) rather than Cay(Γ, S). Let G 1 be the subgraph of G induced by H, which is isomorphic to Cay(H,Ŝ r ∩ H), and let G 2 be Cay(Γ/H,Ŝ r /H). Recall that every finite graph all of whose degrees are even admits an Eulerian circuit, i.e., a cycle that passes through every edge exactly once [17, Theorem 1.8.1]. It follows that every finite graph admits a path that visits every vertex and crosses each edge exactly twice. Thus, there exist n 1 , n 2 ≥ 1 and surjective functions φ 1 : {1, . . . , n 1 } → H and φ 2 : {1, . . . , n 2 } → Γ/H such that • φ i (j) and φ i (j + 1) are adjacent in G i for each i ∈ {1, 2} and 1 ≤ j ≤ n i − 1 and • the path in G i associated to φ i crosses each edge of G i at most twice.
The centrality of H implies that the map φ defines a graph homomorphism B → G, i.e., that φ(a, j) and φ(a , j ) are adjacent in G whenever |a − a | + |j − j | = 1. Equivalently, φ(a, j) −1 φ(a , j ) ∈Ŝ r whenever |a − a | + |j − j | = 1. Indeed, if a = a and j = j + 1 we have trivially that while if a = a + 1 and j = j then we have by centrality of H that which belongs toŜ r since φ 1 (a) and φ 1 (a + 1) are adjacent in G 1 . Finally, we observe that the preimage of each element of Γ under φ has between 1 and 4(2k+1) 2r elements. Indeed, φ(a, j) belongs to the right coset of H determined by π(s 1 · · · s j−1 ), so that for each γ ∈ Γ there are at most 2(2k + 1) r values of j for which φ(a, j) lies in the same H coset as γ, and for each j there are at most 2(2k + 1) r values of a for which φ 1 (a) = γs −1 j−1 · · · s −1 1 . Thus φ is a (1, 4(2k + 1) 2r )-rough embedding of B into G. Since φ is surjective on vertices and min{n 1 , n 2 } ≥λ log max{n 1 , n 2 } the claim follows from Proposition 3.1 and Lemma 2.10. Proposition 3.17 together with Lemmas 3.15 and 3.16 immediately handles the case of Proposition 3.13 in which |Γ s | ≥ log |Γ|. We next address the case that |Γ s | < log |Γ|, for which the induction hypothesis will actually be used. We begin with some preliminaries. Let Γ be an sstep nilpotent group with finite generating set S, so that Γ s is generated by S s = {[x 1 , . . . , x s ] : x 1 , . . . , x s ∈ S} by Lemma 3.15. Since Γ s is central in Γ, we have moreover that z Γ s is an Abelian subgroup of Γ for each z ∈ Γ and that z Γ s is generated by S s ∪ {z}. The next lemma shows that when s ≥ 2 we can always take z ∈ S in such a way that the Abelian group z Γ s "looks at least two-dimensional" within a ball that contains Γ s . which implies the claim since m + 1 ≤ k s + 1 ≤ 2k s .
We are now ready to complete the proof of Proposition 3.13.
Proof of Proposition 3.13. We prove the proposition by induction on the step s of the nilpotent group, the Abelian s = 1 base case having already been handled by Theorem 3.7. Let s ≥ 2, k ∈ N, λ ≥ 1, and ε > 0. Let Γ be a finite s-step nilpotent group and let S be a generating set for Γ of size at most k satisfying We need to prove that there exists p 0 = p 0 (s, k, λ, ε) < 1 such that if p ≥ p 0 then Bernoulli-p bond percolation on Cay(Γ, S) satisfies P p (x ↔ y) ≥ 1 − ε for every x, y ∈ Γ. We may split into two cases according to whether or not |Γ s | ≥ log |Γ|, taking p 0 to be the maximum of the constants produced in the two cases.
In the first case, |Γ s | ≥ log |Γ| and the claim follows immediately from Proposition 3.17 with H = Γ s . Indeed, the hypotheses of this proposition are satisfied by Lemma 3.15, which implies that Γ s is r-quasiconnected for some r = r(s), and Lemma 3.16, which implies that |Γ s | ≤ c|Γ|/ log |Γ| for some constant c = c(s, k). Now suppose that |Γ s | < log |Γ|. In this case, we clearly have that Letting G 1 = Cay(Γ/Γ s , S), it follows from the induction hypothesis that there exists q 1 = q 1 (s, k, λ, ε) < 1 such that if p ≥ q 1 then Let z ∈ S and r ≥ 1 be as in the statement of Lemma 3.18 and let G 2 = Cay( z Γ s , S s ∪{z}). Using the conclusions of Lemma 3.18 together with Theorem 3.7 yields that there exists q 2 = q 2 (s, k, ε) such that Letting G = Cay(G, S ∪ S s ), we deduce from these estimates together with Proposition 2.8 that for every x, y ∈ Γ belonging to a common coset of Γ s .
Letting q 3 < 1 be defined by 1 − q 3 = (1 − q 1 ∨ q 2 ) 2 so that Bernoulli-q 3 percolation on G has the same law as the union of two independent copies of Bernoulli-q 1 ∨ q 2 percolation, we obtain that if p ≥ q 3 then for every x, y ∈ Γ. Since ε > 0 was arbitrary, we may conclude by applying Corollary 2.11.

Subgroups of bounded index and virtually nilpotent groups
In this section we show how to reduce the study of percolation on a group to the study of percolation on a subgroup of bounded index. Our main objective in so doing is to prove the following extension of Proposition 3.13.
Theorem 3.19. Let k, n, s ∈ N and let λ, ε > 0. Then there exist p 0 = p 0 (k, n, s, λ, ε) < 1 such that if Γ is a finite group containing an s-step nilpotent subgroup H of index n, and if S is a generating set for Γ of size at most k satisfying diam S (Γ) ≤ λ|Γ| (log |Γ|) s , then Bernoulli-p bond percolation on Cay(Γ, S) satisfies P p (x ↔ y) ≥ 1 − ε for every p ≥ p 0 and x, y ∈ Γ.
Another objective is to prove the following quantitative version of [47,Corollary 7.19], which is an important ingredient in the proof of Theorem 1.7. Note here that the upper bound on p c does not depend on the size of the generating set.
Theorem 3.20. For each n ≥ 1 there exists ε = ε(n) > 0 such that if Γ is an infinite, finitely generated group that is not virtually cyclic and that contains a nilpotent subgroup of index at most n, and if S is a finite generating set of Γ, then p c (Cay(Γ, S)) ≤ 1 − ε.
In fact, proving these results is fairly straightforward given the results of the previous section and the following standard lemmas.   Although this lemma is well known as folklore, we were not able to find a reference and have therefore included a proof for completeness. . , x j ] with each x i ∈ S generate Γ j modulo Γ j+1 . Thus each quotient Γ j /Γ j+1 with j = 2, . . . , s is an abelian group generated by finitely many elements of finite order. Since Γ s+1 = {id}, it follows that Γ 2 = [Γ, Γ] is finite, and hence that Γ is finite by cyclic. Finite-by-cyclic groups are well known to be virtually cyclic: indeed, if Γ/H ∼ = Z for some H Γ and if xH is a generator of G/H then H is a complete set of coset representatives for the cyclic subgroup x of Γ. This completes the proof.

Percolation from isoperimetry
The goal of this section is to prove Theorems 1.5, 1.8 and 1.10. The section is organised as follows: In Section 4.1 we explain how the methods of Duminil-Copin, Goswami, Raoufi, Severo, and Yadin [18] can be adapted to the setting of finite graphs, then apply the resulting theorems to prove Theorem 1.5 in Section 4.2. Finally, in Section 4.3 we prove Theorem 1.10 and then deduce Theorem 1.8 from Theorems 1.5 and 1.10.

Connecting large sets via the Gaussian free field
In this section we discuss those results that can be obtained by a direct application of the methods of [18] to finite graphs. Before stating these results, we first introduce some relevant definitions. Let G = (V, E) be a finite, connected graph and let B ⊆ V be a distinguished set of boundary vertices. We say that the pair (G, B) satisfies a d-dimensional isoperimetric inequality with constant c, Note that, in contrast to our earlier definition (ID d,c ), we now require that every large subset of V \ B has large boundary, not just those with at most half the total volume of V . In particular, B must itself be large for (ID(B) d,c ) to hold. For each two non-empty, disjoint sets of vertices A and B in the finite connected graph G, we write C eff (A ↔ B) for the effective conductance between A and B, which is defined by where P v is the law of a simple random walk started at v and τ A and τ + A denote the first time and first positive time that the walk visits A respectively. It is a theorem of Lyons, Morris, and Schramm [46] that if G = (V, E) satisfies a d-dimensional isoperimetric inequality (ID d,c ) then there exists a positive constant c = c (d, c) such that for every two disjoint non-empty sets A, B ⊆ V . See e.g. [47] for further background on effective conductances.
The following theorem follows by an essentially identical proof to that of [18,Theorem 1.2].
for every p 0 ≤ p ≤ 1 and every non-empty set A ⊆ V .
Rather than reproduce the entire proof of [18, Theorem 1.2], which would take rather a lot of space, we instead give a brief summary of the main ideas of that proof with particular emphasis given to the (very minor) changes needed to prove Theorem 4.1. The proof relies crucially on the relationships between Bernoulli bond percolation and the Gaussian free field. The basic idea of the proof is that, by using the Gaussian free field, we can construct a percolation in random environment model that is easier to prove has a phase transition than for standard Bernoulli percolation. The main technical step of the proof of [18] then shows that, under a suitable isoperimetric assumption, we can 'integrate out' the randomness of the environment and compare the new model to standard Bernoulli percolation of sufficiently high retention probability.
We now recall the relevant definitions. Let G = (V, E) be a finite, connected graph and let B be a non-empty subset of V . Let P B : V 2 → R be the transition matrix of a random walk on G that is killed when it first visits B, so that P B (u, v) = number of edges between u and v deg(u) for every u, v ∈ V . The Green function G B : V 2 → R is defined by is the expected number of times a random walk started at u visits v before first hitting B. (Note that the normalization by the degree is not always included in the definition of the Green function, but is convenient for our applications here as it makes the Green function symmetric.) The Gaussian free field (GFF) on G with Dirichlet boundary conditions on B is the mean-zero Gaussian random vector ϕ = (ϕ v ) v∈V with covariances given by the Green function. That is, ϕ is a Gaussian random vector with for expectations taken with respect to the law of ϕ. Equivalently, the law P GFF B of the GFF on G with Dirichlet boundary conditions on B can be defined as the measure on R V that is supported on {ψ ∈ R V : ψ(b) = 0 for every b ∈ B} ∼ = R V \B and has density with respect to Lebesgue measure Leb on R V \B given by where E → denotes the set of oriented edges of the graph G and Z(G, B) is a normalizing constant.
(Although our graphs are not oriented, we can still think of each edge as having exactly two possible orientations.) Let λ > 0. It follows from the representation (4.3) that if we condition on the absolute values (|ϕ v + λ|) v∈V , then the signs (sgn(ϕ v + λ)) v∈V are distributed as an Ising model on G with plus boundary conditions on B and with coupling constants given by J(e) = |ϕ x + λ||ϕ y + λ| for each edge e with endpoints x and y. (The discussion here serves only as background; we will not use the Ising model directly in this paper.) Using this fact with λ = 1 together with the relationship between the Ising model and Bernoulli percolation given by the Edwards-Sokal coupling led the authors of [18] to derive an important estimate concerning connection probabilities in a certain percolation in random environment model derived from the GFF. Before stating this estimate, let us first give the relevant definitions. Given a vector of probabilities p = (p e ) e∈E , P p denotes the law of the inhomogeneous Bernoulli bond percolation process in which each edge e is either deleted or retained independently at random with retention probability p e . Finally, given a real number x we write x + = max{x, 0}. The following proposition follows by the same proof as [18, Proposition 2.1]. (Indeed, the proof of that proposition implicitly establishes the proposition as stated here and then applies a limiting argument to deduce a similar claim for connections to infinity in infinite graphs.) where p(ϕ) e := 1 − exp −2(ϕ x + 1) + (ϕ y + 1) + for each edge e with endpoints x and y.
Proof of Proposition 4.3. For each t ∈ R A let X t A (ϕ) = exp − x∈A t x (ϕ x + 1) . Equation (2.5) of [18] states in our notation that for every t ∈ R A . We have by the definitions that x∈A t x (ϕ x + 1) is a Gaussian with mean x∈A t x and variance x,y∈A t x t y G B (x, y), so that Taking for each x ∈ A yields the claimed bound since we have by a standard calculation that x∈A t x = C eff (A ↔ B) and y∈A P x (the walk visits A for the last time at y before hitting B) = 1 for every x ∈ A.
Proposition 4.3 establishes the same estimate as Theorem 4.1 but applying to the percolation in random environment model associated to the GFF rather than to our original Bernoulli percolation model. An important and technical part of the argument of [18] shows that, in high-dimensional graphs, the two models can be compared in such a way that we can deduce Theorem 4.1 from Proposition 4.3.
for every p 0 ≤ p ≤ 1 and every non-empty set A ⊆ V .
Note that the proof does not establish a stochastic domination relation between the two models, and indeed such a relation does not hold.
Proof of Proposition 4.4. We first apply the classical relationship between isoperimetric inequalities and return probability bounds (see e.g. [42,Theorem 3.2.7] or [47, Corollary 6.32]) to deduce from (ID(B) d,c ) that there exists a constant C = C(d, c, k) such that for every u, v ∈ V . This bound is of the same form as the hypothesis (H d ) of [18], and given this bound the proof proceeds exactly as that of [18,Proposition 3.2]; one need only check that the value of p 0 given by that proof depends only on d, c, and k. Let us now give a brief indication of how this can be done. That proof gives a value for p 0 of the form 1 − (1 − q)e −h , where q ∈ [ 1 2 , 1) and h > 0 both depend on two other quantities α > 0 and n 0 ∈ N, and in principle on the graph G. The quantity q is computed explicitly from n 0 in the proof of Proposition 3.2, with no dependence on the graph. The quantity h is computed in Lemma 3.5, and depends only on n 0 and the maximum degree of the graph since all the constants arising in the quoted theorem of Liggett, Schonmann and Stacey [44] depend only on the maximum degree. The value of α and a preliminary value for n 0 are given by Lemma 3.6; they depend on several constants appearing in the conclusion of the proof of that lemma (pages 21 and 22 of the published version), each of which can easily be checked to depend only on the parameters d, c, and k (indeed, these constants are introduced only to simplify a completely explicit expression involving the maximum degree and sums of return probabilities). The integer n 0 is then possibly increased at the start of the proof of Proposition 3.2, but only to ensure that it is larger than some constant depending only on α.

From large sets to small sets
We now apply Theorem 4.1 to complete the proof of Theorem 1.5. We first establish the following crude relationship between the two kinds of isoperimetric inequality we consider.
Putting together Theorem 4.1 and Lemma 4.5 yields the following immediate corollary.
We now apply Corollary 4.6 to prove Theorem 1.5. To do this, we will apply Theorem 4.1 with B taken to be a random set given by a so-called ghost field, which we now introduce. For each p ∈ [0, 1] and h > 0 we write P p,h for the joint distribution of the Bernoulli-p bond percolation configuration ω and an independent ghost field G of intensity h, that is, a random subset of V in which each vertex is included independently at random with probability 1 − e −h . Note that the ghost field G has the property that for every A ⊆ V . Moreover, it follows from a standard Chernoff bound calculation for sums of Bernoulli random variables [51,Theorem 4.5] that there exists a universal constant a ∈ (0, 1) such that if h ≤ 1 then |G| satisfies the lower tail estimate (Indeed, one may take a = (e − 2) 2 /8(e − 1) 2 .) Proof of Theorem 1.5. Let d > 6 + 2 √ 7 and write α = (d − 2)/d. Let a ∈ (0, 1) be the universal constant from (4.7). By the results of Lyons, Morris, and Schramm [46] stated in (4.1), there exists a positive constant η = η(d, c, k) ≤ a such that C eff (A ↔ B) ≥ η min{|A|, |B|} α for every two disjoint non-empty sets A, B ⊆ V . The condition d > 6 + 2 √ 7 allows us to take δ such that .
Fix one such choice of δ and let p 0 = p 0 (δ, η/16, d, c, k) < 1 be as in Corollary 4.6 so that for every two non-empty sets A, B ⊆ V with max{|A|, |B|} ≥ η 16 |V | 1−δ . Since small values of |V | can be handled by increasing p, we may assume throughout the rest of the proof that |V | ≥ (8/η) 1/(1−δ) so that η 8 |V | −δ ≥ |V | −1 . For each A ⊆ V , let K A = v∈A K v be the union of all clusters intersecting A. We first apply Corollary 4.6 with one of the sets equal to the ghost field G to show that K A is much larger than |A| with high probability when p ≥ p 0 and |A| is either large or small in a certain sense. Indeed, Corollary 4.6 implies immediately that if |V | −1 ≤ h ≤ 1 and A ⊆ V are such that either |A| ≥ η 16 |V | 1−δ or h ≥ η 8 |V | −δ then we have by (4.7) and the choice of η that On the other hand, we have by definition of the ghost field that for every h > 0 and A ⊆ V . It follows by (4.9), (4.10) and Markov's inequality that by assumption, we may apply the previous inequality with h = η 8 |V | −δ to obtain that It follows from (4.11) that for every A ⊆ V with |A| ≤ η 8 |V | 1−δ . Now for each i ≥ 1, let p i be defined by 1 − p i = (1 − p 0 ) i+1 , so that Bernoulli-p i percolation has the same distribution as the union of i + 1 independent copies of Bernoulli-p 0 percolation. This relationship immediately yields that the recursive inequality holds for every i ≥ 0, A ⊆ V and n, m ≥ 1, which combines with (4.11) and the fact that and we deduce by induction on i that so that ϕ is increasing and moreover that (4.14) It follows that there exists a constant i 0 , and hence that there exists a constant C 1 = 2i 0 such that for every non-empty set A ⊆ V with |A| ≤ η 8 |V | 1−δ , and hence for every non-empty set A ⊆ V since the inequality holds vacuously in the case |A| > η 8 |V | 1−δ . Considering again the coupling of p i 0 +1 percolation with p i 0 and p 0 percolation, we deduce that if A, B ⊆ V are non-empty then for some constant C 2 = C 2 (d, c, k) = (2C 1 + 1), where we used (4.8) to bound the third term in the second line.
To complete the proof, it remains to show that the constant prefactor C 2 in (4.15) can be made arbitrarily small by increasing p in a uniform manner. This is particularly important in the case that A and B are singletons, in which case the right hand side of (4.15) might be larger than 1, rendering the inequality useless. Let B(v, r) denote the graph-distance ball of radius r around v for every v ∈ V and r ≥ 1. Since |B(v, r)| ≥ r + 1 for every v ∈ V and r ≤ diam(G), it follows from (4.15) that there exists a constant r 0 = r 0 (d, c, k) such that for every u, v ∈ V . Applying the Harris-FKG inequality, it follows that there exists a positive constant c 1 = c 1 (d, c, k) such that for every v ∈ V , where in the second line we used that B(v, r) is incident to at most k r+1 edges. It follows from (4.15) and (4.16) by calculus that there exists a positive constant η = η (d, c, k) such that for every two non-empty sets A, B ⊆ V . The claim as stated in the theorem, in which we can increase p in a uniform way to introduce an arbitrarily small constant prefactor, follows easily by calculus from this together with the fact [23, Theorem 2.38] that P p θ (A) ≥ P(A) θ for every p, θ ∈ (0, 1) and every increasing event A.

Consequences for the uniqueness threshold
In this section we prove our results concerning the uniqueness threshold. More specifically, we first prove Theorem 1.10, then deduce Theorem 1.  Then P p (x ↔ ∞) is continuous on [p 0 , 1] for every x ∈ V . Moreover, if p 0 ≤ p 1 ≤ 1 is such that there is a unique infinite cluster P p 1 -almost surely, then there is a unique infinite cluster P p 2 -almost surely for every p 1 ≤ p 2 ≤ 1.
Remark 4.8. We believe that the bounded degree assumption is not really necessary for this result to hold, and hence should not be necessary for Theorem 1.10 to hold either.
The proof of Theorem 1.10 will also apply [23, Theorem 2.45], which states that if A ⊆ {0, 1} E is an increasing event and I r (A ) denotes the event that A holds in any configuration obtained from ω by deleting at most r edges then for every 0 ≤ p 1 < p 2 ≤ 1 and r ≥ 1. That is, if A holds with high probability at p 1 then it will hold and be stable to the perturbation of a large number of edges with high probability at p 2 > p 1 .
Proof of Theorem 1.10. It follows from the hypotheses that there exists a function f : N → (0, 1] with f (n) → 0 as n → ∞ such that for every two finite sets A, B ⊆ V . Letting (V n ) n≥1 be an exhaustion of V by finite connected graphs, we note that has dense complement in (p 0 , 1]. We will do this by proving that D is contained in the set of discontinuities of point-to-point connection probabilities and hence that D has at most countably many elements. Fix p 0 < p 1 < 1. Given finite sets A and B and r ≥ 1, let {A r B} be the event that there exists a collection of r edge-disjoint open paths each of which starts in A and ends in B. By Menger's theorem this is equivalent to the event that there does not exist any set of r − 1 open edges whose deletion disconnects A from B. Applying (4.17) yields that there exists a constant C 1 , depending on the choice of p 1 , such that for every A, B ⊆ V finite and r ≥ 1. It follows in particular that there exists a positive constant c, depending on the choice of p 1 , such that if we define g(n) = −c log f (n) for every n ≥ 1 then for every A, B ⊆ V finite and 1 ≤ r ≤ g(min{|A|, |B|}). Since f (n) → 0 as n → ∞, we have that g(n) → ∞ as n → ∞.
Let p ≥ p 1 and suppose that u and v belong to distinct infinite clusters with positive probability in Bernoulli-p percolation on G. Let ω 1 and ω 2 be independent copies of Bernoulli-p percolation on G, and let ω ∈ {0, 1} E be defined by where we say that an edge touches the cluster of a vertex if it has at least one endpoint in that cluster; note that these touching edges are exactly those edges that are revealed when exploring the cluster. It is easily seen that ω is itself distributed as Bernoulli-p bond percolation on G, and that the clusters of u and v are the same in ω 1 and ω. Condition on ω 1 and suppose that u and v belong to distinct infinite clusters of ω 1 . Let m ≥ 1 and let A and B be finite subsets of K u and K v respectively such that |A|, |B| ≥ m. Using that ω 2 is independent of ω 1 and applying (4.19), we have with probability at least 1 − f (m) that there exists a collection of at least g(m) disjoint open paths connecting A to B in ω 2 . Since ω and ω 2 coincide for those edges that do not touch the cluster of u or v in ω 1 we deduce that, with probability at least 1 − f (m), there exists a collection of g(m) edge-disjoint paths in G each of which starts in the cluster of u in ω, ends in the cluster of v in ω, and is ω-open other than at its first and last edge. Since m was arbitrary, g(m) → ∞ as m → ∞, and the probability of the aforementioned event tends to 1 as m → ∞, we deduce that, for each k ≥ 1, there almost surely exists a collection of k edge-disjoint paths in G each of which starts in the cluster of u in ω, ends in the cluster of v in ω, and is ω-open other than at its first and last edge. Considering the standard monotone coupling of percolation at p and p + ε, we deduce that for every p 1 ≤ p ≤ p + ε ≤ 1, u, v ∈ V , and k ≥ 1 and hence that and since increasing functions have at most countably many points of discontinuity we deduce that D ∩ [p 1 , 1] is at most countable also. Since p 0 < p 1 < 1 was arbitrary, we deduce that D is at most countable and hence that D has dense complement in (p 0 , 1] as required. Proof of Theorem 1.8. It follows from Theorem 1.5 that there exist positive constants η = η(d, c, k) > 0 and p 0 = p 0 (d, c, k) < 1 such that (4.20) for every p ≥ p 0 and every two finite sets of vertices A and B. (Indeed, simply take n to be sufficiently large that A, B ⊆ V n and apply Theorem 1.5 to G n .) The claim therefore follows from Theorem 1.10.

The structure theory of vertex-transitive graphs
In this section we review the structure theory of vertex-transitive graphs that will be used in the proofs of our main theorems and prove some related supporting technical propositions. Let us begin with a brief historical overview of the relevant theory. The structure theory of vertex-transitive graphs that we use in this paper has its roots in celebrated results of Gromov and Trofimov from the 1980s. Gromov's theorem states that every group of polynomial growth is virtually nilpotent [28]. Trofimov's work shows that every transitive graph of polynomial growth is roughly isometric to a Cayley graph [71] (see also [65,Remark 2.2] for more details), the underlying group of which is then virtually nilpotent by Gromov's theorem. Combined with a formula of Bass [6] and Guivarc'h [29], these results immediately imply that if G = (V, E) is an infinite transitive graph of polynomial growth then there exist constants c, C > 0 and an integer d ≥ 1 such that for all o ∈ V and n ∈ N.
It is a classical result of Coulhon and Saloff-Coste [16] that the lower bound of (5.1) implies in particular that G satisfies (ID d,c ) (with a possibly smaller constant c). This extends without too much difficulty to transitive graphs (see e.g. [66,Proposition 4.1]). Moreover, in a transitive graph the upper bound of (5.1) easily implies that G does not satisfy (ID d,c ) with any larger value of d (see e.g. [66,Proposition 6.7]). An infinite transitive graph thus has a well-defined integer "dimension" d that manifests itself both as the graph's growth rate and as its isoperimetric dimension.
Of course, if G is finite then the existence of constants c, C such that (5.1) holds with d = 0 is completely trivial. For our analysis of percolation on finite graphs, therefore, we need something more finitary and quantitative. Moreover, even in an infinite graph the growth rate and isoperimetry can behave differently on different scales: for example, the graph of Z d 1 × (Z/mZ) d 2 −d 1 with d 2 > d 1 looks d 2 -dimensional on scales up to m, and thereafter looks d 1 -dimensional. Indeed, as was first noted by Tao [62], the growth degree of a transitive graph of polynomial growth can increase and decrease several times as the scale increases, before finally settling down to the rate detected by the bounds (5.1). See [62,Example 1.11] for a particularly illuminating example of a Cayley graph in which the growth rate is faster at large scales than at small scales.
The key result allowing us to understand the different "local" dimensions of transitive graphs at different scales is the celebrated theorem of Breuillard, Green and Tao [12] describing the structure of finite approximate groups. Roughly speaking, an approximate group is a subset of a group that is "approximately closed" under the group operation. Such sets arose implicitly in the original proof of Gromov's theorem, and "approximate closure" can be seen as a natural finitary analogue of polynomial growth (see e.g. [70,Proposition 11.3.1]). Breuillard, Green and Tao essentially show that every finite approximate group has a large finite-by-nilpotent piece; when applied in the context of polynomial growth, this implies in particular a quantitative, finitary version of Gromov's theorem [12,Corollary 11.7] (see also [35,60] for earlier results in this direction). Tessera and the second author also recently used approximate groups to give a quantitative, finitary version of Trofimov's result [65], which complements Breuillard, Green and Tao's finitary Gromov-type theorem in the same way that Trofimov's original work complements Gromov's theorem. For more general background on approximate groups see [69,70]; for some other examples of applications of approximate groups, see [22] and [12, §11].
After some fairly delicate additional work, these results lead to a number of refinements of the bounds (5.1) and their isoperimetric consequences. For a complete picture of the state of the art, as well as a detailed bibliography, see [66,67]. Of particular relevance to the present work is a result of Tessera and the second author stating that if |B(o, n)| ≥ n d for some vertex o of a transitive graph G = (V, E) and some n ∈ N then all subsets of V size at most 1 2 |B(o, n)| satisfy the d-dimensional isoperimetric inequality (ID d,c ) for some c = c(d) (see Theorem 5.3 below). This confirmed a conjecture of Benjamini and Kozma [8]. This isoperimetric inequality was in fact motivated by another application to probability: Tessera and the second author use it to give a quantitative, finitary refinement of Varopoulos's famous result that the simple random walk on a vertex-transitive graph is transient if and only if the graph has superquadradic growth [66], verifying and extending another conjecture of Benjamini and Kozma. In particular, this leads to a gap for the escape probability of a random walk on a vertex-transitive graph [66,Corollary 1.7], very similar in spirit to the gap for the critical percolation probability we obtain here in Theorem 1.7. For some other applications of this structure theory to probability see [ We now present some specific results for use in the present work. The first follows directly from Tessera and the second author's finitary version of Trofimov's theorem. The next result shows that a bound of the form |B(o, n)| ≤ cn d is enough to ensure that the "local" dimension of a transitive graph can never go above d at higher scales. This verified a conjecture of Benjamini.
Note that the case in which m = diam(G) includes the case in which G is infinite and m = ∞.
Proof. It is stated in [66,Theorem 1.20] that for each positive integer k there exists a positive constant c(k) ≤ 1 such that if n ≥ 1 is such that |B(o, n)| ≥ n d for some d ≥ 1 then The case of Theorem 5.3 in which |B(o, n)| ≥ n d , which includes the case that n = 1, follows immediately. Now suppose that n ≥ 2 and that |B(o, n)| < n d .
for every A ⊆ V with |A| ≤ 1 2 |B(o, n)|: This is trivial when d−δ < 1, and follows from [66, Theorem 1.20] otherwise. A little algebra gives that and we easily obtain that Again, speaking very roughly, Theorems 5.1 and 5.3 tell us that for every locally finite transitive graph G = (V, E), there is a scale m such that G "looks high-dimensional" on scales smaller than m and "looks low-step nilpotent" on scales larger than m.

Transitive graphs as quotients of Cayley graphs
In this section we describe a construction due to Abels [1] expressing any vertex-transitive graph as a quotient of a Cayley graph of its isometry group. We start in a fairly abstract setting. Suppose Γ is a group with a symmetric generating set S, and suppose that H < Γ is a subgroup with respect to which S is bi-invariant in the sense that HSH = S. This implies that given x, y ∈ Γ and h ∈ H we have x ∼ y in Cay(Γ, S) if and only if xh ∼ yh in Cay(Γ, S), where we write u ∼ v to mean that two vertices u, v of a given graph are neighbours. We may therefore define an injective homomorphism ρ : H → Aut (Cay(Γ, S)) by setting ρ(h)(x) = xh −1 for x ∈ Γ. We then denote by A(Γ, S, H) the quotient graph Cay(Γ, S)/ρ(H). Note that the vertex set of A(Γ, S, H) is the set Γ/H of left cosets of H in Γ, and the action of Γ on Cay(Γ, S) induces a transitive action of Γ on A(Γ, S, H) given by g(xH) = (gx)H, so that A(Γ, S, H) is a transitive graph.
It turns out that every transitive graph whose automorphism group is discrete (and hence every finite transitive graph) can be realised in this way, as follows. Note that if Γ is a closed subgroup of Aut (G) then the stabilizer Γ o is compact, so that Γ is discrete if and only if Γ o is finite. , which in turn means that there exists s ∈ S such that γhs = γ . The bi-invariance of S then implies that hs ∈ S, and hence that γ ∼ γ in Cay(G, S), as claimed. Since Γ acts transitively on G, the map is a well-defined bijection by the orbit-stabiliser theorem. This map ϕ defines a graph isomorphism G → A(Γ, S, Γ o ), since for every γ, γ ∈ Γ we have as claimed.

Isoperimetry in induced subgraphs
We will see in Section 6 that is a straightforward matter to deduce Theorem 1.7, which concerns the value of p c in infinite transitive graphs, from the structure-theoretic results Theorems 5.1 and 5.3 together with our analyses of virtually nilpotent groups from Section 3 and graphs of large isoperimetric dimension from Section 4. Our main results regarding finite transitive graphs require a rather more delicate approach owing to the possibility that the auxiliary graph G/H is of 'intermediate size', i.e., that 1 |V /H| |V |, in which case we cannot rely on the results of either Section 3 or Section 4 alone to establish the existence of a giant component. The purpose of this section is to prove the following proposition, which is a variation on Theorem 5.3 and will be used to apply the results of Section 4 in the case that G/H is of intermediate size.
Proposition 5.5. For each integer d ≥ 1, k ≥ 1, ε > 0, and ρ ∈ (0, 1) there exist positive constants = (d, k, ε, ρ), ε 0 = ε 0 (d), and c = c(d, k, ε, ρ) such that the following holds. Let Note that a nontrivial argument is still required to deduce our main theorems from this together with our analyses of the nilpotent and high-dimensional cases. This argument is carried out in Section 6.
We now begin to work towards the proof of Proposition 5.5. We begin by adapting the arguments of Coulhon-Saloff-Coste and Tessera and the second author to prove the following result, which essentially says that if a finite set A of vertices in a vertex-transitive graph is sparse in every ball of radius r then its external vertex boundary ∂ + V A has size at least a constant times |A|/r. Proposition 5.6 (Locally sparse sets have large boundary). Let G = (V, E) be a locally finite, vertex-transitive graph, and let A be a finite set of vertices of G. If ρ ∈ (0, 1) and r ≥ 1 are such that |A ∩ B(x, r)| ≤ ρ|B(x, r)| for every x ∈ V then Proposition 5.6 can be seen as a generalisation of [66,Proposition 4.1], and is implicitly contained in the proof of that result. We provide the details here for the convenience of the reader.
We now briefly recall some relevant definitions that will be used in the proof of Proposition 5.6. Given a locally finite, vertex-transitive graph G, the group Aut (G) of automorphisms of G is a locally compact group with respect to the topology of pointwise convergence, and every closed subgroup of Aut (G) is also a locally compact group in which vertex stabilisers are compact and open. Moreover, an arbitrary closed subgroup Γ < Aut (G) admits a (left) Haar measure µ, the properties of which include that See [33, §15] for a detailed introduction to Haar measures. Given a locally compact group Γ with left Haar measure µ, we define the space L 1 (Γ) with respect to µ, so that Γ acts on L 1 (Γ) via γf (x) = f (xγ). Note that since a right translate of a Haar measure is again a Haar measure, by property (4) Proof. We follow the proof of [66,Proposition 4.4]. If there exists s ∈ S such ∆ Γ (s) ≥ 1 + log 2 which certainly gives the required bound. We may therefore assume that ∆ Γ (s) ≤ 1 + log 2 r for every s ∈ S. This implies that for every γ ∈ S r . Define a linear operator M : The hypothesis on A implies that its indicator function 1 A satisfies M (1 A )(x) ≤ ρ for every x ∈ Γ, and hence that On the other hand, given γ ∈ S r written as γ = s 1 · · · s r with each s i ∈ S, the triangle inequality implies that By (5.3), this implies that 1 A − γ1 A 1 ≤ 2r sup s∈S 1 A − s1 A 1 = 2r sup s∈S µ(A As), and averaging this bound over S r then gives that To conclude, simply note that for all s ∈ S we have that µ(A As) = µ(A \ As) + µ(As \ A) so that the desired bound follows from (5.4) and (5.5).
for every 0 ≤ r ≤ 2 n. Averaging over n < r ≤ 2 n, we deduce from Theorem 5.2 that there exist constants C 1 and C 2 depending only on d such that The only important feature of this bound is that the right hand side tends to zero as → ∞, at a rate depending only on d and k. Indeed, letting c 1 = c 1 (d) be the constant from Corollary 5.9, if I ⊆ B(o, 2 n) is a set attaining the minimum in the definition of η( ) then there exists x ∈ V such that |I ∩ B(x, n)| ≥ 1 − η( ) c 1 ε 1/d |B(x, n)|.
It follows that there exists a constant = (d, k, ρ, ε) such that if I ⊆ B(o, 2 n) is a set attaining the minimum in the definition of η( ) then there exists x ∈ V such that |I ∩ B(x, n)| ≥ ρ|B(x, n)|.
Fix one such set I and x ∈ V . Since I ⊆ B(x, 5 n), it suffices to prove that the subgraph of G induced by I satisfies a d-dimensional isoperimetric inequality with constants depending only on d, k, ε and ρ. Since |B(o, 2 n)| ≥ |B(o, n)| ≥ εn d , we can apply Theorem 5.3 to deduce that there exists a positive constant c 2 = c 2 (d, k, ρ, ε) such that |∂ E A| ≥ c 2 |A| (d−1)/d for every subset A with |A| ≤ |I|/2 ≤ |B(o, 2 n)|/2. We are not done at this point of course, since what we really need is a lower bound on the size of the boundary of A considered as a subset of the subgraph of G induced by I. Write ∂ I A for this boundary. Fix A ⊆ I with |A| ≤ |I|/2, write A = I \ A and, following Le Coz and Gournay, note that 2|∂ I A| = |∂ E A| + |∂ E A | − |∂ E I|. (5.8) We have by minimality of I that Thus, writing α = |A|/|I| and γ = (d − 1)/d, it follows that and hence by (5.8) that with v ∈ V has size at most M . Proposition 5.4 implies that S = {γ ∈ Γ : d(γ(Ho), Ho) ≤ 1} is a finite symmetric generating set for Γ, and since the growth of G is superlinear, the same proposition also implies that the growth of Cay(Γ, S) is superlinear. As is well known (see e.g. [70, Lemma 11.1.2 and Proposition 11.1.3]), this means that Γ is not virtually cyclic. Theorem 3.20 therefore implies that there is an absolute constant ε > 0 such that p c (Cay(Γ, S)) ≤ 1 − ε 0 . Proposition 5.4 implies that G/H is isomorphic to a quotient of Cay(Γ, S) by a subgroup of Aut (Cay(Γ, S)) of order at most M , so Lemma 2.9 imples that p c (G/H) ≤ 1 − ε for some absolute constant ε > 0. The theorem then follows from Proposition 2.8.
The remainder of this section is dedicated to the proof of Theorem 1.2. We begin with some simple and standard geometric lemmas. Lemma 6.1 (cf. Ruzsa's covering lemma [57]). Let A be a subset of a graph G, and let m ∈ N. Let X be a maximal subset of A such that the balls B(x, m) are pairwise disjoint. Then A ⊆ x∈X B(x, 2m).
Proof. The maximality of X implies that for every a ∈ A there exists x ∈ X such that B(x, m) ∩ B(a, m) = ∅, and hence a ∈ B(x, 2m). Lemma 6.2. Let G be a graph of diameter at least n and let v be a vertex of G. Then B(v, n) contains at least (n − 2m)/(4m + 2) disjoint balls of radius m for each m ≤ n/2. As such, if G is transitive and 1 ≤ m 1 ≤ m 2 ≤ diam(G) then Proof. The second claim follows easily from the first by a small calculation. We now prove the first, following [66,Lemma 5.3]. Since diam(G) ≥ n, there exists a geodesic of length k = n/2 starting at v. Let x 0 = v, x 1 , x 2 , . . . , x k be the vertices of this geodesic, written in increasing order of distance from v. The balls B(x (2m+1)i , m) with 0 ≤ i ≤ (k − m)/(2m + 1) are then disjoint subsets of B(v, n). This is easily seen to imply the claim.
If m 3 = diam(G) then the theorem follows from Theorem 5.3 and Theorem 1.5. If m 1 ≤ 10 156 ∨ M 2 then letting R = 10 78 ∨ M we may apply Theorem 5.1 with d = 14 to obtain groups H Aut (G) and Γ < Aut (G/H) such that Γ acts transitively on V /H and has a nilpotent subgroup of step and index at most M , such that the stabiliser in Γ of each orbit Hv with v ∈ V has size at most M , and such that each such orbit has diameter at most R, and hence size at most R 14 . Proposition 5.4 then implies that S = {γ ∈ Γ : d(γ(Ho), Ho) ≤ 1} is a symmetric generating set for Γ of size at most (k + 1)M , that diam S (Γ) = diam(G/H), and that |Γ| = |Γ Ho ||V /H| |V |, and hence that It then follows from Theorem 3.19, Proposition 5.4, Lemma 2.9 and Proposition 2.8 that for every ε > 0 there exists q 1 = q 1 (k, λ, ε) < 1 such that P G p u ↔ Hv ≥ √ 1 − ε for every u, v ∈ V and p ≥ q 1 . Since the orbits of H have diameter at most R, there also exists q 2 = q 2 (ε) < 1 such that for p ≥ q 2 and every u, v belonging to the same orbit of H. Letting q 3 be defined by 1 − q 3 = (1 − q 1 )(1 − q 2 ), we have as usual that Bernoulli-q 3 percolation is distributed as the union of two independent copies of Bernoulli-q 1 and Bernoulli-q 2 percolation, so that P G q 3 u ↔ v) ≥ P G q 1 u ↔ Hv · min P G q 2 (w ↔ v) : w ∈ Hv ≥ 1 − ε for every u, v ∈ V and the theorem is proved in this case.
From now on we assume that 10 156 ≤ m 1 ≤ m 2 ≤ m 3 < diam(G), which covers all outstanding cases of the theorem. Note in this case that the three scales m 1 , m 2 , and m 3 satisfy the hypothesis of Proposition 5.5 and are well separated from each other. Indeed, we have that and hence that m 2 ≥ 10 12 · m 1 ≥ 10 168 ∨ M 2 and m 3 ≥ 10 14 · m 2 ≥ 10 182 . (6.2) Let H Aut (G) and Γ < Aut (G/H) be the groups given by applying Theorem 5.1 with d = 13. Thus Γ acts transitively on V /H and has a nilpotent subgroup of step and index at most M , the stabiliser in Γ of each orbit Hv with v ∈ V has size at most M , and each such orbit has diameter at most m It follows in particular that there exists y ∈ B(o, m 2 ) such that |I 2,o ∩ Hy| ≥ 3 4 |Hy|. Since Γ acts transitively on V /H, we may apply an automorphism γ of G mapping some element of I 2,o ∩ Hy to o to obtain a set γI 2,o such that o ∈ γI 2,o and |(γI 2,o ) ∩ Ho| ≥ 3 4 |Ho|. Applying Theorem 1.5 and Lemma 2.1 to the subgraph of G induced by γI 2,o yields that there exists q 2 = q 2 (k, ε) such that for every v ∈ (γI 2,o ) ∩ Ho and p ≥ q 2 . Since |(γI 2,o ) ∩ Ho| ≥ 3 4 |Ho|, we may apply Lemma 2.7 to conclude that for every v ∈ Ho and p ≥ q 2 . This immediately implies the claim (6.3).
Large H-orbits. We now consider the second case, in which |Hv| ≥ (log |V |) 400 for every v ∈ V . We will continue to use the sets (I 2,v ) v∈V as constructed in the case of small H-orbits. Note that B(o, m 2 ) contains the orbit Ho, and since |B(o, m 2 )| ≤ (m 2 + 1) 13 , this implies that m 13 2 (log |V |) 400 , and hence that m 2 (log |V |) 400/13 . (6.4) Note in particular that this implies we may assume m 2 to be larger than any given constant depending on k, λ, ε, the theorem being trivial for graphs of bounded volume. Let the constant 2 = 2 (k) be as in the construction of the sets I 2,v above, so that for each v ∈ V there exists a subset I 2,v ⊆ B(v, 2 m 2 ) such that the subgraph of G induced by I 2,v satisfies a 13-dimensional isoperimetric inequality (ID 13,c 2 ) for some c 2 = c 2 (k) > 0 and |I 2,v ∩ B(v, m 2 )| ≥ 3 4 |B(v, m 2 )|.