Percolation in an ultrametric space

We study percolation on the hierarchical lattice of order $N$ where the probability of connection between two points separated by distance $k$ is of the form $c_k/N^{k(1+\delta)},\; \delta>-1$. Since the distance is an ultrametric, there are significant differences with percolation on the Euclidean lattice. There are two non-critical regimes: $\delta<1$, where percolation occurs, and $\delta>1$, where it does not occur. In the critical case, $\delta =1$, we use an approach in the spirit of the renormalization group method of statistical physics and connectivity results of Erd\H{o}s-Renyi random graphs play a key role. We find sufficient conditions on $c_k$ such that percolation occurs, or that it does not occur. An intermediate situation called pre-percolation is also considered. In the cases of percolation we prove uniqueness of the constructed percolation clusters. In a previous paper \cite{DG1} we studied percolation in the $N\to\infty$ limit (mean field percolation) which provided a simplification that allowed finding a necessary and sufficient condition for percolation. For fixed $N$ there are open questions, in particular regarding the existence of a critical value of a parameter in the definition of $c_k$, and if it exists, what would be the behaviour at the critical point.


INTRODUCTION
Percolation theory in a lattice (e.g., the Euclidean lattice Z d ) began with the work of Broadbent and Hammersley in 1957. The principal features of the model are that the space of sites is infinite and its geometry plays an essential role. The main problem is to determine if there is an infinite connected component, in which case it is said that percolation occurs. In the first models the connections (bonds) were only between nearest neighbors. (See [22,12] for background, and for a physics point of view, [34].) The question of percolation on noneuclidean graphs including nonamenable Cayley graphs given by finitely generated groups was formulated in [6]. The study of long range percolation began in the mathematical physics literature (e.g., [1,7,27,32]). In this case connections are allowed between points at any distance from each other with probability depending on the distance. The main problem is the same, and the geometry remains crucial. (See also [6,8,9,15,35].) The theory of random graphs started with the work of Erd ′′ os and Rényi in 1959. The model consists of a finite number n of vertices with connection probability p n between pairs of vertices, depending on n in some way, and there is no structure on the set of vertices. The results refer to what happens as n → ∞, for example with the largest connected component. (See [10,23,19] for background.) Results from the theory of random graphs have been useful as technical tools in studies on percolation (e.g. [4,5]).
By introducing a structure on the set of vertices of a random graph or some special form of connection probabilities by means of a kernel which induces a sort of geometry, it is possible to generate a large class of interesting models which share properties of both percolation and/or (classical) random graphs, and also "small world" random graphs (see e.g. the models in [2,11,36]).
On the other hand, hierarchical structures arise in the physical, biological and social sciences due to the multiscale organization of many natural objects (see e.g. [3,29]). In particular the hierarchical Ising model which was introduced by Dyson [20] has played an important role in statistical physics (see [13,14,33]) and in population genetics (see e.g. [30]). Important applications of hierarchical structures have also been made by Kleinberg [24,25] in the area of search algorithms in computer science. A basic model is the hierarchical group Ω N of order N (defined in Section 2), which can be represented as the set of leaves at the top of an infinite regular tree, where the distance between two points is the number of levels from the top to their most recent common node. Such a distance satisfies the strong triangle inequality d(x, y) ≤ max{d(x, z), d(z, y)} for any x, y, z, which is the characteristic property of an ultrametric. (See e.g. [31] for background on ultrametric spaces.) The main qualitative difference between Euclidean-type lattices and an ultrametric space such as Ω N is that in the former case it is possible to go far by a sequence of small steps, while in the latter case that is not possible, and the only way to go far is to make jumps of ever bigger sizes. This has important consequences for random walks for which there are analogies and differences with the Euclidean case (see e.g. [17,18] and references therein) and percolation in ultrametric spaces [16]. In particular, percolation in Ω N is possible only in the form of long range percolation, that is, with positive probabilities of connections between vertices separated by arbitrarily large distances.
With these precedents it is natural to investigate percolation in ultrametric spaces such as Ω N where classical tools do not apply. Our aim is to develop a mathematical framework that might be useful generally for this kind of model, not thinking about specific motivations from or applications to physics or any other field.
In [16] we studied asymptotic percolation in Ω N as N → ∞ (or mean field percolation) with connection probabilities of the form c k /N 2k−1 between two points separated by distance k, and we obtained a necessary and sufficient condition for percolation. (See Subsection 3.1 for the definition of asymptotic percolation). The Erd ′′ os-Rényi results on giant components of random graphs were a useful tool, although there are significant differences between classical random graphs and ultrametric ones (e.g., the average length of paths in the giant component of an ultrametric ball is much longer than in the classical case).
In the present paper we study percolation in Ω N for fixed N with connection probabilities of the form c k /N k(1+δ) , δ > −1, between two points separated by distance k. In this case percolation means that there is a positive probability that a given point of Ω N belongs to an infinite connected component. This is a quite different situation from the asymptotic N → ∞ model, and new methods must be used. However, properties of Erd ′′ os-Rényi graphs are again useful, but now it is connectivity results that are of help, specially the result in the Appendix based on Durrett's approach to connectivity [19]. There are three regimes: δ < 1, δ > 1 and δ = 1. Roughly speaking, under certain natural assumptions on c k , for δ > 1 percolation does not occur, and for δ < 1 percolation occurs and the infinite connected component is unique. The most difficult is the critical case, δ = 1, where we use certain special forms of c k and percolation may or may not occur. Our aim is to find forms of c k such that percolation occurs, or that it does not occur, and in the case of percolation it turns out that the infinite connected component is unique.
We emphasize that taking the limit N → ∞ provides a simplification which allowed us to obtain a sharp result, that is, necessary and sufficient condition for percolation in [16]. In the case of finite N that does not seem possible at present, and our work leads to some open problems, in particular regarding the behaviour at the critical values of the parameters in the form of c k in the critical case δ = 1. We also consider an intermediate situation that we call pre-percolation, which is necessary for percolation. Pre-percolation occurs in one of our results.
While we were working on this paper we learned about the manuscript of Koval et al [26] (for which we thank them), where they also study percolation in Ω N for fixed N , with connection probabilities of the form 1 − exp{−α/β k }, α ≥ 0, β > 0, between two points separated by distance k. Some of their results for β > 1 and ours may be compared by setting β = N 1+δ (see Remark 3.2).
For the cases where percolation occurs, specially with δ = 1, we use an approach in the spirit of the renormalization group method of statistical physics which has been employed, for example, in [14] for ferromagnetic systems on Dyson's hierarchical lattice Ω 2 and for the study of long range percolation on Z d (see [27,32]).
In Section 2 we describe the model (the hierarchical group Ω N and the associated random graph G N ). In Section 3 we recall the result on mean field percolation [16] in order to compare it with a result in the present paper, and we state our results for δ < 1 and δ > 1 (Theorem 3.1) and the critical case δ = 1 (Theorems 3.3 and 3.5), and we mention some open problems (Subsection 3.4). Sections 4 and 5 contain the proofs. In an appendix we give a result on connectivity of random graphs derived from [19], which is a key ingredient for the proof of percolation in the critical case.

DESCRIPTION OF THE MODEL
2.1. The hierarchical group Ω N . For an integer N ≥ 2, the hierarchical group of order N , also called hierarchical lattice of order N , is defined as with addition componentwise mod N ; in other words, Ω N is a countable Abelian group given by the direct sum of a countable number of copies of the cyclic group of order N . The hierarchical distance on Ω N is defined as It is a translation-invariant metric which satisfies the strong (non-Archimedean) triangle inequality d(x, y) ≤ max{d(x, z), d(z, y)} for any x, y, z.
Hence (Ω N , d) is an ultrametric space, and it is well known that it can be represented as the leaves at the top of an infinite regular tree where N branches emerge up from each node, and the distance between two points of Ω N is the number of levels from the top to their most recent common node.
For each integer k ≥ 1, a k-ball in Ω N , denoted by B k , is a set of all points which are at distance at most k from each other. Any point of a ball can serve as a center. Once a center is chosen, one may speak of the interior and the boundary of the ball. A k-ball contains N k points, and its boundary contains N k−1 (N − 1) points. For k > 1, a k-ball is the union of N (k − 1)-balls, which are at distance k from each other. For j > k ≥ 1, a j-ball is the union of N j−k k-balls, which are at distance at least k + 1 and at most j from each other. Two balls are either disjoint, or one is contained in the other (this is the reason why connections between nearest neighbours alone cannot produce percolation). For k ≤ j < ℓ, the (j, ℓ]-annulus around B k is the set of all points y such that j < d(x, y) ≤ ℓ, where x is any point in B k . The (j, ℓ]-annulus is also described as B ℓ \ B j with B k ⊂ B j ⊂ B ℓ , and it contains N ℓ (1 − N j−ℓ ) points. The number of points in a bounded subset A ⊂ Ω N is denoted by |A|. The probability that a point chosen at random (uniformly) in B k belongs to A ⊂ B k is |A|N −k .
We fix a point of Ω N which we denote as 0. Most of our considerations about percolation will refer to balls containing 0.
2.2. The random graph G N . We define an infinite random graph G N with the points of Ω N as vertices, and for each k ≥ 1 the probability of connection, p x,y , between x and y with d(x, y) = k is given by where δ > −1 and c k > 0, all connections being independent. This can be realized in terms of a collection of independent uniform [0, 1] random variables {U (x,y) } by adding an (undirected) edge between x and y if and only if U (x,y) ≤ p x,y (see e.g. [23], page 4). Our aim is to find sufficient conditions on c k and δ which imply that percolation occurs, or that it does not occur.
We will study separately the cases δ > 1, δ < 1, and δ = 1. As we shall see, the case δ = 1 requires a more delicate analysis. In this case we take c k of the following special forms: (i) 2) with constants C 0 ≥ 0, C 1 ≥ 0, C 2 ≥ 0 and α > 0.
(ii) We consider the hierarchical distances with logarithmic scale k n = k n (K) := ⌊Kn log n⌋, n = 1, 2, . . . , (where ∼ has the usual meaning, see beginning of the proofs), and we consider the class of connection rates given by (2.1) with δ = 1, c k satisfying c kn = C + a log n · N b log n = C + a log n · n b log N , (2.5) and c kn ≤ c j ≤ c kn+1 for k n < j < k n+1 .
(2.6) with constants C ≥ 0, a > 0, b ≥ 0. The constant K is chosen suitably in each case. The value N = 2 is special because log 2 < 1, and that is why for some results with N = 2 we set K > 1 log 2 . The reason for considering k n (K) and these forms for c kn is that they provide a four parameter family of comparison connection rates (with parameters K, C, a, b) suitable for the renormalization analysis used in the proof of percolation in Theorem 3.5. An intuitive argument for this is given before Theorem 3.5. Results of Theorem 3.5 are used to prove Theorem 3.3 which is our main result.
A set of vertices any two of which are linked by a path of connections is called a cluster. By transitivity of (Ω N , d) we may focus on clusters containing 0. In the proofs it is implicitly assumed that we consider a sequence of nested balls (B k ) k≥1 such that 0 belongs to B 1 (or to some B k ). We denote by X k the largest cluster contained in B k , considering only connections within B k and not through points outside of B k . If there are more than one largest cluster, we choose one of them uniformly from the existing ones. In this way each k-ball B k has a unique attached cluster X k . Note that for B k ⊂ B k+j , either X k ∩ X k+j = φ or X k ⊂ X k+j . Those clusters will be used only in the proofs of sufficient conditions for percolation. The assumption of connections only within balls makes the renormalization approach quite practical for percolation, and connections through points outside would add to the possibility of percolation.
Definition 2.1. We say that percolation occurs in G N if there is a positive probability that a fixed point of G N (for example 0) belongs to an infinite cluster.
If percolation occurs, the probability that there is an infinite cluster is 1. Indeed, the event that there is an infinite cluster is measurable with respect to the tail σ-algebra generated by the connections involving points outside each k-ball (containing 0) for every k, and the connections involving points outside a ball are independent of those inside, so by a 0-1 law the probability that there is an infinite cluster is 0 or 1.
In some cases we consider percolation clusters of positive density, that is, |X k |N −k does not decrease to 0 as k → ∞.

Remark 2.2.
It follows immediately from the construction in terms of the family {U (x,y) } that given two families of connection probabilities p 1 x,y , p 2 x,y with p 2 x,y ≥ p 1 x,y for any x, y, percolation for family 1 implies percolation for family 2.

RESULTS
3.1. Mean field percolation. For completeness, we start by recalling the result on asymptotic percolation as N → ∞ [16]. This will also be used for comparison with a result below. The probability of connection between two points separated by distance k is c k /N 2k−1 . Note that this corresponds to the critical case δ = 1 with c k in (2.1) multiplied by N , and this may be viewed as a normalization required for obtaining the result in the limit. Asymptotic percolation is said to occur if P perc is the probability of percolation. For each k ≥ 1, let β k ∈ (0, 1) satisfy where c k β 2 k−1 > 1. Note that β k is the well-known survival probability of a Poisson branching process with parameter c k β 2 k−1 . This corresponds to hierarchical level k, and the β 2 k−1 comes from the sizes of two connected giant components at the previous level k − 1. Assume that c k ∞ as k → ∞, c 1 > 2 log 2 and c 2 > 8 log 2. The results (see [16], Theorem 2.2 and Lemma 2.1) are that asymptotic percolation occurs if and only if ∞ k=1 e −c k < ∞, and when it occurs, the probability of percolation is given by (which is strictly positive if and only if the exponential series converges), and percolation takes place through a cascade of clusters (in this case giant components) at consecutive hierarchical distances. For example, if c k = a log k for large k, a > 0, then asymptotic percolation occurs if and only if a > 1. See Remark 3.6(1) for a partially analogous result with fixed N .

Remark 3.2.
Our results and those of [26] can be compared for β > 1 therein, since in this case their connection probabilities p k = 1 − exp(−α/β k ) ∼ α/β k as k → ∞. If we set β = N 1+δ and let c k = c not depending on k in (2.1), then the decay rates agree with c corresponding their α. The comparison is between Theorem 1 in [26] and our Theorem 3.1. We have that for δ > 1, β > N 2 , percolation does not occur with any value of c, and for −1 < δ < 1, 1 < β < N 2 , percolation occurs with c sufficiently large, corresponding to α > α c (β) in [26]. In addition in [26] it is proved that there exists α c (β) > 0 such that percolation does not occur for α < α c (β). Our main objective is to investigate the critical case δ = 1, β = N 2 , for which percolation does not occur for any α in [26]. Our results in this case are stated in the next subsection.
3.3. The case δ = 1. In the previous subsection we have seen that δ = 1 identifies the critical exponential decay rate for percolation. In this subsection we formulate our main results that determine the critical polynomial rate for percolation.
(a) If α > 2, then for any C 1 there exist C * 0 > 0 and C * 2 > 0 such that if C 0 > C * 0 and C 2 > C * 2 , percolation occurs and the percolation cluster is unique. (b) If C 2 = 0 and C 1 < N , then percolation does not occur for any C 0 . (c) If α > 2, there exists C * > 0 such that if max(C 0 , C 1 , C 2 ) < C * , then percolation does not occur.
The proof of this result is based on a renormalization argument that is formulated using the hierarchical distances k n (K) defined in (2.3) and the family of connection rates c kn defined in (2.5) with parameters C, a, b. This is the substance of Theorem 3.5.
In order to express one of the results below we introduce the following notion.

Definition 3.4.
We call pre-percolation the situation that (with probability 1) there exists n 0 such that there is at least one connection from (k n , k n+1 ] to (k n+1 , k n+2 ] for all n ≥ n 0 .
Note that for percolation, in addition to pre-percolation, there would have to be paths connecting points in (k n+1 , k n+2 ] which are connected to (k n , k n+1 ] to points in (k n+1 , k n+2 ] which are connected to (k n+2 , k n+3 ], etc. Before stating the next theorem, let us give an intuitive explanation for the choice of k n and c kn in (2.3)-(2.6) with b > 0, and the assumption K < b (the assumption 2 log N < K is a technical requirement for the method of proof). For this argument only we use the notation ≈ for approximate equality for large n without giving it a rigorous meaning. The idea is that k n is the right scaling and the form of c kn is the right one which combines exactly with k n in order to produce the percolation cluster. We consider the largest clusters X kn in each one of the N kn+1−kn ≈ N K log n k n -balls in a k n+1 -ball, and assume that their sizes are |X kn | ≈ βN kn for some β ∈ (0, 1), and that the probability of connection between two points in different clusters is c kn /N kn+1 (which is a lower bound for the actual probabilities). Let s n (β) denote the probability that two such clusters X kn and X ′ kn in disjoint k n -balls are connected. Then Consider the E-R random graph G(N K log n , r n (β)), and write r n (β) as If K > b, then by the E-R theory only order log(N K log n ) of the X kn are connected, hence the ratio of the size of the largest connected component in the k n+1 -ball to the size of the ball decreases to 0 as n → ∞, so there cannot be a percolation cluster (of positive density). Therefore we choose K < b. Now let then by Theorem 2.8.1 of [19] the probability that the graph G(N K1 log n , r n (β)) is connected tends to 1 as . Connectivity of that graph whose vertices are the clusters X kn in the k n -balls in a k n+1 -ball means that the largest cluster in the k n+1 -ball contains the largest clusters in all the k n -balls it contains. If this can be proved for all sufficiently large n, then percolation follows. Note that this argument provides a percolation cluster if K < b, but it does not imply that percolation does not occur if b > K.
(b) Assume that {c kn } satisfy (2.5), (2.6) with b > 0 and c kn = C + a log n · N b log n where k n = k n (K) is given by (2.3) and the pair ( (1) Then there exist C > 0 and a * > 0 such that for a > a * there is a sequence (β n ) n such that

2)
and for the clusters X kn in a nested sequence of k n -balls B kn containing 0, P (there exists n 00 such that |X kn | ≥ β n N kn for all n ≥ n 00 ) = 1, (3.3) percolation occurs, and the percolation cluster is unique.
(2) Assume that b < 2K − 1 log N . Then the percolation cluster is given by a "cascade" of clusters at distances k n , more precisely, there exists a (random) number n 0 such that for n ≥ n 0 connections between X kn ∩ (B kn \B kn−1 ) and X k n+ℓ ∩ (B k n+ℓ \B k n+ℓ−1 ) occur only for ℓ = 1, 2.
(c) In the special case with connection probabilities c j = c kn N 2k n+1 for k n + 1 ≤ j ≤ k n+1 and 0 < b ≤ 2 log N < K, percolation does not occur. Remark 3.6. (1) In Theorem 3.5 (a) a slightly different result holds for N = 2 since log 2 < 1 so that we need K > 1 log 2 . (2) Note the consistency of Theorem 3.5 (a) with the example of asymptotic percolation recalled in Subsection 3.1. The difference is that in the finite N case we only have pre-percolation, and the connections are at hierarchical distances k log k rather than k.
(3) Theorem 3.5(a)(ii) implies that in part (b) pre-percolation occurs with any a > 0 and b > 0. (4) The cascade of clusters in Theorem 3.5 (b)(2) is analogous to the cascade of giant components in the mean field case [16]. (5) The formulation in Theorem 3.5(b) is used as a technical tool to prove the result in Theorem 3.3, and also provides a setting to give a refined result.

Open problems and related developments.
(1) In Theorem 3.5(a) we have proved that with c k = C 0 + C 1 log k + C 2 k α percolation does not occur if C 2 = 0 and C 1 < N . On the other hand in (b) we proved that for α > 2 and C 0 sufficiently large, percolation occurs. It remains an open question as to whether percolation can occur for all α > 0 or even for C 2 = 0 and some C 1 sufficiently large. We next explain that to resolve these questions analogues of well-known results for Erdős-Rényi graphs would be needed for a class of ultrametric random graphs. An ultrametric random graph URG(M,d) is a random graph on a finite set M with ultrametric d and with connection probabilities p x,y that depend on the ultrametric distance d(x, y).
Consider the case C 2 = 0. The expected number of in-edges to the annulus (k n+1 , k n+2 ] from B kn+1 is of order O(a(1 − 1 N ) log n) and the expected number of out-edges to the (k n+2 , k n+3 ] annulus is of order O(a(1 − 1 N ) log(n + 1)). In order to determine the number of in-edges that connect to an out-edge, it would be necessary to determine the probability that two randomly chosen vertices in the (k n+1 , k n+2 ]annulus are connected by a path in the associated ultrametric random graph. This is related to the problem of determining the distribution of sizes of the connected components. These are open problems. Consider the case C 2 > 0. In Theorem 3.5(c) we have proved that the random graph based on a lower bound for the connection probabilities at distances k n + 1, . . . , k n+1 does not exhibit percolation in the case 0 < b ≤ 2 log N < K (corresponding to the case α < 2). In order to refine the argument and determine the behaviour for the actual connection probabilities which arise if 0 < α < 2 it would be necessary to determine the size of the largest connected component, that is, the number of k n -balls (more precisely their largest connected components) in the k n+1 -ball which are connected and in the largest connected component in the associated ultrametric random graph (as n → ∞). This is an open problem.
(2) It would also be of interest to consider the intermediate case with δ = 1 but with connection probabilities of the form n (log n) k N 2kn . (4) Berger [7] has studied the behaviour of random walk on the infinite cluster of long-range percolation in Euclidean lattices of dimensions d = 1, 2. It would also be interesting to investigate this behaviour on the infinite clusters obtained in Theorems 3.1(b) and 3.5 (b). Long-range random walks on Ω N have been studied in [17].
3.5. The renormalization group approach. The basic strategy we employ is in the spirit of the renormalization group method of statistical physics [14], which has been used by Newman and Schulman [27], Section 2, in their study of long range percolation in the Euclidean lattice.
Consider the countable ultrametric space (Ω N , d). For each integer k we define an equivalence relation on Ω N by x ≡ k y iff d(x, y) ≤ k, that is, x and y belong to the same k-ball. Now consider the set of equivalence classes E k furnished with the ultrametric where x is an equivalence class containing a point x. Then the resulting set of equivalence classes with ultrametric d k can be identified with (Ω N , d).
Given a graph with edges E N given by a symmetric subset of Ω N × Ω N we obtain a new graph as follows. The set of vertices is the set of all k-balls, and the set of edges E N,1 are such that x has a connection to y if there is a connection in G N between the largest connected subset (cluster) of x and the largest connected subset of y. Using the above identification this defines a new graph G 1 N = ΦG N on (Ω N , d). Iterating this procedure we obtain a sequence G k N = Φ k G N , k ≥ 1, of graphs all having vertex set Ω N . In addition we assign to each vertex v in G k N the [0, 1]-valued random variable denotes the set of vertices in G N contained in the connected cluster in the k-ball corresponding to v obtained as the union of the clusters in the (k − 1)-balls it contains. This construction defines the renormalization mapping Φ : G N → G N such that Φ k : G N → G k N for each k. Note that at each iteration the connection probabilities are the probabilities that the largest connected subsets of equivalence classes are connected and that these probabilities are random and dependent (because the connected components have random sizes), and they change at each iteration.
Rather than working directly with the sequence G k N we choose a subsequence k n and construct a sequence of renormalization maps Φ kn such that the number of points in a ball of radius 1 (with respect to the new distances d kn ) increases to infinity as n → ∞. In particular, we will show that there exists an increasing sequence of integers (k n ) (see (2.3)) and a sequence of graphs G kn with N kn−kn−1 vertices constructed recursively as follows, that is, where Φ n depends on n since it is a mapping from a graph with vertices Ω N kn −k n−1 to a graph with vertices Ω N k n+1 −kn and with connection probabilities between vertices that are a function of the distance between them. Moreover we can identify G kn with a subgraph of G N and these subgraphs are a decreasing function of n. We establish percolation by showing that the intersection of these subgraphs starting at a given point in Ω N is non-empty with positive probability. The difference now is that Φ kn+1 is not obtained by iteration but by means of Φ n : Given the sequence (k n ) we can consider the equivalence classes given by x ≡ kn y iff d(x, y) ≤ k n , and define the ultrametric d kn by We now consider the graph G kn whose vertices are the d kn equivalence classes. Two points in G kn at d kn -distance ℓ ≥ 1 are connected if there is a G N -edge joining the largest connected components in these equivalence classes in G kn−1 . Note that there are N kn+1−kn points in a ball of d kn -radius 1, N kn+2−kn in a ball of radius 2, etc. The proof of percolation in the case δ < 1 given in Section 4 involves showing that as n → ∞ the graphs G kn occupy a certain portion of the k n -balls and an increasing sequence can be linked in a cascade with probability approaching 1.
In order to apply these ideas to the more delicate critical case δ = 1 in Section 5 we define Y kn (v), v ∈ G kn as above. Then given the random graph G N , the nontriviality, lim inf n→∞ Y kn > 0, has probability 0 or 1. Our goal is to find a sufficient condition for this to be 1. In order to achieve this our strategy is to look for a pair of sequences k n → ∞, lim inf n β n > 0, such that the probability that Y kn ≥ β n converges to 1 as n → ∞. This program will be carried out in Subsection 5.2 using as basic tools large deviation estimates for binomial distributions and probability bounds for the connectivity of an Erdős-Rényi graph.

PROOFS FOR THE CASES δ > 1 AND δ < 1
We first mention a few notational points. In some places in the proofs in this and the following section where a number appears which should be a non-negative integer and it is not necessarily so, it should be interpreted as its integer part. a n ∼ b n means that an bn → 1 as n → ∞, a n >> b n means that b n = o(a n ), and a n b n means that 0 ≤ lim inf n a n b n ≤ lim sup n a n b n ≤ 1.
Definition 4.1. For 0 < γ < 1, we say that a k-ball B k is γ-good if its attached cluster X k satisfies |X k | ≥ N γk .
If 0 ∈ B k and B k is γ-good, then the probability that 0 ∈ X k is greater than or equal to N (γ−1)k .

Proof of Theorem 3.1.
(a) If suffices to show that P (B k is connected to its complement for infinitely many k) = 0. For j ≥ k, P (B k is connected to the (j, j + 1]-annulus around it)  hence k n+1 − k n ∼ log n as n → ∞. Choose γ so that 1 + δ 2 < γ < 1. Then the probability of connection between two clusters X kn and X ′ kn in (disjoint) γ-good k n -balls in a k n+1 -ball is bounded below by (4.5) There are N kn+1−kn k n -balls in a k n+1 -ball. If at least N γ(kn+1−kn) of them are γ-good and their clusters are connected within the k n+1 -ball, then the size of the cluster X kn+1 in the k n+1 -ball is greater than or equal to N γkn+1 , so the k n+1 -ball is γ-good.
It remains to verify assumption (4.15). We do this by means of a connectivity result for E-R random graphs. Writing c k /N k(1+δ) in (2.1) as c k /N k , c k = c k /N δk , we have that for all k ≤ n, c k > c n log N n with c n = c N δn log N n .
We consider the E-R random graph G(N n , c n /N n ) whose vertices are the points of an n-ball. If c > N δn log N n (4.24) for some very large n, then the probability that G(N n , c/N n ) is connected is close to 1, by Theorem 2.8.1 in [19]. It is possible to choose c large enough so that (4.24) holds because the only restriction on c is c < N (1+δ)n . This implies that the probability that all the points in the n-ball are connected is close to 1. Then, taking n = k n0 , (4.15) is true. Finally, the uniqueness follows from Theorem 2 in [26] (see Remark 4.2). ✷

Remark 4.2.
Uniqueness of the infinite cluster is proved in [26] (Theorem 2). They prove that (Ω N , d) can be embedded into Z such that any ball of radius r will be represented by N r consecutive integers, and the collection of balls of radius r partitions Z and that the embedding is stationary and ergodic. The uniqueness then follows from Gandolfi et al. [21], Theorem 0, on the uniqueness of the infinite cluster for long range percolation on Z satisfying the positive finite energy condition. The proof uses only the properties that the connection probabilities between vertices x, y are strictly positive and depend only on d(x, y), and therefore is also applicable to our case. An intuitive argument for uniqueness of the percolation cluster (of positive density), using the argument of the theorem, is that two chains of nested balls will eventually intersect, and by ultrametricity from then on they coincide, so, if their largest clusters occupy a sizeable part of the balls, they will eventually be the same.

PROOFS FOR
Let A n,j denote the event that there in no connection between the k n+1 -ball B kn+1 and the (k n+1 + j − 1, k n+1 + j]-annulus around B kn+1 , j = 1, . . . , k n+2 − k n+1 . Then (c) Let A n denote the event that there is no connection between the interior of a k n+1 -ball B kn+1 and the (k n+1 , k n+2 ]annulus around B kn+1 . Then P (A n ) ∼ n −a as n → ∞, (5.3) and this implies that (i) if a < 1, then with probability 1 there are infinitely many pairs ((k n , k n+1 ], (k n+1 , k n+2 ]) of successive annuli that are not connected, (ii) if a > 1, then with probability 1 there are at most finitely many pairs ((k n , k n+1 ], (k n+1 , k n+2 ]) of successive annuli that are not connected. (d) Let A n,j,ℓ denote the event that a k n -ball B kn is connected to the (k n+ℓ + j − 1, k n+ℓ + j]-annulus around B kn , j = 1, . . . , k n+ℓ+1 − k n+ℓ . Then for ℓ ≥ 1, is a special case with ℓ = 1).
(e) Let A n denote the event that there are connections from a k n -ball B kn to the complement of a k n+1 -ball B kn+1 , with B kn ⊂ B kn+1 . Then log(n + ℓ) (n + ℓ) ℓ log N as n → ∞. Proof.
(a) By (2.5) and (2.6), c kn+1+j ∼ c kn as n → ∞, hence and then (5.1) follows. (b) Since there are N kn+1−kn−1 ∼ N log n k n -balls in the interior of the k n+1 -ball, then from (5.1), (c) By (5.2) and independence, Since n n −a < ∞ if and only if a > 1, then by independence and the (second) Borel-Cantelli lemma, for a < 1, with probability 1 there are infinitely many successive (k n , k n+1 ]-annuli that are not connected, and by the (first) Borel-Cantelli lemma, for a > 1, with probability 1 there are at most finitely many successive (k n , k n+1 ]-annuli that are not connected. (d) By (2.5) and (2.6), (e) By (5.4), This proof has been done for K = 1 and log N > 1, hence N ≥ 3. For N = 2 we have k n+ℓ − k n ∼ K log(n + ℓ) (see (2.4)), which yields K log N instead of log N in the last step, so we take K > 1 log 2 for summability. ✷ Lemma 5.2. Let K and b be as in part (b)(2) of Theorem 3.5, that is, 0 < b < 2K − 1 log N . Let A n,j denote the event that the cluster X kn in a k n -ball B kn is connected to the (k n+j , k n+j+1 ]-annulus around B kn , j ≥ 2. Then there is a positive constant M such that Hence with probability 1 there exists a (random) number n 0 such that for all n ≥ n 0 the connections between the clusters X kn restricted to the (k n , k n+1 ]-annuli do not skip over two successive annuli, that is, there are no connections between the annulus (k n−1 , k n ] and the annuli (k n+2 , k n+3 ], (k n+3 , k n+4 ], etc. Proof. By (2.5) and (2.6), It is easy to show that the assumptions on K and b imply that The main tools for proving part (b) of the theorem are a large deviation inequality for the binomial distribution and a connectivity result for an E-R random graph. Recall that a graph is said to be connected if it has only one connected component and no isolated vertices.
We first recall the large deviation bound for the binomial distribution [23] (Corollary 2.4).
Note that if 0 ∈ B k and B k is β-good, then the probability that 0 ∈ X k is ≥ β.
Proof. By transitivity we may assume that 0 belongs to the k n -ball whose largest cluster is X kn . Then by the assumption (5.8) we have that which implies percolation. ✷ Lemma 5.7. Let 0 < b < 2K in (2.5) with k n as in (2.3), and let 0 < β < 1. Let X kn and X ′ kn be the largest clusters in two (disjoint) β-good k n -balls in a k n+1 -ball. Then P (X kn and X ′ kn are connected within the k n+1 -ball) r n (β) as n → ∞, (5.9) where r n (β) = β 2 a log n N (2K−b) log n . (5.10) Proof. By (2.6) and (2.5), k n < d(X kn , X ′ kn ) ≤ k n+1 , so P (X kn and X ′ kn are connected within the k n+1 -ball)

Proof of Theorem 3.5.
(a) (i) Lemma 5.1(f) guarantees that with probability 1 there exists n 0 such that if n > n 0 there are no connections between the (k n , k n+1 ]-annulus and the complement of B kn+2 . Moreover, if a < 1, by Lemma 5.1(c)(i) there are infinitely many n such that the (k n , k n+1 ]-annulus and the (k n+1 , k n+2 ]-annulus are not connected. This implies that with probability 1 there there exists some n ≥ n 0 such that there are no connections from B kn+1 to the exterior. (ii) The pre-percolation statement follows from Lemma 5.1(c)(ii).
(b) (1) We begin by indicating the main ideas of the proof. We consider a sequence of nested balls B kn (containing 0) and their largest clusters X kn . Recall that each k n+1 -ball is comprised of N kn+1−kn (disjoint) k n -balls.
At each stage we will focus on the subset of the k n -balls in a k n+1 -ball that are β n -good (i.e., |X kn | ≥ β n N kn , see Definition 5.5), where (β n ) n is a sequence of numbers in (0, 1) to be determined below. By construction the events that different k n -balls are β n -good are independent, and by transitivity they all have the same probability p G n (β n ) = P (|X kn | ≥ β n N kn ). (5.11) Let N n denote the number of β n -good k n -balls and recall (5.9) and (5.10). Now we consider the E-R random graph G(N n , r n (β n )) whose vertices are the N n β n -good k n -balls in the k n+1 -ball with connection probability r n (β n ).
The key idea of the proof is to establish that with probability 1 there is a (random) number n 0 such that for n ≥ n 0 , G(N n , r n (β n )) is connected, (5.12) which implies that |X kn+1 | ≥ N n β n N kn . (5.13) We denote by E n the event E n = {G(N n , r n (β n )) is connected} (5.14) and p A n (β n ) = P (E n ). (5.15) In order to prove (5.12) for all large n, by Borel-Cantelli it suffices to show that We denote by F n the event where ε n ∈ (0, 1) and p G n (β n ) is given by (5.11), and p B n (β n , ε n ) = P (F n ), (5.18) The next key idea is to choose a sequence of numbers ε n ∈ (0, 1) of the form ε n = n −(1+θ) , θ to be chosen below, with n ε n < ∞ such that n (1 − p B n (β n , ε n )) < ∞. (5.19) If both events E n and F n occur, then by (5.13) Therefore, since E n and F n are independent (because E n is defined in terms of distance k n+1 , and F n in terms of distance k n ), then We will show that for sufficiently large values of C and a there exists a sequence β n such that lim inf n β n > 0. (5.22) and n (1 − p G n (β n )) < ∞, (5.23) in order to obtain the results (3.2) and (3.3).
Since the quantities involved in the scheme described above are interdependent, we need to overcome the interaction among them. We proceed as follows: • We set ε n = n −(1+θ) for some 0 < θ < K log N 2 − 1 (recall that K > 2 log N ), hence K log N > 2(1 + θ). (5.24) • In Steps 1 and 2 below we will obtain estimates (1 − p A n (β)) < ∞. (5.26) We will then show that we can choose n 0 , C and a such that β n ≥ 1 5 and p G n (β n ) ≥ 1 2 for all n ≥ n 0 . To complete the proof we proceed step by step. We first verify (5.25) and (5.26).
Step 2. We prove the result for a fixed b satisfying and then recalling Remark 2.2 observe that a simple coupling argument shows that the result remains true for all larger values of b. Assume that β n ≥ 1 5 for n ≥ n 0 . This assumption will be verified in Step 3. Define another constant Recalling (5.10), (5.14) and conditioning on the event {N n ≥ N K1 log n }, we have p A n (β n ) ≥ P G (N n , r n (β n )) is connected| N n ≥ N K1 log n P (N n ≥ N K1 log n ), and therefore 1 − p A n (β n ) ≤ P G (N n , r n (β n )) is not connected| N n ≥ N K1 log n (5.32) Since p G n (β n ) ≥ 1 2 for large n, ε n → 0 and K 1 < K by (5.31), then, for large n, N K1 log n < (1 − ε n )p G n (β n )N K log n , so, by Step 1, n (1 − (P (N n ≥ N K1 log n )) < ∞. (5.33) Using N K1 log n ≤ N n ≤ N K log N , assuming β n ≥ 1 5 and taking a > 25K log N , and applying the inequality in the Appendix we obtain where M and L are positive constants. Since K 1 log N > 1, the sum of the second terms converges, and the sum of the third term also converges. The sum of the first terms converges if Hence for a > a * , together with (5.32), (5.33) we have n (1 − p A n (β n )) < ∞ uniformly for β n ≥ 1 5 . which implies (5.26).
Step 3. We must show that the assumptions on β n and p G n (β n ) used in steps 1 and 2 are self-consistent, that is, we can choose n 0 , β n0 , and p G n0 (β n0 ) such that p G n (β n ) ≥ We proceed as follows. Given θ satisfying (5.24) and recalling (5.25), (5.26) we can choose n 0 such that the following products satisfy: provided that β n ≥ 1 5 for n = n 0 . . . , n 0 + k.  Step 4. For a > a * , n 0 and C chosen above we then have β n ≥ 1 5 and p G n (β n ) ≥ (2) The proof follows immediately from Lemma 5.2.
(c) Consider the case 0 < b ≤ 2 log N < K but modifying the model by replacing the actual connection probabilities at distances k n + 1, . . . , k n+1 with the lower bound c kn N 2k n+1 . In this case the lower bound on the connection probabilities in Lemma 5.7 can be replaced by the upper bound P (X kn and X ′ kn are connected within the k n+1 -ball) ≤ r n as n → ∞, where r n (β) = a log n N (K−2/ log N ) log n 1 N K log n .
We can then consider the E-R graph G(N n , r n (β)). Assuming that N n is of order N K log n , in this case by Erdős-Rényi theory the resulting random graph has only of order log(N K log n ) good k n -balls in in the largest connected component in the k n+1 -ball . This would imply that the limiting density of the largest connected component in the k n -balls decreases to 0 as n → ∞ so that percolation does not occur.
We then have that the probability that there are more than (1+ε) α(λ) log N n good k n -balls in the largest connected component in the k n+1 -ball for infinitely many n is 0. Therefore there cannot be an infinite connected component (with positive density). ✷

Proof of Theorem 3.3.
(a) Note that if c k = C 0 + C 1 log k + C 2 k α , with α > 2, then in Theorem 3.5 we can choose b, K such that 2 log N < K < b < α log N .
Then c kn = C 0 + C 1 log⌊Kn log n⌋ + C 2 ⌊K α n α (log n) α ⌋ ≥ C + a log n · n b log N for sufficiently large C 0 and C 2 , where C and a are as in Theorem 3.5(b). The proof follows then from (2.3), (2.5), (2.6) and the assumptions on b in Theorem 3.5(b) and Remark 2.2.
(b) If C 2 = 0 and C 1 < N , then c kn = C 0 + C 1 (log K + log n + log log n) ≤ C 0 + aN log n for some 0 < a < 1 and C 0 > C 0 (with K = 1 if N ≥ 3). The result then follows from Theorem 3.5(a)(i).
(c) The existence of C * follows by the argument in [26] (Theorem 1(b)) as follows. The expected number of edges from a given vertex is (see ( which is less than 1 for sufficiently small C 0 , C 1 , C 2 . The result follows by coupling the largest connected cluster containing a given point with a subcritical branching process (see e.g. [23], page 109). ✷

Appendix. Connectivity of a random graph
Consider the E-R random graph G(n, a log n n ), a > 0. Using a random walk approximation for cluster growth in a susceptible-infected-removed epidemic model, Durrett [19] proves the known result that P (G(n, a log n n ) is connected) → 1 as n → ∞ if a > 1. Putting together the parts of the proof one obtains (see p. 64) the lower bound for a > 1, P G n, a log n n is connected ≥ 1 − 14(a log n) 13 e (13a log n)/n n a 1 − 1 n 2.1 1 − 1 n 2 n ·(1 − e −(log n) 3 /100 ) n(n−1) .
Then using the inequalities 1 − x > e −2x , 0 < x < 0.7968, and 1 − e −x < x, x > 0 it follows that for a > 1, P G n, a log n n is not connected ≤ M [(log n) 13 n 1−a + n −1 + exp(−L(log n) 13 n 2 )], where M and L are positive constants.