Long-range percolation on the hierarchical lattice

We study long-range percolation on the hierarchical lattice of order $N$, where any edge of length $k$ is present with probability $p_k=1-\exp(-\beta^{-k} \alpha)$, independently of all other edges. For fixed $\beta$, we show that the critical value $\alpha_c(\beta)$ is non-trivial if and only if $N<\beta<N^2$. Furthermore, we show uniqueness of the infinite component and continuity of the percolation probability and of $\alpha_c(\beta)$ as a function of $\beta$. This means that the phase diagram of this model is well understood.


Introduction and main results
The use of percolation theory in statistical physics has long been recognized. The study of long-range percolation on Z d goes back to [18] and led to a series of interesting problems and results [2,17,4,5,19]; see [6,Section 2] for an extensive overview. In [10] asymptotic long-range percolation is studied on the hierarchical lattice Ω N (to be defined below) for N → ∞. The contact process on Ω N for fixed N has been studied in [3].
In this paper we study the case of finite N. Long range percolation on the hierarchical lattice is quite different from the usual lattice: classical methods break down and results are different. The results and methods in this paper could appeal to both mathematicians and physicists.
For an integer N ≥ 2, we define the set  and define a metric on it by The pair (Ω N , d) is called the hierarchical lattice of order N.
One can think of the vertices in the hierarchical lattice as the leaves of a regular tree without a root, see Figure 1. The metric d can then be interpreted as the number of generations (levels) till the "most recent common ancestor" of two vertices. Let N be the non-negative integers, including 0 and N + := N \ {0}. The set Ω N is countable, and we can introduce a natural labeling of its vertices via the map f : Ω N → N given by We will sometimes abuse notation and write n for f −1 (n) ∈ Ω N . The metric space (Ω N , d) satisfies the strengthened version of the triangle inequality d(x, y) ≤ max(d(x, z), d(z, y)) for any triple x, y, z ∈ Ω N . Such spaces are called ultrametric (sometimes non-Archimedean) [20]. For x ∈ Ω N , define B r (x) to be the ball of radius r around x. Several important geometrical properties follow from the definition of the space (Ω N , d) and its ultrametricity: 1. B r (x) contains N r vertices for any x; 2. for every x ∈ Ω N there are (N − 1)N k−1 vertices at distance k from it; 3. if y ∈ B r (x) then B r (x) = B r (y); 4. as a consequence of the previous property, for all x, y and r we either have B r (x) = B r (y) or B r (x) ∩ B r (y) = ∅.
Now consider long-range percolation on Ω N . Every pair of vertices (x, y) ∈ Ω N × Ω N is (independently of all other edges) connected by a single edge with probability where k = d(x, y) and where 0 ≤ α < ∞ and 0 < β < ∞ are the parameters of the model. The vertices x ∈ Ω N and y ∈ Ω N are in the same connected component if there exists a path from x to y, that is, if there exists a finite sequence x = x 0 , x 1 , . . . , x n = y of vertices such that every pair (x i−1 , x i ) of points with 1 ≤ i ≤ n shares an edge. The edges are not directed.
We denote the size of a set S of vertices by |S|. The connected component (also called "cluster") containing the vertex x is denoted by C(x). Since |C(x)| has the same distribution for every x ∈ Ω N we may study |C(0)| instead of |C(x)|.
Let P α,β be the probability measure governing this percolation process (on the appropriate probability space and sigma-algebra) and E α,β the corresponding expectation operator. When no confusion is possible, we omit the subscripts α and β. Denote It follows from a standard coupling argument that θ(α, β) is non-decreasing in α for any given β. Therefore, it is reasonable to define Throughout the paper we use the following notation. For a set S of vertices, let S := Ω N \ S denote its complement. The set C n (x) is the cluster of vertices that are connected to the origin by a path that uses only vertices inside B n (x). For disjoint sets S 1 , S 2 ⊂ Ω N , the event that at least one edge connects a vertex in S 1 with a vertex in S 2 is denoted by S 1 ↔ S 2 . The notation S 1 ↔ S 2 denotes the event that such an edge does not exist. Let C m n (x) be the largest cluster in B n (x); if more such clusters exist, C m n (x) is defined to be one of them, chosen uniformly among all possible candidates. In any case, |C n (y)|.
Theorem 1 ((non)-triviality of the phase transition) Theorem 2 (uniqueness of the infinite cluster) There is a.s. at most one infinite cluster, for any value of α and β.
In order to prove Theorems 3 and 4, we need the following result, which is interesting in its own right.
Theorem 5 (size of large components) If α and β are such that θ := θ(α, β) > 0, then for every ε > 0, lim In the next two sections we prove Theorem 1 and Theorem 2 respectively. After that we prove the remaining results, and we end with a discussion about possible generalisations.

Proof of Theorem 1
Proof of (a). Denote by E k the event that the origin shares an edge with at least one vertex at distance k. Then and the events (E k ) k≥1 are independent. It is easy to see that if β ≤ N then ∞ k=1 P(E k ) diverges for any α > 0. Therefore, by the second Borel-Cantelli lemma, infinitely many of the events E k occur with probability 1 and θ(α, β) = 1, for any α > 0 and 0 < β ≤ N. This implies α c (β) = 0 for 0 < β ≤ N.

⊓ ⊔
Proof of (c). By monotonicity it suffices to prove that α c (N 2 ) = ∞, so we now take β = N 2 . A straightforward computation shows that for every j, which is strictly less than 1, and independent of j. Let n 0 = 0 and n i+1 = inf{n ≥ n i : it is enough to prove that there a.s. exists i such that B n i (0) ↔ B n i (0). Because the events {B n i (0) ↔ B n i (0)} are independent and all have the same probability strictly less than 1, the result follows.

⊓ ⊔
Proof of (b). The strict positivity of α c (β) follows from the fact that which can be made strictly smaller than 1 by choosing α small enough. Hence the expected number of edges from a given vertex is strictly smaller than 1, and by coupling with a subcritical branching process, the almost sure finiteness of the percolation cluster follows. The second inequality is much more involved. Choose an integer K and a real number η such that this is possible since √ β < N. We say that a ball of radius nK is good if its largest connected component has size at least η nK . Denote by s n the probability that a ball of radius nK is good, that is, s n := P |C m nK (0)| ≥ η nK . By convention, we set s 0 = 1.
We say that a ball of radius nK is very good if it is good and in addition its largest component shares an edge with the largest component of the first (the one with the smallest index) good sub-ball in the same ball of radius (n + 1)K. Note that according to this definition, the first good sub-ball of diameter nK in a ball of radius (n+1)K is automatically very good.
Since (N K − 1) ≥ η K , B (n+1)K (0) will certainly be good if (a) it contains N K − 1 good sub-balls of radius nK, and (b) all these good sub-balls are very good.
We next estimate the probability of the events in (a) and (b). Clearly, the number of good sub-balls of radius nK in a ball of radius (n + 1)K has a binomial distribution with parameters N K and s n . Furthermore, given the collection of good sub-balls, the probability that the first such good sub-ball is very good is equal to 1, and the probability for any of the other good sub-balls to be very good is at least since the distance between two vertices in a ball of radius (n + 1)K is at most (n + 1)K and the largest component of a good sub-ball contains at least η nK vertices. We conclude that the number of very good sub-balls is stochastically larger than a random variable having a binomial distribution with parameters N K and s n (1 − ε n ). It follows that where Bin(n, p) denotes a random variable with a binomial distribution with parameters n and p. Notice that since 1/x > e −x for all x > 0, we have and hence for any δ > 0 we can find α so large that ε n ≤ δ n for all n.
The expression in (4) is very close to the iteration formulae in [11], and it pays to recall the setup from that paper first.
Writing G p (·) for π(p·) we then obtain In [11], this iteration arises in the study of fractal percolation and it is shown that the limit u = lim n→∞ u n always exists and is positive if and only if p is so large that the equation G p (x) = x has a positive solution. This result is very similar (and can be proved in the same way) as the classical non-extinction criterium for branching procesess. When p approaches 1, then the largest solution of G p (x) = x also approaches 1, at essentially the same rate, and therefore also the limit u approaches 1. Now observe that (4) can be rewritten as This is very similar to (5), the only difference being that the subscript of the iteration function depends on n now. However, we already showed above that for α large enough, ε n goes down exponentially fast at any given rate. It is not hard to believe that this implies that s n converges to 1 exponentially fast, and we make this precise now.
We have we arrive at the inequality Writing ξ n = 1 − s n this gives Choose first γ so small that 4C ≤ γ −1 and then α so large that ε n ≤ γ n and ξ 1 ≤ γ 2 . In an inductive fashion, if ξ n ≤ γ n+1 then which implies that ξ n ≤ γ n+1 for all n. Hence, for α large enough, s n converges to 1 exponentially fast. The exponential convergence of s n to 1 is not quite enough for our purposes. Indeed, s n represents the probability that a ball of radius nK contains a component of size at least η nK , but this component does not necessarily contain the origin. Therefore, we have to make one extra step. Let We claim that To see this, we argue as before. If |C nK (0)| ≥ η nK , then B nK (0) will be the first good subball in the derivation above. If this component is connected to at least N K − 2 other large components in B (n+1)K (0) as above, then the component of the origin in B (n+1)K is large enough, that is, has size at least η (n+1)K . From this, (10) follows. Since a simple coupling gives that and since the derivation above actually gives that the right hand side of this inequality converges to 1 exponentially fast, it follows that which is enough to prove the result. ⊓ ⊔ Remarks (I) By the proof of the strict positivity of α c (β) for β > N, we also may deduce (II) Since we may choose γ arbitrary small in equation (9), we in fact have that for every ε > 0, we can choose α so large, that t n > 1 − ε. This implies that for every β ∈ (N, N 2 ), we can choose α so large such that θ(α, β) > 1 − ε.

Proof of Theorem 2
We will use Theorem 0 from [13]: Then there can be a.s. at most one infinite component.
In order to be able to use this result, we will first embed the metric generating tree into Z in a stationary (and ergodic) way. The embedding will be such that for each r, we have a. any ball of radius r will be represented by N r consecutive integers, b. the collection of balls of radius r partitions Z.
We first describe the construction rather loosely, and after that provide a formal construction. For ease of description, a collection of m consecutive integers is called an interval of length m.
The ball of radius 1 containing 0, that is, B 1 (0) is chosen uniformly at random among all N possible intervals of length N containing the origin of Z. Once we have chosen this ball, all other balls of radius 1 are determined by requirements (a) and (b) above, although it is not yet clear at this point to exactly which balls in Ω N they correspond. To get an idea of this first step of the procedure, note that for N = 2 there are only two possibilities, one of which is depicted here: The other possibility is obtained by translating the edges over one unit to the right (or to the left, for that matter). Next, we determine B 2 (0). The ball B 2 (0) is a union of N balls of radius 1 and contains B 1 (0). There are N possible ways to achieve this, keeping in mind that any ball of radius 2 must -according to (a) above -be an interval of length N 2 . We now simply choose one of the N possible ways to do this, with probability 1/N each. Once we have chosen B 2 (0), all other balls of radius 2 are determined for the same reason as before. The following picture illustrates a possible choice for B 2 (0) given the choice of B 1 (0) made before. perhaps requires some reflection: one can see that this holds by first identifying the two 0's in both graphs, and then build up the balls B r (0), r = 1, 2, . . ., in that order.
It is intuitively clear that this construction yields a stationary metric generating tree in the sense that the distribution of the stochastic process which assigns to each pair {z, z ′ } of points in Z the distance between them, is invariant under integer translations. However, we would like to formalise the construction in such a way that not only stationarity follows as an easy corollary, but we also obtain that the embedding of the metric generating tree is in fact ergodic with respect to translations.
A possible formal construction is the following. Our probability space is the unit interval [0, 1] endowed with Lebesgue measure on its Borel sigma field. For γ ∈ [0, 1], let γ = 0.γ 1 γ 2 · · · be its N-adic expansion, that is, where we ignore those γ for which the expansion is not unique -this is a set of Lebesgue measure zero anyway. In the construction above, we saw that for each r, B r−1 (0) can be seen as one of the balls of radius r − 1 among the balls making up B r (0). The metric generating tree corresponding to γ ∈ [0, 1] is obtained as follows. We let B r (0) be such that B r−1 (0) is the (γ r + 1)-st ball in B r (0), counted from left to right. For instance, in the preceding two figures with N = 2, we have that γ 1 = 1 and γ 2 = 0. The map which assigns to each (apart from the exceptional null set discussed before) γ a metric generating tree is denoted by φ. This map φ is invertible on a set of full Lebesgue measure.
It is clear that this construction formalises the informal description given earlier. Furthermore, one can write down explicitly the transformation S : [0, 1] → [0, 1] which corresponds with the left-shift T on the space of metric generating trees in the sense that φ • S = T • φ, hence T = φSφ −1 . Indeed, a little reflection shows that S can be described as follows: if Y (γ) = min{n; γ n = N − 1} then S(γ) k (that is, the k-th digit in S(γ)) is given by This transformation has been studied in the literature and goes by the name Kakutani -Von Neumann transformation, see e.g. [12]. It is easy to check that Lebesgue measure is invariant under the action of S, and this immediately proves that the construction of our random metric generating tree is stationary on Z.
Proof of Theorem 2. The construction above shows that the metric generating tree can be embedded into Z in a stationary way. We claim that this implies that the whole long range percolation process on the hierarchical lattice can be realised as a stationary percolation process on Z. To see this, we assign a uniformly-[0, 1] distributed random variable U e to each edge e in such a way that the collection is independent. Given a realisation of the metric generating tree, we declare edge e to be open if U e ≤ 1−exp(−α/β |e| ), where |e| denotes the length of e. This gives a realisation of the percolation process with the correct distribution, and shows that we have embedded the full percolation process on the hierarchical tree in a stationary way. Since every pair of vertices shares an edge, with positive probability, irrespective of the presence or absence of other edges, the positive finite energy condition is met and the result follows.
⊓ ⊔ With a little more work one can also see that the construction is in fact ergodic, that is, any event which is invariant under the shift on Z has probability 0 or 1. To show this, we first show that Lebesgue measure on [0, 1] is ergodic with respect to S. This result is known, but we give a simple (and new) proof for the convenience of the reader.
where 1 A denotes the indicator function of A. Now consider the collection M of invariant probability measures for the transformation S. From the fact that Lebesgue measure preserves measure under S we see that M is not empty. It is well known and easy to see that the set M is convex, and that the ergodic measures are precisely the extremal points of M. Since M = ∅, this implies that there is at least one ergodic measure with respect to S.
Let ν be any ergodic measure with respect to S. It then follows from the ergodic theorem and (11) that ν(I m,k ) = N −m for any m and k = 0, . . . , N m − 1. However, there is only one measure that satisfies this condition, namely Lebesgue measure on [0, 1]. Hence ν must be Lebesgue measure, which we already know is indeed invariant.

Theorem 8
The embedding of our long range percolation process on the hierarchical lattice into Z is ergodic.
Proof. From Lemma 7 it follows that the metric generating tree is embedded ergodically. Adding the i.i.d. random variables U e as before does not destroy ergodicity, and the final configuration is a factor of this ergodic process and hence ergodic itself. ⊓ ⊔

Proof of Theorem 5
The proof consists of three steps: 1. For every constant K > 0 the indicator function of the event that both |C(0)| = ∞ and |C n (0)| < K(β/N) n converges a.s. to 0 as n → ∞.
2. The fraction of the vertices in B n (0) which are in a cluster of size at least K(β/N) n , converges a.s. to θ as n → ∞.
3. Combine the previous two steps.
Step 1. We compute Let n 1 be the smallest n for which |C n (0)| ≤ K(β/N) n , if C n i (0) ↔ B n i (0), then n i+1 is the smallest n > n i for which C n i (0) ↔ B n (0) and for which |C n (0) The right hand side is strictly less than 1 and is independent of n i . So there will be an n i for which {C n i (0) ↔ B n i (0)}, and it follows that Step 2. We use the random embedding of the hierarchical lattice in Z, introduced in the previous section. By Theorem 8 and the ergodic theorem we have, for every k > 0, From step 1 we know that this final probability increases to θ as k → ∞, and it follows that Note that the collection vertices {−N n , −N n +1, −N n +2 · · · , N n } contains the image under the embedding of the ball B n (0) and this image contains a fraction N n /(2N n + 1) of those vertices. Whether or not |C n (x)| > K(β/N) n is independent for vertices in different n-balls, so Step 3. The strategy is to split those components in B n+1 (0) which are at least of size K(β/N) n into clusters roughly of size K(β/N) n . Then we use those clusters as "metavertices" for an N-partite graph, in which meta-vertices in different n-balls are connected if the clusters they represent are connected by an edge of length n + 1. Meta-vertices in the same n-ball never share an edge. We show that if we choose K and n large enough, then the largest component of the graph of meta-vertices contains a fraction of the meta-vertices close to 1, which shows that for large n, the fraction of vertices in the largest cluster of B n+1 (0) is close to θ. We will be more precise now.
By step 2 we know that for every K > 0, every ε > 0 and all large enough n, it holds that We now fix ε. The ball B n (y) is said to be good if We condition on the event that all n-balls in B n+1 (0) are good. The probability of this event is bounded below by (1 − ε) N > 1 − Nε. Now, for every good ball B n (y), y ∈ Ω N , we make a partition of the set x ∈ B n (y); |C n (x)| > K β N n in "meta-vertices". For the moment we denote this set by B ′ n (y). For x ∈ B ′ n (y) we make a partition of C n (x) in ⌊|C n (x)|/(⌈K(β/N) n ⌉)⌋ sets, which all have size at least ⌈K(β/N) n ⌉.
Here ⌈x⌉ := inf{n ∈ Z; n ≥ x} is the ceiling of x and ⌊x⌋ := sup{n ∈ Z; n ≤ x} is the floor of x. The vertices that are not in such a cluster are ignored for the moment. Denote the collection of meta-vertices that contain vertices in B n+1 (0) by V n . We note that if B n (y) is good and K is large enough, then it contains at least (θ − ε)N n /⌈2K(β/N) n ⌉ ≥ (θ − ε)N n /(3K(β/N) n ) vertices.
We construct a new N-partite graph on V n as follows. Let V n be the vertex set and let E n be the set of edges between those vertices. This edge set is obtained as follows. Choose ⌈K(β/N) n ⌉ original vertices from every meta-vertex V n . Chosing those vertices may be done in any way that is independent of the presence of edges of lenght n + 1 or larger. Denote these sets by A n . The meta-vertices x, y ∈ V n share an edge in E n , if there is at least 1 edge in the original graph that is shared by vertices that make up the sets in A n corresponding to x and y, and if the original vertices that make up x and y are at distance n + 1 of each other. Otherwise there is no edge between the meta-vertices.
As observed before, the number of meta-vertices in V n that consist of vertices from a good ball B n (x), is at least (θ − ε)N n /(3K(β/N) n ). Since β < N 2 , this quantity grows to ∞ as n → ∞. The expected degree of a vertex in V n exceeds which is larger than λ := (Ñ − 1)(θ − ε)αK/(6β), for all large enough n. This holds for every K > 0, and therefore the expected degree can be chosen to be arbitrary large.
This N-partite graph falls within the class of inhomogeneous random graphs of Bollobás, Janson and Riordan [7]. The degree of every meta-vertex is asymptotically Poisson distributed, with mean bounded below by λ and we know that the (unique) largest component of such an N-partite graph contains with high probability (in the limit for n → ∞) a fraction ρ of the meta-vertices, where ρ is the largest solution of By tuning K, λ can be chosen arbitrary large and ρ can be taken such that ρ > 1 − ε. So, for every ε > 0 and large enough n the graph (V n , E n ) contains a unique giant component, which contains a fraction (1 − ε)N of the vertices in V n , with probability at least 1 − ε.
Since we have conditioned on the event that all n-balls in B n+1 (0) are good, the fraction of vertices in B n+1 (0), that are part of vertices in V n is bounded below by θ − 2ε. (The factor 2, is due to the fact that the sizes of different meta-vertices differ at most a factor 2). Therefore, with the same conditioning, the largest cluster in B n+1 (0) is at least of size with probability exceeding 1 − ε. Now multiplying by the probability that all n-balls in B n+1 (0) are good, gives that the probability that the largest cluster in B n+1 (0) is at least of size (1−2ε)(θ−2ε)N n is bounded below by (1−ε)(1−Nε). By chosing ε ′ ∈ (0, ε/ max(4, N + 1)), we obtain that P(|C n (0)| > (θ − ε ′ )N n ) is at least 1 − ε ′ and this finishes the proof. ⊓ ⊔ Remark We realize that it is possible to prove the statement of Step 2 by using the strong law of large numbers. If we do this, then it is only a small step from the proof of Theorem 5 to a proof of Theorem 2. However, we think that the proof presented in the previous section contains some valuable ideas and therefore should be included in this paper.

Proof of Theorem 3
Continuity proofs of percolation functions typically split into separate proofs for left and right continuity, one of which typically follows from standard arguments. In this case, continuity from the right in α and continuity from the left in β are the easy parts: Lemma 9 θ(α, β) is continuous from the right in α > 0 and continuous from the left in β > 0.
Proof. For α > 0 and β ≤ N, θ(α, β) = 1, so the statement of the lemma holds in that domain. Note that A straightforward computation yields that for β > N, Since |C i (0)| depends on the state of only finitely many edges, this expectation is continuous in α and β. In particular, P(C i (0) ↔ B i (0)) is continuous from the left in β for β > N and therefore it is continuous from the left for β > 0 (in fact, it is also continuous from the right). Furthermore, P(C i (0) ↔ B i (0)) is increasing in α, decreasing in β and decreasing in i. Because a decreasing limit of increasing (resp. decreasing) functions, which are continuous from the right (resp. left) is continuous from the right (resp. left), the statement follows. ⊓ ⊔ In order to prove that θ(α, β) is continuous from the left in α > 0 and continuous from the right in β > 0, we use a renormalisation argument. Fix α > 0 and N ≤ β < N 2 . To get insight in the argument we first (falsely) assume that for given ε > 0 and large enough finite k, there is a δ > 0 such that, (Although the assumption is false, we can get this probability arbitrary close to 1, by choosing k large enough, δ small enough and using Theorem 5.) Now we use renormalisation. The balls of radius k are considered as vertices of Ω N which we call "meta-vertices". If two vertices in the original model have distance k + l, then the meta-vertices in which they are contained are at distance l. Vertices in the new model are connected if and only if the largest clusters in the original k-balls, represented by these vertices, are connected by an edge. The new model is again a percolation model on Ω N . Let x and y be meta-vertices, at distance l of each other. Define, for δ > 0 small, Given the states of all other edges, the (conditional) probability that x and y are connected to each other is always bounded below by and by the choice of α ′ , this is just Hence, the renormalized model stochastically dominates the percolation model with parameters α ′ and β + δ.
Since N 2 /(β + δ) > 1, α ′ can be chosen arbitrary large by choosing k large. In particular it can be chosen such that θ(α ′ , β + δ) > 1 − ε, (by the second remark after the proof of Theorem 1). It follows that for large enough k The only problem is that we have incorrectly assumed that and we will deal with this problem now. We need the notion of mixed percolation (cf. [8]). Mixed percolation involves independently removing vertices, together with all of its adjacent edges. Formally, the measure P mixed α,β,γ is constructed as follows. Before giving the proof of this result, we show how it can be used to prove Theorem 3. The following lemma suffices.
Proof. Fix α, β and ǫ > 0. Let α ′ be such that which is possible by Theorem 1(b) and the second remark after its proof. Furthermore, let γ ∈ (0, ǫ/3) be such that which is possible by Lemma 10. Let K be such that the following conditions are satisfied: which are possible by respectively N 2 > β and Theorem 5. Finally, let δ > 0 be such that δ < min(α/3, (N 2 − β)/2) and which is possible by the continuity of the probability in α and β for finite K.
We say that the ball B K (x) is good if C m K (x) has size at least (θ(α, β) − ǫ/3)N K . Delete all vertices that are in a ball of diameter K which is not good and also all vertices that are not in the largest cluster of good balls. As above, we interpret the remaining components as the vertices of the hierarchical lattice of order N in which vertices are independently deleted with probability at most γ, by (15). Remaining clusters in the original graph, of which the vertices are at distance K + l, are connected by at least one edge with probability at least irrespective of the existence or absence of other connections. Here we have used that α − δ > 2α/3. Hence the rescaled process stochastically dominates a mixed percolation process with parameters 2α ′ , β and γ. Now note that by exchangability Furthermore, conditioned on 0 being in the largest cluster of a good ball, the probability that 0 is in an infinite cluster if the parameters are α − δ and β + δ is larger than 1 − ǫ/3. Combining these observations and γ < ǫ/3 gives vertices (open or closed) the presence or absence of an edge is independent of the presence or absence of other edges. The corresponding measure we denote byP mixed α,β,γ . The set of vertices which can be reached by a path from vertex x is denoted byĈ(x). Note that in th directed model, the presence of a path from x to y does not necessarily imply that there exists a path from y to x. We define the directed version of the original (not mixed) measure,P α,β , in a similar way and note thatP α,β =P mixed α,β,0 . Standard arguments (see e.g. [9,16]) can be used to show that Proof of Lemma 10. The directed mixed percolation graph with parameters α, β and γ can be obtained as follows (the ordinary model can be obtained by taking γ = 0). We assign i.i.d. random variables X x to the vertices x ∈ Ω N , all Poisson distributed with parameter α(N − 1)/(β − N). We construct a directed multi-graph (a graph in which multiple edges between two vertices in the same direction are allowed). Vertices are open with probability 1 − γ, independently of each other. If x is open, then X x directed edges start at x. The endpoints of these edges are independently chosen from Ω N \ x, and a vertex at distance r of x is chosen with probability (β − N)(N − 1) −1 β −r . If x is closed, then no edges start at x. We obtain the original directed graph by replacing the collection of all edges from x to y (if there is at least one) by a single edge from x to y, for all x, y ∈ Ω N . Let Z 1 be a Poisson distributed random variable with parameter α(N − 1)/(β − N). Furthermore, let Z 2 = Y 1 Y 2 , where Y 1 is equal to 1 with probability 1 − γ and equal to 0 with probability γ, and where Y 2 is independent of Y 1 and Poisson distributed with parameter α(1 + ε)(N −1)/(β −N). For the ordinary percolation model the number of edges starting at x in the multigraph is distributed as Z 1 , while for the mixed percolation model, the number of edges starting at x is distributed as Z 2 . It is now easy to check that for ε > 0 there is a γ > 0, such that P(Z 1 = 0) = P(Z 2 = 0) and for this γ and all k > 0 we have, The statement of Lemma 10 now follows by a straightforward coupling argument.

Proof of Theorem 4
The proof of Theorem 4 is split into separate proofs of continuity from the right and from the left of α c (β).
To prove this, we use [1]. In that paper it is shown that for long-range percolation on Z d , Inspection of the proof of this result yields that this proof also works on the hierarchical lattice. Now use the following lemma.

⊓ ⊔
Proof of Lemma 14. Assign independent uniform(0, 1) random variables to all pairs of vertices in Ω N . The random variable assigned to the pair (x, y) is denoted by U(x, y). We say that x and y share an edge for the parameters α and β if U(x, y) < 1 − exp(−αβ −d(x,y) ). This construction provides a coupling for long-range percolation models with different values of α and β. Define C(x; α, β) as the cluster of vertices that can be reached by paths if the parameters are α and β.

Possible generalizations
Possible generalizations of the model considered in this paper include: 1. Randomness in the hierarchical lattice. The hierarchical lattice Ω N is generated by a N-regular tree. An interesting question is how randomness in the underlying tree (or induced random metric) affects the percolation process on the resulting lattice. One possibility is that the metric generating tree, is a Galton-Watson tree. Analysis of long-range percolation on such random hierarchical structures is not a trivial extension of the analysis in this paper, since in renormalisation schemes, one has to take care of all kinds of dependencies of the sizes of balls of given diameter.

2.
More general connection function p(k). In this paper we focused on p k = 1−exp − α β k . What are necessary and sufficient conditions on g(k) so that when p k = 1−exp (−αg(k)) we have 0 < α c < ∞?
3. Random cluster models. We only consider independent percolation on the hierarchical lattice. We did not try to incorporate Random cluster (or Fortuin-Kasteleyn) model [14] yet. Some work has already been done for the Ising model on the hierarchical lattice [15].