On the internal distance in the interlacement set

We prove a shape theorem for the internal (graph) distance on the interlacement set $\mathcal{I}^u$ of the random interlacement model on $\mathbb Z^d$, $d\ge 3$. We provide large deviation estimates for the internal distance of distant points in this set, and use these estimates to study the internal distance on the range of a simple random walk on a discrete torus.


Introduction and the results
We study properties of the interlacement set I u of the random interlacement model. We are mainly interested in its connectivity properties, in particular in the internal distance (sometimes called the chemical distance) on the interlacement cluster.
The random interlacement model was introduced in [Szn10] in order to describe the microscopic structure in the bulk which arises when studying the disconnection time of a discrete cylinder or the vacant set of random walk on a discrete torus. It can be informally described as a dependent site percolation on Z d , d ≥ 3, which is 'generated' by a Poisson cloud of independent simple random walks whose intensity is driven by a non-negative multiplicative parameter u. The set covered by these random walks is called the interlacement set at level u and is denoted by I u . As the precise definition of I u is rather lengthy, we postpone it to Section 2 and state our results first.
Let P u 0 = P[ · | 0 ∈ I u ] be the conditional distribution given that the origin is in the interlacement set I u . For x, y ∈ I u we define ρ u (x, y) to be the internal distance between x and y within the interlacement set I u : ρ u (x, y) = min{n : there exist x 0 , x 1 , . . . , x n ∈ I u such that x 0 = x, x n = y, and x k − x k−1 1 = 1 for all k = 1, . . . , n}, where · 1 denotes the ℓ 1 -norm in Z d . As we shall see below, the set I u is P-a.s. connected for all u, so ρ u (x, y) < ∞ for all u > 0 and x, y ∈ I u . Assuming that x ∈ I u , let Λ u (x, n) = {y ∈ I u : ρ u (x, y) ≤ n} be the ball centred at x with radius n in the internal distance. We abbreviate Λ u (n) := Λ u (0, n).
The first main result of this paper is the shape theorem for large balls in the internal distance.
Remark 1.2. Clearly, the set D u is symmetric under rotations and reflections of Z d and D u ⊂ {x ∈ R d : x 1 ≤ 1} for all u. It is straightforward to show that D u → {x ∈ R d : x 1 ≤ 1} as u → ∞; it would be interesting, however, to be able to say something about the behaviour of D u when u → 0 (e.g., does the shape become close to the Euclidean ball, and what can be said about the size of D u as u → 0?).
The key technical step in the proof of Theorem 1.1 is a fact (which is of independent interest) that the distance within the interlacement cluster should typically be of the same order as the usual distance. Theorem 1.3. For every u > 0 and d ≥ 3 there exist constants C, C ′ < ∞ and δ ∈ (0, 1) such that P u 0 [there exists x ∈ I u ∩ [−n, n] d such that ρ u (0, x) > Cn] ≤ C ′ e −n δ . A corresponding result for the Bernoulli percolation on Z d was proved by Antal and Pisztora; in their case the constant δ equals one and is optimal, see [AP96, Theorem 1.1]. We did not try to optimise the constant δ in Theorem 1.3.
Remark 1.4. It is trivial to replace P u 0 by P in Theorems 1.1 and 1.3. To this end it is only necessary to extend ρ u (x, y) to all x, y ∈ Z d by setting where x u (respectively, y u ) is the closest point to x (respectively, y) on I u (one can choose the rule how ties are broken in any convenient translational-invariant way).
The methods used to show Theorem 1.1 also imply the following result.
Theorem 1.5. It holds that P[I u is connected for all u > 0] = 1.
Previously it was known that for every fixed u > 0, the set I u is P-a.s. connected (see (2.21) in [Szn10]); the above theorem means that P-a.s. there are no 'exceptional values' of the parameter u. Remark also that much more is known about the connectivity of I u for fixed u, see [PT11,RS12]. Theorems 1.1 and 1.3 indicate that the interlacement set I u looks at large scales very much like Z d . In the same direction, Ráth and Sapozhnikov recently proved that the interlacement set I u percolates in slabs [RS11a], and that random walk on I u is transient [RS11b]. Theorem 1.3 can be also used to answer a related question: 'How much the range of the random walk on the torus resembles the torus?' To this end we consider (X k ) k∈N to be a simple random walk on the discrete d-dimensional torus of size N, T d N = (Z/NZ) d , and write P N for its law when started from the uniform distribution. We let I u N to denote the range of the random walk up to time uN d , I u N = {X 0 , . . . , X ⌊uN d ⌋ }. Let ρ u N (x, y) be the minimal distance of x, y ∈ I u N within I u N , defined similarly as ρ u , and let d N (x, y) be their usual graph distance on the torus.
Theorem 1.6. For large enoughC and γ, we have lim N →∞ P N ρ u N (x, y) ≤Cd N (x, y) for all x, y ∈ I u N such that d N (x, y) ≥ ln γ N = 1.
This theorem improves the result of Shellef [She10], where a similar claim was proved forC growing very slowly with N using entirely different methods. More precisely, [She10] requiresC = ln (k) N where ln (k) is the k-times iterated logarithm, k ≥ 1 being arbitrary. On the other hand, Shellef needs γ = 5d only; we do not have control on the size of this constant.
The main difficulty of the paper stems in proving our results for d < 5, in particular for d = 3. In fact, for d ≥ 5 there is a rather simple argument, based on the results of [RS11b], which shows Theorem 1.3 with δ = 1, and which we sketch in the Appendix. This argument uses the fact that for d ≥ 5 the random interlacement restricted to a thick-enough two-dimensional slab dominates in some sense the standard Bernoulli percolation, which allows an application of [AP96]. Heuristically, in large dimensions it is possible to construct 'long straight connections' within I u locally, independently of the connections in other places.
It seems that this argument cannot be extended to d < 5. It is much harder to construct the straight connections locally in an independent manner. This we do in Section 6, where we dominate the internal distance between the origin and the point (n, 0, . . . , 0) by the sum of a sequence of random variables with a finite range of dependence and stretched exponential tails, cf. (6.11) below. To obtain the finite range of dependence, we should show that connections within a large box of size m can be constructed using less than Θ(m d−2 ) random walk trajectories (which is the typical number of random walks intersecting this box; here and in the sequel we write f (m) = Θ(g(m)) when for positive constants c 1 , c 2 we have c 1 g(m) ≤ f (m) ≤ c 2 g(m) for all m). In fact, in Proposition 4.2 we will show that a 'backbone' of I u in this box can be constructed using Θ(m d−2−h ) trajectories only, h < 2/d. This also means that for every u > 0 the interlacement set I u is 'largely supercritical', that is it remains locally connected, even when considerably thinned.
The paper is organised as follows. After introducing the notation in Section 2, we collect in Section 3 some estimates on the hitting probabilities of sets and on the range of the simple random walk. Section 4 contains the key technical result of this paper, Proposition 4.2. This proposition roughly states that all points in (a possibly thinned version of) the set I u within box of size n are at internal distance n 2 , with a very high probability. Using this proposition, in Section 5, we give a short proof of Theorem 1.5. Sections 6-8 contain the proofs of Theorems 1.3, 1.1, and 1.6.

Preliminaries
In this section we fix the notation and recall the definition of the random interlacement model.
Let N = {0, 1, . . . } be the set of natural numbers. We denote with e 1 , . . . , e d the coordinate vectors in Z d , and write · , · 1 , · ∞ for the Euclidean, ℓ 1 , and ℓ ∞ norms correspondingly. We use B(x, r) to denote the closed · ∞ -ball centred at x with radius r, and abbreviate B(r) := B(0, r). We say that A ⊂ Z d is connected if for any x, y ∈ A there is a nearest-neighbor path that lies fully inside A and connects x to y. We write |A| for the cardinality of A, diam(A) = max x,y∈A x − y ∞ for its diameter in ℓ ∞ -norm, and ∂A = {x ∈ A : ∃y ∈ A c , x − y = 1} for its internal boundary.
Let us write P x for the law of a discrete-time simple random walk (X n ) n∈N on Z d started from x. For A ⊂ Z d we denote with H A ,H A and T A the entrance time in A, the hitting time of A, and the exit time from A: (2.1) and denote by cap(A) = x∈A e A (x) its total mass. We now recall the definition of the random interlacement from [Szn10]. In order to do this we need to introduce another notation which is, however, mostly used only locally. Let W be the space of doubly-infinite nearest-neighbour trajectories in Z d which tend to infinity at positive and negative infinite times, and let W ⋆ be the space of equivalence classes of trajectories in W modulo time-shift. (These spaces are equipped with σ-algebras W, W ⋆ as in (1.2), (1.10) of [Szn10].) The random interlacement is defined via a Poisson point process taking values in the space Ω of point measures on the space W ⋆ × [0, ∞) with the intensity measure ν ⊗ du. We denote by P the law of this process.
To describe the measure ν appearing in the intensity of the Poisson point process, for A ⊂ Z d , u ≥ 0, we denote by µ u A the mapping from Ω to the space of point measures on W which selects from ω ∈ Ω the trajectories with labels smaller than u intersecting A and parametrises them so that they enter A at time 0. Formally, for where Ran(w ⋆ ) = n∈Z w(n) for an arbitrary w in the equivalence class of w ⋆ , and s A (w ⋆ ) is the unique w ∈ W in this equivalence class such that w 0 ∈ A, w −n / ∈ A, n > 0. As follows from [Szn10], Theorem 1.1, the measure ν is uniquely determined by the following two properties which we will frequently use: • For every finite set A ⊂ Z d , under P, the number η u A := µ u A (ω)(W ) of trajectories in ω with labels smaller than u entering A has the Poisson distribution with parameter u cap(A).
Then, under P, w i are i.i.d., independent of η u A , with the law given by for any measurable set F in the space of single-infinite nearest-neighbour paths. It means that w i , restricted to non-negative times, are i.i.d. simple random walk trajectories started from the normalised equilibrium measure e A (·)/e A (A).
The interlacement set at level u is then defined as the trace of all trajectories in ω with labels smaller than u, We now explain the conventions for the use of constants in this paper. We denote by C, C 1 , C ′ 1 , C 2 , . . . the 'global' constants, that is, those that are used all along the paper and by c, c ′ , c 1 , c 2 , c 3 , . . . the 'local' constants, that is, those that are used only in the small neighbourhood of the place where they appear for the first time. For the local constants, we restart the numeration either in the beginning of each subsection or in the beginning of each long proof. All these constants are positive and finite and may depend on dimension, u, and other quantities that are supposed to be fixed; usually we omit expressions like 'there exist positive constants c 1 , c 2 such that . . . ' and just directly insert c's to the formulas. Also, the reader will notice that very frequently in this paper the probability of events (indexed by some integer parameter, say, n) will happen to be bounded from above by e −cn δ or from below by 1 − e −cn δ , where δ is typically (but not necessarily) between 0 and 1. So, we decided to use the following definition: Observe that n c s.e.(n) = s.e.(n) for any fixed c > 0. So, it is quite convenient to use this notation e.g. in the following situation: assume that we have at most n c events, each of probability bounded from above by s.e.(n). Then, the probability of their union is s.e.(n) as well.

Estimates on hitting probabilities
In this section we collect several estimates on hitting probabilities of subsets of Z d by random walk trajectories. We recall that P x denotes the law of the simple random walk (X n ) n∈N in Z d , d ≥ 3, starting at x. We denote by g the 'stopped' Green function: g(x, y; n) = n k=0 P x [X k = y], and write g(x, y) for g(x, y; ∞). For the case d ≥ 3 it holds that g(x, y) is finite for all x, y ∈ Z d , g(x, y; n) = g(y, x; n) = g(0, y − x; n), and, for all n ≥ x − y 2 for all x, y ∈ Z d . The upper bound (3.2) follows directly from Theorem 1.5.4 of [Law91]. The lower bound (3.1) can be proved easily adapting the proof of the same theorem.
be the probability that, starting from x, the simple random walk enters A before time n. We use the abbreviation q x (y; n) := q x ({y}; n) for the hitting probabilities of one-point sets, and q x (A) := q x (A; ∞) for the probability that the simple random walk ever enters the set A. It is elementary to obtain that for all x, y ∈ Z d and n ≥ x − y 2 (see e.g. Theorem 2.2 of [AMP02]) (3.3) Next, for x ∈ Z d and a finite set A ⊂ Z d , define g(x, A; n) = y∈A g(x, y; n).
Clearly, g(x, A; n) is the expected number of visits to A up to time n, starting from x. As before, we set g(x, A) := g(x, A; ∞).
The following lemma will be used repeatedly to estimate the hitting probabilities: .
Proof. Using the definition of g and the strong Markov property, Since , the second inequality in (3.4) follows. The first inequality is then implied by Let us use the notation ℓ(x, A) = max y∈A x − y ∞ for the maximal distance between x and the points of A. Two following simple lemmas contain lower bounds on hitting probabilities of sets.
Lemma 3.2. Suppose that A is a connected finite subset of Z d , containing at least two sites. Then, for all x ∈ Z d and n ≥ (ℓ(x, A)) 2 , Proof. Since A is connected, it is possible to find (not necessarily connected) set A ′ ⊂ A with the following properties: . Indeed, it holds that the size of the projection of A on one of the coordinate axes is at least diam(A) and this projection is an interval; then, for all points in the projection pick exactly one element of A that projects there, and erase 'unnecessary' points of A. Then, by (3.1) we have for any n ≥ (ℓ(x, A)) 2 and, by (3.2), for any y ∈ A ′ , Since q x (A; n) ≥ q x (A ′ ; n) for all n, the claim follows from Lemma 3.1.
The previous lemma works well for sparse connected sets. For more densely packed sets we need another estimate: Proof. Again using (3.1), we have for any n ≥ (ℓ(x, A)) 2 To obtain an upper bound on g(y, A; n) for y ∈ A, we observe that where we have used an obvious worst-case estimate (all the points of A are grouped around y, forming roughly a ball of radius Θ(|A| 1/d )) on the passage from the first to the second line of the above display. Then, applying Lemma 3.1 we conclude the proof of Lemma 3.3.
We end this section by stating a few well-known facts about the behavior of the set of sites visited by a simple random walk by time n. As we could not locate suitable references, we also sketch their proofs.
Lemma 3.4. Suppose that d ≥ 3 and let R(n) = {X 0 , . . . , X n } be the set of sites visited by a simple random walk by time n. Then, for any fixed α 1 ∈ (0, 1), Proof. The upper bound on the diameter follows from any convenient large deviation bound on the displacement of the simple random walk (e.g. Lemma 1.5.1 of [Law91]).
To control the diameter and the number of visited sites from below, we use the following simple argument: We divide the temporal interval [0, n 2 ] into c −1 n 2α 1 subintervals of length cn 2−2α 1 , for a large enough c. Clearly, on each of the subintervals of length cn 2−2α 1 the maximal displacement of the simple random walk is at least n 1−α 1 with a constant probability, e.g., by the central limit theorem. Noting that by time k the number of visited sites is at most k, and that the expectation of this number is at least c ′ k (it is straightforward to obtain this from (3.1)), we deduce that also with at least constant probability 1 the number of different sites visited by the random walk during a fixed temporal interval of length cn 2−2α 1 is at least n 2−2α 1 (if c is large enough). Finally, to estimate the probability that the event of interest occurs on at least one of the c −1 n 2α 1 subintervals, use the independence. The claim then follows easily.
We also need an estimate on the number of different sites visited by several random walks: Lemma 3.5. Consider k independent simple random walks (X Proof. We use a similar argument as in the previous proof. We divide the k walks into c −1 n α 3 groups, each containing ckn −α 3 walks. Consider now the ckn −α 3 walks of the, say, first group, suppose that they are labelled from 1 to ckn −α 3 . Let be the set of sites visited by the walks from the first group. For y ∈ Z d , define to be the number of walks of the first group that start at distance at most n from y.
So, if c is large enough Since, trivially, |V | ≤ ckn 2−α 3 , it holds that |V | ≥ kn 2−α 3 with at least a constant probability. As the same reasoning applies to each of the c −1 n α 3 groups, the claim of the lemma follows by independence.

Intersections of random walks
In this section we show that the set of points visited by sufficiently many walks started in B(n) is typically well connected; the precise statement of this fact is contained in Proposition 4.2.
To state this proposition we need some notation. We consider two sequences of positive random variablesη 2 independent simple random walks starting from some sites x (1) , . . . , x (η (n) 2 ) ∈ B(n). We write P for the joint distribution of these walks.
m } be the set of different sites visited by kth random walk until time m. We write H k A ,H k A for the entrance and hitting time of A by random walk X (k) (recall (2.1)).
(We do not indicate the dependence on n in order to keep the notations not too heavy.) In words, the definition says that the trajectories are (s, m)-connected if one can go from the starting point of the ith trajectory to the starting point of the jth trajectory within the cluster of the firstη (n) 1 trajectories, by changing no more than s times the trajectory, and using at most m sites in the beginning of each trajectory.
Let us define for k ≤η (n) 2 the following set of integers: 3) be the index set of the walks that do not come back to B(n) after the time 3n 2 .
For d ≥ 3 and h < 2 d , define (in fact, this quantity represents the necessary number of steps in the recursive construction used in the proof of Proposition 4.2, see (4.8) and (4.15); at this point we only observe that β(d, h) is finite since dh 2 < 1). The following proposition plays the key role in this paper: (4.6) and P ∀i ≤η will be of order n d−2 , so that h = 0. The proposition implies that the model of random interlacements is 'far from the criticality' with respect to the connectedness of the interlacement cluster; we typically need much less than Θ(n d−2 ) walks to ensure that the interlacement set is 'well connected'.
(c) In the most important case h = 0, it holds that β(3, 0) = 1, β(4, 0) = 2, β(5, 0) = 3, β(6, 0) = 4, but then β(7, 0) = 6. Comparing this with the results of [RS12,PT11] (where it is proved that every two points in I u can be joined by a path switching the trajectory at most (⌈d/2⌉ − 1)-times) indicates that the constants β(d, h) are not optimal. The authors did not check if the formula (4.4) can be further simplified, but it is clear that β(d, h) = Θ(d ln d) as d → ∞. In any case, for our needs it is enough to know that β(d, h) is finite for any d ≥ 3 and h < 2/d, and this fact is quite obvious.
First, let us describe informally the idea of the proof for the particular case h = 0 (one may note that there are many similarities with the proof of Theorem 3.2 of [AMP02], and with techniques used in [RS12]). Consider the random walk X (1) and run it up to time n 2 . Then, diam(R 1 (n 2 )) is typically of order n, so any other random walk X (k) hits the set R 1 (n 2 ) with probability at least of order roughly n −(d−3) (with logarithmic correction for d = 3) by Lemma 3.2. Since there are Θ(n d−2 ) other available walks, with high probability R 1 (n 2 ) will be hit by Θ(n) different other walks. In dimension d = 3, running these Θ(n) walks for n 2 time units more after the respective hitting moments of R 1 (n 2 ) is already enough to meet all the other trajectories (again applying Lemma 3.2, one obtains that the probability that any other trajectory hits none of those walks is almost exponentially small in n). In dimension d ≥ 4 this argument, however, just barely does not work. So, what to do in dimension 4? Consider those Θ(n) trajectories (of length n 2 ) that intersect the initial one. Together with the initial trajectory, they form a connected set of cardinality roughly n 3 . We then apply Lemma 3.3 to obtain that a random walk starting somewhere at the boundary of B(n) will hit such a set with probability at least of order n −2 × n 3(1− 2 d ) . Since (recall that now d = 4) we have Θ(n 2 ) walks in total, typically Θ(n 3(1− 2 d ) ) of them will hit that set. Since in four dimensions Lemma 3.2 gives lower bound of order n −1 for the hitting probability of the initial piece of length n 2 of a generic trajectory and 3(1 − 2 4 ) = 3 2 > 1, running these Θ(n 3/2 ) walks a bit more we meet all the other trajectories with high probability (see on Figure 1 an illustration of the proof for d = 4).
If we recursively define the sequence a (d) then the necessary number of iterations β(d, 0) can be calculated as follows: Since it is straightforward to obtain from the recursion (4.8) that we see that the above definition of β(d, 0) agrees to (4.4). In order to make the above argument rigorous, we have to address several issues, for example: • Deal with the dependence of the walks that participate in different stages of the above construction. This can be done by dividing the walks we use into β(d, h) groups and use one group on each stage. • In fact, the trajectories can go back to the ball B(n) at later epochs (i.e., much later then n 2 ). To prove (4.7), we have to assure that the random walks constructed on the β(d, h)th stage would meet these pieces of the trajectories too, otherwise we would have no good control on the distance within the interlacement cluster. So, we have to control the 'total number of returns' (see (4.11) below). In addition, in the above construction we shall use only the walks conditioned on not returning to B(n) after time 3n 2 (in order not to be obliged to condition on a too much detailed future behaviour of the trajectory). • Finally, all the events described in the informal construction should not only be 'typical' in some sense, but hold with probability at least 1 − s.e.(n). For that, we need to 'adjust' (by sufficiently small amounts) the values in the power of n on each stage.
Proof of Proposition 4.2. We start with the formal proof of Proposition 4.2. To simplify the notation we write β = β(d, h). Recall (4.3) and define for m = 1, . . . , β Since, clearly, there is a constant c 4 > 0 such that for all x ∈ B(n) we have  Inequality (4.9) further implies that that for every k, m ≥ 1 (4.11) In the sequel, we will repeatedly use the following observation. For a simple random walk X, let X [0,2n 2 ] be the piece of trajectory of the walk X up to time 2n 2 .
Then there is a constant c 8 > 0 such that for any event A which depends only on the initial piece of the trajectory of length 2n 2 (4.12) Indeed, to prove (4.12), we write and use (4.9) to argue that the last term is at least of constant order. As a last preparatory observation, note that, for any ε > 0, by Lemma 3.4 and the observation following Definition 2.1, . For any j we obtain using Lemma 3.2, and (4.12) with A = {R j (n 2 ) ∩ V 1 = ∅}, (4.14) We introduce the set of indices K 1 = j ∈ J (n) 1 \ {i 1 } : R j (n 2 ) ∩ V 1 = ∅ . By (4.10), (4.13), and (4.14), using the independence the random walks X (j) , it holds that where ε 1 := 2ε (ε is supposed to be sufficiently small so that 1 − h − ε 1 > 0). For d = 3, everything is ready to finish the proof of Proposition 4.2, but for other values of d we first need to describe a general step of the construction (recall that β steps are necessary). Define recursively (recall (4.8)) From the above recursion it is straightforward to obtain that So, with β defined by (4.4), it holds that a β > d − 3. Assume that for some 1 ≤ m ≤ β −1 we have constructed the connected sets V m ⊂ Z d and also the sets K m ⊂ J (4.16) Then, define . By Lemma 3.5 (observe that, by (4.12), its proof still goes through in this situation) and (4.16) it holds that P |V m+1 | ≥ n 2+am−2εm ≥ 1 − s.e.(n). (4.17) Observe that, by Lemma 3.3 and (4.12) with A = {R(n 2 ) ∩ V m+1 = ∅}, for any j ∈ J (n) m+1 (4.18) So, using (4.10), (4.13), (4.17), and (4.18), we obtain and (for the next induction step) denote ε m+1 = a m+1 − (2 + a m − 2ε m )(1 − 2 d ) + ε m , so that (4.16) would hold with m + 1 instead of m.
Now we describe the last step needed for the proof of (4.5), (4.6), and (4.7). Assume that on the initial step the parameter ε was chosen to be so small that a β − ε β > d − 3 + ε. Consider the walks with indices in K β ; after hitting V β the rest of the trajectory is conditionally independent from the initial part, so Lemma 3.2 and (4.12) imply that, for j ∈ K β P X for any connected set A ⊂ B(2n) such that diam(A) ≥ n 1−ε . Using this together with (4.11) and (4.13), we conclude the proof of Proposition 4.2.
5. Proof of Theorem 1.5 Using Proposition 4.2, it is straightforward to show Theorem 1.5. Denote by η (n) u the number of trajectories at level u entering B(n), that is the number of trajectories in the support µ u B(n) (recall (2.2) for the notation). By the definition of random interlacement, η (n) u has Poisson distribution with parameter u cap(B(n)) = Θ(un d−2 ). Therefore, using e.g. Chernoff bounds we obtain for small enough c 1 and large enough c 2 that u , and using (5.1), we see that, if both events in the left-hand sides of (4.5) and (4.6) occur, then I u ′ (n) should be connected for all u ′ ∈ [u,û]; on the other hand, the probability of these events approaches 1 as n → ∞. So, (5.2) cannot be true.

Large deviations for the internal distance
In this section we prove Theorem 1.3. To this end we fix a ∈ (0, 1/3) and investigate the properties of I u when restricted to In words, G (n) a is the n a -neighbourhood of the segment between the origin and ne 1 (recall that B(x, r) denotes the ball in the · ∞ -distance).
First, we need the following elementary estimate on e G (n) a (A). Lemma 6.1. Let F k be the hyperplane {x ∈ Z d : x · e 1 = k}. Then, for any k ∈ {−⌊n a ⌋ + 1, . . . , ⌊n + n a ⌋ − 1}, it holds that Proof. We adapt the proof of Proposition 2.4.5 of [Law91].
Let η n = η n (u, a) be the number of trajectories of µ u G (n) a (recall (2.2)). As before, we enumerate the corresponding random walks as X (1) , . . . , X (ηn) , denote their starting positions by x (1) , . . . , x (ηn) , and let R k (m) be the set of different sites visited by kth random walk by time m. Let us define U k = B(⌊kn a ⌋e 1 , n a ) = {y ∈ G (n) a : |y · e 1 − ⌊kn a ⌋| < n a }, k = 0, . . . , n. (6.2) Due to Lemma 6.1, for any k ∈ {0, . . . , n}, Let η n,k = |{i ≤ η n : X (i) 0 ∈ U k }| be the number of walks starting in U k . Using the large deviation properties of the Poisson distribution, as in (5.1), we obtain We now fix a small positive constant ε > 0, and define for 1 ≤ k ≤ η n Denote byÎ (respectively,Ĩ) the 'interlacement' set formed only by the initial pieces of lengtht k (respectively,t k ) of the trajectories (X (k) , k = 1, . . . , η n ): Observe that I u ⊃Î ⊃Ĩ. Further, by the central limit theorem, for any x ∈ G (n) a , P x [T G (n) a ≤ n 2a ] ≥ c. Therefore, using the strong Markov property recursively on the definition of j k , P[j k ≥ n ε ] ≤ s.e.(n). When j k ≤ n ε , then diam(R k (t k )) ≤ 2n 2(a+ε) . Therefore, Heuristically, the setÎ, is 'well suited' for application of Proposition 4.2 as it has no 'dangling ends' in G (n) a . By this we mean that knowing that X (k) is in G (n) a at some time j, its next n 2a steps will be contained inÎ: On the other hand, the trajectories inĨ are 'short range', which will introduce some independence later. We now introduce a notation that will be useful many times, see Figure 2 for its illustration. For x ∈ Z d \ {0} and y ∈ Z d , we define ζ (x) 0 (y) = max{m ≤ 0 : mx + y ∈ I u }, k (y) x to be the site on I u corresponding to ζ (x) k (y). When y = 0 and/or x = e 1 , we omit them from the notation, that is e.g. ζ k := ζ (e 1 ) k (0).   i (y) depends on n. It may happen thatψ 1 = ψ 1 ,ψ 0 (ne 1 ) = ψ 0 (ne 1 ), but that is not certain. In any case, ρ u (ψ 1 , ψ 0 (ne 1 )) ≤ ρ u (ψ 1 ,ψ 1 ) +ρ(ψ 1 ,ψ 0 (ne 1 )) + ρ u (ψ 0 (ne 1 ), ψ 0 (ne 1 )). (6.6) To bound the right-hand side, we need few lemmas.
Lemma 6.2. Let g d (k) = e −c 4 k when d ≥ 4, and g 3 (k) = e −c 4 k/ ln k . Then, for every a Further, for y such that B(y, n a/2 ) ⊂ G Proof. Let S k = {y + jx : 0 ≤ j ≤ k}. The first claim follows directly from the definition of I u (observe that for any A ⊂ Z d it holds that P u [A ∩ I u = ∅] = e −u cap(A) ) and the simple estimate on the capacity of the 'segment' S k (see e.g. [Law91], Proposition 2.4.5) For the second statement, we assume without loss of generality that y = 0, and define A n = {0 ≤ k ≤ n ε , k even}. For every j ∈ A n , and x ∈ U j , by Lemma 3.2, Combining this estimate with (6.3), usingt k ≥ n 2(a+ε) , we obtain in d = 3, For d ≥ 4 the calculation is very similar. Actually, it is sufficient to consider only the term j = 0, as there are no logarithmic terms in the denominator.
Similarly, applying Proposition 4.2 to the sequence of sets U k , k = 0, . . . , n, we obtain that P[Î is connected] ≥ 1 − s.e.(n). To bound the middle term on the right-hand side of (6.6), we consider the sequence of random variableŝ It is clear that on the event { ψ 1 (ke 1 ) − ke 1 ≤ n a : k = 0, . . . , n}, which by Lemma 6.2 has probability 1 − s.e.(n), we havê ρ(ψ 1 ,ψ 0 (ne 1 )) ≤ n k=0T n k . (6.11) To control the sum, we need a tail estimate onT n k that is uniform in n. Proof. Without loss of generality we consider k = 0 only. First, we fix m ≤ n a /2 and control the number of trajectories entering B(m). We claim that To prove (6.12) we use an argument similar to the proof of Lemma 6.2. We define A n = {0 ≤ k ≤ n ε/2 , k even}. For every j ∈ A n , and x ∈ U j , by Lemma 3.3, q x (B(m); n 2(a+ε) ) ≥ cm d−2 ((j + 1)n a ) d−2 .
Using (6.3), the number of walks starting in U j hitting B(m) has a Poisson distribution with parameter at least Using the stability of the Poisson distribution, this yields that the number of walks starting in j∈An U j hitting B(m) has a Poisson distribution with mean at least c ′′ m d−2 . Claim (6.12) then follows from the large deviation properties of the Poisson distribution again. We now apply Proposition 4.2 with m instead of n, G m = B(m) andη being the number of walks entering B(m). Assumptions (4.1), (4.2) are satisfied by the previous discussion. The construction ofÎ assures that the walks do not stop earlier than after making 2m 2 ≤ 2n 2a steps. Therefore, by an argument similar to proof of (6.8), for c ′ = 2(2β(h, d) + 3) Both terms in the parentheses are s.e.(m), the first one by Lemma 6.2, the second one by Proposition 4.2. Taking m ℓ such that c ′ m 2 ℓ = ℓ, the lemma follows for for ℓ < c ′ (n/2) 2a .
For the remaining ℓ's it suffices to observe that s.e.(ℓ) ≤ s.e.(n) and apply the same reasoning as before with B(n a ) instead of B(m).
and uses the same reasoning for the even terms in the right-hand side, and bounds the odd terms using (6.7) and Proposition 4.2, using the same argument as e.g. in (6.10). This completes the proof of Theorem 1.3.

Proof of the shape theorem
In this section, to prove Theorem 1.1, we use more or less standard argument based on the Subadditive Ergodic Theorem. For reader's convenience, let us state this theorem here (we use the version of [Lig85]): Theorem 7.1. Suppose that {Y (m, n)} is a collection of positive random variables indexed by integers satisfying 0 ≤ m < n such that (i) Y (0, n) ≤ Y (0, m) + Y (m, n) for all 0 ≤ m < n; (ii) The joint distribution of {Y (m + 1, m + k + 1), k ≥ 1} is the same as that of {Y (m, m + k), k ≥ 1} for each m ≥ 0; (iii) For each k ≥ 1 the sequence of random variables {Y (nk, (n + 1)k), n ≥ 1} is a stationary ergodic process; (iv) EY (0, 1) < ∞. Then, it holds that We are going to verify the hypotheses of Theorem 7.1 for the sequence of random variables , under the measure P u 0 . First, (i) is obvious since ρ u is a metric. Stationarity and ergodicity in (ii)-(iii) follow from the corresponding properties of I u , see Theorem 2.1 of [Szn10]. The property (iv) then follows from the estimate 1 ) > n] ≤ s.e.(n), which can be proved by applying the same procedure as in (6.10), using Lemma 6.2 and Proposition 4.2.
Theorem 7.1 implies that for any x ∈ Z d there exists a positive number σ ′ u (x) such that and σ u (0) := 0. With (7.1) it is straightforward to obtain (observe that, according to our notations, ψ (x) 0 (nx) is either nx itself in the case nx ∈ I u , or it is the 'last site before nx' on the discrete ray {kx, k ≥ 0} if nx / ∈ I u ), using also the usual Ergodic Theorem and (6.7), that It is also straightforward to obtain that for any integer m and x ∈ Z d , it holds that σ u (mx) = mσ u (x); this permits us to extend σ u to Q d by σ u (x) := m −1 σ u (mx), where m is such that mx ∈ Z d . Also, it is clear that σ u (x) ≥ x 1 for any x ∈ Q d . Next, the goal is to prove that σ u is a norm.
Lemma 7.2. For all x, y ∈ Q d we have 1 ; from the Ergodic Theorem we obtain Since ρ u is a metric, we have (see Figure 3) Figure 3. On the proof of Lemma 7.2 for any n. Then, the trick is to take the limit as n → ∞ in (7.6) in probability. First of all, a direct application of (7.1)-(7.2) shows that the first term in the right-hand side of (7.6) converges to σ u (x), even P u 0 -a.s. Next, under P u 0 it holds that ρ u ψ y n in distribution, so the second term in the right-hand side of (7.6) converges to σ u (y) in distribution and hence in probability. As for the term in the left-hand side of (7.6), write Again, the first term in the right-hand side of (7.7) converges P u 0 -a.s. to σ u (x + y). To obtain that the second term in the right-hand side of (7.7) converges to 0 in probability, observe that x n − ny n . (7.8) Since the third term in the right-hand side of (7.8) equals in distribution to n −1 ny− ψ (y) b −1 y n , (7.5) implies that the left-hand side of (7.8) converges to 0 in probability, and so Theorem 1.3 implies that the second term in the right-hand side of (7.7) converges to 0 in probability. This proves (7.4). Now, Lemma 7.2 shows that σ u can be extended to a norm in R d by continuity, and we are able to finish the proof of Theorem 1.1.
Since D u is compact, one can find a finite set F := {x 1 , . . . , x k } ⊂ D u ∩ Q d such that σ u (x i ) < 1 for i = 1, . . . , k, and (with C from Theorem 1.3) Consider any x i ∈ F ; let m i be the minimal positive integer such that m i x i ∈ Z d . Let n = jm i +s, where 0 ≤ s ≤ m i −1. Then, for all n large enough it holds by (7.3) that ψ s. Now, Theorem 1.3, (6.7) and the Borel-Cantelli lemma imply that P u 0 -a.s. for all n large enough we have for all i = 1, 2, . . . , k. So nD u ∩ I u ⊂ Λ u ((1 + ε ′ )n), which completes the first part of the proof.

Random walk on the torus
It remains to show Theorem 1.6. We recall that T d N denotes the d-dimensional discrete torus of size N, P N the law of the simple random walk on T d N started from the uniform distribution, and ρ u N (x, y) the internal distance within the set I u N of sites visited by the random walk before time uN d , I u N = {X 0 , . . . , X ⌊uN d ⌋ }. Let B N (x, r) ⊂ T d N be the ball of radius r around x in the usual distance, d N , on the torus. We first control the internal distance in balls of radius ln γ N.
Then, for c 1 , γ large enough, P N there exist y, z ∈ B N (x, ln γ N) such that ρ u N (y, z) ≥ c 1 ln γ N = o(N −d ). (8.1) Before proving this lemma, let us explain how it implies Theorem 1.6.
Proof of Theorem 1.6. By the lemma and a simple union bound, with probability tending to 1, the event in (8.1) is satisfied for all x ∈ T d N . If this is the case, we can chain these boxes to obtain the claim of the theorem. More precisely, consider x, y such that d n (x, y) > ln γ N. Then, one can find points x = x 1 , x 2 , . . . , x n = y such that x i+1 ∈ B(x i , ln γ N), i < n, and n−1 i=1 d N (x i , x i+1 ) ≤ 2d N (x, y). As we assume that the event in (8.1) is satisfied for all balls, for all i < n, The theorem then follows using the triangular inequality, settingC = 2c 1 .
Proof of Lemma 8.1. Let r = ln γ N and R = Cr, with C of Theorem 1.3. By Theorem 1.1 of [TW11], for any α > 0, there exists a coupling Q of random interlacement on Z d and random walk on the torus, such that For points that are in I u(1−ε) ∩ B(x, r) we can use Theorem 1.3 and obtain the required statement. For points in I u N \ I u(1−ε) , however, this simple argument fails and we need more details on the coupling construction.
The construction starts by splitting the random walk trajectory into so-called excursions. These excursions are independent simple random walk trajectories started at the boundary of B N (x, R) and stopped when staying a sufficiently long time out of B N (x, N 1−ε ), see Section 4 of [TW11] for the precise definition.
We denote the excursions started before time uN d by X (1) , . . . , X (η) , where η is random. These excursions are constructed in such a way that Using Lemma 4.3 of [TW11], it is easy to see that the random variable η satisfies Further, combining Lemmas 3.9, 3.10 of [TW11], it follows that the distribution of the starting points satisfies for every z ∈ ∂B(x, R) and i = 2, . . . , R P N [X There is a small issue with the fact that the last point of the trajectory, X ⌊uN d ⌋ , might be contained in the last excursion, cf. (8.2). To solve this issue, observe that our techniques apply to the both 'clusters' C := η−1 i=1 Ran X (i) andC := η i=1 Ran X (i) . Hence, with probability 1 − s.e.(R), the internal distances on these clusters within B(x, r) are bounded by c 1 r. If this is the case, then the trajectory of X (η) must intersect C at least every (2c 1 + 1)-steps. For any x ∈ B(x, r) ∩ X (η) ∩ I u N there is thus a path of length at most (2c 1 + 1) lying inside of I u N which connects x to C. Lemma 8.1 then follows by triangular inequality, by increasing c 1 to 2(2c 1 + 1) + c 1 .

Appendix A. Domination by Bernoulli percolation
We sketch here a simple argument proving Theorem 1.3 in d ≥ 5, with δ = 1. This argument is based on the domination of the interlacement set I u in thick twodimensional slabs by the standard Bernoulli percolation. This domination seems to be folklore in the random interlacement community, but to our knowledge it does not appear in any previous publications.
Let K be a sufficiently large constant and ε ∈ (0, 1). Let E 2 be the set of nearestneighbour edges of Z 2 , and for every e = (x, y) ∈ E 2 , let where we standardly identify x = (x 1 , x 2 ) ∈ Z 2 with (x 1 , x 2 , 0, . . . , 0) ∈ Z d . G e is a thin parallelepiped of length K + 2K ε and width 2K ε along the scaled edge Ke.
Let W ⋆ e ⊂ W ⋆ (recall Section 2 for the notation) be the set of all doubly-infinite trajectories modulo time shift that hit G e but not e ′ :dist(e,e ′ )≥1 G e ′ . The fact that the co-dimension of Z 2 is larger than 3, that is, the random walk is transient in the direction perpendicular to Z 2 , can be used to show that P ω ∈ Ω contains a trajectory with label smaller than u that intersects G e but is not in W ⋆ From Lemma 2 of [RS11b], it follows that the probability that there is a connection along the long direction of G e within I u can be made arbitrarily large by increasing K. Let C e be the probability that this connection uses only the trajectories in W ⋆ e . Using (A.1), it follows that the probability of C e can be made arbitrarily large by increasing K too.
Moreover, Proposition 1 of [RS11b] (or our Proposition 4.2) can be used to show that for any x ∈ Z 2 ⊂ Z d and v, w ∈ I u ∩ B(Kx, K ε ) there is connection of v and w within I u ∩ B(Kx, 2K ε ) with probability tending to 1 as K increases. We denote by D x the event that this connection uses only the trajectories in e∋x W ⋆ e . Again, applying (A.1) and choosing K large, the probability of D x can be made arbitrarily large.
Finally, call the edge e = (x, y) ∈ E 2 good, when C e ∩ D x ∩ D y occur. It follows that the probability that e is good can be made arbitrarily large by choosing K large. Since W ⋆ e and W ⋆ e are disjoint subsets of W ⋆ when dist(e, e ′ ) ≥ 1, the events 'e is good' and 'e ′ is good' are independent when dist(e, e ′ ) ≥ 4. Moreover, using the events D x , the connections realising C e , C e ′ in two adjacent good edges e, e ′ can be connected to form one path.
Using the domination argument of [LSS97], we see that for K large the good edges dominate the supercritical Bernoulli percolation on Z 2 , in particular, there is with probability one an infinite cluster C ⊂ Z 2 of good edges. Moreover, when x, y ∈ C are connected by a path of length ℓ in C, B(xK, K ε ) and B(yK, K ε ) are connected by a path of length at most ℓ{(K + 2K ε )(2K ε + 1) d−1 + 2(4K ε + 1) d } within I u (the factor in braces is simply the volume of the parallelepiped G e plus volume of the two boxes B(x, 2K ε ), B(y, 2K ε )).
Theorem 1.1 of [AP96] then implies that the claim of Theorem 1.3 holds (with δ = 1) for all x ∈ I u ∩ y∈C B(y, N ε ). The extension to all x ∈ I u is then trivial by repeating the argument for other coordinate directions and using Proposition 1 of [RS11b] or Proposition 4.2 to made the final connections to those x's that are not in the K ε -neighbourhood of KC.