Phase transitions for edge-reinforced random walks on the half-line

We study the behaviour of a class of edge-reinforced random walks {on $\mathbb{Z}_+$}, with heterogeneous initial weights, where each edge weight can be updated only when the edge is traversed from left to right. We provide a description for different behaviours of this process and describe phase transitions that arise as trade-offs between the strength of the reinforcement and that of the initial weights. Our result aims to complete the ones given by Davis~\cite{Davis89, Davis90}, Takeshima~\cite{Takeshima00, Takeshima01} and Vervoort~\cite{Vervoort00}.


Introduction
Reinforced random walks (RRW) have been extensively studied in the past 30 years. The canonical model is the one introduced by Coppersmith and Diaconis [2], called Linearly Edge-Reinforced Random Walk (LERRW) which can be described as follows. Consider a graph G which is locally finite and to each edge assign initial weight one. These weights are updated depending on the behaviour of the process. LERRW takes values on the vertices of G, at each step it jumps to vertices which are neighbors of the present one, say x. The probability to pick a particular neighbor is proportional to the weight of the edge connecting that vertex to x. Each time the process traverses an edge, its weight is increased by one. When G is a tree, then LERRW is a random walk in an i.i.d. environment. In general, it can be represented as a mixture of Markov chains (see Merkl and Rolles [6]). The mixing measure is connected with H 2/2 models, which in turn are used to explain the phenomena of Anderson localization. For more information about this connection and details about H 2/2 models, see for example [7] and its bibliography.
Our goal is to study a large class of edge-reinforced walks on Z + , inspired by the work of Davis [3,4]. We allow heterogeneous initial weights on the edges, and a reinforcement that is different from linear. A theorem of Vervoort [11] establishes an interesting recurrence criterion for a large class of RRW with general initial weights. The reinforcement scheme of these processes is characterised by the fact that there is a chance that an edge increases its weight when traversed from left to right (see Theorem V below for a precise statement). Hence, we focus our attention on the case where the reinforcement can happen only when the process traverses an edge from right to left. Intuitively, this class of processes is the 'most' transient and has an interesting phase transition in terms of the initial weights and the reinforcement.
We provide a general phase diagram for edge-reinforced random walks which take values on the vertices of Z + and heterogenous initial conditions. This includes a description of phase transitions that are trade-offs between the strength of the reinforcement and that of the initial weights. We use a martingale approach, a theorem of Austin (see [1]) and the so-called Rubin construction (see [4]) combined with Cramér-type bounds. Some of the methods used in the proofs are close in spirit to the ones proposed by Davis in [3,4].
1.1. Edge-reinforced random walks on the half-line. We define the edgereinforced random walk (ERRW), denoted by X = {X n } n , as follows. This process takes values on the vertices of Z + and at each step it jumps to one of the nearest neighbors. Denote by {x, x + 1} the non-oriented edge connecting x and x + 1. In contrast, we use (x, y) to denote the oriented edge connecting x to y. Define that is the number of traversals of the edge {x, x + 1} by time n. For each x ∈ Z + , let f x = (f (ℓ, x) : ℓ ∈ Z + ) be a non-decreasing sequence of positive numbers, called the reinforcement scheme at x ∈ Z + . For each n ≥ 0, the weights at time n are defined by and the transition probability is given by Here we set w n (−1) = 0 for all n ∈ Z + , which implies a reflection at the origin. We say that the path X(ω) is recurrent if every point is visited infinitely often, and transient if every point is visited only finitely many times. Finally, if the set R of points that X(ω) visits infinitely often is finite, then we say that X(ω) localizes. Takeshima [9] proved that ERRW X(ω) on Z + can be either recurrent, transient or it localizes. Notice there are cases where 0 < P(X is transient) < 1.
Moreover, with positive probability the process drifts away to infinity. In fact, it behaves like a transient Markov chain on the sites x with x ≥ 2.
For ℓ, x ∈ Z + and k ∈ N, define . can be associated to the effective resistance of the network, which characterizes the behaviour of the relative Markov chain (see [5]). We say the ERRW is initially recurrent (resp. initially transient) if F (1) 0 = +∞ (resp. F (1) 0 < +∞). Davis [3] proved that initially recurrent reinforced random walks are not necessarily recurrent.
(i) If F (2) 0 = +∞, then X is either recurrent or it localizes on a single edge. (ii) There exists a reinforcement scheme (f x : x ∈ Z + ) such that F (2) 0 < +∞ and F (1) 0 = +∞ and X is transient with positive probability. The known phases can be summarised in the following table, where we combined Theorem D with other results by Davis [4], Sellke [8] and Takeshima [9,10]. The question marks in Table I indicate what is left open in general. In this paper we partially fill these gaps for a general class of reinforcement, which we call Factor Type Reinforcement (FTR), and which is of the form where δ = (δ ℓ : ℓ ∈ Z + ) is a positive non-decreasing sequence with δ 0 = 1. Furthermore, Vervoort proved the following result (Theorem 8.2.2 in [11]). For the sake of completeness we include the proof in the Appendix.
By virtue of Theorem V, we can focus on the case where δ is unbounded and an edge can be reinforced only when the process traverses from right to left, i.e. when δ 2k = δ 2k+1 for all k ∈ Z + .

Main results
it is non-decreasing, and δ 0 = 1.  Our main result is the following.
Theorem 2. Let X be a reinforced random walk with FTR, and suppose that , then X localizes on a single edge a.s..
As highlighted in the next example, if we perturb a single reinforcement even slightly, we can witness a transition from recurrence to transience.
and g ε (2k + 1) = 0 for all k ∈ Z + . Define the family of reinforced random walks Remark 4. We emphasize the fact that outside the intervals α ∈ (1/2, 1] and ρ ∈ [0, ∞) the behaviour of the process is known from previous results (see Table I and Figure 1). Moreover, the proofs of Theorem 2 parts 3) and 4) cover more general cases, as stated in Propositions 7 and 8. In principle parts 1) and 2) can also be adapted to more general reinforcements, e.g. with a slowly varying factor. To be more precise, our proofs rely on some integral estimations of series. In this context, reinforcements which are power functions are easy to deal with and give explicit estimates. On the other hand, the method itself covers more general cases.

Proof of Theorem 2
Let τ := inf{n > 0 : X n = 0} and The process M = (M n : n ∈ Z + ) is in general a non-negative supermartingale and will play a major role in our proofs. In fact, in virtue of our assumption that δ is DT, we have that M is indeed a martingale (see Lemma 3.0 in [4] for details). We use the following 0-1 law (see Sellke [8] or Takeshima [10] for a proof).
We reason by contradiction and suppose that P(E) > 0. On E, we have that for any y ∈ Z + , there exists a n y ∈ N such that w ny (x) = w ∞ (x) for all x ≤ y. This implies that for all n ≥ n y , we have , on E.
By taking limits, we have that on E On the other hand, M is a non-negative martingale. Combining Doob's convergence theorem (see [12]) with (3.1), we have that ∞ x=0 (N x + 1) −ρ (x + 1) α < +∞, on the event E.
At the same time, as M is a non-negative martingale, we can apply Austin's theorem (see [1]), which says for n < τ and x ∈ Z + , we have By Hölder's inequality, This gives a contradiction, and proves that P(E) = 0. The result follows by Sellke's 0-1 law (see Theorem S).

3.2.
Proof of Theorem 2 part 2). In this Section we prove that X is transient, a.s., under the assumptions of part 2). We reason by contradiction. Suppose that X is recurrent a.s. (see Theorem S). We prove that M is bounded in L 2 , which implies that M is uniformly integrable and P(M ∞ = 0) > 0, which in turn implies transience. In fact, for ρ ∈ (0, 1/2), there exists a constant c > 0 such that In the last step, we used Jensen's inequality, as the map y → y 1−2ρ is concave, for ρ ∈ (0, 1/2). Similarly, for ρ = 1/2, we have In virtue of (3.2), in order to prove Theorem 2 part 2), it is enough to prove that there exists C such that In order to see why (3.4) is sufficient for our purposes, simply notice that 2α − 2(1 − 2ρ)/(1 − ρ) > 1 is equivalent to the condition of the Theorem. The remaining part of this Section is devoted to prove (3.4). In particular, Lemma 6 below is the key result for our goal.
Definition 5. Fix x ∈ N and set γ x := (x + 1) α /x α . Consider a generalised Pólya urn, which initially contains one white and one black ball. The reinforcement scheme for white balls is f (w) (k) = k ρ , for k ∈ N. The reinforcement scheme for black balls is f (b) (k) = γ x k ρ , for k ∈ N. In other words, if the composition of the urn at stage n is z white balls and y black balls, then the probability to pick a white ball at the next stage is f (w) (z)/(f (w) (z) + f (b) (y)). At each stage a ball is picked, and returned to the urn together with another ball of the same colour. Denote by P x the measure describing this model, and by E x the expected value with respect to P x . Denote by (W n , B n ), with n ∈ Z + , the composition of the urn by time n, with W 0 = B 0 = 1. Denote by Po (x) the sequence (W n , B n : n ∈ Z + ) under the measure P x . Lemma 6. Assume that α ∈ (1/2, 1] and ρ ∈ (1 − α, 1/2]. Let H n := inf{k ∈ N : W k = n} for n ∈ N. Define B * n := B Hn . There exists a constant C such that for any x, n ∈ N, we have Proof. Consider Rubin's embedding (see the Appendix of [4]), which is shortly described as follows. Let (Y i ) i and (Z i ) i be two independent sequences of independent exponentials with parameter one. Set, for each n ∈ N, The variables ( W n , B n : n ∈ N) can be used to generate a Pólya urn process with the features of Definition 5. In this context, { B s < W n } for s, n ∈ N if and only if by the time the urn contains n + 1 white balls, it contains at least s + 1 black ones. Let a n,x := n · γ where ⌈x⌉ denote the smallest integer larger or equal to x. Fix a sequence θ n ∈ (0, 1/2), which will be specified later. For s ≥ a n,x we have .
Call the product of the first two terms I θn,n,x and the third term II θn,n,x,s . We have The first inequality follows from an elementary bound exp The third inequality uses an integral comparison. We can obtain the fourth and fifth inequalities by noting that a n,x ≥ nγ and γ x ≥ 1, respectively. On the other hand, By choosing θ n = (1/2)n −(1−ρ)/2 , we can see that I θn,n,x is bounded by a positive constant C 2 . Thus, we have ∞ s=an,x+1 Suppose that n is large enough to imply C 1 n − 1−ρ 2 a 1−ρ n,x ≥ 1. Letting t = C 1 n − 1−ρ 2 s 1−ρ and noting that ρ Hence we have ∞ s=an,x+1 In the last inequality, we used the fact that for m ∈ N, we have The previous inequality can be proved via the mean value theorem applied to the function h(t) = (m + t) 1−ρ , defined for t > 0. Finally we have for all large n. By choosing a large C > 0, we obtain for all n.
We can use a collection of independent generalized Pólya urns (Po (x) : x ∈ N), where Po (x) has distribution P x , to generate a reinforced random walk X. In this context, the jumps from vertex x are modelled using the urn Po (x) . Each time the process is at x, we pick a ball from the urn and observe its color. If it is black the walk moves to x+1, and moves to x−1 otherwise. Recall that N x is the total number of jumps from x to x − 1 before time τ . As we assume that X is recurrent a.s., the variable N x is σ(Po (k) : k ∈ {1, 2, . . . , x − 1})-measurable. Therefore N x is independent of Po (x) . Using Lemma 6 with the urn Po (x) , with n = N x , we have As ρ ∈ (0, 1/2], we have that (1 + ρ)/2 < 1. By taking the expected value of both sides in (3.7) and using Jensen's inequality, we have that Proof of Theorem 2 part 2). Consider a sequence (a x : x ∈ N) satisfying (3.9) a x+1 ≤ γ Notice that the sequence a x := E[N x ] satisfies (3.9), and a 1 = 1 is finite in virtue of the definition of N 1 . Hence (3.4) is proved once we prove that for some positive constant C, and all x ∈ N. We prove this by induction. Of course it is true for x = 1, as we can choose simply C large enough. Suppose it is true for x. Using (3.9), we have that Set C = C(C) (ρ−1)/2 . Notice that as ρ ∈ (0, 1/2], the larger C is, the smaller C becomes, approaching zero in the limit. Using α, ρ ≤ 1, the right-hand side of (3.12) can be bounded as follows: We can choose C smaller than 1 (i.e. C large enough). Hence the right-hand side of (3.13) ≤ x x + 1 3.3. Proof of Theorem 2 part 3). We prove a more general result, and the proof is closely related to the one given by Davis [3].

Proof.
For each x ∈ Z + , Thus we have 1 f (0, x) 2 =: λ < +∞ with probability one. By the orthogonality of martingale increments, we have for any N, which shows that {M n } is an L 2 -bounded martingale. We have This together with Sellke's 0-1 law (see Theorem S) shows that X is transient.

3.4.
Proof of Theorem 2 part 4). We provide a proof for a more general result, which includes initially transient cases.
Proposition 8. Suppose that X has FTR, and there exists a constant C ∈ (0, ∞) such that If ∞ ℓ=0 δ −1 ℓ < +∞, then X localizes on a single edge a.s.. Proof. For each x ∈ N, we define 1l (Xn,X n+1 )=(x,x+1) = 0 , that is the event that the process never jumps from x to x + 1. The k-th time the process visits vertex x, the conditional probability that it jumps to x − 1 is Then, we have This shows that x P(E x ) = +∞. The second Borel-Cantelli lemma implies that P(E x occurs for infinitely many x's) = 1 and P(X is of finite range) = 1. In fact we have P(X is localized to a single edge) = 1 by an application of Rubin's theorem (see Corollary 3.6 in [9]).