The true reinforced random walk with bias

We consider a self-attracting random walk in dimension d=1, in presence of a field of strength s, which biases the walker toward a target site. We focus on the dynamic case (true reinforced random walk), where memory effects are implemented at each time step, differently from the static case, where memory effects are accounted for globally. We analyze in details the asymptotic long-time behavior of the walker through the main statistical quantities (e.g. distinct sites visited, end-to-end distance) and we discuss a possible mapping between such dynamic self-attracting model and the trapping problem for a simple random walk, in analogy with the static model. Moreover, we find that, for any s>0, the random walk behavior switches to ballistic and that field effects always prevail on memory effects without any singularity, already in d=1; this is in contrast with the behavior observed in the static model.


Introduction
The reinforced random walk is an interesting example of stochastic process with memory, defined as the statistical problem of a random walker which is attracted towards the sites it has already visited. It represents the attractive counterpart of the well-known self-avoiding walk, which is biased away from the already visited region. Memory of this type is often present in physics, in ecology, in search strategies and in biological systems: in all these cases, the walkers modify the environment they move on in a local manner, by leaving signals, trails or concentrations of substances which bias their subsequent motion [1,2,3,4,5,6,7]. The process with memory can take place in physical space, like in bacterial motion, or on a theoretical network, as it happens in models of genetic mutations in sequence spaces [8,9,10].
The memory effect of the true reinforced random walk is locally incorporated in the interaction between steps, and can be implemented applying several rules [2]. For instance, one can include the attractive part in the dynamics by increasing the weight for the jump probability towards a given site, according to the number of times the site has been visited [11,12]. In this case, the resulting self-attracting walk collapses in any finite dimension and, at long times, the walker is confined to move on a finite set of sites. A more interesting way to account for memory effects is by letting the jump probability to a given site be ∼ exp(ku), where k = 1 for sites already visited at least once and k = 0 for the others. This model, called the one-step (true) reinforced random walk, is related to the Donsker-Varadhan Wiener sausage problem [13,14,15] and its properties have been studied in details in [12,16,17,18,19]. In particular, it was shown that in high dimensional lattices (d ≥ 2) it exhibits a non-trivial phase transition between a diffusive and a collapsed phase as a function of the parameter u. This is evidenced by, e.g. the behavior of the mean-squared displacement x t and of the mean number S t of distinct sites visited up to time t, which change in correspondence to a critical value u c . Indeed, S t ∼ t for u < u c (with a logarithmic correction for d = 2), but as t d/(d+1) for u > u c . In d = 1, no phase transition has been evidenced and S t scales as in normal diffusion, S t ∼ x t ∼ √ t. Beyond such truly dynamical approach for modeling self-attracting random walks, a so-called static model has been defined. The static model -sometimes confused with its dynamic counterpart -actually describes the statistics of self-attracting chains: each possible path is considered as a particular configuration within the space the process is embedded in, and the weight of each instance is globally assigned according to the selfattracting interaction. As pointed out in the case of the self-avoiding walks [20], static and dynamical self-attracting random walks differ considerably and their long-time and critical behavior do not coincide.
When a field biasing the walker toward a given direction and/or point is present, a few results have been obtained. For instance, the model with bias has been studied by [21,22,23] in the case of static memory, evidencing the presence of a second (first) order phase transition in d = 1 (d ≥ 2) as a function of the bias parameter, between a ballistic regime, where the displacement x t scales linearly with time, x t /t → v > 0, and a sub-ballistic regime, where x t ∼ t α , with α < 1. These results have been obtained exploiting the mapping with a model of random walk diffusing on a medium displaying a uniform distribution of traps [21,24]. However, this powerful mapping does not hold when memory effects are accounted for dynamically, so the true reinforced random walk still presents many open problems.
Here we shall concentrate on the 1D version of the true one-step reinforced random walk, accounting also for the presence of a bias; the bias is taken to point towards a given site called target. Indeed, while already exhibiting a non-trivial behaviour, the 1D model is amenable to an analytical treatment for the main statistical quantities of interest in the dynamical process. Such results may be useful in deepening the considerable controversy and confusion generated by the reinforced random walk despite its simple definition. In particular, we consider the number of distinct sites visited after a time t and its growth rate, the mean displacement and the mean time spent on the border of the (instantaneous) span and we provide an analytic expression of their asymptotic behavior in very good agreement with extensive numerical simulations. As we will show, in the presence of a bias with strength s, the random walk behavior switches to ballistic, i.e. x t ∼ S t ∼ t, namely, in d = 1, field effects prevail on memory effects for any s > 0, without any singularity. This is in contrast with the behavior observed in the static model [21,24]. Finally, exploiting the results obtained in the asymptotic region, we obtain a formal expression for the probability of realizing a given path, highlighting its explicit dependence on S t , x t and on the number of times the random walk has sojourned on the borders of the (instantaneous) spanned region. This allows us to properly extend to the dynamic model the mapping with the trapping problem, already exploited for the static model.
The paper is organized as follows. In Sec. 2 we provide the reader with the main definitions and in Sec. 3 we describe the mapping with the trapping model. Then, we analyze in details the main statistical quantities, such as, in Sec. 4 the number of distinct sites visited, in Sec. 5 the time spent on the edges of the spanned region and in Sec. 6 the mean time to return to the target. Finally, Sec. 7 contains conclusions and Appendices include details on analytical calculations.

Definitions
Let G = {V, E} be a graph with vertex set V and edge set E. Given i, j ∈ V , the stochastic process we consider is defined by the following single-step jump probability from site i to j: with k t (j) = 1, 0, according to whether j has or has not been visited before time t respectively, Γ(i) is the set of neighborhood sites of i, and u is a constant, whose sign distinguishes between reinforced (u > 0) or repulsive (u < 0) walks. We also account for the presence of a field biasing the random walk towards (s > 0) or away (s < 0) from a given site T ∈ V , hereafter called target: where d j is the chemical distance between j and the target, i.e. the number of links between j and the position of the target, while s tunes the field effect.
In the following we consider the case of a reinforced (u > 0) walk in the presence of an attractive target (s > 0), defined on a 1-dimensional lattice. When d = 1, the walker position at time t can be denoted by a scalar quantity x t , and the site where the target is located is denoted by x T . By handling (2) we obtain the following probability to jump from where N t is the proper normalization factor and k dropped the redundant index t.
Without loss of generality we can choose x 0 = 0, so that x t represents the end-to-end distance of the walk at time t.
For the analytical investigation of the process, it is convenient to focus separately on the initial transient regime (before eventually reaching the target) and the asymptotic regime (after reaching the target); these regimes correspond to a "longitudinal field" and to a "central field", respectively. In the former case, as long as x t + 1 < x T , the position of the target can be ignored and the transition probability can be simplified as N t = e uk(xt+1)+s + e uk(xt−1)−s .
More generally, we can look at the process as a biased random walk moving on an inhomogeneous substrate characterized by three regions: at the interface between the visited and the unvisited region, the single step probability assumes different expressions. Referring to figure (1), where the single step probabilities to move rightwards in the spanned region and at interfaces are denoted by α, β and γ, we have γ = e u/2+s 2 cosh(u/2 + s) .
3. The mapping with trap models According to the model described in the previous section, we assign a probability p(j|i) to the walker to step from i to the neighbor site j during the evolution. As a result, memory effects are accounted for at each time step, that is, normalization is local. Another common way of modeling reinforced (or repulsive) interaction between steps, is to assign to any possible realization w of the walk an energy E u,s (w) = uS(w) − sx(w), being S(w) the number of distinct sites visited and x(w) the endto-end distance displayed by the walk w (here, again, u and s are parameters tuning the interaction). Then, the probability to realize a particular path is taken to scale exponentially with the related energy; the normalization is therefore global (see the next subsection for more details). Models with local and global normalization conditions are also referred to as dynamic and static models, respectively [18].
A comparative study of these models in one dimension [25] has shown that the emergent behavior depends crucially on the kind of normalization, either global or local. In fact, in the static model, by definition, all the paths sharing the same number of visited sites S (and, in the presence of bias, sharing also the same elongation x) have the same probability, conversely in the dynamic model paths with the same S (and x) can occur with different probability, depending on the number of times the walk visits the borders between different regions (see figure (1)), as will be explained in the following.
Another point which distinguishes dynamic and static models is the possibility for the latter to be mapped into a trap model [23,21]; a possible extension of this mapping to the dynamic case is discussed in Sec. 3.2.

The static model
As anticipated, in the static model [18,23] an energy is associated to any path w; in general the energy is E u,s (w) = uS(w) − sx(w), where u is the cost associated to a new visited site and s accounts for the field ‡. Accordingly, the probability for the path w of length t to occur is defined as: with Z u,s (t) the partition function, given by whereŴ (S, x, t) is the number of paths of length t with an elongation x and S distinct sites visited. Starting from the partition function, we can compute the mean number of distinct sites visited S t , the mean displacement x t and the mean square displacement x 2 t as Now, we recall that the survival probability for a 1-dimensional biased walk in the presence of a quenched concentration c of static traps after t step can be written as [21] so that, using (9) and posing 1 − c = e −u , we obtain Of course, for the unbiased case, starting from (11), we recover the well-known Rosenstock approximation where the last relation, holding in the early-time regime and for c ≪ 1, i.e. u ≪ 1, allows the simple estimate Z u,s (t) ≈ e −u S t . In general, through this mapping, the partition function describing the static model on the substrate G at time t can be directly associated to the survival probability of a standard random walk on G at time t, in presence of a concentration c = 1 − e −u of static traps.

The Dynamical model
Let us consider a particular path of t steps w = {x 0 , ..., x t }; exploiting (4) we can write the probability P u,s (w) of realizing the path w in terms of macroscopic quantities, such as the number of distinct sites visited at time t and the end-to-end distance. In fact, we have This product can be decomposed in three parts, according to whether x i corresponds to the right border (i ∈ ∂ R ), to the left border (i ∈ ∂ L ) or to a site within the span of the random walk (i ∈ I), namely t i=0 = i∈∂ R · i∈∂ L · i∈I . Considering the values that k + and k − assume in the different regions, we obtain We call t I , t R , t L the number of times that the random walker has sojourned in any of the three regions, respectively. Then, recalling that t = t I + t R + t L , the denominator gets Therefore, given u and s, all the paths sharing the same span, elongation and border times, whose number shall be denoted with W (S, t B , t), are realized with the same probability As a consequence, the probability of realizing any of these paths is P u,s (S, t B , t) = P u,s (w)W (S, t B , t). We also define W (S, t) as the number of paths with span S and f (t B |S, t) as the relative fraction with t B visits on border sites, so that For simplicity, let us now focus on the unbiased case; this allows to neglect the quantity x t and (15) simplifies into that is, the probability of a path in the dynamic model depends on the two stochastic (not-Markovian) variables S t and t B = t R + t L . We now calculate the probability P u,s=0 (S, t) of realizing any arbitrary path of length t which spans over S distinct sites, regardless of the number of times t B it visits a border. This can be obtained by summing P u,s=0 (S, t B , t) over t B , namely We note that W (S, t)2 −t can be looked at as the probability P u=0,s=0 (S, t) that a simple random walk covers a span S in a time t. Thus, recalling (16), we have where in the last passage we assumed f (t B |S, t) as a Poissonian with average t B S,t = 2S, so that, exploiting the cumulant expansion, we get As shown in Fig. 2, this picture is in agreement with numerical results at long times (with respect to S), while at shorter times time-dependent corrections have to be introduced. The expression in (18) can be looked at as the survival probability for a random walker which has visited S sites in the presence of a concentration c = 1 − exp(u ′ ) of traps. In the limit u → 0 we get c = 0, while for u → ∞ we get c = 1, as expected.
note that in the static case we had as variables t and S, where the latter was canceled out via cumulants expansion. In the dynamic case we have t B as additional variable, and it is analogously canceled out so that we are left with P u,s (S, t) to be compared with a survival probability of a walk where we specify both the length t and the span S.

The number of distinct visited sites
In this section, we estimate analytically the number of distinct sites visited, as a function of time. We recall that, restricting the analysis to the transient regime, the relative distance to the target can be neglected. First, we focus on the distribution P u,s (S, t), which was introduced in the previous section and provides the probability that, at time t, the number of distinct sites visited is S. We also define • G u,s (S, t), which is the probability that at time t the span is incremented from S −1 to S, being G(S, 0) = δ S,1 , • F u,s (S, t), which is the probability that a walker, starting from a frontier node, is able to widen the span from S to S + 1 in exactly t time steps, •F u,s (S, t), which is the probability that a walker, starting from a frontier node of a span S, moves inwards and remains within the internal region for (at least) the following t time steps.
Thus, the following relations hold (we drop subscripts u, s to lighten the notation) As for G(S, t), it satisfies the recursive equation The two coupled equations can be treated within a generating-function formalism, being h(λ) ≡ t h(t)λ t , the generating function of the arbitrary function h(t). As explained in Appendix A.1, when the right border can be neglected (e.g., in the presence of a strong bias or at long times), the particle is likely not to return to the left border, so that in F (S, t) we can drop the dependence on S, and we find whereF is the generating function of the probability to extend the span on the right side. The final formula is obtained by plugging the last expression into (21). By anti-transforming P (S, λ) we get an estimate for P (S, t) which, as shown in figure (3), is in good agreement with simulations as long as the bias is rather strong or t is large. Notice that, by choosing u = 0 and s = 0, we getF (λ) = λ/[1 + √ 1 − λ 2 ], which consistently represents the generating function of the mean-first passage probability to a neighboring site for a simple random walk [26].
Starting from equation (21), we can also obtain the generating function of the moments of the number of distinct sites visited, that is In particular, for the first moment we obtain .
The previous equation can be handled out via Tauberian theorems (see e.g. [26]) in order to infer the asymptotic behavior for S t . In fact, S λ can be restated as 2e 2s − e u (e 2s − 1) + e u (e 2s − 1) 2 + 4e 2s (2y − 1)/y 2 2e 2s − ye u (e 2s − 1) + ye u (e 2s − 1) 2 + 4e 2s (2y − 1)/y 2 turns out to be a slowly varying function such that when y → ∞, L(x) → 1/f (s, u), being f (s, u) = 1 + 2e u /(e 2s − 1). Hence, we have This result is checked numerically in figure (5). Therefore, at long times, S t grows linearly with time, that is to say, the bias prevails against memory effects. This is consistent with [27], where the qualitative behavior of the 1D random walk is found not to be affected by memory effects. More precisely, the rate of growth for the number of distinct sites visited is just given by 1/f (s, u): it depends exponentially on u, without exhibiting any singularity. Indeed, for any s > 0 the velocity turns out to be positive. This estimate can be compared with the result found in [21] for the 1-dimension static model where they evidenced a phase transition, between a phase with a zero drift velocity v = 0 for s < u and a ballistic phase with v = tanh(s − u) for s > u. The transition is argued to be of the second order, in the sense that v → 0 continuously and with a discontinuous second derivative.
As for the second moment S 2 λ , we have Again, we use Tauberian theorems to infer the asymptotic behaviour and we get , with L 2 (y) converging to 2L 1 (y) 2 , when y → ∞. We therefore get the asymptotic behaviour, S 2 t ∼ t 2 /[f (s, u)] 2 , which also suggests that in the limit of long t, the distribution for S is Poissonian.

The linear growth rate of S t
A better estimate of S t at small and intermediate times can be independently obtained by calculating the average time taken by the walker to increase the extent of the visited region. In this approach we still focus only on the movement of the right edge, neglecting the movement of the other border. More precisely, we compute the mean time taken to pass from a span of S sites to S + 1 sites. Referring to figure (4), this corresponds to the mean first-passage time t(0 → 1). This quantity enables us to determine the rate of growth for the asymptotic law of S t . Starting from size 0, the random walk can either jump directly to site 1 with probability β, or jump to site −1 with probability 1 − β; a similar position can be applied for the time t(−1 → 1) to go from site −1 to site 1, thus we have The explicit expression for τ is obtained in Appendix A.2 as τ = (tanh s) −1 , thus, substituting, we have The mean waiting time can be related to the growth rate v of S t , namely Since t(0 → 1) does not depend on time, we have S t ≈ vt, hence recovering the same result of (25), obtained through the Tauberian theorem.  We can improve the estimate of t(0 → 1) appearing in (29), taking into account also the presence of the left border, assumed as reflecting; this assumption allows to get a lower bound for t(0 → 1). In this case the time taken by the random walk to expand the visited region depends on the span length. Decomposing the first passage time in the same way as in (27) and (28), the only difference being the inclusion of the left border, the equations to solve are where L is the coordinate of left border set at −S (see Appendix A.1), x|y (z) is the splitting probability of reaching x before y starting from z and t(z → x; y) is the conditional mean exit time, i.e. the mean first-passage time to reach x from z without seeing y [28]. With some algebra one obtains t(0 → 1) = 1 + (e u + e 2s )(e u + e −2s )e −2s e −2(S−1) tanh s + (34) where g(l, t) = [(t 2(l−1) − 3)(t 2l − 1) + 2l(t 2 − 1)]/[t(t 2(l−1) − 1)] 2 . We note that when s ≫ 1, the previous equation correctly gives t(0 → 1) → 1, while for S ≫ 1 we recover the expression in (30). This analytical estimate has been compared with data from numerical simulations in Fig. (6): the asymptotic behavior is nicely recovered. Moreover, the expression in Eq. (34) with respect to the one in Eq. (30), provides a better estimate at small times, yet recovering the very same asymptotic behavior at large times.

Border times
Let us now consider the average total time t B t spent on borders and let us highlight its dependence on S t . This relation was anticipated in Sec. 3.2, as it was crucial to extend the mapping between the trapping problem and the reinforced walk to the dynamic case. Given a walk w, we define t I the number of times it stays on any strictly internal site and t L (t R ) the number of times that the process visits the left (right) border; of course, t B = t L + t R .
Focusing on long times, from the definitions above and recalling Fig. (1) and Eq. (6) we can write the following relations We preliminary note that a positive field (s > 0) implies α > β > 1 − γ (see (6)) as well as t R t ≫ t L t ; therefore, we can write S t ∼ β t R t and x t ∼ t, from which we can infer t R t ∼ t, namely the propagation is ballistic, which is consistent with the above results. On the other hand, in the absence of bias (s = 0), we get 1 − γ = β and α = 1/2, from which S t ∼ t B t and x t t ∼ t B t , consistently with [29]. Now, we focus on the occupation of border sites, namely t B , and try to highlight its connection with S t . We consider the unbiased case, where the random walk has probability 1/2 to move left/right within the span and probability φ = exp(−u/2)/(2 cosh u/2) to move outwards the span from a frontier site.
Let us denote with ∆S t and ∆t B,t the discrete differentials, given by Since both ∆S t and ∆t B,t can assume value either 0 or 1, the conditional probability P (∆S t |∆t B,t−1 ) can be written in the matricial form: Therefore, for the related average values we obtain from which a linear relation between S t and t B t−1 follows: In the bias-free case we measured numerically the joint probability P u,s (S, t B , t) that the walkers has stayed t B times on any border and that S distinct sites have been visited; results are represented in figure (7). From this distribution we calculated S t , confirming (37). Of curse, for u = 0, we get φ = 1/2 as expected.

Target
In this last section we analyze the behaviour of the random walk once the target has been reached; this corresponds to the case of a random walk in the presence of a "central field" pointing towards the point x T . For simplicity we assume that the span around the point x T is symmetric and we calculate the rate of growth for the number of distinct sites visited as well as the average time taken by the random walk to return to the target.

Mean exit time and scaling law for S t
After having reached the target, the single step probability follows the framework depicted in figure (8). In order to calculate the mean time taken by a random walk to increase the span width, we need the mean waiting time to revisit the borders. As anticipated, we take the spanned region as symmetric with respect to the target; more precisely, to simplify the notation we take x T = 0 and borders at −L and L, respectively. This condition simplifies the framework for the single step probability (see figure (9)): we have one reflecting point in x T and a region characterized by a rightwards drift, where Figure 8. The single step probabilities in the region around the target. In this scheme the RW is supposed to have visited a symmetric region around the target Figure 9. The framework of figure (8) is mapped into a semi-infinite structure where, exploiting the symmetry, the target is placed at the origin and looked at as a reflective barrier.
the probability for x t → x t + 1 is 1 − α, while at the border x = L, the probability to move in the right direction is 1 − γ. Following a procedure similar to the one presented in Section (4.1), we can calculate the mean exit time, i.e the first passage time from L to L+1, namely t(L → L+1). The details of the computation are shown in Appendix B.1, leading to: From this result we can estimate the scaling law for S t . In fact, t(L → L + 1) is the mean time the walker has to wait in order to increase the number of visited sites from S = 2L to 2L + 1. The time t(S) corresponding to S visited sites is: where C is a finite constant depending on s and u. In Fig. (10) we show that the expression in (39) correctly estimates the leading behavior of S t . In particular, in the limit S → ∞, the first term in (39) provides the leading contribution, therefore, at long times we can retain only this term and invert the relation to get with B(s, u) = exp(3s + u − 2 tanh s)/(2 sinh s tanh s). In Fig. (10) we show the comparison with simulations. The leading behaviour of (40) is in good agreement with numerical data, the difference being due to our approximations (the symmetric span, the continue limit in (39) ). Notice that the long time behaviour is dominated by the bias encoded by the factor (tanh s) −1 , while memory effects yield small corrections encoded by the term B(s, u).

Mean return time to the target
Let us now consider the mean time to return to the target, denoted as τ T . Referring to figure (9) this is given by τ T = 1 + t(1 → 0), namely the walker takes one step to move from x T ≡ 0 to 1 and t(1 → 0) to go from 1 back to 0. Assuming the span 2L, so that the distance between the target and the border is L, we can calculate t(1 → 0) fixing the distance of the border from the target at L, hence imposing a reflecting barrier at L + 1; this corresponds to a situation where u ≫ 1 and/or L ≫ 1. From calculations reported in Appendix B.2: where t exit is the exit time from an interval L,starting from L − 1 and in presence of a bias towards 0. Therefore, the mean time taken by the random walk to return to the target site is In the limit for L → ∞, τ T reaches the asymptotic value Notice that τ T can be related to the stationary probability for a biased random walker to be at the target site [30]. In fact, according to Kac's formula for irreducible graphs, there exists a unique stationary probability π π π and the mean number of steps needed to return to any point i, is 1/π i .

Conclusions
In this paper we studied analytically the 1D one-step reinforced random walk: at each time step the probability for the random walk to jump to a given site is ∼ exp(ku), where k = 1 for the sites visited at least once and k = 0 for the others; a positive value for the parameter u, ruling memory effects, ensures that the walk is self-attractive. Since memory effects are accounted for at each time step, this model is often referred to as "dynamic" or "true". Such models find applications in the investigation of systems where diffusing particles are able to change the environment as, e.g., in the evolution of a surface of growing aggregates [31], in spatial exploration with learning [2] and angiogenesis [4]. In our analysis we also considered the presence of a field s biasing the particle towards a given direction/site.
First, we highlighted some basic differences between the dynamic and the static model. In the latter case memory effects are accounted globally: the probability of realizing a given path w of length t scales exponentially with the number of distinct sites visited S and with its end-to-end distance x. The dynamic model is by far more difficult to treat, due to the fact that memory effects are accounted locally, hence yielding a complex non-Markovian dynamics. Indeed, we found an analytic expression for the probability P u,s (w) to realize a particular path w, for a given choice of parameters u and s, where we evidenced a dependence not only on S and x, but also on the number of times t B that the walker has sojourned on the border sites of the (instantaneous) span. By averaging P u,s (w) over all paths exhibiting the same span we found an expression which can be compared to the survival probability of a standard random walk with a given span, in the presence of a uniform concentration c of traps, hence extending the mapping already proved and far exploited in the static model. Interestingly, here c depends exponentially on u as well as on the walker "velocity".
We also obtained an expression for the probability P u,s (S, t) of visiting S distinct sites in a time t, in terms of generating functions and found analytically that S t ∼ √ t in the absence of bias (recovering previous results [29]), and becomes ballistic with S t ∼ t in the presence of bias, as soon as s > 0. Thus, memory effects induce second order corrections in this case.
Finally, we studied the joint probability P u,s (S, t B , t), which for any given (large) S is peaked at t B = S/φ, where φ is the probability to move outwards starting from an edge of the spanned region.
All analytical results have been confirmed, at least in their leading terms, by numerical simulations.
whereF (S, t) is the probability that a random walker, starting from a frontier site, does not broaden the span S after t steps. Now, G(S, t) satisfies the recursive equation where F (S, t) is the probability that a random walker, starting from a frontier site, increases the span width from S to S + 1 in t steps. In terms of the generating functions, equation (A.1) becomes:G The previous finite difference equation has solutioñ where we have considered that G(S = 1, T ) = δ t,0 , that is,G(S = 1, λ) = 1.
Due to the bias, the probability distribution for the position of the random walk will be peaked in the region close to the right edge and it will move away from the left edge, increasing the mutual distance. Hence, we can assume that the random walk never returns at the left edge, namely that this edge is fixed and consequently focus on the right side of the visited region. More precisely, we can fix the right border at x = 0, so that the visited region is the half-line of the negative x, the single step probability towards right on this region is α = exp(s)/2 cosh(s), while on the edge the probability becomes β = exp(s − u/2)/2cosh(s − u/2) in order to account for memory effects, see figure (4).
Within this framework the probabilityF (L, k) is independent of L, and can be restated as F (0 → 1, k), namely the first-passage probability from x = 0 to x = 1 in k steps.
For the probability F (0 → 1, t) the following equations hold: In the former equation the first passage probability to reach 1 from 0 is splitted in two terms according to whether the random walk moves directly to 1 or to -1, while in the latter equation the first-passage probability from -1 to 1, is decomposed in two processes: the random walk first reaches 0, then from 0 it reaches 1. The related generating functions arẽ By plugging equation (A.4) into (A.5) we obtaiñ The advantage of this formulation is that now we have to findF (−1 → 0, λ), which involves a simple random walk in the presence of bias; this can be found via standard techniques (see e.g. [26]) as With this result we finally get In the main textF (0 → 1, λ) is denoted asF (λ) to lighten notation. Now, we can obtain an expression for the probability distribution of the number of distinct sites visited P (S, t). The probabilityG(S, λ) is simply F (−1 → 0, t) The probabilityF (0 → 1, λ) can be obtained from the first-passage probabilitỹ which plugged into (A.7) yields: The final formula is obtained using (A.6) in (A.8).

Appendix B. Target
Appendix B.1. Computation of the mean passage time from L to L + 1 Following a procedure similar to the one presented in Appendix A.2, we calculate the mean exit time, i.e the first passage time from L to L + 1, denoted as t(L → L + 1).
Referring to figure (7), we can write where x|y (z) means the splitting probability of reaching x before y starting from z and t(z → x; y) is the conditional mean exit time i.e. the mean first-passage time to reach x from z without seeing y [28]. After some algebra (B. What we need in the last expression are the conditional exit times, i.e. t(1 → L − 1; 0) and t(1 → 0; L − 1), but these involve only a simple random walk with bias and are known from the literature [32]. After some algebra we obtain t(L − 1 → L + 1) = exp u/2 − 2s − 2(L − 1) tanh s cosh(s + u/2) sinh s .
The last expression has to be substituted in (B.1) to obtain the mean exit time from a support of 2L distinct visited sites t(L + 1 → L + 2) = 1 + 2e u 1 − e −2s e 2(s+L tanh s) − 1 . (B.7) Here, we derive the mean time to return to the target: looking at figure (9), this requires one step (from site 0 to site 1) together with the mean number of steps to return in 0 starting from 1. Hence, we distinguish between the paths which reach the border L before returning to the target and those that do not. Accordingly, we have (1) t(L → 0) = t exit (1) + L|0 (1) 0|L (L − 1) t exit (L − 1) + 2 γ + 1 .