Reinforced walks in two and three dimensions

In probability theory, reinforced walks are random walks on a lattice (or more generally a graph) that preferentially revisit neighboring `locations' (sites or bonds) that have been visited before. In this paper, we consider walks with one-step reinforcement, where one preferentially \emph{revisits} locations irrespective of the number of visits. Previous numerical simulations [A. Ordemann {\it et al.}, Phys. Rev. E {\bf 64}, 046117 (2001)] suggested that the site model on the lattice shows a phase transition at finite reinforcement between a random-walk like and a collapsed phase, in both 2 and 3 dimensions. The very different mathematical structure of bond and site models might also suggest different phenomenology (critical properties, etc.). We use high statistics simulations and heuristic arguments to suggest that site and bond reinforcement are in the same universality class, and that the purported phase transition in 2 dimensions actually occurs at zero coupling constant. We also show that a quasi-static approximation predicts the large time scaling of the end-to-end distance in the collapsed phase of both site and bond reinforcement models, in excellent agreement with simulation results.


A. Background: Physics
Random walks with memory have a large number of applications in physics and other sciences. Many variants have thus been studied in different contexts. The best known example is presumably the self-avoiding walk [1], which models the large scale behavior of flexible chain polymers in good solvents. As pointed out by Amit et al. [2] the name 'self-avoiding walk' is something of a misnomer, since this model describes either static self-avoiding chains or self-killing walks. When the selfavoiding walker tries to revisit a site it has visited before, it is not gently turned towards another neighboring site; it is killed. In a more general version of this model, walkers carry an initial weight of unity, which decreases by a fixed factor whenever a site is revisited (the Domb-Joyce model [3]). If a site i had been visited n times before, the weight is diminished at the n + 1 visit by e nu with u < 0.
When the sign of the interaction is changed to u > 0, so that the weight is multiplied by a factor e nu > 1 at each revisit, the resulting self-attracting walk degenerates in any finite dimension; for large times, the walker just oscillates between two sites. This extreme behavior is avoided if the weight change is independent of n and one distinguishes only between sites which have and have not been visited before. This model is related to the Donsker-Varadhan [4] "Wiener sausage" problem [5] and leads to a power law scaling R t ∼ t 1/(d+2) for the end-to-end distance after t time steps in d dimensions of space.
In contrast to these "static" models, where instances are weighted and the weights are modified by interactions, one can define "dynamic" models where the walks are biased by the interaction. The oldest such model is the true self-avoiding walk (TSAW) of Amit et al. [2]. Assume that at time t the walker is at site i, and that the number of previous visits to any of the N neighbors is n j , j = 1, ..., N . Then the probability to step to neighbor j at the next time is This is a much milder modification than the original self-avoiding walk. Accordingly, the r.m.s. end-to-end distance scales as R 2 t ∼ t for d > 2, while there are logarithmic corrections at the 'upper critical dimension' d = d c = 2. In contrast, the upper critical dimension for the self-avoiding walk is d c = 4, and R 2 t ∼ t 2ν d for d < d c with ν d < 1/2 [1].

B. True self-attracting walks
When the sign of u is switched to positive, the resulting "true self-attracting walks" (TSATWs) are also closer to random walks than the ordinary self-attracting walks. It seems that the behavior of the TSATW with p j given by Eq.(1) but with u > 0 is unknown. On the other hand, there are several numerical studies of the TSATW with one-step reinforcement [6,7,8,9,10,11] where κ j = 0 if the site j has never been visited before, and κ j = 1 otherwise. By far the most extensive studies were those of [10,11], which claimed that onestep reinforcement TSATWs on the lattice showed a nontrivial phase transition in both d = 2 and d = 3, with u c (d = 2) = 0.88 ± 0.05 and u c (d = 3) = 1.92 ± 0.03. In both cases, the behavior of R 2 t is supposed to change at at u > u c . Hence the phase transition is between a random-walk-like and a collapsed phase. At the critical point, R t scales with a new exponent ν c which is 0.40±0.01 in d = 2 and 0.303±0.005 in d = 3 [11]. These phase transitions are also seen in the average number S t of sites visited up to time t. This scales as S t ∼ t for u < u c (with a logarithmic correction for d = 2), but as t k with for u > u c . The latter was derived from a quasi-static approximation [12] in Ref. [6]. The quasi-static approximation seems to be satisfied to high precision (see below). At u = u c , Ordemann et al. found k c = 0.80 ± 0.01 (d = 2) resp. 0.91 ± 0.01 (d = 3) [11].

C. Background: Mathematics
In a parallel and largely independent development, these and similar random walks with memory have been extensively studied in the probability theory literature. For a recent survey, see [13]. The rigorous mathematical study of reinforced walks displays much more breadth than the rather limited study of one-step site reinforcement in the statistical physics literature. In contrast to the physics literature, which focuses on the site model, bond or "edge" reinforced random walks (ERRW) have been studied in great detail and with multiple reinforcement as in Eq.(1) with positive u. Such walks (most clearly on trees) are closely related to Pólya urn processes and similar problems with reinforcement that can be solved exactly (note that walks with bond reinforcement are often called 'trails' in the physics literature [15]). For the models with multiple site reinforcement discussed above (called vertex-reinforced random walks or VRRW), the related urn process is Friedman-like and less tractable [13,14], see endnote [28].
The mathematicians have discovered profound differences between these two models. For example, ERRW is recurrent on finite graphs [16,17], meaning that every edge is traversed infinitely often, while VRRW is not, becoming trapped e.g. on a line of five vertices [13,14,18] or more generally on "trapping subgraphs" [13,14,19,20]. Many properties of the ERRW remain unknown; for example, the recurrence of ERRW on the infinite 2-d lattice is an open question. Even the one-step ERRW model (called once-reinforced in the mathematics literature) has only been successfully studied on a few special graphs, e.g. the infinite regular or Galton-Watson tree (where it is transient [13,21,22]) or the infinite ladder (where it is recurrent [13,23]). The recurrence of onestep ERRW on the infinite 2-d lattice remains essentially open, although Sellke showed the separate recurrence of each coordinate [13,24]. Pemantle [13,14] provides a more complete description of these and other results.
The difference in mathematical tractability and underlying structure might suggest that models with bond and site reinforcement show different phenomenology. But this result would be unexpected from considerations of universality in statistical physics.

D. Overview of Results
In the present paper, we clarify some of these issues by means of high precision simulations. Our main results are: • Bond and site reinforced TSATWs with one-step reinforcement show the same critical behavior and are likely in the same universality class; • There is no finite u phase transition in the 2dimensional TSATW model with one-step reinforcement. Walks are in the collapsed phase for all u > 0 and the phase transition happens at u c = 0; • The critical point and the critical exponents for TSATWs with one-step reinforcement in d = 3 are markedly different from the values obtained in [10,11]; and • The quasistatic approximation for the end-to-end distance seems to become exact as t → ∞ for the collapsed phase.

A. Methods
Simulations of TSATWs with one-step reinforcement are straightforward. To keep track of previous visits, one has to store a one-bit "spin" variable s i for each site (bond) i, and clear all spins after each walk. For convenience we sometimes used one byte per spin, which has the added advantage that clearing is needed only after every 255th walk. This requires L d /8 resp. L d bytes of memory for site TSATWs and dL d /8 resp dL d bytes for bond TSATWs. Memory limitations were more severe than CPU time so in the following we show more detailed results for site TSATWs than for bond TSATWs.
The most serious potential source of systematic errors arises from lattices that are too small. If open boundary conditions (b.c.) are used, the walk cannot go beyond the boundary, and both R t and S t are underestimated. If periodic b.c. are used and the walk wraps around the lattice, it finds visited terrain in front of it and R t is overestimated, while S t is still underestimated. We used lattices with helical boundary conditions and with up to N = 2 32 sites (d = 2) resp. 2 34 sites (d = 3), see endnote [29]. For each walk, the spans x max − x min in all d directions were measured, and it was checked that the fraction of walks where any span was ≥ L did not exceed 10 −4 . This restricted the number of steps per walk to t max ≤ 10 8 for d = 2, and to t max ≈ 4 × 10 7 for d = 3. The total number of walks for each parameter setting was typically ≈ 2 × 10 4 to ≈ 2 × 10 5 .

B. Variance reduction
For small u, where walk-to-walk variation is significant, substantially increased accuracy is obtained by the following variance reduction procedure. Assume that the walker has already made t steps and is presently at a site with cartesian coordinates x t . Given x t and the states of the neighboring sites (i.e., visited or unvisited), one can calculate the expected increment ∆x t+1 for the next step, since one knows the probability for the walker to step in each direction. From this, one obtains an estimate for the increment of R 2 where we have used the fact that ∆x t+1 · ∆x t+1 = 1. The improved estimate is obtained by summing these increments, Further improvement is obtained by taking the optimal linear combination of the direct sample average and this estimator, with α t fixed for each t such that the variance of [R 2 t ] opt is minimal. Differentiating the variance of [R 2 t ] opt with respect to α t and minimizing this variance requires the estimation of the variances of R 2 t and R 2 t as well as their covariance. In Fig. 1 we show the errors (single standard deviations, divided by t) of the three estimators for 2-d site TSATWs with u = 0.34.

C. Site TSATWs in d = 3
The average r.m.s end-to-end distance divided by the number of steps, t −1 R 2 t , is shown in Fig. 2 for site TSATWs in d = 3. In this and in all subsequent figures, curves are not labelled by u but by w = exp(u). We see clearly that there are significant corrections to scaling (all curves bend upward at small t), but they are no worse than in other nonequilibrium critical phenomena. A more careful analysis, taking these corrections into account, gives u c = 1.831 ± 0.002 (w c = exp(u c ) = 6.24 ± 0.01) and ν c = 0.378 ± 0.004. In particular, we can rule out the possibility that u c > 1.85 from the simple fact that all curves for u > 1.85 (i.e. for w > 6.35) are clearly S-shaped and curve down at large t. These estimates are incompatible with those of [11], u c = 1.92 ± 0.03 and ν c = 0.303±0.005. Possible explanations for these earlier results are that corrections to scaling were neglected in [11] or that the lattices used were too small.
The cross-over behavior near u ≈ u c can be fitted to the usual ansatz with φ = 0.185 ± 0.020, as seen from the data collapse shown in Fig. 3. The deviations from a perfect collapse seen in this figure are due to the corrections to scaling at small t seen in Fig. 2, which are not included in the scaling ansatz Eq. (8). The apparent collapse could have improved by the widespread practice of plotting the suband supercritical branches separately, without demanding that they join smoothly (the function F (z) must be analytic at z = 0). But the results obtained in this way would be spurious.
Results for the average number of visited sites, S t , again divided by t, are shown in Fig. 4. This time the corrections to scaling are much bigger. This is not unexpected, since there are also large corrections to the asymptotic law S t ∼ t for ordinary 3-d random walks. These corrections make an independent estimate of u c impossible, whence we shall use the estimate obtained from R t , i.e. u c = 1.831 ± 0.002. The corrections to scaling also make the estimation of the exponent k c very uncertain, in spite of the extremely small statistical errors (much smaller than the thickness of the curves). Our best estimate is k c = 0.977 ± 0.010. This is again incompatible with the estimate 0.91 ± 0.01 of [11]. The leading correction to scaling exponent, de- , is found to be ∆ = 0.22 ± 0.03. This is to be compared to ∆ = 1/2 for ordinary 3-d walks [25].
For the supercritical case, u > u c , the following argument was given in [6]: Let us assume that the visited sites form, for large t, a compact d−dimensional domain V t whose volume increases as S t ≡ |V t | ∝ R d t with R t ∼ t ν . Its surface is fuzzy but not fractal, i.e. it increases as . If the walker is uniformly distributed inside V t , then the chance for it to be at the boundary is |∂V t |/S t ∝ 1/R t . This is then also proportional to the chance that the walker will make the next step outside V t , i.e. d S /dt ∼ t −ν . Integrating this gives ν = 1/(d + 1) [6]. The main assumption here is not that ∂V t is nonfractal (as stated in [6,11]), but that the walker is uniformly distributed inside V t . This would be exact if the boundary would not grow at all (i.e. in the limit u → ∞), but for finite u it corresponds to a quasistatic approximation in the sense of [12].
In order to test this quasistatic approximation of the supercritical behavior, we plot in Fig. 5 the ratio R 2 t / √ t for several values of u > u c . We see very large corrections to scaling (the corrections to S t would be even larger), but the curves do seem to become horizontal for t → ∞. For u ≥ 2.5, our best estimate is R t ∼ t ν with ν = 0.25 ± 0.01, in perfect agreement with the prediction of the quasistatic approximation.

D. Site TSATWs in d = 2
In two dimensions the situation seems at first glance similar, except for the fact that corrections to scaling are even larger. The latter is not surprising: random walks are recurrent in d = 2, while they are not in any d > 2. The number of visited sites increases not as t in d = 2, but as S t = πt/ ln(8t)[1 + O(1/ ln t)] [25]. Related to this is the fact that true self avoiding walks have upper critical dimension d = 2, leading to logarithmic corrections in most observables for d = 2. As a consequence, one should also expect logarithmic corrections for TSATWs.
Results for the end-to-end distance are shown in Fig. 6. Again we show a log-log plot of R 2 t /t, for easy comparison with Fig. 2. The main difference between these two plots is that the curves fan out in Fig. 6 already for very small t, while they fan out only at much later times in Fig. 2. While the curves for u < u c in Fig. 2 first seem to follow the scaling R t ∼ t νc and cross over to R t ∼ t only at large t, no such cross-over is seen in Fig. 6. Careful inspection shows that all curves for u > 0.58 (i.e. e u > 1.79) bend down at large t, indicating that u c ≤ 0.58 and that the estimate u c = 0.88 ± 0.05 of [11] is untenable. If we want to see a critical point with an associated non-trivial power law in these data, then a possible candidate is u c ≈ 0.54 and ν c ≈ 0.47.
An attempted data collapse for the data of Fig. 6, again using Eq. (8) and optimized values u c = 0.548, ν c = 0.475, and φ = 0.085, is shown in Fig. 7. We might mention that the exponents proposed in [11], ν c = 0.40 ± 0.01 and φ ≈ 0.2, seem to be ruled out. A data collapse using these exponents is shown in panel (b) of Fig. 7. Although it has an acceptable overall dispersion, this is achieved mainly by fitting well the small-t data, and grossly misrepresenting data for large t.  Although the collapse seen in Fig. 7(a) is satisfactory, the smallness of φ and the closeness of ν c to the random walk value ν = 1/2 suggest a very different interpretation. We propose that there is in fact no phase transition at any u c > 0. Instead, the TSATW is collapsed for any u > 0, i.e. u c = 0. This is also consistent with the fact that 2-d random walks are recurrent, i.e. the interaction should be a relevant perturbation for any u > 0. It is difficult to obtain direct numerical evidence for this scenario, due to the very slow cross-over from the random walk behavior to the collapsed behavior, and due to the presence of strong corrections. In order to make any progress, we have to understand better these corrections.
In order to analyze the behaviour for very small u more closely, let us define the quantity It is obviously well defined for u = 0, but it can be defined also for u = 0 using l'Hôpital's rule, We used here the fact that R 2 t = t exactly for u = 0. Numerically, Ψ t (0) can be estimated by a slight generalization of the reduced variance method discussed in subsection II B. We simulate just ordinary random walks, but keeping track of the visited sites and calculating ∂R 2 t /∂u using Eqs. (2), (5), and (6).
It is easily seen that Ψ t (u) is positive for all u. Plots of Ψ t (u) versus t, both for positive and for negative values of u, are shown in Fig. 8. Assume there is a collapse transition at u = u c . We then expect that Ψ t (u) diverges as ln t for u > u c and t → ∞, while it should stay bounded for u < u c . More precisely, we expect that Ψ t (u) ∼ const − a/t δ for u < u c , where δ is another correction to scaling exponent. Plotting Ψ t (u) versus t −δ should thus give straight lines converging to finite values for t −δ → 0 if u < u c , but upward bent curves diverging for t −δ → 0 if u > u c . One such plot, showing Ψ t (u) versus t −0.22 , is given in Fig. 9. From this and similar plots with different exponents, we conclude that (i) the data are consistent with this scenario; (ii) the critical point is at u c ≈ 0, most likely at u c = 0 exactly; (iii) the correction to scaling exponent is δ = 0.22 ± 0.05; and (iv) at the critical point, Ψ t scales either as Ψ t (0) ∼ ln ln t or Ψ t (0) ∼ [ln t] α with 0 < α ≪ 1. The former (Ψ t (0) ∼ ln ln t) seems preferred, but a clear distinction between these alternatives is not possible.
Studying S t , the number of visited sites, is not very revealing. As seen from Fig. 10, there is no value of u for which the curve is straight. u ≈ 0.7, w = e u ≈ 2 yields the straightest curve in the large t range 10 5 < t < 10 8 , but this is clearly not asymptotic, as the curves for larger u indicate (they have not crossed over to asymptotic behavior and hence curve up for large t, although they finally curve down, for very large t).
For coupling constants u ≫ 1 one finds again that the prediction of the quasistatic approximation, R t ∼ t 1/3 , is in excellent agreement with the data (see Fig. 11). As in the 3-d case, corrections to this prediction are very large for small values of u, but they decrease quickly for u → ∞. One would like of course to verify the quasistatic approximation for smaller u, but this seems at present impossible without going to lattice sizes beyond the reach of our normal computational resources.

E. Bond TSATWs in d = 3
We now turn to the bond reinforced random walk in d = 3. Results for the end-to-end distance R 2 t are shown in Fig. 12. The plot is superficially quite similar to Fig. 2, with substantial corrections to scaling and the critical reinforcement u c occurring at much higher u. This latter fact is unsurprising as bond-reinforcement is much more "dilute" than its site-reinforced cousin; consider that the equivalent of a visited site in the bond-reinforced model must have all six bonds visited in d = 3. Analysis suggests u c = 2.475 ± 0.003 and ν c = 0.380 ± 0.004. Note that ν c is within error of the estimate ν c = 0.378 ± 0.004 for the Site TSATW in d = 3. This is the first piece of evidence that the bond and site models are in the same universality class.
In Fig. 13 we show a data collapse with the same scaling ansatz Eq.(8); u c = 2.475, ν c = 0.380 and φ = 0.185 ± 0.020. The critical exponents ν c and φ are within error and identical, resp., to those for the Site TSATW in d = 3, see Fig. 3, and the data collapse is if anything even better than that of Fig. 3. We see similar corrections to scaling at small t and excellent collapse at large t, facilitated by our ability to simulate long walks ( 10 7 ) due to the high value of u c .
Results for the average number of visited sites, S t are not shown, but are similar to Fig. 4, with substantial corrections to scaling. We thus use the estimate of u c obtained from R t , i.e. u c = 2.475 ± 0.002. The best estimate of the exponent k c (again made difficult by corrections to scaling) is k c = 0.970 ± 0.010. This is incompatible with the estimate 0.91 ± 0.01 of [11] but within the error of our estimate for the site-reinforced model, k c = 0.977 ± 0.010. The leading correction to scaling exponent, defined as S t = t kc [a + b/t ∆ + o(t −∆ )], is found to be ∆ = 0.25 ± 0.05. This is within error of the estimate ∆ = 0.22 ± 0.03 in the site case.
Hence in all cases the estimates of the critical exponents for the site-reinforced model given by [11] are excluded by our results as candidate exponents for the  Fig. 12, plotted now as R 2 t /t 2νc versus (u − uc)t φ , with νc = 0.380, uc = 2.475, and φ = 0.185. Note that the critical exponents are within error bars of those used in the data collapse of Fig. 3. bond-reinforced model. The estimates of all exponents for the bond-reinforced model agree (within error) with those we obtained for the site-reinforced case, see Sec. II C. It is thus unsurprising that Fig. 14 similarly verifies the quasistatic approximation R t ∼ t ν with ν = 0.25 for u > u c in the large t limit. Crossover to the asymptotic behavior occurs at smaller t as u → ∞. It is harder to verify the large t limit for small u > u c ; we cannot run sufficiently long walks while limiting spurious self-intersection and hence reliably estimating R t . Our results for d = 3 indicate that the bond-and sitereinforced models are in the same universality class. This implies that u c = 0 for bond TSATW in d = 2. To test this, we study small u walks, which will cross over to the collapsed behavior only for large values of t. This regime is even more difficult to study in the bond reinforced case, due to the dilute nature of the bond reinforcement. The simulations used are as large as possible (2 32 sites and walks of ≈ 10 8 steps) but in many cases these walks just reach the beginning of what may be the scaling regime.
In Fig 15 we show results for the end-to-end distance R 2 t /t. As in the site-reinforced case for d = 2, the curves fan out in Fig. 15 for small t, with no apparent cross-  [11]. The latter collapse is completely unsatisfactory.
over of the sort seen in Fig. 6 or Fig. 12. An estimate of the critical reinforcement u c from these data is extremely difficult. An attempted data collapse for the data of Fig. 15, using the scaling ansatz Eq.(8) and optimized values u c = 0.73, ν c = 0.481, and φ = 0.058, is shown in Fig. 16(a). While the data collapse acceptably for these values, the exponents proposed in [11], u c = 0.88, ν c = 0.40 ± 0.01 and φ ≈ 0.2, can be ruled out, as a data collapse using these exponents is completely unsatisfactory, Fig. 16(b) . As the estimated φ = 0.058 is even smaller than that obtained for the site-reinforced model in d = 2 (where φ = 0.085) and as ν c = 0.481 is very close to the random walk value ν = 1/2, we argue that there is no phase transition for u c > 0 in the bond-reinforced model, either. In particular, similar heuristic arguments about the recurrence of 2-d random walks suggest again that any non-zero reinforcement is a relevant perturbation. Hence we study again the function Ψ t (u) [Eq. (9)] for u > 0, u = 0, and u < 0 (see Fig. 17). As for the site case (subsection II D), plotting Ψ t (u) against 1/t δ with different exponents δ reveals the detailed asymptotic behavior. Such plots (not shown here) indicate that u c = 0, and that the correction to scaling exponent in the uncollapsed phase u < u c is δ = 0.20 ± 0.05, well within error of the estimated δ = 0.22 ± 0.05 of the site case.
As was the case for the site-reinforced model in d = 2, it is not particularly illuminating to study the number of visited sites S t . The corrections to scaling are even larger, and hence it is impossible to estimate the correct scaling exponent from these data.
For coupling constants u ≫ 0 one finds that the prediction of the quasistatic approximation, R t ∼ t 1/3 , is in excellent agreement with the data (see Fig. 18). Corrections to this prediction are very large for small values of u, but become irrelevant as u → ∞, as is already apparent at w = e u = 16. Familiar limitations to accessible lattice size make the quasistatic approximation impossible to verify for smaller u, where crossover to the asymptotic behavior takes place at very large t. Settling the validity of the quasistatic approximation in the small u regime will in all likelihood require the development and application of appropriate analytical methods.

III. DISCUSSION AND OUTLOOK
Despite its simplicity, the once-reinforced site variant of the True Self Attracting Walk (TSATW) has generated considerable controversy since its original statement by Sapozhnikov [6]. In this paper we have used a combination of careful high-statistics simulations and heuristic arguments to attempt a resolution of many of these disputes. In d = 3 we confirm the existence of a phase transition from random walk-like to collapsed behavior for finite reinforcement u c . Our simulations provide overwhelming evidence for rejecting the proposed u c and scaling exponents of [10,11]. We find u c = 1.831 ± 0.002 and ν c = 0.378 ± 0.004, with the crossover exponent φ = 0.185 ± 0.020. In addition we verify the quasistatic approximation R 2 t ∼ √ t for large t and u. In d = 2 we argue that there is no phase transition at any finite reinforcement u c . For any u > 0 the walks go to a collapsed phase, and the "critical behavior" at u c = 0 is simply that of a random walk. The quasistatic approximation R 2 t ∼ t 2/3 is also verified for large t and u.
In addition to the site-reinforced TSATW, we studied the bond-reinforced variant and found evidence that despite the underlying mathematical differences (and related difficulties) bond-reinforced TSATW is in the same universality class as site-reinforced TSATW. In d = 3 we found a phase transition at finite reinforcement u c = 2.475 ± .003 and scaling behavior extremely similar to the site-reinforced model, with similar success of the quasistatic approximation. In d = 2 we found evidence of a phase transition at u c = 0 although the evidence here is somewhat weaker due to the long time needed to cross over to the collapsed behavior and the memory limitations imposed by the extremely large lattices needed to minimize spurious self-intersection.
An obvious limitation of our work is the lack of an analytical understanding as to why the transition to a collapsed phase should occur at any finite reinforcement in the site-and bond-models in d = 2. In the mathematics literature, the once-reinforced ERRW has been studied by mapping it to a diffusion with a drift term (directed inward) at the boundary [13,27]. As far as we know this technique has only been applied in d = 1 and is even in this case of considerable technical difficulty. There are also techniques mapping the stochastic process to a (deterministic) dynamical system, the so-called "stochastic approximation", which is largely unknown to the physics literature [13,14]. This suggests that some sensible map to a continuous process or a dynamical system might enable an analytic proof of u c = 0 in one or both of the variants of once-reinforced TSATW in d = 2.
More generally, the universality result we propose for site and bond TSATW in d = 3 and d = 2 suggests the possibility of a deep dialogue between the statistical physics and probability literatures. The perspective of statistical physics generates different questions (with respect to phase transitions, critical behavior, and universality) that complement the rigorous results derived within the probability community. Furthermore, the probability literature as reviewed in [13] contains an enormous number of unexplored models for random walks with reinforcement. It is also clear that these walk processes are specific instances of a general study of random processes with reinforcement, with many applications in the biological and social as well as physical sciences [13]. The statistical physics of such models remains an almost entirely open question.
We end on a cautionary note about the use of simulation in these problems. As pointed out by Pemantle [14], the convergence times for some random processes with reinforcement can be astronomical; the Friedman urn, for example, does not reach its asymptotic behavior until a googol updates or more. This suggests that in some cases the high statistics simulations that would be applied by statistical physicists may only be probing the transient behavior of such models. While the transient behavior has its own intrinsic interest, we suggest that a dialogue between the two fields would do much to drive research in mutually beneficial directions-while avoiding pitfalls along the way.