Linking the mixing times of random walks on static and dynamic random graphs

This paper considers non-backtracking random walks on random graphs generated according to the configuration model. The quantity of interest is the scaling of the mixing time of the random walk as the number of vertices of the random graph tends to infinity. Subject to mild general conditions, we link two mixing times: one for a static version of the random graph, the other for a class of dynamic versions of the random graph in which the edges are randomly rewired but the degrees are preserved. The link is provided by the probability that the random walk has not yet stepped along a previously rewired edge. We use this link to compute the scaling of the mixing time for three specific classes of random rewirings. Depending on the speed and the range of the rewiring relative to the current location of the random walk, the mixing time may exhibit no cut-off, one-sided cut-off or two-sided cut-off, a trichotomy that was also found in earlier work. Interestingly, for a class of dynamics that are `mesoscopic', i.e., non-local and non-global, we find new behaviour with six subregimes. Proofs are built on a new and flexible coupling scheme, in combination with sharp estimates on the degrees encountered by the random walk in the static and the dynamic version of the random graph. Some of these estimates require sharp control on possible short-cuts in the graph between the edges that are traversed by the random walk.


Introduction
Target. In the present paper we study the mixing time of a non-backtracking random walk on a dynamically rewired random graph initially drawn according to the configuration model. Our core result is a link between the mixing times on the static and the dynamic random graph. Subject to mild conditions on the degrees of the vertices and the dynamics of the underlying graph, we show that, up to an error that vanishes as the number of vertices tends to infinity, the total variation distance to the stationary distribution on the dynamic random graph is given by the total variation distance on the static random graph multiplied by the probability that the random walk has not yet stepped along a previously rewired edge. Phrased in symbols, we show that D dyn x,ξ (t) = P x,ξ (τ > t) D stat x,ξ (t) + o P (1), (1.1) where x is the starting vertex of the random walk, ξ is the starting configuration of the random graph, D dyn x,ξ (t) and D stat x,ξ (t) are the total variation distance between the distribution of the random walk at time t and the stationary distribution for the dynamic, respectively, the static random graph, and τ is the first time the random walk crosses a rewired edge (see Theorem 1.4 below for a precise statement). The latter acts as a randomised stopping time and plays a central role in our analysis.
Innovative aspects. Our goal is to build a general framework that can be applied to a large class of random graph dynamics for which the degree structure is preserved, including dynamics that depend on the position of the random walk and dynamics that are non-Markovian. To do so we use a coupling that works well for non-backtracking random walks. We show that (1.1) holds under general conditions that appear to be the weakest possible, and that can be verified in specific examples. In particular, we use (1.1) to identify the scaling of the random walk mixing time for three choices of the dynamics where the rewiring is done in a certain range around the current position of the random walk. Depending on the speed and the range of the rewiring, the mixing time may exhibit no cut-off, one-sided cut-off or two-sided cut-off, a trichotomy that was also found in earlier work (see Section 1.5 for an extensive literature overview). Interestingly, for a class of dynamics that are non-local and non-global, we find new behaviour with six subregimes (see Fig. 3 below), two of which include critical crossover times where the mixing profile changes shape.

Model and notation
It is convenient to describe our model in terms of half-edges. Write V to denote the vertex set of the graph, |V | =: n the number of vertices, and deg(v) the degree of vertex v ∈ V . To each vertex v ∈ V we associate deg(v) half-edges, forming the set . The set of all half-edges is H := v∈V H v . We denote the vertex v for which h ∈ H v by v(h) ∈ V . If x, y ∈ H v , x = y, then we write x ∼ y and say that x and y are siblings of each other. Using |X| to denote the cardinality of set X, we define the degree of a half-edge h ∈ H as (1.2) We identify an edge with a pair of half-edges. A configuration is a pairing ξ of half-edges with the property that ξ(h) = h and ξ(ξ(h)) = h for all h ∈ H. The set of all configurations on H is denoted by Conf H , and the uniform distribution on Conf H is denoted by U Conf H .
With a slight abuse of notation, we will use the same symbol ξ to denote the set of pairs of half-edges forming ξ, so {x, y} ∈ ξ means that ξ(x) = y and ξ(y) = x. Note that ξ may represent a multi-graph, possibly with self-loops. A random graph corresponding to a configuration where the half-edges are paired uniformly at random is called the configuration model (see [10], [24,Chapter 7]). The quantities above depend on n, but this dependence will be mostly suppressed from the notation.
We study Markov chains {(X t , C t )} t∈N 0 , where X t ∈ H denotes the non-backtracking random walk component and C t ∈ Conf H corresponds to the evolution of the underlying graph. The evolution is chosen in such a way that it does not change the degree sequence of the graph (and consequently does not change the stationary distribution of the random walk on the graph), and can be visualised by breaking up pairs of half-edges and pairing them again, both according to prescribed rules. At each time t ∈ N, we first update the configuration and then let the walk move. Remark 1.1 (Notation). Note that (X t−1 , C t−1 ) is the state just before the transition at time t, while (X t , C t ) is the state just after the transition at time t.
Our main result concerns the total variation distance between the distribution of the random walk component and the stationary uniform distribution on the set of half-edges U H , defined as D dyn x,ξ (t) := P x,ξ (X t ∈ ·) − U H (·) TV . (1.3) Here, the total variation distance between two probability measures µ and ν on the same finite state space S is defined by We are concerned with the behaviour of D x,ξ (t) for "typical" choices of x and ξ. We formalise the notion of typicality in the following definition: Definition 1.2 (With high probability). Recall that n = |V | and let µ := U H × U Conf H . A statement that depends on the initial half-edge x and the initial configuration ξ is said to hold with high probability, abbreviated whp, if the µ-measure of the set of pairs (x, ξ) for which the statement holds tends to 1 as n → ∞.
Another important object is the first time the random walk steps along a previously rewired edge: Definition 1.3 (Randomized stopping time). Let R t be the set of edges being rewired at time t, R ≤t := t s=1 R s , and let I t denote the indicator of the event that the random walk steps along a previously rewired edge at time t, i.e., I t = 1 when X t−1 ∈ R ≤t and I t = 0 otherwise. We define the randomized stopping time τ as τ := min{t ∈ N : I t = 1}. (1.5) Note that, since rewiring happens before the random walk steps, X t−1 is the position of the random walk just before it steps over an edge that is rewired at time t.
For x ∈ H and ξ ∈ Conf H , we denote by D stat x,ξ (t) the total variation distance of the random walk on the static random graph to the stationary uniform distribution U H at time t, and by P x,ξ (τ > t) the probability that τ > t, both given the starting state (x, ξ).

Mixing for general rewiring mechanisms
The main theorem of this paper is the following statement linking the total variation distance to the stationary distribution for the static and the dynamic version of the random graph: Theorem 1.4 (Link between static and dynamic mixing). Suppose that t = O(log n). Subject to Conditions 3.1 and 3.5 below, the following holds whp in x and ξ: D dyn x,ξ t = P x,ξ (τ > t) D stat x,ξ (t) + o P (1).
(1.6) Conditions 3.1 and 3.5 are regularity conditions. The former is rather standard in the literature and ensures that the underlying graph is sparse and that the non-backtracking random walk is well-defined. The latter, representing one of the novelties of this article, ensures that the non-backtracking random walk is well-mixed when it steps along a previously rewired edge and the time at which this happens does not depend on the fine details of its past trajectory. The proof of Theorem 1.4 is based on a coupling argument in which the random walk on the dynamically rewired random graph is coupled to a modified random walk on the static random graph that at certain random times makes uniform jumps. These jumps correspond to the times at which the random walk steps along a previously rewired edge. The coupling must be good enough to beat the errors in the comparison. A key ingredient of the coupling is that the non-backtracking random walk on the configuration model is whp self-avoiding on the scale of the mixing time.
Note that while {(X t , C t )} t∈N is Markov, the marginal {X t } t∈N need not be, even though the stationary distribution of the latter is still the uniform distribution.

Application to specific rewiring mechanisms
We next consider three choices of random rewiring, referred to as local-to-global, near-to-global and global-to-global, controlled by two parameters: (1) r n , representing the radius of the ball around the current location of the random walk in which edges are allowed to be rewired with an edge that is drawn uniformly at random from the set of all edges; (2) α n , representing the probability that an edge in this ball is rewired per unit of time. By rewiring we mean breaking up two pairs of chosen edges into four half-edges and tying these up at random (for details, see Section 4).
At every unit of time a subset of the edges is rewired. The rewiring of each edge is always with an edge that is chosen uniformly at random from the set of all edges. For the subset of edges that is rewired we consider three choices: • Local-to-global (r n = 1): The edge that corresponds to the current position of the random walk has probability α n to be rewired.
• Near-to-global (1 < r n < r max ): All the edges in the r n -ball around the current position of the random walk have probability α n to be rewired, independently of each other.
• Global-to-global (r n = r max ): All the edges have probability α n to be rewired, independently of each other.
Here, r max is the maximal radius (see (1.12) below), provided that the graph is connected (which happens whp under the conditions that will be stated below).
Global-to-global rewiring was considered in [2] and [3], while local-to-global rewiring was considered in an unpublished chapter of the doctorate thesis [22]. In the present paper, however, we prove results under weaker assumptions. For an overview of previous work, see Section 1.5. Near-to-global rewiring is new and turns out to hold surprises: Theorem 1.5 (Scaling of cross-rewired time). Suppose that lim n→∞ α n = 0 and t = O(log n). Subject to Conditions 3.1(R1) and (R3) below, the following hold whp in x and ξ: (A) For local-to-global rewiring defined in Section 4.2: For near-to-global rewiring defined in Section 4.3, subject to Condition 3.6 below: (a) If lim n→∞ α n r 2 n = ∞, then For global-to-global rewiring defined in Section 4.4: Note that the tail probability P x,ξ (τ > t) exhibits a trichotomy for near-to-global rewiring, with an additional crossover at time t = r n when lim n→∞ α n r 2 n = β ∈ (0, ∞). Condition 3.6 says that the empirical degree distribution converges to a limit as n → ∞, and so do it first and second moments. It implies that whp the radius (i.e., the typical distance between vertices) of the random graph is log n log ν , (1.12) where ν is the size-biased mean of the limiting empirical degree distribution [25,Theorem 7.1], which is assumed to satisfy ν ∈ (1, ∞). Thus, for near-to-global rewiring we can only choose Condition 3.6 is needed for Theorem 1.5(B) only. The fact that it is not needed for Theorem 1.5(C) weakens the conditions in [2,3]. We expect Theorem 1.5(B) to fail without Condition 3.6. Namely, when the degree distribution has infinite variance, graph distances are of smaller order than log n, and in fact are of order log log n under an appropriate powerlaw assumption on the empirical degree distribution [15,25,27,28]. In the latter setting, for r n = c log n we expect near-to-global rewiring to behave similarly as global-to-global rewiring.
In order to exploit Theorem 1.4, we need to also control D stat x,ξ (t). For this we use the following result from [6], which requires additional regularity conditions stronger than Condition 3.1 (see Appendix B for further details): Theorem 1.6 (Scaling of static mixing time). Subject to (B.3) and Condition B.1, the following holds whp in x and ξ: where c * ∈ (0, ∞) is the constant defined in (B.3).

1.
Each of the three choices of rewiring shows a trichotomy between fast dynamics (α n r n 1/ log n), moderate dynamics (α n r n 1/ log n) and slow dynamics (α n r n 1/ log n), with r n = 1 for local-to-global rewiring, 1 < r n < r max for near-to-global rewiring and r n = r max for global-to-global rewiring. For fast dynamics the mixing time is of smaller order than log n, which is the mixing time on the static random graph, and so speed-up occurs. For moderate and slow dynamics the mixing time is of order log n, and so no speed-up occurs. The onesided cut-off for moderate dynamics shows that there is a competition between static and dynamic. For fast dynamics only Conditions 3.1 and 3.5 are needed, while for moderate and slow dynamics (B.3) and Condition B.1 are needed as well. For fast dynamics the scaling does not depend on the choice of degrees, subject to the mild regularity imposed by Condition 3.1. On other hand, for moderate and slow dynamics it does, because the constant c * equals the limit as n → ∞ of the empirical average of the logarithm of the degrees of the half-edges.

2.
Whereas for local-to-global and global-to-global rewiring the trichotomy controls the scaling, for near-to-global rewiring several subregimes show up. In particular, crossovers in the mixing time occur at critical values of the scaling parameter c (see (1.19) and (1.21)). These arise from a crossover in the cross-rewired time that appears as soon as t r n (see (1.9)). What happens is that all edges on the r n -future of the path can be rewired before the random walk reaches them, but only until time t − r n : for any time s ∈ (t − r n , t] only t − s edges are left on the future path until time t. The extra condition in Condition 3.6 ensures that whp the r n -balls carried around by the random walk do not overlap significantly, i.e., short-cuts of length ≤ r n are negligible until time t = O(log n).

4.
The coupling of the random walk on the dynamically rewired random graph to the modified random walk is implicit in the proof of the main theorem in [3]. There the main idea was that the path probabilities for the two random walks coincide for self-avoiding paths, and it was shown that the two random walks are with high probability self-avoiding. The crucial observation was that, on a typical configuration drawn according to the configuration model, the random walks are self-avoiding with high probability. The particular form of Condition 3.5 was motivated by this observation, and suggests that the same results may hold when the initial graph is drawn according to some other distribution, on which non-backtracking random walks are typically self-avoiding.

5.
The graph regularity conditions in Condition 3.5 are mild, but can be violated. Consider for example a modification of the local-to-global rewiring in which the probability α n of the half-edge X t−1 being rewired at time t depends on a specific choice of X t−1 , e.g. α n (X t−1 ) = 1/ deg H (X t−1 ). This would lead to a violation of Condition 3.5(D1). Condition 3.5(D2) can be violated by a graph rewiring mechanism that at each time gives preferential treatment to some half-edges. For example, fix a set of half-edges F with |F | n, and define a "local-to-F" graph dynamics where the edge that might get rewired at time t with {X t−1 , C t−1 (X t−1 )} is chosen from the set of edges generated by the configuration C t−1 such that each edge contains at least half-edge from F . This obviously results in a violation of the Condition 3.5(D2).
6. The scaling regimes considered in Theorem 1.5, Corollaries 1.7-1.9 and Figures 1-3 are chosen so as to end up with non-trivial scaling profiles. Apart from conditions on t in terms of α n and r n , there is also the implicit condition that t = O(log n). However, since the probability that τ > t is monotone decreasing in both α n and r n , we get trivial scaling profiles outside these regimes.

Previous work
The past decade has witnessed much activity towards understanding processes -both random and deterministic -on dynamic networks [1,7,17,18,20,21,29,31,34,38]. Research is motivated not only by mathematical interest, but also by numerous applications in computer science and data science. One of the emerging efforts is concerned with the study of mixing times of random walks on dynamic networks, and how they compare with those of random walks on static networks. The present paper fits within this line of research.
In [2] we introduced a version of a dynamic configuration model in which a fraction of the edges gets rewired at each step of the random walk according to a global-to-global rewiring mechanism. We obtained an expression for the mixing time of a non-backtracking random walk under conditions that guarantee a locally tree-like structure of the graph and fast dynamics. In [3] we extended our results to moderate and slow dynamics. In particular, we obtained a trichotomy for the mixing time of non-backtracking random walks, of the type as stated in Corollary 1.9. In the current paper, however, we achieve this trichotomy under weaker assumptions.
Trichotomies were also found in subsequent work. The closest to our setting is [14], where the authors consider a dynamic directed version of the configuration model. Contrary to our setting, for the directed graph the rewiring no longer preserves the stationary measure, and the analysis in [14] is restricted to a rewiring mechanism in which all the edges are freshly resampled at each step of the random walk. Two trichotomies are derived for the worst-case total variation distance, respectively, for the joint Markov process given by the graph and the random walk and for the non-Markov process given by the random walk marginal. Trichotomies can also emerge in the presence of other random mechanisms that do not directly change the graph. This is well illustrated in [13,37,38], where crossovers were established for random walks on random graphs with various PageRank-like transitions. Results are analogous to Theorem 1.4, with the role of the randomized stopping time τ replaced by the first time the walk gets "teleported" by a PageRank-like transition.
Mixing studies for random walks on dynamic random graphs started with [33], which considered random walks on dynamic percolation clusters on a d-dimensional discrete torus, i.e., a stochastic version of percolation where edges appear and disappear independently at a given rate. In [33] and subsequent works [23,32], mixing times were identified for several parameter regimes controlling the rates of the random walk and the random graph dynamics. Similar results were obtained for dynamic percolation on the complete graph [35,36]. One of the main difficulties with the dynamic percolation setting is that the stationary distribution of the random walk changes over time, which explains why results tend to be restricted to specific parameter regimes. Some further advances were achieved in [4,35], where general bounds on mixing times, and other quantities such as hitting, cover and return times, were derived for certain classes of evolving graphs under proper expansion assumptions. Typically, random walk mixing on a dynamic graph is faster than on a static graph, although [4] contains some (artificial) examples where the dynamics makes the mixing slower. Speed-up of mixing times for general Markov chains was recently analysed in [16], which also contains an overview of related results.
Unlike for dynamic graphs, mixing times of random walks on static random graphs form a well-established subject. For the present paper it is important to note the work in [6,8,30], where (two-sided) cut-offs on time scale log n were established for both simple and nonbacktracking random walks on a fairly general class of sparse undirected random graphs with good expansion properties. More recently, similar results were obtained for static random graphs with directed edges [11,12] or with a community structure [5].

Outline
The remainder of this paper is organised as follows. In Section 2 we define the random walk and the random graph dynamics. In Section 3 we prove Theorem 1.4. In Section 4 we prove Theorem 1.5. In Appendix A we show that the joint Markov chain of random walk and dynamically rewired random graph is irreducible, aperiodic and doubly-stochastic. In Appendix B we recall the precise form of Theorem B.2. In Appendix C we identify the general form of the transition matrix for rewirings and prove that the stationary distribution for the class of "anything-to-global" rewirings is the uniform distribution on H.

Random graph dynamics and random walk
In this section we set up the model. In Section 2.1 we give a general description of the rewiring mechanism for the random graph (specific choices will be considered in Section 4). In Section 2.2 we define the non-backtracking random walk. In Section 2.3 we define the joint process of random graph and random walk.

Random graph dynamics
We consider a general class of graph dynamics in which some edges are randomly rewired at each unit of time according to a prescribed rule. First a subset of edges to be rewired is chosen randomly, then these edges are broken into half-edges, and afterwards the resulting half-edges are paired randomly according to a prescribed distribution. The set of half-edges involved in the rewiring at time t ∈ N is denoted by R t .
Suppose that X t−1 = x and C t−1 = ξ. Then, at time t, the above dynamics gives rise to a distribution Q x (ξ, ·) on Conf H . In [2], [3] a specific choice of dynamics was considered in which Q x (ξ, ·) did not actually depend on x. In such a situation, the configuration component forms a Markov chain itself.

Random walk
We consider a non-backtracking random walk on a dynamic random graph in which some edges are rewired at each step. By non-backtracking we mean that the random walk cannot traverse the same edge twice in a row. Since in our model the underlying graph is dynamic and the edges change over time, the random walk is more conveniently defined as a random walk on the set of half-edges H. Recall that at time t ∈ N we update the configuration to C t = ξ and only then let the random walk make a move. Then the random walk moves according to the transition probabilities More descriptively, when the random walk is on a half-edge x and the graph is in configuration ξ, the random walk moves to one of the siblings of the half-edge that the current half-edge x is paired with, chosen uniformly at random (see Fig. 4). The transition probabilities are symmetric with respect to the pairing given by ξ, i.e., P ξ (x, y) = P ξ (ξ(y), ξ(x)). In particular, the transition matrix is doubly stochastic, and so the uniform distribution on H, denoted by U H , is the stationary distribution for the random walk process: Figure 4: The random walk moves from half-edge X t to half-edge X t+1 , one of the siblings of the half-edge ξ(X t ) that X t is paired to.

Joint process
The law of the joint Markov chain (X t , C t ) t∈N , starting from initial half-edge x ≡ X 0 and initial configuration ξ ≡ C 0 , is given by the conditional probabilities where the transition probabilities Q y (·, ·) remain to be chosen. While the joint process is Markov, the marginal processes X = (X t ) t∈N and C = (C t ) t∈N need not be Markov. Consequently, the total variation distance P x,ξ (X t ∈ ·)−U H (·) TV is not guaranteed to be decreasing in t, even when it converges to 0. We emphasise that at each time step the graph evolution happens first and only then the random walk makes a move.
Furthermore, note that when the graph dynamics does not depend on the random walk, i.e., Q x (·, ·) = Q y (·, ·) for all x, y ∈ H, the uniform distribution U H is the stationary distribution for the random walk, i.e., for all ξ ∈ Conf H and t ∈ N, This can easily be seen by noting that the random walk conditioned on a realisation of the graph dynamics is a time-inhomogeneous Markov chain for which U H is the stationary distribution.

Proof of the main theorem
In this section we build up the apparatus that is required to prove Theorem 1.4. In Section 3.1 we formulate the regularity conditions for the graph and its evolution. In Section 3.2 we introduce the modified random walk, which lives on the static random graph. In Section 3.3 we propose a coupling of the modified random walk and the dynamically rewired random walk. In Section 3.4 we analyse the errors in the coupling. In Section 3.5 we use the coupling to prove Theorem 1.4.

Regularity conditions
In the formulation of Theorem 1.4 we refer to certain regularity conditions, which we lay out next. The first set of conditions concerns the degrees of the graph: Condition 3.1(R1) ensures that the graph is sparse, and together with Condition 3.1(R2) guarantees that the paths of the random walk are with high probability self-avoiding on relevant time scales (see Lemma 3.10 below). Condition 3.1(R3) is a consistency condition ensuring that the non-backtracking random walk is well-defined. As stated in the introduction, Condition 3.1 is standard in the literature, unlike the forthcoming Condition 3.5. To state this new condition, we require further notation: and sequences of half-edges x 0 [r] ) and the role of the sequences in ) be the event that (see Fig. 5): are dynamically self-avoiding with respect to T .
When the event DSA(T, ) occurs, we say that the random walk on the dynamic random graph has a dynamically self-avoiding history up to time t. We call , the sequences in Definition 3.2 have the following interpretation: times when the random walk steps along a previously rewired edge, half-edges the random walk visits up to time t − 1.
half-edges that are paired withx 1 , . . . ,x r in the initial configuration.
With these definitions in hand, we can now state the conditions on the random graph dynamics: Condition 3.5 (Regularity of graph dynamics). Recall that I t is the indicator of the event that a random walk steps over a rewired edge at time t (see Definition 1.3). For all t = O(log n) and all T = {t 1 , . . . , t r } ⊂ [t − 1] the following conditions hold (note that I t is random given (I s ) 0<s<t ): avoiding histories with respect to T , where the bound is uniform in the histories.
that describe dynamically self-avoiding histories with respect to T , where the bound is uniform in the histories.
In case (D1) and (D2) cannot be verified for all sets of sequences that describe a dynamically self-avoiding history with respect to T , the following suffices: Part (D1) states that the times at which the random walk steps over a rewired edge are almost independent of the fine details of the random walk, provided it has a good dynamically selfavoiding history. Part (D2) states that a random walk with a good dynamically self-avoiding history is close to being mixed right after it steps over a rewired edge. The error terms of order o(1/ log n) are chosen such that we can carry out the estimates in Lemma 3.10 below. Part (D3) ensures that good dynamically self-avoiding histories are typical.
To identify the scaling of the mixing time for near-to-global rewiring in Corollary 1.8, we need an extra regularity condition: Condition 3.6 (Regularity of degree distribution). Let p n := 1 n v∈V δ deg(v) denote the empirical degree distribution. We require that: (R1*) lim n→∞ p n = p, pointwise for some probability distribution p on N.
The size-biased mean minus one of p is and is assumed to satisfy ν > 1.
We may interpret ν as the average forward degree of a uniformly chosen half-edge, which plays the role of the mean offspring in the branching-process approximation of the local limit of the configuration model. In view of Condition 3.1(R3), the condition ν > 1 amounts to the requirement p = δ 2 .

Modified random walk
We define a modified random walk, denoted by (Y t ) t∈N , as a random walk on a static random graph that at certain random times makes uniform jumps. Formally, we have a sequence (J t ) t∈N of random variables adapted to a filtration (F t ) t∈N , taking values in {0, 1} according to a pre-specified distribution on {0, 1} N . For fixed t ∈ N, J t is seen as the indicator of the event that the modified random walk makes a uniform jump at time t. The law of the modified random walk (Y t ) t∈N on ξ that starts from the initial half-edge x ≡ Y 0 , which is adapted to (F t ) t∈N , is given by the conditional probabilities Note that, according to the definition, neither (J t ) t∈N nor the pair (Y t , J t ) t∈N needs to be Markov, but (Y t ) t∈N is Markov conditionally on a realisation of (J t ) t∈N . Uniform jumps of the modified random walk can be rephrased in the following form. Let Y t be a uniformly chosen half-edge, independent of the random walk path and the jump times. If J t = 1, then we choose a uniform sibling of Y t , say y, and set Y t = y. Since Y t is uniform and one of its siblings is chosen uniformly at random, the resulting half-edge is distributed uniformly on H. Even though Y t is already a half-edge chosen uniformly at random, working with its sibling (which is also a half-edge chosen uniformly at random) will come in handy in the coupling argument in Section 3.3.
As an analogue of τ , we define σ to be the first time that the modified random walk makes a uniform jump, i.e., σ := inf{t ∈ N : J t = 1}. (3.10)

Coupling of modified and dynamically rewired random walk
We couple the law P x,ξ (X t ∈ ·) of the random walk on the dynamic random graph, with initial half-edge x and initial configuration ξ, to the law P mod x,ξ (Y t ∈ ·) of the modified random walk. We want the coupled random walks to stick together as much as possible. When the two random walks make different steps, we say that the coupling of the two random walks has failed. Until the coupling fails, the times at which the random walk on the dynamically rewired graph makes a step over a previously rewired edge correspond to the times at which the modified random walk makes a uniform jump.
Definition 3.7 (Coupling to a modified random walk). Let X t be a non-backtracking random walk starting in the initial state (x, ξ), where x ∈ H, ξ ∈ Conf H , and Y t be a modified random walk on ξ starting in x. First, define a sequence of auxiliary random sets (A t ) t∈N 0 . Call A t the set of active half-edges at time t. Let A 0 be the set consisting of the initial half-edge of the random walk and its siblings, i.e., A 0 := H v(x) . Define the coupling of the non-backtracking random walk X t and the modified random walk Y t at any time t ∈ N by the following rules: 1. If ξ(X t−1 ) or any of its siblings belong to A t−1 , then declare the coupling as failed.
2. If deg H (X t−1 ) > (log n) 2+ε (recall that n := |V |), then declare the coupling as failed. If Condition 3.5(D3) is not needed, then this rule is suspended (see Remark 3.8 below for further details).
3. If the coupling has not yet failed, then maximally couple the distribution of I t , conditionally on the history of the random walk and the rewired edges seen by the random walk, to the distribution of J t , conditionally on the values of the indicators J 1 , . . . , J t−1 . The following three outcomes are possible: (a) If the coupling of the conditional distributions of I t and J t is successful and I t = J t = 0, then create A t as a union of ξ(X t−1 ) and all its siblings with A t−1 . Let the random walk on a dynamic graph make a move and set Y t := X t .
(b) If the coupling of the conditional distributions of I t and J t is successful and I t = J t = 1, then maximally couple the distribution of C t (X t−1 ), i.e., the half-edge paired with X t−1 in configuration C t , conditionally on the history of the random walk and I t = 1, to the distribution of Y t : i. If the coupling of C t (X t−1 ) and Y t is successful, and neither C t (X t−1 ) nor any of its siblings is already contained in A t−1 , then add ξ(X t−1 ) and all its siblings, along with C t (X t−1 ) and all its siblings, to A t−1 in order to obtain A t . Phrased in symbols: Let the random walk on the dynamic graph make a move, and set Y t := X t . ii. Otherwise, declare the coupling as failed.
(c) If the coupling of the conditional distributions of I t and J t is not successful, namely if I t = J t , then declare the coupling of the two random walks as failed.
4. If the coupling has failed let X t and Y t evolve independently.
Remark 3.8 (Failure of the coupling after a high-degree half-edge is encountered). In Lemma 3.10 we will see that failure of the coupling as described in item 2 above is needed only when Condition 3.5(D3) comes into play. This will only happen for one of the three examples in Section 4, namely, near-to-global.

Failures in the coupling
Remark 3.9 (Possible failures). At each time t ∈ N, the random walk and the coupled modified random walk try to avoid stepping on the active half-edges A t−1 . The coupling of these two random walks fails in four cases described in Definition 3.7: I. In step 3(b)ii: A. if the coupling of C t (X t−1 ) and Y t is not successful, B. if the two random walks step over a half-edge in A t−1 .
II. In step 3(c), if the coupling of I t and J t is not successful.
III. In step 1, if the pair of X t−1 in the starting configuration is already in A t−1 .
IV. In step 2, if the random walk encounters a half-edge X t−1 with a high degree.
Failure cases IB and III correspond to the situation in which the random walks do not have dynamically self-avoiding histories. Consequently, the random walks have dynamically selfavoiding histories before the coupling of the two random walks fails. Failure case IA corresponds to the situation in which the conditional distribution of C t (X t−1 ) is too far from the uniform distribution in total variation distance. Failure case II corresponds to the situation in which the conditional distribution of the times at which the random walk on the dynamically rewired graph and the conditional distribution of the times at which the modified random walk makes uniform jumps are far from each other in total variation distance. Finally, failure case IV corresponds to the situation when during the graph exploration the random walk encounters a half-edge with an anomalously high degree.
The next lemma states that these failure events are unlikely up to logarithmic times when Conditions 3.1 and 3.5 hold for the random walk on the dynamically rewired random graph: (3.12) that describes a good dynamically self-avoiding history with respect to T s (see Definition 3.3).
Consider the modified random walk for which the jump distribution has conditional distribution .

(3.13)
Then, whp in x and ξ, 14) and, with σ as defined in (3.10), ) in the right-hand side of (3.13) is not compatible with the initial state (x, ξ) (i.e., when the conditioning is on an event of probability zero), then we set the right-hand side of (3.13) equal to zero. The proof below uses an annealing argument in which the "mismatched" events play no role.
Proof of Lemma 3.10. Let P couple x,ξ denote the law of the coupling of the two non-backtracking random walks described in Section 3.2 with X 0 = x and C 0 = ξ. Also, use F ∈ N to denote the time at which this coupling fails. Due to Condition 3.1(R3), these random walks are always well-defined. Since the two random walks agree up to the time F , that is until the coupling fails, we have So, in order to prove the claim it suffices to show that, whp in x and ξ, To achieve this, we use an annealing argument on the initial graph and the initial location. Recall that µ = U H × U Conf H , and let (3.18) We will show that by exploring the initial configuration using the paths of the random walk and its coupled modified random walk until the coupling fails at time F . 2. At time s ∈ N, first explore the half-edge to which X s−1 = Y s−1 is paired in the initial configuration ξ, then let the coupled random walks evolve in accordance with Definition 3.7, and update A s accordingly.
This exploration process covers the part of the graph seen by the random walks, along with the parts affected by the rewiring at the positions of the random walks, and stops as soon as the coupling of the two random walks fails. We will carry out the proof in a setting where Conditions 3.5(D1) and (D2) hold. At the end of the proof we will briefly comment on the changes required when Condition 3.5(D3) comes into play.
Suppose that the coupling of the two random walks has not failed before time s. Failure at time s can occur in the following three cases (see also Remark 3.9): 1. The coupling of I s and J s fails in step 3(c) of Definition 3.7.
2. The coupling of C s (X s−1 ) and Y s fails in step 3(b)ii of Definition 3.7.
3. The random walks jointly step over a half-edge that lies in A s−1 in either step 3(b)ii or step 1 of Definition 3.7.
For case 1, we note that, since the distribution of J t for the modified random walk is given by (3.13), Condition 3.5(D1) implies that the probability of coupling failure is o(1/ log n).
For case 2 we note that, by Remark 3.9, before the coupling of the two random walks fails, the random walk has a dynamically self-avoiding history. By Condition 3.5(D2), the total variation distance between the conditional distribution of C s (X s−1 ) and the uniform distribution U H is o(1/ log n). Since Y s is also distributed uniformly on H, the probability of the event in case 2 is o(1/ log n).
For case 3, we first need an upper bound on the size of A s−1 . Each time we explore the initial configuration, we add at most d max half-edges to the set of active half-edges. In case a rewiring occurs, then we add at most 2d max half-edges to the set of active half-edges. This gives us the following crude bound: (3.20) For a fail event in step 3(b)ii, we see that the probability that C s (X s−1 ) ∈ A s−1 is smaller than since the random walk has a dynamically self-avoiding history before the coupling of the two random walks fails (see Remark 3.9), so the total variation distance between the conditional distribution of C s (X s−1 ) and the uniform distribution U H is o(1/ log n), by Condition 3.5(D2). For a fail event in step 1, we see that the probability that C 0 (X s−1 ) ∈ A s−1 is smaller than Taking a union bound up to time t, and using that by assumption t = O(log n), d max = o(n/(log n) 2 ) (Condition 3.1(R2)) and |H| = Θ(n) (Condition 3.1(R1)), we get which in turn implies that, In case we rely on Condition 3.5(D3), a fourth possible failure of the coupling shows up, namely, if the random walk encounters a half-edge of degree larger than (log n) 2+ε . The probability of this failure is o(1/ log n) by Condition 3.5(D3). The estimates for the other possible failures carry over, because if the coupling did not fail at some time s due to a meeting with a high-degree half-edge, then the random walk path traced up to time s is good and we can apply the same arguments as above.

Link between dynamic and static
In this section we prove Theorem 1.4. Consider the modified random walk in the statement of Lemma 3.10 and sample uniform jump times up to time t. For any fixed T = {t 1 , . . . , t r } ⊂ [t], we see that the modified random walk conditionally on the event J(T ) := {J s = 0 for s ∈ [t] \ T, J s = 1 for s ∈ T } is a time-inhomogeneous Markov chain that makes random-walk steps at times s ∈ [t] \ T and jumps to half-edges chosen uniformly at random at times s ∈ T .
Conditionally on T ⊂ [t] being non-empty, it is obvious that at time t the random walk on a graph satisfying Condition 3.1 is well-mixed for any starting x ∈ H, ξ ∈ Conf H and so we claim that and since J(T ), ∅ = T ⊂ [t] by definition implies σ ≤ t, we also get On the other hand, since the modified random walk up to time t conditionally on the event {σ > t} is the same as the random walk on the static graph, for any x ∈ H and ξ ∈ Conf H , we have (3.28) Using the triangle inequality twice, we obtain and Inserting (3.27) and (3.28), we obtain (3.31) Now using Lemma 3.10, we see that, whp in x and ξ, which concludes the proof of Theorem 1.4.

Examples of admissible dynamics
In Section 4.1 we introduce three choices of rewiring. In Sections 4.2-4.4 we identify, for each of these choices, the scaling of the probability that the random walk does not step along a previously rewired edge, which settles Theorem 1.5.
In Appendix A we show that each of the three choices of rewiring leads to an irreducible and aperiodic joint Markov chain for the random walk and the random graph.

Three choices of rewiring
We explore rewirings that fit into a larger scheme of random graph dynamics, namely, where the decision which edges to rewire depends on their distance to the current position of the random walk.
Definition 4.1 (Sets of edges to be rewired). Recall that the configuration ξ is a pairing of all the half-edges (which induces a set of edges) and H is the set of all half-edges. By abuse of notation, in Section 1.1 we introduced the expression {a, b} ∈ ξ, a, b ∈ H, to mean that the half-edges a, b form an edge in the configuration ξ. For any ξ ∈ Conf H , h ∈ H and r n ∈ N, define the following sets of edges: (4.1) In words, Local ξ (h) is the edge to which the half-edge h belongs and Near ξ,rn (h) are the edges that can be reached in r n steps by the non-backtracking random walk when the graph is in configuration ξ (and is not evolving). The red edge forms the local set, which is also the near set with r n = 1. The red and green edges form the near set with r n = 2. The red, green and orange edges form the near set with r n = 3.
With the above notation we can define the dynamics: Definition 4.2 (Random walk with (K t )-to-(L t ) rewiring). Recall that X t−1 is the position of the random walk before the transition at time t and C t−1 is the configuration of the random graph before an update at time t. Let (K t ) t∈N , (L t ) t∈N be sequences of sets of edges, which can be different at each time t. Define the random walk with (K t )-to-(L t ) rewiring as the following process: 3. (a) If |R t | ≥ |L t \ R t |, then break-up all the edges in R t ∪ L t into half-edges and re-pair them at random. More formally, pick 1 2 |R t ∪ L t | different half-edges (the half-edges forming R t ∪ L t ) and order them randomly. Also order randomly the half-edges not chosen in the previous step. The new pairing is generated by pairing the successive elements from the first and the second ordered sets described above.
(b) Otherwise, for every e ∈ R t , choose e ∈ L t \ R t uniformly at random without replacement. Denote the set of all edges e chosen in the previous step by R t . Break up R t into half-edges and order them randomly. Do the same with R t . Just as in (a), the new pairing is given by the successive elements of the first and the second ordered set.
The new pairing of half-edges obtained in either (a) or (b) above is the new graph configuration C t .
4. The random walk moves from X t−1 to X t on the evolved graph C t .
Remark 4.3 (Sets of edges generated from a configuration). When in the sequel we write L t ≡ ξ ∈ Conf H , we mean that the set of edges L t is generated by the configuration ξ, which is a pairing of the entire set of half-edges H.

Local-to-global rewiring
In this section we focus on a rewiring mechanism that is called local-to-global. Using the language of Definition 4.2, this would be a rewiring with K t = Local C t−1 (X t−1 ) (see Definition 4.1) and L t ≡ C t−1 (see Remark 4.3). Observe that the set K t is explicitly dependent on the position of the random walk X t−1 before the transition at time t occurs. For ξ, η ∈ Conf H and x ∈ H, define Then the transition matrix for the random graph from configuration ξ to configuration η when the random walk is at position x equals where I(ξ, η) = 1 if η = ξ, and I(ξ, η) = 0 otherwise, i.e., I is the identity matrix. The first term of (4.2) captures the situation when rewiring does not happen and the graph remains the same. On the other hand, the off-diagonal symmetric matrix Q R x (ξ, η) in the second term represents the possible evolution of the graph by local-to-global rewiring. Note that the only possible transitions between graph states are those where the two configurations ξ and η differ in exactly two pairs of half-edges. The condition ξ(η(x)) = η(ξ(x)) in (4.2) says that rewiring always happens at the position of the random walk. The value 1 |H|−2 comes from the fact that at time t the rewiring mechanism can choose to pair the half-edge X t to any half-edge chosen randomly from H \ {X t−1 , C t−1 (X t−1 )}, which is a set of size |H| − 2.
Since Q R x is symmetric for all x ∈ H, we see that the measure U Conf H , defined by is the stationary distribution for Q R x for any x ∈ H. This implies that U Conf H is also the stationary distribution for Q x for all x ∈ H. Remark 4.4 (Symmetry of transition matrix for graph dynamics). Local-to-global rewiring is one of the examples where the transition matrix is symmetric. Symmetry does not hold generally, even within the restricted class of "something-to-global" rewirings. Still, for such rewirings the transition matrices are always doubly stochastic. For more details see Appendix C.
Using this fact, we have the following result for the joint Markov chain:

Proposition 4.5 (Stationary distribution).
For any α n ∈ [0, 1], U H × U Conf H is the stationary distribution for the random walk with local-to-global rewiring with parameter α n .
Proof. Recall from Section 2.2 that P η is the transition matrix for the non-backtracking random walk on the graph η. Since U H is stationary for P η for any η ∈ Conf H , and U Conf H is stationary for Q x for any x ∈ H, it follows that for any y ∈ H and η ∈ Conf H ,

(4.5)
It is not obvious that the joint Markov chain is irreducible and aperiodic. In Appendix A we show that this is nonetheless the case when α n ∈ (0, 1), and so the distribution of the joint Markov chain at time t converges to U H × U Conf H as t → ∞. An important implication is that the distribution of the random walk alone at time t converges to U H as t → ∞. Indeed, for any x ∈ H, ξ ∈ Conf H and t ∈ N, we have and since the right-hand side tends to 0 as t → ∞, D dyn x,ξ (t) also tends to 0 as t → ∞. On the other hand, this argument does not automatically imply that D dyn x,ξ (t) is non-increasing in t. We are now ready to prove the scaling results stated in Theorem 1.5(A) and Corollary 1.7: that describe a dynamically self-avoiding history with respect to T . Conditionally on the event DSA(T, ), x t−1 cannot have been rewired before time t. Indeed, by construction the half-edges that are rewired before time t are x t 1 −1 , . . . , x tr−1 , x t 1 −1 , . . . ,x tr−1 ,x 1 , . . . ,x r andx 1 , . . . ,x r , while x t−1 is not equal to any of these. So we have ) is the uniform distribution on H \ {x t−1 , C t−1 (x t−1 )}, because after rewiring the half-edge x t−1 cannot end up being paired with itself or the half-edge it was paired with before. This gives Since this holds for any choice of On the other hand, the event {τ = t} is the same as the event {min{s ∈ N : R s = 1} = t}, since when a rewiring occurs the random walk steps over a rewired edge with probability 1. This implies that, for any x and ξ, where SA(t) is the event that the random walk is self-avoiding until time t. The first equality comes from the requirement that none of the edges the random walk steps over until time t gets rewired, the second equality uses that lim n→∞ α n = 0. Since

Near-to-global rewiring
In this section we focus on near-to-global rewiring. In view of Definition 4.2, this is a rewiring with K t = Near C t−1 ,rn (X t−1 ) (recall Definition 4.1) and L t ≡ C t−1 (see Remark 4.3) at any time t. Just like in the previous example, this is also a rewiring mechanism where the sets K t are dependent on the current position of the random walk.
The layout is the same as in the previous section, the main difference being the presence of the additional parameter r n that controls the size of the set of edges that are being considered for rewiring at each unit of time. We will see that this parameter controls the trichotomy. We only consider r n = O(log n), since the expected diameter of the configuration model is of order log n (see (1.12) and [25,26]). For r n = o(log n) the behaviour is dominated by the local properties of the graph dynamics and is similar to that for the local-to-global rewiring studied in Section 4.2. On the other hand, once r n = Θ(log n) we get a significant contribution from a certain "boundary term" in the computation of the tail probability P(τ > t | SA(t)), and we find a behaviour that is more similar to the global-to-global rewiring studied in Section 4.4.
First, we claim that the random walk is again irreducible and aperiodic: Proposition 4.6 (Irreducibility and aperiodicity). Non-backtracking random walk with near-to-global rewiring is aperiodic and irreducible.
Proof. In Appendix A we show that the joint Markov chain with local-to-global rewiring is irreducible and aperiodic. Since near-to-global rewiring admits all the transitions that are admitted for local-to-global rewiring, the proof carries over.
Next, we claim that the stationary distribution is again uniform:

Proposition 4.7 (Stationary distribution).
For any α n ∈ [0, 1] and r n = O(log n), U H × U Conf H is the stationary distribution for the random walk with near-to-global rewiring with parameters α n , r n .
Proof. Apply Proposition C.2 to establish that U Conf H is stationary for the chosen graph dynamics. After that the rest of the proof carries over from Proposition 4.5.
We are now ready to prove Theorem 1.5(B) and Corollary 1.8. First we settle Condition 3.5(D2) for good histories. After that we identify the asymptotics of P(τ > t | SA(t)) and settle Condition 3.5(D1) for good histories. Both are tricky because they force us to investigate the possible occurrence of short-cuts in the configuration. The key ingredient in the proof is that short-cuts are unlikely when t = O(log n) and r n ≤ (1 − ε)ρ max log n for some ε > 0, which requires the error term in Condition 3.5(D1). We finally settle Condition 3.5(D3). At the end we put the pieces together and wrap up the proof.
Proof of Condition 3.5(D2). Because the rewiring is done with the global set, we have (4.11) and, just as in (4.8), Thus, Condition 3.5(D2) is satisfied.
Identification of P(τ > t | SA(t)). On the event SA(t), for 1 ≤ k < l ≤ t, let S rn kl be the indicator of the event that there is a short-cut of length ≤ r n between the half-edges visited by the random walk at times k and l, i.e., a connection not running along the path of the random walk itself. Abbreviate SH rn (t) = (S rn kl ) 1≤k<l≤t . Then, for any x, ξ, This equality comes from the requirement that from time 1 until time (t − r n ) + none of the r n half-edges on the future path must be rewired, while from time (t − r n ) + + 1 until time t none of the r n half-edges on the future path until time t must be rewired. Rewrite (4.13) as with  The first factor in (4.14) equals 16) and produces the scaling in Theorem 1.5(B) (recall (4.10)). We therefore need to show that the second factor in (4.14) is negligible. For this it suffices to show the following: Lemma 4.8 (Bound on number of short-cuts). Subject to Condition 3.6, χ rn (t) = 0 whp uniformly in t = O(log n) and r n ≤ (1 − ε)ρ max log n for some ε > 0.
Proof. Recall that SA(t) is the event that the random walk is self-avoiding until time t. Consider the ball B t (x) of radius t around the starting point x of the random walk. Recall that, conditionally on SA(t), (1.2) implies that the probability for the random walk to choose a t-step self-avoiding path consisting of half- . (4.17) Condition on h. Note that SA(t) is equivalent to the event that all half-edges in h are distinct, which we assume from now on. It is helpful to distinguish between disjoint short-cuts and non-disjoint short-cuts. A disjoint short-cut between two half-edges h i and h j is a short-cut that does not use any of the other half-edges in h. Not all short-cuts are disjoint. Indeed, a disjoint short-cut gives rise to other short-cuts that are counted in 1≤k≤t S rn kt , which we call non-disjoint. For example, for r n ≥ 2, if there is a disjoint short-cut of one edge between h i and h i+4 , then there necessarily is a short-cut between h i and h i+5 also. The point is that χ rn (t) = 0 precisely when there are no disjoint short-cuts. We must also bring the graph dynamics into the picture.
We call a disjoint short-cut a disjoint (s, i, j, k)-short-cut when the r n -neighbourhood of the random walk at time s creates a disjoint short-cut consisting of k edges between h i and h j . This is only possible when s ≤ i ≤ s + r n and k ≤ r n , since otherwise h i would not be in the r n -neighbourhood of the random walk at time s, and when j > i + r n , since otherwise the path of k-edges would not be a short-cut.
We aim to show that, for r n ≤ (1−ε)ρ max log n and ε > 0, the probability that there exists a disjoint (s, i, j, k)-short-cut vanishes as n → ∞. To do so, we rely on the first-moment method. We make crucial use of the fact that the configuration model is the stationary distribution under our graph dynamics. This implies that, conditionally on h, all other half-edges at time s are paired uniformly at random, so that we can use configuration model estimates. Given h, the expected number of disjoint (s, i, j, k)-short-cuts is bounded by (see [25,Proposition 7.4]) where ν n is the size-biased mean of the empirical degree distribution p n (recall (3.7)), n is the sum of the degrees (= number of half edges), and the error term o(1) is uniform in k ≤ C log n. The quantities in (4.19) introduce corrections that come from the fact that, conditionally on h, only a subset of size n of the half-edges is randomly paired at time s. Due to Condition 3.6, the sum over 1 ≤ k ≤ r n ≤ (1 − ε)ρ max log n of this expression is bounded by (max 1≤i≤t deg H (h i )) 2 n −ε/2 for n large enough. Thus, for r n ≤ (1 − ε)ρ max log n, by a union bound over 1 ≤ i, j ≤ t, the probability that there exists a disjoint short-cut before time t is bounded by Since t = O(log n), we can use an annealing argument to show that, subject to Condition 3.6, max 1≤i≤t deg H (h i ) ≤ t 2 whp. Indeed, leth i denote the half-edge to which h i is paired, so that deg H (h i+1 ) = deg H (h i ). Then, the distribution of deg H (h i ) is the size-biased degree distribution minus 1. By Condition 3.6, the mean of this size-biased distribution is uniformly bounded, so that by the Markov inequality the probability that deg H (h i+1 ) ≥ B is at most C/B for any B > 0 and some C < ∞. Hence the probability that max 1≤i≤t deg H (h i ) > t 2 is at most Ct/t 2 = o(1).
Since r n , t = O(log n), we conclude that the probability that χ rn (t) > 0 is whp at most as required.
We can now complete the identification of P(τ > t | SA(t)). By Lemma 4.8, P x,ξ (τ > t | SA(t), SH rn (t)) is asymptotically equal to the expression in (4.16) whp, uniformly in r n ≤ (1 − ε)ρ max log n and t = O(log n). Taking the expectation w.r.t. SH rn (t), we get that the same is true for P x,ξ (τ > t | SA(t)). Taking the expectation w.r.t. x, ξ as well, we conclude that the same is true for P(τ > t | SA(t)), as required.
Proof of Condition 3.5(D1). For all paths that describe a dynamically self-avoiding history with respect to T ⊂ [t − 1], the probability that at time t the random walk steps along a rewired edge is (4.22) with (recall t r from (3.1)) β n,t = 1 − (1 − α n ) (t−tr)∧rn (4.23) being the probability that the t th edge is rewired when it is in the range of the random walk path, and ε n,t (T, ) ≥ 0 is the contribution due to short-cuts. Note that β n,t is independent of (T, , so that to verify Condition 3.5(D1), we only need to bound ε n,t (T, where ε n,t (T, ) is the probability that the t th edge is rewired due to a short-cut that puts it in the r n -neighbourhood of the location of the random walk at some time k < t−r n , but is not rewired due to a rewiring on the path of the random walk. The crux of the argument is to show that the event DSA(T, ) affects a negligible amount of half-edges. After that we are in a situation where we can once again apply configuration model estimates, as in (4.18).
) implies certain restrictions on the pairing of half-edges for every s ∈ [0, t − 1]. These restrictions can be of two kinds: they can pair two half-edges with certainty or with a probability that depends on the fine details of the rewiring dynamics. In the near-to-global case these probabilities are generally close to 1. Denote by H s the (partially) random set of half-edges that are paired by the event DSA(T, ) at times. The following observation is crucial: Proof. Since the graph is initially drawn according to the configuration model, and the configuration model is the stationary distribution of the graph dynamics, we see that on the set H \ H s the pairing is uniformly at random. Because the paired half-edges in H s are fixed, they do not affect the half-edges in H \ H s . Let us clarify the possible restrictions implied by 1. Edges already traversed by the random walk can get stuck in the configuration seen by the random walk. More formally, edges {X q , C q (X q )} with q ≤ s need not be a part of the near-set Near C p−1 ,rn (X p−1 ) for any time p ≥ s. This concerns half-edges in 2. Edges that are traversed at time q with q > s and q / ∈ T must not get rewired before the random walk crosses them. This concerns half-edges in x [0,t−1] andx [0,t−1] .
3. Edges that are traversed at time q with q > s and q ∈ T can (but need not) get rewired before the random walk crosses them. If they get rewired just before the random walk crosses them and near-sets at times < q do not containx q , then {x q ,x q } and {x q ,x q } must remain paired until time q. This concerns half-edges in Observe that only the edges that consist of half-edges in can be fixed. If we take the union of all these half-edges H , we get a crude upper estimate on H s that is valid for all s ∈ [0, t − 1].
Next we estimate the number of half-edges that are influenced by the restrictions implied by DSA(T, to see thatx [r] andx [r] both contain at most t − 1 half-edges. Summing the four contributions, we see that indeed |H | = O(t).
We are now ready to apply configuration-model estimates: Lemma 4.11 (Bound on number of short-cuts). Subject to Condition 3.6, conditionally on DSA(T, Proof. Observe that χ rn * (t) > 0 implies the existence of a (s, i, j, k)-short-cut at some time s ∈ [0, t − 1]. In Lemma 4.8 we proved a result about rarity of these shortcuts where we assumed only Condition 3.6. The statement of the current lemma furthermore assumes that the event DSA(T, In Lemma 4.9 we have shown that at any time s ∈ [0, t − 1] the conditioning on DSA(T, ) only affects the pairing of some half-edges in H s . In Lemma 4.11 we gave an estimate of |H s | for any s ∈ [0, t − 1]. These two results bring us into the same setting as we had in the proof of Lemma 4.8, namely, we see that configuration model estimates hold (recall (4.18)). Therefore, by the same argument as above, given that r n , t = O(log n), we once again claim that the probability of χ rn * (t) > 0 is at most Since Condition 3.5(D1) concerns sequences x [0,t−1] that are good, we have and so r n t 2 ((log n) 2+ε ) 2 n −ε/2 = o(1), (4.28) as required. Note that χ rn * (t) ≤ χ rn (t) (compare (4.15) and (4.25)).
Now we see that the contribution of the ε n,t (T, ) term in (4.22) is O(n −ε/2 ) and therefore Condition 3.5(D1) holds.
Completion of the proof of Theorem 1.5(B) and Corollary 1.8. We already verified Condition 3.5, and have shown that P(τ > t | SA(t)) is asymptotically equal to the expression in (4.16). Furthermore, by (4.10), SA(t) occurs whp, uniformly in t = O(log n). This completes the proof of Theorem 1.5(B). Finally, given Condition 3.1(R1), we can again use Corollary B.4, which combined with (B.3) yields Corollary 1.8.

Global-to-global rewiring
In this section we focus on global-to-global rewiring. This choice was already explored in [2], [3], with the minor difference that in the present paper the parameter α n is the probability that an edge gets rewired per unit of time, while in [2], [3] it was the fraction of edges that get rewired per unit of time. This difference has no impact on the scaling of the mixing times. Global-to-global rewiring corresponds to the choice K t = L t ≡ C t−1 (see Remark 4.3) for all t in Definition 4.2. Unlike for the previous examples, now the rewiring is independent of the position of the random walk, so the graph dynamics becomes Markovian.
As before, the use of Corollary B.4 depends on Condition 3.1(R1). The proof of Theorem 1.5(C) uses that for all x and ξ, (4.30) The first equality comes from the requirement that up to time t each of the half-edges on the future path of the random walk up must not get rewired. We thus obtain the scaling in Theorem 1.5(C) (again recall (4.10)). Given Condition 3.1(R1), we can again use Corollary B.4, which combined with (B.3) yields Corollary 1.9. Irreducibility and aperiodicity of the rewiring was settled in [2]. The fact that the stationary distribution is the configuration model is settled by Proposition C.2, in combination with an argument analogous to Proposition 4.5. It remains to establish Condition 3.5. Global-to-global rewiring satisfies the graph-dynamics regularity conditions formulated in Condition 3.5.
Proof. Since any edge can get rewired at any time, we have where we use that the edge crossed at time t has had exactly t opportunities to get rewired. Since this holds for any choice of , Conditions 3.5(D1) follows with zero error. Moreover, since a half-edge can get rewired to any half-edge except itself and its current pair, we know that

A Irreducibility and aperiodicity
In this section we show that the random walk with local-to-global rewiring is irreducible and aperiodic. This ensures that the total variation distance D x,ξ (t) converges to 0 as t → ∞ for fixed x ∈ H, ξ ∈ Conf H and α n ∈ (0, 1). Our proof builds on the proof of irreducibility of the switch chain on multigraphs given in [19].
The random walk with local-to-global rewiring (X t , C t ) t∈N (see Section 4.2) is irreducible and aperiodic for any initial state (x, ξ) ∈ H × Conf H and any choice of α n ∈ (0, 1).
Proof. Let V = {v 1 , . . . , v n } and assume that deg to v 2 , and so on. Let v 1 , . . . , v 2k ∈ V be the odd-degree vertices. We fix a configuration ξ 0 ∈ Conf H such that each vertex has the maximum number of self-loops, i.e., each vertex v ∈ V with even degree has 1 2 deg(v) self-loops, each vertex v ∈ V with odd degree has 1 2 (deg(v) − 1) self-loops, and there is exactly one edge between every pair of odd-degree vertices v 2i−1 , v 2i for i = 1, . . . , k (see Figure 7). We will show that the pair (1, ξ 0 ) ∈ H × Conf H is accessible from any pair (x, ξ) ∈ H × Conf H by allowed moves for the random walk with local rewiring.

2.
Suppose that x is not on a self-loop, i.e., it is on an edge between two odd-degree vertices. We first move to (x , ξ 0 ) without rewiring, where x ∈ H is on a self-loop. After that we apply the procedure in item 1 to (x , ξ 0 ). Next, we show that for any (x, ξ) ∈ H × Conf H with ξ = ξ 0 we have access from (x, ξ) to (y, ξ 0 ), for some y ∈ H. To do this, we show that we can move from (x, ξ) to some (y, η) ∈ H × Conf H such that the configuration η has more edges in common with ξ 0 than ξ has, i.e., |ξ ∩ ξ 0 | < |η ∩ ξ 0 |, by considering the two scenarios: To show that we can access an arbitrary state (x, ξ) from (1, ξ 0 ), we first note that we can access (y, ξ 0 ), for any y, from (1, ξ 0 ) by relabelling the half-edges and using the first argument above. Then we see that we can access (x, ξ) from (y, ξ 0 ) for any y by using the above strategy of reducing the edges and using the cycles to move around. Hence, the Markov chain is irreducible. Since, by traversing the self-loop without rewiring, we can reach (1, ξ 0 ) from itself in one step, we see that the Markov chain is also aperiodic.

B Cut-off without dynamics
In order to use the results of [6] we need to assume the conditions stated there: Conditions B.1(R1**) and (R2**) are technical and proof-generated. It might be possible to relax them via a truncation argument [9]. Condition B.1(R3**) ensures that the random walk does not behave deterministically and that the configuration model is connected whp. Note that (R1**) and (R3**) are considerably more stringent than (R2) and (R3) in Condition 3.1. As shown in [6], the following holds: Remark B.3. We are aware of the fact that Condition B.1 is in [6] used to prove a much stronger statement than Theorem B.2 that is related to an exact computation of the cut-off window. Therefore, if we are interested only in proving Theorem B.2, weaker conditions might suffice.
Combining Theorem 1.4 and Theorem B.2, we obtain the following corollary:  C Transition matrix for (K t )-to-(L t ) rewiring Recall Definition 4.2, where we have introduced the general class of rewirings considered in this paper. In this appendix we provide a general expression for the transition matrix of the graph dynamics. Furthermore, we explore the conditions required for this transition matrix to be doubly stochastic. The matrix element Q Kt→Lt X (η, ξ) that represents the rewiring of the edges in the set X that realises the transition from graph state η to graph state ξ is given by if ξ is accessible from η by rewiring all edges in X, 0 otherwise.

(C.2)
Observe that the matrix given by (C.1) is a sum of multiple terms. Let us explain the meaning of these terms through the example of the general term First, the factor (1 − α n ) |Kt|−k represents the probability of |K t | − k edges not getting rewired, and its counterpart α k n represents the probability of k edges getting rewired. The sum runs over all k-tuples from K t , and the matrix Q Kt→Lt {e 1 ,...,e k } represents the possible rewiring of the k-tuples of edges we are summing over.
The Markov chain transition matrix must be stochastic. Let us check this by an explicit computation. Take an arbitrary graph state η. In the row that lists the probabilities of all the possible transitions from η, we get the following contributions: (1 − α n ) |Kt|−k α k n |K t | k = [(1 − α n ) + α n ] |Kt| = 1.

(C.4)
The combinatorial factor |Kt| k counts the different ways of choosing k-tuples from K t . Since the entries in Q Kt→Lt X (η, ξ) are chosen to be the reciprocal of the number of accessible states, it is not surprising that they sum up to 1. The factor 2 k comes from the ability to break up an edge into two ordered sets of half-edges.
Observe that the matrix defined by (C.1) has a "binomial" structure, but that it is not of the form Q Kt→Lt X (η, ξ) = e∈X (1 − α n )I + α n Q Kt→Lt {e} . (C.5) Clearly, (C.5) would be correct if we would draw e ∈ R t in Definition 4.2 with replacement, when the state space for the rewiring of |X| edges would have size 2 |X| |X| i=1 (|L t | − |L t ∩ K t |). In the current setting, where we draw e without replacement, the state space for the rewiring of |X| edges is smaller, namely, size 2 |X| |X|−1 i=0 (|L t | − |L t ∩ K t | − i), due to the removal of already drawn edges.
While we have seen that the transition matrix is stochastic, it is doubly stochastic only subject to additional conditions. For the purpose of this paper we need the following fact: Proposition C.2 (Double stochasticity of (K t )-to-global rewiring transition matrix). The transition matrix given in (C.1) is doubly stochastic for L t ≡ ξ, in the sense that edges in L t are generated by pairing ξ of the whole set of half-edges H (recall Remark 4.3).
Proof. The proof is by explicit computation. Choose an arbitrary graph state ξ and count the contributions to the sum over the row corresponding to transitions leading to ξ: . (C.6) The term (|H| − 1 − (2|K t | − 1) − 2i) is based on the following observation. We are counting possible pairs of half-edges where we see a difference in ξ compared to η. This way we get the whole set of half-edges |H|, without the considered half-edge itself and without all but one half-edge in K t . Rewiring cannot create an edge between half-edges that gave rise to K t , and the term −1 arises from the one half-edge from K t the considered half-edge is paired with in ξ. The term −2i again arises because we are drawing without replacement. Now observe that 2|L t | = |H| and L t ∩ K t = K t . Apply the binomial theorem to get the claim.