Mixing and relaxation time for Random Walk on Wreath Product Graphs

Suppose that G and H are finite, connected graphs, G regular, X is a lazy random walk on G and Z is a reversible ergodic Markov chain on H. The generalized lamplighter chain X* associated with X and Z is the random walk on the wreath product H\wr G, the graph whose vertices consist of pairs (f,x) where f=(f_v)_{v\in V(G)} is a labeling of the vertices of G by elements of H and x is a vertex in G. In each step, X* moves from a configuration (f,x) by updating x to y using the transition rule of X and then independently updating both f_x and f_y according to the transition probabilities on H; f_z for z different of x,y remains unchanged. We estimate the mixing time of X* in terms of the parameters of H and G. Further, we show that the relaxation time of X* is the same order as the maximal expected hitting time of G plus |G| times the relaxation time of the chain on H.

1. Introduction. Suppose that G and H are finite connected graphs with vertices V (G), V (H) and edges E(G), E(H), respectively. We refer to G as the base and H as the lamp graph, respectively. Let X (G) = {f : V (G) → H} be the set of markings of V (G) by elements of H. The wreath product H G is the graph whose vertices are pairs (f , x) where f = (f v ) v∈V (G) ∈ X (G) and x ∈ V (G). There is an edge between (f , x) and (g, y) if and only if (x, y) ∈ E(G), (f x , g x ) , (f y , g y ) ∈ E(H) and f z = g z for all z / ∈ {x, y}. Suppose that P and Q are transition matrices for Markov chains on G and on H, respectively. The generalized lamplighter walk X (with respect to the transition matrices P and Q) is the Markov chain on H G which moves from a configuration (f , x) by 1. picking y adjacent to x in G according to P , then 2. updating each of the values of f x and f y independently according to Q on H.
The state of lamps f z at all other vertices z ∈ G remain fixed. It is easy to see that if P and Q are irreducible, aperiodic and reversible with stationary distribution π G and π H , respectively, then the unique stationary distribution of X is the product measure and X is itself reversible. In this article, we will be concerned with the special case that P is the transition matrix for the lazy random walk on G.
In particular, P is given by for x, y ∈ V (G) and where d(x) is the degree of x. We further assume that the transition matrix Q on H is irreducible and aperiodic. This and the assumption (1.1) guarantees that we avoid issues of periodicity. , the red bullets on each copy of H represents the state of the lamps over each vertex v ∈ G and the walker is drawn as a red W bullet.
1.1. Main Results. In order to state our general result, we first need to review some basic terminology from the theory of Markov chains. Let P be the transition kernel for a lazy random walk on a finite, connected graph G with stationary distribution π.
The ε-mixing time of P on G in total variation distance is given by (1.2) t mix (G, ε) := min t ≥ 0 : max Throughout, we set t mix (G) := t mix (G, 1 4 ). The relaxation time of a reversible Markov Chain with transition matrix P is where λ 2 is the second largest eigenvalue of P .
The maximal hitting time of P is where τ y denotes the first time t that X(t) = y and E x stands for the expectation under the law in which X(0) = x. The random cover time τ cov is the first time when all vertices have been visited by the walker X, and the cover time t cov (G) is The next needed concept is that of strong stationary times.
Definition 1.1. A randomized stopping time τ is called a strong stationary time for the Markov chain X t on G if P x [X τ = y, τ = t] = π(y)P x [τ = t], that is, the position of the walk when it stops at τ is independent of the value of τ .
The adjective randomized means that the stopping time can depend on some extra randomness, not just purely the trajectories of the Markov chain, for a precise definition see [13,Section 6.2.2]. Our main results are summarized in the following theorems: Theorem 1.3. Let us assume that G and H are connected graphs with G regular and the Markov chain on H is ergodic and reversible. Then there exist universal constants c 1 , C 1 such that the relaxation time of the generalized lamplighter walk on H G satisfies If further the Markov chain is such that then the upper bound matches the lower bound. This holds for many natural chains such as lazy random walk on hypercube Z d 2 , tori Z d n , some walks on the permutation group S n (the random transpositions or random adjacent transpositions shuffle, and the top-to-random shuffle, for instance).  [17], including the cycle Z n , the hypercube Z d 2 and more generally tori Z d n , n, d ∈ N and dihedral groups Z 2 Z n , n ∈ N are also obtained by the construction of strong stationary times with halting states on direct and semidirect product of groups. Further, Pak constructs strong stationary times possessing halting states for the random walk on k-sets of n-sets, i.e. on the group S n /(S k × S n−k ), and on subsets of n × n matrices over the full linear group, i.e. on GL(n, F q )/(GL(k, F q ) × GL(n − k, F q )).

Previous
Work. The mixing time of Z 2 G was first studied by Häggström and Jonasson in [11] in the case of G being the complete graph K n and the one-dimensional cycle Z n . Generalizing their results, Peres and Revelle [18,Theorem 1.2,1.3] proved that there exists constants c i , C i depending on ε such that for any transitive graph G, The vertex transitivity condition was dropped in [13,Theorem 19.1,19.2]. These bounds match with Theorems 1.3 and 1.4 since H n = Z 2 implies that the terms not containing H n in the denominator of (1.6) and in the bounds in (1.7) dominate.
In [16], it is shown that t mix (Z 2 G n ) ∼ 1 2 t cov (G n ) whenever (G n ) is a sequence of graphs satisfying some uniform local transience assumptions, including G n = Z d n with d ≥ 3 fixed. Moving towards larger lamp spaces, if the base is the complete graph K n and |H n | = o(n) one can determine the order of mixing time from [13,Theorem 20.7], since in this case the lamplighter chain is a product chain on n i=1 H n . Levi [14] investigated random walks on wreath products when H = Z 2 . In particular, he determined the order of the mixing time of K n λ K n , 0 ≤ λ ≤ 1, and he also had upper and lower bounds for the case H d Z n , i.e. H is the d-dimensional hypercube and the base is a cycle of length n, however, the bounds failed to match for general d and n. Further, Fill and Schoolfield [10] investigated the total variation and l 2 mixing time of K n S n , where the base graph is the Cayley graph of the symmetric group S n with transpositions chosen as the generator set, and the stationary distribution on K n is not necessarily uniform.
The mixing time of H n = Z 2 is closely related to the cover time of the base graph, and thus it helps understanding the geometric structure of the last visited points by random walk [4][5][6]16]. Further, larger lamp graphs give more information on the local time structure of the base graph G. This relates our work to the literature on blanket time (when all the local times of vertices are within a constant factor of each other) [3,8,20].

1.3.
Outline. The remainder of this article is structured as follows. In Section 3 we state a few necessary theorems and lemmas about the Dirichlet form, strong stationary times, different notions of distances and their relations. In Lemmas 3.3 and 3.5 we construct a crucial stopping time τ and a strong stationary time τ 2 on H G which we will use several times throughout the proofs later. Then we prove the main theorem about the relaxation time in Section 4, and the mixing time bounds in Section 5.
2. Notations. Throughout the paper, objects related to the base or the lamp graph will be indexed by G and H, respectively, and always refers to an object related to the whole H G. Unless misleading, G and H refers also to the vertex set of the graphs, i.e. v ∈ G means v ∈ V (G). P µ , E µ denotes probability and expectation under the conditional law where the initial distribution of the Markov chain under investigation is µ. Similarly, P x is the law under which the chain starts at x.

Preliminaries.
In this section we collect the preliminary lemmas to be able to carry through the proofs quickly afterwards. The reader familiar with notions of strong stationary times, separation distance, and Dirichlet forms might want jump forward to Lemmas 3.3 and 3.5 immediately, and check the other lemmas here only when needed.
The first lemma is a common useful tool to prove lower bounds for relaxation times, by giving the variational characterization of the spectral gap. First we start with a definition.
Let P be a reversible transition matrix with stationary distribution π on the state space Ω and let E π [φ] := y∈Ω φ(y)π(y). The Dirichlet form associated to the pair (P, π) is defined for functions φ and η on Ω by It is not hard to see [13,Lemma 13.11] that The next lemma relates the spectral gap of the chain to the Dirichlet form (for a short proof see [2] or [13, Lemma 13.12]): Lemma 3.1 (Variational characterization of the spectral gap). The spectral gap γ = 1 − λ 2 of a reversible Markov Chain satisfies where A very useful object to prove the upper bound on t rel and both bounds for t mix is the concept of strong stationary times. Recall the definition from (1.1). It is not hard to see ([13, Lemma 6.9]) that this is equivalent to To be able to relate the tail of the strong stationary times to the mixing time of the graphs, we need another distance from stationary measure, called the separation distance: The relation between the separation distance and any strong stationary time τ is the following inequality from [2] or [7] or [13, Lemma 6.11]: Throughout the paper, we will need a slightly stronger result than (3.5). Namely, by [7,Remark 3.39] or from the proof of (3.5) in [13,Lemma 6.11] it follows that in (3.5) equality holds if τ has a halting state h(x) for x. Unfortunately, we just point out that the [13, Remark 6.12] is not true and the statement can not be reversed: the state h(x, t) maximizing the separation distance at time t can also depend on t and thus the existence of a halting state is not necessarily needed to get equality in (3.5).
On the other hand, one can always construct τ such that (3.5) holds with equality for every x ∈ Ω. This is a key ingredient to our proofs, so we cite it as a Theorem (with adjusted notation to the present paper). Let (X t , t ≥ 0) be an irreducible aperiodic Markov chain on a finite state space Ω with initial state x and stationary distribution π, and let s x (t) be the separation distance defined as in (3.4). Then 1. if τ is a strong stationary time for X t , then s x (t) ≤ P x (τ > t) for all t ≥ 0. 2. Conversely, there exists a strong stationary time τ such that s x (t) = P x (τ > t) holds with equality.
Combining these, we will call a strong stationary time τ separation optimal if it achieves equality in (3.5). Mind that every stopping time possessing halting states is separation optimal, but the reversed statement is not necessarily true. The next two lemmas, which we will use several times, construct two stopping times for the graph H G. The first one will be used to lower bound the separation distance and the second one upper bounds it.
We start with introducing the notation for the number of moves on the lamp graph H v , v ∈ G by the walker up to time t. Slightly abusing terminology, we call it the local time at vertex v ∈ G. Let us further denote the random walk with transition matrix Q on H by Z. Since the moves on the different lamp graphs H v , v ∈ G are taken independently given L v (t), v ∈ G, we can define for each v ∈ G an independent copy of the chain Z, denoted by Z v , running on H v . Thus, the position of the lamplighter chain at time t can be described as Below we will use copies of a strong stationary time τ H for each v ∈ G, meaning that τ H (v) is defined in terms of Z v , and given the local times Let τ H be any strong stationary time for the Markov chain on H. Take the conditionally independent copies of (τ H (v)) v∈G given the local times L v (t), realized on the lampgraphs H v -s and define the stopping time τ for X by Then, for any starting state (f 0 , x 0 ) we have If further τ H has halting states then the vectors (h(f v (0)), y) are halting state vectors for τ and initial state (f 0 , x 0 ) for every y ∈ G.
We postpone the proof and continue with a corollary of the lemma: Corollary 3.4. Let τ H be a strong stationary time for the Markov chain on H which has a halting state h(z) for any z ∈ H. Then define τ as in Lemma 3.3. Then for the separation distance on the lamplighter chain H G the following lower bound holds: Proof. Observe that reaching the halting state vector (h(f v (0)), x) implies the event τ ≤ t so we have (3.9) This quotient is less than 1 since both the numerator and the denominator are probability distributions on G. Then, using this and Lemma 3.3, the right hand side of (3.9) equals Clearly the separation distance is larger than the left hand side of (3.9), and the proof of the claim follows. Note that the proof only works if τ H has a halting state and thus it is separation-optimal.
Proof of Lemma 3.3. First we show that (3.8) holds using the conditional independence of τ H (v)-s given the number of moves L v (t) on the lamp graphs H(v), v ∈ G. Clearly, conditioning on the trajectory of the walker {X 1 , . . . , X t−1 , X t = x} := X[1, t] contains the knowledge of L v (t)-s as well. We will omit to note the dependence of P on initial state (f 0 , x 0 ) for notational convenience. The left hand side of condition (3.3) equals Recall that Z v stands for the Markov chain on the lamp graph H v , and their conditional independence given L v (t)-s. Due to (3.3) and τ H being strong stationary for H we have for all v ∈ G that [1,t] ]. Now we use that τ H (v)-s are conditionally independent given the local times to see that [1,t] Note that the second product gives exactly P τ ≤ t|X [1,t] , yielding As X t = x remains fixed over the summation, thus summing over all possible X[1, t] trajectories yields To turn the inequality τ ≤ t inside the probability to equality can be done the same way as in (3.3) and left to the reader. To see that the vector of halting states (h(f v (0)), y) is a halting state for τ for any y ∈ G is based on the simple fact that reaching the halting state vector (h(f v ) v∈G , y) means that all the halting states h(f v ), v ∈ G have been reached on all the lamp graphs H v , v ∈ G-s. Thus, by definition of the halting states, all the strong stationary times τ H (v) have happened. Then, by its definition, τ has happened as well.
Recall the definition (3.7) of τ d iamond. Then we can construct a strong stationary time for H G, described in the next lemma.
Lemma 3.5. Let τ be the stopping time defined as in Lemma 3.3, and let τ G (x) be a strong stationary time for G starting from x ∈ G and define τ 2 by where the chain is re-started at τ G is started from (F τ d iamond , X τ d iamond ), run independently of the past and τ G is measured in this walk. Then, τ 2 is a strong stationary time for H G.
Proof of Lemma 3.5. The intuitive idea of the proof is based on the fact that τ G is conditionally independent of τ H -s and thus the lamp graphs stay stationary after reaching τ , and stationarity on G is reached by adding the term τ G (X τ ). The proof is not very difficult but it needs a delicate sequence of conditioning. To have shorter formulas, we write shortly P for P (f 0 ,x 0 ) . First we condition on the events {τ = s, X s = (g, y)} and make use of (3.8) from Lemma 3.3.
Now for the conditional probability inside the sum on the right hand side we have where τ G (y) • θ s means the time-shift of τ G (y) by s, and we also used that τ G is only depending on y. We claim that The first equality holds true since τ G (y) is independent of the lampgraphs and the transition rules of X on H G tells us that the lamp-chains stay stationary. We omit the details of the proof. The second equality is just the strong stationarity property of τ G . Thus, using this and rearranging the order of terms on the right hand side of (3.12) we end up with Then, realizing that the sum is just P[τ + τ G (X τ ) = t] finishes the proof.
We continue with a lemma which relates the separation distance to the total variation distance: Let us define first The total variation distance of the chain from stationarity is defined as: The next lemma relates the total and the separation distance: Lemma 3.6. For any reversible Markov chain and any state x ∈ Ω, the separation distance from initial vertex x satisfies: Proof. For a short proof of (3.14) see [2] or [13,Lemma 6.13], and combine [13,Lemma 19.3] with a triangle inequality to conclude (3.15).
We will also make use of the following lemma: ([13, Corollary 12.6]) Lemma 3.7. For a reversible, irreducible and aperiodic Markov chain, The two fundamental steps to prove Lemma 3.7 are the inequalities stating that for all x ∈ Ω we have with π min = min y∈Ω π(y). This inequality follows from [13, Equation (12.11), (12.13)]. We note that Lemma 3.6 implies that the assertion of Lemma 3.7 stays valid if we replace d(t) 1/t by the separation distance s(t) 1/t .

Relaxation time bounds.
4.1. Proof of the lower bound of Theorem 1.3. We prove c 1 = 1/(16 log 2) in the lower bound of the statement of Theorem 1.3. First note that it is enough to prove that t hit (G) and |G|t rel (H) are both lower bounds, hence their average is a lower bound as well. First we start showing the latter.
Let us denote the second largest eigenvalue of Q by λ H and the corresponding eigenfunction by ψ. It is clear that E π H (ψ) = 0 and we can normalize it such that Var π H (ψ) = E π H (ψ 2 ) = 1 holds. Let us define thus φ is actually not depending on the position of the walker, only on the configuration of the lamps. Let X t = (F t , X t ) be the lamplighter chain with stationary initial distribution π . In the sequel we will calculate the Dirichlet form (3.1) for φ at time t, first conditioning on the path X[0, t] of the walker: We remind the reader that in each step of the lamplighter walk, the state of the lamp graph H v is refreshed both at the departure and arrival site of the walker. Thus, knowing the trajectory of the walker implies that we also know L v (t), the number of steps made by the Markov chain Z v on H v . Moreover, the collection of random walks (Z v ) v∈G on the lamp graphs are independent given L v (t)-s. We can calculate the conditional expectation on the right hand side of (4.1) by using the argument above and the fact that E π H (ψ) = 0 as follows: Next, the product form of the stationary measure π ensures that we can move to π H inside the sum and condition on the starting state Z v (0): Since ψ was chosen to be the second eigenfunction for Q, clearly where in the last step we assumed λ H > 1/2, since in this case we have On the other hand, if λ H < 1/2, than t rel (H) < 2 and we will use the other lower bound t hit (G) which is at least of order |G|. Dividing by Var π φ = |G|, and using the variational characterization of the spectral gap in Lemma 3.1, we get that the spectral gap γ t * at time Since γ t is by definition the spectral gap of the chain at time t, we have so we get a lower bound t rel (H G) ≥ 1 5 log 2 |G|t rel (H). To get the lower bound t hit (G)/4 we adjust the proof for 0 − 1 lamps (H = Z 2 ) [13, Theorem 19.1] to our setting. First pick a vertex w ∈ G which maximizes the expected hitting time E π G (τ w ). As before, we will use the second eigenfunction ψ with eigenvalue λ H with E π H (ψ) = 0, E π H (ψ 2 ) = 1 and define φ (f , x) := ψ(f w ).
Easy to see with the same conditioning argument we used in (4.2) and (4.3) that the Dirichlet form at time t equals Now we will show that E π λ Lw(t) H ≥ 1/4. To see this we first note that for any t we have for the hitting time To see the first line: either the walk hits w before time t, or the expected additional time it takes to arrive at w is bounded by t hit regardless of where it is at time t. The second line follows by averaging over π G .
Next, [13,Lemma 10.2] states that t hit ≤ 2 max v E π [τ v ] holds for every irreducible Markov chain. We exactly picked w such that it maximizes E π G (τ v ), so we have t hit ≤ 2E π G [τ w ], so multiplying the previous displayed inequality by 2 gives Now substituting t = t hit /4 and rearranging terms results in Since {L w (t hit /4) = 0} = {τ w > t hit /4}, we can use this inequality to obtain the upper bound Analogous to the last lines of the proof of the lower bound above, (see (4.4)) we obtain the other desired lower bound: Putting together the two bounds we get

4.2.
Proof of the upper bound of Theorem (1.3). To prove the upper bound, we will estimate the tail behavior of the strong stationary time τ 2 in Lemma 3.5, relate it to s (t), the separation distance on H G, and then use Lemmas 3.7 and 3.6 to see that s (t) 1/t → λ . We will use separationoptimal τ H and τ G in the construction of τ 2 . The existence is guaranteed by Theorem 3.2. We will use P for P (f ,x) for notational convenience. Combining (3.5) and the fact that τ happens when all the stopping times τ H (v), v ∈ G have happened on the lamp graphs, by union bound we have for any choice of 0 < α < 1 ≤ P[τ cov > αt/3] (4.5) + P ∃w ∈ G : L w (αt) < αt 2|G| τ cov ≤ αt/3 (4.6) Namely, there are four possibilities: The first option is that there is a state w ∈ G which is not hit yet, i.e. the cover time of the chain is greater than αt/3: giving the term (4.5). The constant 1/3 could have been chosen differently, we picked αt/3 such that the remaining 2αt/3 time still should be enough to gain large enough local time on the vertices v ∈ G. Secondly, even though any state w on the graph G is reached before time αt/3, the remaining time was not enough to have at least αt/2|G| many moves on some lamp graph H(w), term (4.6). The third option is that even though there have been many moves on all the lamp graphs, there is a vertex w ∈ G where τ H (w) has not happened yet, yielding the term (4.7). We will handle the three terms separately. The fourth term handles the case where the strong stationary time τ G is too large. (For convenience, we will write t instead of αt in estimating the first three formulas. ) We can estimate the first term (4.5) by a union bound: where t hit is the maximal hitting time of the graph G, see (1.4). To see this, use Markov's inequality on the hitting time of w ∈ G to obtain that for all starting states v ∈ G we have P v [τ w > 2t hit ] ≤ 1/2, and then run the chain in blocks of 2t hit . In each block we hit w with probability at least 1/2, so we have P v [τ w > K(2t hit )] ≤ 1 2 K . To get it for general t, we can move from t/t hit to t/t hit by adding an extra factor of 2, and (4.9) immediately follows by a union bound.
For the third term (4.7) we claim the following upper bound holds: (4.10) To see this we estimate the probability of the event {τ H (w) ≥ L w (t) L w (t) ≥ t 2|G| } on a single lamp graph and then use a union bound to lose a factor |G| and arrive at the right hand side. First note that according to Lemma 3.7, the tail of the strong stationary time τ H is driven by λ t H . More precisely, using the inequality (3.16) we have that for any initial state h ∈ H: Since we have made at least L w (t) ≥ t 2|G| steps on each coordinate, the claim (4.10) follows. The fourth term (4.8) can be handled analogously and yields an error probability exp{−ct/t rel (G)} which then, taking the power of 1/t and limit as in Lemma 3.7, will lead to a term of order t rel (G). Then, taking into account that t rel (G) ≤ ct mix (G) ≤ Ct hit (G) holds for any lazy reversible chain (see e.g. [13, Chapter 11.6,12.4]), we can ignore this term.
The intuition behind the estimates below for the second term (4.6) is that since the total time was at least 2t/3 after hitting, regularity of G implies that the average number of moves on a lamp graph equals 4t/(3|G|) by the double refreshment at any visit to the vertex. Thus, the probability of having less than t/(2|G|) moves must be small.
More precisely, we introduce the excursion-lengths to a vertex w ∈ G: Let us define for all w ∈ G the first return time to state w as The strong Markov property implies that the length of the i-th excursion R i (w), defined as the time spent between the (i − 1)th and ith visit to w, are i.i.d random variables distributed as the first return time R(w).
Thus, having not enough local time on some site w ∈ G can be expressed in terms of the excursion lengths R i (w)-s as follows: since conditioning on hitting before t/3 ensures that we had at least 2t/3 steps to gain the t/4|G| visits to w, and by the definition (3.6) of L v (t), this guarantees that L w (t) < t/2|G|. We aim to estimate the right hand side of (4.11) using the moment generating function of the first return time R(w). To be able to carry out the estimates we need a bound on the tail behavior of the return times. A very similar argument can be used to the one we used for the tail of the cover time (4.9), namely the following holds: Running the chains in blocks of 2t hit + 1, one can see that in each block the chain has a chance at least 1/2 to return to w, so we have for each t > 2t hit + 1 where the factor 3 comes from ignoring to take the integer part of t/t hit and neglecting the +1 term in the denominator. We can use this tail behavior to estimate the moment generating function where we cut the expectation at 2|G|. Using the bounds in (4.12) yields: Setting arbitrary β < log 2/(2t hit ) makes the second term integrable, and with the special choice of β = log 2 4t hit we obtain the following estimate: with an appropriately chosen 0 < δ < 1/3. Now we apply Markov's inequality to the function e β t/4|G| i=1 R i (w) to estimate the right hand side of (4.11): (4.14) where we also used the independence of the excursions R i (w)-s. Using the estimate in (4.13) to bound the right hand side we gain that (4.15) where we used β = log 2/(4t hit ), and modifiedδ := 3δ/2 ≤ 1/2. Using the relation of the local time to the excursion lengths in (4.11) we finally get that the second term (4.6) is bounded from above by Mind that all the estimates (4.9), (4.10) and (4.16) were independent of the initial state (f , x) ∈ H G, so using the second inequality in (3.16) and maximizing over all possible initial states yields us (4.17) In the final step we apply Lemma 3.7: we take the power 1/t and limit as t tends to infinity with fixed graph sizes |G| and |H| on the right hand side of (4.2) to get an upper bound on λ 2 . Then we use that ( for small x and obtain the bound on t rel finally: This finishes the proof of the upper bound on the relaxation time.

Mixing time bounds.
Based on the fact that H has a separationoptimal strong stationary time τ H , the idea of the proofs is to relate the separation distance to the tail behavior of the stopping times τ and τ 2 constructed in Lemmas 3.3 and 3.5, respectively. Then these estimates are turned into bounds of the total variation distance using the relations in Lemma 3.6. This method gives us the upper bound in (1.7) and the corresponding lower bound under the assumption (A). For the lower bound without the assumption, we will need slightly different methods. We continue with the definition of the blanket time: Let us further denote It is known from [8] that there exist universal constants C and C such that C t cov ≤ B 2 ≤ Ct cov . Thus, our first goal is to show that at time we have for any starting state (f , x) that We remind the reader that τ 2 = τ + τ G (X τ ) and thus the following union bound holds: where in the third term we mean that we restart the chain after time 8B 2 + |G|t u H , and measure τ G starting from there. The first term on the right hand side is less than 1/8 by Markov's inequality, the third is less than 1/16 by the definition of the worst case quantile. The second term can be handled by conditioning on the local time sequence of vertices and on the blanket time: (for shorter notation we introduce t 1 : The fact that B 2 ≤ 8B 2 means that the number of visits to every vertex v ∈ G must be greater than half of the average, which is at least 1 2 t u H . Since L w (t) is twice the number of visits by (3.6), {τ H (w) > L w (t 1 )} ⊆ {τ H (w) > t u H }. By the definition of the quantiles, holds for every h ∈ H and w ∈ G. Applying a simple union bound on the conditional probability on the right hand side of (5.6) yields where we used that the sum of the probabilities on the right hand side is at most 1. Combining these estimates with (5.5) yields (5.4). It remains to relate the worst-case quantiles to the total variation mixing times. Here we will make use of the separation-optimal property of τ H and τ G . Now just consider the walk on G. Let us start the walker on G from an initial state x 0 ∈ G for which the maximum is attained in the definition (5.1) of the quantile t quant 1/16 (τ G ). Then, by (3.15) we have that one step before the quantile we have This immediately implies that 1 By the submultiplicative property of the total variation distance d(kt) ≤ 2 k d(t) k we have that t mix (G, 1 64 ) ≤ 6t mix G, 1 4 . So we arrive at Similarly, starting all the lamps from the position h 0 where the maximum is attained in the definition of t u H = t quant 1/16|G| (τ H ), one step before the quantile we have On the other hand, on the whole lamplighter chain H G we need the other direction: For every starting state (f , x) (3.14) and (5.4) implies that Maximizing over all states (f , x) yields Putting the estimates in (5.7) and (5.8) to (5.9), we get that Since B 2 (G) ≤ Ct cov (G), and t mix (G) ≤ 2t hit (G) ≤ 2t cov (G) for any G (see for instance [13]), the assertion of Theorem 1.4 follows with C 2 = 8(C + 3), where C is the universal constant relating the blanket time B 2 to the cover time t cov in [8].
We remark why we did not make the constant C 2 explicit: If the blanket time B 2 were not used in our estimates, the error probability that some vertex w ∈ G does not have enough local time would need to be added. This, similarly to the term (4.6) behaves like |G|e −c(tcov+|G|t mix (H, 1 G ))/t hit . If we do not assume anything about the relation of t hit (G) and t cov (G) and on t mix (H, 1 G ), then this error term will not necessarily be small. For example, if G n is a cycle of length n, H n is a sequence of expander graphs, then t cov (G n ) = t hit (G n ) = Θ(n 2 ), and t mix (H, 1 G ) = log |H| · log |G| = log |H| log n, and we see that the term is not small if log |H| = o(n/ log n).

5.2.
Proof of the lower bounds of Theorem 1.4. As we did with the relaxation time, it is enough to prove that all the bounds are lower bounds separately, then take an average. First we start showing that the upper bound is sharp in 1.7 under the assumption that there is a strong stationary time τ H with halting states.

Lower bound under Assumption (A).
We first aim to show that c |G|t mix (H, 1 |G| ) ≤ t mix (H G). Consider the stopping time τ constructed in Lemma 3.3. Corollary 3.4 tells us that the tail of τ lower bounds the separation distance at time t. We again emphasize that this bound holds only if τ H in the construction of τ is not only separation optimal but it also has a halting state. Our first goal is to lower bound the tail of τ , then relate it to the total variation distance.
First set clearly this time is nontrivial if t quant |G| −1/2 /2 (τ H ) = 1. We handle the case if it equals 1 later. We can estimate the upper tail of τ by conditioning on the number of moves on the lamp graphs H v , v ∈ G: For each sequence (L v (t )) v∈G we define the random set Since v L v (t ) = 2t = 1 2 |G|t H , we have that for arbitrary local time configuration (L v (t )) v , Thus we can lower bound (5.11) by restricting the event only to those w ∈ G coordinates which belong to this set, i.e. whose local time is small: where in the second line we used that for Conditioned on the sequence (L v (t )) v , the times τ H (w) for w ∈ S (Lv)v are independent. On each lamp graph H(v) let us pick the starting state to be h 0 ∈ H where the maximum is attained in the definition of t quant |G| −1/2 /2 (τ H ). Since t H is one step before the quantile, we have (5.14) We need to start the lamp-chains from the worst-case scenario h 0 ∈ H for two reasons: First, we needed to define the quantile as in (5.1) to be able to relate it to the total variation mixing time on H, see below. Then, the fact that t quant ε was defined as the worst-case starting state quantile means that for other starting states the quantile may be smaller, and the lower bound can possibly fail.
Combining (5.14) with (5.12) and the conditional independence gives us the following stochastic domination from below to the event in (5.13) where V is a Binomial random variable with parameters |G|/2, |G| −1/2 /2 . Clearly, for |G| > 8 > 16(log 2) 2 we have Combining this with (5.13) and summing over all possible (L v (t )) v∈G sequences we easily get that Then, by Corollary 3.4 we have In the next few steps we relate the tail of τ and τ H to the mixing time of the graphs. First, combining the previous inequality with (3.15) implies that for the starting state (h 0 , x) the following inequalities hold: These immediately imply  Lemma 3.6) and maximizing over all h ∈ H we get that (5.16) d H (t H + 1) ≤ |G| −1/2 /2.
On the other hand, the total variation distance for any Markov chain has the following sub-multiplicative property for any integer k, see [13,Section 4.5]: Taking t = t H + 1 and combining with (5.16) we have that which immediately implies t mix (H, 1/|G|) ≤ 2(t H + 1).
Combining this with (5.15) yields the desired lower bound: .
Mind that the term −2 in the brackets can be dropped when picking a possibly smaller constant and take the graph large enough. The case when t quant |G| −1/2 /2 (τ H ) = 1 can be handled the following way: first mind that we can exchange the quantile for arbitrary 0 < α < 1, and look at the proof with t quant |G| −α /2 (τ H ). If this is still = 1 for all α, that means that τ H = 1 a.s. In this case, it is enough to hit the vertices to mix immediately and thus the mixing time |G|t mix (H) is of smaller order than the cover time t cov (G). The case when |G| ≤ 8 but |H| → ∞ is easy to see since in this case t mix (H, 1 |G| ) ≤ 2t mix (H) and one can argue that mixing on H G requires mixing on a single lamp graph H w for a fixed w ∈ G. Thus the lower bound remains valid.
The cover time of G is already a lower bound for the 0 − 1 lamps case by [18], hence also for general lamps, but, for completeness, we adjust the proof in [13,Theorem 19.2] to our setting. By Lemma 3.3 we can estimate the separation distance on H G as Now, using the submultiplicativity of d(t) in (5.17) and the relation of the separation distance and the total variation distance in (3.15), we have that at time 8t mix (H G, 1/4): Combining with (5.18) yields that for every starting state we have Thus, run the chain in blocks of 8t mix (H G, 1/4) and conclude that in each block it covers with probability at least 3/4. Thus, the cover time is dominated by 8t mix (H G, 1/4) times a geometric random variable with success probability 3/4, so we have Maximizing the left hand side over all possible starting states yields t cov (G) ≤ 11t mix (H G, 1/4), finishing the proof.

5.2.2.
Proof of the lower bound of Theorem 1.4, without assumption (A). Now we turn to the general case and first show that c t rel (H)|G| log |G| is a lower bound. No laziness assumption on the chain on H is needed to get this bound. We will use a distinguishing function method. Namely, take an eigenfunction φ 2 of the transition matrix Q on H corresponding to the second eigenvalue λ H . Then let us define ψ : H G → C: One can always normalize such that This normalization has two useful consequences: First, by Chebyshev's inequality, the set A = {ψ < 2|G| 1/2 } has measure at least 3/4 under stationarity. Second, φ 2 (g 0 ) := max g∈H φ 2 (g) > 1, otherwise the variance would be less than 1. We aim to show that the set A has measure less then 1/2 at time ct rel (H)|G| log |G| and then we are done by using the following characterization of the total variation distance, see [2,13]: Let us start all the lamp graphs from g 0 ∈ H where the maximum is attained for φ 2 . Then we can condition on the local time sequence and use the eigenvalue property of φ 2 to obtain (5.20) Since v L v (t) = 2t, we can apply Jensen's inequality on the function y → λ y H to get a lower bound on the expectation: By giving a lower bound on the right hand side we must assume here that λ H > 0, or equivalently t rel (H) > C > 1. Thus, first we handle the other case, i.e. when t rel (H) < 2. Then the lower bound we are about to show is of order |G| log |G| which is always at most the order of t cov (G), due to a result by Feige [9] stating that for simple random walk on any connected graph G, t cov (G) ≥ (1 + o(1))|G| log |G|.
When t rel (H) > 2, we can use that 1−x > e −1.5x when 0 < x < 0.5 to get a lower bound on the right hand side of (5.21). Then set t = ct rel (H)|G| log |G| turning the estimate in (5.20) into We can easily upper bound the conditional variance as follows: Now, let us estimate the measure of set A at time t by using the lower bound on the expectation: Now we use that φ 2 (g 0 ) > 1 and if c < 1/6 then on the right hand side, the term φ 2 (g 0 )|G| 1−3c dominates, so for |G| large enough we can drop the negative term and compensate it with a multiplicative factor of 1/2, say. Thus, condition on the local time sequence first and see that for any sequence (L v (t)) v∈G Chebyshev's inequality yields: Combining this with the estimate on the conditional variance above yields that This bound is independent of the local time sequence, so the law of total probability says we have the same upper bound without conditioning on the local times. Now setting c < 1/6 an |G| large enough we see that the right hand side can be made smaller than 1/2, finishing the proof.
To see that the cover time is a lower bound in the general case, couple the chain on H G to Z 2 G, i.e. jump to stationary distribution on H v once the walker on the base hits vertex v and use [18] or [13] to see that t cov (G) ≤ t mix (Z 2 G) ≤ t mix (H G).
Next we show that c|G|t mix (H) is a lower bound if the chain on H is lazy.
Let us start with a definition for general Markov chain X on Ω We call a stopping time mean-optimal if E[τ ] = t stop (G). Lovász and Winkler [15] show that optimal stopping rules always exist for irreducible Markov chains. We aim to show that Take a mean optimal stopping time τ * on H G reaching minimal expectation, i.e. E (f * ,x * ) [τ * ] = t stop (H G) for some (f * , x * ) ∈ H G and E (f ,x) [τ * ] ≤ t stop (H G) for (f , x) = (f * , x * ). We use this τ * to define a stopping rule τ H (v) on H v , for every v ∈ G. Namely, do the following: look at a coordinate v ∈ G and at the chain restricted to the lamp graph H v , i.e. only the moves which are done on the coordinate H v . Then, stop the chain on H v when τ * stops on the whole H G.
Start the chain from any (f 0 , x 0 ). Since v∈G L v (t) = 2t, we have Take the vertex w ∈ G (which can depend on x 0 ), which minimizes the expectation E fv(0) [τ H (w)]. Clearly for this vertex the expected value must be less than the average: The left hand side is at least as large as what a mean-optimal stopping rule on H can achieve, and the right hand side is at most 2 |G| t stop (H G). Thus we arrive at 1 2 |G|t stop (H) ≤ t stop (H G).
In the last step we use the equivalence from the paper [19, Corollary 2.5] stating that t stop and t mix are equivalent up to universal constants for lazy reversible chains and get that c 1 |G|t mix (H) ≤ t mix (H G).
6. Further directions. The next step of understanding generalized lamplighters walks might be to investigate which properties on G and H are needed to exhibit cutoff (for a definition see [2,13]), or to determine the mixing time in the uniform metric.
For Z 2 G, already [11] implies a total variation cutoff with threshold 1 2 t cov (K n ) for G being the complete graph and that there is no cutoff if G is a cycle of length n. The results of [18] include a proof of total variation cutoff for Z 2 Z 2 n with threshold t cov (Z 2 n ). The results in [16] also includes cutoff at 1/2t cov (G n ), with some uniform local transience assumptions on G n . Further, Levi [14] proved that the wreath product of two complete graphs K n λ K n , 0 ≤ λ ≤ 1 exhibits a cutoff at (1 + λ)/2n log n.
For the mixing time in the uniform metric, we know [18,Theorem 1.4] that if G is a regular graph such that t hit (G) ≤ K|G|, then there exists constants c, C depending only on K such that (6.1) c|G|(t rel (G) + log |G|) ≤ t u (Z 2 G) ≤ C|G|(t mix (G) + log |G|).
These bounds fail to match in general. For example, for the hypercube Z d 2 , t rel (Z d 2 ) = Θ(d) [13,Example 12.15] while t mix (Z d 2 ) = Θ(d log d) [13, Theorem 18.3]. Then [12] showed that the lower bound is sharp in (6.1) under conditions which are satisfied by the d(n) dimension tori G n = Z d(n) n for arbitrary chosen n and d(n).
7. Acknowledgement. We thank Elisabetta Candellero, Gábor Pete and Tim Kam Wong for useful comments and Louigi Addario-Berry, Perla Sousi and Peter Winkler for a useful discussion of halting states.