Random walks on finite groups with few random generators

Let $G$ be a finite group. Choose a set $S$ of size $k$ uniformly from $G$ and consider a lazy random walk on the corresponding Cayley graph. We show that for almost all choices of $S$ given $k = 2a\, \log_2 |G|$, $a>1$, this walk mixes in under $m = 2a \,\log\frac{a}{a-1} \log |G|$ steps. A similar result was obtained earlier by Alon and Roichman and also by Dou and Hildebrand using a different techniques. We also prove that when sets are of size $k = \log_2 |G| + O(\log \log |G|)$, $m = O(\log^3 |G|)$ steps suffice for mixing of the corresponding symmetric lazy random walk. Finally, when $G$ is abelian we obtain better bounds in both cases.


Introduction
In the past few years there has been a significant progress in analysis of random walks on groups with random support.Still for general groups G and small sets of generators, such as of size O(log |G|), more progress is yet to be made.Our results partially fill this gap.
Here is a general setup of a problem.Let G be a finite group, n = |G|.For a given k choose uniformly k random elements g 1 , . . ., g k ∈ G. Denote by S the set of these elements.A lazy random walk W = W(G, S) is defined as a finite Markov chain X t with state space G, and such that X 0 = e, X t+1 = X t • g i i where g i = g i (t) are independent and uniform in [k] = {1, . .., k}; i are independent and uniform in {0, 1}.By Q m denote the probability distribution of X m .If S is a set of generators, then Q m (g) → 1/|G|, i.e. the walk W has a uniform stationary distribution U , U (g) = 1/n for all g ∈ G.
Define the total variation distance d(m) of the walk after m steps as follows: Also define the separation distance It is easy to see that 0 ≤ d(m) ≤ s(m) ≤ 1.It is also known (see e.g.[AD2]) that d(m + 1) ≤ d(m), s(m + 1) ≤ s(m) for all m > 0, and s(2m) < C d(m), a universal constant C and large enough m, such that d(m) < 1/16.
The general problem is to find the smallest m such that d(m), s(m) ≤ ε for almost all choices of S. Clearly, if m is small enough, then almost surely S is not a set of generators and s(m) = 1, d(m) ≥ 1/2.The example of G = Z r 2 shows that if k < r = log 2 n this is the case.Thus it is reasonable to consider only the case k ≥ log 2 n.
Theorem 1.Let G be a finite group, n = |G|.Let ε > 0, a > 1 be given.Then where the expectation is taken over all choices of S = {g 1 , . . ., g k } of size k > 2 a log 2 n , and where s(m) is the separation distance of the lazy random walk W(G, S) after m steps, for For example, when a = 2, ε → 0, we have m ≈ 2.77 log 2 n steps of the lazy walk is enough to drive the expected separation distance to 0, where the set of generators has size k > 4 log 2 n and is chosen uniformly in G.
Our second result deals with the case when k = log 2 n + o(log n).While we cannot show that m = O(log n) steps is enough (later we show that this is not true in general), we prove that m = O(log 3 n) suffices.For a technical reason, we need to use a symmetric lazy random walk W • (G, S) defined as follows : where i are independent and uniform in {±1, 0}, and g i are independent and uniform in S.
Theorem 2. Let G be a finite group, n = |G|.Let ε > 0 be given.Then where the expectation is taken over all choices of S = {g 1 , . . ., g k } of size and where s(m) is the separation distance of the symmetric lazy random walk Both results are obtained as an application of the Erdős-Rényi results on random subproducts (see [ER]).
A brief history of the problem.In [AD1] Aldous and Diaconis formulated the following informal conjecture for the usual (not lazy) random walks : If both k and log k n are large, then the total variation distance d(m) is small with high probability for m > (1 + ε) log k n.
In a superlogarithmic case the conjecture was modified and proved by Dou and Hildebrand in [DH].They showed that E[d(m)] → 0 as n → ∞ if k > (log n) a , a > 1, and m > a a−1 log k n(1 + ε).They also showed that the factor a a−1 cannot be lowered for certain classes of groups.A different proof was later found by Roichman (see [R].) The case k = O(log n) for general groups was first explored by Alon and Roichman in [AR], where authors showed that the second largest eigenvalue λ 2 of the Cayley graph Γ(G, S) is bounded by a constant.This immediately implies m = O(log n) steps is enough for mixing.Formally, they showed that given 1 > δ > 1/e, k ≥ (1 + o(1))2e 4 ln 2/(δ e − 1) then E(λ 2 ) < δ.Although our results do not imply these, for the mixing time this gives bounds that are slightly worse than ours.We shall note that authors work with symmetric sets of generators.
Another approach was introduced by Dou and Hildebrand in [DH,§5].They showed that if k = a log n, m > b log n, where a > e 2 , b < a/4, and b log(eb/a) < −1, then E[d(m)] → 0 as n → ∞.The result of Theorem 1 is a somewhat stronger version of a similar result.Particularly, we require just a > 2. Again, the direct comparison of results is cumbersome since authors use different measures of mixing (separation vs. total variation distance), different types of walks (1/2 vs. 0 holding probability), and in addition to that the latter result expresses m = m(k, n) inexplicitly.Let us point out, however, that as k/ log 2 n → ∞ the result in [DH] gives an asymptotically better bound on the expected number of steps (m/ log 2 n → 0 vs. m/ log 2 n → 2).On the other hand our probabilistic method approach seems slightly more straightforward and easier to generalize.
The case k = log 2 n + o(n) studied in Theorem 2 is also not new in this setting.In [AR] authors remarked that one can easily get m = O(log 4 n) bound using just the diameter bound.We have all reasons to believe that the power 3 can be (and should be) brought down to at least 2. However the examples show that m = Ω(log n log log n) in some cases (see below).
For specific groups, such as abelian groups, the situation is well understood.Several authors have obtained sharp bounds for these random random walks (see [G, H, PV, W]).A good guidance for the case is enough (see [W]).This shows that the bound in Theorem 1 is of the right order.We should also mention that in the case G Z r 2 the results of Wilson in [W] give extremely sharp estimates on convergence.The paper also suggests a possible generalization to all abelian groups, but the results have yet to be published.
An interesting observation is due to Hildebrand (see also [AR,§3]).In [H] he showed that if G is abelian, k = (log n) a , where a < 1, then for any given ε > 0, b > 0 and m = (log n) b we have d(m) > 1 − ε for sufficiently large n.Thus there is a phase transition around a = 1.Therefore our results can be interpreted as a look inside this phase transition.
Although the example above cover only abelian groups, the reader should be warned that the abelian groups might give an incomplete picture.For example, besides Z r 2 there are nonabelian groups with the property that they cannot be generated by less than log 2 n generators (cf.[AP]).Random walks on them are yet to be better understood.The following result, obtained a bonus from the proof of Theorem 1, is another illustration of the "abelian groups are easier" principle.
where the expectation is taken over all choices of S = {g 1 , . . ., g k } of size and where s(m) is the separation distance of the lazy random walk W(G, S) after m steps, for In other words, we can save a factor of 2 in Theorem 1.This makes the result tight when G = Z n 2 (cf.[W]).Also, we get an analog of Theorem 2 which gives much tighter bound in this case.
Theorem 4. Let G be a finite abelian group, n = |G|.Let w be any functions of n such that w → ∞ as n → ∞.Then where the expectation is taken over all choices of S = {g 1 , . . ., g k } of size where C is a universal constant (independent of n), and s(m) is the separation distance of the lazy random walk W(G, S) after m steps, for For example, w = log log log n will work.Heuristicly, Theorem 2 corresponds to a nonexistent case a = 1 in Theorem 3. Roughly, let a = 1 + 1/ log 2 n.Then k = log 2 n + 1, and m > (1 + ε) log 2 n log log 2 n, which is basically what Theorem 4 says.
Finally, let us mention a somewhat relevant conjecture of Babai.He conjectured in [B2] that there exist a universal constant c such that if G is simple, then the diameter ∆ of any Cayley graph on G is at most (log n) c .Together with the standard bound mix < C|S|∆ 2 log n on a mixing time (see [AF, DSC]), and given k = |S| = O (log n) c1 this gives us m = C (log n) c2 steps is always enough for convergence (assuming S is a set of generators.)On the other hand, it is known that P = Pr( S = G) → 1 as n → ∞, where P is the probability of a random S, |S| = k ≥ 2 generating G.This is a result of Liebeck and Shalev (see [LS]), conjectured earlier by Kantor and Lubotzky (see [KL]).Therefore we conclude that Babai conjecture implies that for simple G and random S of constant size k ≥ 2 the mixing time is almost surely polylogarithmic in n.
Note here that the above conjecture of Babai as well as its application to convergence is open even for G = A n .The best known result is due to Babai and Hetyei (see [BH]) who found ∆ ≤ (log n) log n(1/2+o(1)) bound for almost all pairs of even permutations.

Proof of Theorem 1
Let G be a finite group, n = |G|.Throughout the paper we will ignore a small difference between random subsets S and random sequences J of group elements.The reason is that the two concepts are virtually identical since probability of repetition of elements (having g i = g j , 1 ≤ i < j ≤ k) when k = O(log n) is exponentially small.Thus in the future we will substitute uniform sets S of size k by the uniform sequences J ∈ G k , which, of course, can have repeated elements.
Fix a sequence J = (g 1 , . . ., g k ) ∈ G k .Random subproducts are defined as where i ∈ {0, 1} are given by independent unbiased coin flips.Denote by P J the probability distribution of the random subproducts on G. Erdős and Rényi showed in [ER] that if g 1 , . .., g k are chosen uniformly and independently, then : Proofs of Theorems 1, 3 are based on ( * ).Let m > 2 log 2 |G|, and let J be as above.Denote by Q J the probability distribution Q m J of the lazy random walk W(G, S) after m steps, where S = S(J) is a set of elements in J. Suppose we can show that with probability > 1 − α/2 we have s , where α → 0 as n → ∞.This would imply the theorem.Indeed, we have where i 1 , . . ., i m are uniform and independent in [k] = {1, . .., k}.
Let J = (g 1 , . . ., g k ) be fixed.For a given I = (i 1 , . . ., i m ) ∈ [k] m , consider J(I) = (g i1 , . . ., g im ) and R I = P J(I) .By definition of a lazy random walk we have We will show that for almost all choices of J and I, the probability distribution R I is almost uniform.
First we deduce Theorem 1 from the lemmas and then prove the lemmas.
Proof of Theorem 1.Let I be a L-subsequence of I of length l > 2 log 2 n + 3 log 2 1/δ.Since numbers in I are all different, for at least (1 − δ) fraction of all J = {g 1 , . .., g k }, we have Indeed, this is a restatement of ( * ) with ε = δ.
Note here that we do not require the actual group elements g ij , i j ∈ I be different.By coincidence they can be the same.But we do require that numbers in I are all different, so that the corresponding group elements are independent.
Let l = 2 log 2 n + 3 log 2 1/δ , k > a l, and m > (1 + ε) a l ln a a−1 .Denote by P (l) the probability that a uniformly chosen I ∈ [k] m contains an L-subsequence of length l.By Lemma 1, with probability > P (I)(1 − δ) we have where the the probability is taken over all I ∈ [k] m and all J ∈ G k .Setting δ = δ(α, ε, n) small enough we immediately obtain s J (m) ≤ α/2 with probability > (1 − α/2).where the the probability is taken over all J ∈ G k .By observations above, this is exactly what we need to prove the theorem.Now take δ = α/4, β = ε/2.By Lemma 2, and and since l > log 2 n we have P (I) > 1−α/4 for n large enough.We conclude This finishes proof of Theorem 1.

Proof of Lemmas
Proof of Lemma 1.For any x, y ∈ G denote by y x the element xyx −1 ∈ G.
Clearly, if y is uniform in G and independent of x, then y x is also uniform in G.
Let Q be a distribution on a group G which depends on J ∈ G m and takes values in G.We call Q (α, β)-good if with probability > (1 − β) it satisfies inequality max g∈G |Q(g) − 1/n| ≤ α/n.
Consider the following random subproducts: where x is fixed, while g 1 , . . ., g l are uniform and independent in G, and 1 , . .., l are uniform and independent in {0, 1}.We have Similarly, let x, y, . . .be fixed group elements.Then random subproducts are distributed as R I • f(x, y, . . .), I = (1, . .., r, . .., l, . . .).Indeed, pull the rightmost fixed element all the way to the right, then pull the previous one, etc.We conclude that if R I is (α, β)-good, then distribution of h is also (α, β)-good.Note that in the observation above we can relax a condition that the elements x, y, . . .are fixed.Since we do not have to change their relative order, it is enough to require that they are independent of the elements g i to the right of them.Now let I = (i 1 , . . ., i m ) ∈ [k] m , and let I be an L-subsequence of I. Define Q(h) to be a distribution of random subproducts where all the powers j are fixed except for those of j ∈ I .We claim that if R I is (α, β)-good, then Q(h) is also (α, β)-good.Indeed, pull all the elements that are not in I to the right.By definition of the L-subsequence, the elements in I to the right of those that are not in I must be different and thus independent of each other.Thus by the observation above Q(h) is also (α, β)-good.Now, the distribution R I is defined as an average of the distributions Q(h) over all of the 2 m−l choices of values s of elements not in I = (i r1 , . .., i rl ).Observe that for fixed g 1 , . . ., g k and different choices the s , s = r j the distributions of subproducts h can be obtained by a shift from each other (i.e. by multiplication on a fixed group element).Therefore each of these distributions has the same separation distance.In other words, each of the J is either "good" altogether or "bad" altogether for all 2 m−l choices.Therefore after averaging we obtain an (α, β)-good distribution R I .This finishes proof of the lemma.
Proof of Lemma 2. The problem is equivalent to the following question.What is the probability that in the usual coupon collector's problem with k coupons, after m trials we have at least l different coupons?Indeed, observe that if all m chosen coupons correspond to elements in a sequence I ∈ [k] m , then distinct coupons correspond to L-subsequence I of length l.Note that in our case k = a l and m = (1 + β) k log a a−1 .Let τ be the first time we collect l out of k possible coupons.Let us compute the expected time E(τ ).By the usual argument (see [F]) we have The o(1) in the last equality comes from cancellation of the Euler-Mascheroni constant γ when k, (k − r) → ∞, and the formula (see e.g.[WW], §12.1) : When k = a l.We obtain Let us compute V ar(τ ).We have Now let m = (1 + β) E(τ ).The probability P (l) that after m trials we collect l coupons is equal to Pr(τ ≤ m).Use Chebyshev inequality: This finishes the proof of the lemma.

Proof of Theorem 2
The proof of Theorem 2 is based on a different result of Erdős and Rényi.In [ER] along with ( * ) they proved that if a sequence J = (g 1 , . . ., g k ) is chosen uniformly and independently, then : Here P J is a distribution of random subproducts as in ( * ).We will use ( * * ) to prove that with probability > 1 − α taken over all choices of J ∈ G k we have where m > 3 ln 2 k 2 and k as above.By the same reasoning as in the proof of Theorem 1, this implies Theorem 2. Consider group elements g 1 , . . ., g k such that every g ∈ G is given by a subproduct g = g 1 1 • . . .• g k k , where i ∈ {0, 1}.Denote p(k) the probability of this event given g 1 , . . ., g k are chosen uniformly in G.
We will use the subproducts above as paths on a Cayley graph generated by g ±1 i .Note here that every generator occurs in each of the subproducts at most N = 1 time.Also, the diameter ∆ of the Cayley graph above is bounded by maximum length of subproducts: ∆ ≤ k Let us use the path arguments (see [DSC, AF]) to compare the reversible Markov chain (which corresponds to our symmetric random walk) and a trivial Markov chain which at each step sends a chain to a uniform group element.For the second largest eigenvalue λ we get : as n → ∞, where the second inequality can be found e.g. in [B1].Now take k = log 2 n+(1+ε) log 2 log 2 n .Since ε log 2 log 2 n−5 → ∞, we obtain that p(k) → 1 as n → ∞, where p(k) is the probability of choosing J ∈ G k such that the left hand side of ( * * ) holds.But in this case s J (m) → 0 as n → ∞, and where m = (1 + β) 3 k 2 ln n as above.Now take β = ε/2 and express m in terms of log 2 n.This finishes proof of the Theorem.

Proof of Theorems 3 and 4
Proof of Theorems 3, 4 is very much similar to the proof of Theorem 1, so we will just point out the differences.
When G is an abelian group, we can use a result of Erdős and Hall in [EH].They showed that there exist a universal constant C such that for any ε, δ > 0, n → ∞ we have ( * ) Pr max Similarly, to obtain Theorem 4 we need a proper analog of Lemma 2. Let us estimate how big the m we need to get l distinct numbers in I.This is just coupon collector's problem.If there is a total of k different coupons, k ≥ l, let X be the time to get l different coupons.Then the the expected time E = E(X) is given by Also, since the variance of the coupon collector's problem is given by V ar(X) = O(l 2 ) (see [F, §9.9]) the Chebyshev inequality gives where C 1 , C 2 are universal constants.Thus if k ≥ l = log 2 n + C log 2 n log log log n log log n we have δ → 0. Now take δ = ε is as in the proof of Theorem 1, and let m > l(log l+w).Proceeding as in the proof of Theorem 1, we finish the proof of Theorem 4.
δ for k ≥ log 2 n 1 + C log log log n log log nUsing the arguments from the proof of Theorem 1 we immediately obtain Theorem 3.