A method to derive concentration of measure bounds on Markov chains

We explore a method introduced by Chatterjee and Ledoux in a paper on eigenvalues of principle submatrices. The method provides a tool to prove concentration of measure in cases where there is a Markov chain meeting certain conditions, and where the spectral gap of the chain is known. We provide several additional applications of this method. These applications include results on operator compressions using the Kac walk on $SO(n)$ and a Kac walk coupled to a thermostat, and a concentration of measure result for the length of the longest increasing subsequence of a random walk distributed under the invariant measure for the asymmetric exclusion process.


Introduction
In the analysis of Chatterjee and Ledoux on concentration of measure for random submatrices [7] , it is proved that for an arbitrary Hermitian matrix of order n and k ≤ n sufficiently large, the distribution of eigenvalues is almost the same for any principal submatrix of order k. Their proof uses the random transposition walk on S n and concentration of measure techniques. To further generalize their results, we observe that it is important to use a Markov chain which does not change too many matrix entries all at once and whose spectral gap is known. Instead of looking at a Markov chain on S n , we first consider a Markov chain on SO(n). We introduce the Kac walk on SO(n) and demonstrate that it is sufficiently similar to the transposition Markov chain to allow for Chatterjee and Ledoux's results to carry over to the more general case of operator compressions. It should be noted that a similar result has been proved by Meckes and Meckes [13] using different techniques. In a more recent work [14], Meckes and Meckes have extended their techniques to include several other classes of random matrices and prove almost sure convergence of the empirical spectral measure. The purpose of this paper is to highlight the fact that the methods of Chatterjee and Ledoux can be extended to include more general cases, provided the Markov chain used satisfies appropriate conditions. To emphasize this point, we also apply the method to get a concentration of measure result for a compression by a matrix of Gaussians using the Kac walk coupled to a thermostat. We also show an application of this method applied to the length of the longest increasing subsequence of a random walk evolving under the asymmetric exclusion process.
Following the notation of Chatterjee and Ledoux, for a given Hermitian matrix A of order n with eigenvalues given by λ 1 , . . . , λ n , we let F A denote the empirical distribution function of A. This is defined as The following model, introduced by Kac [8], describes a system of particles evolving under a random collision mechanism such that the total energy of the system is conserved. Given a system of n particles in one dimension, the state of the system is specified by v = (v 1 , . . . v n ), the velocities of the particles. At a time step t, i and j are chosen uniformly at random from {1, . . . , n} and θ is chosen uniformly at random on (−π, π]. The i and j correspond to a collision between particles i and j such that the energy, is conserved. Under this constraint, after a collision, the new velocities will be of the form v new be the rotation matrix given by: where the cos(θ) and sin(θ) terms are in rows and columns labeled i and j, and the I denote identity matrices of different sizes (possibly 0). We will use the convention that R ii θ = I. After one step of the process, v new = R ij (θ) v.
In our case, we will be considering this process acting on SO(n), so instead of vectors in R n , our states will be given by matrices G ∈ SO(n). Then we can define the one-step Markov transition operator for the Kac walk, Q, on continuous functions of SO(n): for any G ∈ SO(n), and where f is a continuous function on SO(n).
Theorem 2.1 ( [6,12]). The Kac walk on SO(n) is ergodic and its invariant distribution is the uniform distribution on SO(n). Furthermore, the spectral gap of the Kac walk on SO(n) is n+2 2(n−1)n . Recall that for any reversible Markov chain, we can define the Dirichlet form, Q (·, ·). It is well known that for a Markov chain with spectral gap, λ 1 , the Poincare inequality holds: For the Kac walk, we have where µ n is the Haar measure on SO(n) normalized so that the total measure is 1. Let us define the triple norm: The following result is analogous to Theorem 3.3 from Ledoux's Concentration of Measure Phenomenon book [10] . We reproduce the proof of Theorem 3.3 here to verify that even though our situation does not satisfy the conditions of the theorem, the exact same argument carries through for the Kac walk on SO(n).
Theorem 2.2. Consider the Kac walk on SO(n) and let F : SO(n) → R be given such that |||F ||| ∞ ≤ 1. Then F is integrable with respect to µ n and for every r ≥ 0, where λ 1 = n+2 2(n−1)n is the spectral gap of the Kac walk on SO(n).

Main Result
Using these results, along with the method of Chatterjee and Ledoux, we are able to prove the following result: Theorem 3.1. Take any 1 ≤ k ≤ n and an n-dimensional Hermitian matrix G. Let A be the k × k matrix consisting of the first k rows and k columns of the matrix obtained by conjugating G by a rotation matrix R θ ij ∈ SO(n) chosen uniformly at random. If we let F be the expected spectral distribution of A, then for each r > 0, Proof. The proof of this theorem uses the method introduced by Chatterjee and Ledoux [7] with appropriate changes made to apply to the situation we are considering.
Let R ij (θ) ∈ SO(n) and let A be as stated above. Note that since A is a compression of a Hermitian operator, it will also be Hermitian. Fix is the empirical spectral distribution of A. Let Q be the transition operator as defined in (1) and let |||.||| ∞ be as in (2). Using Lemma 2.2 from Bai [3], we know that for any two Hermitian matrices A and B of order k, k In our case, taking one step in the Kac walk is equivalent to rotation in a random plane by a random angle. Hence A and R θ ij A will differ in at most two rows and two columns, bounding the difference in rank by 2, so where the 2k n comes from the probability that both i and j are greater than k, in which case, A and R θ ij A will be the same. From Theorems 2.1 and 2.2, we have that Hence, for r > 0, This holds for all r, so we can replace > by ≥. Next we will fix ℓ ∈ Z ≥2 . For Take any x ∈ R. Let i be an index where t i ≤ x < t i+1 . Then Using these two facts, we get that Then for any r > 0, we have Letting ℓ = k 1/2 + 1, we have which concludes the proof of our theorem.

Kac Model Coupled to a Thermostat
Using a spectral gap result from [4], we are able to demonstrate the application of this method to a more complicated Markov chain. In this system, the particles from the Kac system interact amongst themselves with a rate λ and interact with a particle from a thermostat with rate µ. The particles in the thermostat are Gaussian with variance 1 β , so they have already reached equilibrium. The Markov transition operator for the Kac walk is defined as in (1) and the Markov transition operator for the thermostat is given by where ω = (ω 1 , ω 2 , . . . , ω n ), V j (θ, ω) sends each element g ij in column j to g ij cos(θ)+ ω i sin(θ) for i = 1 to n and ω * ij = −g ij sin(θ) + ω i cos(θ). In [4] they consider the Markov chain acting on a vector. We consider the Markov chain acting on a matrix by treating the matrix as n independent vectors. Using this adaption, the following theorem follows immediately from the results proved in [4]. For the thermostat alone (letting λ = 0), we can again prove a theorem analogous to Chatterjee and Ledoux's theorem 3.3. Let G be the set of n × n matrices with independent and identically distributed N (0, 1/β) entries. We can define the Dirichlet form and the triple norm for the thermostat as Using these, we can prove a concentration of measure result for the thermostat analogous to Theorem 2.2 Theorem 4.2. Consider the Gaussian thermostat and let F : G → R be such that |||F ||| ∞ ≤ 1. Then F is integrable with respect to ν n and for every r ≥ 0, where λ 1 = µ 2n is the spectral gap of the thermostat process.
We omit the proof here as it is symmetric to the proof of Theorem 2.2.
Using this result and Theorem 4.1, we can prove the following concentration of measure inequality.
Theorem 4.3. Take any 1 ≤ k ≤ n and an n-dimensional Hermitian matrix G. Let S be an n × k matrix whose k columns are the first k columns of a random matrix with distribution ν n . Let A be the k × k matrix obtained by conjugating G by S. Letting F denote the expected spectral distribution of A, then for each r > 0, where µ is the rate of the interaction with the thermostat.
Proof. The proof of this theorem closely follows the proof of Theorem 3.1, with appropriate changes made. Let A be stated as above, and let A ′ be A after one step of the Markov chain. Fix x ∈ R and let f (x) = F A (x), where where F A is the empirical spectral distribution of A. Notice that rank(A − A ′ ) ≤ 3, since after one step of the chain, at most 3 columns of A will be changed (two from the Kac Walk, and one from the thermostat). Again using the inequality from [3], we know that where the first sum is over possible interactions in the Kac process and the second is over possible particle interactions with the thermostat. The above is Using theorems 4.1 and 4.2, we have that Following the rest of the proof in 2.1 (with the appropriate numbers changed), we get

An Additional Application: The Length of the Longest Increasing Subsequence of a Random Walk Evolving under the Asymmetric Exclusion Process
Consider a random walk X on {1, . . . , n}. Represent X by some element in {0, 1} n , where X i = 0 corresponds to a step down in the walk at position i and X i = 1 corresponds to a step up. We will assume that n i=1 X i = n 2 so that we have the same number of up steps as down steps. We can now look at this random walk as the initial configuration of a particle process with X i = 1 corresponding to a particle in position i and X i = 0 corresponding to no particle at position i. Consider the asymmetric exclusion process acting on this configuration with the following dynamics. At each step of the process, a number i is chosen uniformly in {1, . . . , n − 1}. If X i = X i+1 , then the configuration stays the same. If X i = 1 and X i+1 = 0, then the values of X i and X i+1 switch with probability 1−q/2 and if X i = 0 and X i+1 = 1, then the values switch with probability q/2. Viewed in this way, the asymmetric exclusion process can be viewed as a Markov process on the set of random walks. See [11] for an in depth discussion of the asymmetric exclusion process. for a parameter q satisfying 0 < q < 1.
In our case, take q = 1 − c/n α , for a constant c, and 0 < α < 1, such that q ≈ e −c/n α . Then Taylor approximating and simplifying gives λ n = c 2 /2n 2α Now let M X denote the height of the midpoint of the random walk at a fixed time during the process. In other words, M X = X n/2 , assuming n is even. Note that the range of this function is [−n/2, n/2]. Let M ′ x be the evolution of M x after one step of the process. Notice that M x − M ′ x ∞ ≤ 1 since switching the position of two adjacent particles can change the height of the midpoint by at most 1. Then The 1 n−1 appears because the only choice of i that will effect the midpoint is i = n/2. Now plugging into the Chatterjee Ledoux theorem, we have the following result.
Theorem 5.2. Letting M X denote the height of the midpoint of the random walk after evolution under the asymmetric exclusion process, for all r > 0 and q = 1 − c/n α , Notice that this implies that the height of the midpoint has fluctuations bounded above by a constant n α−1/2 for 0 < α < 1.
Consider the length of the longest increasing (non-decreasing) subsequence of the random walk. This is defined as for a more in depth description of this topic and results for the simple random walk. Notice that the height of the midpoint gives a lower bound on the length of the longest increasing subsequence. Using ASEP as our Markov process and the spectral gap above, we can prove concentration of measure for L X . Notice that switching the position of two adjacent particles via ASEP can only change L X by at most 1. As before, let X ′ be the evolution of X after one step of the process. Then, bounding the probability above by 1, we have so plugging into the Chatterjee Ledoux formula, we get the following result.
Theorem 5.3. Letting L X denote the length of the longest increasing subsequence of the random walk after evolution under the asymmetric exclusion process, for all r > 0 and q = 1 − c/n α , This implies that the fluctuations are bounded above by a constant times n α . In particular, for q = 1−c/ √ n, the fluctuations are bounded above by a constant times √ n.
By approximating the invariant measure for the asymmetric exclusion process by a product measure and using alternative concentration of measure techniques, we are able to show that for q < 1 − c/n with c = −20 log(3/5), the height of the midpoint is kn for some constant k > 0. This gives a lower bound on the length of the longest increasing subsequence of the walk under this distribution. Details of this calculation will be provided in a future publication.

Remarks
Using this method, we are able to show concentration of measure of the empirical spectral distribution not only for operator compressions via SO(n) but also for operators that are "compressed" by conjugation with a Gaussian matrix. It is likely that this method could be applied to a much wider range of Markov chains, given that the chain does not change too many entries at once, has an appropriate invariant distribution, and for which the spectral gap is known. It is possible that better bounds for the Gaussian compression could be obtained by adapting the method to use the "second" spectral gap or the exponential decay rate in relative entropy found in [4].
It is worth noting that Talagrand's isoperimetric inequality [15] gives concentration of measure for the length of the longest increasing subsequence for random permutations, but it cannot be used in the context of this ASEP random walk, as it requires independence. Using Chatterjee and Ledoux's method, independence is not needed. We only need a spectral gap bound for the Markov chain.