Quenched CLT for random toral automorphism

We establish a quenched Central Limit Theorem (CLT) for a smooth observable of random sequences of iterated linear hyperbolic maps on the torus. To this end we also obtain an annealed CLT for the same system. We show that, almost surely, the variance of the quenched system is the same as for the annealed system. Our technique is the study of the transfer operator on an anisotropic Banach space specifically tailored to use the cone condition satisfied by the maps.


introduction
The issue of limit laws in dynamical systems has been widely explored in the last decades and it has a clear relevance for physical applications. A prime example of a physically relevant system is the study of the statistical behavior of a Lorenz gas with randomly distributed obstacles. The case of periodic obstacles is known to be ergodic. This follows from the recurrence [19], which in turns follows from the CLT, [10,20], which has been proved in [18] (see also [8] for more refined results on these issues). See [15] for more details and for the treatment of some (locally) aperiodic cases. On the contrary the random case (albeit one may naïvely think of it as an easier case) stands as a challenge.
If one considers the simplest possibility (the random position of the obstacles is a small i.i.d. perturbation of a periodic configuration) then, by Poincaré section, one is readily reduced to considering a random sequence of hyperbolic symplectic maps. Yet, such a sequence of maps is not i.i.d. due to the presence of recollisions. Recollisions are notoriously a source of serious problems in the study of gases but, quite surprisingly, even disregarding the recollision problem (i.e. for the i.i.d. case), the problem is poorly understood.
In this paper we address the easiest setting in which such a situation occurs: an i.i.d. sequence of smooth uniformly hyperbolic symplectic maps. To make the presentation as clear as possible we will steer away from the full generality in which the present results can be obtained (although we will comment on it) and we will consider an i.i.d. sequence of linear two dimensional toral automorphisms. Exponential decay of correlations has been shown in this setting in [1].
For such a model we will show that the time-N average of any smooth zero mean observable has Gaussian fluctuations of order √ N for almost every sequence of maps. Moreover, we identify the variance of such Gaussian fluctuations. and Dmitry Dolgopyat for communicating to one of us (CL) reference [2]. MS would like to thank the Finnish Cultural Foundation for funding. MS and AA were supported in part by NSF DMR-01-279-26 and AFOSR AF 49620-01-1-0154. CL would like to thank the Courant Institute where he was visiting when this work started.

The model and the results
Let us consider two 1 matrices {A i } 1 i=0 ∈ SL(2, N) and define the toral automorphisms T i x = A i x mod 1. Let ℘ ∈ [0, 1] and set p 0 = ℘, p 1 = 1 − ℘. We can then introduce the Markov operator Q ℘ : L ∞ (T 2 , R) → L ∞ (T 2 , R) defined by Such an operator defines a Markov Process. To describe it we consider the space of trajectories Ω * := (T 2 ) N endowed with the product topology and, letting (x 0 , x i , . . . ) be a general element in Ω * , we have the obvious dynamics τ : Ω * → Ω * defined by τ (x 0 , x 1 , . . . ) = (x 1 , . . . ). For each initial measure µ on T 2 , the above Markov process defines a Borel probability measure P µ on Ω * . Let E Pµ be the expectation with respect to such a measure. Then 1 In fact, the following would hold almost verbatim also for any larger collection of matrices.
The measure P µ is supported on a very small set of trajectories: P µ -almost surely In other words we can define the probability space Ω = Σ × T 2 (again equipped with the product topology), the map F : Ω → Ω defined by where (τ ω) i = ω i+1 , and the measure P ℘,µ = P ℘ × µ, where P ℘ is the Bernoulli measure with probability ℘ of having zero. We will denote by E ℘ the expectation with respect to P ℘ . Note that if µ is simultaneously T 0 and T 1 invariant, then P ℘,µ is invariant for the map F . Since the maps are symplectic, this happens for the normalized Lebesgue measure m. Let us set P ℘ := P ℘,m and call E P℘ the corresponding expectation. Finally, we define the map Ψ : Ω → Ω * by It is then easy to verify that τ k (Ψ(ω, x)) = Ψ(F k (ω, x)) for all (ω, x) ∈ Ω, k ∈ N, and E Pµ (h) = E P℘,µ (h • Ψ) for each continuous function h : Ω * → R, 2 that is the two Dynamical Systems (Ω * , τ , P m ) and (Ω, F, P ℘ ) are isomorphic and so are the σ-algebras F k = σ-{x, ω 1 , . . . , ω k } andF k = σ-{x 0 , . . . , x k }. We will use the two processes above interchangeably as far as the study of measure theoretical properties is concerned. For We are interested in studying the P ℘ -almost sure asymptotic behavior, as N → ∞, of the random variables The first relevant fact lies in the following Lemma.
Proof. By the above discussion the ergodicity of (Ω, F, P ℘ ) is equivalent to the ergodicity of the stationary Markov process P m . It is well known that the ergodicity of such a process is equivalent to the fact that Q ℘ g = g implies g = constant for each bounded measurable g. In section 3 we will see (Corollary 3.3) that there exists p, q > 0 such that, for each f ∈ C p+2d+1 , g ∈ C q , holds Taking f ∈ C p+2d+1 and g ∈ L ∞ , we can choose a sequence (g j ) ∞ j=1 ⊂ C q that converges to g in L 1 and we obtain (2.1) also for such functions. But this means On the other hand if we have already the equality for functions depending on n variables, we can write h(x 0 , . . . , xn, x n+1 ) = gx 0 ,...,xn (x n+1 ) and, by induction, The assertion follows then by the density of the local functions among the continuous ones.
that Q ℘ g = g implies m(f g) = m(f )m(g) for each f ∈ C p+2d+1 which readily implies that g is constant.
Since m(S N (ω)) = 0, thanks to the previous Lemma, we can apply the Birkhoff Ergodic Theorem and obtain The next step is to investigate the variable N − 1 2 S N and prove that it satisfies a (quenched) CLT.
In fact, Σ 2 ℘ depends analytically on ℘. Moreover, if f is not a simultaneous continuous coboundary 5 for each admissible 6 map T i , then Σ 2 ℘ > 0. Remark 2.2. Imposing Σ 2 ℘ > 0 clearly excludes fewer observables in the random case than the deterministic one. From Theorem 1 and the classical Livschitz Theorem [14] easily follows that Σ 2 ℘ = 0 if and only if is a closed orbit for some sequence of admissible maps T ω1 , . . . , T ω k . Unfortunately, to use such a criterion it may be necessary to check a very large number of trajectories. Yet, if ℘ ∈ ]0, 1[, then the situation may be much simpler.
is ergodic, 7 then g must be constant and hence f ≡ 0 contrary to assumptions. That is: if {T 0 , T 1 } are admissible and T 1 • T −1 0 is ergodic, then Σ 2 ℘ > 0. Remark 2.3. Note that one cannot possibly extend our results to include all sequences; for instance, there exist sequences containing alternating, "deterministic", stretches of either T 0 's or T 1 's. If these stretches are of rapidly and ever increasing length, then the variance fails to exist.
Before proving such a strong result we will obtain its averaged (annealed) version.
3 In the proof we use f ∈ C r (T 2 , R) for r large enough. Yet, since our bounds for r are far from optimal (nor do we strive to optimize them) we see no point in giving an explicit bound for r. 4 By N`0, Σ 2´w e mean the centered Gaussian random variable with variance Σ 2 . The symbol ⇒ stands for convergence in distribution. As usual, N (0, 0) stands for the measure concentrated at zero. 5 By a simultaneous continuos coboundary for a set of maps {T i } we mean that there exists a continuous function g such that f = g − g • T i , for each map {T i }. 6 The map T i is admissible if it appears with nonzero probability with respect to P℘. In our case the admissible maps are {T 0 , T 1 } unless ℘ ∈ {0, 1}. 7 Note that this may easily fail even if T 0 = T 1 . Indeed, consider the case In turn such a result is based on a fine understanding of the dynamical properties of certain transfer operators associated to the process.
Remark 2.5. Note that one could obtain similar results for any finite collection of smooth symplectic hyperbolic maps in any dimension or piecewise smooth maps in dimension two. This can be achieved at the price of using in the following section the functional setting of [13,5] or [7] for the piecewise smooth case.
Our first task will be to obtain some information on the spectral properties of such operators. To do so in a useful way it is necessary to introduce appropriate functional spaces. Instead of appealing to the general theory developed in [6,4,12,5,13] we will take advantage of the simplicity of the present setting and introduce explicitly a particularly simple version of such a theory. We will then see how it can be used to address the ergodic theoretical questions we are interested in.

Spectral properties of the Transfer operators
For further use (see section 5) we need to study more general automorphisms than the one introduced in the previous section, namely with d ∈ {1, 2}. These matrices are symplectic with respect to the symplectic form where J 2 = 0 1 −1 0 is the standard symplectic form in two dimensions. 8 Let us introduce the transfer operators L (d) induced by the above automorphisms. We will consider the operator obtained from the latter by averaging over the Bernoulli measure, namely . We also need to study perturbed operators of the form . In order to avoid unnecessary proliferation of indices, we set Ti , L ℘ := L (1) ℘ , and L g,℘ := L (1) g,℘ . Finally, notice that the transfer operator L ℘ and the Markov operator Q ℘ are dual: To study such operators it is necessary to introduce appropriate Banach spaces (see [13,5,12]). Here, given the simplicity of the situation, we can quickly introduce and use spaces inspired by [5,12] whereby making the presentation self-consistent. 9 Given and, for all v ∈ C + , (i 1 , . . . , i n ) ∈ {0, 1} n , and n ∈ N, Moreover one can compute that there exists β > 1 such that ( Now, we proceed to define the norm for the Banach space we want to consider. Notice that the natural objects in these cones are not vectors but Lagrangian subspaces. Recall that, given a symplectic form J, a Lagrangian subspace For our choice of symplectic form, every Lagrangian subspace can also be written as the set E = {v :v = −Uv} for a specific symmetric d × d matrix U . Our convention is to write a minus sign in front of the U here, because then E ⊂ C β if and only if β −1 1 ≤ U ≤ β1.
Let us denote the set of Lagrangian subspaces as L. For a Lagrangian subspace E and a vector k, we set E, k := sup Because the Bernoulli weights above sum to unity, 9 Actually our choice is more flexible than the one in [12], in the spirit of [5], and would allow to treat C k maps, although it is not the goal here.
A straightforward computation shows that (L (d) We begin estimating the summand. Before that, we simplify notation and rename the matrix product. To avoid the problem of too many indices, we denote it only with the subscript n and assume the dimension and the sequence to be implicit: Using (3.3) and the fact E ⊂ C − , we have Indeed, setting k n := A −1 n−1 k, where we have chosenv = −k n in the second line. On the other hand the choicě v = U −1ǩ n yields The inequality (3.6) follows from the above estimates, (3.5) and C 0 Λ n−1 |k n | ≥ |k|. Next, we consider two subcases. If |k| ≥ B β,n , then | E n , k | ≥ 1. Hence, which is a good estimate provided q is large enough so that Λ p λ −(p+q) < 1. The remainder is a finite sum which can be estimated because if Accordingly, settingμ := max{λ −p , Λ p λ −p−q } we can collect all the above inequalities as where B n = C q+2p 0 B β,n Λ np . Next, for each µ ∈ (μ, 1) choose n 0 such that C 2μ n0 ≤ µ n0 and, for each n ∈ N, write n = kn 0 + m with m ∈ {0, . . . , n 0 − 1}. One can thus iterate the second of the (3.7) and obtain which finally yields We can then consider the closure, B p,q , of C ∞ in the space of distributions with respect to the norms · p,q . It is easy to prove the following: ℘ are well defined bounded operators on B p,q , provided Λ p < λ p+q . In addition, the unit ball of B p,q is relatively compact in B p−1,q+1 .
℘ acting on B p,q has an essential spectral radius smaller than µ. The rest of the spectrum consists of finitely many eigenvalues of finite multiplicity, all in the unit disk. The only eigenvalue of modulus one is one and the constant function equal to one is the corresponding eigenfunction.
Before proceeding, let us mention that for any r, n ∈ N we endow the space C r (T n , R) with the norm g C r := r s=0 g (s) ∞ . Moreover, a simple computation shows that C r ⊂ B p,q , provided r > p + 2d. Proof. First of all notice that, for all f ∈ B p,q and g ∈ C q holds If p, q satisfy the hypothesis of Theorem 3.2, L ℘ has a spectral gap δ ℘ > 0. Thus, Here is the last fact we need to know about the above functional analytic setting.
Clearly g ∈ C r (T 2d , R) implies g r ≤ g C r . The B p,q -norm of the product then reads (3.9) f g p,q = sup Let us analyze the second term first: The desired bound follows, if r ≥ q, from Now, look at the summand in the first term of (3.9), ignoring the l = 0 case that can be taken care of separately. 10 We are aware that our choices of norms and the subsequent estimates, are not the optimal ones. We are simply trying to simplify the arguments as much as possible even at the expense of some, not really relevant, optimality.
Notice that | E, l | ≤ | E, k | + | E, k − l |. Thus, on the one hand One the other hand

Hence Theorem 3.2 and Lemma 3.4 show that
is a bounded operator on B p,q provided f ∈ C 2p+q+5 and depends analytically on λ. To continue it is necessary to study the leading eigenvalue of such an operator. Note that, in general, given any positive operator L on the spaces B p,q with maximal simple eigenvalue one, with a spectral gap and m(Lϕ) = m(ϕ) for each smooth ϕ, for any smooth complex valued function g we can define the family of operators L ν ϕ := L(e νg ϕ) and, thanks to Lemma 3.4, the standard perturbation theory applies. Thus there exists φ ν , µ ν , with µ 0 = 1, such that Differentiating this relation with respect to ν and integrating one readily obtains 11 Finally, differentiating again yields Thus, by standard perturbation theory and in view of Theorem 3.2 we can write , the spectral radius of R ν is smaller than ρ < 1 for all |ν| ≤ ν 0 for some ν 0 > 0, and |µ ν −1−µ ′ 0 ν− 1 2 µ ′′ 0 ν 2 | ≤ C|ν| 3 for some fixed constant C > 0 and |ν| ≤ ν 0 . In addition, Q ν is a rank one operator of the form φ ν ⊗ m ν where m ν belongs to the dual of the space, m 0 = m, and |m ν (1) − 1| ≤ C|ν|.
If we apply the above to the operator L iλN − 1 2 f,℘ , ν = iλN − 1 2 (hence g = f , and φ 0 ≡ 1), then remembering equation (4.1) it follows that is always nonnegative and given by Moreover, if f is not a C 0 simultaneous coboundary for the admissible automorphisms T i (see Remark 2.2), then Σ 2 ℘ > 0.

Proof. A direct computation yields
n m(f L n ℘ f ) Using Corollary 3.3, the last sum converges exponentially fast in n. Hence Σ 2 ℘ exists and is nonnegative simply because it is the limit of a nonnegative quantity.
To address this last issue, suppose Σ 2 ℘ = 0. Then n |m(f L n ℘ f )| ≤ C uniformly in N . This means that the random variables Z N := N −1 k=0 X k are uniformly bounded in L 2 . By the Banach-Alaoglu Theorem, they form a weak-* relatively compact set. We can then extract a subsequence (N j ) ∞ j=1 such that, for each ϕ ∈ L 2 (Ω, P ℘ ), lim j→∞ E P℘ (ϕZ Nj ) = E P℘ (ϕY ) for some L 2 random variable Y . If we choose ϕ to be a function of the x only, it follows that m). On the other hand, for each smooth ϕ, That is f = g − Q ℘ g. Next, consider the L 2 random variables G n := g • π • F n and .
In fact, the process (M n ) is a martingale, since Thus, The inequality (4.7) and the boundedness of The continuity of g follows from the usual Livschitz rigidity arguments. 12 In order to prove analyticity of the variance Σ 2 ℘ with respect to ℘, first notice that there is a positive lower bound on the spectral gap δ ℘ appearing in the proof of Corollary 3.3 in a complex neighborhood of [0, 1]. Thus, the series in (4.6) converges 12 In fact, in the present simple case one can provide the following direct proof: clearly g = f for an admissible choice of T i , convergence taking place in the · p,q norm. Let v u,s be the unstable and stable vectors of T i , respectively, and ϕ ∈ C ∞ . Then On the other hand, g(x) = P n k=0 f • T k i + g • T n+1 i , and the mixing of T i (proven exactly as in Taking the sup over {ϕ ∈ C ∞ : ϕ L 1 = 1}, it follows that ∇g ∈ L ∞ , which implies g ∈ W 1,2 . Hence, by Morrey's inequality, g ∈ C 0 . uniformly in ℘. The partial sums are polynomials of ℘, hence the limit Σ 2 ℘ is an analytic function of ℘.
We finish the section with two simple but important results.
Proof. The argument follows verbatim the previous discussion. Thus to prove the Lemma we only need to compute the second derivative of the leading eigenvalue, which we still denote µ ν , and to show that µ ′′ 0 = 2Σ 2 ℘ . In analogy with (4.3), where m 2 is the normalized Lebesgue measure on T 4 . Then [L There exists L 0 > 0 such that, for all L ∈ (0, L 0 ), the following estimate holds, Proof. This is an averaged large deviation estimate and can be obtained exactly as the averaged CLT was obtained. Although the idea is standard we give here a sketch of the proof. For any random variable Y , for each β > 0, . We again apply perturbation theory techniques at the beginning of this section to estimate the right-hand side. Using (4.4) with ν = βN −1 , g = f, φ 0 = 1, we have . If we define the Legendre transform I C (L) = sup |ν|≤C Lν − ln µ ν and we call ν * the value in which the sup is attained, then choosing β = ν * N we have To compute explicitly I C (L) we expand Minimizing this quadratic expression leads to a value of ν * = L µ ′′ 0 and gives (recalling µ ′′ 0 = Σ 2 ℘ ) the estimate, provided L ≤ Cǫ where Cǫ is small.

Quenched CLT
Now that we have the CLT in average we would like to establish it for a large class of sequences. Let Σ 2 ℘ be the variance of the average CLT with respect to the Bernoulli process with parameter ℘. We wish to show that for P ℘ almost all sequences ω we have the CLT with variance Σ 2 ℘ . To this end we start with an L 2 estimate: assuming that Y N is a sequence of random variables such thatȲ := lim N E ℘ (Y N ) exists and is real, we can compute Thus, recalling the notation f ω,k := f • T ω k • · · · • T ω1 and the bound (4.5), (5.1) The first term on the right-hand side can be conveniently reinterpreted by introducing a product system. That is, consider the maps T ω k ⊕ T ω k : T 4 → T 4 , which are represented by the block matrices By Lemma 4.2 and by (5.1), By Chebyshev inequality the above estimate implies One would then like to prove almost sure convergence by applying a Borel-Cantelli argument but two problems are in the way: on the one hand the sum over N of the above bound diverges, on the other hand one wants the limit to hold almost surely for all λ, that is one has potentially uncountably many sets to deal with. Both problems can be dealt with by applying Borel-Cantelli to subsequences and then showing that controlling the limit of such sequences one controls the limit for each N and λ. First of all, notice that On the other hand, notice that the estimate (4.8) in Lemma 4.3 also implies Next, consider b ∈ ( 1 2 , 1) and the sets 13 We can then write where we have assumed Σ ℘ k −1 ≤ ε in order to deal with the difference e − 1  13 Here [x] stands for the integer closest to x.
We can estimate the above expression by Thus, remembering (5.3), f ω,l ≥ ε 2 + 4ε −1 P ℘ sup (N1,λ1)∈∆ k (N,λ) Then the estimates (4.8), (5.4) and (5.2) imply, for k ≥ Σ ℘ ε −1 , for which it follows that the sum over k is finite. By Borel-Cantelli it follows that the above events sup (N,λ)∈J k m(e Here we used the fact that, for each fixed N , |λ| ≤ log 2 N implies (N, λ) ∈ J ⌊log 2 N ⌋ . Let us call Ω ε the bad set of sequences, involving N ε = ∞. It is an increasing set with decreasing ε, such that P ℘ ( ε>0 Ω ε ) = lim ε↓0 P ℘ ( Ω ε ) = 0; the bad set is independent of ε. This concludes the proof and establishes the almost sure CLT where almost sure means that, fixing any Bernoulli measure, the set of the sequences for which we do not have CLT has zero measure. Note, however, that the limit (more precisely, the variance) is not constant but depends on ℘. This is natural since the deterministic limits ℘ = 0 and ℘ = 1 generically have different variances and as ℘ varies, the variance should interpolate smoothly between these two extremal values, which indeed is confirmed by Lemma 4.1.