Regularity and convergence rates for the Lyapunov exponents of linear co-cycles

We study linear co-cycles in GL(d,R) (or C) depending on a parameter (in a Lipschitz or Holder fashion) and establish Holder regularity of the Lyapunov exponents for the shift dynamics on the base. We also obtain rates of convergence of the finite volume exponents to their infinite volume limits. The technique is that developed jointly with Michael Goldstein for Schroedinger co-cycles. In particular, we extend the Avalanche Principle, which had been formulated originally for SL(2,R) co-cycles, to GL(d,R).


Introduction
Let (X, Σ, m; θ) be an ergodic space, where m is an invariant measure under the ergodic transformation θ. Let A : X → GL (d, K) where K = R or K = C, be a measurable transformation such that log A(x) , log A −1 (x) ∈ L 1 (X, m) Define the co-cycle asÃ Powers ofÃ lead to matrix products A (n) The classical Oseledec theorem [Ose], also known as multiplicative ergodic theorem, guarantees a filtration by linear subspaces, i.e., for a.e. x ∈ X there exist linear subspaces and a sequence of positive numbers µ r > µ r−1 > . . . > µ 1 associated with this filtration, which determine the Lyapunov exponents as follows. One has that dim V j x is a.e. constant, and furthermore, the (set) complement of V j−1 x in V j x is characterized by the property that 1 n log A (n) as n → ∞. The Lyapunov exponents are precisely the numbers {log µ j } 1 j=r counted with multiplicity given by dim V j x − dim V j−1 x . Finally, we remark that A(x)V j x = V j θx for a.e. x. See also [Rag, Rue, Led].
The Lyapunov exponents are of fundamental importance in dynamics and have been widely studied, see the encyclopedic treatment in [BarPes]. For more recent research literature see for example [AviVia1,AviVia2,BonVia,BGMV]. Results for co-cycles with shift base (but in a very different spirit from this paper) can be found in [Avi, AviKri, FayKri, Kri].

Statement of the main results
We mainly consider co-cycles whose base dynamics exhibits very weak mixing properties. Let A : T → GL(d, K) be continuous, and define the co-cycle where ω is Diophantine, which will mean that nω ≥ c(ω) n(log n) a ∀ n ≥ 2 (4) with a > 1 arbitrary but fixed. Lebesgue almost every ω ∈ (0, 1) satisfy this condition. The purpose of this note is to point out a mild regularity result that the Lyapunov exponents exhibit as functions of a parameter which A is assumed to depend on. To be precise, we prove the following theorem where (E, d) is a compact metric space. The study of Lyapunov exponents (especially, conditions which ensure positivity of the top exponent) of linear co-cycles with deterministic base dynamics goes back to the seminal work by M. Herman [Her].
Theorem 1. Suppose A : T × E → GL(d, K) is continuous (K = R, C), and analytic as a function x → A(x, E) uniformly in E ∈ E. Furthermore, suppose E → A(x, E) is Hölder continuous, uniformly in x ∈ T. Assume that the Lyapunov exponents satisfy the gap condition Then all λ j (E) are Hölder continuous as a function of E ∈ E. Moreover, if (5) holds at some point E 0 ∈ E, then each λ j (E) is Hölder continuous locally around E 0 . In other words, if all exponents are distinct at E 0 , then they are all Hölder continuous locally around E 0 , and therefore also remain distinct near E 0 .
The proof is based on an adaptation of the technique developed jointly with Michael Goldstein in [GolSch1] for SL(2, R) Schrödinger co-cycles. In fact, [GolSch1] dealt with Schrödinger co-cycles for which E plays the role of the "energy" (spectral parameter), and one needed to impose the condition of positive Lyapunov exponents. In Theorem 1 this positivity is exactly mirrored by the gap condition. For Schrödinger co-cycles Hölder regularity of the exponents is best possible; this follows from the connection between the Lyapunov exponent and the integrated density of states as expressed by the Thouless formula, see [Bou1].
The Hölder C α class which our argument produces for the Lyapunov exponents depends on the size of the gaps, i.e., α depends on κ in (5). Note that for SL(2, R) co-cycles for which the larger exponent is positive, one can refine the argument in [GolSch1] in such a way that α does not depend on κ, see [Bou2,Chapter 8]. The point here is that for such SL(2, R) co-cycles the exponents are λ(E) > 0 > −λ(E) whence one can take κ > λ. This property allows one to obtain uniform control on the Hölder exponent α even as κ → 0. For GL(d, K) with d > 2 there is no relation between the size of the gaps and the individual exponents, and it therefore remains an open question as to how α may deteriorate as κ → 0, let alone what happens when the gap condition is violated. For example, one might expect that the top exponent remains Hölder continuous provided (5) holds with j = 1. However, the argument presented in this paper breaks down completely if, say, λ 1 > λ 2 = λ 3 . Two more comments are in order: • In addition to the regularity of the exponents, we also obtain an estimate on the rate of convergence of the finite-scale Lyapunov exponents. To be specific, we show in Section 6 that for all n ≥ 1 where λ j,n are the finite volume exponents. Moreover, one has the following dichotomy: (i) either (6) is optimal along a sequence of n → ∞ or (ii) the convergence in (6) is exponentially fast. • It is of course natural to inquire about other types of dynamics with weak mixing properties, such as shifts x → x + ω on higherdimensional tori T ν , as well as shifts on T for which ω satisfies weaker Diophantine conditions than (4). In these cases one has a somewhat weaker result than Theorem 1. While the Lyapunov exponents are again continuous, they are not seen to be Hölder by the technology used here. Rather, one obtains regularity of the form with some 0 < σ < 1 and c > 0. See Section 7. We remark that (5) where {V j x (E)} j denotes the Oseledec filtration in dependence of E. The technique developed in [GolSch1] is based on the following two main ingredients: • The Avalanche Principle (AP) for 2×2 matrices. This is a deterministic statement. • A Large Deviation Theorem (LDT) for the matrices A (n) x . These are quantitative versions of the Fürstenberg-Kesten theorem, or of Kingman's sub-additive ergodic theorem. The LDTs are analytical statements, and they depend on the structure of the co-cycles and the dynamics on the base. The LDT theorem from [GolSch1], see also [BouGol] and [Bou2], applies easily to higher-dimensional co-cycles. On the other hand, the AP does not carry over from the 2 × 2 case without changes. In fact, the gap condition is intimately tied to the formulation of the AP as it appears in the following section. In the final section of this paper we apply the main strategy to the products of random matrices and derive the same type of conclusions as we did for the shift dynamics.
This note is organized as follows. In Section 3 we discuss the deterministic AP. In Section 4 we discuss the large deviation estimates for shifts on T under the Diophantine condition (4). Here the analytic dependence of A on x is important. The proof of Theorem 1 is presented in Section 5, and sharp convergence rates for the Lyapunov exponents are presented in Section 6. The final two sections are very sketchy, and discuss other types of dynamics on the base. While Section 7 deals with more general shifts, Section 8 covers products or random matrices.

The Avalanche Principle
The point of Lemma 1 below, which is a completely deterministic result, is that it effectively allows one to linearize long (non-commuting) matrix products A n A n−1 . . . A 2 A 1 . One cannot expect such a result for general products, even if each factor A j has large norm. The mechanism which underlies this linearization is as follows: • we assume that each matrix has a dominant simple singular value, see (9). • no two adjacent pairs A j+1 A j cancel, in the sense that the dominant stretching of A j is annihilated by A j+1 , see (10).
Appropriate quantitative formulations of these properties guarantee that the whole product A n . . . A 1 is "to leading order" nothing other than the composition of the 1-dimensional dominant actions, see (11). This is what we mean by linearization.
Lemma 1. Let {A j } n j=1 ⊂ GL(d, K) satisfy the following properties: for each 1 ≤ j ≤ n there exists a 1-dimensional subspace S j ⊂ K d such that In addition, we assume that Then for some absolute constant C.
Proof. We denote by π j the orthogonal projection onto S j , and by π ⊥ j that onto S ⊥ j . Define T j := A j S j and Since A j is invertible, we see that T j is a line. By a polar decomposition, each A j =Ũ j D j U j where D j is diagonal, andŨ j , U j are unitary. Moreover, In particular, where each ε j is either the empty symbol, or equals ⊥. Consider the contribution to (14) where all ε j equal the empty symbol: Comparing (16) with (15) yields Here we used that In conclusion, as well as Next, we note that Combining (19) with (17) and (18) implies (11).
Note that Lemma 1 remains unchanged if we replace any A j with a nonzero multiple of itself. Also, for the scalar case (11) is identically zero. The fact that S j are lines, rather than planes or higher-dimensional subspaces, played a crucial role in the proof. Indeed, for higher-dimensional spaces we would need to replace (12) with Then (13) remains valid withλ j instead of λ j . However, in the estimation of the dominant branch of (14) these quantities are not sufficient. Instead, we could only use the lower bounds and it is not possible to conclude as we did for dim(S j ) = 1. Note that for the case of SL(2, R) matrices as in [GolSch1] the line-condition is automatic. To be precise, we simply take α j = A j −1 so that condition (9) becomes A j ≥ 4µ.
To illustrate this point further, consider orthogonal projections P j in R 3 or rank 1. Let where ε > 0 is small but fixed. This ensures that A j ∈ GL(3, R). Then A j = 1, and A j+1 A j = | cos θ j | as ε → 0 where θ j is the angle between the range of P j and that of P j+1 . Moreover, Lemma 1 gives the precise conditions in terms of (10) and (9) such that (11) holds for non-zero ε in (20). Note that the former condition simply reads | cos θ j | > µ − 1 4 . On the other hand, define now In that case A j+1 A j = 1 for any 0 < ε ≤ 1 since 2-planes have a non-zero intersection in R 3 . It follows that (10) trivially holds, as does (9) at least when ε → 0. However, (11) becomes for any ε > 0 which is absurd since the left-hand side can be zero in the limit ε → 0, and we may also send µ → ∞ as ε → 0.

Lemma 2.
Under the Diophantine condition on ω, see (4), one has for any δ > 0 where c, C, b are positive constants depending on A, ω, and n ≥ 2. Here Proof. The argument proceeds in two steps, the first step being analytical.
The map x → n −1 log A (n) x (E) extends to a neighborhood A of T in the complex plane as a subharmonic function, uniformly in E ∈ E. Moreover, we have for some M = M(A, E) as well as Properties (23) and (24) ensure that Theorem 3.8 in [GolSch1] applies to the subharmonic function z (E) uniformly in n. The conclusion is as follows: for all δ > 0, and any m ≥ 1, The second step is to note the almost invariance of the co-cycles: To obtain (21) we now set m := δn/C in (25) and conclude via (26).
By general principles, as a monotone decreasing sequence. Indeed, it follows from the co-cycle relation A Averaging in x implies (n + m)λ 1,n+m ≤ nλ 1,n + mλ 1,m , whence the claim. We now turn to the exterior powers of A (n) x , which we denote by Λ p A (n) x . On a general vector space V with a linear operator L : V → V one defines where f is an alternating linear form on (V * ) p . We have the following more general version of the LDT, which allows us to control the Lyapunov exponents λ j with j ≥ 2.
Lemma 3. Under a suitable Diophantine condition on ω one has for any δ > 0 where c, C, b are positive constants depending on A, ω, and n ≥ 2. The λ i,n are defined inductively via Proof. The proof is essentially the same as that of the previous lemma; indeed, we retain the analyticity of on the annulus A uniformly in E ∈ E. This allows us to apply the subharmonic machinery as before. Moreover, we have the product structure which ensures the almost invariance (26) (note that Λ p A remains invertible). So the argument leading to the previous lemma applies.
In addition, we note that as decreasing limits of continuous functions, the sums are upper-semicontinuous. But more regularity is harder to obtain.

The proof of Theorem 1
Proof of Theorem 1. Normalizing, we may assume that A : For simplicity, we consider the case of the two largest Lyapunov exponents first, and our goal is to show that the largest Lyapunov exponent is Hölder.
In other words, j = 1 in (5). Fix E 0 ∈ E. We may choose n 0 such that where δ 0 > 0 is a small constant which will be specified later. By (27) 1 Define B n ⊂ T as the set of x ∈ T for which with p = 1 or p = 2. By Lemma 3, |B n | < e −cδ 2 0 n for n large enough. On B c n we have Let N > n be a multiple of n (for simplicity of notation) and write We wish to apply the AP to this product with A j := A (n) x+jnω (E 0 ). In order to do this, we take N = ⌊e δ 1 n ⌋ where 0 < δ 1 ≪ δ 2 0 is another constant. The union of N/n = k shifted copies of B n has measure < N n e −cδ 2 0 n < e −(cδ 2 0 −δ 1 )n < e −cδ 2 0 n/2 Denoting this set again by B n , we see that conditions (9) hold with µ determined by (32), i.e., µ = e n(κ−4δ 0 ) > N C where C is a large constant, say C = κ/(2δ 1 ). By the LDT applied to A (2n) x (E 0 ) and A (n) x (E 0 ), we may also assume that (10) holds on B c n for all 0 ≤ j < k. Hence, (11) holds, viz., provided x ∈ B c n . The right-hand side of (33) is of the form N −C for a large constant C, as is the measure of B n . Averaging (33) over B c n therefore yields |Nλ 1,N (E 0 ) + (k − 2)nλ 1,n (E 0 ) − 2(k − 1)nλ 1,2n (E 0 )| < N −C which further implies, for all n ≥ n 0 , and with N = ⌊e δ 1 n ⌋ where δ 1 depends on κ, We now argue that (34) remains valid if we replace E 0 with a nearby E. To see this, we first run the argument leading up to (34) for all n ∈ [n 0 , e n 0 ] =: N 0 and note that (32) in that case remains valid near E 0 ; in fact on some ball B := B(E 0 , ε) ⊂ E where ε depends on κ and n 0 . For example, we may take ε = exp(−Ce n 0 ) with some large C. Furthermore, the condition (9) and (10) also remain valid on that ball. Therefore, (34) does hold for E ∈ B with n ∈ N 0 . In particular, with The idea is now to combine this estimate with one for λ 2,N (E). This will allow us to show that the gap between the {λ j,n (E)} d j=1 does not shrink by much when we pass from scale n ∈ N 0 to scale N ∈ N 1 with E ∈ B. If d = 2 we are of course done, see (28). If d > 2, then we need to re-run the AP-argument for Λ 2 A (n) x (E 0 ). This is legitimate, since the two largest exponents of the Λ 2 A co-cycle are The same AP argument as before therefore yields and thus, with N ∈ N 1 related to n ∈ N 0 as before, Subtracting this from (35) yields By the same token, we have this property for all gaps, i.e., for any 1 ≤ j < d, provided δ 0 is sufficiently small, and n 0 large. We claim that we may now iterate this argument without shrinking B. First, we define {⌊e δ 1 n ⌋ | n ∈ N ℓ } =: N ℓ+1 for each ℓ ≥ 1 and note that by construction In fact, N ℓ and N ℓ+1 overlap. To n ∈ N 0 associate N ∈ N 1 as above, and then set N = ⌊e δ 1 N ⌋ ∈ N 2 . We may now repeat the argument leading to (34), (36), respectively, for the N, N scales. Indeed, these estimates now become for all E ∈ B. Next, we replace λ 1,N (E 0 ) in (34) (with E ∈ B instead of E 0 ) with λ 1,2N (E). Subtracting the resulting estimate from (34) implies that for all N ∈ N 1 , and similarly for the other exponents. In combination with (38) and (37) this allows us to conclude that for all E ∈ B, It is essential here that we do not lose more factors of δ 0 , but only subtract terms such as on the right-hand side which are summable over a sequence of scale N j ∈ N j . In view of (40) we can indeed repeat the arguments again for the next scale N ′ := ⌊e δ 1 N ⌋ ∈ N 3 and so on. This establishes our claim concerning infinite iterations while keeping the ball B fixed.
In particular, (39) will hold for all large N. Summing these estimates over a sequence 2 k N implies that and all large N and all 1 ≤ j ≤ d. In particular, we have the uniform gaps Note that we have shown that the validity of (42) at one point E 0 implies its validity for all E near E 0 .
To establish the Hölder regularity of the exponents, we first replace λ 1,N (E 0 ) with λ 1 (E) in (34) which yields and all large N (the relation between n and N is as before). The idea is now to pick two E, E ′ ∈ B and then to compare the resulting estimates (43) with an N that is adjusted to d(E, E ′ ), the distance between E, E ′ ∈ E. The be specific, we claim the crude bound for some β 0 ∈ (0, 1]. This follows from the assumed Hölder regularity of E → A(x, E) with a a constant C 0 that depends on A. Indeed, expanding the co-cycle product and writing the difference in a telescoping fashion leads to the estimate Furthermore, for all n ≥ 1 as claimed.
We infer from (43), (44) that Optimizing over N yields the desired result: where γ = γ(κ, β 0 ). The same argument applies to the other exponents by repeating these considerations for the powers Λ p A (n) x (E).
It is natural to ask whether λ 1 (E) remains Hölder if λ 1 (E) − λ 2 (E) > 0. In other words, if there is a gap between the two largest Lyapunov exponents, does it follow that the top exponent is Hölder? The proof we just gave does not show this, since it relies on statements about λ 2,n (E) over different scales n, uniformly in E. Such control can only be obtained via the AP, at least within the confines of our methods. However, the AP for Λ 2 A (n) x requires a gap between λ 2 and λ 3 , and so on. So it is really important for our argument that all exponents are distinct, even to show that the top one is Hölder regular.
We remark that the proof we just gave is not optimal in some ways. The main improvements one can make relate to various upper bounds we used, such as (44) where one can insert the Lyapunov exponent λ 1 (E) into the exponent, viz.
for large n; here we are using that λ 1 (E) > 0 which follows from (5). To obtain (48), one relies on the upper bound sup x∈T log A (n) x (E) ≤ nλ 1 (E) + (log n) C 0 for large n, and analogously Here C 0 is a constant that depends on ω, and A. For a proof, see [GolSch2,Proposition 4.3]. Results of this type played an important role in the finer results required for the Cantor structure of the spectrum, see [GolSch2,GolSch3].
Moreover, they were used by Bourgain in [Bou2,Chapter 8] to show that the Hölder exponent does not depend on the Lyapunov exponent (as long as it is positive) for Schrödinger co-cycles. In addition to these finer upper bounds one also needs to improve on (21) to accomplish this, namely by removing the dependence on the δ 2 on the right-hand side. The appearance of this δ 2 , too, is closely related to a crude upper bound, see (26). Indeed, it is natural that we may place the Lyapunov exponent on the right-hand side of (26), which then shows that for δ ≃ λ 1 we may put δ on the right-hand side of (21) rather than δ 2 . This is exactly what is done in [Bou2,Chapter 8].
However, we do not seem to gain anything from these improvements which is why we have not implemented them in the general GL(d, R) case. As already mentioned in Section 2, for d > 2 there is no relation between the size of the gaps and the Lyapunov exponents themselves. Thus, unlike the d = 2 case the methods of this paper cannot possibly lead to gapindependent Hölder classes.

Rates of convergence
As a byproduct of the argument presented in the previous section we obtained the convergence rates (41), viz.
Here E is as in Theorem 1 (cover the compact space E by finitely many open balls B as in the previous section). The purpose of this section is to remove the log n factor from this rate. This is the analogue of [GolSch1,Theorem 5.1] where the same result was obtained for SL(2, R) Schrödinger co-cycles.

Proposition 2. Under the assumptions of Theorem 1 one has
for all n ≥ 1. The constant C depends on κ, A, E, ω.
Proof. We write, A (2n) x (E) The idea is now to apply the AP to the three matrix products A (2n) x+nω (E), and A (n) x (E) with factors of the form A (ℓ j ) x+m j ω (E). Here n is large and for some large C 1 . This can be justified as in Section 5 for all x ∈ B c n where |B n | < n −10 , say (taking C 1 large). Subtracting the resulting representation (11) for log A (n) x+nω (E) and log A (n) x (E) from that for log A (2n) for all x ∈ B c n and some ℓ ≃ log n. The power n −1 can be improved on the right-hand side but this is of no consequence. Integrating (50) over T and noting that the integral over B n makes a negligible contribution yields (dropping E for simplicity) Setting R(n) := 2n|λ 1,2n − λ 1,n | we infer that R(n) ≤ R(ℓ) + C n Iterating and summing yields R(n) ≤ C for all n ≥ 1 as claimed. Applying the exact same reasoning to the exterior powers Λ p A yields the analogous estimate for the other exponents.
We now turn to the question of whether (49) is an optimal estimate. As the example of constant A shows, this of course need not be the case. However, based on the observation from Section 5 that the quantities λ j,2ℓ (E) − λ j (E) and λ j,ℓ (E) − λ j,2ℓ (E) differ only by an amount that is exponentially small in ℓ, we now establish the following dichotomy: either (49) is optimal, or the convergence is exponentially fast.
Corollary 3. Under the assumptions of Theorem 1 there exists a constant c 1 = c 1 (κ, A, E, ω) > 0 such that for all sufficiently large ℓ and all 1 ≤ j ≤ d. Moreover, there is ℓ 0 = ℓ 0 (c 1 ) so that if for some ℓ 1 ≥ ℓ 0 and some j, then In other words, on balls B on which (5) holds, we have the following property: for each E ∈ B, either λ j,n (E) → λ j (E) exponentially fast, or for infinitely many n.
By (53) and our assumption,

Other types of shifts
As already noted in Section 2, the proof is modular and rests on two main ingredients, the avalanche principle on the one hand, and the large deviation theorems on the other hand. In fact, the proof in Section 5 applies to any GL(d, K) cocycle over an "abstract" base dynamics as in Section 1 θ : X → X provided one has an LDT for all δ > 0 and all sufficiently large n ≥ n 0 (δ) with some constant c(δ) > 0. In fact, a weaker statement such as for all δ > 0, n ≥ n 0 (δ) and some τ = τ(δ) > 0 would still lead to a result. But instead of Hölder regularity one obtains a modulus of the type (7). For example, instead of (51) we only obtain under the assumption (55). In the iteration which underlies the proof of Theorem 1 given in Section 5 we can therefore only pass from scale n to N = exp(n β ) for some 0 < β < 1. The details are routine, and are left to the reader. Examples of dynamics that give rise to LDTs of the form (55) are shifts on higher tori T ν as well as those on T for which (4) is relaxed such as to where a > 1 is arbitrary but fixed. See for example [GolSch1,Proposition 10.2], [Bou2,Proposition 7.18] for precise statements along these lines. It is not known if these weaker results can be improved or not; in other words, for higher-dimensional tori we currently do not have an estimate such as (54).

Products of random matrices
For the sake of completeness, and less to say anything new, we now sketch how the machinery of this note relates to the classical theory of products of random matrices. A standard reference for everything that we will need is the book by Bougerol, Lacroix [BouLac]; see also Le Page [LeP], and Ledrappier's lecture notes [Led].
Let µ be a probability measure on GL(d, R) and consider i.i.d. variables {Y j } ∞ j=1 with common distribution µ such that E[log + Y 1 ] < ∞. We assume that µ is strongly irreducible, i.e., there is no finite union of proper subspaces of R d which is invariant under every matrix in T µ , the smallest closed semigroup containing the support of µ. Furthermore, we assume that µ is contracting, i.e., there exists a sequence {M j } j in T µ such that M j −1 M j converges to a rank-1 matrix.
Fürstenberg's theorem [Fur], says that for d = 2 the co-cycle generated by the random sequence {Y j } ⊂ SL(2, R) has Lyapunov exponents λ 1 > 0 > λ 2 = −λ 1 . Moreover, there exists a unique µ-invariant probability measure on PR 1 , denoted by ν, such that where M ·x is the direction of Mx where x belongs to the equivalence classx. If these conditions are valid for all exterior powers Λ p Y 1 , a theorem of Guivarc'h and Raugi, see [BouLac,page 78], extends Fürstenberg's result to d > 2 ensuring that all exponents are distinct.
By a theorem of Le Page [LeP], we further know that one has exponential convergence to the invariant measure ν, see [BouLac,page 106] for the precise meaning. Finally, assuming that µ has exponential moments, one has a large deviation estimate of the following form: for every δ > 0 lim sup n→∞ n −1 log P | log Y n Y n−1 · · · Y 2 Y 1 − nλ 1 | > nδ < −c(δ) < 0 see [LeP,Theoreme 7], or [BouLac,page 131]. We therefore have all ingredients in order to apply the exact same argument as above. Starting from the case where the random co-cycle does not depend on a parameter, we conclude for example that Corollary 3 holds (with E fixed). Thus, one either has exponential convergence of the Lyapunov exponents, or (52) holds. However, by the aforementioned convergence result of Le Page it follows that (52) does not occur in the random case and one always has exponential convergence. In particular, this logic shows that the statement of Corollary 3 is sharp.
For random co-cycles depending "nicely" on a parameter, we may again conclude that the exponents are Hölder in the parameter. However, we see no need to make this precise; indeed, if the dynamics is random or strongly mixing then we might expect much better regularity of the exponents. At least for Schrödinger co-cycles this is indeed the case, see Campanino, Klein [CamKle] and Simon, Taylor [SimTay]. The approach we followed here is clearly not able to capture results of that strength; conversely, the techniques in these papers cannot handle deterministic dynamics such as shifts.