Random perturbations of hyperbolic dynamics

A sequence of large invertible matrices given by a small random perturbation around a fixed diagonal and positive matrix induces a random dynamics on a high-dimensional sphere. For a certain class of rotationally invariant random perturbations it is shown that the dynamics approaches the stable fixed points of the unperturbed matrix up to errors even if the strength of the perturbation is large compared to the relative increase of nearby diagonal entries of the unperturbed matrix specifying the local hyperbolicity. This work is motivated by the (long-term) aim of controlling the growth of the finite volume eigenfunctions of the Anderson model in the weak coupling regime of disorder.


Model, main results and comments
Let us consider the random dynamics on the L-dimensional sphere S L , L ≥ 2, given by v n = T n · v n−1 , n ∈ N , where the action · : GL(L + 1, R) × S L → S L of the general linear group is and the random matrices T n are of the form Here R = diag(κ L+1 , . . . , κ 1 ) is a fixed unperturbed positive diagonal matrix whose entries satisfy κ 1 ≥ · · · ≥ κ L+1 > 0 and a random perturbation λr n U n is given by a coupling constant λ ∈ [0, 1), a radial randomness r n ∈ [0, 1] and an angular randomness induced by orthogonal matrices U n . The main assumption is that both the r n and U n are independent and identically distributed with a distribution of the r n that is absolutely continuous with a bounded density and with a Haar distributed U n ∈ O(L + 1). Hence the object of study is a particular Markov process on the continuous state space S L .
The above is the standard set-up of the theory of products of random matrices [3] except that usually the action is studied on the projective space and not its double cover by S L , but for sake of simplicity we suppress this difference. By Furstenberg's Theorem the random action has a unique invariant probability measure µ r,λ on S L if supp(r) = {0} and λ = 0 (see [3], Part A, Theorem III. 4.3). This paper is about proving further quantitative information about this invariant measure in the special case described above. Hence the paper is thematically located at the interface between random matrix theory, the theory of products of random matrices and random dynamical systems. One of the key technical elements in the proofs is a stochastic order underlying the process (1) with R = 1, see Proposition 12 below.
Let us begin by describing the dynamics (1) heuristically. The unperturbed deterministic dynamics R· induced by R is maximally hyperbolic if the deterministic local expansion rates are strictly positive for all i = 1, . . . , L. Then there is a simple stable fixed point given by the unit vector e L+1 corresponding to the last component (the fixed point is unique only on projective space). The deterministic dynamics R N · v 0 converges to the unit vector e j if j is the largest index such that the jth component of the initial condition v 0 does not vanish. However, e j is an unstable fixed point of R· if j ≤ L. All these facts are elementary to check.
In the following, we also speak of the unit eigenvector e L+2−j of the eigenvalue κ j as the jth channel specified by the unperturbed dynamics. We will not assume maximal hyperbolicity in the following.
If now the strength of the perturbation is non-zero and satisfies λ ≪ δR 1 , one can prove that the random dynamics leaves any unstable fixed point and is driven to the vicinity of the stable fixed point in which it then remains. This implies that also the Furstenberg invariant measure µ r,λ is supported on a neighborhood of the stable fixed point. More generally, if λ ≪ δR i for some i, then µ r,λ is supported by a neighborhood of {0} L+1−i × S i−1 ⊂ S L . We are, however, interested in a situation where several of the δR i may vanish or are at least smaller than λ. Hence the unperturbed dynamics may be merely partially hyperbolic. In this situation the random perturbation is not small compared to the local hyperbolicity of R. Intuitively, it is clear that the random dynamics may then visit all points on S L because the randomness can overcome the hyperbolic character of R and lead to significant escapes from anywhere. This just means that the support of the invariant measure is the whole sphere S L . This last fact is precisely part of the following first result. Theorem 1. Suppose that λ ∈ (0, 1), that the i.i.d. distribution of the r n is absolutely continuous with a bounded density and that the i.i.d. U n are Haar distributed. Then for any N ≥ 1 and initial condition v 0 , the random variables v N ∈ S L are distributed absolutely continuously w.r.t. the normalized surface measure ν L on S L . In particular, the Furstenberg measure µ r,λ is absolutely continuous w.r.t. ν L . If λ > δR i for all i = 1, . . . , L and 1 ∈ supp(r), then the support of µ r,λ is the whole sphere S L . Now let us suppose that the randomness, while being large compared to the local expansion rates λ > δR i , is small compared to the expansion rates from channel i to channel j for some j > i. Then if λ < δR i,j , there is some contraction hyperbolicity on this larger scale, even though the local hyperbolicity is dominated by the randomness. Hence a finer analysis of the interplay between the randomness and the hyperbolic unperturbed dynamics is needed. Intuitively, one certainly expects the random dynamics to spend little time in the channel j and this should lead to a small weight of the Furstenberg measure on this channel. Roughly this is what we actually prove below. To state our main result more precisely, we need some further notations. Let us partition the channels into three parts (L a , Moreover, let us introduce the macroscopic gap γ = γ (R, L b , L c ) between the upper and lower parts by Note that the macroscopic gap γ is positive provided that κ Lc > κ L b +Lc+1 . Now the deviation of the random path (v n ) n∈N defined by (1) and (3) from the attractive part {0} La × S L b +Lc−1 of phase space can be measured as the norm of the upper part a(v N ) . The main result provides a quantitative bound on the expectation value of a(v N ) 2 for N sufficiently large when the expectation is taken over the randomness contained in T n for n = 1, . . . , N.
Using the invariance property of the Furstenberg measure µ r,λ , one deduces the following Corollary 3. Under the same hypothesis as in Theorem 2, The estimates (5) and (6) strongly differ from the behavior for R = 1 where no hyperbolicity is present. Then E a(v N ) 2 ∼ L a L −1 holds for large N independent of λ > 0 which just reflects the equidistribution of the random dynamics on all channels (this follows from Proposition 14 below). To us, the most interesting regime is that of large L a , L b and L c , say all a fraction of L, and of γ of the order of 1 (but possibly less than 1). Then the r.h.s. in (5) and (6) is approximately proportional to λ 2 which is the expected behavior. Indeed, the random kicks of order λ are uniform and thus do not distinguish between channels, and hence the drift into each channel is given by their variance or λ 2 , so that E a(v) 2 should be of the order λ 2 times the proportion L a L −1 of channels in a(v).
Our main motivation for the present study are potential applications to the field of discrete random Schrödinger operators like the Anderson model, see [3,4,1] for general mathematical background information. Little is known rigorously about the so-called weak localization regime of such operators in space dimension higher than or equal to 3. In this regime, the eigenfunctions are not expected to be exponentially localized and the quantum dynamics is believed to be diffusive like in a Brownian motion. Furthermore, random matrix theory is expected to provide a good description of the eigenvalues and eigenfunctions locally in space and within a suitable range of energies. In infinite volume the spectral measures likely have an absolutely continuous component.
Let us mention a few rigorous contributions to the understanding of this weak-localization regime. Some attempts try to extract random matrices from the Anderson model [14], others start directly from models containing random matrices. For example, random band matrices is such an intermediate model for which one then has quantum diffusion [6]. Another one is the Wegner N-orbital model in the limit N → ∞. It becomes an Anderson model with free random potential values. This model has a rather explicit solution [13] which in turn allows to prove diffusive behavior of the underlying quantum dynamics [19]. Such diffusive behavior has also been proved in supersymmetric sigma models [5]. Furthermore, one can argue for diffusive behavior in a scaling limit of the Anderson model itself by sending the coupling constant of the randomness to 0 while increasing time [7]. The approach closest to the present study constructs solutions of finite volume approximations explicitly via the transfer matrix method. Best understood is then the quasi-one-dimensional limit in which one has strong Anderson localization, that is, pure-point spectrum with exponentially localized eigenfunctions with a rate called the (inverse) localization length [3,4,8]. In a perturbative regime of small coupling of the randomness, one can calculate this localization length [20,15] and, more generally, the whole Lyapunov spectrum [16,17] provided the random dynamics of the transfer matrices is well understood. For such systems, one can also derive flow equations for the finite volume growth exponents, the so-called DMPK-equations [2,21,18]. Beneath these works, only [20,18] address the hyperbolic character of the unperturbed dynamics (corresponding to the R above), however, only in the regime of very small randomness (corresponding to λ ≪ δR i ).
In order to apply the results of this paper (notably Theorem 2) to the transfer matrices of the Anderson model and extract relevant information on its eigenfunctions, several nontrivial extensions have to be worked out. First of all, the transfer matrices at real energies have a symplectic symmetry that has to be implemented and then leads, in particular, to a supplementary symmetry in the Lyapunov spectrum. This can be done as in [3,20,11]. Then one has to consider the dynamics not only on unit vectors, but rather on the whole flag manifold [3,20]. Furthermore, while the transfer matrices can be brought in the form (3) [20], the random matrices U n stemming from the Anderson model are not Haar distributed and contain much fewer random entries. In the quasi-one-dimensional regime, this can be dealt with using commutator methods, see [8] and [17] for a perturbative result when R is elliptic, that is, of unit norm.
Theorem 2 also has some short-comings by itself. First of all, it and its proof do not provide a good quantitative estimate on N 0 . Furthermore, the proof does not readily transpose to the case where 1 + λrU is replaced by exp(λrU). Actually, many of the arguments below depend heavily on geometric considerations and explicit calculations exploring formulas for averages over the Haar measure.
Acknowledgements We thank Andreas Knauf for many discussions as well as helpful and constructive comments. F. D. received funding from the Studienstiftung des deutschen Volkes. This work was also supported by the DFG.

Outline of the proof of Theorems 1 and 2
Throughout the remainder of the paper we assume that λ ∈ (0, 1), r n ∈ [0, 1] are i.i.d. with a bounded density and the U n ∈ O(L + 1) are Haar distributed and independent. The onedimensional Lebesgue measure will be denoted by x. We also abbreviate absolutely continuous and absolute continuity by a.c..

Lemma 4. The random variable
If also w ∈ S L , the Radon-Nikodym densities satisfy In order to explain the main point of the lemma, let us introduce the random variable Y = v, Uv and spell out the action (2) explicitly: Now r and Y both have an a.c. distribution and are independent of each other, but nominator and denominator in (8) are correlated and standard Lipschitz continuity arguments on zero measure sets do not seem to apply. Therefore some further argument is needed. This is lengthy, but elementary. Details of the proof of Lemma 4 and all the technical results described in this section are deferred to Section 3. The next lemma states an elementary invariance property.
Now the two previous lemmas combined imply that the action (1 + λrU)· has a symmetric transition probability density w.r.t. ν L and thus, in particular, the detailed balance condition is satisfied.
Moreover, its Radon-Nikodym density satisfies The final preparatory result involves the deterministic hyperbolic part of the dynamics.
Once all these lemmatas are proved (once again, see Section 3), it is possible to complete the proof the first part of Theorem 1, namely to prove the absolute continuity stated therein.
Proof of Theorem 1. Let v 0 ∈ S L be a given initial condition. Then by Lemma 7 the random Then, again by Lemma 7, also the random vector This procedure can now be iterated and the density ρ vn is given by In the limit n → ∞, the ρ vn converge to the density ρ of the Furstenberg measure µ r,λ which satisfies and is hence indeed absolutely continuous. The density ρ thus also satisfies the fixed point equation: The final statement of Theorem 1, namely the fact that the support of the Furstenberg measure is the whole sphere S L , follows from the next Lemma 8 showing that each point on S L can be reached by an explicit path of finite length. ✷ ..,L δR i and that 1 ∈ supp(r). Then for every w ∈ S L there exists N ∈ N and r 1 , . . . , r N ∈ supp(r) and orthogonals U 1 , . . . , U N such that v N = w.
Next let us outline the proof of Theorem 2. It will be useful to split each T n into the unperturbed, deterministic action R and a random perturbation 1 + λrU, and analyze the action of both factors separately. The unperturbed action R· leads to a decrease of the norm of the upper part and an increase of the norm of the lower part. More precisely, provided that κ Lc > κ L b +Lc+1 , one has for any v ∈ S L obeying a(v) = 0 = c(v) the bound The former inequality is now strengthened.
This implies that the unperturbed dynamics obeys The random perturbation, on the other hand, may augment a(v) . However, in expectation this growth is bounded by a term of order O(λ 2 ).
At first glance, it may now appear straightforward to prove upper bounds on E a(v n ) 2 for large N by combining Lemmata 9 and 10. An iterative application turns out to be more involved, however. The core task is to deal with the expectation value of products (13). This is tackled by the following elementary lemma.
Consequently the next aim is to bound from above so that inequalities (13) and (15) can be used. This turns out to be possible by comparing the random dynamics (1) generated by (3) with the random dynamics generated by 1 + λr n U n instead of T n , that is, the case of R = 1 which has no hyperbolicity. The comparison of the cumulative distribution function (16) under these two random dynamics is based on the next lemma.
Remark Since c ((1 + λrU) · v) and c ((1 + λrU) · w) are R-valued, the validity of (17) for all ǫ ∈ [0, 1] is equivalent to the stochastic order as defined, e.g. in Section 17.7 of [9]. ⋄ Now one can iteratively combine the second part of (12) and Proposition 12. For ordered products, we use the following notation: Corollary 13 allows to bound (16) by the r.h.s. of (19) with δ = ǫ 2 . This r.h.s. can readily be estimated if one knows the invariant probability measure on S L under the dynamics (1 + λrU)· (it is again unique and given by the Furstenberg measure). The following proposition shows that this invariant measure is equal to the normalized invariant surface measure ν L on S L . In the terminology of [17,16] this means that the dynamics (1 + λrU)· has the so-called random phase property.
While a proof is included in Section 3, let us note that Proposition 14 also follows by general principles from the detailed balance condition following from Lemma 6. At large N, the r.h.s. of (19) therefore approaches ν L ({v ∈ S L : c(v) 2 < δ}), see [3], Part A, Theorem 4.3. Therefore the following geometric identity will be needed.
which just means that c(v) 2 is distributed according to the beta distribution with parameters ( La+L b 2 , Lc 2 ). For (L a , L b ) = (1, 1) this can, moreover, be bounded as follows: The following Corollary 16 combines Proposition 14 and Lemma 15 and concludes the transient focus on the special case of R = 1.
An iterative application of this inequality fromÑ =Ñ 0 to N − 1 yields for all N ≥Ñ 0 . Thus for all , optimizes the order in λ of the bound and proves (5).  [12], Chapter 3). Considering, moreover, the projection ς v : S L → R into the direction v given by ς v (w) = v, w , it is also x with a Radon-Nikodym density given by Therefore the random variable Y = v, Uv is a.c. distributed w.r.t. x and its Radon-Nikodym density ρ Y is given by (24). Now, let x ∈ [0, 1). Due to (8), where Now let y ∈ (−1, 1). Using the Radon-Nikodym density ρ r of the random variable r w.r.t. x, one has Because it follows from (25) and (27) that The r.h.s. will now be bounded separately in the cases y < ∓x and y > ∓x. Note that the case y = ∓x never materializes because of the condition s ± y (x) ∈ [0, 1] due to (26). Now, if y < ∓x, the bound s ± y (x) ≤ 1 implies due to (26) and hence On the other hand, for y > ∓x the bound s ± y (x) ≥ 0 implies and hence similarly Replacing inequalities (28) and (29) implies that Comparing with (24) for L ≥ 2, one concludes that for any ǫ > 0 there exists a function g ǫ ∈ L 1 (−1, 1), ρ Y dx such that for all y ∈ (−1, 1)

Now one can derive (25) and exchange the derivative with the integral so that for all
In particular, the l.h.s. exists if x ∈ [0, 1) so that v, (1 + λrU) · v is a.c. on [0, 1). The estimate now concludes the proof of the absolute continuity. For the proof of (7), but W * UW is distributed identically to U due to the invariance of the Haar measure.
holds. But V U is distributed identically to U and this implies (10). As (1 + λsU ) · V v = V (1 + λsV * U V ) · v, the proof of (9) follows in a similar manner. ✷ Proof of Lemma 6. Let us fix a vector v ∈ S L and introduce the real random variable Z = v, (1 + λrU) · v . By Lemma 4, Z has an a.c. distribution w.r.t. x with a density that will be denoted by ρ Z . With this, where v ⊥ ∈ S L is a random unit vector orthogonal to v. By with P ⊥ being the projection onto the orthogonal complement of the span of v, so that v ⊥ only depends on the direction of the component of Uv orthogonal to v, which is independent of the component parallel to v.

Now by the above and Lemma 4 the distribution of (Z, (i
is the distribution of (1+λrU)·v. The function F is a bijection when restricted to (−1, 1)×S L−1 and the two sets {±1} × S L−1 are mapped to one point each. Locally, the restrition of F to (−1, 1) × S L−1 is bi-Lipschitz, in particular with uniform Lipschitz constants on compact subsets of (−1, 1) × S L−1 . Thus one deduces that also this pushforward is a.c. and thus also the distribution of (1 + λrU) · v.