GOE statistics for Anderson models on antitrees and thin boxes in $\mathbb{Z}^3$ with deformed Laplacian

Sequences of certain finite graphs, antitrees, are constructed along which the Anderson model shows GOE statistics, i.e. a re-scaled eigenvalue process converges to the ${\rm Sine}_1$ process. The Anderson model on the graph is a random matrix being the sum of the adjacency matrix and a random diagonal matrix with independent identically distributed entries along the diagonal. The strength of the randomness stays fixed, there is no re-scaling with matrix size. These considered random matrices giving GOE statistics can also be viewed as random Schr\"odinger operators $\mathcal{P}\Delta+\mathcal{V}$ on thin finite boxes in $\mathbb{Z}^3$ where the Laplacian $\Delta$ is deformed by a projection $\mathcal{P}$ commuting with $\Delta$.

class which is referred to as a universal behavior or simply universality. The GOE statistics applies to models with time reversal symmetry in delocalized regimes. Without time reversal symmetry (for instance in presence of magnetic phases) disordered systems are expected to follow the local statistics of the Gaussian Unitary Ensemble (GUE) given by the Sine 2 process.
Let us give some small introduction to Random Matrix Theory. For more detailed information we refer to the common literature, e.g. [Meh, AGZ, For]. The Gaussian Orthogonal Ensemble GOE = (GOE(N )) N ∈N is given by the collection of Gaussian distributions GOE(N ) with density proportional to e − N 4 Tr(H 2 ) on the set of real symmetric N × N matrices Sym(N ) for N ∈ N. The distribution GOE(N ) is invariant under orthogonal conjugation, where I N denotes the N ×N unit matrix. If the random matrix H N is drawn from GOE(N ) this means that it has real entries, (H N ) j,k for j ≥ k (entries above and on the diagonal) are independent Gaussian random variables with mean zero and variance 1 N for the offdiagonal variables and variance 2 N for the diagonal. The other entries are determined as H N is symmetric. Similar, the Gaussian Unitary Ensemble is defined in a way that the distribution is Gaussian and invariant under unitary conjugation.
The normalization in N assures that in the limit N → ∞ the (random) density of states measures converge to the Wigner semi-circle measure 1 2π √ 4 − λ 2 dλ supported on [−2, 2]. Here, {λ N i ∶ i = 1, . . . , N } is the (random) set of eigenvalues of the random matrix H N and δ λ denotes the delta measure supported at λ. The distribution of these eigenvalues for finite N follow the β-ensemble rule, that is, the symmetrized distribution of (λ N 1 , . . . , λ N N ) ∈ R N is proportional to e −βN 4 ∑ i λ 2 i ∏ i<j λ i −λ j β ∏ i dλ i where β = 1 for GOE and β = 2 for GUE.
The Sine β processes emerge as limits of the local statistics of these joint distributions. For some value in the so called bulk spectrum λ 0 ∈ (−2, 2) one can consider the eigenvalue process around λ 0 , that is the shifted eigenvalue process spec(H N − λ 0 I N ) = {λ N i − λ 0 ∶ i = 1, . . . , N }. According to the semi-circle law, the number of eigenvalues in a fixed small neighborhood around λ 0 is proportional to N 4 − λ 2 0 and the distance of λ 0 to the next eigenvalue is roughly proportional to 1 (N 4 − λ 2 0 ). So in order to get a limiting point process it is reasonable to consider Σ N ∶= N 4 − λ 2 0 (2π) spec(H N − λ 0 I). We view this random discrete set as a random counting measure σ N = ∑ x∈Σ N δ x , that is σ N (A) = Σ N ∩A for A ⊂ R. Thus, we have a probability distribution N N on the set of discrete counting measures on R. In general such a distribution is called a point process. N N converges weakly, N N ⇒ Sine 1 (or Σ N ⇒ Sine 1 ) for N → ∞. Weak convergence of point processes is given by convergence of the Laplace functional Ψ N N (f ) → Ψ N (f ) for non-negative continuous functions f with compact support, where Ψ N (f ) ∶= E N (exp(−σ(f ))) = dN (σ)(exp(−σ(f )) .
In essence, this functional replaces the role of the characteristic function (Fourier transform) for probability distributions on R. Now, in the considered case, the processes N N and Sine β are also uniquely characterized by their moment measures (or joint intensities) which all exist and one obtains vague-convergence of those. Under certain growth conditions vague convergence of finite moment measures is sufficient for obtaining weak convergence 1 which is the case here. For a point process N the finite moment measures are given by the expectations over the power measures, this means for a bounded Borel set A 1 × A 2 × ⋯ × A k ⊂ R k the k-th finite moment measure is given by Sine 1 is a so called Pfaffian process where all the finite moment measures are absolutely continuous and their densities are given by a certain Pfaffian. If H N where drawn from the GUE we would have convergence to the Sine 2 process which is a determinantal process given by the famous Sine-kernel, K(x, y) = sin(π(x−y)) π(x−y) , giving the name Sine processes. This means, the finite moment measures are given by Universality (limiting Sine β process) has been proved for many random matrix ensembles, e.g. [DG, ESY, TV] and particularly very recent works by Ajanki, Erdős, Krüger and Schröder allow very general profiles of covariance structures and dependencies with slow correlation decay of the random entries [AEK, EKS]. However, these ensembles are still far from ensembles of very sparse matrices (many zero entries) or matrices with randomness only along the diagonal, both of which apply for Anderson models.
The Anderson model is supposed to describe the quantum motion of electrons in randomly disordered solids like doped semi-conductors. Unlike the random matrix ensembles here one considers operators on an infinite dimensional separable Hilbert space. Typically it is given by ℓ 2 (Z d ) or ℓ 2 (G) for some countable graph G and the random entries just appear on the diagonal. This means that one considers a random operator H = ∆+V given by the sum of a real random diagonal multiplication operator V (in the canonical basis) with independent identically distributed entries along the diagonal and a graph-Laplacian or adjacency operator ∆. The physically most relevant models are given by the sum of a random potential and the discrete Laplacian on Z d , d = 1, 2, 3. There are also continuous analogues defining Anderson models on L 2 (R d ) where ∆ is the actual Laplacian on R d and V a random multiplication operator made out of a sum of bump like potentials centered around lattice points and multiplied by i.i.d. real random variables.
For Anderson models one can also consider eigenvalue statistics if one restricts the model to sequences of finite cubic boxes in Z d or adequate finite sub-graphs of G approaching the infinite graph. Restricting the Anderson model to a finite box gives a random matrix. However, the sequences of such random matrices are very different from the random matrix ensembles mentioned above. The random entries are only on the diagonal and the variances are constant and not re-scaled with the size of the matrix. The off-diagonal entries are typically very sparse 2 describing the graph structure (edges, edge-weights).
For one and quasi-one dimensional models, e.g. [GMP, KuS, KLS] and for large disorder or at band edges in any dimension, e.g. [FS, AM, DLS, Klo] the Anderson model localizes. This means one has pure point spectrum and exponentially localized eigenfunctions, a phenomenon called Anderson localization. In regimes of Anderson localization one finds Poisson type statistics (i.e. limiting Poisson point processes) [Min, Wan, GK].
While there is a huge literature on Anderson localization, there is still a major open problem concerning delocalization. For Anderson models of low disorder on Z d for dimension d ≥ 3 it is expected that some absolutely continuous spectrum persists 3 . Moreover, in these delocalized regimes one also expects some form of universality (GOE statistics) for the eigenvalue statistics along increasing boxes approaching Z d . However, so far even the existence of this delocalized regime for models on Z d is mathematically unproven.
Existence of a delocalized regime for Anderson models was first shown for infinite dimensional regular trees (Bethe lattice) and then extended to similar tree like structures and tree-strips [Kle,ASW,FHS,KLW,AW,KS,Sa1,Sa2]. Only recently some examples of graphs with finite d-dimensional growth rate (d > 2) have been introduced with rigorous proofs of absolutely continuous spectrum for Anderson models on them. These are so called antitrees and similar graph structures [Sa3,Sa4]. The word antitree describes that these graphs are far from trees as they have some local complete-graph-like structures in them. These can be viewed as local mean-field structures which give a local averaging effect on the random potentials preventing localization.
A connection from Anderson models to GOE statistics has been found by considering long strips within Z 2 and re-scaling the random potential in relation to the graph size. Originally one also had to modify the Laplacian slightly [VV] which was later resolved in [SV]. In this paper we combine methods from [Sa3] with [VV, SV] to construct examples of sequences of Anderson models on finite graphs with fixed disorder strength that show GOE statistic in the limit. An additional re-scaling of the randomness (in relation to the non-random parts) as in [SV, VV] is not needed. The graphs are tensor products of two-dimensional grids and a complete graph with normalized edge weights. As tensor products of such a complete graph with the line Z or half line Z + are special cases of antitrees as described in [Sa3], we call these graphs antitrees as well. The locally averaging graph structure of the complete graph part replaces the re-scaling of the randomness in [VV, SV]. In some sense this sequence of considered models lies in between the theory of random band matrix ensembles and the Anderson models on Z d .
1.1. The considered graphs and related random matrices. Let us introduce more precisely the graph structures we will consider. Definition 1.1. a) A discrete weighted graph (G, W ) is a countable or finite set G together with a symmetric, real valued weight function W ∶ G × G → R. Two distinct points x ≠ y ∈ G are considered to be connected by an edge if and only if W (x, y) ≠ 0 in which case W (x, y) = W (y, x) ∈ R is the edge weight. The diagonal elements W (x, x) will be referred to as point weights. One may think of W as a real symmetric matrix indexed by points in G. This is the adjacency matrix of the weighted graph. b) The complete graph of s-elements with re-normalized edge weights (K s , P s ) is given by K s ∶= {1, . . . , s}, P s (j, k) = 1 s for any j, k ∈ K s , thus any point is connected to any other point and the weights are normalized by 1 s . With this normalization, P s can be viewed as a rank one orthogonal projection and thus P s = 1 independent of s. c) If (G, W ) is a discrete weighted graph, then the G-antitree of constant width s is given by the tensor product (G, W ) ⊗ (K s , P s ) = (G × K s , W ⊗ P s ) where W ⊗ P s ((x, j), (y, k)) = W (x, y)P s (j, k) = 1 s W (x, y). In part c), the G-antitree of constant width is basically obtained by replacing any vertex x ∈ G by a set S x of s vertices and the edges between x and y by s 2 edges connecting all points in S x and S y . Doing this procedure where S x is not constant we would get a general G-antitree (of non-constant width S x ). The antitrees we worked with in [Sa3] would all be Z + -antitrees in this sense where Z + is the half line of positive integers with edges only between neighbors. Therefore, we also use the term antitree here as well.
We will consider such antitrees of constant width (tensor products with K s ) for (long) two-dimensional strips. Such adjacent matrices can also be obtained through some deformation of the Laplacian of 3-dimensional (thin) boxes as we shall see. More precisely, the n × r strip Z n×r with point weight w is the set {1, . . . , n} × {1, . . . , r} ⊂ Z 2 with weight function The corresponding Z n×r antitree of constant width s shall be denoted by A w n×r,s and the corresponding nrs × nrs adjacency operator by A w n×r,s , i.e. (A w n×r,s , A w n×r,s ) = (Z n×r , W w ) ⊗ (K s , P s ). To represent it in matrix form, we will split the nrs × nrs matrix A w n×r,s into rs × rs blocks and each of these blocks is split into s × s blocks. Identifying the base space with n×r,s can be considered as an operator on ℓ 2 (Z n×r×s ) and we use the canonical orthonormal basis (δ x 1 ,x 2 ,x 3 ) with lexicographical order to represent A w n×r,s as a matrix. We start with the mean-field vector (1.1) Then, define the rs × rs matrices with s × s blocks Here, I r is the r × r identity matrix and A w r,s is block-tri-diagonal with s × s blocks. Finally, we have A w n×r,s as a block-tri-diagonal nrs × nrs matrix structured in rs × rs blocks: Let be given a probability distribution ν on R. The Anderson type model on A w m,×r with single site distribution ν is given by the random real symmetric matrix H w n,r,s ∶= A w n×r,s + V nrs (1.4) where V nrs is a nrs × nrs real diagonal matrix with independent identically ν-distributed random variables along the diagonal. We will assume that the distribution is compactly supported. This is a family of random band-matrices with randomness only on the diagonal, size N = nrs, band-width 2rs and sparse structure in the entries, but with some local mean-field setup within groups of s × s blocks.
Assumption. (A1) We assume that the distribution ν of the single site potential (diagonal entries of V nrs ) is compactly supported, say in the interval [−σ, σ].
(A2) Furthermore let us assume that the distribution is centered, We will often need averaged quantities over the distribution ν or products thereof ν ⊗k as the diagonal entries of V nrs are all independently ν distributed. When the random variables and their dependence on these entries are clear, we will denote the expectations values by E. Furthermore, in these expressions a variable v will express a ν-distributed independent random variable. Now let us introduce the harmonic mean and define the interval The harmonic-mean to arithmetic-mean inequality gives h λ < λ for λ > σ and we find (1.7) So for small σ or large w the set I w,ν will not be empty.
Theorem 1.2. Let H w n,r,s be the Anderson models on the antitree A w n×r,s with single site distribution ν under the assumptions (A1) and (A2). For almost any λ ∈ I w,ν there exist sequences s k ≫ n k ≫ r k → ∞, and normalization constants N k such that N k spec(H w n k ,r k ,s k − λ) ⇒ Sine 1 for k → ∞ The growth of s k n k → ∞ and r k → ∞ can be chosen as slow as one wants, meaning that for any increasing function f (n) growing towards infinity one finds sequences s k , n k , r k satisfying this limit with s k n k < f (n k ) and r k < f (n k ).
Remark 1.3. The spectrum spec(H w n k ,r k ,s k − λ), i.e. the eigenvalues of H w n k ,r k ,s k − λ are considered as a random point process and the convergence holds in the sense of a weak limit of random point processes as described in the introduction for the GOE case.
Let us go back to blocks in Z 3 and consider the set Z n×r×s as introduced above, a n × r × s grid within Z 3 . Now let us introduce the discrete Laplacian ∆ n,r,s on Z n×r×s but with periodic boundary conditions in the last coordinate direction. For the other directions we use Dirichlet boundary conditions. This corresponds to introducing an additional edge from points (x 1 , x 2 , 1) to (x 1 , x 2 , s) for any x 1 , x 2 . All the edges get weight one and we have no point weights. This means the matrix (or weight function) associated to ∆ n,r,s is given by Using the same basis structure as before we obtain where in general I m will denote the m×m identity matrix and ∆ r,s is an rs×rs tri-diagonal block matrix made of s × s blocks given by Because of the periodic boundary condition in the third coordinate we use the superscript 'p' for ∆ p s . This periodicity is reflected by the top-right and bottom-left entry 1 in ∆ p s . This Laplacian commutes with the orthogonal projection P onto the functions which are constant along the third coordinate direction, i.e.
P∆ n,r,s = ∆ n,r,s P where Pψ( In matrix form using s × s blocks we have the block structure P = diag(P s , . . . , P s ) with nr such blocks. Using ∆ p s 1 s = 2 ⋅ 1 s implying P s ∆ p s = ∆ p s P s = 2P s and I s P s = P s I s = P s we find P∆ n,r,s = P∆ n,r,s P = A 2 n×r,s . Here, the A 2 is not the the square of A, rather in our notation as above it means that we take the adjacency matrix A w n×r,s of the Z n×r -antitree with the point weight w = 2 on Z n×r . This leads to the following corollary: can grow as slow as wanted in comparison to the growth of n k ) such that with the correct normalization N k we find Remark 1.5. (i) The Laplacian ∆ on Z 3 or N 3 can be seen as some sort of limit of ∆ n,r,s for n, r, s → ∞ and has spectrum [−6, 6]. Indeed, for any n, r, s ∈ N we have spec ∆ n,r,s ⊂ [−6, 6]. However the projection P reduces to the subspace with top energy for the Laplacian in the x 3 direction leading to spec P∆ n,r,s ⊂ [−2, 6] and for n, r, s → ∞ one fills this interval.
(ii) Modifying the proofs slightly one can replace ∆ n,r,s by the Dirichlet-Laplacian ∆ D n,r,s on the n × r × s grid. However, since ∆ D n,r,s does not commute with P one needs to consider P∆ D n,r,s P + V nrs and a similar calculation as above shows P∆ D n,r,s P = A 2−2 s n×r,s . So the whole difference is a further s-dependence which in the limit s → ∞ has no influence. This resembles the fact that in the limit towards infinity the boundary conditions of the Laplacian do not matter.
(iii) This result can not be seen as a limiting statistics for boxes on some fixed Anderson model on a separable, infinite dimensional Hilbert space because there is no limit of the projections P = P(n, r, s) in ℓ 2 (Z 3 ) and there is no operator limit of P∆ n,r,s for boxes of size n × r × s approaching Z 3 .
(iv) With s k n k and r k growing very slowly, less than any power of n k , the corresponding subset Z n k ,r k ,s k of Z 3 look like very thin rectangular shaped boxes in Z 3 .

Transfer matrices
The block structure of A w n×r,s will allow the analysis of the eigenvalue equation through transfer matrices of similar type as in [Sa3]. In order to see this, let us identify ψ ∈ real random variables and ν-distributed. The projection P r,s can be written as is a rs × r matrix. Recall that 1 s ∈ C s is the normalized 'mean-field column vector' Hence, it is defined and invertible for all but finitely many values of z ∈ R as the determinant is a rational function of z. If it is invertible we can re-write the eigenvalue equation in the following form, We call T w,z i;r,s the i-th transfer matrix at energy z of A w n×r,s . We write the energy or spectral parameter z as an upper index because the dependence on z is somewhat of the same flavor as the one on w. Now let Q s be an s × (s − 1) matrix such that (1 s , Q s ) is orthogonal, meaning that . . , r} which is slightly different from the periodic one ∆ p s used above, The diagonal matrix V k can be further partitioned into s × s blocks to obtain with v i,j,k being the random potential at the point (i, j, k) so that (2.10) Therefore we finally obtain For some parameters z = λ ∈ R some of the inverses in the definition of the transfer matrix (2.5) are not defined. However, whenever possible we define it by analytic extension of the map z ↦ T w,z i;r,s . Note that by definiteness of the imaginary parts in the occurring inverses there is never a problem for non-real parameters z ∈ R. For this reason we define: Definition 2.1. The value λ ∈ R (spectral parameter) is called singular for H w n,r,s at the i-th slice if the map z ↦ T w,z i;r,s is not defined in λ after analytic extensions. We call λ ∈ R singular for H w n,r,s if it is singular at some slice i = 1, . . . , n. Note that by (2.9) and (2.11) the finite set of singular parameters for H w n,r,s is contained in the convex hull of the support of ν and hence inside the interval [−σ, σ].

The spectrum
For the spectrum and the determination of singular energies we may first split off some (trivial) part of the matrix H w n,r,s . For calculating the appearing Schur complement in the i-th transfer matrix it is sufficient to consider the subspace which is the union of all cyclic spaces of A w r,s + V i associated to the column vectors of Φ r,s . It is clear that A w r,s + V i leaves the (random) subspace V i and its orthogonal complement C rs and with this isomorphy we can identify the product V of the V i as subspace of C nrs and we also have a natural embeddingV ⊥ i of the complements This means ψ ∈ V ⇔ ∀i = 1, . . . , n ∶ ψ i ∈ V i and ψ ∈V ⊥ i ⇔ (ψ j = 0 for j ≠ i and ψ i ∈ V ⊥ i ). We should mention that it is possible that V ⊥ i = {0} for all i and V = C nrs is the full space. In fact, for continuous distributions ν of the single-site potentials this will happen with probability one.
Proposition 3.1. We find including multiplicities that where V ⊥ i is non-trivial and spec(V i V ⊥ i ) non-empty if and only if there is j ∈ {1, . . . , r} such that V i,j has a multiple eigenvalue with V i,j as defined in (2.10).
Proof. Since ran Φ r,s ∈ V i we see that H w n,r,s leaves V for any i = 1, . . . , n invariant. Similarly, for any ψ i ∈ V ⊥ i we have P r,s ψ i = Φ r,s Φ ⊺ r,s ψ i = 0 giving that H w n,r,s also leaves all the spacesV ⊥ i invariant and the restrictions of H w n,r,s toV ⊥ i is isomorphic to the restrictions of A w r,s + V i to V ⊥ i . Now for ψ i ∈ V ⊥ i ∈ C rs we can split up ψ i once more into r-parts (ψ i,j ) r j=1 by C rs = (C s ) r and use the block structure for A w r,s as in (1.2) and Φ r,s as in (2.1). Then implies 1 ⊺ s ψ i,j = 0 for all j = 1, . . . , r. This in turn implies P s ψ i,j = 1 s 1 ⊺ s ψ i,j = 0 and from (1.2) we get A w r,s ψ i = 0. Therefore we have in terms of an orthogonal sum of operators (in fact matrices). The spectral decomposition follows. Moreover, by construction, V ⊥ i is non-trivial precisely if there is a non-zero eigenvector ψ i of A w r,s + V i which is orthogonal to all column vectors of Φ r,s . By the calculations above this is equivalent to finding j and ψ i,j ≠ 0 such that 1 ⊺ s ψ i,j = 0 and ψ i,j is an eigenvector of V i,j as defined in (2.10). Using the fact that V i,j is diagonal, you can find such an eigenvector precisely if V i,j has an eigenvalue with multiplicity more than one.
Considering the eigenvalue equation (2.3) Lemma 3.2. For any non-singular energy λ ∈ R the matrices Ψ λ,i are defined or can be defined by analytic extension of z ↦ Ψ z,i at z = λ.
Proof. Let λ be non singular for H w n,r,s . If λ ∈ spec(A w r,s + V i ) then the statement is clear by existence of the first inverse in (3.3) and existence of the last term at least by analytic extension in λ. So let λ be an eigenvalue of A w r,s +V i . In order to show that Ψ λ,i is defined by analytic extension, it is sufficient to show that ϕ ⊺ Ψ λ,i can be defined by analytic extension for any eigenvector ϕ of the real symmetric matrix A w r,s +V i because there is an orthonormal basis of eigenvectors. So let (A w r,s + V i )ϕ = λ 0 ϕ. Then, for ε ≠ 0, ε small, For λ ≠ λ 0 it is clear that the limit ε → 0 exists as λ is non singular and therefore the limit of the second term exists. Let us now assume λ = λ 0 . We need to use some Schur complement formulas. Since Φ ⊺ r,s Φ r,s = I r we can choose some orthonormal basis for C rs and C r such that Φ r,s ≡ I 0 . We may work in these bases and give EI − A w r,s − V i and ϕ the corresponding block structures The symbol ≡ shall remind that this is not how the matrices are defined but their appearance after some basis change putting Φ r,s in the block structure as indicated. Then the eigenvalue equation for ϕ transforms to Aϕ 1 + Bϕ 2 = 0, B ⊺ ϕ 1 + Dϕ 2 = 0 (3.4) and we find using the Schur complement formula where P ker is the orthogonal projection onto the kernel of D (note that D is self-adjoint). We know that the limit B(D + εI) −1 B ⊺ exists as λ is not singular. Hence, for any vector v we have that Therefore ran B ⊺ ⊂ (ker D) ⊥ and BP ker = (P ker B ⊺ ) ⊺ = 0. Hence, This implies for ε → 0, which exists because the extension of the Schur complement is analytic in z = λ.
Let us now introduce the products of the transfer matrices: X w,z i;r,s ∶= T w,z i;r,s T w,z i−1;r,s ⋯ T w,z 2;r,s T w,z 1;r,s .' (3.5) We finally obtain the key Proposition of this section or λ is an eigenvalue of V i V ⊥ i for some i = 1, . . . , n. Particularly, if λ > σ then λ is an eigenvalue of H w n,r,s if and only if (3.6) holds. Note that the second statement follows immediately as V i ≤ σ ⇒ spec(V i ) ⊂ [−σ, σ] and all singular energies are also inside the interval [−σ, σ].
Proof. First let λ ∈ spec(H w n,r,s ), either λ ∈ spec(V i V i ) for some i = 1, . . . , n or λ ∈ spec(H w n,r,s V). For the letter case let ψ = (ψ i ) n i=1 be a corresponding non-zero eigenvector, note ψ ∈ V. Claim 1: For some i = 1, . . . , n we have ⃗ u i = Φ ⊺ r,s ψ i ≠ ⃗ 0. If Φ ⊺ r,s ψ i = 0 for all i = 1, . . . , n, then H w n,r,s ψ = λψ also implies V i ψ i = λψ i and we have ψ i ∈ V ⊥ i and hence ψ ∈ V ⊥ implying ψ = 0 as ψ ∈ V as well. Claim 2: (⃗ u i ) i = (Φ ⊺ r,s ψ i ) i satisfy the transfer matrix equation (2.5) at z = λ with ⃗ u 0 = ⃗ u n+1 = ⃗ 0. If all appearing inverses in the definition of T w,λ i;r,s in (2.5) exist for all i = 1, . . . , n then this is clear so we focus on the case when the transfer matrix is defined only by analytic extension. The eigenvalue equation for λ leads to which after multiplying with Φ ⊺ r,s from the left gives In both equations we have to set ⃗ u 0 = ⃗ 0 for i = 1 and ⃗ u n+1 = ⃗ 0 for i = n. With Lemma 3.2 the limit ε → 0 shows that (⃗ u i ) i satisfies the transfer matrix equation with the transfer matrices defined by analytic extension to λ.
As not all of the ⃗ u i are zero, and ⃗ u 0 = ⃗ 0, we find that ⃗ u 1 ≠ ⃗ 0 and we have I r 0 X w,λ n;r,s This implies (3.6). Conversely, assume (3.6), then we find ⃗ u 1 ≠ ⃗ 0 satisfying (3.7). We again focus on the case where one or more of the transfer matrices at z = λ are only defined by analytic extension.
In the limit ε → 0 with Lemma 3.2 we obtain that ψ(λ) is a non-zero eigenvector for the eigenvalue λ.

Effective energy, effective potential and elliptic channels
For λ > σ the random variables (v λ i,j;s ) i,j are well defined and independent identically distributed, the distribution depends on λ and s. Moreover, the law of large numbers gives for λ > σ and s → ∞ a limit distribution concentrated on the point h λ . From (2.11) we thus define the effective energy by where v is a ν-distributed random variable. Another important quantity will be the λ-dependent variance Moreover, let us define From now on we mostly make considerations for a fixed λ ∈ [−σ, σ] and will omit the λ-dependence most of the time. Note that E(Y i,j;s ) = 0 and in (i, j) we have a family of real (for λ real), independent identically distributed random variables. Harmonic mean estimates for bounded random variables as in Theorem A.1 give (4.5) The error bounds are uniform in λ on compact sets outside [−σ, σ] (including compact subsets of C).
The upper left r × r block entry of the transfer matrices are given by is the effective random potential in the i-th slice. In the s → ∞ limit the eigenvalues and eigenvectors of EI r − ∆ D r will classify some of the asymptotic behavior of the products. Let us note that λ ∈ I w,ν implies E(λ) ∈ (−4, 4) . Moreover, E(λ) is a continuous and strictly monotone function of λ in I w,ν . As in [SV] we now separate elliptic and hyperbolic channels and diagonalize ∆ D r by the orthogonal matrix O jk ∶= 2 (r + 1) sin(π jk (r + 1)) , j, k = 1, . . . , r .
The corresponding j-th eigenvector of ∆ D r corresponding to the j-th column vector of O is given by a j = 2 cos(πj (r + 1)) , j = 1, . . . , r so that O ⊺ ∆ D r O = diag(a 1 , . . . , a r ) . We focus on the case −4 < E(λ) ≤ 0, the other case is symmetrical. In this case E − a j < 2. In the notions of [SV] we have a parabolic channel if there exists j such that E − a j = 2 which in this case means E − a j = −2. For any given r, there are r such values of E (and of λ). The union over r ∈ N gives some countable set of values in E and λ respectively. We will omit these values. Then, if there is no parabolic channel, there is r h = r h (r, E) such that E − a j < −2 for j = 1, . . . , r h (hyperbolic channels) −2 < E − a j < 2 for j = r h + 1, . . . , r (elliptic channels) So we have r h hyperbolic and r e ∶= r − r h elliptic channels. Note that for any fixed E and r → ∞, r e = r e (r, E) is of the order of r, r e ∼ cr for some c > 0. Then we define γ j ∈ R and z j ∈ C, z j = 1 by and as in [SV] we define Γ = diag(γ 1 , . . . , γ r h ) , Z = diag(z 1 , . . . , z re ) (4.8) as well as Here, U is a 2r e × 2r e diagonal matrix,Õ a 2r × 2r orthogonal matrix written in r × r blocks and Q a 2r × 2r matrix where the rows are divided in 4 blocks of sizes r h , r e , r e , r h and the columns in 4 blocks of sizes r h , r e , r h , r e . All the non-zero blocks indicated above are diagonal square matrices. These matrices depend on λ. In order to get the eigenvalue processes we will have to vary the spectral parameter around λ but we will use these fixed QÕ and U to describe our basis change cf. (5.2). But primarily let us set one more demand on the choice of λ or better E(λ), respectively. For fixed r the value r h changes exactly at the points E = E(λ) where we have some parabolic channel. Hence, I(r 0 ) ∶= {λ ∈ I w,ν ∶ r e (r, E(λ)) = r 0 } is a union of intervals where r h and r e are constant and Z = Z(λ) is an analytically dependent diagonal r 0 × r 0 matrix.
Lemma 4.2. For each r 0 > 0 and Lebesgue almost all λ ∈ I(r 0 ) we find that Z as defined above is chaotic and moreover for any unitary diagonal r 0 × r 0 matrix Z * there is an increasing sequence (n k ) k of integers such that Z n k +1 → Z * for k → ∞.
So we will consider λ and r such that we have elliptic channels 4 , there is no parabolic channel and such that Z is chaotic.

The limit of thin boxes with fixed width
We will first look at the situation s = mn with m and r constant and consider the eigenvalue process for n → ∞. Furthermore, we scale energy differences to λ by n(h 2 λ σ 2 λ +1) (cf. (A.5)) and define Here, the error bound is uniform for ε n varying inside a compact set so that λ ε n ∈ I w,ν . Let us map out the correspondences between notations here and in [SV] in order to understand the relations of the propositions. In principle s amounts to the disorder strength and 1 s corresponds to λ 2 or better to σ 2 λ 2 in [SV,Section 5], m amounts to σ −2 . Note in particular that the use of λ as in this paper does not correlate to the use of λ in [SV]. But E(λ) is the more important quantity here which corresponds to E in [SV], the use of ε is the same. The size of the transfer matrices r here corresponds to d in [SV], moreover r h and r e correspond to d h and d e in [SV], respectively.
Since s = mn from now on, we will omit the index s and replace it by m and n. Because of the different roles of m and n we will place the indices differently. This way notations correspond somewhat to the ones used in [SV]. Then using the definitions (4.8) and (4.9) for some fixed λ without parabolic channel such that Z is chaotic we define For ε = 0 the limit s → ∞ gives the non-random matrix written in blocks of sizes r h , 2r e , r h . Note that the upper block has the eigenvalues of size (absolute value) < 1, the middle part the eigenvalues of size 1 and the lower part the eigenvalues of size > 1. Hence, when considering products the upper part is decaying, the lower part growing and the middle part stays of order 1. For the products we will look at the same basis changes and scaling and define X ε,m i;r,n ∶= Q −1Õ⊺ X w,λ ε n i;r,mnÕ Q = T ε,m i;r,n T ε,m i−1;r,n ⋯ T ε,m 1;r,n (5.4) We also have to consider the impact of the perturbation in the spectral parameter. From (4.3), (4.6), (5.1) we obtain using s = mn that The error term is non-random and the bound is uniform for ε n varying in compact sets where λ ε n stays outside [−σ, σ]. Equation (5.5) is in essence the analogue of [SV,equation (5.5)] where 1 √ m here plays the role of σ there. Some difference is that here the randomness and the drift-term have some dependence on ε and m, however, this dependence will not matter in the limit. Note that by construction E(Y ε,m i;n ) = 0. Using (4.4) and (4.5) we find for ε varying inside compact sets that The error terms mean that the reminder terms are bounded by C(1 (mn) + ε n) with a uniform C as long as ε n stays inside some compact interval so that always λ ε n ∈ I w,ν . In particular for any compact set K there is N such that for n > N and ε ∈ K this bound is uniform.
Using these bounds, the moment bound in (4.5) and the independence of the Y i,j;s we see that Theorem B.1 is applicable towards an SDE limit for the products X ε,m i;n for fixed ε, m with scaling i ∼ n. More precisely, from the decomposition of T r in (5.3) define and let X ε,m i;r,n ∶= where X 0 is some adequate r × r matrix such that the Schur complement exists. Then, Theorem B.1 (i) gives a weak limit of stochastic processes being some stochastic processes with Λ ε,m 0 ∶= I 2re which for (ε, m) fixed satisfy some SDE (stochastic differential equation) in t. A special choice of X 0 and hence X 0 as in [SV] is needed for proving the limiting eigenvalue statistics mentioned further below. The covariance structure of the matrix Brownian motions appearing can be calculated as in [SV,Proposition 5.3 and Section 5.3], especially [SV,eq. (5.37)], as we have almost the same type of random matrices here with the same elliptic and hyperbolic channels and the same diagonalization of ∆ D r . This gives the following.
Proposition 5.1. Let λ be such that Z is chaotic. The family of processes Λ ε,m t satisfy SDEs in the evolution in t of the form A t and B t are independent matrix Brownian motions, A t is Hermitian and B t complex symmetric, i.e.
All covariances which do not follow are zero.
Note, the occuring factors h 4 λ σ 2 λ come from the variance of Y i,j;s as compared to the variance 1 for the potential used in [SV]. As in [SV] for energies close to λ all the eigenvalues of H w n,r,s are given by the zeros of λ ′ ↦ det( I r 0 X w,λ ′ n;t,s Ir 0 ) (See Proposition 3.3). Note that for ε < C and any C there is n 0 = n 0 (C, λ) such that λ ε n ∈ I w,ν ⊂ R ∖ [−σ, σ] for any n > n 0 Using the calculations in [SV,Theorem 5.4] one can change the analytic function in ε characterizing the eigenvalues along adequate sub-sequences to get another characterization of this point process in the limit using Theorem B.1 (iii). This leads to the following.
Proposition 5.2. Let E n,r,s be the process of eigenvalues of H n,r,s − λI nrs re-scaled by the factor n(h 2 λ σ 2 λ + 1), i.e. let E n,r,s = n(h 2 λ σ 2 λ + 1) spec (H n,r,s − λ I nrs ) . Fixing r let λ ∈ I w,ν be such that Z (as defined in (4.8)) is chaotic, let n k be some strictly increasing sequence such that Z n k +1 → Z * for k → ∞. Then, E n k ,r,mn k converges to the zero process of the determinant of a r e (λ) × r e (λ) matrix, The important part here is that the SDEs can be jointly solved in ε with unique analytic versions in ε (distributions on the set of analytic functions, see Theorem B.1 (ii) ). Therefore, the random set of zeros, zeros ε f (Λ ε,m 1 ) = {ε ∈ C ∶ f (Λ ε,m 1 ) = 0} for an analytic function f is well defined (as a distribution on the set of sets) and makes sense as a point process if P(f (Λ ε,m 1 ) ≡ 0 ∀ε ∈ C) = 0. The factor (h 2 λ σ 2 λ + 1) occurs here because it also occurs in the perturbations λ ε n of λ. Note that with fixing r and letting s ∼ n going to infinity of the same order we basically look at a sequence of graphs resembling a quasi-two-dimensional limit.

The GOE limit
Let us now explain how from Propositions 5.1 and 5.2 one can get to the limiting GOE statistics as in [SV, VV]. Formally, the first step is like a derivative of the SDE in Proposition 5.1 for small 1 √ m when replacing ε by ε √ m meaning that we zoom in more locally. Then in the m → ∞ lots of (groups) of eigenvalues of this process will move to infinity and some group is left which spaces like the eigenvalues of a random matrix with Gaussian entries. These random matrices are almost like in the GOE ensemble, there is just a bit of a different covariance structure and some dependence. Afterwards, the r → ∞ limit will finally lead to the Sine 1 process. So all together with the limit in the previous structure, it is a triple limit process leading to the GOE statistics. Fixing r we look at the process √ m (X ε √ m,m i;r,n − X 0 ) in a m → ∞ limit. On the level of the limiting process Λ ε,m t as in Proposition 5.1 let us note that In the limit m → ∞ the SDE can be easily solved and one finds as in [SV] Λ ε,m t m→∞ ⇒ Λ ε t ∶= ε t S Now taking λ such that Z is chaotic as in Proposition 5.2 and taking a sequence n k with Z n k +1 → I re we find the limiting eigenvalue processes E n k ,r,mn k k→∞ ⇒ E r,m ∶= zeros ε det I re I re Λ ε,m 1 I re −I re .
Then, working with analytic versions in ε and 1 √ m for this family of processes one finds as in [SV] √ Using the calculations as in [SV,Lemma 5.5] in combination with Theorem B.1 (iii) one can get to this limit with a double sequence n k ≫ m k → ∞, more precisely: Proposition 6.1. Let λ be such that Z is chaotic, let n k be a strictly increasing sequence of natural numbers such that Z n k +1 → I re and let m k → ∞ be some increasing sequence towards infinity such that √ m k Z n k +1 − I re → 0. Then for t > 0, jointly in t ∈ (0, 1] and ε varying in any finite subset of C we find Moreover, for the re-scaled eigenvalue process E n,r,s as defined above we find Let us note that from the process it is obvious that given any (slowly) towards ∞ increasing function f (n) one can choose to consider only sequences such that m k < f (n k ).
Proof of Theorem 1.2. Let b be some standard Gaussian variable and K = K(r e ) be an independent real symmetric r e × r e matrix with Gaussian entries such that E((K ii ) 2 ) = 5 4 and E((K 2 ij ) = 1 for i ≠ j. Then in distribution, As explained in [VV], using methods of [ESYY] the local eigenvalue process converges to the Sine 1 process for r e → ∞, more precisely, √ r e spec(K(r e ) + b I re ) re→∞ ⇒ Sine 1 .
Now, for almost all λ ∈ I w,ν i.e. almost all E(λ) ∈ (−4, 4) we find that for all r ∈ N, Z is chaotic and there is no parabolic channel. Let us fix such a λ. Then for r → ∞ we also find r e (r, E) → ∞ and hence (r + 1)r e h 2 λ σ λ E r ⇒ Sine 1 .
This convergence and the convergence mentioned in Proposition 6.1 happen in the topology of weak convergence for point processes. Therefore, one can construct some diagonal sequence m k , n k , r k → ∞ such that with s k = m k n k and r e,k = r e (r k , E) we find m k (r k + 1) r e,k h 2 λ σ λ E n k ,r k ,s k k→∞ ⇒ Sine 1 .
This proves Theorem 1.2 with the normalization constant N k ∶= (h 2 λ σ 2 λ + 1) n k s k (r k + 1)r e,k h 2 λ σ λ . Now let f (n) be any (slowly) increasing function with f (n) → ∞ for n → ∞. In Proposition 6.1 one may choose m k < f (n k ) and start the sequence with n k > f (r) . Therefore, we may choose m k < f (n k ) and r k < f (n k ).

Appendix A. Harmonic means of random variables
In the transfer matrices we see effective potentials that are harmonic means of certain independent identically distributed (iid) random variables. Certain estimates are crucial for the proofs. We therefore consider in this section independent identically distributed random variables X k ∈ [a, b], 0 < a < b, k ∈ N. These variables correspond to E − v i,j,k . We will consider the harmonic means V s and the harmonic average h defined by where E denotes the expectation value. V s corresponds to the random variables v λ i,j;s as in (2.9) and h corresponds to h λ . The second and third moment of the centered random variable 1 X j − 1 h will be of some importance, therefore let Note σ 1 = 0 and σ 2 2 is the variance of 1 X k and corresponds to σ 2 λ in the application of the following estimates.
Theorem A.1. There exists a continuous function C = C(a, b, h, σ 2 , σ 3 ) such that uniformly in s, Moreover, for the higher moments we find where we used that unpaired indices lead to zero expectation and the fact that (2m)! 2 m m! is the number of pairings of the set {1, . . . , 2m}. Now using that there are s m m-tuples (k 1 , . . . , k m ) and using the bound of Y k as mentioned above we find s 2 which with (A.4) (using the second-last and last term) gives (A.1). Taking powers of (A.4) and using similar estimates lead to (A.2) and (A.3).
For the general moment bound we use When varying the spectral parameter we also need to understand how the harmonic average varies for the definition in (5.1). This amounts to replacing X k by X k,ε = X k + ε and recalculating h ε = 1 E(X −1 k,ε ). Note by the continuity of C = C(a, b, h, σ 2 , σ 3 ) for the formulas above the error terms will also be uniform in ε along compact sets ε ≤ c in ε as long as c < a because X k,ε ∈ [a − c, b + c] under such perturbations. Using where the error bound is uniform on compact sets in ε ≤ c where a − c > 0.

Appendix B. SDE limits for products of random matrices
In this appendix we sumerize the key results of [SV] which are used in this paper. Let be given some probability space (Ω, A, P), an open ball of radius r around zero B r = {z ∈ C ∶ z < r} and a family of analytic random matrices T ε k;n ∶ Ω → C r×r for k, n ∈ N, ε n ∈ B r of the form T ε k;n = T 0 + 1 √ n V k;n + 1 n (εY n + W n ) + 1 n 3 2 Z ε k;n where (V k;n , Z ε k;n ) ∞ k=0 are independent identically distributed random variables (for fixed ε and n) and T 0 , Y n and W n are non-random (that is they are fixed for all ω ∈ Ω). Analyticity means that for any ω ∈ Ω the dependence of T ε k;n (ω) and thus of Z ε k;n (ω) on ε ∈ nB r = B nr is analytic. The lowest order term shall be block-diagonalized in the form where Γ 0 < 1 , Γ 2 < 1. Here, U(r 1 ) is the unitary group of r 1 × r 1 matrices. Moreover, we assume that V k;n have mean zero, E(V k;n ) = 0, and that we have uniformly for ε n ∈ B r , n ∈ N a 8th moment bound 5 in the following sense Furthermore, we assume that the limits lim n→∞ Y n = Y and lim n→∞ W n = W exists and that we have limits of all second moments of the complex entries of V k;n meaning that lim n→∞ E V ⊺ k;n M V k;n = h(M ) and lim n→∞ E V * k;n M V k;n =ĥ(M ) exist giving linear maps from C r×r to itself. Here, E denotes the expectation, i.e. the integral over ω ∈ Ω with respect to the probability measure P. Without the limit n → ∞ these functions encode all joint second moments of the random matrix entries of V k;n . First, let us define some projections we will need. P ≤1 = I r 0 +r 1 0 r 2 ×(r 0 +r 1 ) , P 1 = ⎛ ⎜ ⎝ 0 r 0 ×r 1 I r 1 0 r 2 ×r 1 ⎞ ⎟ ⎠ , P 2 = 0 (r 0 +r 1 )×r 2 I r 2 .
The exponential growing part for powers of T 0 will be projected away by a Schur complement: Let X 0 be such that X 0 ∶= (P ⊺ ≤1 X −1 0 P ≤1 ) −1 exists and consider X ε k;n ∶= I r 0 U −k P ⊺ ≤1 T ε k;n T ε k−1;n ⋯T ε 2;n T ε 1;n X 0 The rotations through U in T 0 lead to an averaging effect. The averaged covariances for a limiting Brownian motion will be described by the following functions, Here, ⟨U ⟩ is the smallest compact group containing the unitary U and by the notation du we integrate u ∈ ⟨U ⟩ over the normalized Haar measure on that group. Furthermore, for the drift term we define in a similar way W ∶= ⟨U ⟩ u P ⊺ 1 WP 1 − P 1 h(P 2 Γ 2 P ⊺ 2 )P 1 U * u * du and Y ∶= ⟨U ⟩ u P ⊺ 1 Y P 1 U * u * du .
Theorem B.1. (i) In the scaling limit k ∼ n → ∞ the family of processes can be described by an SDE (stochastic differential equation) in the sense that the family of processes (for varying ε) converges in distribution X ε ⌊nt⌋;n ⇒ 0 r 0 ×r 0 Λ ε t X 0 for n → ∞ , t > 0 .
Here, (Λ ε t ) t>0 is a family of processes in C r 1 ×r 1 satisfying an SDE in t of the form dΛ ε t = dB t Λ ε t + (εY + W )Λ ε t dt with Λ ε 0 = I r 1 . B t is a matrix-valued Brownian motion (independent of ε) with covariance structure There is an analytic version of this family of processes, this means a version (same finite points distributions) such that the random functions ε ↦ Λ ε t are analytic in ε. Moreover, let f ∶ C (r 0 +r 1 )×(r 0 +r 1 ) → C be complex-analytic such that P(f (Λ ε 1 ) = 0 ∀ε ∈ C) = 0. Then, one has a well-defined point process For some analytic function f 0 ∶ C r×r → C let be defined the point processes E n ∶= zeros ε f 0 (T ε n;n T ε n−1;n ⋯T ε 1;n ) which should be discrete countable sets with probability one. Assume that one finds X 0 as above and analytic functions f n ∶ C (r 0 +r 1 )×(r 0 +r 1 ) → C such that for any compact set K ⊂ C we have P E n ∩ K = zeros ε f n (X ε n;n ) ∩ K → 1 , f n →f uniformly on K and f (Λ ε ) ∶=f 0 Λ ε 1 X 0 fulfills the conditions of part b).
Then, in the sense of weak convergence of point processes, E n ⇒ zeros ε f (Λ ε 1 ) . Proof. Part (i) follows directly from [SV,Theorem 1.1]. Also, note that for a finite set of ε, say (ε 1 , . . . , ε m ) ∈ C m we can simply consider block-diagonal matrices diag(T ε 1 k;n , . . . , T εm k;n ) for obtaining the joint distributions for different ε in the limit. This leads to the use of the same Brownian motions for different ε and we have in fact convergence to a random field (ε, t) → Λ ε t . For part (ii) first note that the limit is independent of Z ε k;n . Hence, we can set this part equal to 0 first, obtaining families of random matrices T ε k;n for all ε ∈ C depending analytically on ε. As argued in [SV,Section 5.2] using the uniform bounds (in ε) one can use [VV,Corollary 15] to get a unique version for which ε ↦ Λ ε t is analytic (uniqueness in the sense of joint probability distributions on the set of analytic functions). For f as given, one can then obtain well-defined distributions on the set of countable subsets of C defined by the zeros of f (Λ ε 1 ) which gives a point process. Part (iii) is basically proved in [SV,Theorem 5.4] for a specific case, following again [VV,Corollary 15]. First, for the weak convergence of point processes it is sufficient that for any compact set K ⊂ C the point processes restricted to K converge. Secondly, for K ⊂ C compact and n 0 large enough we have K ⊂ n 0 B r and all T ε k;n are defined for ε ∈ K and n ≥ n 0 . Moreover, using the uniform bounds and arguments in [SV], for ω ∈ Ω 0 ⊂ Ω with P(Ω 0 ) = 1 we have that the Schur complements X ε k;n are well defined for sufficiently large n (with a possibly random lower bound). Again, by [VV,Corollary 15] one finds analytic versions in ε, all realized on the same probability space, such that the convergence in part a) is uniform on compact sets (almost surely). Thus, the zeros of f n (X ε n;n (ω)) converge to the ones of f (Λ ε 1 (ω)) uniformly in K (almost surely), if this limiting function is not identically zero in ε. This implies the weak convergence of the point processes given by the zeros in K.