Inhomogeneous Circular Law for Correlated Matrices

We consider non-Hermitian random matrices $X \in \mathbb{C}^{n \times n}$ with general decaying correlations between their entries. For large $n$, the empirical spectral distribution is well approximated by a deterministic density, expressed in terms of the solution to a system of two coupled non-linear $n \times n$ matrix equations. This density is interpreted as the Brown measure of a linear combination of free circular elements with matrix coefficients on a non-commutative probability space. It is radially symmetric, real analytic in the radial variable and strictly positive on a disk around the origin in the complex plane with a discontinuous drop to zero at the edge. The radius of the disk is given explicitly in terms of the covariances of the entries of $X$. We show convergence even down to local spectral scales just slightly above the typical eigenvalue spacing with an optimal rate of convergence.


Introduction
Many random matrix models exhibit a strong concentration of measure phenomenon; their empirical eigenvalue distributions are well approximated by deterministic measures as their sizes tend to infinity. For Hermitian matrices, the simplest and most prominent example is the celebrated semicircle law for Wigner ensembles with independent and identically distributed (i.i.d.) entries above the diagonal [49]. Girko's circular law [29,13] is its non-Hermitian analogue 1 . For matrices X = (x ij ) n i,j=1 with centred i.i.d. entries, unrestricted by symmetry and with normalisation E |x ij | 2 = 1 n , it asserts convergence of the eigenvalue distribution to the uniform probability measure on the unit disk in the complex plane.
Establishing similar concentration results and identifying the limiting spectral density while simultaneously relaxing the two basic assumptions of identical distributions and independence of the entries has since been the focus of many works in random matrix theory. When the entries are independently drawn from different distributions, their variance profile s ij = E |x ij | 2 becomes an additional parameter of the model that determines the density through the nonlinear Dyson equation for n unknowns. Since in general no explicit formula for its solution is available, analysing the characteristic properties of the spectral density has attracted considerable attention.
In the Hermitian case, convergence of the empirical spectral measure is well established [10,32,45] and a classification of the degree of regularity of the asymptotic density as well as of its possible singularities has been given [1]. Even when the independence of matrix entries is dropped and local correlations with sufficient decay are considered this classification persists [7] and concentration of the spectral measure has been proven in broad generality [11,14,20,26,30,34,39,41,44].
There are far fewer results on the existence and characteristics of limiting spectral densities for non-Hermitian matrices since their spectral instability makes such questions more challenging compared to the Hermitian situation. For random matrices X with centred, independent entries and a general variance profile, the convergence of the spectral measure of X to a rotationally symmetric, continuous limiting density σ was shown in [25], and independently in [6] on all mesoscopic scales in the bulk spectrum under stronger assumptions on the variance profile and regularity of the entry distribution. The extension of convergence on mesoscopic scales to the spectral edges and optimal control of the spectral radius was achieved in [8]. These three papers avoided the requirement of identical variances imposed earlier.
In the present paper we also depart from the independence assumption on the entries. We consider a large class of centred non-Hermitian random matrices X ∈ C n×n with general decaying correlations among their entries. Throughout this class, the limiting spectral density σ is determined solely by the covariances between the matrix entries and has the following properties: (i) the density is rotationally symmetric around zero, (ii) its support is a disk centred at the origin, (iii) the density is real analytic as a function of the radial variable inside the disk and has a jump at its boundary.
The analyticity is a new result even when the entries of X are independent (apart from the explicitly known circular law case). In this case, the other properties are known [6]. Remarkably, the support of σ is always connected in the non-Hermitian case, in the independent as well as the correlated setup. This is in sharp contrast to the Hermitian case, where the support can be disconnected even for matrices with centred, independent entries and a variance profile [3].
The random matrices X we consider here are finite-dimensional approximations of certain matrixvalued circular elements from free probability theory. For such elements, one is interested in their Brown measure, a generalisation of the spectral measure of normal operators to general operators in a finite von Neumann algebra. It was introduced in [21] and revived in [33]. Since then significant attention has been given to determining the Brown measure and understanding its properties for specific classes of non-normal operators, see e.g. [15,16,33,36]. In the present work, we prove that the Brown measure of these matrix-valued circular elements has the properties (i), (ii), (iii) listed above.
Convergence of the eigenvalue density to a limiting measure is commonly expressed by showing that for each ball with fixed diameter on the scale of the entire spectrum the fraction of eigenvalues in it agrees asymptotically with the mass assigned to this ball by the limiting measure. Such global law is refined to a local law, showing convergence on mesoscopic scales, by allowing the diameter to decrease with n as long as it stays slightly above the typical eigenvalue spacing. We now review some previous results on local laws for non-Hermitian random matrices with independent entries. A bulk local law for random matrices with centred, independent entries of identical variances was shown in [18]. Additionally requiring the first three moments of the entry distribution to match a standard Gaussian, the local law including the edge was established in [47] and in [19]. The third moment matching condition for the edge local law was then removed in [50].
For the bulk local law, the assumption of identical variances was dropped in [6]. In this situation, the limiting density differs substantially from the circular law generically. Under weaker moment assumptions and asymptotically identical variances, a bulk local law with the circular law as limiting density was shown in [31]. In the setup of [6], the edge local law was proven in [8].
The availability of a local law has wide ranging implications for the spectral analysis of any random matrix model. In the present paper, we apply it to exclude eigenvalues away from the support of the limiting spectral density σ, i.e. with high probability all eigenvalues concentrate on a disk around the origin whose radius is determined by the covariances of the matrix entries. We also obtain the complete isotropic delocalisation of all eigenvectors associated to the bulk eigenvalues. Furthermore, local laws have been a key ingredient in the study of more refined eigenvalue statistics. In the non-Hermitian i.i.d. setup they have been crucially used in the proofs of universality of bulk and edge eigenvalues with a four moment matching condition in [47], edge universality with two matching moments in [23] and the central limit theorem for linear statistics in [22,24].
Non-Hermitian random matrices without any symmetry constraint also play an important role in various applications. In particular, they are used to model connectivities in food webs and neural networks [5,37,40,46]. Since understanding the stability properties of such systems requires precise knowledge of the eigenvalue locations of the associated random matrix model, our work contributes to this line of research by allowing the correlation among the connectivities to depend on underlying geometric structures.
The analysis of the eigenvalue density of a non-Hermitian random matrix X is commonly reduced via Girko's Hermitization trick [29] to the study of the family of Hermitian matrices with spectral parameter ζ ∈ C. Consequently the main task is to control the resolvent G(ζ, This equation has a unique solution for every ζ ∈ C and η > 0 if Im M = 1 2i (M − M * ) is required to be positive definite [35]. Here, S is a linear map on C 2n×2n defined through for any deterministic matrix R ∈ C 2n×2n with n × n-blocks R 11 , R 12 , R 21 , R 22 . The operator S captures the covariances between the entries of X.
The main tool developed in the present paper is a precise stability result for the non-linear high dimensional matrix equation (1.2). From [4,26], we know that G = G(ζ, η) satisfies a perturbed version of (1.2) with a small error term when n becomes large. Thus, G is close to M if stability of (1.2) against small perturbations is controlled. Moreover, the limiting spectral density σ for X is obtained as a derivative of Im M with respect to |ζ|, where ζ is the spectral parameter of X. Thus, any analysis of σ also requires stability of (1.2).
In previous works, the matrix structure of (1.2) was crucially simplified due to more restrictive assumptions on X. If the entries of X are independent then (1.2) reduces to a vector-valued equation for the diagonal of M and thus the Dyson equation is formulated on the commutative algebra of diagonal matrices. For identical variances, all diagonal entries of M coincide, yielding a single scalar equation.
In the matrix setup, a general version of (1.2) and its stability have been studied in [4] under a strong irreducibility condition on S , which is called flatness. However, S as defined in (1.3) does not fulfil this flatness condition due to its special block structure, making the equation inherently unstable. This issue was overcome in [6] for vector Dyson equation, i.e. when the entries of X are independent. Owing to the commutative structure of this vector case, an additional symmetry of M could be exploited to obtain the stability against perturbations respecting this symmetry.
The analysis in the present situation necessitates tackling, at the same time, both main challenges from [4] and [6], the non-commutativity of the MDE and the instability due to the specific block structure, respectively. The genuinely non-commutative structure of the MDE is a major obstacle throughout the entire argument requiring the introduction of appropriately symmetrised objects, which are much more complicated than their counterparts in the commutative setup. To resolve the instability we perform a non-linear transformation of the MDE that allows to restrict the analysis to the manifold of perturbations that respect the additional symmetry of M. This transformation is also applicable in the context of other non-normal models, e.g. non-Hermitian polynomials in several non-commutative variables. Furthermore, it is crucial to show that σ is a real analytic function of |ζ| 2 .
with a constant C > 0. Furthermore, the correlations among the entries of √ n X decay in the product metric d × d on n 2 faster than any power law, i.e. there is a sequence of positive constants C ν such that for any two measurable functions f i : A3 Flatness: There is a constant c > 0 such that for any two deterministic vectors x, y ∈ C n we have where · , · and · denote the standard Euclidean scalar product and norm on C n , respectively.
A4 Smallest singular value: For each ε > 0 and ν ∈ N, there is C ε,ν > 0 such that for all n ∈ N and all ζ ∈ C. Here, s min (X − ζ) denotes the smallest singular value of X − ζ.
A4' Bounded conditional density: There are q ∈ (1, ∞] and κ > 0 such that, for each pair (i, j) ∈ n 2 , there is a probability density almost surely for all measurable B ⊂ C (or B ⊂ R).
A5 There is c > 0 such that the spectral radius ̺(S) of S satisfies ̺(S) ≥ c. Moreover, there is an n-independent monotonically decreasing function f : (0, ∞) → (0, ∞) such that for all τ > ̺(S) and for all n ∈ N.
We remark that Assumption A4' implies Assumption A4 as shown in Proposition 2.10 below. Moreover, in Section 2.3, we explain how some assumptions can be relaxed (see in particular Remark 2.11 for weaker versions of A2) and examples satisfying the assumptions listed above.
The n-independent constants appearing in Assumptions A1-A5 will be called model parameters and while many constants in the following depend on these parameters, we consider them as fixed and often do not explicitly mention this dependence.
Remark 2.1. The monotonicity of f (τ ) in Assumption A5 is not a restriction since multiplying the right-hand side of (2.8) by τ yields a monotonically decreasing function. Furthermore, Assumption A3 implies Assumption A5 (cf. Lemma B.3). The weaker Assumption A5 is imposed to exclude eigenvalues away from the support of the asymptotic spectral density of X, while Assumption A3 is imposed to guarantee convergence of the empirical eigenvalue distribution to this density in the spectral bulk.
The first main result is that, with very high probability, X does not have any eigenvalues away from the disk of radius ̺(S) centred at the origin. This will be proven in Section 3.1 below.
Theorem 2.2 (No eigenvalue outliers). Let X satisfy A1, A2 and A5. Then, for every ν ∈ N and τ * > 0, there exists a constant C ν > 0 such that uniformly for all n ∈ N.
The next theorem states that, for large n, the empirical spectral distribution 1 n ζ∈Spec X δ ζ is well approximated by a deterministic probability density σ on the complex plane. Theorem 2.3 (Global inhomogeneous circular law). Let X satisfy A1 -A4. Then there is a (possibly n-dependent) deterministic probability density σ : C → [0, ∞) such that the empirical spectral distribution of X approaches σ(ζ)d 2 ζ weakly in probably for n → ∞. That is, for every bounded, continuous function f : C → C and ε > 0, we have The proof of Theorem 2.3 will be presented in Section 3.2 below. The density σ will be explicitly defined in (2.11) below in terms of the solution to a system of two coupled n × n-matrix equations determined by the operators S and S * from (2.1). The existence and uniqueness of this solution is stated in the following proposition, whose proof is deferred to the end of Subsection 4.3 below.
Proposition 2.4 (Existence and uniqueness). Let X satisfy A1-A3, S, S * be defined as in (2.1) and τ ∈ [0, ̺), where ̺ = ̺(S) is the spectral radius of S. Then the coupled system of matrix equations has a unique solution V 1 (τ ) = V 1 , V 2 (τ ) = V 2 ∈ C n×n such that both V i are positive definite and satisfy the constraint This solution can be extended to real analytic functions V 1 , V 2 : (−c, ̺) → C n×n with some n-independent constant c > 0.
We will refer to (2.9) as the Dyson equation since as we will see later in Section 4 it is equivalent to a Dyson equation that describes the limit of the resolvent of self-adjoint random matrices. Our first theorem expresses the density σ in terms of the solution to (2.9) and shows that its support is a disk centred at the origin of the complex plane. It is proven at the end of Section 5.2.
Definition 2.6 (Self-consistent density of states). We call the probability density σ, defined through (2.11), the self-consistent density of states associated to S or to X.
In order to formulate the local law in the spectral bulk we introduce observables around a fixed spectral parameter ζ 0 ∈ C on mesoscopic scales n −α with α ∈ (0, 1/2). For any function f : For any r > 0, we denote the disk of radius r centred at the origin by D r := {ζ ∈ C : |ζ| < r}.
The proof of Theorem 2.7 will be given in Section 6 below. Under the stronger Assumption A4', the condition ∆f L 2+a ≤ n C ∆f L 1 in Theorem 2.7 is not necessary (as explained at the beginning of its proof). As a corollary we prove complete delocalisation of the eigenvectors of X. In the case of independent entries eigenvector delocalisation was first proven in [43].
Corollary 2.8 (Isotropic eigenvector delocalisation). Let X satisfy A1-A3. For any τ * > 0, let U τ * denote the set of eigenvectors of X with corresponding eigenvalue in D √ ̺−τ * with ̺ = ̺(S). Then for any ε > 0 and ν ∈ N there exists a constant C ε,ν such that for all n ∈ N and all v ∈ C n . Corollary 2.8 will be proven in Section 6.2 below.

Brown measure of matrix-valued circular elements
We now illustrate how the probability density defined in (2.11) is interpreted as the Lebesgue density of the Brown measure associated to a matrix linear combination of circular operators and thus how Theorem 2.5 provides information about this measure. To that end, let (M, τ ) be a tracial W *probability space 2 . For ℓ ∈ N, free circular elements c 1 , . . . , c ℓ ∈ M and deterministic matrices a 1 , . . . , a ℓ ∈ C n×n , we consider the operator (2.14) We are interested in the spectral distribution of X. Since X is non-normal, we consider the Brown measure, a generalisation of the spectral measure for normal operators. The Brown measure µ X of X is the unique compactly supported probability measure on C such that for any Y ∈ M n×n . The Brown measure was originally introduced in [21] and revived in [33]. The Fuglede-Kadison determinant was first defined in [28]. For an introduction to both of these objects, we refer to [38,Section 11].
In the next result, we express the Brown measure µ X of X from (2.14) in terms of the operators S and S * on C n×n defined through for any R ∈ C n×n . In particular, we identify the support of µ X and classify its regularity.
Proposition 2.9 (Regularity of µ X ). Let X ∈ M n×n be defined as in (2.14). We assume that there are constants C > c > 0 such that for all positive semidefinite R ∈ C n×n . Then the Brown measure µ X of X is given by where σ is defined via (2.11) with S and S * from (2.17). In particular, the Brown measure of X has all properties of σ stated in Theorem 2.5.
The proof of Proposition 2.9 is presented in Section 5.4 below.

Relaxed assumptions and examples
In this subsection, we explain how the assumptions A1 -A5 are related, how some of them can be relaxed and provide some concrete examples satisfying these assumptions. The first result shows that A4' implies A4 and follows directly from Proposition 7.1 below.
Remark 2.11 (Relaxing Assumption A2). We chose to assume a decay of correlation within the matrix X in the form A2 because it is easy to state. However, for our proof it suffices to assume that the decay of correlation (2.4) holds with a fixed power ν > 12p with p ∈ N from (2.3), provided higher order cumulants of the matrix entries of X satisfy a certain compatibility condition. This compatibility condition is [ In analogy to [26,Example 2.12] we also provide a simple description of our assumptions for the case of Gaussian random matrices while relaxing the polynomial decay of correlations from (2.4) to be of order ν = 2 when d(i, j) = |i − j| is the standard metric on n .
Example 2.12 (Results for Correlated Gaussian matrices). Let X ∈ C n×n be a random matrix with centred Gaussian entries such that for all i, j, l, k ∈ n , as well as E |Tr BX| 2 ≥ c n Tr B * B for all B ∈ C n×n , where c, C > 0 are some positive constants. Then the conclusions of Theorem 2.2, Theorem 2.3, Theorem 2.7 and Corollary 2.8 hold for X.
Next, we will formulate a condition for block matrices that ensures Assumption A4'. We denote by E ij ∈ C N ×N the matrix whose (i, j)-entry is 1 and whose other entries are zero, that is, E ij = (δ ik δ jl ) k,l∈ N . In the following lemma, we write z for a matrix-valued variable z = (z γδ ) γ,δ∈ K .
We denote by dz integration with respect to all entries of z and d 2 z αβ denotes the omission of the integration over z αβ . Lemma 2.13 (Block matrices). Let K ∈ N be fixed. Let {x ij : i, j ∈ N } be a family of independent random matrices in C K×K satisfying Ex ij = 0 for all i, j ∈ N . We assume that, for all i, j ∈ N , the matrix x ij √ N K has a density f ij on C K×K , i.e.
for all measurable subsets B ⊂ C K×K . If there are q > 0 and C > 0 such that for all α, β ∈ K then Assumption A4' is satisfied for the block matrix An analogous statement holds when x ij √ N K has a density on R K×K instead of C K×K . Lemma 2.13 will be proven in Section 7.1 below.

Notations
Here we introduce some notations that will be used throughout the paper. We start with basic notations for matrices. We equip the space of d × d-matrices with the normalised scalar product corresponding norm A 2 hs := A , A and use the short hand A = 1 d Tr A for the normalised trace. By A we denote the operator norm induced by the standard Euclidean metric on C d . More generally, for linear operators A : A → B from a normed space A to a normed space B, we indicate the corresponding operator norm by writing A A→B and simply A A in case A = B. Since we often work with 2 × 2block matrices having block dimension n, we will frequently use the block notation from (1.3), where R ∈ C 2n×2n and R ij ∈ C n×n .
For nonnegative quantities φ, ψ we use the comparison relation φ ψ whenever φ ≤ Cψ with an n-independent constant C > 0. This constant is uniform in all parameters except the model parameters from Assumptions A1-A4 and possibly other parameters that are either clearly indicated or obvious from the context. In particular, C is uniform in the spectral parameter ζ within the domain under consideration. If c ψ ≤ φ ≤ Cψ we write φ ∼ ψ and φ = ψ + O(ν) is a short hand for |φ − ψ| ν. We also use the comparison relation for positive definite matrices, where it is interpreted in a quadratic form sense.

Inhomogeneous circular law
In this section we prove Theorems 2.2 and 2.3. These proofs will illustrate how Girko's Hermitization trick translates these questions to Hermitian random matrices which will be analysed via their resolvents and the associated matrix Dyson equation. The proof of Theorem 2.3 is a prototype of the more complicated proof of Theorem 2.7 in Section 6 below.
The fundamental observation due to Girko [29] is that ζ ∈ C is an eigenvalue of X if and only if the kernel of H ζ is nontrivial, where the Hermitian matrix H ζ is defined through The family (H ζ ) ζ∈C is called the Hermitization of X. All spectral information about the kernel of H ζ is captured by the resolvent G = G(ζ, η) of H ζ defined by where ζ ∈ C and η > 0.
We will see in Proposition 3.7 below that the resolvent G is well approximated by the matrix M = M(ζ, η) ∈ C 2n×2n which is the unique solution of the matrix Dyson equation (MDE) under the constraint that the imaginary part Im M = 1 2i (M − M * ) is positive definite. Here, the matrix-valued function Z : C 2 → C 2n×2n and the self-energy operator S , a linear operator on C 2n×2n , are defined through where all blocks in these matrix representations are of size n × n (see (2.1) for the definitions of S and S * ). The existence and uniqueness of M have been shown in [35]. We represent M in terms of the 2 × 2-block structure corresponding to the right-hand side of (3.3). For this purpose we first introduce the matrices V 1 , V 2 ∈ C n×n which are the unique solution of for any η > 0 and τ ≥ 0 under the constraint that V 1 and V 2 are positive definite. We note that (3.5) is a regularised version of the Dyson equation (2.9), used for the definition of σ in (2.11), with some regularisation parameter η > 0. Moreover, we introduce the auxiliary matrix U (τ, η) := 1 τ + (η + S * V 1 (τ, η))(η + SV 2 (τ, η)) . (3.6) Then we obtain that since the right-hand side of (3.7) satisfies (3.3) and has a positive definite imaginary part. Thus solving (3.3) for M with positive imaginary part is equivalent to solving (3.5) for positive definite V 1 , V 2 . From (3.7) we easily get that

Exclusion of eigenvalues away from the disk
We now prepare the proof of Theorem 2.2. First we note that if X satisfies A1 and A2 then, for all positive definite R ∈ C n×n , we have The next lemma describes the behaviour of Im M(ζ, η) when |ζ| 2 > ̺(S).
Proof of Lemma 3.1. Multiplying (3.5a) with η + S * [V 1 ] from the left and (3.5b) from the right with η + S[V 2 ] as well as realizing that the resulting right hand sides coincide reveal the identity Taking the inverse on both sides of (3.11) and applying this identity to the result of multiplying (3.5a) with V 1 from the right and with η + S * V 1 from the left yield We reorganize the terms in (3.12), use that τ − S is invertible as τ > ̺(S) and obtain Here, we used in the last step that (τ − S) −1 is positivity preserving due to the Neumann series and that (η + SV 2 )V 1 (η + SV 2 ) ≥ 0. Therefore, we have shown that for all η > 0 and all τ ≥ ̺(S) + δ. Similarly, we get V 1 δ η/τ . Using V 1 + V 2 δ η/τ and (3.9) to estimate the right-hand side of (3.12) from above implies Hence, V 2 δ η/τ and a similar argument yields V 1 δ η/τ . Owing to (3.8), these estimates and τ = |ζ| 2 complete the proof of Lemma 3.1.
For the upcomming arguments, it is convenient to use the following notion of events that occur with "very high probability". Definition 3.2 (With very high probability). We say that a (sequence of ) events (A n ) n∈N occurs with very high probability if for every ν > 0 there is C ν > 0 such that for all n ∈ N.
The constants C ν in (3.13) will typically depend on the model parameters. Note that an intersection of n C -many events holding with very high probability also holds with very high probability.
Proof of Theorem 2.2. The theorem will follow from the next lemma and an interpolation argument. As we will see in its proof in Appendix A below, this lemma is a direct consequence of [26, Corollary 2.3] and Lemma 3.1. . Let X satisfy A1, A2 and A5. If ζ ∈ C satisfies ̺(S) + δ ≤ |ζ| 2 ≤ δ −1 for some δ ∼ 1 then there is ε ∼ 1 such that with very high probability.
Since Spec(X) = {ζ ∈ C : 0 ∈ Spec(H ζ )} we conclude from Lemma 3.3 that with very high probability X has no eigenvalues in the annulus A := {ζ : ̺(S) + δ < |ζ| 2 < δ −1 }. We will now show that there are no eigenvalues of X outside D δ −1/2 either. For this purpose we apply Lemma 3.3 to the Hermitization tH ζ/t of tX for any t ∈ [0, 1]. We choose a finite subset Z ⊂ A such that Z + D n −1 covers the entire annulus A and |Z| ≤ n C for some C > 0. By a union bound and Lemma 3.3, we find that for any (t, ζ) ∈ n −1 n × Z with very high probability Spec(tH ζ/t ) ∩ (−ε, ε) = ∅. Thus, by Lipschitz-continuity of tH ζ/t in t and ζ, we have 0 ∈ ∪ t∈[0,1] ∪ ζ∈A Spec(tH ζ/t ) with very high probability. In particular, the eigenvalues of each matrix along the interpolation t → tX between the zero matrix and X, that continuously depend on t, do not cross the annulus. Therefore, X has the same number of eigenvalues inside the disk with radius ̺(S) + δ as the zero matrix, namely n, i.e. it has no eigenvalues outside this disk.

Global inhomogeneous circular law
In this section, we prove Theorem 2.3. We first derive the basic formula relating the eigenvalue density of X to the Hermitian matrices H ζ defined in (3.1). This approach goes back to Girko [29]. Then we motivate and collect all other ingredients required for the proof of Theorem 2.3.
The starting point is a relation for the averaged linear statistics with a test function f ∈ C 2 0 (C) given by 1 where we used in the first step that log is the fundamental solution of the Laplace equation in R 2 . The right-hand side of (3.14) can be expressed purely in terms of the Hermitian matrices H ζ since The resolvent G contains all spectral information about H ζ . In particular, log|det H ζ | is expressed in terms of G via the well-known identity for any T > 0 (see [47] for the use of (3.16) in a similar context). Hence, owing to (3.14), (3.15) and (3.16), it suffices to control G in order to understand the averaged linear statistics. As indicated in Section 3.1, the resolvent G will be well approximated by the solution M of the MDE (3.3) for large n.
We now collect some auxiliary results about M and σ. We will need the following bounds on M proven at the end of Section 4.1.
Lemma 3.4 (Bounds on M). Let X satisfy A1 -A3. Then, uniformly for ζ ∈ C and η > 0, we have Moreover, for any T > 0 and ζ ∈ C, we have The self-consistent density of states σ introduced in (2.11) relates to Im M in the way expected from (3.14), (3.15) and (3.16) as well as G ≈ M. This is the content of the next lemma.
for every f ∈ C 2 0 (C). The integral in the definition of L exists in the Lebesgue sense due to (3.18). Lemma 3.5 in particular shows that ∆L = −2πσ in the sense of distributions, i.e. L is the logarithmic potential of the probability measure σ(ζ)d 2 ζ. The proof of Lemma 3.5 requires a very detailed analysis of the MDE, (3.3), and its stability properties and will be presented in Section 5.3 below.
For technical reasons it is easier to work with discrete approximations of the ζ-integrals introduced above. To that end, we apply the sampling method formulated in the following lemma. It is a special case of [47,Lemma 36], which was used in a similar context. Let Ω ⊂ C be a bounded subset of positive Lebesgue measure and µ the normalized Lebesgue measure on Ω. Let F : Ω → C be a square-integrable function with respect to µ. For m ∈ N, let ξ 1 , . . . , ξ m be independent random variables distributed according to µ.
The next bound on G − M is the last missing ingredient for the proof of Theorem 2.3.
Proposition 3.7 (Global law for H ζ , averaged version). Let X satisfy A1 and A2. Then there is an absolute constant K > 0 such that for all sufficiently small δ > 0 we have with very high probability uniformly for all n ∈ N, ζ ∈ D ϕ and η ∈ [n −δ , n 100 ].
Proposition 3.7 is implied by Proposition A.4 below. The former has an important consequence, namely the following bound on the number of eigenvalues of H ζ close to zero. Note that the moduli of the eigenvalues of H ζ are the singular values of X − ζ. The eigenvalues of H ζ are denoted by λ 1 (ζ), . . . , λ 2n (ζ). Together with Assumption A4, the bound in the next lemma will be used to control the integral in (3.16) for small η.
with very high probability uniformly for all η ∈ [n −δ , n 100 ] and ζ ∈ D ϕ with any fixed ϕ > 0. Here, the constant C ν implicit in the very high probability notion from Definition 3.2 depends on δ and ϕ as well as the constants in A1 and A2 in addition to ν.
Proof. The trace of G is bounded by n, |Tr G| n with very high probability, for all η ∈ [n −δ , n 100 ] due to (3.20) and M 1 by (3.17). Therefore, setting Σ η := {i ∈ 2n : We will now conclude Theorem 2.3 from Proposition 3.7 and Lemma 3.8.
Proof of Theorem 2.3. We will show below that for all sufficiently small δ > 0 we have with very high probability uniformly for all f ∈ C 2 0 (C) satisfying ∆f L 2+a ≤ n C ∆f L 1 and supp f ⊆ D ϕ , where a > 0, C > 0 and ϕ > 0 are some constants. In (3.21), the constant C ν in the definition (3.13) depends only on δ, a, C and ϕ in addition to ν and the constants from A1 and A2.
Given (3.21), Theorem 2.3 follows from the exclusion of eigenvalues outside of D 2R , R = ̺(S), with very high probability by Theorem 2.2 and approximating functions from C b (C) on D 2R by functions from C 2 0 (C). Without loss of generality we can assume that ϕ > 4R + 1 since R = ̺(S) 1 due to Assumptions A1 and A2.
What remains is proving (3.21). We set Ω = D ϕ . Combining (3.14) and (3.19) as well as using the second bound in (3.18) yield Here, we denoted by µ the normalized Lebesgue measure on Ω and introduced We now apply Lemma 3.6 to the first term on the right-hand side of (3.22). Note that ζ → log|ξ −ζ| lies in L p (Ω) for every p ∈ [1, ∞). Hence, owing to the first bound in (3.18) we get that, for any p ∈ [1, ∞), h L p (Ω) p 1 uniformly for T > 0. In particular, the function F defined in (3.23) is square-integrable on Ω. Thus, Lemma 3.6 is applicable and choosing δ = n −ν and m = n ν+2C+22 shows that with very high probability, where ξ 1 , . . . , ξ m are independent random variables distributed according to µ. We set T = n 100 and now show that for all sufficiently small δ > 0 we have with very high probability uniformly for all ζ ∈ Ω. To that end, we define η * := n −δ and Using (3.15), (3.16) and . Next, we establish individual estimates on h 1 , . . . , h 4 which hold with very high probability. We get |h 1 (ζ)| ≤ 2n −1+Kδ from (3.20) as well as a union bound and a continuity argument in η. To estimate h 2 , we write λ j ≡ λ j (ζ) and compute In the following, we will decompose the sum into two regimes, |λ j | < η 1/2 * and |λ j | ≥ η 1/2 * , and estimate each regime separately. For the first regime, Assumption A4 and Lemma 3.8 yield 1 4n with very high probability for all ε > 0 small enough. In the remaining regime, log(1 + x) ≤ x yields 1 4n By slightly shrinking δ, these estimates imply |h 2 (ζ)| ≤ n −δ . For h 3 , we obtain where we used log(1 + x) ≤ x in the first step and |x ij | ≺ n −1/2 due to Assumption A1 in the last step. Since ζ ∈ Ω we conclude |h 3 (ζ)| n −10 with very high probability. Since M 1 due to (3.17) we have 0 ≤ h 4 (ζ) η * . Hence, the proof of (3.25) is complete.
Since m is at most of polynomial order in n, a union bound over ξ 1 , . . . , ξ m and (3.25) yield with very high probability. Here, we applied Lemma 3.6 again in the last step and used supp f ⊆ Ω. Finally, we combine the relation (3.22), the estimates (3.24) and (3.28) as well as ∆f L 2 ∆f L 2+a ≤ n C ∆f L 1 and obtain (3.21). This completes the proof of Theorem 2.3.

Dyson equation and its stability
In this section, we analyse the solution M to the matrix Dyson equation (3.3) and its stability against perturbations. These results are the core of this article as they will be the basis of the proofs of Theorem 2.5 and Lemma 3.5 about the properties of σ as well as the local law for H ζ (cf. Theorem 6.2 below).
The matrix Dyson equation and its stability have been analysed in [4,7]. However, their main regularity and stability results impose the flatness condition (see [4, equation (2.7)] and (4.2) below) on the self-energy operator S . This condition is not satisfied by S as defined in (3.4). In fact, the special structure of S poses significant challenges since it leads to an instability in the Dyson equation (3.3). Dealing with this instability is the main purpose of this section.
In Subsection 4.1 we will establish some basic properties of the solution to (3.5) and hence (3.3). Then we will prove stability of the Dyson equation in the bulk of the spectrum against small perturbations in Theorem 4.3 of Subsection 4.2, using an important technical lemma that will be proven in Subsection 4.3.

Solution
The first result of this subsection establishes matching upper and lower bounds on the solution of (3.5) in the sense of quadratic forms. For this proposition only the following flatness property of S is needed. Due to assumptions A1-A3 the operators S and S * are both comparable to the normalised trace in the sense of quadratic forms, i.e.
for any positive semi-definite A. In fact, the lower bound SA A is just an equivalent formulation of assumption A3, while the upper bound SA A is a simple consequence of assumptions A1 and A2. The same is true for S * .
for all τ ≥ 0 and η > 0, as well as the bounds Proof. Throughout this proof, we will refer to some identities from the proof of Lemma 3.1. To see (4.3), we multiply (3.11) with V 1 from the right and with V 2 from the left and take the normalized trace. Now we prove (4.4). First we observe that V 1 and V 2 are both comparable to their normalized traces, which coincide as we have just shown, i.e.
This is seen directly from the two equations (3.5) since the right hand sides are both comparable to the same multiple of the identity due to (4.2) and (4.3). Now let S 2 be the unique positive definite Perron-Frobenius eigenmatrix of S with normalisation S 2 = 1, i.e. SS 2 = S 2 . Because of (4.2) this eigenmatrix satisfies S 2 ∼ 1. We take the scalar product with S 2 on both sides of (3.12) and get Depending on whether τ ≤ 1 or τ > 1 we either subtract τ S 2 , V 1 or S 2 , V 1 on both sides of (4.6) and use (4.2) as well as S 2 ∼ 1 to see that The claim (4.4) is now an immediate consequence of (4.7) and (4.5).
As a consequence of Proposition 4.1 we can also estimate the singular values of U , defined in (3.6), from above and below. When multiplying (3.5a) with η + S * V 1 from the left and (3.5b) with η + SV 2 from the right we see the identities Furthermore, when we multiply (3.5a) with V 1 from the left and (3.5b) with V 2 from the right we see that Multiplying (4.9) by U and using (4.8) also reveals Finally, using (4.4) in (4.8) shows the comparison relation uniformly for η > 0 and τ ≥ 0. For future reference we also record the identities which result from multiplying (3.5a) from left and right by V 1 and (3.5b) by V 2 and then using (4.8).
As a consequence of (4.4) for τ ≥ 1 we can extend V i continuously to η = 0 as V i (τ, 0) = 0. This is summarised in the following corollary whose proof is immediate from the representation of M in (3.7) and the definition of U in (3.6).
Proof of Lemma 3.4. First, we get (3.17) from (3.7), (4.4) and (4.11). Second, both bounds in (3.18) follow directly from the estimate which holds uniformly for ζ ∈ C and η > 0 and is shown next i.e. the missing bound in (4.13). This completes the proof of Lemma 3.4.

Stability
In this subsection we will establish stability of the MDE (3.3) and its solution against small perturbations. As indicated at the beginning of the section, (3.3) has an inherent instability due to the structure of S . This instability originates from a single unstable direction and implies that stability can only be expected with respect to perturbations D that take values in a manifold of codimension 1 in C 2n×2n . Through a special choice of coordinates this manifold can be mapped to the orthogonal complement of E − ∈ C 2n×2n defined through and thus projected out. Before we state the stability theorem we introduce a norm that is designed to prove isotropic convergence of the resolvent G from (3.2) to M, i.e. to prove | x , (G − M)y | → 0 for any fixed vectors x, y ∈ C 2n in a high moment sense. A similar norm was introduced in [26] for the same purpose and to match the notation to this work we introduce coefficients κ R (α, β) with α, β ∈ 2n 2 associated to any linear operator R : C 2n×2n → C 2n×2n via Through this one to one correspondence between R and κ R we define κ c := κ S . We also recall the following notation from [26]. For an expression f a 1 a 2 ...a k with indices a 1 , . . . , a k we write f xa 2 ...a k = a x a f aa 2 ...a k if an index is averaged against a vector x, and similarly if more than one index is averaged. We also write f · a 2 ...a k for the vector (f aa 2 ...a k ) a . In particular, Let us now fix two deterministic vectors x, y ∈ C 2n and K ∈ N. Then for fixed η and ζ writing M = M(ζ, η) we recursively define the sets of vectors Here, e a denotes the a-th standard basis vector in C 2n . The · * -norm is then defined as The definition of · * is chosen such that the arguments from [26] can be followed directly in the proof of Theorem 6.2 below. The norm is dominated by the standard operator norm, A * ≤ 2 A and by construction and A2 it satisfies for all A, B ∈ C 2n×2n and R ∈ {M, Z, S M, (iη1 + S M) −1 }. The bounds from (4.17) follow exactly as (73), (70b) and (70a) in [26] from A2. Now we present our main stability theorem. It states that when (3.3) is properly rewritten and restricted to E ⊥ − it is stable against small perturbations. Theorem 4.3 (Stability). For any sufficiently small δ > 0 (depending on model parameters) and any For D, B 1 and B 2 we have the following choices. Either with c 1 , c 2 > 0 constants, depending only on the model parameters and on δ, or for sufficiently large n (depending on model parameters, δ and K). Here the superscripts indicate with respect to which norm the ball B r (A) of radius r around A is meant. The function G is analytic in all variables.
Proof. We solve the implicit equation . We will show that J is a well-defined bounded holomorphic function on D × B 2 with values in E ⊥ − . In particular, we will see that where the constants hidden in the comparison relation may depend on δ and K in addition to the model parameters. We will keep this convention until the end of this proof. The theorem then follows from the implicit function theorem, Lemma B.1, and the following bound on the inverse of the derivative ∇J ζ,ζ,η : Note that the inverse of the derivative in (4.22) is restricted to the hyperplane E ⊥ − and the · * -norm on E ⊥ − is simply the restriction of the · * -norm from (4.16) on C 2n×2n . To see that J ζ 1 ,ζ 2 , η leaves the hyperplane E ⊥ − invariant we compute Here we used E − (S R)R = 0 for any R ∈ C 2n×2n and G ∈ E ⊥ − in the first identity and the general fact that by the Schur complement formula for any invertible 2 × 2-block matrix with square blocks in the second identity.
In the remainder of the proof we verify where the inverse is understood to be restricted to M −1 E ⊥ − and the · * -norm on the hyperplane M −1 E ⊥ − is simply the restriction of the · * -norm on C 2n×2n . The bounds (4.25) are a consequence of the following three lemmas.
Lemma 4.4 (Resolvent control for L ). Let L be defined as in (4.24). For any sufficiently small δ > 0 (depending on model parameters) there is a constant ε ∼ δ 1 such that uniformly in η ∈ (0, δ 3 ) and ζ ∈ C with |ζ| 2 ≤ 1 − δ we have the resolvent bound   Approximate right and left eigenvectors corresponding to this isolated eigenvalue of L are given by the identities
Lemma 4.5 (Smoothing lemma). Let C d be equipped with two norms · # , · + and B ∈ C d×d with for some constant C > 0. Then for ξ ∈ Spec(B) ∪ {0} we have Lemma 4.6 (Twist lemma). Let C d be equipped with a scalar product · , · and a norm · # (not necessarily induced by the scalar product), ε ∈ (0, 1) and A ∈ C d×d such that D ε ∩ Spec A = {α}. We assume that α is a non-degenerate eigenvalue of A and Aa = αa for some a ∈ C d with a # = 1. Let with some p ∈ C d be the corresponding spectral projection and b ∈ C d a vector such that Suppose that A has a bounded inverse on the range of 1 − P , i.e.
Then A has a bounded inverse when restricted to b ⊥ , namely Lemma 4.4 is an important technical result that allows to apply analytic perturbation theory to the isolated eigenvalue λ of the non-selfadjoint operator L . Its proof is given in Subsection 4.3 below. The proof of Lemma 4.5 is to simply take the · # -norm in the identity The proof of Lemma 4.6 is postponed to Appendix B.
To show (4.25) we use that by Lemma 4.4 the spectral projection P L corresponding to the isolated eigenvalue λ of L close to zero has rank one and thus the form where (L − λ)L r = (L − λ) * L l = 0, i.e. L r and L l are the unique (up to normalisation) corresponding right and left eigenvectors of L , respectively. Now we extend the resolvent control (4.26) from the · hs -norm to the norms · and · * with the help of Lemma 4.5 applied to the choice B = Id − L . This is possible because and S hs→ · + S #→ · 1. In particular, we may use analytic perturbation theory in the · -norm and find according to (4.28). Applying Lemma 4.6 with the choices shows the invertibility of L on M −1 E ⊥ − = (E − Im(M −1 )) ⊥ in the · # -norm. Here, the positive constants c and C are chosen sufficiently small and large, respectively, in order to ensure the assumptions (4.30) and (4.31) of Lemma 4.6. In case of the # = · we have c ∼ C ∼ 1 and in the # = * case c ∼ n −1/2K and C ∼ 1. The expansion (4.33) is used to ensure that indeed | a , b | ≥ 2 ε as required in (4.30) and (4.31) follows from the resolvent control on L in #-norm.
Corollary 4.7 (Perturbations). Let η > 0 and ζ ∈ C with η + ||ζ| − 1| ≥ δ for some fixed δ > 0. For any D ∈ C 2n×2n and G ∈ E ⊥ − such that G − M * + D * ≤ n −7/K (respectively G − M + D ≪ 1) that satisfy the perturbed Dyson equation the matrix G is close to M = M(ζ, η) in the sense that for sufficiently large n we have We also introduce the Matrix Dyson equation with general spectral parameter given by with z ∈ H := {w ∈ C : Im w > 0} as well as S and Z = Z(ζ,ζ) from (3.4) with ζ ∈ C. There is a unique solution M = M(ζ, z) to (4.36) under the constraint Im M ≥ 0 [35]. Note that (4.36) is the counterpart of (3.3), where the special spectral paramter iη ∈ H is replaced by a general z ∈ H. In particular, both solutions agree for z = iη. To (4.36), we associate the self-consistent density of states ρ ζ of H ζ defined as the unique probability measure on R whose Stieltjes transform is given by for any z ∈ H. The support of ρ ζ is called the self-consistent spectrum of H ζ . By Corollary A.1 below, supp ρ ζ is bounded away from zero for any ζ / ∈ D 1+δ due to A1 -A3 and our normalisation (4.1).
The remaining regime 1 − |ζ| ≥ δ and η < δ is treated using Theorem 4.3. In this case we rewrite (4.34) in the form where J = J ζ,ζ,η is given in (4.20). We have seen in where ∆ = G − M. Applying (4.17) we take the · * -norm on both sides and estimate where for the last summand we used that AB * n 1/2 A B * for any pair of matrices A, B. Owing to the assumption ∆ * ≤ n −7/K , this verifies (4.39).

Resolvent control on L
In this subsection we prove Lemma 4.4 by considering a reduction L of L on the space of diagonal block matrices, or equivalently on C n×n ⊕ C n×n . We introduce the short hand notation as well as the average and scalar product on C n×n ⊕ C n×n as for A, B, A 1 , B 1 , A 2 , B 2 ∈ C n×n . We will denote linear operators A on C n×n ⊕ C n×n by the block notation We split the stability operator L into diagonal and off-diagonal contributions, Here we introduced the projection and embedding operators the reduced stability operator with E − ∈ C n×n ⊕ C n×n defined in analogy to (4.14) through 44) and the offdiagonal contribution Q : C 2n×2n → C 2n×2n to the stability operator, Similarly to (4.44) we also write On the level of the reduced stability operator the result analogous to Lemma 4.4 is the following statement. For any sufficiently small δ > 0 (depending on model parameters) there is a constant ε ∼ δ 1 such that uniformly in η ∈ (0, δ 3 ) and ζ ∈ C with |ζ| 2 ≤ 1 − δ we have the resolvent bound Furthermore, the ε-ball around zero contains a single isolated eigenvalue λ = 0 of L, i.e.
Approximate right and left eigenvectors corresponding to this isolated eigenvalue of L are given by the identities which are valid globally for η > 0 and ζ ∈ C.
The proof of Lemma 4.8 requires some preparation. But first we will see how the lemma is used to establish Lemma 4.4. For t ∈ [0, 1] we consider an interpolation L t := L −tQ that removes the off-diagonal contribution. With the help of (4.46) we now establish the lower bound for any R ∈ C 2n×2n and ξ in the domain where the resolvent is controlled, i.e. ξ ∈ (2 + D ε )∪(1 + D 1−ε )∪ D ε . This finishes the proof of (4.26) with the choice t = 0. Furthermore, it shows that no eigenvalues can leave the complement of the domain where the resolvent is controlled along the continuous interpolation. We conclude that the non-degeneracy property (4.27) holds if it can be established for L 1 = P * LP + Id − P * P. But L 1 leaves both, the space of diagonal and of off-diagonal block matrices, invariant and acts as L on the first and as the identity on the latter. Thus (4.27) follows from (4.47). Finally, the fact that λ = 0 follows e.g. from the general result on the weak, i.e. η-dependent, stability of the Dyson equation from Lemma A.3.
To prepare the proof of Lemma 4.8 we introduce some auxiliary operators. The purpose of these operators is to allow for a rewriting of the non-Hermitian reduced stability operator L in terms of Hermitian operators for which spectral information can be turned into norm bounds. Definition 4.9. For any η > 0 and τ ≥ 0 we define the n × n-matrices in terms of the solution V i = V i (τ, η) to (3.5a) and U = U (τ, η) from (3.6). Furthermore, we define the linear operators T τ,η , F τ,η , V τ,η : as well as and V τ,η := (4.52) The matrices from Definition 4.9 allow to rewrite L through the formula The following three lemmas list important analytical properties of the operators from Definition 4.9.
Lemma 4.10 (Properties of P , K 1 , K 2 and V). Fix η > 0 and τ ≥ 0. The matrices K 1 , K 2 and P defined in (4.49) satisfy the identities as well as the comparison relations The operator V from (4.52) is invertible and satisfies Proof. The identities (4.54) follow from  1. It is self-adjoint with respect to the scalar product (4.40) and positivity preserving, i.e.

58)
where C + denotes the cone of positive definite matrices and C + its closure.
2. It has a positive spectral radius and ± F hs are non-degenerate eigenvalues of F with unique corresponding eigenvectors F ± of the form for some normalized ( F 1 hs = F 2 hs = 1) matrices F 1 , F 2 ∈ C + . Both these matrices are comparable to the identity matrix 3. The spectral gap of F is bounded away from zero, i.e. there exists ε ∼ 1 such that 4. The spectral radius of F is given by the formula (4.61)

The eigenvectors F ± satisfy
Proof. The self-adjointness of F is clear from its definition (4.51) and the property of being positivity preserving is inherited from the same properties of S (cf. (2.1)). Thus (4.58) holds true. Properties 2 and 3 now follow from the structure given in (4.51). Thus the spectrum of F is determined by the spectrum of F * F through Because of (4.55) and V 1 ∼ V 2 ∼ ρ (cf. (4.4)) the operators F * F and F F * inherit the flatness property (4.2) from S, i.e.,

Thus we can apply [4, Lemma 4.8] to infer
where ε ∼ 1 is a bound on the spectral gap and According to the same lemma the eigenvalue 1 in (4.63) is non-degenerate with corresponding normalised eigenmatrices F 1 , F 2 ∈ C + that satisfy (4.59). In particular, Therefore, F ± are eigenvectors of F 2 corresponding to F 2 hs and, consequently, FF ± = ± F hs F ± . It remains to verify Properties 4 and 5. For this purpose we will use that VV ± are approximate eigenvectors, (4.64) Indeed, (4.64) follows from using the definition of V in (4.52) to identify the first summand on the right hand side of (4.65) as ±VV ± . In (4.65) we used the definition of V and F for the first equality and the identities (4.54) for the second equality. For (4.61) we choose the + in (4.64), take the scalar product with F + and use that F is self-adjoint to obtain To establish (4.62) we apply Lemma B.3 for S replaced by F F * / F 2 hs and F * F / F 2 hs , i.e. for the diagonal entries of (F/ F hs ) 2 . Due to (4.64) the projections of VV + to the first and second component provide approximate eigenvectors for these two operators. The resolvent control from Lemma B.3 allows us to use analytic perturbation theory and the size of the error term in (4.62) is a consequence of (4.55), (4.4) and the definition of V in (4.52). This finishes the proof of the lemma. Lemma 4.12 (Spectral properties of T ). The operator T defined in (4.50) satisfies the following properties uniformly for η > 0 and τ ≥ 0: 1. It is self-adjoint, T * = T .
2. Let P = n i=1 π i p i q * i with π i ≥ 0 and orthonormal bases (p i ) i and (q i ) i of C n be the singular value decomposition of P . The eigenvectors of T are (4.66) In particular, the spectrum of T is bounded away from 1 by some ε 1 > 0 satisfying 3. An eigenvector of T corresponding to the eigenvalue −1 is given by 4. On VV + the operator T acts contracting, i.e. there is an ε 2 > 0 such that (4.69) Proof. The self-adjointness of T follows immediately from its definition in (4.50). The form of the eigenvectors in (4.66) is a consequence of the following general fact. Let A = n i=1 α i a i x * i and B = n i=1 β i b i y * i be singular value decompositions of matrices A and B and AR := ARB the operator that multiplies a matrix R from the left by A and from the right by B. Then A[x l b * k ] = α l β k a l y * k . In particular, for any function f that is continuous on the positive reals. With these formulas (4.66) is easily verified using the definition of T . The bound (4.67) on the spectrum of T now follows from (4.66) and (4.55). For (4.68) and (4.69) we use the identities ∓1+τ P * P (1+τ P * P ) 3/2 , that follow from the definitions of T and V in (4.50) and (4.52), respectively. To show (4.69) we also use (4.55) and (4.4).
Proof of Lemma 4.8. We start by verifying (4.48). Indeed, owing to the representation of M in (3.7) we have where we used the identities (4.12) for the second equality. By using the comparison relations (4.4) and (4.11) to bound the last summand on the right hand side of (4.70) we conclude (4.48a). The identity (4.48b) is verified by using the definition of M and (4.12) again. Now we turn to the proof of the resolvent bound (4.46) for the reduced stability operator L. We rewrite this operator using (4.53) and apply this representation to the resolvent of P C M S P * to get For η ≤ δ 3 and τ = |ζ| 2 ≤ δ we use T = −1 + O hs (τ ) = −1 + O hs (δ) which follows from the definition of T in (4.50) and (4.55) as well as V 1 ∼ V 2 ∼ 1 in this regime (cf. (4.4)). From (4.56), the spectral properties of F, (4.60), and F hs = 1 + O(η) = 1 + O(δ 3 ) (cf. (4.61)), as well as (4.71) we infer that there is an ε ∈ (0, 1/2) such that In particular, (4.46) holds true. The non-degeneracy (4.47) of the eigenvalue in D ε follows from the non-degeneracy of the eigenvalue F hs of F as stated in Lemma 4.11. The statement λ = O δ (η) about the non-degenerate isolated eigenvalue in (4.47) follows from V − being an approximate eigenvector (cf. (4.48a)) and the resolvent bound (4.46). For η ≤ δ 3 and τ ∈ [δ, 1 − δ] we will apply Lemma B.2 with the choices F := F/ F hs and T := T . We verify the assumptions of the lemma. The required upper bound T hs ≤ 1 follows from (4.67) and (B.10) holds true because of (4.60). Furthermore according to (4.62) and (4.69) we have where F + is the normalized eigenvector of F corresponding to the eigenvalue F hs and we used (4.4) to see the bounds in terms of δ. We also have  Finally, note that λ = 0 because the representation (4.53) shows that with T ≤ 1 (cf. (4.67)) the operator L is invertible as long as F hs < 1, which is always true for η > 0 due to the right hand side of (4.61) not vanishing. As used in the proof of Proposition 2.4 above, the uniformity of the statement of Theorem 4.3 in η > 0 allows for an extension of M as well as V 1 , V 2 to η = 0 in the following sense. 3) has a unique continuous extension to C × [0, ∞), i.e. to η = 0. For every ζ ∈ ∂D this extension, still denoted by M, also has a continuation to a neighbourhood of (ζ, 0) that is real analytic in Re ζ, Im ζ, η. The size of this neighbourhood only depends on the model parameters and on dist(ζ, ∂D) = ||ζ| − 1|.

Self-consistent density of states
In this section we use the information about the solution of the Dyson equation to control the selfconsistent density of states σ corresponding to X. In Subsection 5.1 we begin with establishing upper and lower bounds on the density. In Subsection 5.2 we provide a detailed description of V i and σ at the edge of the spectrum. We end the subsection by summarising its results in the proof of Theorem 2.5. Subsections 5.3 and 5.4 contain the proofs of Lemma 3.5 and Proposition 2.9, respectively.

Upper and lower bounds in the bulk
In this subsection we establish lower and upper bounds on the density σ inside the spectrum, i.e. we show (2.12) away from the edge of the spectrum at |ζ| = 1.
Lemma 5.1 (Formula for density). For any ζ ∈ D the density σ admits the formula where all expressions on the right hand side are evaluated at η = 0 (cf. Corollary 4.13) and τ = |ζ| 2 and where

(5.2)
and K i the matrices from (4.49). For τ = 0 the very right hand side of (5.1) is interpreted as its limit τ ↓ 0. Here and in the following, the notation Proof. By definition of σ in (2.11) and the identity (4.9) we have where we used the structure of S and the projection P from (4.42) in the second equality. We compute the derivative of M with respect to τ by differentiating both sides of (3.3) and solving for By definition of M and the identities (4.8) for U we have (5.6) Thus, inserting (5.6) into (5.5) and recalling the definition of L from (4.43), shows (5.7) We plug this into (5.4) and verify the first equality in (5.1). Note that L −1 is applied to the orthogonal complement of (SV 2 , S * V 1 ) in (5.7). To check the orthogonality of the vector on the right hand side we can use (4.8) at η = 0.
For the second equality in (5.1) we recall the definitions of T , F and V from (4.50), (4.51) and (4.52), as well as the identities (4.54) that take the form at η = 0. Then we compute , where we used (2.9) for the last equality. Again with (2.9) we also have .

(5.9)
Now we insert the representation (4.53) for the reduced stability operator into the middle formula of (5.1). Afterwards we use (5.8) and (5.9) to get The vector in the second argument of the scalar product is a representation of the Perron-Frobenius eigenvector for F. Indeed, by the definitions of F in (4.51) and K i in (5.2) we see that Because of (5.10) we also have the identity which finishes the proof of the second equality in (5.1) and thus the proof of the lemma.
Proof. We consider two separate regimes. First upper and lower bounds on σ follow in a neighbourhood of ζ = 0 by continuity (cf. Corollary 4.13 and (2.11)) of σ and σ(0) 1. The latter is easy to see because at τ = η = 0 the Dyson equation simplifies to and we have In particular, the reduced stability has the form L = 1 − A, where A preserves the cone of positive definite matrix pairs. Thus the first identity in (5.1) implies Note that we can expand L −1 = (1 − A) −1 in a Neumann series because of the representation (4.53), T ≤ 1 and F hs < 1. Now we consider the regime 1 τ 1/2 = |ζ| ≤ 1 − δ. Here, owing to the second relation in (5.1), we have the lower and upper bound We apply Lemma B.2 with T = T , F = F/ F hs and f ± = VV ± / VV ± hs and note that the non-degenerate eigenvalue 1 of A corresponds to the eigenvector VV − which is projected out when we take the norm. Thus we have the resolvent bound for some ε ∼ 1, which implies the desired norm bound.

Solution close to the edge
In this subsection we explicitly determine the leading order of the solution V 1 , V 2 to (3.5) close to the edge τ = |ζ| 2 = 1 of the spectrum. We use the result to determine the jump height (2.13) of the density σ at the edge. Let S 2 and S 1 be the unique positive definite right and left eigenvectors of S, respectively, i.e. SS 2 = S 2 and S * S 1 = S 1 , satisfying S 1 = S 2 = 1. We also write ρ := ρ ζ := V 1 /π for the harmonic extension of the self-consistent density of states of H ζ to the complex upper half plane and recall that ρ is comparable to the right hand side of (4.4).
where α satisfies the cubic equation Proof. We write τ = 1 + ε for some small ε. The case when ε ≤ −c for some constant c ∼ 1 is trivial since then ρ ∼ 1 and the error term in (5.13) dominates. Similarly, for ε ≥ c we have ρ ∼ α ∼ η, i.e. in both regimes the proposition does not contain any information. Solving (4.10) shows We use this expansion for U in (4.9) and find Multiplying with V −1 1 from the left and using the decomposition Here V 1 and V 2 are the spectral projection of V 1 and V 2 corresponding to the spectrum of S and S * complementary to the isolated eigenvalue 1, respectively, i.e. V i = Q i V i , with In particular, projecting both sides of (5.15) onto the range of Q 2 implies V 2 η + ρ 3 . Here we used that (1 + ε − S) −1 Q 2 1, which follows from Lemma B.3. By exchanging the roles of V 1 and V 2 we also find V 1 η + ρ 3 . Therefore, (5.15) can be expanded further as Now we apply the rank one projection P 2 on both sides and get where we used S 1 = 1 and, for the error term, |ε|ρ 3 ρ 5 + ηρ 2 due to (4.4). Finally (5.13) follows from which is a consequence of (4.3) and α 2 = α. Moreover, (5.16) and V i η + ρ 3 yield (5.12).
For the next corollary, we introduce M : C n×n × C n×n → C n×n × C n×n defined by Moreover, we recall that L and E − were defined in (4.43) and (4.44), respectively.
Corollary 5.4 (Resolvent control for stability operator close to the edge). The following holds.

The operator ML has the invariant subspace
2. There is δ ∼ 1 such that, for any η ∈ (0, δ) and ζ ∈ D 1+δ \ D 1−δ , the eigenvalues of ML| E ⊥ − close to zero are isolated in the sense that for some ε ∼ 1 and # = hs, · . In fact, ML| E ⊥ − has only one eigenvalue in D ε . This eigenvalue is simple and the spectral projection is a small perturbation of K from Corollary B.4, since due to (5.12) and (5.14). Thus the claim follows from Corollary B.4 for # = hs, · by perturbation theory for sufficiently small δ ∼ 1.
Corollary 5.5 (Density at the edge). At the edge of the spectrum the self-consistent density of states has an expansion for any ζ ∈ D.
Proof. We set η = 0 throughout the proof. We use the first identity in (5.1) and insert M from (5.17) to find We consider τ = |ζ| 2 = 1 − ε for some ε > 0. Since η = 0, we get from (5.12) that where we used ρ ∼ √ ε. From the expansion of U in (5.14) this implies where the first order perturbation of L is given by According to Corollary 5.4 the operator ML = (M e + O(ε))(L e + ε D + O(ε 2 )) has an isolated eigenvalue λ close to 0 when restricted to E ⊥ − . Therefore, we can use perturbation theory to determine its value to leading order This finishes the proof.
Proof of Theorem 2.5. The analyticity of V 1 , V 2 and thus the well definedness of σ in (2.11) was shown in Corollary 4.13, the upper and lower bounds on σ from (2.12) in Corollary 5.2 away from the edge and in Corollary 5.5 close to the edge. Integrating the definition of σ(ζ) over D 1 and recalling ̺(S) = 1 from (4.1) as well as V 1 (τ ) → 0 and V 2 (τ ) → 0 for τ → 1 due Corollary 4.13 and (4.4) imply that σ is a probability density on C. Finally the jump height (2.13) of σ right at the spectral edge is read off from (5.19).

Proof of Lemma 3.5
In this subsection, we prove Lemma 3.5, the basic property of σ used in the proofs of the global and local inhomogeneous circular law, Theorem 2.3 and Theorem 2.7.
The identity (5.24) directly shows that L is rotationally symmetric on C. Moreover, it implies that L is a continuous function of ζ on C since V 1 (τ ) and V 2 (τ ) are continuous functions of τ .
Since L is rotationally symmetry it suffices to show (3.19) under the same constraint on f . If f ∈ C 2 0 (C) is rotationally symmetric then a simple change of coordinates yields We now split up the τ -integration into (0, 1) and (1, ∞) and use the differentiability of L on both domains to integrate by parts. More precisely, integrating by parts twice, using the continuity of L and L(1) = 0 as well as (5.26) and lim τ ↑1 ∂ τ L( √ τ ) = −1/2 yield Here, we used in the last step that πσ(ζ) = ∂ τ τ U (τ ) | τ =|ζ| 2 if |ζ| < 1 due to the definition of σ in (2.11) and the definition of U in (3.6). Secondly, an integration by parts, (5.26), the continuity of L and L(1) = 0 imply By plugging these identities into (5.27), we obtain where the last step follows from σ(ζ) = 0 if |ζ| ≥ 1 by definition (see (2.11)). This proves (3.19).

Proof of Proposition 2.9
In this section, we establish Proposition 2.9.
Proof of Proposition 2.9. In the light of Theorem 2.5, it suffices to show that (2.15) holds when µ X (dζ) is replaced by σ(ζ)d 2 ζ. To that end, let L(ζ) be defined as in (3.19). We first show that for all ζ ∈ C. Using [38, Theorem 11 and Proposition 13 in Chapter 9] it is easy to see that M(ζ, is defined analogously to (3.1) with X from (2.14) and S is defined as in (3.4) with S and S * from (2.17). We introduce the tracial state ϕ := · ⊗ τ on M n×n and the matrix E 22 ∈ C 2n×2n which has the identity matrix in its lower-right n × n-block and vanishes otherwise. Thus, the definitions of ϕ and M as well as (B.27) imply We remark that the integrals exist due to (3.18). In (5.30), we used ( for all λ ∈ C and some harmonic function h : C → C. In the proof of Lemma 3.5, we saw that L(λ) = − log|λ| if |λ| is sufficiently large. Hence, h(λ) → 0 if |λ| → ∞, which implies h ≡ 0. Therefore, (5.28) and (5.31) with h ≡ 0 prove (2.19) and, thus, Proposition 2.9.

Local inhomogeneous circular law
This section is devoted to the proof of Theorem 2.7 which is based on the next theorem. Its formulation and the notation in the next arguments is simplified by the use of the following notion of high probability estimate first introduced in [27].
Definition 6.1 (Stochastic domination). Let X = X (n) and Y = Y (n) be two sequences of two nonnegative random variables. We say that X is stochastically dominated by Y , denoted by X ≺ Y , if, for any ε > 0 and ν > 0, there is C ≡ C ε,ν such that P X > n ε Y ≤ C ε,ν n −ν (6.1) for all n ∈ N.
We remark that stochastic domination is compatible with basic arithmetic operation (see e.g. [27,Lemma 4.4]). The constants C ε,ν in (6.1) will typically depend on the model parameters.
To simplify the formulation of the next result, we fix τ * ∈ (0, ̺(S)) and define the spectral domains Theorem 6.2 (Local law for H ζ ). Let X satisfy A1 -A3 and M be the solution of (3.3). Then we have the isotropic local law, uniformly for all deterministic vectors x, y ∈ C 2n . Moreover, the averaged local law holds uniformly for all deterministic matrices R ∈ C 2n×2n .
We will prove Theorem 6.2 in Section 6.1 below. The next lemma is an application of Theorem 6.2 and estimates the number of small, in modulus, eigenvalues of H ζ . It will be used in the proof of Theorem 2.7 to control the integral in (3.16) for small η.
Proof. We follow the proof of Lemma 3.8 and use |Tr G| ≺ n for all η ≥ n −1+ε due to (6.3) instead of |Tr G| n. This proves Lemma 6.3.
Proof of Theorem 2.7. We first remark that the condition ∆f L 2+a ≤ n C ∆f L 1 is not needed in Theorem 2.7 if the stronger Assumption A4' holds. This can be seen by following the proof of We now prove Theorem 2.7 assuming A1-A4. In fact, the proof is a simple refinement of the proof of Theorem 2.3 and we solely describe the necessary modifications. We replace f by f ζ 0 ,α and choose Ω = D √ ̺(S)−τ * /2 . We remark that supp f ζ 0 ,α ⊆ Ω for all sufficiently large n as α > 0. The functions F and h as well as the measure µ are defined analogously according to the new choices of f and Ω.

Local law for H ζ -Proof of Theorem 6.2
This section is devoted to the proof of Theorem 6.2. The local law for Hermitian random matrices with decaying correlations was established in [4,26]. In order to get the isotropic version stated in Theorem 6.2 we will follow the strategy from [26].
The main input for the local law for H ζ , Theorem 6.2, is the following estimate on the error term D in terms of the p-norms for random variables Y and random matrices A ∈ C 2n×2n defined through Proposition 6.4 (Bound on error matrix). There is a constant C > 0, depending only on model parameters, such that for any η ∈ [n −1 , n 100 ], p ∈ N, ε > 0, R ∈ C 2n×2n and x, y ∈ C 2n with q := Cp 4 /ε the following holds true: x , Dy p ε,p x y n ε Im G q nη Before deriving Proposition 6.4 from [26,Theorem 4.1], we now explain the definition of the selfenergy operator in [26] which differs from the self-energy operator S used in the present work and defined in (3.4). Instead of S , the self-energy operator considered in [26] (and denoted by S in [26]) is with Z = Z(ζ, ζ) and the operators R, R * : C n×n → C n×n defined through Moreover, [26] works with the solution M (denoted by M in [26]) of the MDE, (4.36), with the selfenergy S instead of S , i.e. M = M(ζ, z) satisfies for all z ∈ H and all ζ ∈ C, where Z = Z(ζ, ζ).
The proof of Lemma 6.5 is given in Appendix B below. The matrix H ζ satisfies [26, Assumptions (A),(B),(C)] and the modified version (6.13) of [26,Assumption (D)] according to Assumption A1, A2 and Lemma 6.5. Since the modification (6.13) does not effect any of the proofs in [26] we can apply [26,Theorem 4.1] to H ζ . Owing to the different self-energy operator in [26] as explained above, the bounds from [26, Theorem 4.1] 3 are for (6.14) Thus to prove the proposition it suffices to show the following bounds on the additional error term x , (( S − S )G)Gy p x y n ε G q Im G q nη To see (6.15) we use that for any unit vectors x, y ∈ C n and R, Q ∈ C n×n we have where v jk := ( i x i EX ij X lk ) l ∈ C n and we employed the general inequality for random variables Since the diagonal contributions of S and S coincide, we conclude that where the decay v jk n −1 (1 + d(j, k)) −ν of arbitrarily high order ν ∈ N was used. The Ward identity η G * G = Im G now implies (6.15).
The remaining inequality, (6.16), follows from the Ward identity and S − S hs R hs  [26] as explained after Proposition 6.4 necessitates replacing a few objects in the arguments in [26] by their counterparts in the present setup. Indeed, S, M and D in [26] are replaced by S , M and D, respectively. Furthermore, whenever [26,Theorem 4.1] is used in [26] we will use Proposition 6.4 instead. The now missing Assumption (E) from [26] was used along the argument solely for the purpose of establishing stability of the MDE, i.e. to show that the inverse of L defined in (4.24) is bounded (Note that L is the analogue of 1 − C M S from [26]). We will now point out where the boundedness of L −1 and the resulting stability in the form of [26, equation (74)] has to be replaced by the use of Corollary 4.7.
Any direct use of [26, equation (74)] is simply replaced by (4.35), using that G ∈ E ⊥ − by Lemma B.5 below. Otherwise the boundedness of L −1 is only used to establish the averaged bound [26, equation (84)]. To establish this bound in the current setting we start from the quadratic equation for the difference ∆ := G − M right after [26, equation (83)]. Outside the self-consistent spectrum, i.e. for |ζ| − 1 ≥ δ or for η ≥ δ we can invert L and follow the argument from [26]. In the regime |ζ| ≤ 1 − δ and η ≤ δ the operator L does not have a bounded inverse. Thus we have to proceed more cautiously from (6.17) and use the operator M defined in (4.24). Since ∆ ∈ E ⊥ − and M L preserve the subspace E ⊥ − , we see that M D ∈ E ⊥ − . Thus we can use (4.25) to invert L on M −1 E ⊥ − and after that follow [26] again. This proves Theorem 6.2.

Eigenvector delocalization for X
In this subsection we prove Corollary 2.8 and Lemma 6.3 which are consequence of the local law for H ζ , Theorem 6.2.
Proof of Corollary 2.8. As in the proof of [2, Corollary 1.14], we conclude from (6.2) in Theorem 6.2 and M 1 due to (3.17) that uniformly for all deterministic vectors x ∈ C 2n and all vectors u ∈ C 2n contained in the kernel of H ζ for some ζ ∈ D < . If u ∈ C n is an eigenvector of X corresponding to an eigenvalue ζ ∈ D < then u := (0, u) t ∈ C 2n lies in the kernel of H ζ . Hence, Corollary 2.8 follows from (6.18) and a simple union bound.

Bound on the smallest singular value
In this section we bound the smallest singular value of X + A if X satisfies Assumption A4' and A is deterministic. This is done in Proposition 7.1 below, which, in particular, implies Proposition 2.10. Moreover, we prove Lemma 2.13 in the next subsection. We recall that s min (R) denotes the smallest singular value of a matrix R ∈ C n×n .
Proof. The following proof mimics the one of [17,Lemma 4.12] that is valid for independent entries. Going back to [42], the smallest singular value is often estimated by the inequality where R 1 , . . . , R n are the rows of X + A and R −i := span{R j : j = i} (see also [17,Lemma 4.16]). Owing to this inequality and a union, we obtain P(s min (X + A) ≤ u) ≤ n max i∈ n We fix i ∈ n . Let y be a unit vector that is orthogonal to R −i and measurable with respect to {R j : j = i}. The Cauchy-Schwarz inequality implies where π i is the orthogonal projection onto the orthogonal complement of R −i . Therefore, we obtain Since y is normalised, we find j ∈ n such that |y j | ≥ n −1/2 . This yields where we denote by X ij the family of random variables X ij := x kl : (k, l) ∈ n 2 \ {(i, j)} .
We now estimate the conditional probability with respect to X ij for any j ∈ n such that |y j | ≥ n −1/2 . We only consider the case that ψ ij is a density on C. (If ψ is a density on R then we proceed completely analogously.) The condition |y j | ≥ n −1/2 , the identity (2.7) in A4' and Hölder's inequality imply for some C-valued random variable a ij , which is measurable with respect to X ij . Thus, estimating the sum in (7.1) by n and using the bound on E ψ ij q from A4' complete the proof of Proposition 2.10.

Proof of Lemma 2.13
Proof of Lemma 2.13. For all i, j ∈ N and α, β ∈ K , we set X i,j,α,β := {(x kl ) γ,δ : (k, l, γ, δ) = (i, j, α, β)} and with the convention ψ i,j,α,β (z) = 0 if the denominator vanishes. A simple computation shows that In this appendix we show how Lemma 3.3 and Proposition 3.7 can be derived from existing results. We recall that the self-consistent density of states ρ ζ was defined in (4.37) and the self-consistent spectrum is supp ρ ζ .
The following corollary to Lemma 3.1 states that the self-consistent spectrum supp ρ ζ is bounded away from zero for any spectral parameter ζ outside the disk of radius ̺(S).
Proof. The corollary follows from Lemma 3.1 and the implication [7, (i) [26] (cf. (6.10)) differs slightly from S defined in (3.4) and used in the present work. Therefore, applying results from [26] requires controlling the difference between M(ζ, z) and M(ζ, z), the solutions of the MDE's (4.36) and (6.12), respectively. This is done in the next lemma. In analogy to ρ ζ , we define ρ ζ as the unique probability measure on R with Stieltjes transform z → M(ζ, z) .
(ii) There is a universal constant K > 0 such that, for all η > 0 and ζ ∈ C with |ζ| 1, we have Proof of Lemma A.2. We start by showing that the operators R and R * from (6.11) that constitute the off-diagonal entries of S − S can be considered small perturbations. Indeed, we will prove that for every R ∈ C n×n . To check (A.1) we simply use that | x , (RR)y | ≤ 1 √ n i,j,k,l |x i y j x l y k |K ij,lk Here, we introduce the coefficients K ij,lk given by where we used Assumptions A1 and A2 as well as Young's inequality to see that they still have a polynomial decay of arbitrarily high order ν ∈ N. Thus, the volume growth condition where we used R n −1/2 by (A.1) and M ≤ M + ∆ . Thus we can bootstrap the bound ∆ n −1/2 from the regime η ≥ 1. For the proof of (ii), we show the improved norm bound on R in the hs-sense To show (A.2), for each R ∈ C n×n , we estimate the hs-norm through where the second bound holds because for any ν ∈ N the coefficients K uv,u ′ v ′ satisfy As above, from (A.2), we get D hs R hs M 2 n −1+2ε /(1 + η 2 ) and infer (ii) of the lemma from (ii) of Lemma A. 3 [7, (v) in Lemma D.1] holds for M, i.e. dist(iη, supp ρ ζ ) 1.
The next proposition is a generalization of Proposition 3.7.
Proposition A.4 (Global law for H ζ , general version). Let X satisfy A1 and A2. Then there is C > 0 such that for all ϕ > 0 and all sufficiently small δ > 0 we have with very high probability uniformly for all n ∈ N, ζ ∈ D ϕ and η ∈ [n −δ , n 100 ] as well as deterministic vectors x, y ∈ C 2n and deterministic matrices R ∈ C 2n×2n . Here K is some absolute constant and the constant C ν implicit in Definition 3.2 of 'very high probability' depends only on δ and ϕ as well as the constants from A1 and A2, in addition to ν.
Proof. The proposition is an immediate consequence of [26, Theorem 2.1] since η ≥ n −δ means that the spectral parameter in the MDE is sufficiently far away from the self-consistent spectrum associated to H ζ . As alluded to after Proposition 6.4, the self-energy in [26] is S instead of S . Consequently, the resolvent G is compared to M, the solution of (6.12), instead of M. Thus (A.3a) and (A.3b) follows from the closeness of M to M from (i) and (ii) in Lemma A.2, respectively.
Since all cases |||κ # 1 # 2 ||| # 1 # 2 n ε with # i = d, c are proven similarly by simply interchanging the role of certain indices, we only show the case # 1 = # 2 = d. Due to [26,Lemma A.1] and Assumption A2 we have for any fixed ν ∈ N and ε > 0 that Proof of Lemma 4.6. Let w ∈ C d with w ⊥ b. To prove (4.32) we use the spectral projection P from (4.29) and its complementary projection Q := 1 − P as well as a # = 1 to estimate Since w is orthogonal to b we have the identity 0 = b , w = b , a p , w + b , Qw . (B.5) In particular, we find an upper bound on | p , w | in terms of Qw # , namely where we used the assumption from (4.30). Continuing from (B.4) we see that because |α| ≤ ε by assumption.
To finish the proof of (4.32) we use Combining the two lower bounds (B.6) and (B.8) on Qw # and optimizing over the values of | p , w | while using that ε ≤ 1 yields Together with (B.7) this finishes the proof of Lemma 4.6.
Lemma B.1 (Quantitative implicit function theorem). Let T : C A × C D → C A be a continuously differentiable function with invertible derivative ∇ (1) T (0, 0) at the origin with respect to the first argument and T (0, 0) = 0. Suppose C A and C D are equipped with norms that we both denote by · and let the linear operators on these spaces be equipped with the corresponding induced operator norms. Let δ > 0 such that where B # δ is the δ-ball around 0 with respect to · in C # . Suppose for some positive constants C 1 , C 2 , where ∇ (2) is the derivative with respect to the second variable. Then there is a constant ε > 0, depending only on δ, C 1 and C 2 , and a unique function f : B D ε → B A δ such that T (f (d), d) = 0 for all d ∈ B D ε . The function f is continuously differentiable. If T is analytic, then so is f . Lemma B.2. Let a denote the Euclidean norm of a vector a ∈ C d and A the induced operator norm for a matrix A ∈ C d×d . Fix ε 1 , ε 2 , ε 3 ∈ (0, 1) with 100ε 3 ≤ ε 1 ε 2 2 . Let F, T ∈ C d×d be self-adjoint matrices such that T ≤ 1 and where ±1 are non-degenerate eigenvalues of F with corresponding normalized eigenvectors f ± , i.e. F f ± = ±f ± . Suppose that T f + ≤ 1 − ε 2 and (1 + T )f − ≤ ε 3 . Then the resolvent of T F satisfies Furthermore, there is a single eigenvalue ζ 0 close to 1 and this eigenvalue is non-degenerate, more precisely, Proof. First we realize that f − satisfies approximate eigenvalue equations for both T F and F T , namely We now prove that when restricted to the orthogonal complement of f − , the matrix T F is strictly smaller than 1. More precisely, we will establish that a , a ⊥ f − . (B.14) To show (B.14) we fix a unit vector a ∈ f ⊥ − , a = 1, and decompose it according to f + and its orthogonal complement, a = 1 − α 2 f + + α a , a ⊥ f + , a = 1 , for some α ∈ [0, 1]. Because T ≤ 1 and F has a spectral gap (cf. (B.10)) we see that T F a is bounded from above by On the other hand, by using the assumption T f + ≤ 1 − ε 2 we also get a second bound, For α ≤ ε 2 /2 we use (B.16) while for α ≥ ε 2 /2 we use (B.15) to infer (B.14).
With the help of (B.13) and (B.14) we represent T F with respect to f − and an orthonormal basis of f ⊥ − . Thus we see that there is a unitary matrix U ∈ C d×d as well as α ∈ C, b, a ∈ C d−1 and B ∈ C (d−1)×(d−1) such that A := U * T F U has the structure where the representation on the very right is in terms of the invertible operator , and the two self-adjoint operators (B.23) Here we introduced a short hand notation for the matrices K 1 := ( S 2 S 1 S 2 ) 1/4 , K 2 = ( S 1 S 2 S 1 ) 1/4 , P : Note that the definitions of V, F and T above are compatible with (4.52), (4.51) and (4.50) in the limit τ → 1, η ↓ 0, while with the same limit we have K i := lim K i / V 1 and P := lim P V 1 .
Since Σ from (B.22) is a direct sum of S and S * , the claim (B.20) is equivalent to the same statement with S replaced by Σ. Owing to (B.21) we have V hs V −1 hs ∼ V V −1 ∼ 1. Therefore, (B.20) for Σ now follows from the following facts about T and F: for some ε ∼ 1, where S ± = (S 1 , ±S 2 ). Here, the last bound is obtained from [4,Lemma 4.8] in the same way as (4.60) in Lemma 4.11 was obtained. Since via Lemma 4.5 we can lift the resolvent control to the other norm # = · , this finishes the proof of the lemma. for any sufficiently small ε > 0 and # = hs, · . Furthermore, the eigenvalue 0 has algebraic and geometric multiplicity equal to 2 and corresponding right and left eigenvectors where E ± are defined in (4.44).
Proof. Since the operator K separately acts on the first and second component of a pair of matrices, the assertions about the multiplicity of 0 and (B.25) follow from Lemma B.3 and a simple computation. Similarly, it suffices to prove the resolvent control (B.24) for each component, i.e. to show it for S 1 (1−S) and S 2 (1 − S * ). We will only consider the first since the latter is treated similarly with the roles on S and S * interchanged. We define the projections PA := S 1 , A S 1 , S 2 S 2 , PA := 1 , A 1 , S 2 S 2 , P ⊥ A := S 2 , A S 2 , S 2 S 2 .
Due to Lemma B.3 the rank one projections P and P are the spectral projections associated to the non-degenerate eigenvalue 0 of 1 − S and S 1 (1 − S), respectively. The claim follows now because the operator S 1 (1 − S) has a bounded inverse on the image of Q, i.e. where for the inequality we used S 1 ∼ 1 and Lemma B.3 and for the last relation QA 2 hs = Q A 2 hs = A 2 hs + | S 1 , A | 2 S 1 , S 2 2 ∼ A 2 hs ∼ Q A 2 hs = QA 2 hs .
Here we used the short hand A = Q ⊥ A, QA = Q A, Q A = QA and the second comparison relation holds for the same reason as the first. This finishes the proof of (B.24) for # = hs. For # = · we use Lemma 4.5.