Large deviations principle for the largest eigenvalue of Wigner matrices without Gaussian tails

We prove a large deviation principle for the largest eigenvalue of Wigner matrices without Gaussian tails, namely such that the distribution tails $\mathbb{P}( |X_{1,1}|>t)$ and $\mathbb{P}(|X_{1,2}|>t)$ behave like $e^{-bt^{\alpha}}$ and $e^{-at^{\alpha}}$ respectively for some $a,b\in (0,+\infty)$ and $\alpha\in (0,2)$. The large deviation principle is of speed $N^{\alpha/2}$ and with a good rate function depending only on the tail distribution of the entries.


Introduction and main result
Let (X i,j ) i<j be i.i.d complex random variables such that E(X 1,2 ) = 0, E|X 1,2 | 2 = 1 and E(X 2 1,2 ) = 0 and let (X i,i ) i≥1 be i.i.d real random variables such that E(X 1,1 ) = 0. Let X(N ) be the N × N Hermitian with up-diagonal entries (X i,j ) 1≤i≤j≤N . We call such a sequence (X(N )) N ∈N , a Wigner matrix. In the following, we will drop the N and write X instead of X(N ). Consider now the normalized random matrix X N = X/ √ N . Let λ i denote the eigenvalues of X N , with λ 1 ≤ λ 2 ≤ ... ≤ λ N . We define µ X N the empirical spectral measure of X N by We know by Wigner's theorem (see [1, p.7]), that almost surely where denote the weak convergence and where σ sc denote the semicircular law which is defined by, Futhermore, assuming that E|X 1,1 | 2 < +∞ and E|X 1,2 | 4 < +∞, we know from [11] and [3] λ N −→ N →+∞ 2 a.s.
Given these two fundamental results for the convergence of the empirical spectral measure and largest eigenvalue of Wigner matrices, we can try to understand the deviations of these two quantities around their respective limits and ask ourselves if (µ X N ) N and (λ N ) N admit large deviations principles.
In 1997, A. Guionnet and G. Ben Arous proved a large deviation principle with respect to the weak topology for the empirical spectral measure in the case of βensembles in [2] with speed N 2 and with an explicit rate function. In [1, p.81] this result has been extended by the same authors for matrix models distributed according to the Gibbs measure, in which case the joint law of the eigenvalues is of the form : with V a continuous potential growing faster than log |x| in ±∞, Z N V,β a normalization constant, and β > 0. In this setting, the large deviation principle for the spectral measure is still with speed N 2 and with a good rate function involving the Voiculescu non-commutative entropy. With an extra assumption on the normalization constant Z N V,β , one can derive a large deviations principle of the largest eigenvalue for the same family of matrix models, with speed N and an explicit good rate function (see [4] and [1, p.83]). Those results of large deviations principles relie entierely on the knowledge of the joint distribution of the eigenvalues. Another interesting matrix model is the so-called deformed Wigner ensemble, for which some large deviations principle for the extreme eigenvalues have been proven. First in [13], a large deviation principle is derived for the extreme eigenvalue of a GOE or a GUE matrix perturbed by a rank one deterministic symetric or Hermitian matrix. Then in [5], the large deviations for the joint law of the extreme eigenvalues of a deterministic Wigner matrix pertubed with a low rank Hermitian matrix with delocalized eigenvectors are extensively studied.
Thus, in the setting of Wigner's matrices which coefficients have a sub-Gaussian tail but are not Gaussian, the existence of a large deviation principle for the empirical distribution of eigenvalues or the largest eigenvalue is still an open problem.
Recently, in [8] C. Bordenave and P. Caputo gave a large deviation principle for the empirical spectral measure of Wigner matrices with coefficients without Gaussian tail, a case where there are no explicit computation of the joint law of the eigenvalues.

Main result
We recall that a sequence of random variables (Z n ) n≥1 taking value in some topological space X equipped with the Borel σ-field B follows a large deviations principle (LDP) with speed υ : N → N and rate function J : X → [0, +∞] if J is lower semicontinuous and υ increase to infinity and for all B ∈ B, where B • denote the interior of B and B the closure of B. We recall that J is said to be lower semicontinuous if its level sets {x ∈ X : J(x) ≤ t} are closed.
If futhermore all the level sets are compact, then we say that J is a good rate function.
In the following we make the same statistical assumptions as in [8] which we state below.
Under these assumptions, it has been proven in [8] that the empirical spectral measure of the normalized matrix X N denoted by µ X N , follows a large deviations principle with speed N 1+α/2 and good rate function I defined for all µ ∈ M 1 (R), where M 1 (R) denote the space of all probability measures on R, by where denote the free convolution, and where Φ denote a good rate function (see [8] for further details). In the following, we will denote by λ Y the largest eigenvalue of Y ∈ H N (C) where H N (C) denote the space of Hermitian matrices of size N . We will prove in this paper the following large deviations result.

Theorem.
Under assumptions (1.1), the sequence (λ X N ) N ∈N follows a large deviations principle with speed N α/2 and with good rate function defined for all x ∈ R, by Observe that the rate function is infinite on (−∞, 2). Indeed, in order to make a deviation of the top eigenvalue at the left of 2, we need to force the support of the empirical spectral measure to be in (−∞, 2 − ε), for some ε > 0. But this event has an infinte cost at the exponential scale N α/2 since the empirical spectral measure follows a large deviation principle with speed N 1+α/2 according to [8]. As illustrated in figure 1, this rate function has also the particularity of being discontinuous at 2. As we will show, the deviations of the top eigenvalue are given by finite rank perturbations of a Wigner matrix. It is well-known that finite rank perturbations of Wigner matrices show a thresold phenomenon with respect to the strenght of the pertubation (see [15], [10] [16], [6], [12] for further details), which the rate function seems to reflect by this discontinuity at 2. This picture may also mean that there is a more subtle behaviour of the largest eigenvalue in the right neighbourhood of 2 which is still to be understood.

Heuristics
We will show that one can obtain the lower bound of the LDP by finite rank perturbation method. For simplicity, let us assume the X i,j 's are exponential variables with parameter 1. Thus, the matrix X satisfies assumptions (1.1) with a = b = α = 1. Let x > 2 and θ = 1/G σsc (x). By Weyl's inequality we have, N e 1 e * 1 and e 1 the first coordinate vector of C N . Since θ > 1, we have according to [16], But X 1,1 has exponential law with parameter 1, thus Therefore,

Outline of proof
The outline of the proof will follow closely the one of the LDP for the empirical spectral measure derived in [8].
Following [8], we start by cutting the entries of X N according to their size.We decompose X N in the following way.
In a first phase, namely in Parts 5 and 6, we will focus on trying to identify which parts in the decomposition of X N significantly contribute to create deviations of the largest eigenvalue with regards to its limit value 2. We start by showing in Part 6.1 that we can neglect in the deviations of λ X N the contributions of B ε and D ε , corresponding to the intermediate and large entries respectively. Then in Part 6.2 we prove that we can replace A by a Hermitian matrix H N with entries bounded by (log N ) d / √ N and independent from C ε . From the LDP of the empirical spectral measure of X N of speed N 1+α/2 proved in [8], we deduce in Proposition 6.5 that the deviations at the left of 2 have an infinite cost at the scale N α/2 . We can now focus on the deviations of the largest eigenvalue of H N + C ε at the right of 2. As in many papers on finite rank deformations of Wigner matrices (see [6] for exemple), we see the largest eigenvalue of H N + C ε , provided it is not in the spectrum of H N , as the largest zero of where r is the rank of C ε , θ 1 , ..., θ r are the non-zero eigenvalues of C ε in nondecreasing order, and u 1 , ...u r are orthonormal eigenvectors associated to θ 1 , ..., θ r .
This method is made efficient in the studies of the deviations of λ H N +C ε at the right of 2, by the fact that the spectrum of H N can be considered at the exponential scale N α/2 nearly as included in (−∞, 2] as we show in Proposition 6.7 and by the fact that C ε is a sparse matrix as shown in Lemma 5.5, and can be considered to have a given finite number of non-zero entries.
In Part 6.3, we focus on showing that the function f N is exponentially equivalent to a certain limit equation f defined for any x > 2 by Using concentration inequalities, we show in Proposition (6.11) that at the exponential scale N α/2 uniformly in x in a compact set of (2, +∞) Next, we prove in Theorem 6.12 an isotropic property of the semi-circular law using the estimates in [16] of the resolvent entries of Wigner matrices, allowing us to deduce in Proposition 6.13 that where we denote by G σsc (x) the resolvent of the semi-circular law. Using the fact that the spectral radius of C ε can be considered as bounded as shown in Lemma 5.3, and using the uniform continuity of the determinant on compact sets of H r (C), we get, as stated in Theorem 6.9, uniformly in z in any compact included in (2, +∞), In Part 6.5, we show that provided λ H N +C ε is greater that 2 and that λ C ε is greater than 1, the largest zero of f N , namely λ H N +C ε is exponentially equivalent to the largest zero of f , denoted µ ε,N , which is simply Observe that despite the fact that f N and f are holomorphic functions, we can't use Rouché's theorem to deduce that their zeros are close since they are close only on compact set of (2, +∞). But we use here a trick which will allow us to content ourself with this uniform closeness between f N and f on compact sets of (2, +∞) which is a bit similar to the one used in [6, p. 513]. We perturb the spectrum of C ε so as to the largest eigenvalue it simple and bounded away from its second largest eigenvalue by some γ > 0. Classical intermediate values theorem shows that any continuous function ϕ close to f on any compact set in (2, +∞) admits a zero in (2, +∞) and that its largest zero is close to the largest zero of f . Since f remains in a compact set of continuous function, we can prove a uniform continuity of the "largest zero function" in Lemma 6.17. We deduce in Proposition 6.16, that the largest zero of f N and of f are exponentially equivalent at the scale N α/2 , allowing use to conclude in Theorem 6.14 that the largest zero µ N,ε of f are exponentially good approximations of λ X N . Then in Part 7, we prove that (µ n,ε ) N satisfies a LDP for each ε and we deduce a LDP for (λ X N ) N . The key of the proof is Proposition 5.5 which allows us to assume in the study of the deviations that the matrix C ε has only a finite number of non-zero entries. With this observation, the problem can be reduced to a finite-dimensional one. We defineẼ r the space of equivalence classes of infinite Hermitian matrices with all but at most r non-zero entries under the action of any permutation matrix. We easily see that this space can be embeded in the quotient space of H r (C)/S r . With the topology inherited from H r (C)/S r , the map which associates to any matrix ofẼ r its largest eigenvalue is continuous, and allows us to apply a contraction principle to get the large deviations principle for (µ ε,N ) N . In Proposition 7.1 we establish a LDP for C ε seen as an element ofẼ r . The contraction principle yields a LDP for (µ ε,N ) N which is stated in Proposition 7.3. We deduce a LDP for (λ X N ) N in Proposition 7.4.

Concentration inequalities
Throughout the rest of this paper, we fix κ > 0 such that for all indices 1 ≤ i, j ≤ N and all t large enough independent of N , We decompose X N in the following way. with where d is such that dα > 1, ε > 0 and where |z| ∞ = max(| (z)|, | (z)|) for all complex number z.
With a slight adaptation of the concentration inequality from [14, p. 239] for the largest eigenvalue of a random Hermitian matrix with bounded entries, we get the following proposition.

Proposition.
Let H ∈ H N (C) be a random Hermitian matrix with entries bounded by a constant K > 0 such that (H i,j ) i≤j are independent variables and C be a deterministic Hermitian matrix. Then we have for all t > 0 the following concentration inequality, From the assumption (1.1) on the tail distributions of the entries, we get the following lemma.

Lemma.
For all 1 ≤ i, j ≤ N and for t > 0 Applying the result of Proposition 4.1, we get the following corollary.

Corollary. For all
where A is the matrix with entries and where λ A denote the largest eigenvalue of A.

Proof. If we apply Proposition 4.1 to the Wigner matrix
we get for any t > 0, Since α < 2, we have at the exponential scale N α/2 for all t > 0, lim sup We know from [11] and [3] that the largest eigenvalue of X N converges in mean to 2. Besides by Weyl's inequality we have But with κ > 0 defined in (3). Since dα > 1, we have putting the estimate above into (7) Putting together (6) and (8), we get

Proposition.
Let u be a unit vector and µ a real number. Let H be a random Hermitian matrix such that the entries (H i,j ) 1≤i≤j≤N are independant and bounded by K > 0. We denote by C the set of Hermitian matrices X of size N with top eigenvalue λ X < µ. Let also x be in (µ, +∞).
(i). The function f u : C → R defined by  [7, p.117], we know that t → 1/t is operator convex on (0, +∞). Consequently, t → (x − t) −1 is operator convex on (−∞, x), and in particular on (−∞, µ). It means that the mapping f u defined on C by where , denote the canonical Hermitian product on As a supremum of affine functions,f u is convex and by the property above it is also 1/d 2 -Lipschitz.
We show now thatf u satisfies a bounded differences inequality in quadratic mean in the sense of [14, p.249] on the product space H N (C) of Hermitian matrices with entries bounded by K. Let H and H be two Hermitian matrices with entries bounded by K. Let ζ(H) be a subdifferential off u at the point H. Then we havẽ We can apply Lemma 8.6 in the Appendix and it follows that for all t > 0

Exponential tightness
Proof. According to Weyl's inequality (see Lemma 8.3 in the Appendix) we have, with A, B ε , C ε and D ε are as in (4). Therefore We are going to estimate at the exponential scale N α/2 the probability of each of the events We already know by Proposition 4.3 that for t large enough we have, For the second event P (λ B ε > t), we start by proving the following lemma.
Proof. We repeat here almost verbatim the argument used in the proof of Lemma 2.3 in [8, p.7]. We have Let λ > 0. Then by Chernoff's inequality, Denoting by µ the distribution of |X i,j |, we have . (3). Recall that for µ a probability measure on R and g ∈ C 1 we have the following integration by part formula. Thus, (13) Putting together (13) and (12) we get Since dα > 1, we have for N large enough Finally, putting this last estimate into (11) we get which gives the claim.
Coming back at the proof of Proposition 5.1, we notice that Hence, lim sup We focus now on the third event {λ C ε > t}. The estimate is given by the following lemma.

Lemma. For all
with κ as in (3) and where ρ(C ε ) denote the spectral radius of C ε .
Consequently for N large enough we have, But by Bennett's inequality (see in appendix Lemma 8.5) we have As Using (19), we get lim sup Using inequalities (17) and (18) and the last exponential estimate (20), we get the claim lim sup Finally, we now turn to the estimation of the last event P (λ D ε > t). It will directly fall from the following lemma.

Lemma. For all
where ρ (D ε ) denote the spectral radius D ε and κ is as in (3).
Proof. Just as in the proof of Lemma 5.3, we have By Markov's inequality we get which gives at the exponential scale we get the announced estimate.
Putting together the different exponential estimates (10), (15), (16) and (5.4) , we get using inequality (9) with C 1 > 0 a constant small enough, lim sup Let ε = t −1/(2α+1) . Then, we have for t > 1, Indeed since t > 1, t 2 ε −2+α ≥ tε α+1 , and with this choice of ε, tε α+1 = ε −α = t α/(2α+1) . Putting the later inequality into (21) and taking the limsup as t goes to +∞ we get lim sup t→+∞ lim sup We show now that C ε is a sparse and that at the exponential scale we consider C ε to have only a given number of non-zero entries. This will be crutial later when we will see C ε as a finite rank perturbation of the matrix H N .

Proposition. For all
Proof. We follow here the argument of the proof of Lemma 2.2 in [8, p. 6]. We have, Using Bennett's inequality (see in the Appendix Proposition 8.5) and denoting x log x, we have for N large enough, where we used in the last inequality the fact that σ ≤ N 2 e −κε α N α/2 . Taking the limsup at the exponential scale we get the claim.
As a consequence of the latter proposition, we get the following result.

Proposition.
For all ε > 0, Proof. As the rank of a matrix is bounded by the number of non-zero entries, we see that Proposition 5.5 yields the claim.

.1 First step
We show here that we can neglect at the exponential scale N α/2 the contributions of the very large entries (namely those such that where A, B ε , C ε and D ε are as in (4).
Proof. We have by Lemma 8.2 in the Appendix, But by Lemma 5.4, lim sup with κ as in (3). Hence, In short, (λ A+C ε ) N,ε are exponentially good approximations of (λ X N ) N .
Proof. Since we already proved Proposition 6.1, it suffices to show for all t > 0, By Lemma 8.2 in the appendix we have, Using Lemma 5.2 and the fact that α ∈ (0, 2), we get the claim.

Second step
We now show that in the study of the deviations of λ A+C ε , we can consider A and C ε to be independent. We will prove the following result.

Theorem. We denote by
With a similar argument as in the proof of Proposition 5.5, we get the following lemma.

Lemma. Let
Proof of Theorem 6.3. Thanks to Proposition 6.2, it is enough to prove for all ε > 0 lim sup We will follow the same coupling argument to remove the dependency between A and C ε as in the proof of Proposition 2.1 in [8].
Then A and H N are independent of F and have the same law. By Lemma 8.2 in the appendix we have, Let t > 0 and F = |I| < tN/(log N ) 2d . Then we have by Lemma 6.4 But according to (22), Thus, But C ε is F-mesurable and conditioned by F, A is a random Hermitian matrix with up-diagonal entries independent and are bounded by (log N ) d / √ N . According to Proposition 4.1, we get Applying again Proposition 4.1 to H N and C ε , we get But A and H N are independent of F and have the same law. Therefore, Thus by triangular inequality,

Exponential approximation of the equation of eigenvalues outside the bulk
As a consequance of the LDP for the empirical spectral measure proved in [8], we show in the next proposition that the deviations at the left of 2 have a cost infinite at the exponential scale N α/2 . This result will allow us to focus only on understanding the deviations of the largest eigenvalue outside the bulk.

Proposition.
∀x < 2, lim sup Proof. According to [8], we know that L X N satisfies a LDP with speed N 1+α/2 and with good rate function I which achieves 0 only for the semicircular law σ sc . Let x < 2 and h be a bounded continuous function whose support is in (x, 2) such that σ sc , h = 1. We have 6.6 Definition. As in Theorem 6.3 we denote by P N the law of X 1,1 conditioned on Let H be a random Hermitian matrix independent of X such that (H i,j ) i≤j are independent, and for 1 ≤ i ≤ N , H i,i has low P N and for all i < j, H i,j has law Q N . We denote by H N the normalized matrix H/ √ N .
To this end, we need a control over the spectrum of H N which allows us to assume the spectrum of H N is nearly included (−∞, 2] at the exponential scale we are considering. Arguing similarly as in Corollary 4.3, we get the following proposition.

Proposition (Control on the spectrum of H N ). Let δ > 0 and let
In the view of Theorem 6.3, Proposition 6.7, and Proposition 6.5, we are reduced to understand the deviations outside the bulk on the exponential scale N α/2 of the largest eigenvalue of the pertubed Wigner matrix H N + C ε , where C ε can be assumed, thanks to Proposition 5.6 to be a finite rank matrix. We will use here the same approach as in many papers on finite rank deformations of Wigner matrices (see for exemple [6] or [12]) to determine the behavior of the extreme eigenvalues outside the bulk of a perturbed Wigner matrix. This approach is based on a determinant computation, stated here without proof in the following lemma. It is a direct consequence of Frobenius formula (see Proposition 8.1 in the Appendix).

Lemma.
Let H and C be two Hermitian matrices of size N . Denote by r the rank of C, by θ 1 , ..., θ r the non-zero eigenvalues of C in nondecreasing order and u 1 , ..., u r orthonormal eigenvectors associated with those eigenvalues and by Sp(H) the spectrum of H.
The goal of this section is to prove an exponential approximation of the equation of eigenvalues of the perturbed matrix outside the bulk. We will prove the following result.
6.9 Theorem. Let H N be defined just as in 6.6 and let C N be an independent random Hermitian matrix. Let r be the rank of C N , θ 1 , ..., θ r the non-zero eigenvalues in non-decreasing order of C N and u 1 , ..., u r orthonormal eigenvectors of C N associated with those eigenvalues.
Let δ > 0, ρ > 0, and r ∈ N. Define the event where ρ(C N ) is the spectral radius of C N . For any t > 0 and any compact set K included in (2 + δ, +∞) lim sup where f is defined for any x > 2 by

First step
We start by showing that M N is close to its conditional expectation with respect to C N . As a consequence of Proposition 4.4, we get the following concentration result.

Proposition. Let u, v be two unit vectors. For
Proof. Since b N is a bilinear form, by polarization formula we see that we only need to prove By assumption, H N has its entries bounded by (log N ) d / √ N . From Proposition 4.4 we get with µ = 2 + δ, wheref where we denote by , the canonical Hermitian product on H N (C). But we have where the supremum is taken over the compact set K N of Hermitian matrices with entries bounded by (log N ) d / √ N . Thus on one hand, we have for all t > 0 Invoking Proposition 6.7, we get On the other hand, It only remains to show that Indeed, putting together (26) with (28) and the claim above, we will get by triangular inequality We now show (30). Since x > 2 + δ, we have for all H ∈ C δ Let H be a Hermitian matrix with entries bounded by But H/(||H|| + 1) is in C δ thus |f (H/(||H|| + 1))| ≤ 1 η . Besidesf is 1/η 2 -Lipschitz with respect to the Hilbert-Schmidt norm, therefore We deduce From Proposition 6.7 we get which ends the proof of the claim.
We are now ready to prove that M N , restricted to the event that the spectrum of H N is in (−∞, 2 + δ), is exponentially equivalent to its conditional expectation with respect to C N , uniformly on any compact included in (2 + δ, +∞).

Proposition (Concentration in the equation of eigenvalues outside the bulk).
For all x > 2 + δ, we define where C δ = {X ∈ H N (C) : λ X < 2 + δ} and where H N is as in 6.6. For all compact set K included in (2 + δ, +∞), and where E C N denote the conditional expectation with respect to C N , and where |M | ∞ = sup i,j |M i,j |, for any matrix M .
Proof. Fix x in (2 + δ, +∞) and i, j ∈ {1, ..., r}. We will denote by P C N and E C N respectively the conditional probability and the conditional expectancy with respect to C N . We have Thus from Proposition 6.10, we get Taking the union over all the i, j in {1, ..r}, we get for any x ∈ (2 + δ, +∞) We now use a ε-net argument to extend this concentration inequality uniformly in z in a given compact set K included in (2 + δ, +∞). Let n ∈ N. Since there are at most δ(K)n points in {x ∈ K : nx ∈ Z} where δ(K) denote the diameter of K, we deduce that for any t > 0 Taking n small enough we get The second step of the proof of Theorem 6.9 will be to prove an isotropic property of the semicircular law. This will be made possible thanks to the results of estimates of the coefficients of the resolvent of Wigner matrices in [16].

Theorem. For any compact set included in
Proof. Let u and v be two unit vectors. Let K be a compact set included in (2 + δ, +∞). We denote by η = inf K − (2 + δ). To ease the notation, we denote for any z / ∈ Sp(H N ), R(z) = (z − H N ) −1 the resolvent of H N . Let η > 0 and x ∈ K. We write z = x + iη. We have Thus, Take η = 1/ log N . From Proposition 6.7 we get uniformly on K. Expanding the scalar product and using the echangability of the entries of H N we get Since u and v are unit vectors, we have But thanks to Proposition 3.1 in [16], we have where we denote by R(X N ) the resolvent of X N and where P 9 is a polynom of degree 9. But recall from the proof of Proposition 6.3 that H N has the same law as the matrix A where A is the N × N matrix such that Therefore where R(A ) denote the resolvent of A at x + iη. Using the resolvent equation (see in the appendix Lemma 8.4) we get where ||.|| HS denote the Hilbert-Schmidt norm. But it is easy to see that since we know from Lemma 4.2 that with κ as in (3) and dα > 1. Combining (35), (34) and (33), we get uniformly in x ∈ R, But according to [16,Proposition 3.1], we have also uniformely on K. Thus, putting (36) , (37) together with (32), we get uniformely on K. The proof is now complete by injecting this result into (31).
Putting together the exponential equivalent proved in Proposition 6.11, the isotropic property in Proposition 6.12 with the control on the spectrum of H N proved in Proposition 6.7, we get the following exponential equivalent for M N .

Proposition.
Let H N be as in 6.6 and C N be a random Hermitian matrix independent of H N . Let r be the rank of C N , θ 1 , θ 2 , ..., θ r the non-zero eigenvalues of C N in non-decreasing order and u 1 , u 2 , ..., u r orthonormal eigenvectors associated with those eigenvalues. We define for x / ∈ Sp(H N ), and for all x > 2, Let δ > 0. We have for all compact set K included in (2 + δ, +∞) and t > 0, Proof. By triangular inequality, we have From Theorem 6.12 we know that We get the claim by applying Proposition 6.11.
We are now ready to give the proof of Theorem 6.9.
Proof of Theorem 6.9. Let K be compact set included in (2 + δ, +∞). Assuming W happens, we see that for all x ∈ K, the matrices M N (x) and M (x) have their spectral radii bounded by where d(2 + δ, K) is the distance of 2 + δ from K. Therefore M (x) and M N (x) remain in a compact set K r ⊂ M r (C). But the determinant function is uniformly continuous on K r . Let t > 0 and s > 0 such that for all M, N ∈ K r , We have Therefore using Theorem 6.13 and taking the limsup at the exponential scale, we get the claim.

Exponential equivalence of the largest solutions of the eigenvalue equation and the limit equation.
We are interested here in finding simple exponentially good approximations of (λ X N ) N which will allow us to derive a large deviation principle for λ X N . To this end, define for all N ∈ N and ε > 0 We wil show in this section the following result.
6.14 Theorem. For all t > 0 lim ε→0 lim sup Since we know by Theorem 6.3 that (λ H N +C ε ) N,ε are exponentially good approximations of (λ X N ) N , we only need to prove Theorem 6.14 with λ H N +C ε instead of λ X N . We will focus first on finding an exponential equivalent of λ H N +C N where C N is a general random Hermitian matrix independent of H N . We know by Lemma 6.8, that provided λ H N +C N is outside the spectrum of H N , it is the largest zero of f N defined for all z / ∈ Sp (H N ) by with r is the rank of C N , θ 1 , ...,θ r are the non-zero eigenvalues of C N in nondecreasing order and u 1 , u 2 , ..., u r are orthonormal eigenvectors associated with those eigenvalues. But from Theorem 6.9, we know that this function is arbitrary close to a certain limit function f on every compact set included in (2, +∞) with an exponentially high probability, with f defined for all z / ∈ (−2, 2) by Therefore, one can hope that the largest zero of f N which is the top eigenvalue of H N + C N is arbitrary close to the largest zero of f . Observe that f admits zeros only when θ r > 1, in which case its largest zero is 6.15 Remark. For all z ∈ (0, 1], G −1 σsc (z) = z + 1 z .

Proposition.
Let H N be as in 6.6 and Let C N be a random Hermitian matrix independent of H N . Let δ > 0 and l ≥ 2 + 2δ. For all t > 0 and r ∈ N, Proof. We start by reducing the problem to the case where C N has its top eigenvalue simple and bounded away from its last-but-one eigenvalue. Let u be an eigenvector associated with the largest eigenvalue of C N . Let γ > 0. We denote C N,γ the matrix defined by By definition, the largest eigenvalue of C N is bounded away from its last-but-one eigenvalue by γ. Provided that λ C N ≥ 1, we denote by µ N,γ the random variable

Easy computations yields
Therefore for γ small enough, l large enough and δ small enough, It is now clear that it is sufficient to prove Proposition 6.16 but with V γ r,l instead of V r,l , where where θ r (X), and θ r−1 (X) denote respectively the largest and the second largest eigenvalue of X. We know from Theorem 6.9 that the functions f N and f are arbitrary close on any compact set with exponentially high probability. Since we can't make the error on the distance between f N and f in Theorem 6.9 depend on C N , we need now a kind of uniform continuity property of the largest zero of continuous functions belonging to a certain compact set to get that their largest zeros are close with exponentially high probability. This is the object of the following lemma.

Lemma (Uniform continuity of zeros). Let K be a compact set included in
(2 + δ, +∞) and K be a closed set included in K such that there is some open set U satisfying K ⊂ U ⊂ K. Let ρ > 0, r ∈ N, and γ > 0. Define where ρ r = G −1 σsc (1/θ r ) whenever θ r ≥ 1. K is a compact set of the space C(K) of continuous functions on K. Thus, for any t > 0, there is s > 0 such that if f ∈ K and g ∈ C(K) are such that sup x∈K |f (x) − g(x)| < s, then g admits at least one zero and its largest zero z max (g) is such that Proof. Since the θ i 's are in a compact set and K is compact, it follows that K is compact in C(K).
Let ϕ be defined for any g ∈ C(K) by where z max (g) denote the largest zero of g if it exists. Observe that for any f ∈ K, f admits at least one zero in K and its largest zero is ρ r . Since θ r − θ r−1 ≥ γ, ρ r is a simple zero of f . Thus, we see that ϕ is continuous on each f in K and since K is compact, we get the claim.
We come back now at the proof of Proposition 6.16. Observe that the fact that C N ∈ V γ r,l yields that µ N ≤ l. Let K be a compact set such that there is an open set U satisfying [2 + 2δ, l] ⊂ U ⊂ K ⊂ (2 + δ, +∞). Applying Lemma 6.17 to with K = [2 + 2δ, l] and K, we get a s > 0 such that By Theorem 6.9 we deduce which ends the proof of Proposition 6.16.
We are now ready to give the proof of Theorem 6.14.
Proof of Theorem 6.14 . According to Proposition 6.5, we only need to prove for δ > 0 small enough that lim ε→0 lim sup With δ < t/3, we see that it is actually sufficient to show lim ε→0 lim sup Using Proposition 6.16 but with C ε instead of C N we get for any l ≥ 2 + 2δ, and k ∈ N, where µ ε,N is defined as in (38) and where As a consequence of Lemma 5.3 and Proposition 5.6 we get for any ε > 0, lim r,l→+∞ lim sup Thus, lim sup Using the fact that (λ H N +C ε ) N,ε are an exponentially good approximation of (λ X N ) N according to Theorem 6.3, we get lim ε→0 lim sup But (λ X N ) N is exponentially tight according to Proposition 5.1, thus we finally get the claim (41).

Large deviation principle for the largest eigenvalue of X N
Our aim here is to prove for each ε a large deviations principle for (µ ε,N ) N . Since (µ ε,N ) ε,N are exponentially good approximations of the largest eigenvalue of X N , we will get a large deviations principle for λ X N . For every r ∈ N we define Let S denote the group ∪ n>0 S n . We denote E r the set of equivalence classes of E r under the action of S, which is defined by where M σ denote the permutation matrix associated with the permutation σ i.e M σ = (δ i,σ(j) ) i,j . We observe that the vector space E r can be embedded into H r (C)/S r . Identifying E r to a subspace of H r (C)/S r , we equip E r of the quotient topology of H r (C)/S r . This topology is metrizable by the distanced given by Since the application λ which associate to a matrix of H r (C) its largest eigenvalue is continuous and is invariant by conjugation, we can define the application λ on H r (C)/S r and it is still continuous. Therefore the application λ which associate to a matrix of E r its largest eigenvalue is continuous for the topology we defined above.
Let ε > 0. Let P ε N,r denote the law of C ε conditioned on the event {C ε ∈ E r } and P ε N,r the measure image of P ε N,r by the projection π : E r → E r .

Proposition.
Let r ∈ N and ε > 0. Then ( P ε N,r ) N ∈N satisfies a large deviations principle with speed N α/2 and good rate function I ε,r defined for allÃ ∈ E r by We recall here a Lemma from [8] which will be very useful to derive a large deviations principle for C ε . [8, p.18]) For all γ > 0 and all x = 0 there is a sequence (b N ) N which converges to b such that for N large enough :

Lemma. (According to
Similarly for all z = 0 and all 0 < γ < |z|, there is a sequence (a N ) N which converges to a such that for N large enough : Proof of Proposition 7.1. Property of the rate function: Let γ > 0. Let ϕ denote the injective linear map between E r and H r (C)/S r and p : H r (C) → H r (C)/S r the projection given by the action of S r on H r (C). We have Since K γ is a compact set of H r (C), we see that the γ-level set of I ε,r , is a compact set of E r . Therefore I ε,r is a good rate function. Exponential tightness: Let γ > 0. We denote By the same argument as above, K γ is a compact set of E r . Then by definition of P ε N,r , we have But 1 i,j |C ε i,j |>γ and 1 C ε ∈Er are respectively nondecreasing and nonincreasing with respect to the absolute value of each entry of C ε . Therefore by Harris' inequality we have Now choose a 1 such that 0 < 2a 1 < a and b 1 such that 0 < b 1 < b. By Chernoff's inequality we have But integrating by part just as in the proof of Lemma 5.2 we get, Similarly we get for N large enough and with a 2 such that 2a 2 ∈ (2a 1 , a) E e Therefore putting together (45) and (46) into (44), we get at the exponential scale lim sup which proves that ( P ε N,r ) N is exponentially tight. Lower bound: Let r ∈ N and A ∈ H r (C) with at most r non-zero entries. Let also N ≥ r. Without loss of generality we can assume Let δ > 0 be such that δ < min min
But by independance, we have Since δ < min min But according to Lemma 7.2, For N large enough we get, with κ defined in (4), Putting (49) and (50) into (48) we get Hence at the exponential scale, lim inf Besides by Borel-Cantelli Lemma, we have Putting these estimates into (47), we get lim inf Upper bound: Since C ε has its non-zero entries in R ε,ε −1 = {z ∈ C : ε ≤ |z| ≤ ε −1 }, we see that whenever A does not have all its non-zero entries in R ε,ε −1 . Let A ∈ H r (C) be such that A has all its non-zero entries included in R ε,ε −1 . Since X ∈ H r (C) → r i=1 |X i,i | α and X ∈ H r (C) → 1≤i =j≤r |X i,j | α are continuous, then by definition of the topology we put on E r the functionsX ∈ E r → i≥1 |X i,i | α andX ∈ E r → i =j |X i,i | α are continuous. Let h(δ) be a nonnegative function such that h(δ) → 0 as δ → 0, satisfying } are nondecreasing with respect to the absolute value of each entry of C ε , and {C ε ∈ E r } is nonincreasing with respect to the absolute value of each entry of C ε . By Harris' inequality we havẽ Taking N large enough, we can assume A ∈ H N (C). But by Chernoff's inequality and taking 0 < b 1 < b, But we know from (45) that for N large enough, As this inequality is true for all b 1 < b, we have letting b 1 go to b, Putting these two last estimates into (51), we get lim sup δ→0 lim sup The idea is now to use the fact that C ε has with arbitrary exponentially large probability at most r non-zero entries according to Proposition 5.5 and the continuity of the largest eigenvalue map onẼ r to apply a contraction principle in order to get a LDP for (µ ε,N ) N . By Proposition 5.5, we get lim r→+∞ lim sup We can apply [9,Theorem 4.2.16] and deduce that (µ ε,N ) N satisfies a weak LDP with rate N α/2 and rate function defined for all x ∈ R by Ψ ε (x) = sup where J ε is defined in Proposition 7.3. To conclude that ψ ε = J ε , we need to show that J ε is lower semicontinuous. We will in fact show that J ε has compact level sets. Let τ > 0 and x ∈ R. If with I ε,r being defined in Proposition 7.1. Since I ε,r is a good rate function and f is continuous onẼ r , we have Thus, the τ -level set of J ε is compact, which concludes the proof.

Theorem.
The sequence (λ X N ) N ∈N follows a LDP with speed N α/2 and good rate function defined by, Proof. We already know by Proposition 5.1 that (λ X N ) N is exponentially tight. Thus it is sufficient to prove that (λ X N ) N satisfies a weak LDP. Since we know from Theorem 6.14 that (µ ε,N ) N are an exponentially good approximations of (λ X N ) N and for each ε > 0, (µ ε,N ) N follows a LDP with rate function J ε , then by [9, Theorem 4. with We show in the next lemma that we can compute explicitly J.

Lemma.
For all x ∈ R, Proof. As α ∈ (0, 2), we have for all matrices A ∈ H n (C), We deduce Because the rate function of λ X N is for all x > 2 we deduce that for all x > 2, If b ∧ a 2 = b, then the infimum is achieved for the 1 × 1 matrix with the entry G σsc (x) −1 . If b ∧ a 2 = a 2 , the infimum is achieved for the 2 × 2 Hermitian matrix 0 G σsc (x) −1 G σsc (x) −1 0 .

Therefore in both cases
With Lemma 7.5 proven, it is easy to see that J has compact level sets. Therefore, from (52), we get that Φ = J, which concludes the proof.

Acknowledgement
I would like to thank Alice Guionnet for welcoming me at MIT during April and May 2014, where I was able to put into shape this paper. I feel very grateful to have had this opportunity to work with Alice Guionnet and for the time and availability she offered me, also for her excellent guidance, as well as her enthusiasm and generosity. I also would like to thank MIT for its hospitality and all the people who made my time over there so enjoyable. Finally I would like to thank my supervisor Charles Bordenave for his inspiring advice and the attention he gave to this paper.