DIFFERENTIAL OPERATORS AND SPECTRAL DISTRIBUTIONS OF INVARIANT ENSEMBLES FROM THE CLASSICAL ORTHOGONAL POLYNOMIALS. THE CONTINUOUS CASE

. Following the investigation by U. Haagerup and S. Thorbjørnsen, we present a simple di ﬀ erential approach to the limit theorems for empirical spectral distributions of complex random matrices from the Gaussian, Laguerre and Jacobi Unitary Ensembles. In the framework of abstract Markov di ﬀ usion operators, we derive by the integration by parts formula di ﬀ erential equations for Laplace transforms and recurrence equations for moments of eigenfunction measures. In particular, a new description of the equilibrium measures as adapted mixtures of the universal arcsine law with an independent uniform distribution is emphasized. The moment recurrence relations are used to describe sharp, non asymptotic, small deviation inequalities on the largest eigenvalues at the rate given by the Tracy-Widom asymptotics. Submitted


Introduction
Limiting distributions of spectral measures of random matrices have been studied extensively in mathematical physics, since the pionneering work by E. Wigner (cf.[Me], [De], [Fo]), as well as in the context of multivariate analysis in statistics (cf.[Bai]).
To illustrate, as an introduction, some of the classical results, consider first, for each integer N ≥ 1, X = X N a selfadjoint centered Gaussian random N × N matrix with variance σ 2 .By this, we mean that X is a N ×N Hermitian matrix such that the entries above the diagonal are independent complex (real on the diagonal) Gaussian random variables with mean zero and variance σ 2 .Equivalently, X is distributed according to where dX is Lebesgue measure on the space of Hermitian N × N matrices, and X is then said to be element of the Gaussian Unitary Ensemble (GUE).For such an Hermitian random matrix X, denote by λ N 1 , . . ., λ N N the eigenvalues of X = X N .Denote furthermore by µ N σ the mean (empirical) spectral measure It is a classical result due to E. Wigner [Wi] that the mean density µ N σ converges weakly, as σ 2 ∼ 1 4N , N → ∞, to the semicircle law (on (−1, +1)).The second main example arose in the context of the so-called complex Wishart distributions.Let G be a complex M × N random matrix the entries of which are independent complex Gaussian random variables with mean zero and variance σ 2 , and set Y = Y N = G * G.The law of Y defines similarly a unitary invariant probability measure on the space of Hermitian matrices called the Laguerre Unitary Ensemble (LUE) (cf.[Fo]).Denote by λ N 1 , . . ., λ N N the eigenvalues of the Hermitian matrix Y = Y N and by µ N σ the mean spectral measure E 1 . It is a classical result due to V. Marchenko and L. Pastur [M-P] (see also [G-S], [Wa], [Jon] for the real case in the context of sample covariance matrices) that the mean density µ N σ converges weakly, as σ 2 ∼ 1 4N and M = M (N ) ∼ cN , N → ∞, c > 0, to the so-called Marchenko-Pastur distribution (with parameter c > 0), or free Poisson distribution.
Recently, U. Haagerup and S. Thorbjørnsen [H-T] gave an entirely analytical treatment of these results on asymptotic eigenvalue distributions.This analysis is made possible by the determinantal representation of the eigenvalue distribution as a Coulomb gas and the use of orthogonal polynomials (cf.[Me], [De], [Fo], [H-T]...).For example, in the GUE case, by unitary invariance of the ensemble (1) and the Jacobian formula, the distribution of the eigenvalues (λ N 1 , . . ., λ N N ) of X = X N is given by 1 where is the Vandermonde determinant, dµ(x) = e −x 2 /2 dx √ 2π the standard normal distribution on R and C the normalization factor.Denote by P ℓ , ℓ ∈ N, the normalized Hermite polynomials with respect to µ. Since, for each ℓ, P ℓ is a polynomial function of degree ℓ, up to a constant depending on N , the Vandermonde determinant ∆ N (x) is easily seen to be equal to det P ℓ−1 (x k ) 1≤k,ℓ≤N .
On the basis of this observation and the orthogonality properties of P ℓ , the marginals of the eigenvalue vector (λ N 1 , . . ., λ N N ) (the so-called correlation functions) may be represented as determinants of the (Hermite) kernel K N (x, y) = N −1 ℓ=0 P ℓ (x)P ℓ (y).In particular, the mean spectral measure µ N σ of the Gaussian matrix X is given, for every bounded measurable real-valued function f on R, by Coulomb gas of the type (2) may be associated to a number of orthogonal polynomial ensembles for various choices of the underlying probability measure µ and its associated orthogonal polynomials.This concerns unitary random matrix ensembles (cf.[De], [Fo]...) as well as various random growth models associated to discrete orthogonal polynomial ensembles (cf. in particular [Joha1], [Joha2]).In this work, we only consider random matrix models associated to the classical orthogonal polynomial of the continuous variable (for the discrete case, see [Le2]) for which determinantal representations of the correlation functions are available.For example, the mean spectral measure µ N σ of the Wishart matrix Y may be represented in the same way by for every bounded measurable function f on R + , where P ℓ , ℓ ∈ N, are now the normalized orthogonal (Laguerre) polynomials for the Gamma distribution dµ(x) = Γ(γ + 1) −1 x γ e −x dx on (0, +∞) with parameter γ = |M − N |.(The Dirac mass at 0 comes from the fact that when M < N , N − M eigenvalues of Y are necessarily 0.) Based on these orthogonal polynomial representations, U. Haagerup and S. Thorbjørnsen deduce in [H-T] the asymptotic distributions of the complex Gaussian and Wishart spectral measures from explicit formulas for the Laplace transforms of the mean spectral measures in terms of confluent hypergeometric functions.They also determine recurrence equations for moments and describe the almost sure asymptotic behavior of the largest and smallest eigenvalues of these models.
In this paper, we actually push forward the investigation by U. Haagerup and S. Thorbjørnsen.With respect to their work, we however avoid any appeal to series expansions and rather only concentrate on the differential aspects of confluent hypergeometric functions.We present the methodology in the abstract framework of Markov diffusion generators, that provides a convenient language to derive both the basic differential equations on Laplace transforms and recurrence formulas for moments from the integration by parts formula.The setting moreover allows us to consider in the same framework and with no more efforts, spectral measures described by Jacobi polynomials, associated to Beta random matrices.Precisely, if G 1 and G 2 are independent complex, respectively M 1 × N and M 2 × N with M 1 + M 2 ≥ N , random matrices the entries of which are independent complex Gaussian random variables with mean zero and variance 1, set The law of Z defines a unitary invariant probability distribution, element of the Jacobi Unitary Ensemble (JUE).Denote again by λ N 1 , . . ., λ N N the eigenvalues of the Hermitian matrix Z = Z N .Denote furthermore by µ N the mean spectral measure Then, using the Coulomb gas description of the eigenvalues (cf.[T-W2], [Fo]), the mean spectral measure µ N of Z = Z N may be represented, for every bounded measurable function f on (−1, +1), by , where P ℓ , ℓ ∈ N, are the normalized orthogonal (Jacobi) polynomials for the Beta, or Jacobi, distribution a, b > 0, are described by P. Forrester in the symmetric case, see [Fo] and the references therein, and B. Collins [Co] for another particular family using asymptotics of Jacobi polynomials [C-I].These limits may be predicted by the free probability calculus and the result of D. Voiculescu [Vo] and have been obtained this way in full generality by M. Capitaine and M. Casalis [C-C].In our setting, their construction will follow the one of the Marchenko-Pastur distribution (and seems to be connected with free multiplicative products of Bernoulli laws [Co]).
It is worthwhile mentioning that the Gaussian Unitary random matrix Ensemble may be seen as a limiting case of the Laguerre Unitary random matrix Ensemble, and similarly the latter is a limiting case of the Jacobi Unitary random matrix Ensemble.Namely, with σ = 1, The main results of this work are presented in Section 3 where the framework of abstract Markov operators is used to derive identities for a general class of eigenfunction measures.In this setting, Hermite, Laguerre and Jacobi operators are treated in complete analogy in Section 4. On the basis of the general abstract identities developed in Section 3, we actually deal first with the eigenfunction measures P 2 N dµ for which the universal limiting arcsine distribution is emphasized in accordance with the classical theory in the compact case developed by A. Maté, P. Nevai and V. Totik [M-N-T].The method of [M-N-T], based on the recurrence relations for orthogonal polynomials, may certainly be adapted similarly to the non-compact case (and orthogonal polynomials with varying coefficents).We however develop here a strategy using differential equations and moment recurrence relations that is well-suited to the small deviation inequalities on the largest eigenvalues presented in Section 5.The limiting distributions for the Cesaro averages 1 N N −1 ℓ=0 P 2 ℓ dµ (corresponding to the mean spectral measures) then appear as mixtures of affine transformations of the arcsine law with an independent uniform distribution, leading thus to a new picture of the Marchenko-Pastur law and the equilibrium law of Beta random matrices.These equilibrium measures also appear as limits of empirical measures on the zeroes of the associated orthogonal polynomials ¿From the general identites of Section 3, we obtain in Section 5 recurrence formulas for moments of the mean spectral measures of the three ensembles GUE, LUE and JUE.These recurrence equations are used to derive in a simple way small deviation inequalities on the largest eigenvalues for matrices with fixed size of the GUE, LUE and JUE, in concordance with the Tracy-Widom asymptotics [T-W1].For example, it is known that the largest eigenvalue λ N max of the GUE random matrix X = X N with σ 2 = 1 4N converges almost surely to the right endpoint 1 of the support of the limiting spectral distribution (semicircle law).C. Tracy and H. Widom [T-W1] described the fluctuations of λ N max at the rate N 2/3 .They showed that (some multiple of) converges weakly to the Tracy-Widom distribution F constructed as a Fredholm determinant of the Airy kernel (as a limit in this regime of the Hermite kernel using delicate Plancherel-Rotach orthogonal polynomial asymptotics), and that may be described as where u(x) is the unique solution of the Painlevé equation u ′′ = 2u 3 + xu with the asymptotics u(x) ∼ Ai (x) as x → ∞.Similar results have been established for the largest eigenvalues of the Laguerre [Joha1], [Joho] and recently the Jacobi [Co] (cf.also [Fo]) Unitary Ensembles.For discrete orthogonal polynomial ensembles, see [Joha1], [Joha2].
Using the moment equations for the spectral distribution, we actually show that for every N and 0 < ε ≤ 1, for some numerical constant C > 0, and similarly for the largest and (smallestsoft edge) eigenvalues of the LUE and soft edge of the symmetric JUE.These small deviation inequalities thus agree with the fluctuation rate N 2/3 (choose ε of the order of N −2/3 ) and the asymptotics of the Tracy-Widom distribution for s large.They also related to the large deviation principle of [BA-D-G].We obtain these results from bounds on the p-th moments of the trace by a simple induction procedure on the recurrence relations.The argument stays at a rather mild level, however easily tractable.In particular, with respect to the Tracy-Widom asymptotics, only the first correlation function is used and no sharp asymptotics of orthogonal polynomials is required.Non asymptotic small deviation inequalities on the largest eigenvalues of the LUE (as a limit of the discrete Meixner orthogonal polynomial ensemble) may also be shown to follow from earlier bounds that K. Johansson [Joha1] obtained using a delicate large deviation theorem together with a superadditivity argument (cf.[Le2]).In the limit from the LUE to the GUE, these bounds also cover the GUE case.An approach via hypercontractivity is attempted in [Le1].In [Au], bounds over integral operators are used to this task.
In the companion paper [Le2], the analytic approach presented here is developed, with similar results, for discrete orthogonal polynomial ensembles as deeply investigated recently by K. Johansson [Joha1], [Joha2].

Abstract Markov operator framework
As announced, to unify the various examples we will investigate next, we consider more generally the setting of abstract Markov generators (see [Bak], [F-O-T]).The general conclusions obtained here might be of independent interest.Let thus µ be a probability measure on some measurable space (E, E).Let L be a Markov generator invariant and symmetric with respect to µ.To work with more freedom, we assume that we are given an algebra F of functions, dense in the domain of L and contained in all L p spaces (with respect to µ).
Given functions f, g in F, define the carré du champ operator Γ by Since L is invariant, the integration by parts formula indicates that for any f, g in F, Our main assumption in this work is that L is a diffusion operator, that is, Γ is a derivation in the sense that, for every smooth function φ on R k , any f ∈ F and any finite family This setting conveniently includes the three basic one-dimensional diffusion operators we will consider in this work.
a) The first one is the Hermite or Ornstein-Uhlenbeck operator.Here, E = R and is the Hermite operator with invariant measure the standard Gaussian measure dµ(x) = e −x 2 /2 dx √ 2π .The carré du champ operator is given, on smooth functions, by Γ(f, g) = f ′ g ′ .(One may choose for F the class of smooth functions with polynomial growth).In addition, the (normalized in L 2 (µ)) Hermite polynomials P N , N ∈ N, that form an orthogonal basis of L 2 (µ), are eigenfunctions of −L with respective eigenvalues N , N ∈ N.
b) Next we turn to the Laguerre operator.Consider the second order differential operator L acting on smooth functions f on E = (0, +∞) as where γ > −1.The Laguerre operator L has invariant (Gamma) probability distribution dµ(x) = dµ γ (x) = Γ(γ + 1) −1 x γ e −x dx (on E = (0, +∞)).The carré du champ operator is given, on smooth functions, by Γ(f, g) = xf ′ g ′ .(One may choose for F the class of smooth functions on E = (0, +∞) with polynomial growth.)The (normalized in L 2 (µ)) Laguerre polynomials P N , N ∈ N, that form an orthogonal basis of L 2 (µ), are eigenfunctions of −L with eigenvalues N , N ∈ N. c) Our third example concerns the Jacobi operator The carré du champ operator is given, on smooth functions, by When κ is an integer, the ultraspheric measure µ k is the projection on a diameter of the uniform measure on the κ-dimensional sphere.Ultraspherical measures will be the basic limiting distributions in this setting, and particular values of κ are κ = 1 corresponding to the arcsine law, κ = 2 corresponding to the uniform distribution, and κ = 3 corresponding to the semicircle law.To this end, it is worthwhile describing the Laplace, or Fourier, transforms of the Jacobi distributions.Namely, let by the integration by parts formula, In particular thus If J κ is the Laplace transform of the symmetric Jacobi distribution µ κ with parameter κ > 0, then J κ solves the differential equation Note also that for every κ > 0, by ( 9).It will also be useful later to observe from (9) that the function The differential equation ( 9) may be expressed equivalently on the moments of µ κ .Namely, if (the odd moments are clearly all zero by symmetry), then we have the recurrence equation It is worthwhile mentioning, following [Ma], that on the real line, up to affine transformations, the three preceding examples are the only Markov diffusion operators that may be diagonalized in a basis of orthogonal polynomials.Moreover, it is not difficult to check as in the introduction (cf.[Sz]) that the Hermite and Laguerre operators may be obtained in the limit from the Jacobi operators using a proper scaling.

Differential equations
In the preceding abstract framework, we now describe a simple procedure, relying on the integration by parts formula (7), to establish differential equations on measures given by pqdµ where p and q are eigenfunctions of −L, with respective eigenvalues ρ p and ρ q (> 0).
We assume that we are given a function h of F such that for some real-valued function A on R. Assume moreover that for all f, g ∈ F, where B : R → R is smooth enough (or only µ-almost everywhere).We also agree that Γ(h, h) = B(h).
Furthermore, we assume that h, p, q are compatible for L in the sense that for real constants d p , d q , and functions D p , D q : R → R, −LΓ(h, p) = d p Γ(h, p) + D p (h)p and − LΓ(h, q) = d q Γ(h, q) + D q (h)q. (15) The main result in this setting is a differential identity on the operator F given by for smooth functions θ : R → R. (We assume throughout the argument that all the integrals we will deal with are well-defined.) Proposition 3.1.Let θ : R → R be smooth.Under the preceding assumptions on p, q and h, where R is the differential operator defined, on smooth functions φ : R → R, by Proof.First note that, by the integration by parts formula (7) and diffusion property (8), and assumptions ( 13), ( 14), for any smooth φ : R → R and any f ∈ F, As a consequence, by (8) again, and where we used ( 14).
In particular, together with (16) for θ ′ , We also have that together with the same formula exchanging the roles of p and q.
Using (15), we now have that and similarly with q.To handle the last term on the right-hand side of (20), we use again the eigenfunction property of q to get that Therefore, (20) yields We now summarize the preceding: Applying (17) to θ ′′ , and then (21) to θ ′ (with the same formula exchanging the roles of p and q), we get that The conclusion follows by applying (18) to Bθ ′′ and (19).
Corollary 3.2.In the setting of Proposition 3.1, where the real-valued functions T 4 , T 3 , T 2 , T 1 , T 0 are given by In case the functions A, B, D are polynomials, Corollary 3.2 may be used directly to deduce a differential equation for the Laplace of Fourier transforms of pqdµ as well as recurrence formulas for its moments.Indeed, when θ(x) = e λx , x ∈ R and λ is a parameter in some open domain in C, we get the following consequence.For a given (complex) polynomial of the variable λ, denote by the corresponding differential operator acting on smooth (complex) functions φ.
Corollary 3.3.Let ϕ(λ) = e λh pqdµ assumed to be well-defined for λ in some open domain of C.Then, When θ(x) = x k , x ∈ R and k ∈ N, we may deduce from Corollary 3.2 recurrence equations for moments (which will be used specifically in Section 5).In the next sections, we examine the Hermite, Laguerre and Jacobi diffusion operators with respect to their orthogonal polynomials.

Limiting spectral distributions of GUE, LUE and JUE
In this section, we test the efficiency of the preceding abstract results on the three basic examples, Gaussian, Laguerre and Jacobi, random matrix ensembles.We actually discuss first eigenfunction measures of the type P 2 N dµ for which, under appropriate normalizations, the arcsine law appears as a universal limiting distribution (cf.[M-N-T]).The Cesaro averages 1 N N −1 ℓ=0 P 2 ℓ dµ associated to the mean spectral measures of the random matrix models are then obtained as appropriate mixtures of the arcsine law with an independent uniform distribution.This approach leads to a new look at the classical equilibrium measures of the spectral measures (such as the Marchenko-Pastur distribution).We are indebted to T. Kurtz, and O. Zeitouni, for enlightning remarks leading to this presentation.
We refer below to [Sz], [Ch]..., or [K-K], for the basic identities and recurrence formulas for the classical orthogonal polynomials.
a) The Hermite case.Let where µ is the standard normal distribution on R and P N the N -th (normalized) Hermite polynomial for µ.We apply the general conclusions of Section 3. Choose h = x so that A = x, B = 1, and p = q = P N so that Changing λ into σλ, σ > 0, ϕ σ (λ) = ϕ(σλ) solves the differential equation Whenever ϕ σ and its derivatives converge as σ 2 ∼ 1 4N , N → ∞, the limiting function Φ will thus satisfy the differential equation which is precisely the differential equation satisfied by the Laplace transform J 1 of the arcsine law (cf.( 9)).Precisely, let V N , N ∈ N, be random variables with distributions P 2 N dµ, and denote by ξ a random variable with the arcsine distribution on (−1, +1).Using the recurrence equation of the (normalized) Hermite polynomials, it is easily checked that, for example, sup N E(V 4 N /N 2 ) < ∞.Extracting a weakly convergent subsequence, it follows by uniform integrability that, along the imaginary axis, the Fourier transform ϕ σ of σV N (σ 2 ∼ 1 4N ), as well as its first and second derivatives ϕ ′ σ and ϕ ′′ σ , converge pointwise as N → ∞ to Φ, Φ ′ and Φ ′′ respectively.By ( 22), Φ = J 1 so that V N /2 √ N converges weakly to ξ.
Proposition 4.1.Let µ N σ be the mean spectral measure of the random matrix X N from the GUE with parameter σ.Wheneverσ 2 ∼ 1 4N , then µ N σ converges wea kly to the law of √ U ξ with ξ an arcsine random variable on (−1, +1) and U uniform on [0, 1] and independent from ξ.

Of course,
√ U ξ has the semicircle law, in accordance with the classical Wigner theorem.This may be checked directly, or on the moment identities (12).Alternatively, and more in the differential context of this paper, one may observe from Lemma 5.1 below that whenever ψ is the Laplace transform of the measure 1 N N −1 ℓ=0 P 2 ℓ dµ, then 2N λψ = ϕ ′ − λϕ.In the limit under the scaling σ 2 ∼ 1 4N , λΨ = 2Φ ′ = 2J ′ 1 so that, by (10), Ψ = J 3 the Laplace transform of the semicircular law.
To investigate spectral measures of random matrices and their equilibrium distributions, we are actually interested in the measures 1 N N −1 ℓ=0 P 2 ℓ dµ.We proceed as in the Hermite case, with however the additional dependence of µ = µ γ and P ℓ = P γ ℓ on the varying parameter γ = γ N ∼ c ′ N , N → ∞, c ′ ≥ 0. For f : R → R bounded and continuous, write where U is uniform on [0, 1] and independent from ξ.
We obtain in this way convergence of the mean spectral measure µ N σ , as presented in the introduction.By appropriate operations on (4), we conclude to the following statement.
Proposition 4.2.Let µ N σ be the mean spectral measure of the random matrix Y N from the LUE with parameter σ.Whenever σ 2 ∼ 1 4N and M = M (N ) ∼ cN , N → ∞, c > 0, then µ N σ converges weakly to where ν is the law of , with ξ an arcsine random variable on (−1, +1) and U uniform on [0, 1] and independent from ξ.
Now of course, the law ν c ′ of the random variable 1 2 on (v − u, v + u) ⊂ (0, +∞) where u and v are described by ( 23) so that This may be checked directly, although at the expense of some tedious details.Alternatively, more in the context of this paper, one may observe from Lemma 5.3 below that whenever ψ is the Laplace transform of the measure 1 In the limit under the scaling Since Φ = e vλ J 1 (uλ) (with u and v given by ( 23)), it must be, by ( 10), that Ψ ′ = const.e vλ J 3 (uλ).Now, assuming a priori that Ψ is the Laplace transform Ψ(λ) = e λx dν c ′ of some (probability) distribution ν c ′ on (0, +∞), we have that Changing the variable, In other words, ν c ′ is the .Note that when c ′ = 0, then v − u = 0 and ν 0 is simply the distribution of ζ 2 where ζ has the semicircle law.
c) The Jacobi case.Let where µ = µ α,β is the Beta distribution with parameters α, β > −1 on (−1, +1) and P N = P α,β N the N -th (normalized) Jacobi polynomial for µ α,β .As in the Hermite and Laguerre cases, we apply the general conclusions of Section 3. Choose h = x so that A = (α + β + 2)x + α − β, B = 1 − x 2 , and p = q = P N so that Hence s = 0, t = −(α + β), and as N → ∞, whenever ϕ and its derivatives converge, the limiting function Φ solves the differential equation In other words, by (11), Φ(λ) = e vλ J 1 (uλ) where Precisely, let V N = V α,β N , N ∈ N, be random variables with distributions P 2 N dµ.Recall we denote by ξ a random variable with the arcsine distribution on (−1, +1) and Laplace transform J 1 .Using the recurrence equation of the (normalized) Jacobi polynomials of parameter α, β, it is easily checked that, for example, sup Extracting a weakly convergent subsequence, it follows by uniform integrability that, along the imaginary axis, ϕ, ϕ ′ , ϕ ′′ , converge pointwise as and Φ ′′ respectively.As we have seen, Φ(λ) = e vλ J 1 (uλ) so that V N converges weakly to uξ + v where u and v are given by ( 25).(Note that when a ′ = b ′ = 0, Φ = J 1 recovering the result of [M-N-T] in this particular case.)Cesaro means of orthogonal polynomials are studied as in the Hermite and Laguerre cases so to yield the following conclusion.Recall the mean spectral measure µ N of the Beta random matrix Together with (5), we conclude to the following statement.
Proposition 4.3.Let µ N be the mean spectral measure of the random matrix , with ξ an arcsine random variable on (−1, +1) and U uniform on [0, 1] and independent from ξ.

Moment equations and small deviation inequalities
In this section, we draw from the general framework developed in Section 3 moment recurrence formulas for the spectral measures of Hermite, Laguerre and Jacobi random matrix ensembles to obtain sharp small deviation bounds, for fixed N , on the largest eigenvalues.As presented in the introduction, in the three basic examples under study, the largest eigenvalue λ N max of the random matrices X N , Y N and Z N is known to converge almost surely to the right endpoint of the support of the corresponding limiting spectral measure.Universality of the Tracy-Widom arises in the fluctuation of λ N max at the common rate N 2/3 and has been established in [TW] for the GUE (cf.also [De]), in [Joha1] and [John] for the LUE, and recently in [Co] for the JUE.We present here precise non asymptotic deviation inequalities at the rate N 2/3 .Classical measure concentration arguments (cf.[G-Z]) do not apply to reach the rate N 2/3 .We use to this task a simple induction argument on the moment recurrence relations of the orthogonal polynomial ensembles As in the preceding section, we investigate successively the Hermite, Laguerre and Jacobi Unitary Ensembles.a) GUE random matrices.In order to describe the moment relations for the measure 1

N
N −1 ℓ=0 P 2 ℓ dµ, where µ is the standard normal distribution and P ℓ , ℓ ∈ N, are the (normalized) orthogonal Hermite polynomials for µ, the following alternate formulation of the classical Christoffel-Darboux (cf.[Sz]) formula will be convenient.
Lemma 5.1.For any smooth function g on R, and each N ≥ 1, Proof.Let L be the first order operator In other words, for every integer p ≥ 2, ).This recurrence equation, reminiscent of the three step recurrence equation for orthogonal polynomials, was first put forward in an algebraic context by J. Harer and D. Zagier [H-Z].It is also discussed in the book by M. L. Mehta [Me].
The proof above is essentially the one of [H-T].
It may be noticed that b N p → χ p as σ 2 ∼ 1 4N , N → ∞, for every p, where χ p is the 2p-moment of the semicircle law.It may actually even be observed from ( 27) that when σ 2 = 1 4N , for every fixed p and every N ≥ 1, where C p > 0 only depends on p.
Let now X = X N be a GUE matrix with σ 2 = 1 4N .As we have seen, the mean spectral measure µ N σ of X N then converges weakly as N → ∞ to the semicircle law on (−1, +1).The largest eigenvalue λ N max of X N is known (cf.[Bai], [H-T]) to converge almost surely to the right endpoint +1 of the support of the limiting semicircle law.(By symmetry, the smallest eigenvalue converges to −1.) Actually, the fact that lim sup N →∞ λ N max ≤ 1 almost surely will follow from the bounds we provide next and the Borel-Cantelli lemma.To prove the lower bound for the liminf requires a quite different set of arguments, namely the classical strengthening of the convergence of the mean spectral measure into almost sure convergence of the spectral measure (This is shown in [H-T] via a measure concentration argument.)C. Tracy and H. Widom [T-W1] proved that (some multiple of) N 2/3 (λ N max − 1) converges weakly to the distribution F of (6).We bound here, for fixed N , the probability that λ N max exceeds 1 + ε for small ε's in accordance with the rate N 2/3 and the asymptotics at infinity of the Airy function.
Fix thus σ 2 = 1 4N .By (3), for every ε > 0, Hence, for every p ≥ 0, Note that, for every fixed p, the moment equations ( 28) converge as ).The moments χ c ′ p may be recognized as the moments of the Marchenko-Pastur distribution ν c ′ of (24).It may actually even be observed from (28) that when σ 2 = 1 4N and γ = γ N = c ′ N , for every fixed p and every N ≥ 1, where C p > 0 only depends on p and c ′ (use that b ).We turn to the Laguerre ensemble, and consider now the Wishart matrix Y = Y N from the LUE with σ 2 = 1 4N and M = M (N ) = [cN ], c ≥ 1.As we have seen, the mean spectral measure µ N σ of Y N then converges weakly as N → ∞ to the Marchenko-Pastur distribution ν c ′ of ( 24), c ′ = c − 1 ≥ 0, with support (v − u, v + u) where u > 0 and v are given by ( 23).The largest eigenvalue λ N max of Y N is known to converge almost surely to the right endpoint v + u of the support of ν c while the smallest eigenvalue λ N min converges almost surely towards v − u (cf.[Bai]).As for the GUE, the fact that lim sup N →∞ λ N max ≤ 1 almost surely will follow from the bounds we provide next and the Borel-Cantelli lemma.The lower bound on the liminf follows from the almost sure convergence of the spectral measure.K. Johansson [Joha1] and I. Johnstone [John] independently proved that (some multiple of) N 2/3 (λ N max − (v + u)) converges weakly to the Tracy-Widom distribution F .We bound here, for fixed N , the probability that λ N max exceeds v + u + ε for small ε's at the appropriate rate and dependence in ε.Fix thus σ 2 = 1 4N and γ = [cN ] − N , c ≥ 1.For simplicity, we assume throughout the argument below that γ = c ′ N , c ′ = c − 1 ≥ 0, the general case being easily handled along the same line.By (4), we have that for every ε > 0 and p ≥ 0, where b p = b N p is the p-th moment of the mean spectral measure µ N σ (that is of the trace of Y N ) and u > 0 and v are given by ( 23).The recurrence equation ( 28 Two cases have to be distinguished.If u = v = 1 2 (c ′ = 0), the argument leading to Proposition 5.2 may essentially be repeated.Assume thus in the following that v > u > 0. We show by recurrence that for some constant C > 0 only depending on c ≥ 1 and possibly varying from line to line below, for every 1 To this task, it suffices to check that We thus conclude as for the proof of Proposition 5.2 to the next sharp small deviation inequality.
For c = 1 (c ′ = 0), a slightly weaker inequality, involving some polynomial factor, is established in [Le1] using hypercontractivity of the Laguerre operator.
Interestingly enough, the same argument may be developed to establish the same exponential bound on the smallest eigenvalue λ N min of Y = Y N in case v > u > 0, that is the so-called "soft wall" in the physicist language as opposed to the "hard wall" 0 (since all the eigenvalues are non-negative).Fluctuation results for λ N min at the soft edge have been obtained in [B-F-P] (see also [Fo]) in a physicist language, with again the limiting Tracy-Widom law (for the hard wall with a Bessel kernel, cf.[T-W2], [Fo]).Assume thus that v > u > 0. For any 0 < ε < 1, Hence, for every p ≥ 0, Setting bp = b −p , the recurrence equations ( 28 As a consequence, provided that u 2 (1 We may then conclude, as in the Hermite and Laguerre cases, to the following result.By symmetry, a similar bound holds for the smallest eigenvalue.
The hypercontractive method of [Le1] may be used similarly on the Jacobi operator to prove a slightly weaker inequality, involving some polynomial factor.
As a consequence of Proposition 5.7 and the Borel-Cantelli lemma, it immediately follows that lim sup N →∞ λ N max ≤ u almost surely.Almost sure convergence of the spectral measure is established in [Co].
The arguments developed here do not seem to yield at this point bounds at fixed N on the largest eigenvalues of the GUE, LUE or (symmetric) JUE under their limiting values (and similarly for the smallest eigenvalues).This would be of particular interest in case of the "hard walls" v − u = 0 for the LUE and v ± u = ±1 for the JUE where the Airy kernel is replaced by the Bessel kernel [T-W2] at the rate N 1/2 .
b p where we recall that b p = b N p are the 2p-moments of the mean spectral measure µ N σ (that is of the trace of X N ).Now, by induction on the recurrence formula (27) for b p , it follows that, for every p ≥ 0, b p ≤ 1 + p(p − 1) 4N 2 p χ p (b 0 = 1, b 1 = (γ + N )σ 2 ).As for (27), this recurrence relation already appeared, with the same proof, in the paper [H-T] by U. Haagerup and S. Thorbjørnsen.
1holds as a consequence of the recurrence hypothesis (29) for p − 1.Hence, by iteration, for every p ≤ N/C,