On the distribution of maximum value of the characteristic polynomial of GUE random matrices

Motivated by recently discovered relations between logarithmically correlated Gaussian processes and characteristic polynomials of large random $N \times N$ matrices $H$ from the Gaussian Unitary Ensemble (GUE), we consider the problem of characterising the distribution of the global maximum of $D_{N}(x):=-\log|\det(xI-H)|$ as $N \to \infty$ and $x\in (-1,1)$. We arrive at an explicit expression for the asymptotic probability density of the (appropriately shifted) maximum by combining the rigorous Fisher-Hartwig asymptotics due to Krasovsky \cite{K07} with the heuristic {\it freezing transition} scenario for logarithmically correlated processes. Although the general idea behind the method is the same as for the earlier considered case of the Circular Unitary Ensemble, the present GUE case poses new challenges. In particular we show how the conjectured {\it self-duality} in the freezing scenario plays the crucial role in our selection of the form of the maximum distribution. Finally, we demonstrate a good agreement of the found probability density with the results of direct numerical simulations of the maxima of $D_{N}(x)$.


Introduction.
The space of all N × N Hermitian matrices H with probability density function is known as the Gaussian Unitary Ensemble (or GUE) [1,37,42]. Here and henceforth the variance is chosen to ensure that asymptotically for N → ∞, the limiting mean density of the GUE eigenvalues is given by the Wigner semicircle law ρ(x) = (2/π) √ 1 − x 2 supported in the interval x ∈ [−1, 1]. The characteristic polynomial p N (x) = det(xI − H) of the matrix H constitutes one of the most basic quantities of interest, encoding all eigenvalues of H through the roots of p N (x). As one varies the argument x over an interval containing many eigenvalues for a given realization of the ensemble, the value of the polynomial p N (x) shows huge variations by the orders of magnitude for large N, see   The purpose of this article is to describe the statistical properties of the highest peak displayed by the modulus of the GUE polynomial |p N (x)|, namely the probability density for the maximum value attained by |p N (x)| over the interval [−1, 1] on the real line as N → ∞. Our main result is the following where u is a continuous random variable characterized by the two-sided Laplace transform of its probability density: (1.4) E(e us ) = 1 C 2 4s π s Γ(s + 1)Γ(s + 3)G(s + 7/2) 2 G(s + 6)G(s + 1) where Γ(z) and G(z) stand for the Euler gamma-function and the Barnes digamma-function, correspondingly. The normalization C can be evaluated explicitly as where u ′ is an independent random variable with two-sided Laplace transform In the end of the paper we provide convincing numerical evidence that this Laplace transform does indeed define a unique random variable u ′ . This immediately implies that the probability density of u is the convolution of a Gumbel random variable with u ′ . Such a convolution structure is expected to appear universally when studying the extreme value statistics of logarithmically correlated fields [45,14] such as the Gaussian Free Field, Branching Brownian Motion, 1/f noises and indeed the characteristic polynomial of a random unitary matrix. In each of these cases the re-centered statistics of the maximum is expected to have the above Gumbel convolution form, although the distribution of the random variable u may vary from case to case, and is related to the so-called derivative martingale structure. The explicit form of the distribution of u was so far conjectured only in a very few cases [20,24] including the maxima of CUE polynomial [21,22]. Here we add to this list.
In recent years, much interest has accumulated regarding the statistical behaviour of characteristic polynomials of various random matrices as a function of the spectral variable x. To a large extent this interest was stimulated by the established paradigm that many statistical properties of the Riemann zeta function along the critical line, that is ζ(1/2 + it), 3 can be understood by comparison with analogous properties of the characteristic polynomials of random matrices [31,29,11,13,28].
For invariant ensembles [37,42] of self-adjoint matrices with real eigenvalues, statistical characteristics of p N (x) depend very essentially on the choice of scale spanned by the real variable x. From that end it is conventional to say that x spans the local (or microscopic) scale if one considers intervals containing in the limit N → ∞ typically only a finite number of eigenvalues (the corresponding scale for GUE in (1.1) is of the order of 1/N). At such scales, standard objects of interest are correlation functions containing products and ratios of characteristic polynomials, which show determinantal/Pfaffian structures [5,27,44,4,7,32] for Hermitian/real symmetric matrices and tend to universal limits at the local scale. Similar structures arise for properly defined characteristic polynomials p N (θ) = det I − U e −iθ of circular ensembles (like CUE, COE, and CSE) [37] of unitary random matrices U uniformly distributed with respect to the Haar measure on U(N) (and other classical groups) [11,8,12], whose properties on the local scale are indistinguishable from their Hermitian counterparts.
Next, when x spans an interval containing in the limit N → ∞ typically of order of N eigenvalues one speaks of the global (or macroscopic) scale behaviour. At such a scale properties of p N (x) display both universal and non-universal features, the latter depending on the ensemble chosen. The study of characteristic polynomials at such a scale was initiated in [29] where it was shown that the function V N (θ) = −2 log | det(1 − U e −iθ )|, with U belonging to the CUE, converges (in an appropriate sense) to a random Gaussian Fourier series of the form where the coefficients v n , v n are independent standard complex Gaussian random variables, i.e. E{v n } = 0, E{v 2 n } = 0 and E{v n v n } = 1. The covariance structure associated with such a process is given by E{V (θ 1 )V (θ 2 )} = −2 log |e iθ 1 − e iθ 2 | as long as θ 1 = θ 2 . Such a (generalized) random function V (θ) is a representative of random processes known in the literature under the name of 1/f noises, see [26,22] for background discussion and further references.
Recently the study of the global scale behaviour was extended to the GUE polynomial p N (x) in [23] by using earlier insights from [30] and [34]. That work revealed again a structure analogous to that of (1.8), though different in detail. Namely, it was shown that the natural limit ofD N (x) := − log |p N (x)| + E{log |p N (x)|} is given by the random Chebyshev-Fourier series with T n (x) = cos(n arccos(x)) being Chebyshev polynomials and real a n being independent standard Gaussians. A quick computation shows that the covariance structure associated with the generalized process F (x) is given by an integral operator with kernel as long as x = y. Such a limiting process F (x) is an example of an aperiodic 1/f -noise.
Finally, one can consider an intermediate, or mesoscopic spectral scales, with intervals typically containing in the limit N → ∞ the number of eigenvalues growing with N, but representing still a vanishingly small fraction of the total number N of all eigenvalues. The properties of the characteristic polynomials at such scales were again addressed in [23] where it was shown that for the GUE that object gives rise to a particular (singular) instance of the so-called fractional Brownian motion (fBm) [36,15] with the Hurst index H = 0, again characterized by correlations logarithmic in the spectral parameter.
The discussion above serves, in particular, the purpose of pointing to an intimate connection between Gaussian random processes with logarithmic correlations and the modulus of characteristic polynomials at global and mesoscopic scales. The relation is important as logarithmically correlated random processes and fields attract growing attention in Mathematical Physics and Probability and play an important role in problems of Quantum Gravity, Turbulence, and Financial Mathematics, see e.g. [17]. In particular, the periodic 1/f noise (1.8) emerged in constructions of conformally invariant planar random curves [3].
From a quite different perspective the processes similar to (1.8) and (1.9) appeared in the context of statistical mechanics of disordered systems when studying extreme values of random multifractal landscapes supporting spinglass-like thermodynamics [20,24,26,2]. The latter link is especially important in the context of the present paper. The idea that it is beneficial to look at |p N (θ)| as a disordered landscape consisting of many peaks and dips, and think of an associated statistical mechanics problems was put forward in [21,22]. It allowed to get quite non-trivial analytical insights into statistics of the maximal value of CUE polynomial sampled over the full circle θ ∈ [0, 2π], or over its mesoscopic sub-intervals. This was further used to conjecture the associated properties of the modulus of the Riemann zeta-function along the critical line. Some relations between between CUE characteristic polynomials and logarithmically correlated processes (in the form of the so-called "multiplicative chaos" measures introduced by Kahane, see [43] for a review) was recently rigorously verified in [46]. The case of GUE polynomials remained however outstanding.
It is our objective in this paper to provide two separate means of supporting the Prediction 1.1. First, we will provide careful and explicit, albeit in part heuristic, analytical arguments. Although our technique is inspired by the approach of [22] it contains new nontrivial features necessary to overcome challenges arising from the non-uniform eigenvalue density ρ(x) reflecting absence of translational invariance for the GUE at the global spectral scale (note e.g. the non-trivial recentering in (1.2)). All this makes actual calculation for the GUE much more involved in comparison to the CUE and the limiting random variable u above appears to be more complicated than its CUE counterpart. Secondly, we will test our prediction with numerical experiments for matrices of size N = 3000 and around 250, 000 realizations. This is especially important as part of our analysis is based on very plausible but as yet not fully rigorous considerations. Finally, is natural to expect that the same distribution should be shared by the maximum modulus of characteristic polynomials for Hermitian random matrices with independent entries taken from the so-called Wigner ensembles, see [18].
Before giving the detail of our procedure in the next section we need to quote the following fundamental asymptotic result obtained by Krasovsky [34] which will be central for our considerations: and G(z) is the Barnes G-function. Differentiating with respect to α, we deduce that The most salient feature of the asymptotics (1.12) is the product of differences on the second line, which when rewritten in the form can be looked at as another evidence of the limiting Gaussian process with logarithmic correlations at the background, cf. (1.10). We will however see below that naively replacing the (shifted) log |p N (x j )| with the corresponding 1/f noise (1.9) is not a valid approximation as the factors (1 − x 2 j ) α 2 j /2 in (1.12) do play an essential role in determining the extreme value statistics of |p N (x j )|. Curiously, in a simpler case of CUE polynomial a similar replacement with the periodic noise (1.8) does actually work heuristically and yields the correct functional form of the distribution of maximum.
Acknowledgements: We are grateful for helpful comments from Christian Webb during the preparation of this manuscript. We acknowledge support from EPSRC grant EP/J002763/1 "Insights into Disordered Landscapes via Random Matrix Theory and Statistical Mechanics".

2.
Statistical mechanics approach to the distribution of GUE characteristic polynomials.
Following the ideas of [22] we recast the problem of computing the value of a global maximum of |p N (x)| (with an appropriate shift by the mean value) as a statistical mechanics problem characterized by the partition function , inverse temperature β > 0 and β-independent non-negative parameter q. Specifically, if we define the associated "free energy" as F (β) = −β −1 log Z N (β), then Note that if compared to a similar partition function for CUE case the main new feature in 2.1 is the factor (ρ(x)) q . Although naively presence of such a factor may seem irrelevant when taking β → ∞ limit, we will actually see that it plays a very important role in supporting our procedure of extracting the free energy for β exceeding some critical value. Now we aim to compute the integer moments of the partition function defined in (2.1): In the limit N → ∞ the leading asymptotics of the above integral can be extracted by replacing the factor E k j=1 |p N (x j )| 2β with its asymptotics from (1.12). In this way one obtains After changing variables x j = 2y j − 1 the integral above assumes the form with the quantity S k (a, b, −γ) being the well-known Selberg integral [19]: It is easy to see that the found expression for the partition function moments E(Z k β ) in (2.5,2.4) is well-defined for any 0 < γ = β 2 < 1 and for an integer k satisfying 1 < k < γ −1 .
To understand how to deal with the case k > 1 β 2 , we recall that Krasovsky's asymptotic formula (1.12) is valid only when all of the differences |x i − x j | remain finite when N → ∞, and should be replaced by a different expression when |x i − x j | ∼ N −1 . One can check that the divergence of the integral for k > 1/β 2 is due precisely to the fact that these near degeneracies become important. Relying on our experience with the corresponding situation for the CUE [22] case suggests that taking into account the correct short-scale cutoff cures the formal divergence, but changes the asymptotics of the moments E(Z k β ) with N: namely, these become of the order of N 1+k 2 β 2 for k > β −2 whereas they are of the order of N (1+β 2 )k for k < β −2 . Such a change of behaviour will lead to a log-normal (far) tail in the distribution. Note that for CUE case the above behaviour conjectured in [22] was validated by recent rigorous calculation [10]. There is no doubt that the same mechanism is operational in present case as well and will be validated by extending the theory of [10] from Toeplitz to Hankel case. Actually, as argued in [20] the moments with k > β −2 play only secondary role when addressing the question of extreme value statistics which is controlled exclusively by moments with 1 < k < β −2 . Our next goal is to use the latter integer moments for restoring the associated part of the probability density P(Z β ) for the partition function. This will be achieved if we manage to find the distribution for a random variable z β whose positive integer moments are given by Such a task actually requires finding a way to continue analytically those moments to complex k. Below we will arrive to the required continuation by exploiting a relatively simple heuristic procedure suggested in [24]. Note that in a series of insightful papers [38,39,40,41] Ostrovsky developed a rigorous mathematical procedure of the required continuation which provides an aposteriori justification of the results obtained via the heuristic approach.
Let us now define a function M where A = −1, B 1 = 2β 2 , B 2 = 2(a 1 + b 1 + 1) + β 2 (2a 2 + 2b 2 − 1) and C = (β 2 − 1/2). Then a straightforward computation which relies on the identity following from (2.18) shows that the ratio reproduces the right-hand side of (2.16) from which we conclude which finally implies that β (0) with (see (2.6) and (2.9)) 2.2. Duality and the freezing transition. The pair (2.3)-(2.23) solves the problem of finding the complex moments M β (s) = E(z 1−s β ) of the random variable z β for any complex s, and β < 1. Knowledge of such moments can be used to restore the probability function of z β , hence of the partition function Z β , and of its logarithm ( the free energy). Our goal is however to study the limit of the latter as β → ∞ and one therefore should have a way of extracting information on the distribution for β > 1. In doing this we rely on the freezing transition scenario for logarithmically correlated random landscapes. The background idea of such scenario goes back to [9] and was further advanced and clarified in the series of works [20,24,25,26]. In brief, this scenario predicts a phase transition at the critical value β = 1 and amounts to the following principle: Thermodynamic quantities which for β < 1 are self-dual functions of the inverse temperature β, that is remain invariant under the transformation β → β −1 , retain for all β > 1 the value they acquired at the point of self-duality β = 1. Although such scenario is not yet proven mathematically in full generality and has a status of a conjecture supported by physical arguments and available numerics, recently a few nontrivial aspects of freezing were verified within rigorous probabilistic analysis, see e.g. [2,14,45] for efforts in this direction.
Within that scenario, one of the main outcomes of the analysis performed in [24] is that the self-dual object associated with the distribution of the partition function for logarithmically correlated landscapes is expected to be the appropriately defined Laplace transform: g β (y) = exp(−e βy Z β /Z e β ), (2.24) where Z e β is a typical scale of the partition function which is extracted from the integer moments and in our case can be chosen as ( c.f. (2.3), (2.6)) Moreover, defining the probability density p β (y) by p β (y) = −g ′ β (y) one can show that the double-sided Laplace transform for such a probability density is related to complex moments M β (s) of the scaled partition function via the following relation (see eq.(26) of [24]) Actually, as shown in [24] the freezing scenario implies that the variable y whose probability density is given by p β=1 (y) is precisely the fluctuating part of the height of the global minimum of the random potential which is our main object of interest. Our strategy therefore will be to check if self-duality holds for the right-hand side combination in (2.26) when we substitute there moments M β (s) from (2.23).
To that end we first use the identity which again follows from (2.18): to replace the factor G β (β(s − 1) + 1/β) in the denominator of (2.20) and further define β (s) It is easy to check that the self-duality of the right-hand side in (2.28) is equivalent to the self-duality of the functionM A direct inspection makes it clear that the self-duality is only possible if either a 1 = a 2 , b 1 = b 2 or a 1 = b 2 , b 1 = a 2 . For the GUE characteristic polynomials, we have a 1 = b 1 = 1/2, a 2 = b 2 = q/2 so that duality occurs only if q = 1. We therefore have to choose q = 1 to be able to rely upon the freezing scenario allowing to interpret the function p β=1 (y) calculated from its Laplace transform via (2.26) as the probability density for the (shifted) global minimum. Using (2.29) with a 1 = a 2 = b 1 = b 2 = 1 2 we get (2.31) The latter formula constitutes our main analytical result and finally leads to our Prediction 1.1.

Numerical study of the distribution of the maximum modulus of GUE characteristic polynomials
The purpose of this Section is to provide a numerical test of Prediction 1.1.

3.1.
Results. In Figure 3.1 we present a histogram of the recentered and rescaled maximum of the GUE characteristic polynomial, defined by  Table 1 we display values of the parameters c * N and s * N for the studied range of sizes N, as determined empirically from the mean and variance of the random variable u.
The observed decay with N is certainly consistent with asymptotic validity of our Prediction 1.1, though the convergence to asymptotic is too slow to make more definite claims. To resolve further decrease of the coefficients c * N and s * N would require much larger matrices and is computationally demanding.  Finally, we provide a numerical validation of the decomposition (1.6). In Figure 3.3 we plot the inverse Laplace transform of (1.7) obtained by a direct numerical evaluation of the integral in the Bromwich inversion formula for the Laplace transform. The positive and normalized curve clearly corresponds to a bona fide probability density of some real random variable u ′ .
3.2. Numerical method. The numerical evaluation of the maximum value (1.2) may be considered quite a non-trivial problem in its own right, for at least two reasons. Firstly, the characteristic polynomial p N (x) having zeros as the eigenvalues of H, displays O(N) oscillations in the spectral interval [−1, 1] with hugely varying peaks heights. This produces considerable clusterings of 'near-maxima' which may confuse any naive attempt to find the true maximum value. Secondly, the slow changing nature of the correction terms in Prediction 1.1, of order log(N) and log log(N)) respectively, require one to go to somewhat large matrices to resolve reasonable asymptotic behaviour. The problem is further compounded by the numerical instability of calculating determinants of such matrices. Our solution to these problems heavily relies on a sparse realization of matrices H due to Dumitriu and Edelman [16]. They discovered that the eigenvalues of GUE matrices H have the same joint probability density as those of the following real symmetric tri-diagonal matrix: where N (0, 2) is a normal random variable with mean 0 and variance 2. The sub-diagonal is composed of random variables χ 2n having the same density as χ 2 2n where χ 2 2n is a χ-square random variable with 2n degrees of freedom. To compute the maximum value of p N (x) = det(xI −H) = det(xI −H), we begin by exploiting the known asymptotic behaviour Further progress is now possible thanks to the fact that determinants of tri-diagonal matrices satisfy a linear recurrence relation. Furthermore, by an appropriate rescaling, the recursion computes determinants of all leading principal minors simultaneously, thus computing f j (x) for all j = 1, . . . , N in linear time. Now to find the maximum, we define a mesh M = {−1 + n/∆ : n = 0, . . . , 2∆} with ∆ ∼ 2N and evaluate f N (x) at each of the points in M. At those points where f N (x) is maximal the Matlab function 'fminbnd' is invoked to converge onto the global maximum. Figure 1.2 illustrates the complexity of the problem. Our algorithm is sufficiently precise to distinguish the true maximum (located at x ≈ −0.3 in red) from other possible candidates, e.g. x ≈ −0.7 as well as the thousands of other local maxima.