Superintegrability of matrix Student's distribution

For ordinary matrix models, the eigenvalue probability density decays rapidly as one goes to infinity, in other words, has"short tails". This ensures that all the multiple trace correlators (multipoint moments) are convergent and well-defined. Still, many critical phenomena are associated with an enhanced probability of seemingly rare effects, and one expects that they are better described by the"long tail"models. In absence of the exponential fall-off, the integrals for high moments diverge, and this could imply a loss of (super)integrability properties pertinent to matrix and eigenvalue models and, presumably, to the non-perturbative (exact) treatment of more general quantum systems. In this paper, we explain that this danger to modern understanding could be exaggerated. We consider a simple family of long-tail matrix models, which preserve the crucial feature of superintegrability: exact factorized expressions for a full set of basic averages. It turns out that superintegrability can survive after an appropriate (natural and obvious) analytical continuation even in the presence of divergencies, which opens new perspectives for the study of the long-tail matrix models.


Introduction
In abstract theory, the thermodynamical equilibrium is associated with the Gaussian distribution [1]. Still, it is well-known that, in practice, many relevant random distributions do not behave this way: strong deviations from the average appear much more often than expected, and are suppressed by powers rather than exponentially. In other words, the true distributions often have "long tails". This is well-known about earthquakes [2,3], stock trade [4], flicker (pink) noise [5], self-organized criticality [6,7] and many other examples, which can inspire a creation of entire new paradigms [8,9].
In fact, this is well-known even in the undergraduate course of experimental physics: the true distributions of random data obey the long-tail Student distribution rather than the classical Gaussian one. Also well-known is an explanation: since most results tend to lie around the peak, one can easily underestimate the true dispersion, and get an illusion that it is small: then with more measurements one observes stronger deviations from the putative mean than originally anticipated, and they seem not to be exponentially damped. The reason is that the true dispersion is bigger, but one needs (exponentially) more data to see this. In formal language, the reason is that one actually measures not the true Gaussian distribution, but that of the ratio of Gaussian distributed quantities, which is characterized by another, Student distribution [10]. It is power-like for any number n of measurements, and tends to the Gaussian distribution only when n −→ ∞. The Student distribution is therefore a nice representative from the long-tail class. Still, it is deduced from the Gaussian one, and one can expect that the more sophisticated (or more abstract) properties will survive.
In this letter, we explain that this expectation is not far from truth: the Student distribution remains superintegrable, i.e. all the correlators are given by nicely factorized rational functions.
At the same time, these functions can be negative! Of course, this is impossible for physically meaningful quantities: once the underlying Gaussian hidden variables are restored, both the poles and the negativity disappear. Nevertheless, for the formal distribution this does happen: integrals in non-physical domain formally start to diverge, and an analytical continuation preserves the factorization at the expense of positivity, i.e. reveals some other layer of physical reality and requires a new interpretation. Most importantly, however, we demonstrate that there is no contradiction between long tails and the modern non-perturbative quantum theory, where the invariance of exact functional integrals under arbitrary change of integration variables (fluctuating fields) reflects itself in the Ward identities often giving rise to integrability properties of effective actions.
We will mainly consider matrix Student distributions in order to have more degrees of freedom so that the theory would be closer to real complicated systems. This kind of long-tail distributions was also studied previously in various applications, from financial markets to very theoretical string issues, see e.g. [11][12][13][14][15][16].

Student distribution
The density of the Student distribution [10] is given by the following formula and it is apparently "long-tail" as compared to the Gaussian distribution e −x 2 dx. Nevertheless, it is nothing but the distribution of the ratio of two Gaussian variables, X and Y : and the long tail arises because the denominator can be much smaller than the numerator, while the both remain near the Gaussian peaks, and are not exponentially damped. Indeed, z exceeds some large value z 0 whenever Y < 1/z 0 , and the integral . More general is the situation, when Y is an n-component vector, a collection of n Gaussian variables, while i . Then the degree of the power-like distribution changes, but the long-tail feature persists: The Student's distribution plays a great role in experimental sciences because it naturally arises when one tries to estimate the mean value of a Gaussian random variable x ∼ N (ν, σ) from the sample of its measurements x 0 , . . . , x n , but does not know the dispersion σ in advance. Then, since most results are near the Gaussian peak, it is easy to underestimate the true dispersion, before the truly rare events show up and demonstrate that the dispersion is bigger than it originally seemed. At a more formal level, let be the average and the mean quadratic deviation defined from the sample. The question is how the variable z := √ n·x S is distributed. For simplicity, we put the mean value ν of the Gaussian distributed variable x to be zero. Then the relevant density dµ(z) is equal to One can perform an orthogonal change of integration variables such that the first integration variable becomes Denoting as Y the set of x-variables orthogonal to this X ∼x, one returns to the calculation (3). Note that σ drops out from the final formula.
Since the Student distribution does not decay fast enough at infinity: dµ(z) ∼ O(|z| −2α ) as |z| → ∞, its moments z n = µ(z)z n do not converge starting from certain n. Therefore one needs extra considerations to regularize/define the divergent integrals. One of the ways to do this is through some matrix generalization of (1), which we introduce in the next Section. We discuss various subtleties and implications of this approach in Section 6.

Matrix Student distribution
The matrix generalization of (1) is straightforward. It even appeared in the literature [17], but with no relation to integrability and other hidden structures typical for matrix models. Let X and Y be rectangular real-valued matrices of sizes N × M and N × (N + n − 1) respectively. Consider the following density function for their peculiar combination where A and B are constant square N × N and M × M matrices parameterizing the model. The inverse square root is taken in the sense of quadratic form, not of linear operator: Performing integration in X, using the δ-function, and taking into account the relevant Jacobian, one gets Now, rewriting the determinant as an integral over auxiliary M/2 copies of complex Grassmann variables one can perform integral over Y where the last Grassmann integral is taken by using diagonalization. The result is the desired matrix generalization of the Student distribution.
In what follows, we concentrate on the simplest form of this matrix Student distribution, with M = N , when Z is a square matrix, B = I, A = 1 a 2 I and n = 2α + 1 − 2N . Moreover, we take Z to be Hermitian matrix instead of an arbitrary real-valued matrix, the derivation of the integration measure being similar. The case of rectangular non-Hermitian Z, and of arbitrary matrices A and B, will be considered elsewhere.
For the unit A and B matrices, it is obvious that the Student matrix model depends only on the eigenvalues of the N × N square matrix-variable Z, and it possesses the usual properties of matrix models [18][19][20][21], of which we discuss just two: Ward identities [22][23][24][25] and superintegrability [26][27][28]. The third basic property, integrability [29,30] is trivially presented since it is a formal property of the one-matrix integral over the Hermitian matrix with an invariant measure, and the matrix model (12) is exactly of this type. These properties correspond to introduction of sources/deformations in the partition function (12) but in two slightly different ways. Also, as we explain in sec.6, though the results are formally valid for all values of α and N , they should be treated with care to be physically sensible.

Ward identities
In terms of eigenvalues, the measure of the simplest matrix Student distribution reads and the averages and the power-sum (multi-trace) correlators are defined as usual with the average f (Z) := f (Z)dµ

dµ . Inserting a suitably chosen set of full-derivatives in the integrand
one obtains a set of Ward identities for the correlators where I denotes a multi-index i 1 , . . . i m , and \i s means deletion of the element i s from the multi-index. All integrals are defined in the sense of the principal value, which is important in order to eliminate possible contributions of the boundary terms at infinities. Actually, these constraints are as strong as the ones for the Gaussian Hermitian model: if one starts from initial conditions C ∅ = 1, C 1 = 0, one can unambiguously find every correlator C i1,...,im in a finite amount of steps. Specifically, to find every correlator with a sum over indices, say, p provided all the correlators up to (including) degree p − 2 are already known, one needs to consider equations (16) with n + |I| = p − 1 (17) For instance, in order to find the simplest non-trivial correlators C 2 and C 1,1 , one needs to consider two equations: those with n = 1 and I = ∅, and with n = 0 and I = [1].

Superintegrability
The fact that it is possible to define seemingly-divergent matrix-model correlators of the long-tail Student model in an integrable way, i.e. in such a way that they are consistent with an infinite system of Ward identities, which determines them unambiguously is already surprising enough, and has interesting implications for predictability of risks, avalanches and earthquakes. However, there is even more: correlators of the model can be found once and for all, in other words, the model is superintegrable [28,31]. As usual, this requires a switch from C I to an appropriate basis in the space of correlation functions. Namely, the correlators of Schur polynomials χ λ , which are peculiar linear combinations of the power-sum (multi-trace) correlators enumerated by the Young-diagram λ, are simple factorized expressions, again in terms of the Schur polynomials with just the same λ, depending now on the matrix size N . The first examples are: In general where P λ (α, N ) is an extra contribution as compared with the short-tailed Hermitian Gaussian matrix model case equal to Here [...] denotes the integer part of a number, and δ m|2 is equal to 1 for even m and 0 for odd m. This product is, in fact, the product over a subset of the Young diagram λ boxes with coordinates (i, j) that belong to diagonals with odd content c i,j = i − j of the peculiar combination 2(α − N − c i,j ) + 1.
There are two, straightforward if a bit tedious, ways to prove the superintegrability formula (18). One way is via theŴ -representation [32] and its apparently simple form in the Schur basis [33]. Another way is via the determinant (Jacobi-Trudi) formula for the Schur polynomials and the use of orthogonal polynomials [34]. The third way to prove this formula, probably the easiest one, is to combine the determinant representations for the matrix model (which is due to its integrable properties, [30]) and that for the Schur polynomials as it was done in [33, sec.2.2].
Let us note that the matrix integral (12) lies in the class of Selberg type matrix and eigenvalue models, i.e. those with logarithmic potentials, however, with a specific choice of the integration contours. These models are known [35][36][37] to possess all basic properties of matrix models but the superintegrability: this later strongly depends on details of the model, and, hence, one had to check it for the concrete matrix Student case.

Long tails and analytical continuation of factorized formulas
After the strongest simplifying property of superintegrability is established, one can wonder, what at all is the difference between the long-tail and Gaussian models. It is, of course, in the physical meaning of exactlycalculable correlators: they are polynomial in the integration variables, and thus the long-tail correlators can diverge. This raises an interesting question of how superintegrability is reflected in the properties of physically meaningful, convergent correlators, but here we restrict ourselves to just a brief review of the problem.
The superintegrability formula (18) in the long-tail case contains a peculiar factor of P −1 λ , which can cause poles in N : averages seem to become infinite at certain N = N * and then becomes negative! This is particularly amusing for the quantities that are intuitively strictly non-negative, for example Recall, however, that the first equality, i.e. relation to the Gaussian-distributed matrices in Section 3 implies that α = N +N +n−1 2 , where N and N +n−1 are the dimensions of the matrix Y . Moreover, the second dimension M should be at least as big as the first one: otherwise, the rank of Y is not sufficient to correctly extract the square root and perform the inversion in (7). Putting α = N + n−1 2 , one gets So, there are no longer poles in N , and there are no changes as N changes in the sign of the average of a strictly non-negative quantity. There is still some denominator, which, however, becomes positive and finite starting from some finite n; in this example n > 2. This denominator is the only reminder that the model in question has long tails. In other words, positivity is not requires, and is not preserved as soon as one deals with α as a free parameter, fully independent of N . This once again highlights the importance of finding the right physical degrees of freedom for a model in question. And, as it often happens with matrix models, the choice can depend on whether we need physical or mathematical predictions. The power of matrix model theory should be used with attention and care.

Conclusion
Our conclusion in this paper is that the long-tail distributions can remain as simple and structured as the exponentially damped ones, and their non-trivial phase structure can remain well under control. In particular, various basic properties from a set of mutually consistent Ward identities (giving rise to the Wrepresentation, [33,38]) to integrability and further superintegrability, do survive, at least in the simplest long-tail example, that is, the matrix Student distribution. Remarkably, the long-tail-inspired divergence of integrals for the correlators, which exhibit integrability in the most straightforward way, is easily avoided by appropriate analytical continuation beyond the physical domain, where positivity can be traded for preservation of the superintegrability.
It remains to understand how general is this result, and to extend it beyond the Student distribution analyzed in this paper, which is long-tail, but still has an apparent Gaussian distribution in the background. Obvious next steps are to consider β-and (q, t)-deformations, as well as generalizations to monomial non-Gaussian and tensor models.