Riemannian Gaussian distributions, random matrix ensembles and diffusion kernels

We show that the Riemannian Gaussian distributions on symmetric spaces, introduced in recent years, are of standard random matrix type. We exploit this to compute analytically marginals of the probability density functions. This can be done fully, using Stieltjes-Wigert orthogonal polynomials, for the case of the space of Hermitian matrices, where the distributions have already appeared in the physics literature. For the case when the symmetric space is the space of $m \times m$ symmetric positive definite matrices, we show how to efficiently compute by evaluating Pfaffians at specific values of $m$. Equivalently, we can obtain the same result by constructing specific skew orthogonal polynomials with regards to the log-normal weight function (skew Stieltjes-Wigert polynomials). Other symmetric spaces are studied and the same type of result is obtained for the quaternionic case. Moreover, we show how the probability density functions are a particular case of diffusion reproducing kernels of the Karlin-McGregor type, describing non-intersecting Brownian motions, which are also diffusion processes in the Weyl chamber of Lie groups.


Introduction
The statistical analysis of a probability measure Q on a differentiable manifold is a topic attracting much attention and has developed considerably due to the wide range of interesting applications that require the analysis of data not living in ordinary Euclidean spaces. There is great interest, for example, in trying to process datasets which lie in the space P m of m × m symmetric positive definite matrices.
This space can be equipped with a Riemannian metric, giving it the structure of a Riemannian homogeneous space of negative curvature. Such Riemannian manifold has been thoroughly studied from many points of view, corresponding to different disciplines, such as harmonic analysis and number theory [1,2], matrix analysis [3] and information geometry and multivariate statistics [4,5].
This and other symmetric spaces have been studied endowed with metrics such as the Rao-Fisher metric or the log-Euclidean metric [6], in order to develop the tools to properly carry out statistical inference of data on such spaces. The landscape of applications is very vast as it includes, for example, medical imaging [7,8], continuum mechanics [9], radar signal processing [10,11] and computer vision [12][13][14][15][16], among many other references that could be quoted here.
In the present paper, the Riemannian metric will be the Rao-Fisher metric, also known as affineinvariant metric, and is the subject of Section 1.1. A specific probabilistic model to represent the statistical variability of data in non-Euclidean spaces, such as P m , has been developed by introducing the so-called Gaussian probability distributions on Riemannian symmetric spaces [17][18][19][20].
The Riemannian Gaussian distribution G(Ȳ , σ) will depend on two parameters,Ȳ ∈ P m and σ > 0. The form of its probability density function, generalising that of a Gaussian distribution on the Euclidean space R p , is given by where d : P m × P m → R + is Rao's Riemannian distance (defined below), and the density is with respect to the Riemannian volume element of P m , denoted by dv(Y ). In comparison to a Gaussian distribution on R, the normalising factor ζ(σ) plays the role of the factor √ 2π σ 2 . Among other possible applications, the distributions provide a statistical foundation for the concept of Riemannian centre of mass, which plays a major role in many contexts [21,22]. They also play a role in machine learning when the dataset is inherently structured [23][24][25].
In addition to the relationship between Riemannian Gaussian distributions and the concept of Riemannian centre of mass and the potential for applications, discussed in [17][18][19][20], we shall show here that one of the hallmarks of these distributions is analytical tractability. In particular, we will show how the obtained distributions are of standard random matrix type and how the ensuing analytical tools associated to the study of random matrices can be used effectively to further characterize the distributions analytically [26][27][28].
More specifically, we shall use Pfaffian and determinant descriptions of the random matrix ensembles and the classical orthogonal polynomial method in random matrix theory [26][27][28]. We do this in the setting which includes skew-orthogonal polynomials, necessary to solve models which do not correspond to the case of Hermitian matrices, such as the case of the real symmetric matrices, which is central for statistical applications.
We now briefly summarize the derivation of the model for the case of real symmetric matrices, along the lines of [17][18][19][20], to set up the context. Further details are in these references, while we will focus, from Section 2 on, on the analytical characterization of the joint probability distribution functions using random matrix theory.
1.1. The Rao-Fisher metric: distance, geodesics and volume. We overview the general setup, following closely the presentation in [18].
The Rao-Fisher metric [5,2] (2) is a Riemannian metric on P m . It induces a Riemannian distance d : P m × P m → R + , known as Rao's distance, in the standard way. When equipped with the Rao-Fisher metric, the space P m becomes a Riemannian manifold of negative sectional curvature [1,2]. Since it is also complete and simply connected, the distance between any two points in P m is given by the length of the geodesic connecting them.
The volume form on P m induced by the Rao-Fisher metric is [2] ( where indices denote matrix entries. One can then consider the expressions of ds 2 (Y ) and dv(Y ), given by (2) and (3), in terms of polar coordinates [27]. Then, for any class function f : P m → R, the angular coordinates can be integrated out, obtaining where (see [29,Page 71]), with Γ m the multivariate Gamma function, given in [29]. Formula (4) follows directly from [2, Proposition 3, Page 43] but has been written following the notation in [19].
For any σ > 0, let f (Y |σ) be given by, where I ∈ P m is the m × m identity matrix. The normalization constant ζ(σ) is then defined as where dv(Y ) is the volume form (3). Then, with ω m defined in (5).
We show how to use random matrix tools, in particular the evaluation of Pfaffians, to compute analytically ζ(σ) for several values of m and also in the large m limit. This extends some of the results in [17,18] and the other works discussing Gaussian distributions on Riemannian symmetric spaces, where the case m = 2 is evaluated analytically and the rest is carried out numerically, with Monte-Carlo methods. Before that, we note that the corresponding model for Hermitian matrices, also discussed in [19], which is characterized by an analogous expression but with the sinh (|r i − r j |/2) term squared, is fully solvable with orthogonal polynomials. It has already been studied in the physics literature in detail, but we summarize and develop the results in the next Section and we will study more difficult cases to treat analytically, such as (8), in Section 3 and in the Appendix. Finally, Section 4 will contain a discussion of these statistical distributions from the point of view of diffusion processes.

Full solution of the case of Hermitian matrices
Let us define the function z β (σ) as that we call the partition function, adopting the language of random matrix theory. The parameter β is called Dyson's index. While (9) makes sense for every β > 0, with a suitable choice of c β > 0, we will be mainly interested in β ∈ {1, 2, 4}, corresponding to a random matrix ensemble with orthogonal, unitary and symplectic symmetry, respectively. To avoid cumbersome overall factors of 2 m/2 in what follows, we adopt the normalization of [30] and choose but we stress that other suitable choices exist, as for example c β = β 2 , and the present discussion does not depend on such choice. Note that the variance with this normalization is σ 2 2c β , so in particular it is σ 2 for β = 1 but it is σ 2 2 for β = 2. For β = 1 in (9) we find The crucial fact we want to exploit is that z β (σ) can be written in fully standard random matrix form, meaning in terms of a Vandermonde determinant. This was done in [31], in the context of the study of vicious walkers 1 and, later on, it was also crucial to study Chern-Simons gauge theories [32]. One can therefore, use all the power of the traditional random matrix tools, such as determinantal expressions and the method of orthogonal polynomials. Denoting x i = e u i and using (12) 1≤i<j≤m we see that Changing variables r i = u i − σ 2 β 4c β m − 1 + 2 β to complete the square, it follows that: which describes the partition function of the Stieltjes-Wigert random matrix model. Note that this model is of the standard random matrix form but with a weight function of log-normal type: . Random matrix theory gives formulas 2 for any marginal of the joint probability density function of eigenvalues, including the normalization constant, in terms of the polynomials orthogonal with regards to the log-normal weight w(x) above. These polynomials are the Stieltjes-Wigert polynomials [33], which are essentially a type of q-deformed Hermite polynomial.
A Riemannian log-normal probability distribution was already introduced in [34] (see [35] for a review) but is not of the random matrix type and not comparable to (14) due to an extra factor in the Vandermonde part.
Another way to see this q-deformed structure at play is to define the q-parameter and introduce the q-number which reduces to an ordinary real number, [x] q → x, in the limit q → 1 − . Then we have wherer i = r i /σ. This latter form shows that the partition function z β (σ) is, up to the explicitly known overall σ-dependent factor, the q-deformation of the Gaussian β-ensemble, for every β > 0, with q as in (15). We study the partition function z β (σ). Before that, we briefly review the relation between random matrix ensembles and symmetric spaces in Subsection 2.1. Then we discuss first the case β = 2 in the rest of the present Section. The other two cases β = 1 and β = 4 are studied in Section 3.
2.1. Cartan classification: from symmetric spaces to matrix ensembles. A correspondence between symmetric spaces and random matrix ensembles was established by Altland and Zirnbauer [36] and used to classify the latter ones in terms of the former (see [37,38] for an overview). 3 As we have seen above, when dealing with invariant joint probability distributions, the integrals only depend on the eigenvalues (x 1 , . . . , x m ) of the random matrix. The Jacobian coming from this reduction is the Vandermonde determinant [26]: For matrix ensembles of the transfer matrix type (following the terminology of [38]) the Jacobian is naturally For the case β = 2. The other cases are more complicated as they require skew-orthogonal polynomials. 3 These results have been recently extended to include double-coset spaces [39]. That setup seems especially well-suited for applications in statistical analysis.
which is recast in the form (18) with an exponential change of variables. For circular ensembles, on the other hand, one gets [40] (20) which is related to (18) through x = e iθ . One readily observes that (18), (19) and (20) are precisely the Jacobians to pass to spherical coordinates in symmetric spaces of zero, negative and positive curvature, respectively. We are interested in ensembles of real symmetric, Hermitian or quaternion real Hermitian matrices, corresponding to β = 1, β = 2 and β = 4, respectively [26,27]. These three matrix algebras are realised as the tangent space at the origin of coset spaces of Cartan type A, yielding the classification summarized in Table 1 In conclusion, we single out three hyperbolic symmetric spaces, which correspond to random transfer matrix ensembles with β ∈ {1, 2, 4}. All of them are discussed in the following. Besides, we can go beyond Cartan type A: two examples of type D are provided in Appendix B.

2.2.
Moments of the log-normal distribution and the Stieltjes-Wigert orthogonal polynomials. In what follows we exploit the rewriting (14), which allows to directly apply standard results from random matrix theory. Define the inner product for analytic real functions f and g. We keep the dependence on σ implicit in the notation. The inner product of two monomials is 4 For the cases β = 1 and β = 4 we will need to define skew-symmetric products, but for β = 2, z β (σ) can be evaluated exactly [31], thanks to the Stieltjes-Wigert polynomials, which are orthonormal polynomials with respect to inner product (·, ·) 2 defined in (21), that is This family of polynomials is given by where the q-parameter is (not to be confused with q in (15)), and is the q-binomial. Let us denote by a n the leading coefficient, that is P n (x) = a n x n + . . . , and define the monic orthogonal polynomials p n (x), which satisfy the orthogonality relation The main tool to evaluate z β=2 (σ) exactly is the celebrated Andréief identity [45,46], which allows to rewrite the matrix model (14) as a determinant: In turn, the inner product of monomials inside the determinant can be replaced by (p i−1 , p j−1 ) 2 , so that the matrix (of which we are taking the determinant) becomes diagonal in this basis, and we obtain [31] (30) with q defined in (25) and Γ q the q-Gamma function Explicitly: We plot z 2 (σ) as a function of σ for various m in Figure 1, and as a function of m for various fixed σ in Figure 2. The dependence on m is better captured by log z β rather than z β itself. For this reason, while our discussion will focus on z β , in the plots and numerical evaluations we will instead consider log z β . 2.3. Eigenvalue density, β = 2. Let us introduce the eigenvalue density, which by definition is obtained integrating the joint probability density over m − 1 of the m eigenvalues [26]: Clearly, from the invariance of the integral z 2 (σ) under permutation of the r i 's, we can integrate over any m − 1 of the m variables and obtain the same definition of ρ β (r; σ). Throughout this Subsection we consider β = 2. It is convenient to use the change of variables x i = e r i − σ 2 2 m as in (14), and exploit the Stieltjes-Wigert polynomials [32]. We can use the Christoffel-Darboux formula to rewrite the eigenvalue density of any random matrix ensemble as [27, Eq. (5.13)] where w(x) is the weight function, p n are the monic orthogonal polynomials with respect to the weight w(x), and the prime means derivative with respect to x. In our case, w(x) = e −(log x) 2 /σ 2 σπ 1/2 and p n are the monic Stieltjes-Wigert polynomials introduced in the previous Subsection. We find (35) Undoing the change of variables, using ρ 2 (x)dx = ρ 2 (e r+σ 2 m/2 )e r+σ 2 m/2 dr and multiplying the last expression by the Gaussian prefactor coming from the weight function in the Christoffel-Darboux formula (34), we find that ρ 2 (r; σ) takes the form with the coefficients c k (q) obtainable from the expressions above. The upshot is that ρ 2 (r; σ) is the sum of 2m − 1 Gaussian distributions, centered at the points It is also easy to check that the coefficients c k (q) have alternating sign, thus the Gaussians centered at r k with k odd interfere constructively with the other Gaussians at odd positions, and destructively with those at even positions. We therefore expect ρ 2 (r; σ) to be described by m peaks and m − 1 valleys among them. Moreover, the support of the eigenvalue density grows linearly with the product t ≡ mσ 2 . We plot the eigenvalue densities for various σ in Figure 3, and in the small σ and large σ regime in Figure 4.

2.4.
Limits of z β (σ). Before delving into a more detailed analysis of z β (σ) for β ∈ {1, 4}, it is instructive to analyze various limits for generic β > 0.  2.4.1. σ → 0 + limit. First, it is clear from (17) that sending σ → 0 + we recover the usual Gaussian βensemble, whose partition function in known exactly for every β > 0, see [27,Eq. (1.160)]. In particular, with our normalization we have for m even (and a similar expression for m odd), and Comparing with formula (11), the limit (38) shows in particular that ζ(σ) goes to zero as ∼ σ m 2 /2 in the small σ limit, that is, small variance in the dataset.
2.4.2. σ 2 → ∞ limit. Another possible limit is the σ 2 → ∞ limit. We give the result taking first the large σ 2 limit and then approximating for large m. It is convenient to write z β (σ) as where we have changed variablesr i = r i /σ 2 and approximated 2 sinh σ 2 |x| 2 ≈ e σ 2 |x|/2 at large σ 2 . Making the ansatz thatr i grows asr i = m α s i for some α > 0 and s i of order 1, we see that a saddle point exists in the large m limit if the two terms in the exponential in (41) are of the same order in m. The first term goes as m 1+2α and the second as m 2+α , meaning that the large m limit requires m 1+2α = m 2+α , that is α = 1. This implies that log z β (σ) ∝ σ 2 m 3 at leading order at large σ 2 and large m.
We can also foresee the form of the sub-leading correction. Indeed, it will come from integrating over fluctuations around the saddle point. The entries of the Hessian matrix from (41) are ∼ c β σ 2 , and performing the m Gaussian integrals and taking into account the coefficient we expect this sub-leading correction to be of order ∼ m log σ 2 . Note, however, that (41) as it stands is not suitably written to study the saddle point equation. To get rid of the absolute value, we should restrict to the principal Weyl chamber in R m and look for a saddle point therein. We do not pursue this approach, and instead quote the result found in [47], where a rigorous approach to this limit has been undertaken, based on work by Baxter [48]. 5 One gets: 2.4.3. Planar limit. Another useful limit is the scaled large m limit with m → ∞ and σ 2 → 0 keeping their product mσ 2 ≡ t fixed. In this limit, the leading contribution to z β comes from the saddle point configuration, and one finds where F univ. (t) is a β-independent quantity, explicitly found solving the saddle point equation. To obtain (43) we have used the normalization c β = β/2, and the normalization (10) is recovered simply shifting t → 2t in the β = 4 case. Both F univ. (t) and the limiting eigenvalue density ρ(r; t) are explicitly known: they have been computed almost twenty years ago in the context of Chern-Simons theory [51]. We refer to [51] or the review [52] for detailed analysis and formulas in this regime.
In [31] this scaled limit was studied and subdivided in three cases, according to the value of mσ 2 ≡ t being finite, infinite or 0. In Section 4, we will discuss the relationship between the probability densities in [17][18][19] with diffusion processes and how they appear in the study of Brownian motion on Weyl chambers of Lie groups.

The cases of real symmetric and quaternion Hermitian matrices
According to the results in Subsection 2.4, we can extract the behaviour of z β (σ) in certain limits from the knowledge of z 2 (σ). Nevertheless, we can do more, and evaluate exactly z 1 (σ) and z 4 (σ) using standard tools in random matrix theory.
3.1. Skew-symmetric products and β = 1 partition function. We discuss z β=1 (σ), which is directly related to ζ(σ) through (11). In contrast to the case β = 2, the partition functions z β=1 (σ) and z β=4 (σ) do not correspond to determinants, but rather to Pfaffians. It is convenient to discuss separately the two cases with m even or m odd, starting with the former. Before that, we define the skew-symmetric products [30] (44) Note that the coefficient in the exponent of the weight function is c β as defined in (10).
Computing the skew-symmetric products of any pair of monomials gives 5 The extension of Baxter's result [48] to arbitrary β, as we need, is straightforward. The statistics of extreme eigenvalues in this regime has been recently studied in [49,50]. and x k , where we have used the change of variables x = e r . In (47), erf(x) is the error function, and in the last line we have used the integration formula for erf(x) with a Gaussian weight. The error function is odd, erf(−x) = −erf(x), whence both products are manifestly skew-symmetric.
Assuming m even, we use the de Bruijn identity [53] to write with the skew-symmetric product x i−1 , x j−1 1 given in (47). For m odd, instead, the de Bruijn identity [53] leads us to the Pfaffian of a (m + 1) × (m + 1) skew-symmetric matrix: In both cases, we could expand the Pfaffian and obtain z β=1 (σ), and hence ζ(σ), as a finite sum of terms, explicitly known from (47). The advantage of this approach is that the Pfaffians can be obtained explicitly for fixed m with the aid of a computer algebra. We provide numerical results at various m and σ 2 in Tables 2 and 3, computed in Mathematica using the algorithm of [54].  Table 3. − log z β=1 for even m from 2 to 40 and fixed σ 2 = 0.01. Evaluation time 0.133s (whole table) on macbook.
From z β=1 (σ) we immediately obtain ζ(σ), which we show in Figure 5 as a function of σ 2 for various m, and in Figure 6 as a function of m for various fixed values of σ 2 . We also show the agreement of log z β=1 with the theoretical predictions of Subsection 2.4 for small σ and large σ in Figure 7.   While the Pfaffian representation of z β=1 is very useful for computational purposes, the simple example m = 2 can be easily solved by direct integration: We now observe that the integral of an error function with a Gaussian weight is again an error function and, after simplifications, we obtain This is a check of the result (48) for the smallest value of m. For higher values of m a direct computation becomes more involved. However, from the Pfaffian representation, we infer that the result is always a finite sum of products of error functions, weighted by integer powers of e σ 2 /4 . So, for example, de Bruijn's identity immediately gives A similar type of expression, involving a sum of error functions, is obtained for the Gaussian distribution on the Poincaré ball [23]. In that case, the Vandermonde term sinh (|r i − r j |/2) is replaced by the simpler term sinh (|r|/2).
It would be interesting if the quick numerical evaluation of the normalization constant we obtained, based on Pfaffians, can be of further exploited, given the relevance of the distribution to the engineering of algorithms for machine learning and detection of structured collections of data on graphs [25], as well as in the study of auto-encoders [23,24], just to name a few applications, in addition to the ones discussed in [17,19,18,20].

3.2.
Eigenvalue density, β = 1. The eigenvalue density at β = 2 has been obtained relying on the explicit knowledge of the Stieltjes-Wigert polynomials. For a closed expression of ρ β=1 , we would need the corresponding skew-orthogonal polynomials [26,30], which however are not known. To gain insight, let us first study the simplest case m = 2. Then the eigenvalue density is obtained by direct integration, and the computations are identical to the ones at the end previous Subsection. We find: This formula already shows the salient features of the β = 1 eigenvalue density: it is a sum of products of Gaussians (centered at different, shifted points) multiplied by error functions.
Although without the skew-orthogonal polynomials in closed form, the eigenvalue density can still be calculated for any fixed m. For this, we need to introduce some notation. Let p n (x) be the monic skew-orthogonal polynomials of the Stieltjes-Wigert family, that is, they satisfy (55) p 2k , p 2 +1 1 = h k δ k , p 2k , p 2 1 = 0 = p 2k+1 , p 2 +1 1 with the skew-symmetric product ·, · 1 defined in (45). These polynomials have an explicit Pfaffian representation [55,56]. To reduce clutter, let us denote throughout this Subsection That is, z (m) 1 is just the partition function at β = 1 for a given value of m, with the overall factor coming from the definition (14) stripped off to avoid cumbersome coefficients.
The constants h k in (55) can then be expressed as h k = z (2k+2) 1 z (2k) 1 . Moreover we have (57) p 2 (x) = 1 for even degree n = 2 , and (58) for odd degree n = 2 + 1. In the latter expression we have introduced the skew-symmetric matrix [56] We also need the functions Explicit expressions of the first few skew orthogonal polynomials and of the corresponding φ n (x) functions are given in Appendix A.
Putting these definitions at work, we can obtain ρ β=1 (x) as a particular case of [27, Proposition 6.3.3, Page 254]. We focus on the case of even m for simplicity, but a similar expression exists for odd m. We get This expression allows to characterize the eigenvalue density ρ β=1 (r; σ), undoing the change of variables x = e r− σ 2 2 (m+1) . Each p n (x) is a polynomial of degree n in the variable x (and hence in e r ), with coefficients ratios of sums in which each term is of the form e σ 2 a/4 erf(|σ|b/2) with a and b integers. In turn, if we write schematically (62) p n (x) = n k=1 α n;k x k with α n;n = 1 by definition, we find (63) φ n (e u ) = 1 √ 2 n k=0 α n;k e σ 2 (k+1) 2 Writing (61) in terms of the exponential variable r instead of x, and putting all together after some rewriting, we arrive at In the latter expression the coefficients α n;k are read off form the skew-orthogonal polynomials p n (x). Since the formulas for both the polynomials and the normalizations z (n) 1 are known, the challenge to obtain the density ρ β=1 (r; σ) is reduced to the task of evaluating Pfaffians. Moreover, once p n (e u ), φ n (e u ) n=0,...,m−1 are found, it is possible to go beyond the eigenvalue density and compute any marginal distribution of the eigenvalues [26,Chapter 6], [27,Chapter 6].
We conclude that ρ β=1 (r; σ) is a sum of terms that depend on r through the product of a Gaussian and an error function in r. Comparing the results at β = 2 and β = 1, we see that each Gaussian term in ρ β=2 (r; σ) is corrected by an error function in the β = 1 case. The same happens with the coefficients: the dependence ∼ e σ 2 a/2 in the β = 2 case is replaced by ∼ e σ 2 a/2 erf(|σ|b/2) at β = 1.

β = 4 partition function.
We can apply the tools of Subsection 3.1 to study the partition function z β=4 (σ) as well. This computes the normalization constant of the density considered in [57]. In this case, we must use a different de Bruijn identity [53], which allows us to write in terms of a Pfaffian with entries the skew-symmetric products given in (46). Note that the Pfaffian is that of a 2m × 2m skew-symmetric matrix. We plot the result as a function of σ 2 in Figure 8, and as a function of m for various fixed values of σ 2 is Figure 9. Moreover, we evaluate numerically log z β=4 (σ) in Tables 4 and 5. The determination of the density ρ β=4 (r; σ) for arbitrary m would entail constructing skew-orthogonal polynomials with respect to the skew-symmetric product ·, · 4 , in analogy with the β = 1 case. However, Figure 9. log z β=4 (σ) as a function of m for σ 2 = 0. 5 Table 5. − log z β=4 (σ) for m from 1 to 20 and fixed σ 2 = 0.02. Evaluation time 0.110s (whole table) on macbook.

Diffusion and kernel interpretation
The first instance where the probability density (1) appeared in physics is seemingly in the problem of vicious walkers in statistical physics [58]. The formulation of the vicious random walker problem on a lattice, according to the so-called lock step model, consists in considering m random walkers, each an even number of lattice spacings apart, on a one-dimensional lattice. Then, at regular time intervals, each walker must take either a step one lattice site to the left or a step one lattice site to the right, with equal probability 1/2. Coincidence of two walkers at the same site and time results in the annihilation of both walkers and hence the process ends (whence the name vicious).
In the diffusive limit, this model describes a set of non-intersecting Brownian motions and one obtains the Gaussian Riemannian density (1) and also the ones corresponding to other symmetric spaces, as we shall see, by consideration of the classical, determinantal formalism of non-intersecting random walkers and non-intersecting Brownian motion, initiated by Karlin and McGregor in the seminal work [59] (see more recent work and textbooks [60][61][62] for a clear description, including a modern explanation of Dyson's Brownian motion [63]).
Given the relevance of diffusion processes in the statistical analysis of data [64,65], it is worth to dwell further into the diffusive origins and interpretations of the probability densities discussed above.
In [58], the probability density that all walkers survive at time t is obtained, giving an expression in terms of the positions (x 1 , . . . , x m ) of the form which, again using r.h.s of (12), is the probability density function of (9), up to a shift of variables. Therefore, this density is the solution of a diffusion process and the density satisfies a heat equation [31]. Interestingly, the problem of m vicious walkers on a one-dimensional lattice has an equivalent description in terms of Brownian motion on the Weyl chamber of a Lie group [60,66]. All the densities discussed in this work emerge in a natural way in such a context. To see this, we quote a basic theorem [60] which states that if c t is the density function for a continuous stochastic process, then for absorbing boundary conditions, it holds In this formula, η = (η 1 , . . . , η m ) is the vector of initial positions of the m particles, b t (x, η) is the probability density for the particles to be at x given the initial positions η, and the sum is over elements w of the Weyl group W . These expressions are all determinantal and this theorem applies to standard Brownian motion, in the Weyl chambers of A m−1 , B m (and thus C m ), and D m , with absorbing or reflecting boundary conditions. The measure for unconstrained standard Brownian motion is where N t is the normal distribution function with mean 0 and variance t.
Consider the root system A m−1 , with associated Weyl group the symmetric group S m . The principal Weyl chamber is characterized by x 1 > x 2 > · · · > x m . This models the Brownian motion of m independent particles in one dimension. With absorbing boundary conditions, collisions are forbidden and the sum can be written as a determinant, which gives (70) b t (x, η) = det 1≤i,j≤m This determinant gives the probability for n particles that start at positions (η 1 , . . . , η m ) and are in independent Brownian motion to be at positions (x 1 , . . . , x m ) at time t without having collided.
The expression (70) can be written as The η j are all integers and, if in addition we choose η j = m − j, the determinant on the right-hand side is the Vandermonde determinant in the variables e x i /t , equal to (72) 1≤i<j≤m e x j /t − e x i /t . Therefore, we have that, for this equal spacing condition on η j : where c t (x) is (68). Therefore, because of (12), the density is that of a Gaussian distribution on the Riemannian space of real symmetric matrices introduced in [18,17,57], and studied here in Section 3. The corresponding Gaussian density for the space of Hermitian matrices appears from the above when studying the probability of reunion, using for example the extensivity property of probabilities [66,Eq. (14)], which in turn is related to the reproducing property of the (determinantal) kernel (71).
For root system B m , the corresponding Weyl group is the hyperoctahedral group, which includes permutations with any number of sign changes. Its Weyl chamber is x 1 > x 2 > · · · > x m > 0. This also models m independent particles in one dimension, with an additional wall at x = 0 [60]. In the Brownian motion setting at hand, the formalism gives This determinant gives the measure for m particles which start at (η 1 , . . . , η m ) to be at (x 1 , . . . , x m ) at time t, neither having collided nor having touched x = 0. Choosing again η j = m − j, the determinant in (75) can be written in a Vandermonde form [60]. Doing so and with some rewriting, that part gives This expression is associated with the Jacobian for matrix integration on o(2m + 1). These non-intersecting Brownian motions are interwoven with many other areas. One of this relationships is the connection with the Harish-Chandra-Itzykson-Zuber integral [70,71,61]. For o(m) and sp(2m), this has been further developed in [72] where, in addition, it is proven that the density b t (x, η) satisfies a diffusion equation.
Likewise, for D m , with Weyl group the even hyperoctahedral group, which includes permutations with an even number of sign changes, the principal Weyl chamber is x 1 > x 2 > · · · > x m , with x m−1 > −x m . This does not give a natural model for m particles in one dimension, but one can still find a determinantal expression leading to the so(2m) case studied in Appendix B, see Eq. (80), and also appeared in statistical work in [19] (but not analyzed analytically there). In Appendix B we study that case and the one corresponding to sp(2m), Eq. (93), which is closely related to the odd orthogonal case quoted above (see also [72]).
It is worth mentioning that b t (x, η) is itself a reproducing kernel, as can be quickly checked (again, see for example [66,Eq. (14)]), even though this fundamental property does not seem to have been exploited, in spite of the broad literature on the subject.
It would be interesting if this fact could be used in a statistical context, either in conjunction with the results described and cited or on its own, given the very well established relevance of reproducing kernels in traditional statistical analysis [73] and, at the same time, the remarkable and deep relevance of diffusion, in the statistical analysis of data, at many different levels [64,65].
It should be stressed that this is a different reproducing kernel from the one that immediately emerges by identifying the probability density in terms of a random matrix ensemble (14). Random matrix theory [26] associates to (14) a reproducing kernel of the form: where w(x) is the Stieltjes-Wigert weight and p k (x) are the Stieltjes-Wigert polynomials, introduced in Section 2.2. This is a delta sequence type of kernel [73]. Since the polynomials are q-deformations of Hermite polynomials, the kernel is a one-parameter deformation of a Hermite kernel, used in probability density estimations [74,73]. From it, in principle any marginal of the probability density can be obtained. The result in Section 2.3 is an example of this, corresponding to the diagonal limit of the kernel. See [75] for explicit evaluations, and [76] for recent developments along these lines.
An interesting open problem is to construct the skew-orthogonal polynomials associated to the Stieltjes-Wigert (log-normal) weight function. This would put on equal footing the analytical results for the real-symmetric and quaternionic models with the full solution of the case of Hermitian matrices. That would be a new result from the random matrix theory point of view as well, since only classical orthogonal polynomials [30,55] and, more recently, semi-classical polynomials [77] have been studied, whereas skew-polynomials in the q-deformed setting, to which the Stieltjes-Wigert polynomials belong, remain unstudied.   The methods presented in the main text can be extended to other symmetric spaces. As discussed previously, switching to other spaces in the Cartan classification corresponds, on the random matrix theory side, to consider matrix integrals with different integration domain. We restrict ourselves to hyperbolic spaces and β = 1 for concreteness.
We consider the partition function of the β = 1 ensemble of so(2m) matrices. The choice so belongs to the D-class in the Cartan classification, and β = 1 (orthogonal symmetry) means that we are working in the tangent space at the origin to the coset space SO(2m)/U (m), hence in the DIII class [37]. The model also appears in the latter part of [19] where it is shown to be related to integration on a Siegel disc.
The partition function is We are now working with 2m × 2m matrices, therefore the Weyl group permuting the 2m eigenvalues is of order (2m)!. We then use the invariance of the integral under such permutations to restrict the integration domain to the principal Weyl chamber (81) r 1 ≥ r 2 ≥ · · · ≥ r m ≥ 0 and obtain Using the simple property and changing variables x i = cosh r i , we arrive at (84) z so(2m) 1 (σ) = 2 m(m−1)/2 where we have defined The function w so satisfies (86) w so (x; σ)dx| x=cosh r = e − r 2 2σ 2 dr.
We will need Moreover, we introduce the skew-symmetric product We are now ready to apply the de Bruijn identity to (84). We obtain the Pfaffian form: The procedure extends to the sp(2m) case. For β = 1, this corresponds to the class CI in the Cartan classification [37]. The partition function is [27] (93) z sp(2m) 1 We mimic the steps above, using the invariance to restrict the integration domain to the principal Weyl chamber and change variables x i = cosh r i . We arrive at (94) z sp(2m) 1 (σ) = 2 m(m+1)/2 The de Bruijn identity [53] gives (97) and the skew-symmetric product ·, · sp 2 for monomials is given by × e (j−2 ) 2 σ 2 We note that these models, in the trigonometric version and with squared interaction terms (that is, β = 2) have been solved fully in [44], using the connection to determinants, instead of Pfaffians, of Toeplitz+Hankel matrices. One method that could be applied here would be for example the use of expansions in Schur polynomials, see [44,Theorem 4] and the proof therein.