Matrix Integral Approach to MIMO Mutual Information Statistics in High-SNR Regime

In this work, an analytical framework for deriving the exact moments of multiple-input- multiple-output (MIMO) mutual information in the high-signal-to-noise ratio (SNR) regime is proposed. The idea is to make efficient use of the matrix-variate densities of channel matrices instead of the eigenvalue densities as in the literature. The framework is applied to the study of the emerging models of MIMO Rayleigh product channels and Jacobi MIMO channels, which include several well-known channels models as special cases. The corresponding exact moments of any order for the high-SNR mutual information are derived. The explicit moment expressions are utilized to construct simple yet accurate approximations to the outage probability. Despite the high-SNR nature, simulation shows usefulness of the approximations with finite SNR values in the scenario of low outage probability relevant to MIMO communications.


Introduction
Mutual information is one of the most important quantities in information theory. It is crucial in the analysis and design of various communications and signal processing systems. In multiple-input-multiple-output (MIMO) communications, the supremum of the mutual information provides the fundamental performance measure of the channel capacity. Efforts have been made to understand the statistical properties of MIMO mutual information for different channel models. However, knowledge in the literature is essentially limited to either the exact mean values [1][2][3][4][5][6] or the limiting means and variances [7][8][9][10][11]. The first moment is relevant to the ergodic mutual information, whereas the higher order moments describe the outage probability essential to study to slow or block fading channels. Our study is also motivated by the fact that the prevailingly adopted asymptotic variances based approximative outage probabilities [8][9][10][11] fail to capture the true distribution when the number of antennas is small or the outage probability is low. Accurate characterizations require the exact higher order moments, which govern the tail of the distribution.
Deriving the exact higher order moments of MIMO mutual information for any given arbitrary signal-to-noise ratio (SNR) is a notable and longstanding challenging task. In fact, even the exact second moment is still unknown in the literature for any non-trivial MIMO channel model. The purpose of this paper is to show that, with the assumption of high-SNR, exact moments of any order for a wide class of MIMO channels can be explicitly obtained. The idea of the proposed approach stems from our observations that moment expressions of high-SNR mutual information can be efficiently obtained by means of integrals over matrix-valued channel densities. This is contrary to the existing approach [1][2][3][4][5][6][8][9][10][11], where the starting point is the seemingly simpler integrals over the eigenvalue densities of channel matrices.
To show the usefulness of the proposed framework, we study the mutual information of the MIMO Rayleigh product channels [3][4][5]7,10,12] and the Jacobi MIMO channels [6,9,11]. The MIMO Rayleigh product channel takes the well-known MIMO Rayleigh channels with [2] and without [1] correlation as special cases. The Jacobi MIMO channel is useful in modeling MIMO optical channels [6,11] and interference-limited multiuser channels [9]. The main results of this present work are the exact yet explicit formulas of all integer moments of mutual information for the above mentioned channel models in the high-SNR regime. We utilize the derived moments to construct analytical approximations to channel outage probabilities. Despite the high-SNR assumption, the resulting approximations turns out to be reasonably accurate for finite SNR values.
The high-SNR regime provides important insights into the statistical performance of MIMO channels. In particular, it characterizes the minimum required transmit power, which is also referred to as the high-SNR power offset [13]. The considered high-SNR mutual information is directly related to the high-SNR power offset, where its mean values have been derived for different channel models in [13]. As an application of our results, we may study as a possible future work the distribution of the power offset pertaining to the nonergodic channels, an open problem discussed in [13]. This open problem has been partially addressed in [14] for the case of a product of two MIMO Rayleigh channels.

MIMO Mutual Information
Consider a MIMO system consisting of n transmit and m receive antennas, the channel in between is described by an m × n random matrix H. Assuming i.i.d. input across the transmit antennas and that the channel H is only known to the receiver, the mutual information in nats/second/Hz of the MIMO channel is [1] where m ≤ n is assumed without loss of generality. In Equation (1), ln(·) is the natural logarithm, det(·) is the matrix determinant, r is the SNR, and θ m ≤ · · · ≤ θ 2 ≤ θ 1 denote the eigenvalues of the Hermitian matrix HH † . In the high-SNR regime, by ignoring the constant I m in Equation (1) the mutual information is approximated by where I denotes the approximation to I in the high SNR regime. This approximation becomes exact as the SNR r grows to infinity. A fundamental information-theoretic quantity for MIMO channels is the outage probability, which is the probability that a given rate exceeds the value of the mutual information. For high-SNR mutual information Equation (2), the outage probability P out (z) is defined as a function of the rate z as P out (z) = P (I < z) .
Clearly, the above defined high-SNR outage probability P out (z) is an approximation to the true outage probability P (I < z), which approaches to its exact value as the SNR tends to infinity. Finding simple and explicit expressions to the outage probability Equation (4) for finite-size systems of the following MIMO channel models is the focus of this work.

MIMO Rayleigh Product Channels
The MIMO Rayleigh product channel, originally proposed in [7], is a relevant model for the indoor wireless propagation in pico-cellular networks such as train stations, office buildings, and airports. Physical motivation for such a channel model can be found, for example, in ( [15], Section 3). The MIMO Rayleigh product channel has received increasing attention due to the recent breakthrough in understanding its finite-size distribution [3][4][5]. Assuming a MIMO channel with d 0 transmit and d M receive antennas, the information transmitted to the receiver goes through M − 1 successive layers, each having d i (i = 1, . . . , M − 1) scatterers, the corresponding channel equals the product of M channel matrices The dimensions of the i-th channel H i is d i × d i−1 , where each channel is assumed to be an independent MIMO Rayleigh channel. All the scattering between the MIMO Rayleigh channels H i and H i+1 happens through the d i scatterers in the layer i, which can be thought as d i keyholes. In the literature, the channel model considered in Equation (5) has been referred to different names such as multiple cluster scattering channel, progressive scattering channel, or multiple Rayleigh scattering channel. We choose to use the term MIMO Rayleigh product channel in this paper to emphasize that each channel in the product is described by the Rayleigh fading.
The effect of spatial correlation among receive antennas is considered, which is caused by physical constraints of the terminal size. We choose the separable correlation model, where the effect of correlation is multiplicative, i.e., where the d M × d M deterministic matrix Σ is the correlation matrix. For convenience of the discussion, we study here the receiver side correlation, whereas the transmitter side correlation can be considered at the same time. The only difference is that the result (21a) will have an additional term ln det Ω, where the d 0 × d 0 matrix Ω specifies the correlation matrix at the transmitter side. As will be discussed in detail in Section 3, one could assume without loss of generality that the dimensions of the channels are ordered as which is known as the weak commutation relation for products of random matrices [4]. Strictly speaking, the weak commutation relation was established in [4] for the case Σ = I m . The extension to an arbitrary Σ can be seen by the fact that the parametrization ( [4], Equations (2)-(4)) is essentially the same in the presence of Σ. The extension is further confirmed by the observation that the density (9) is indeed invariant under any permutation of the parameters ν i . For notational convenience, we further denote the smallest dimension d M by m = d M .
By the above assumptions, the density of the joint eigenvalues of the hermitian matrix HH † is given by [5] with f j (θ) = G M,0 where 0 ≤ θ m ≤ · · · ≤ θ 2 ≤ θ 1 < ∞ and c is a normalization constant that does not depend on θ. Here, σ j denotes the j-th eigenvalue of the correlation matrix Σ with σ i = σ j for i = j. The case when some of σ i are equal can be resolved by the L'Hopital's rule. The function in Equation (10) is a Meijer's G-function [5] and the determinant in Equation (9) det θ is a Vandermonde determinant. The MIMO Rayleigh product channel, seen in Equations (5) and (6), is a general channel model that includes the following well-known models as special cases: • Uncorrelated MIMO Rayleigh product channels [3,4,7,10] when Σ = I m , • MIMO Rayleigh channels [2] when M = 1, • Uncorrelated MIMO Rayleigh channels [1,8,9] when M = 1 and Σ = I m .
In the literature, the exact ergodic mutual information E [I] of the MIMO Rayleigh product channels with and without correlation has been derived in [3][4][5], respectively. For the MIMO Rayleigh channels, the exact ergodic mutual information with and without correlation has been derived in [1,2], respectively. For the exact higher order moments E I k , k = 2, 3, . . . , that are needed to characterize the outage probability, no explicit finite-size results seem available for any of the above channel models. We exclude the discussion of finite-size results in the literature that involve, as final results, integral representations or combinatorial objects such as determinants and sums over partitions. For example, for different channel models in [16,17] the exact outage mutual information has been represented as contour integrals involving determinants, which may only be evaluated numerically in most cases. The existing results are limited to a scenario of large channel dimensions, where some asymptotic formulas for the second moment are available [8-10].

Jacobi MIMO Channels
The Jacobi MIMO channel is a useful model for the MIMO optical communications [6,11] as well as the interference-limited multiuser MIMO [9]. We will, however, formulate the problem mainly in the context of the former application, where the relevance to the latter will be briefly discussed.
The spatial degrees of freedom of the MIMO Rayleigh channels (and its generalization to MIMO product channels) lead to the well-known capacity scaling law [1]. The idea behind the MIMO fiber optical channels is to attain a similar scaling law by exploiting the spatial degrees of freedom as well. Particularly, multiple spatial transmission in the same fiber is possible by designing a multi-mode and/or multi-core fiber. To explore the spatial diversity, the Jacobi MIMO optical channel was proposed in [6,11], which relies on the following assumptions. The propagation through the fiber is assumed to be as lossless such that it can be modeled as an l × l random unitary matrix UU † = I l , also known as the scattering matrix. Assuming n transmitting and m receiving modes (with m ≤ n), the effective optical MIMO channel is given by the upper left sub-matrix of the scattering matrix U = u ij with the condition that l ≥ m + n, i.e., H = u ij i=1,...,m; j=1,...,n .
With these assumptions, the joint density of the eigenvalues of the hermitian channel matrix HH † is expressed as [6,11] where 0 ≤ θ m ≤ · · · ≤ θ 2 ≤ θ 1 ≤ 1, and c is a normalization constant that does not depend on θ. The ensemble Equation (13) is referred to as the Jacobi ensemble [18] in random matrix theory, and hence the name Jacobi MIMO channels in the information theory and communications theory communities. Notice that when the SNR is high enough the optical channel becomes nonlinear which may not be modeled by the above channel model. Fortunately, our derived analytical results are valid for moderate high SNR values as shown in Section 4. The eigenvalue density of the interference-limited MIMO channel considered in [9] takes the form of Equation (13) with the same parameter α 1 = n − m as the difference between the number of transmit and receive antennas. For this application, the parameter α 2 is now where k denotes the number of interferers. For a detailed discussion of the interference-limited MIMO channels and its connection to the Jacobi ensemble, we refer to [9]. For applications in optical MIMO communications, the exact ergodic mutual information E [I] of the Jacobi MIMO channels was derived in [6], whereas an unexplicit as well as a limiting second moment formulas can be found in [11]. Note that the exact higher order moments might be also derived from the representation in ( [11], Equation (13)), which would carry the same combinatorial structure involving sums over partitions. On the contrary, our proposed moments expressions for high SNRs in Proposition 2 are simple and explicit, the difficulty of which does not increase with the order of moment considered. For applications to interference-limited MIMO channels, the first two limiting moments and a differential equation for the moments were obtained in [9]. Finally, we note that the random variable ln det HH † for Jacobi MIMO channels was shown in ( [19], Proposition 2.4) to have the same distribution as a product of independent Beta random variables, whose density function can be obtained in principle, albeit complicated, by multiple convolutions of the underlying Beta densities.

Exact Moments of High-SNR Mutual Information
Despite that even the exact second moment E I 2 of the mutual information Equation (1) is difficult to obtain for any previously discussed channel model, we will show that the exact moments E I k , k = 1, 2, . . . , of the high-SNR mutual information Equation (2) can be explicitly calculated. These moments will be utilized to construct simple and accurate approximations to the outage probability of the considered MIMO channels.
To compute the moments, one naturally starts from integrals involving the eigenvalue densities, seen in Equations (9) and (13) as has been done in the literature [1][2][3][4][5][6][8][9][10][11]. Contrary to this prevailing approach, our starting point is the integrals over the matrix-variate densities of the channel matrices HH † . This may be counter-intuitive as the matrix integral involves to the order of m 2 real variables, whereas the integral over eigenvalue density only involves m variables. We will show, however, that by starting from matrix integrals the exact moment of any order can be derived in a straightforward manner.
As m ln r in Equation (2) is a constant, we first focus on the random variable The cumulant generating function K(s) of X is defined as where the i-th order cumulantκ i of X can be recovered from the generating function as Denote the i-th order cumulant of I by κ i , we have which is due to the shift-equivariant and the shift-invariant property, respectively. With the knowledge of cumulants, the corresponding moments can be determined. In general, the i-th order moment is an i-th degree polynomial in the first i cumulants and vice versa. For example, the first five moments of I written in terms of its first five cumulants are listed below We now present the main technical contributions of this paper, which are summarized in the two propositions and two corollaries below.

All the Integer Moments of Mutual Information of MIMO Rayleigh Product Channels
Proposition 1. The i-th exact cumulant κ i of the high-SNR mutual information (Equation (2)) of the MIMO Rayleigh product channels (Equation (5)) is given by where denotes the i-th order polygamma function [20].
The proof of Proposition 1 is in the Appendix A. Note that for positive integer arguments, the polygamma function of order 0 (digamma function) reduces to a finite sum as with γ ≈ 0.5772 being the Euler's constant, and the polygamma functions of order i ≥ 1 in Equation (24) also become finite sums as where is the Riemann zeta function. In particular, the trigamma function (i = 1) reduces to Note also that an exact moment expression of the MIMO Rayleigh product channel has been recently obtained in ( [12], Proposition 4), which involves a combinatorial structure of matrix determinant. On the contrary, our proposed moment expressions are simple and explicit partially due to the efficient use of the cumulant generating function, an approach we try to advocate in this paper.
The general results in Proposition 1 can be simplified to various special cases depending on the channel models considered. The following two special cases, as summarized in the Corollaries 1 and 2, may deserve separate attention.

Corollary 1.
In the case of square channel matrices the results of Proposition 1 are simplified to The proof of Corollary 1 is in Appendix B. Note that the above channel model corresponds to the scenario of equal number of scatterers on each layer (cf. Equation (5)). Note also that Proposition 1 is in fact not directly valid in the case of Equation (27), where some general expressions from Equations (A22a) and (A23a) will be needed, cf. Equations (A4) and (A5).

Corollary 2.
In the case M = 1, i.e., the MIMO Rayleigh channels, the results of Proposition 1 reduce to The proof of Corollary 2 follows directly by setting M = 1 in Equation (21). In the literature, the first two (unsimplified) cumulants κ 1 and κ 2 of Corollary 2 have been reported in [13,21,22], where some bounds on the corresponding outage probability can be also found in [22]. In particular, the variance formulas provided in ( [22], Section IV) can be further simplified to finite sums. Note that the derivation in [22] essentially ended with the expressions for the moment generating functions, whereas we have derived explicit expressions for all the moments via the corresponding cumulants. In addition, in our construction of the outage probability from the moments no generic probabilistic bounds, such as the Chernoff bound used in [22], are needed.
It is interesting to observe that the terms involving SNR γ and spatial correlation Σ only appear in the first cumulants in Equations (21a), (28a) and (29a) of the high-SNR mutual information.
As a result, the corresponding cumulants for the uncorrelated channel models only incur a change in Equations (21a), (28a) and (29a) by setting ln det Σ = ln det I m = 0, whereas the higher cumulants in Equations (21b), (28b) and (29b) remain the same.

All the Integer Moments of Mutual Information of Jacobi MIMO Channels
Proposition 2. The i-th exact cumulant κ i of the high-SNR mutual information in Equation (2) of the Jacobi MIMO channels in Equation (12) is given by The proof of Proposition 2 can be found in Appendix C. A similar, but unsimplified, κ 1 expression was obtained in the context of mean power offset of interference-limited MIMO channels ( [13], Equation (78)). In such a setting, our derived higher cumulants may useful in the study of the distribution of power offset, which may be addressed separately as a future work.

Moment-Based Approximations to Outage Probability
With the derived exact cumulant expressions from Equations (21), (28), (29), (31) and the cumulant-moment relations Equation (20), moment-based approximations to the outage probability in Equation (4) for each of the discussed channel model can now be constructed. The basic idea of moment based approximation is to match the moments and support of an intractable distribution by an elementary distribution and the associated orthogonal polynomials [23,24]. The theory of moment-based approximation as developed by Ha and Provost [23,24] assigns Gaussian, Gamma, or Beta density as the elementary distribution (initial approximation) when the considered random variable is supported in (−∞, ∞), [a, ∞), or [a, b] (a, b being finite), respectively. The respective orthogonal polynomials for the chosen elementary densities are Hermite, Laguerre, and Jacobi polynomials. The parameters of the initial approximations are obtained by matching the first two moments, whereas the orthogonal polynomials encode the higher moments. An important property is that the approximation accuracy in general improves as the number of moments involved increases. Note also that the moment-based approximation provides closed-form distribution functions by taking the moment expressions as input, where no numerical simulation of any random variable is needed.
Since the random variable of interest in Equation (16) is supported in X ∈ [a, ∞) for the MIMO Rayleigh product channels and in X ∈ (−∞, 0] for the Jacobi MIMO channels, the gamma distribution is chosen as the elementary distribution of the approximation. As a consequence, the Laguerre polynomials as the associated orthogonal polynomials are utilized to construct the approximation. The resulting approximative distribution function F q (x) to the distribution function P(X < x) of the random variable X in Equation (16), obtained by matching the first q moments of X to a linear combination of Laguerre polynomials up to degree q, can be read off from ( [23], Appendix 2) as where with being the lower incomplete Gamma function. The parameters α, β in the approximative distribution function in Equation (32) are found by matching the first two moments of X to a Gamma random variable having a density 1 where the derived higher cumulant expressions enter the approximation via Equation (34), cf. Equation (20). Finally, the resulting approximation to the outage probability of Equation (4) is obtained by keeping in mind the relation of Equation (2) as The simplest form of the moment-based approximation corresponds to q = 2 in Equation (32) as where only the first two moments are involved. Note that the above Equations (32)-(39) are valid for X ∈ [0, ∞) of the MIMO Rayleigh product channels and its special cases. For the Jacobi MIMO channels having X ∈ (−∞, 0], one constructs an approximation to the random variable −X in a parallel manner as the above with details omitted. In principle, with the number of moments E X i , i = 1, . . . , involved in the term in Equation (34) increases, the accuracy of the approximation in Equation (38) improves [23,24]. This is especially true for the tail of the distribution, where the accuracy is governed by the higher order moments. We now focus on the numerical study of the approximation accuracy in various realistic scenarios. The emphasis will be on the low outage probability in the regime when the SNR and the number of antennas are finite. We point out that the outage probability at low outage is a crucial metric for optical communications, where requesting a retransmission in the case of packet loss may not be possible [11].
We first study the impact of the number of moments q on the approximation accuracy, where we consider the Jacobi MIMO channels Equation (12) for illustration. Figure 1 shows the outage probability of the high-SNR mutual information Equation (2) assuming the channel dimensions m = 4 and n = 6, where different dimensions l = 12, 16, and 20 of the scattering matrices are considered. The numerical simulations are compared with the moment-based approximative outage probability, where the number of moments considered are q = 2 and q = 5. It is observed that the accuracy of the proposed approximation improves (most prominently in the tails) as the number of moments increases from q = 2 to q = 5. We also see that the outage probability decreases as the number of untapped channels l − m − n decreases, where the same phenomenon has been observed in [11] as well.  (2)) of Jacobi multiple-input-multiple-output (MIMO) channels (Equation (12)) for different l with m = 4, n = 6, and r = 20 dB.
We now study the impact of finite SNR on the accuracy of the outage probability. Since the approximation is formally valid when the SNR approaches infinity, the study helps evaluate the usefulness of the results in practical scenarios of finite SNR values. In Figure 2, we consider the MIMO Rayleigh product channels in Equation (5) where the exponential correlation model is chosen for the correlation matrix Σ with ρ being set at 0.7. In Figure 3, we consider the Jacobi MIMO channels in Equation (5) with the channel dimensions l = 12, m = 4, and n = 6. In both figures, the number of moments used in the approximation equals q = 5. Despite the high-SNR assumption, it is observed that the proposed approximation is already reasonably accurate for not-so-high SNR values even for outage probability as low as 10 −4 . Simulations with other parameter values have been extensively performed, where similar accuracy performance as in Figures 2 and 3 persists.

Conclusions
In this work, we study the mutual information of MIMO Rayleigh product channels and Jacobi MIMO channels, which has received substantial attention very recently. For the considered channel models, the exact moments of any order of the high-SNR mutual information are derived. The results are derived by making use of the relevant matrix-variate densities of the channel matrices. The obtained exact moments lead to closed-form approximations to the outage probability. Simulations demonstrate the usefulness of the results in practical scenarios of low outage probability with finite SNR values and channel dimensions. Future work includes extending the analytical framework to studying the moments of mutual information for MIMO channels with more complicated matrix-variate densities such as the MIMO Rician channels.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Proof of Proposition 1
The proofs of Propositions 1 and 2 as well as Corollary 1 rely on the matrix-variate gamma and beta distributions as well as some finite sum identities involving polygamma functions, which are summarized in the definitions and the lemma below.
Definition A2. The matrix-variate beta density with parameters (α, β) is defined as ( [18], p. 357) where X is a Hermitian matrix of dimensions m × m.
Lemma A1. For an integer l > 0, we have and when l = 0, we have n ∑ k=1 ψ 0 (k) = nψ 0 (n) − n + 1, Proof. We will first prove the identities of Equation (A4), whereas the identities of Equation (A5) can then be established by taking appropriate limits of Equation (A4) to resolve the indeterminacy.
To prove Equation (A4a), we observe from the definition of digamma function of Equation (23) that which gives This proves Equation (A4a). To show Equation (A4b), by the series representation of polygamma function of Equation (24), one has which similarly gives To prove Equation (A5a), setting l = 0 in Equation (A4a) results in where the "0 × ∞" type indeterminacy can be resolved by using the fact that [20] This proves Equation (A5a). Similarly, to prove Equation (A5b) we set l = 0 in Equation (A4b), which gives The indeterminate term −iψ i−1 (0) − 0ψ i (0) can be handled by the fact that [20] as which gives Equation (A5b). This completes the proof of Lemma A1.
With the Lemma A1, we turn to the proof of Proposition 1.
Proof. In order to utilize the matrix integral approach, we need to parameterize the product channel matrix of Equation (5) as [4] where the original dimensions are ordered as in Equation (7) with m = d M . The above transform is known as the weak commutation relation for products of random matrices [4]. The density (Equation (A19)) is recognized as the product of M matrix-variate gamma densities (Equation (A1)), where the parameters d i , i = 0, . . . , M − 1 take the role of the parameter α in Equation (A1). Consequently, the cumulant generating Equation (17) where the second to the last step is established by using the definition of matrix-variate gamma density in Equation (A1). In fact, the results of the integrations can be directly read off from the normalization constant in Equation (A1) with each d i (α in Equation (A1)) replaced by s + d i . By the definition of Equation (18), the i-th cumulantκ i of the MIMO Rayleigh product channels Equation (5) can now be computed as where we have used the definitions of the multivariate gamma function in Equation (A2) and the polygamma function in Equation (24). By the relation between the cumulants in Equation (19), we now have ψ 0 (d j−1 − m + k) + ln det Σ + m ln r (A22a) = ln E det HH † s (A27b) = ln Γ m (l) Γ m (n)Γ m (l − n) + ln det HH † s+n−m det I m − HH † l−n−m dHH † (A27c) The integration in the second to last line is a trivial deformation of the density of Equation (A26), which was computed by replacing in the normalization constant of Equation (A26) the appearance of α by s + α. By Equation (18), the i-th cumulantκ i of the Jacobi MIMO channels of Equation (12) are now calculated asκ where the definitions in Equations (24) and (A2) have been utilized. Finally, by using Equation (A4) in Lemma A1 and the relation between the cumulants in Equation (19), we prove Proposition 2.