A Random Matrix Approach to VARMA Processes

We apply random matrix theory to derive spectral density of large sample covariance matrices generated by multivariate VMA(q), VAR(q) and VARMA(q1,q2) processes. In particular, we consider a limit where the number of random variables N and the number of consecutive time measurements T are large but the ratio N/T is fixed. In this regime the underlying random matrices are asymptotically equivalent to Free Random Variables (FRV). We apply the FRV calculus to calculate the eigenvalue density of the sample covariance for several VARMA-type processes. We explicitly solve the VARMA(1,1) case and demonstrate a perfect agreement between the analytical result and the spectra obtained by Monte Carlo simulations. The proposed method is purely algebraic and can be easily generalized to q1>1 and q2>1.


I. INTRODUCTION
Vector auto-regressive (VAR) models play an important role in contemporary macro-economics, being an example of an approach called the "dynamic stochastic general equilibrium" (DSGE), which is superseding traditional largescale macro-econometric forecasting methodologies [1]. The motivation behind them is based on the assertion that more recent values of a variable are more likely to contain useful information about its future movements than the older ones. On the other hand, a standard tool in multivariate time series analysis is vector moving average (VMA) models, which is really a linear regression of the present value of the time series w.r.t. the past values of a white noise. A broader class of stochastic processes used in macro-economy comprises both these kinds together in the form of vector auto-regressive moving average (VARMA) models. These methodologies can capture certain spatial and temporal structures of multidimensional variables which are often neglected in practice; including them not only results in more accurate estimation, but also leads to models which are more interpretable. They are widely used by academia and central banks (cf. the European Central Bank's Smets-Wouters model for the euro zone [2]), as they constitute quite a simple version of the DSGE equations.
VARMA models are constructed from a number of univariate ARMA (Box-Jenkins; see for example [3]) processes, typically coupled with each other. In this paper, we investigate only a significantly simplified circumstance when there is no coupling between the many ARMA components. One may argue that this is too far fetched and will be of no use in describing an economic reality. However, one may also treat it as a "zeroth-order hypothesis," analogously to the idea of [4,5] in finance, namely that the case with no cross-covariances is considered theoretically, and subsequently compared to some real-world data modeled by a VARMA process; any discrepancy between the two will reflect nontrivial cross-covariances present in the system, thus permitting their investigation. This latter route is taken in this communication.
A challenging and yet increasingly important problem is the estimation of large covariance matrices generated by these stationary VARMA(q 1 , q 2 ) processes, since high dimensionality of the data as compared to the sample size is quite common in many statistical problems (the "dimensionality curse"). Therefore, an appropriate "noise cleaning" procedure has to be implemented, and random matrix theory (RMT) provides a natural and efficient outfit for doing that. In particular, the mean spectral densities (a.k.a. "limiting spectral distributions," LSD) of the Pearson estimators of the cross-covariances for the VMA(1) and VAR(1) models, in the relevant high-dimensionality sector and under the full decoupling, have been derived in [6] by applying the framework proposed by [7].
In this paper, we suggest that such calculations can be considerably simplified by resorting to a mathematical concept of the free random variables (FRV) calculus [8,9], succinctly introduced in sec. II. Our general FRV formula [10] allows not only to rediscover, which much less strain, the two fourth-order polynomial equations obtained in [6] in the VMA(1) and VAR(1) cases, but also to derive a sixth-order equation (45) which produces the mean spectral density for a more involved VARMA(1, 1) model. The results are verified by numerical simulations, which show a perfect agreement. This is all done in sec. III.

II. DOUBLY CORRELATED WISHART ENSEMBLES AND FREE RANDOM VARIABLES
A. Doubly Correlated Wishart Ensembles

Correlated Gaussian Random Variables
VARMA(q 1 , q 2 ) stochastic processes, as we will see below, fall within quite a general set-up encountered in many areas of science where a probabilistic nature of multiple degrees of freedom evolving in time is relevant, for example, multivariate time series analysis in finance, applied macro-econometrics and engineering. To describe this framework, consider a situation of N time-dependent random variables which are measured at T consecutive time moments (separated by some time interval δt); let Y ia be the value od the i-th (i = 1, . . . , N ) random number at the a-th time moment (a = 1, . . . , T ); together, they make up a rectangular N × T matrix Y. In what usually would be the first approximation, each Y ia is supposed to be drawn from a Gaussian probability distribution. We will also assume that they have mean values zero, Y ia = 0. These degrees of freedom may in principle display mutual correlations. A set of correlated zero-mean Gaussian numbers is fully characterized by the two-point covariance function, C ia,jb ≡ Y ia Y jb if the underlying stochastic process generating these numbers is stationary. Linear stochastic processes, including VARMA(q 1 , q 2 ), belong to this category. We will restrict our attention to an even narrower class where the crosscorrelations between different variables and the auto-correlations between different time moments are factorized, i.e., (1) In this setting, the inter-variable covariances do not change in time (and are described by an N × N cross-covariance matrix C), and also the temporal covariances are identical for all the numbers (and are included in a T × T autocovariance matrix A; both these matrices are symmetric and positive-definite). The Gaussian probability measure with this structure of covariances is known from textbooks, where the normalization constant N c.G. = (2π) N T /2 (DetC) T /2 (DetA) N/2 , and the integration measure T a=1 dY ia , while the letters "c.G." stand for "correlated Gaussian." Now, a standard way to approach correlated Gaussian random numbers is to recall that they can always be decomposed as linear combinations of uncorrelated Gaussian degrees of freedom; indeed, this is achieved through the transformation where the square roots of the covariance matrices, necessary to facilitate the transition, exist due to the positivedefiniteness of C and A; the new normalization reads N G. = (2π) N T /2 .

Estimating Equal-Time Cross-Covariances
An essential problem in multivariate analysis is to determine (estimate) the covariance matrices C and A from given N time series of length T of the realizations of our random variables Y ia . For simplicity, we do not distinguish in notation between random numbers, i.e., the population, and their realizations in actual experiments, i.e., the sample. Since the realized cross-covariance between degrees i and j at the same time a is Y ia Y ja , the simplest method to estimate the today's cross-covariance c ij is to compute the time average, This is usually named the "Pearson estimator", up to the prefactor which depending on the context is 1/(T − 1) or 1/T . Other estimators might be introduced, such as between distinct degrees of freedom at separate time moments ("time-delayed estimators"), or with certain decreasing weights given to older measurements to reflect their growing obsolescence ("weighted estimators"), but we will not investigate them in this article. Furthermore, in the last equality in (4), we cast c through the uncorrelated Gaussian numbers contained in Y, the price to pay for this being that the covariance matrices now enter into the expression for c, making it more complicated; this will be the form used hereafter. The random matrix c is called a "doubly correlated Wishart ensemble" [11].
Let us also mention that the auto-covariance matrix A can be estimated through a ≡ (1/N )Y T Y. However, it is verified that this object carries identical information to the one contained in c (it is "dual" to c), and therefore may safely be discarded. Indeed, these two estimators have same non-zero eigenvalues (modulo an overall rescaling by r), and the larger one has |T − N | additional zero modes.
Any historical estimator is inevitably marred by the measurement noise; it will reflect the true covariances only to a certain degree, with a superimposed broadening due to the finiteness of the time series. More precisely, there are N (N + 1)/2 independent elements in C, to be estimated from N T measured quantities Y, hence the estimation accuracy will depend on the "rectangularity ratio," the closer r to zero, the more truthful the estimate. This is a cornerstone of classical multivariate analysis. Unfortunately, a practical situation will typically feature a large number of variables sampled over a comparably big number of time snapshots, so that we may approximately talk about the "thermodynamical limit," On the other hand, it is exactly this limit in which the FRV calculus (see the subsection below for its brief elucidation) can be applied; hence, the challenge of de-noising is somewhat counterbalanced by the computationally powerful FRV techniques. Any study of a (real symmetric K × K) random matrix H will most surely include a fundamental question about the average values of its (real) eigenvalues λ 1 , . . . , λ K . They are concisely encoded in the "mean spectral density," Here the expectation map . . . is understood to be taken w.r.t. the probability measure P (H)DH of the random matrix. We will always have this distribution rotationally (i.e., H → O T HO, with O orthogonal) invariant, and hence the full information about H resides in its eigenvalues, distributed on average according to (7).
On the practical side, it is more convenient to work with either of the two equivalent objects, referred to as the "Green's function" (or the "resolvent") and the "M -transform" of H. The latter is also called the "moments' generating function," since if the "moments" M H,n ≡ (1/K) TrH n of H exist, it can be expanded into a power series around z → ∞ as M H (z) = n≥1 M H,n /z n . It should however be underlined that even for probability measures disallowing such an expansion (heavy-tailed distributions, preeminent in finance, being an example), the quantities (8) still manage to entirely capture the spectral properties of H; hence the name "M -transform" more appropriate, in addition to being more compact.
We will show that for our purposes (multiplication of random matrices; see par. II B 2) the M -transform serves better than the Green's function. However, it is customary to write the relationship between (7) and (8) in terms of this latter, resulting from a well-known formula for generalized functions, lim ǫ→0 + 1/(x ± iǫ) = pv(1/x) ∓ iπδ(x).

The N -Transform and Free Random Variables
The doubly correlated Wishart ensemble c (4) may be viewed as a product of several random and non-random matrices. The general problem of multiplying random matrices seems formidable. In classical probability theory, it can be effectively handled in the special situation when the random terms are independent: then, the exponential map reduces it to the addition problem of independent random numbers, solved by considering the logarithm of the characteristic functions of the respective PDFs, which proves to be additive. In matrix probability theory, a crucial insight came from D. Voiculescu and coworkers and R. Speicher [8,9], who showed how to parallel the commutative construction in the noncommutative world. It starts with the notion of "freeness," which basically comprises probabilistic independence together with a lack of any directional correlation between two random matrices. This nontrivial new property happens to be the right extension of classical independence, as it allows for an efficient algorithm of multiplying free random variables (FRV), which we state below: Step 1: Suppose we have two random matrices, H 1 and H 2 , mutually free. Their spectral properties are best wrought into the M -transforms (8), M H1 (z) and M H2 (z).
Step 2: The critical maneuver is to turn attention to the functional inverses of these M -transforms, the so-called "N -transforms," Step 3: The N -transforms submit to a very straightforward rule upon multiplying free random matrices (the "FRV multiplication law"), Step 4: Finally, it remains to functionally invert the resulting N -transform N H1H2 (z) to gain the M -transform of the product, M H1H2 (z), and consequently, all the spectral properties via formula (9).
It is stunning that such a simple prescription (relying on the choice of the M -transform as the carrier of the mean spectral information, and the construction of its functional inverse, the N -transform, which essentially multiplies under taking the free product) resolves the multiplication problem for free random noncommutative objects.
Let us just mention that the addition problem may be tackled along similar lines: In this case, the Green's function should be exploited, its functional inverse considered (G H (B H (z)) = B H (G H (z)) = z; it is sometimes called the "Blue's function" [12,13]), which obeys the "FRV addition law," B H1+H2 (z) = B H1 (z) + B H2 (z) − 1/z, for two free random matrices. In this paper, we do not resort to using this addition formula, even though our problem could be approached through it as well.

Doubly Correlated Wishart Ensembles from Free Random Variables
The innate potential of the FRV multiplication algorithm (11) is surely revealed when inspecting the doubly correlated Wishart random matrix c = (1/T ) √ C YA Y T √ C (4). This has been done in detain in [10], so we will only accentuate the main results here, referring the reader to the original paper for a thorough explanation.
The idea is that one uses twice the cyclic property of the trace (which permits cyclic shifts in the order of the terms), and twice the FRV multiplication law (11) (to break the N -transforms of products of matrices down to their constituents), in order to reduce the problem to solving the uncorrelated Wishart ensemble (1/T ) Y T Y. This last model is further simplified, again by the cyclic property and the FRV multiplication rule applied once, to the standard GOE random matrix squared (and the projector P ≡ diag(1 N , 0 T −N ), designed to chip the rectangle Y off the square GOE), whose properties are firmly established. Let us sketch the derivation, This is the basic formula. Since the spectral properties of c are given by its M -transform, M ≡ M c (z), it is more pedagogical to recast (12) as an equation for the unknown M , It provides a means for computing the mean spectral density of a doubly correlated Wishart random matrix once the "true" covariance matrices C and A are given.
In this communication, only a particular instance of this fundamental formula is applied, namely with an arbitrary auto-covariance matrix A, but with trivial cross-covariances, C = 1 N . Using that N 1K (z) = 1 + 1/z, equation (13) thins out to which will be strongly exploited below. Let us mention that these equalities (13), (14) have been derived through other, more tedious, techniques (the planar Feynman-diagrammatic expansion, the replica trick) in [14][15][16][17][18].

III. VARMA FROM FREE RANDOM VARIABLES
In what follows, we will assume that the VMA(q), VAR(q), or VARMA(q 1 , q 2 ) stochastic processes are covariance (weak) stationary; for details, we refer to [19]. It implies certain restrictions on their parameters, but we will not bother with this issue in the current work. Another consequence is that the processes display some interesting features, such as invertibility.
For all this, we must in particular take both N and T large from the start, with their ratio r ≡ N/T fixed (6). More precisely, we stretch the range of the a-index from minus to plus infinity. This means that all the finite-size effects (appearing at the ends of the time series) are readily disregarded. In particular, there is no need to care about initial conditions for the processes, and all the recurrence relations are assumed to continue to the infinite past.
A. The VMA(q) Process

The Definition of VMA(q)
We consider a situation when N stochastic variables evolve according to identical independent VMA(q) (vector moving average) processes, which we sample over a time span of T moments. This is a simple generalization of the standard univariate weak-stationary moving average MA(q). In such a setting, the value Y ia of the i-th (i = 1, . . . , N ) random variable at time moment a (a = 1, . . . , T ) can be expressed as Here all the ǫ ia 's are IID standard (mean zero, variance one) Gaussian random numbers (white noise), ǫ ia ǫ jb = δ ij δ ab . The a α 's are some (q + 1) real constants; importantly, they do not depend on the index i, which reflects the fact that the processes are identical and independent (no "spatial" covariances among the variables). The rank q of the process is a positive integer.

The Auto-Covariance Matrix
In order to handle such a process (15), notice that the Y ia 's, being linear combinations of uncorrelated Gaussian numbers, must also be Gaussian random variables, albeit displaying some correlations. Therefore, to fully characterize these variables, it is sufficient to calculate their two-point covariance function; this is straightforwardly done (see appendix 1 for details), where In other words, the cross-covariance matrix is trivial, C = 1 N (no correlations between different variables), while the auto-covariance matrix A (1) , responsible for temporal correlations, can be called "(2q + 1)-diagonal." In the course of this article, we will use several different auto-covariance matrices, and for brevity, we decide to label them with superscripts; their definitions are all collected in appendix 2.
For example, in the simplest case of VMA(1), it is tri-diagonal,

The Fourier Transform and the M -Transform of the Auto-Covariance Matrix
Such an infinite matrix (17) is translationally invariant (as announced, it is one of the implications of the weak stationarity), i.e., the value of any of its entries depends only on the distance between its indices, A (1) for d = 0, 1, . . . , q, and A (1) (|d| > q) = 0. Hence, it is convenient to rewrite this matrix in the Fourier space, In this representation, the M -transform of A (1) is readily obtained [10], This integral can be evaluated by the method of residues for any value of q, which we do in appendix 3, where also we print the general result (42). In particular, for q = 1, where the square roots are principal.
The FRV technique allowed us therefore to find this equation in a matter of a few lines of a simple algebraic computation. It has already been derived in [6], and (22) may be verified to coincide with the version given in that paper. In [6], the pertinent equation is printed before (A.6), and to compare the two, one needs to change their variables into ours according to y → 1/r, x → z/r, and m → −r(1 + M )/z. The last equality means that m and m of [6] correspond in our language to the Green's functions −rG c (z) and −G a (z/r), respectively, where a = (1/N )Y T Y is the Pearson estimator dual to c. As mentioned, a quick extension to the case of arbitrary q is possible, however the resulting equations for M will be significantly more complicated; for instance, for q = 2, a lengthy ninth-order polynomial equation is discovered.
B. The VAR(q) Process

The Definition of VAR(q)
A set-up of N identical and independent VAR(q) (vector auto-regressive) processes is somewhat akin to (15), i.e., we consider N decoupled copies of a standard univariate AR(q) process, It is again described by the demeaned and standardized Gaussian white noise ǫ ia (which triggers the stochastic evolution), as well as (q + 1) real constants a 0 , b β , with β = 1, . . . , q. As announced before, the time stretches to the past infinity, so no initial condition is necessary. Although at first sight (23) may appear to be a more involved recurrence relation for the Y ia 's, it is actually easily reduced to the VMA(q) case: It remains to remark that if one exchanges the Y ia 's with the ǫ ia 's, one precisely arrives at the VMA(q) process with the constants a (2) 0 ≡ 1/a 0 , a (2) β ≡ −b β /a 0 , β = 1, . . . , q. In other words, the auto-covariance matrix A (3) of the VAR(q) process (23) is simply the inverse of the auto-covariance matrix A (2) of the corresponding VMA(q) process with the described modification of the parameters, This inverse exists thanks to the weak stationarity supposition.

The Fourier Transform and the M -Transform of the Auto-Covariance Matrix
The Fourier transform of the auto-covariance matrix A (3) of VAR(q) is therefore a (number) inverse of its counterpart for VMA(q) with its parameters appropriately changed, where and where we define b 0 ≡ −1.
In order to find the M -transform of the inverse matrix, A (3) = (A (2) ) −1 , one employs a general result, true for any (real symmetric) random matrix H, and obtainable through an easy algebra, Since the quantity M A (2) (z) is known for any q (42), hence is M A (3) (z) via (27), but we will not print it explicitly. Let us just give it for q = 1, in which case (27) and (21) yield

The Auto-Covariance Matrix
Despite being somewhat outside of the main line of thought of this article, an interesting question would be to search for an explicit expression for the auto-covariance matrix A (3) from its Fourier transform (25), where we exploited the fact that A (3) must be translationally invariant, A (3) a − b). This computation would shed light on the structure of temporal correlations present in a VAR setting. This integral is evaluated by the method of residues in a very similar manner to the one shown in appendix 3, and we do this in appendix 4. We discover that the auto-covariance matrix is a sum of q exponential decays, where C γ are constants, and T γ are the characteristic times (44), γ = 1, . . . , q; these constituents are given explicitly in (43). This is a well-known fact, nevertheless we wanted to establish it again within our approach.
For example, for q = 1, the auto-covariance matrix of VAR(1) is one exponential decay, where we assumed for simplicity 0 < b 1 < 1 (the formula can be easily extended to all values of b 1 ).

The Pearson Estimator of the Covariances from Free Random Variables
Having found an expression for the M -transform of the auto-covariance matrix A (3) of a VAR(q) (27), (42), we may proceed to investigate the equation (14)  This equation has been derived by another method in [6], and our result confirms their equation (A.8), with the change in notation, y → 1/r, x → z/r, z → rM .

The Definition of VARMA(q1, q2)
The two types of processes which we elaborated on above, VAR(q 1 ) and VMA(q 2 ), can be combined into one stochastic process called VARMA(q 1 , q 2 ), Now it is a straightforward and well-known observation (which can be verified by a direct calculation) that the autocovariance matrix A (5) of this process is simply the product (in any order) of the auto-covariance matrices of the VAR and VMA pieces; more precisely, where A (1) corresponds to the generic VMA(q 2 ) model (17), while A (4) denotes the auto-covariance matrix of VMA(q 1 ) with a slightly different modification of the parameters compared to the previously used, namely a β ≡ −b β , for β = 1, . . . , q 1 . We have thus already made use here of the fact that the auto-covariance matrix of a VAR process is the inverse of the auto-covariance matrix of a certain corresponding VMA process (24), but the new change in parameters necessary in moving from VAR to VMA has effectively a 0 = 1 w.r.t. what we had before (24); it is understandable: this "missing" a 0 is now included in the matrix of the other VMA(q 2 ) process.  (45), for various values of the process' parameters. The scale of these parameters is determined by choosing a0 = 1 everywhere. Recall that the theoretical formula (45) is valid in the thermodynamical limit (6) of N, T → ∞, with r = N/T kept finite. UP LEFT: We set the remaining VARMA parameters to a1 = 0.3, b1 = 0.2, while the rectangularity ratio takes the values r = 0.5 (the purple line), 0.1 (red), 0.02 (magenta), 0.004 (pink); each one is 5 times smaller than the preceding one. We observe how the graphs become increasingly peaked (narrower and taller) around λ = 1 as r decreases, which reflects the movement of the estimator c toward its underlying value C = 1N . UP RIGHT: We fix r = 0.25 and consider the two VARMA parameters equal to each other, with the values a1 = b1 = 0.6 (purple), 0.4 (red), 0.2 (magenta), 0.01 (pink). DOWN LEFT: We hold r = 0.25 and b1 = 0.2, and modify a1 = 0.6 (purple), 0.4 (red), 0.2 (magenta), 0.0 (pink); for this last value, the VARMA(1, 1) model reduces to VAR(1). DOWN RIGHT: Similarly, but this time we assign r = 0.25 and a1 = 0.2, while changing b1 = 0.6 (purple), 0.4 (red), 0.2 (magenta), 0.0 (pink); this last value corresponds to VMA(1).

The Fourier Transform and the M -Transform of the Auto-Covariance Matrix
The Fourier transform of the auto-covariance matrix A (5) of VARMA(q 1 , q 2 ) (34) is simply the product of the respective Fourier transforms (19) and (25), where where we recall b 0 = −1. For instance, for VARMA(1, 1) (it is described by three constants, a 0 , a 1 , b 1 ), one explicitly has The M -transform of A (5) can consequently be derived from the general formula (20). We will evaluate here the pertinent integral only for the simplest VARMA(1, 1) process, even though an arbitrary case may be handled by the technique of residues,

The Auto-Covariance Matrix
One might again attempt to track the structure of temporal covariances in a VARMA process. This can be done either by the inverse Fourier transform of (35), or through a direct computation based on the recurrence relation (33) (importantly, adhering to the assumption that it stretches to the past infinity). Let us print the result just for VARMA(1, 1), where for simplicity 0 < b 1 < 1. This is an exponential decay, with the characteristic time of the VAR piece, with an additional term on the diagonal.  1) process; it happens to be polynomial of order six, and we print it (45) in appendix 5. It may be solved numerically, a proper solution chosen (the one which leads to a sensible density: real, positive-definite, normalized to unity), and finally, the mean spectral density ρ c (λ) derived from (9). We show the shapes of this density for a variety of the values of the parameters r, a 0 , a 1 , b 1 in fig. 1. Moreover, in order to test the result (45), and more broadly, to further establish our FRV framework in the guise of formula (14), the theoretical form of the density is compared to Monte Carlo simulations in fig. 2; they remain in excellent concord. These are the main findings of this article.

IV. CONCLUSIONS
In this paper we attempted to advertise the power and flexibility of the Free Random Variables calculus for multivariate stochastic processes of the VARMA type. The FRV calculus is ideally suited for multidimensional time series problems, provided the dimensions of the underlying matrices are large. The operational procedures are simple, algebraic and transparent. The structure of the final formula which relates the moments' generating function of the population covariance and the sample covariance allows one to easily derive eigenvalue density of the sample covariance. We in detail illustrated how this procedure works for VARMA(1, 1), confronted the theoretical prediction with numerical data obtained by Monte Carlo simulations of the VARMA process and observed a perfect agreement.
The FRV calculus is not restricted to Gaussian variables. It also works for non-Gaussian processes, including those with heavy-tailed increments belonging to the Lévy basin of attraction, where the moments do not exist. Since the majority of data collected nowadays is naturally stored in the form of huge matrices, we believe that the FRV technique is the most natural candidate for the matrix-valued "probability calculus" that can provide efficient algorithms for cleaning (de-noising) large sets of data and unraveling essential but hidden correlations. In this appendix, we sketch a proof of the formula (17) for the auto-covariance matrix of the VMA(q) process. As mentioned, since the random variables are centered Gaussian, this matrix alone suffices to completely capture all their properties. We set i = j; the dependence on this index may be dropped as there are no correlations here. We use the definition (15) of VMA(q), as well as the auto-covariance structure of the white noise, ǫ ia ǫ jb = δ ij δ ab . This leads to which, upon splitting the sum over d into three pieces (from −q to −1, d = 0, and from 1 to q), is quickly seen to coincide with (17).
We proceed by the technique of residues, analogously to appendix 3, however this time with aid of another variable, x ≡ e −ip , related to the previously used through y = 2 cos p = x + 1/x. The integration measure is dp = idx/x, and the integration path is counterclockwise around the centered unit circle. The denominator of the integrand is a polynomial of order q in the variable y, having thus some q rootsỹ β , β = 1, . . . , q. Therefore, there are 2q corresponding solutions for the variable x, with a half of them inside the integration path and a half outside; letx β be the solutions to x + 1/x =ỹ β with the absolute values less than 1. Only them contribute to the integral, and their residues straightforwardly give This is indeed q exponents (x γ ) |d| , γ = 1, . . . , q. Remark that the solutions may be complex, hence this is really q different exponential decays exp(−|d|/T γ ), with the characteristic times (these times are positive as the roots have the absolute values less than 1), possibly modulated by sinusoidal oscillations when a root has an imaginary part.
This equation in a Mathematica file can be obtained from the authors upon request.