Index of a matrix, complex logarithms, and multidimensional Fresnel integrals

We critically discuss the problem of finding the $\lambda$-index $\mathcal{N}(\lambda)\in [0,1,\ldots,N]$ of a real symmetric matrix $\mathbf{M}$, defined as the number of eigenvalues smaller than $\lambda$, using the entries of $\mathbf{M}$ as only input. We show that a widely used formula $$ \mathcal{N}(\lambda)=\lim_{\epsilon\to 0^+}\frac{1}{2\pi \mathrm{i}}\left[\log\det(\mathbf{M}-\lambda+\mathrm{i}\epsilon)-\log\det(\mathbf{M}-\lambda-\mathrm{i}\epsilon)\right] $$ based on the branch-cut structure of the complex logarithm should be handled with care, as it generically fails to produce the correct result if the same branch is chosen for the two logarithms. We improve the formula using multidimensional Fresnel integrals, showing that even the new version provides at most a self-consistency equation for $\mathcal{N}(\lambda)$, whose solution is not guaranteed to be unique. Our results are corroborated by explicit examples and numerical evaluations.

based on the branch-cut structure of the complex logarithm should be handled with care, as it generically fails to produce the correct result if the same branch is chosen for the two logarithms. We improve the formula using multidimensional Fresnel integrals, showing that even the new version provides at most a self-consistency equation for N (λ), whose solution is not guaranteed to be unique. Our results are corroborated by explicit examples and numerical evaluations.

The question
Consider a N × N real symmetric matrix M = (M ij ), whose eigenvalues (all real) are λ 1 ≤ λ 2 ≤ . . . ≤ λ N . Is there a way to count how many eigenvalues of M fall below a threshold λ, using as only input ‡ the entries of M ? This question might appear rather bizarre at first sight. In practice, though, this is often the scenario one faces when dealing with non-invariant random matrices, where the only available information concerns the distribution of matrix entries. This setting is relevant to Wigner matrices, especially in connection with sparse undirected graphs. Having a formula that connects directly the entries of M to the number of its eigenvalues falling below a certain threshold -an interesting and well-studied observable in the context of random matrices [1,2] -would certainly come very handy in these cases.
For the time being, let us focus on a single, deterministic matrix M , and define the λ-index of M , denoted by N (λ) as N (λ) = # of eigenvalues of M less than λ . (1) A formula that is claimed to do the job exists (see [3][4][5][6][7] for equivalent or related versions of the formula given here) At first sight, this formula indeed seems like an excellent candidate. First, it only requires as input the entries of M (in the form of a determinant). Second, the complex function f (z) = log det(M − z), having cuts on the real axis, allegedly counts how many eigenvalues fall below λ by recording how many jumps of width 2πi one meets while approaching the real axis from above and from below. But is this really the case?
The first thing to notice is that the notation log (ln is also often used) in (2) is ambiguous: the complex logarithm is a multi-valued function (see Appendix A for details), hence it is important to specify which branch is chosen. If the same branch is chosen for the two logarithms appearing in (2) (for example, the Principal branch, the default choice in Mathematica), then a simple example will be sufficient to show that N (λ) can only take a handful of (half)-integer values, making (2) rather unfit for purpose. whose eigenvalues are λ = {λ 1 , λ 2 , λ 2 , λ 4 } = {−11.0323..., −2.79789..., −0.631656..., 2.46186...}. Therefore, N (0) = 3 (just an example for λ = 0). Now, let us pretend we do not have any way to access the list of eigenvalues directly (hence we cannot just count them!). We just wish to infer N (0) by performing manipulations on the entries of M alone. Trying to use (2), we have to compute whose principal logarithms (hereafter denoted by Log), evaluated e.g. for = 10 −8 in Mathematica yield The difference between the two is quite clearly 2πi, therefore N (0) = 1 using (2) (and not N (0) = 3 as expected!). This is understandable, as the difference between imaginary parts of two principal logarithms is bounded between −2πi and 2πi, being therefore rather unfit for any counting purpose. We can try now to gain some further intuition by exploiting the very bit of information that we pledged not to use, namely the exact values λ = {λ 1 , λ 2 , λ 2 , λ 4 } of the eigenvalues of M . Let us cheat for a moment then, and compute numerically the following quantities to machine precision (fix e.g. = 10 −8 ) Log(λ 4 + i ) 0.9009181991462066 + 10 −9 i (10) For each of the negative eigenvalues (like λ 1 ), subtracting (9) from (8) we get a contribution 2πi, while for each of the positive eigenvalues (like λ 4 ) there is no imaginary part and the subtraction gives 0. Therefore we get as many 2πi contributions as there are negative eigenvalues (or more generally, eigenvalues < λ). But isn't this precisely achieving the counting of eigenvalues that we were after? Indeed, the following formula does the job perfectly § and in our case would correctly return N (0) = 3. The point is that (12) is not equivalent to (2), as one might be naively tempted to assume (using the standard mantra Tr log = log det). § Not completely true, but bear with us for a moment.
Let us now pause for a moment and summarise what we have done. We have shown that knowing the individual eigenvalues we can determine N (λ) exploiting the branch cut structure of the complex logarithm, and land on the correct answer (Eq. (12)). If instead we do not know (or cannot use) the eigenvalues (as postulated at the beginning), and we try to cast Eq. (12) in the form of Eq. (2) (involving only the entries of M ), we get the wrong answer. Why?
The reason can be ultimately traced back to the failure of the standard relation among real variables ln(x 1 x 2 ) = ln(x 1 ) + ln(x 2 ) in a complex setting. Indeed, in general as the imaginary part of the l.h.s. is bounded between −π and π, while the imaginary part of the r.h.s. -being the sum of two -is not guaranteed to lie again between −π and π. That is why the sum of principal logarithms in (12) does not translate into a single principal logarithm of a product (the determinant in Eq. (2)). Would fiddling with the branches of the complex logarithm in Eq. (2) cure the problem? Choosing the same (non-principal) branch for the two logarithms appearing in Eq.
(2) will unfortunately have no effect: the addition of the same amount 2πin (n ∈ N) to each of the two principal values (which is the operation needed to define a new non-principal logarithm, see Appendix A) gets wiped out in the subtraction. One would be forced to choose two different branches of the complex logarithms depending on the specific form of the matrix M and the value of the threshold λ, with the obvious drawback that the resulting formula will be heavily case-dependent and ultimately of little use.

Problem n. 2
There is yet another problem with this formalism. Indeed, Eq. (12) is not completely flawless either.
The problem specifically arises when the matrix M has N 0 eigenvalues exactly equal to λ. Then, applying (12) we would get = lim where we have isolated the contribution from the eigenvalues that are exactly equal to λ. However, due to the following identity (proved in Appendix B) Obviously, we are purposely using an overly complicated method, instead of just counting them! the formula (12) actually overestimates the correct number of eigenvalues strictly smaller than λ by an amount N 0 /2. An improved index formula can be nevertheless defined by discounting half of the number of eigenvalues exactly equal to λ. This is achieved by (see again Appendix B) Let us check it on the following example. We consider This matrix has the following eigenvalues λ = {λ 1 , λ 2 , λ 3 , λ 4 } = {−2, −1, 0, 2}. If we apply again (12) to λ = 0, we would get N (0) = 5/2, which is clearly incorrect. Indeed we get for = 10 −2 Clearly the zero eigenvalue λ 3 induces a jump of width πi, which interferes with the counting of eigenvalues strictly smaller than λ = 0. Let us now consider the correction term that appears in (17). We get and from (17) as expected.
To summarise, we have produced two formulas for the λ-index (Eq. (12) and (17)), which return the correct resultΠ. From now on, we will focus on the "shorter" version (Eq. (12)) for illustrative purposes: all our considerations can be easily extended to the more accurate version (Eq. (17)) if needed.
We have also showed that Eq. (12) is not equivalent to the starting point in Eq. (2). This is really disappointing, though, because Eq. (12) relies on the complete knowledge of all the individual eigenvalues (which of course makes the whole exercise rather pointless), while Eq. (2) does not (but fails to produce the correct result!).
Is there a way to convert the "correct" formula (12) into an expression involving only the entries of the matrix M ? A possible strategy -based on a careful evaluation of multidimensional Fresnel integrals -and its limitations is outlined in the next section.

Multidimensional Fresnel integrals
For reasons that will become readily apparent, we now carefully evaluate the multidimensional Fresnel integral where > 0, x = (x 1 , . . . , x N ) T is a real column vector, M is a real symmetric matrix with real eigenvalues {λ i }, λ a real number and 1 is the N × N identity matrix. Note that the integral is perfectly convergent, as the coefficient of terms ∼ x 2 in the exponent is negative (− /2). As a warm-up, we consider the single Fresnel integral which is equal to Note that we have intentionally avoided writing the result in terms of square roots of complex numbers. The way the result is written in (23) presents no ambiguities whatsoever, and can be easily checked numerically with arbitrary precision. Now, turning back to (21), let M = ODO T be the spectral decomposition of M , with O the orthogonal matrix of eigenvectors. Making a change of variable y = O T x with unit Jacobian, we readily realize than Π Modulo the incorrect handling by Eq. (12) of matrices with eigenvalues exactly equal to λ.
where {λ k } are the eigenvalues of M . Therefore The first question to address is whether this expression may be cast in a more familiar (at least to a physicist's eye) form involving the inverse square root of a determinant. This procedure requires again a very meticulous care in the manipulations, and is reported in Appendix C.
Multiplying by (2π) −N/2 and taking the principal logarithm on both sides of (25), we get where · denotes the largest integer less than or equal to (·). The last term on the r.h.s. is due to another startling property of complex logarithms, namely Log(exp(z)) may not be just equal to z ! Eq. (26) is particularly interesting, as it connects an expression that depends on the entries of M (on the l.h.s.) to the term (depending on the eigenvalues of M ), which appears in (12). May (26) be then the long-sought key for our problem? It would certainly be, were it not for the last term in (26), which (as we now proceed to show) still depends explicitly on the exact pattern of eigenvalues. Indeed, by definition of principal logarithm where atan2(y, x) is defined in (B.4). Since > 0, we have In view of (28), there will be as many +π terms arising in (27) -once written in terms of arctan -as there are eigenvalues smaller than λ (= N (λ)). Similarly, we may consider the multidimensional Fresnel integral where > 0, x = (x 1 , . . . , x N ) T is a real column vector, M is a real symmetric matrix with real eigenvalues {λ i }, λ a real number and 1 is the N × N identity matrix. Note the change in sign of i/2 in the argument of the exponential, necessary to ensure that the integral is convergent. As a warm-up, we again consider the single-integral version of it which is equal to Now, turning back to (30), let M = ODO T the spectral decomposition of M , with O the orthogonal matrix of eigenvectors. Making a change of variable y = O T x with unit Jacobian, we readily realize than where {λ k } are the eigenvalues of M . Therefore Note that the sum of logarithms appearing on the r.h.s. is precisely the second term on the r.h.s. of (12). Multiplying by (2π) −N/2 and taking the principal logarithm on both sides, we get We now proceed to evaluate Im( ). We have again by definition of principal logarithm where atan2(y, x) is defined in (B.4). Since − < 0, we have Therefore, from (12) and using (26) and (35) we can write an equation for the index of the matrix M + as which can then be rewritten explicitly as Note that (rather unexpectedly) the λ-index N (λ) itself has cropped up on the right hand side as well. This is due to as many terms ±π arising in (28) and (37) as there are eigenvalues less than λ in the spectrum of M . Eq. (40) is thus to be regarded as a self-consistency equation for N (λ), although at this stage the rather annoying explicit dependence on the eigenvalues has not yet disappeared.
Taking the last limit inside the floor brackets is safe, except in the case where 1/2 + N (λ)/4 − N/8 (or 1/2 − N (λ)/4 + N/8) are integers. This is due to the fact that for m ∈ Z, the limit and the "floor" operations do not necessarily commute. Indeed, assume lim →0 + φ( ) = 0. Then depending on whether φ( ) > 0 or < 0 for small positive . Under the assumption that 1/2 + N (λ)/4 − N/8 and 1/2 − N (λ)/4 + N/8 are not integers (which needs to be checked case by case), we can safely neglect the arctan terms * , and we land on the following self-consistent equation for N (λ) where we have also erased the constants (2π) −N/2 that get cancelled in the difference of Logs. This is one of the main results of this paper. Note that both integrals on the r.h.s. of (43) are convergent and only depend on the entries of M as desired. Moreover, the r.h.s. is very close to the integral formula that in some papers is claimed to provide directly the λ-index. This claim, however, can be immediately ruled out for the very same reason why (12) and (2) are not the same thing: the imaginary part of the difference of two principal logarithms divided by πi is bounded between −2 and 2 and is therefore unfit to "count" quantities up to N .
The way formula (43) becomes operational is as follows: for a given real symmetric matrix N × N M , evaluate the r.h.s. of (43). Then, for that given value of N , create a table of Therefore, if the r.h.s. of (43) comes up = 1, then the λ-index of the matrix M could only be = 3, or = 7, or = 11. As this example already shows, the drawback of the formula is that uniqueness is not guaranteed: in other words, there may exist multiple N (λ) for which equality holds (see also Fig. 1 and Example 2 below). In this sense, the r.h.s. only provides a set of allowed /forbidden values of the index that the matrix M can have, but in general may fail to single out the "correct" one. This non-uniqueness is the price to pay for wiping the eigenvalues out the game entirely: eq. (43) should be thus regarded as the "amended" version of (2) where i) the ambiguous "log" is replaced by the principal Log, ii) the inverse square roots of determinants are replaced by * For any finite N , the terms φ( ) = 1 4π k arctan ± λ k −λ can be made arbitrarily small. In particular, inside a "floor" bracket x + φ( ) , they can be made smaller than the distance between x and its nearest integer (and thus immaterial), unless x is itself an integer.
Bear in mind, however, that outstanding issues still persist around Eq. (43): it has been derived under the assumptions that (i) the matrix M does not have eigenvalues exactly equal to λ, and (ii) the arguments of the "floor" functions are not integers. multidimensional Fresnel integrals (see also Appendix C), but iii) uniqueness of the λ-index value is generically lost. For the sake of completeness, we give here below a 4 × 4 example where uniqueness is eventually attained, and a 5 × 5 example where it is not.
whose eigenvalues are λ = {−9.25637, 6.54579, −2.46691, 1.17749}. Therefore N (0) = 2. Setting and The integrals (evaluated with Mathematica with = 10 −6 ) read One notices that the correct value of the index N (0) = 2 produces a l.h.s. that matches the r.h.s. (= 0). On the other hands, two more incorrect values for the index N (0) = 0, 4 also appear to be compatible with the r.h.s. . A closer look reveals that in those two cases, the assumption that 1/2 + N (λ)/4 − N/8 and/or 1/2 − N (λ)/4 + N/8 are not integers is not satisfied. Therefore, let us cheat for a moment, and go back to Eq. (40) to evaluate the limit at λ = 0 and for N (0) = 0 more carefully. To do so, we clearly need to use some information about the eigenvalues, whichonce again -we pledged not to use. But please bear with us for a moment.
Having N (0) = 0 means that all eigenvalues are positive, therefore 1 4π k arctan λ k > 0 and 1 4π k arctan − λ k < 0 for positive. Therefore the limit in (50) yields and not −1 as one would have naively obtained by just setting = 0 in (50). Therefore the l.h.s. of (43) should be corrected and would actually read −2 and not 0, implying that N (λ = 0) = 0 is not a good match anymore.
Similarly, for N (λ = 0) = 4 we need to evaluate the limit and not 1 as one would have naively obtained by just setting = 0 in (52). Therefore the l.h.s. of (43) should be corrected and would actually read +2 and not 0, implying that N (λ = 0) = 4 is not a good match either. In summary, the formula (43) (depending only on the entries of the matrix M ) eventually provides the correct and unique value for the index N (λ = 0) = 2, but this only happens because other potential matchings between the l.h.s. and the r.h.s. can be ruled out following a careful evaluation of the limits in the original formula (40) (see (50) and (52)). This strategy -in principle relying on explicit information about eigenvalues, which we should never use -can clearly be followed only in a few rare instances, namely when the sign of the arctan terms in (40) can be determined from the mere knowledge of N (λ) alone (e.g. knowing that all eigenvalues are positive/negative, without using their explicit values).
Outside these rare instances, the best we can do when either 1/2 + N (λ)/4 − N/8 or 1/2 − N (λ)/4 + N/8 happen to be integers is to take advantage of Eq. (42). In practice, the effect of the arctan terms -depending explicitly on the exact pattern of eigenvalues -leads at most to lowering the "floor" argument by one. Therefore, the Table 1 -assuming we could not analyse the "floor" terms in full detail, as we just did -could be amended as follows.
Not being able to analyse the "floor" functions in full detail clearly has the disappointing consequence that other potential matchings could not be ruled out, and uniqueness would be lost.  Table 2: Possible values of the l.h.s. of (43) for λ = 0 and N = 4 depending on the value of N (λ = 0) ∈ {0, 1, 2, 3, 4}, taking into account possible extra matchings due to the arguments of the "floor" functions being integers. and The integrals (evaluated with Mathematica with = 10 −6 ) read from which the r.h.s. of (43) yields −3/2 (to the numerical precision). On the other hand, we can produce the following table for the l.h.s. of (43) (with λ = 0 and N = 5).  Table 3: Possible values of the l.h.s. of (43) for λ = 0 and N = 5 depending on the value of N (λ = 0) ∈ {0, 1, 2, 3, 4, 5}.
One notices that in this case the correct value for the index (N (λ = 0) = 5), yielding a l.h.s. equal to the r.h.s. −3/2 comes alongside another incorrect matching N (λ = 0) = 1. In this case, the spurious solution of the self-consistency equation (43) cannot be ruled out, because the corresponding arguments of the "floor" functions are not integers.

Summary
In summary, we have shown that the claimed algebraic identity Eq. (2) generically fails to count the number of eigenvalues of real symmetric matrices M falling below a threshold λ (the λ-index N (λ)) every time the same branch of the complex logarithm is chosen for the two determinants appearing in (2). The improved formula (12) is equally unsatisfactory, for two reasons: first, it depends explicitly on the eigenvalues of M themselves, and not on its entries (making the whole exercise rather pointless); second, it also fails to deal correctly with matrices M having a certain number of eigenvalues exactly equal to λ.
The two determinants in the crippled formula (2) have been represented as multidimensional Gaussian-Fresnel (or less frequently as Grassmann integrals [3]) in previous studies of the index of random matrices [4][5][6][7]. We have shown that these representations per se do not cure the problem, which is originated by the phase pattern of complex logarithms.
Indeed, by a very careful evaluation of multidimensional (convergent) Fresnel integrals, supported by several examples and numerical checks, we have demonstrated that the difference of principal logarithms of multidimensional Fresnel integrals only yields a set of possible λ-indices of M , which may well contain spurious values alongside the correct one. The very same problem would arise if Grassman integrals were used instead of Fresnel.
The inaccurate identity (2) -and its "integral" versions -are typically used in connection with the celebrated "replica trick" [8,9]: it is therefore a very interesting question how the corresponding results turn out to be likely correct while relying on a technically flawed starting point. This issue will be discussed in a separate publication.

Appendix A. Complex logarithms
A complex logarithm is defined as a solution w of the equation where z is any nonzero complex number. When z and w are written as z = re iΘ (−π < Θ ≤ π) and w = u + iv, eq. (A.1) becomes e u e iv = re iΘ . If z = re iθ is a nonzero complex number, the argument θ takes any one of the values θ = Θ + 2nπ (n = 0, ±1, ±2, . . .) where Θ = Arg z. Therefore, the definition of the logarithmic function (A.4) can be written as log z = ln r + iθ.
If we let α denote any real number and restrict the value of θ so that α < θ < α + 2π the function log z = ln r + iθ, r > 0, α < θ < α + 2π (A.6) is single-valued and continuous in the stated domain.
A branch of a multiple-valued function f is any single-valued function F that is analytic in some domain at each point z of which the value F (z) is one of the values f (z). The requirement of analyticity, of course, prevents F from taking on a random selection of the values of f . Observe that, for each fixed α the single-valued function log z = ln r + iθ, is a branch of the multiple-valued function log z = ln r + iθ . (A.8) Some identities involving real logarithms in calculus carry over to complex analysis and others do not (see [10] for an excellent online resource).
If z 1 and z 2 denote any two nonzero complex numbers, it is straightforward to show that log(z 1 z 2 ) = log z 1 + log z 2 (A.9) but this statement involves a multiple-valued function! Hence it means that if two of the three logarithms are specified, then there is a value of the third logarithm such that this equation holds. As noted before, however, the identity (A.9) is no longer necessarily valid if log is replaced by Log. where the function atan2(y, x) is such that −π < atan2(y, x) < π and defined in terms of the standard arctan(x) (whose range is [−π/2, π/2]) as arctan(y/x) + π for y ≥ 0, x < 0 arctan(y/x) − π for y < 0, x < 0 Now there are three cases: • λ = 0. In this case atan2( , − ) → arctan(−1) + π = 3π/4 for → 0 + , while atan2( , + ) → arctan(1) = π/4. Therefore, as expected.
To drag a determinant into the game, we would need to swap the square root and the product symbols. This can only be done at the price of introducing an extra phase, as (Z 1 Z 2 ) a = Z a 1 Z a 2 e 2πiaN + , with with Arg denoting the principal argument of z. Therefore, we can conclude that where we have used i N/2 = exp((N/2)Log(i)) = exp(iN π/4), andN (λ) ∈ Z depends on the specific pattern of phases of the numbers {λ k − λ + i }. The consequence is that a representation of the multidimensional Fresnel integral in terms of inverse (principal) square root of a determinant does exist, but it carries a sign "ambiguity" † † that would be resolved knowing the precise phase patterns of the numbers {λ k − λ + i }. However, expressing the integerN (λ) in terms of the entries of M alone is rather unnatural, and is preferable to keep adopting the representation (25) that is free of ambiguities.