1 Introduction

The quantitative genetic theory of multivariate trait evolution provides a powerful framework to analyze and predict phenotypic evolution (Steppan et al. 2002; Blows 2007; Blows and Walsh 2009; Walsh and Blows 2009; Teplitsky et al. 2014). At the core of the theory is the Lande equation, which describes a population’s response to directional selection under certain simplifying conditions (Lande 1979; Lande and Arnold 1983):

$$\begin{aligned} \Delta \bar{\textbf{z}}= \textbf{G} \varvec{\upbeta }, \end{aligned}$$

where \(\varvec{\upbeta }\) is a p-dimensional selection gradient vector (partial regression coefficients of trait values to relative fitness), \(\textbf{G}\) is a \(p \times p\) (additive) genetic covariance matrix, and \(\Delta \bar{\textbf{z}}\) is a p-dimensional vector of per-generation change of the population’s mean trait values. (Throughout the paper, p denotes the (nominal) number of the traits/variables analyzed, and all vectors are column vectors without transposition.) In light of this theory, the \(\textbf{G}\) matrix is supposed to represent the population’s (in)ability to respond to directional selection—what has been termed evolvability by Hansen, Houle and colleagues (Houle 1992; Hansen 2003; Hansen et al. 2003a, b, 2011; Hansen and Houle 2008; Hansen and Pélabon 2021).Footnote 1 The theory has motivated extensive investigations into aspects of the \(\textbf{G}\) matrix or its substitutes, ranging from theoretical and simulation-based analyses (e.g., Pavlicev and Hansen 2011; Pavlicev et al. 2011; Jones et al. 2012; Chevin 2013; Melo and Marroig 2015; Hansen et al. 2019) to empirical inter-population/specific comparisons of trait covariation and its expansion to inter-population divergences (e.g., Cheverud 1982, 1989; Schluter 1996; Porto et al. 2009; Marroig et al. 2009; Rolian 2009; Bolstad et al. 2014; Haber 2016; Puentes et al. 2016; Costa e Silva et al. 2020; Machado 2020; McGlothlin et al. 2022; Opedal et al. 2022, 2023; Hubbe et al. 2023).

Based on this framework, and following earlier theoretical developments (Hansen 2003; Hansen et al. 2003a, b), Hansen and Houle (2008) proposed several scalar indices to capture certain aspects of the multivariate evolvability represented in \(\textbf{G}\) (see also Kirkpatrick 2009; Marroig et al. 2009). Of these, (unconditional) evolvability e, conditional evolvability c, autonomy a, integration i, respondability r, and flexibility f (due to Marroig et al. 2009) concern a single \(\textbf{G}\) matrix (Fig. 1A), whereas response difference d concerns comparison between two \(\textbf{G}\) matrices. Related to the latter is the use of response correlation \(\rho \) for matrix comparison (Cheverud 1996; Cheverud and Marroig 2007) (Fig. 1B). Notably, all these indices are simple or multiple ratios of quadratic forms in \(\varvec{\upbeta }\) (see below). In this paper, they are collectively called evolvability measures. Many of them can be related to the rate of adaptation or evolutionary constraint/bias (e.g., Hansen and Houle 2008; Chevin 2013; Bolstad et al. 2014; Hansen et al. 2019), providing means to characterize and compare \(\textbf{G}\) matrices in biologically meaningful ways.

Fig. 1
figure 1

Geometric interpretations of evolvability measures. In a hypothetical bivariate trait space, genetic covariance matrices \(\textbf{G}\) are schematically represented by equiprobability ellipses around the mean phenotype \(\bar{\textbf{z}}\), and selection gradients \(\varvec{\upbeta }\) by acute-headed brown arrows. Response vectors \(\Delta \bar{\textbf{z}}\) are matrix products of \(\textbf{G}\) and \(\varvec{\upbeta }\), represented by broad-headed red/magenta arrows. (a) One-matrix measures: evolvability e is the norm of \(\Delta \bar{\textbf{z}}\) projected on \(\varvec{\upbeta }\), standardized by the norm of the latter, \(|\varvec{\upbeta } |\); conditional evolvability c is the same but when \(\Delta \bar{\textbf{z}}\) is constrained along \(\varvec{\upbeta }\); autonomy a is the ratio of c to e; respondability r is \(|\Delta \bar{\textbf{z}} |\) standardized by \(|\varvec{\upbeta } |\); and flexibility f is the similarity in direction (cosine of the angle) between \(\Delta \bar{\textbf{z}}\) and \(\varvec{\upbeta }\). (b) Two-matrix comparison measures: response difference d is the distance between the end points of the two \(\Delta \bar{\textbf{z}}\)’s standardized by \(|\varvec{\upbeta } |\), and response correlation \(\rho \) is their similarity in direction

The evolvability measures are primarily defined with respect to a given selection gradient \(\varvec{\upbeta }\). In this sense, they represent aspects of \(\textbf{G}\) matrices under a fixed, deterministic selection. In many theoretical and comparative analyses of evolvability (see references above), however, interest is often in characterizing \(\textbf{G}\) matrices without referring to \(\varvec{\upbeta }\). For this purpose, it is sensible to take the expectation of an evolvability measure assuming a certain probability distribution of \(\varvec{\upbeta }\). Most typically, although not necessarily, the uniform distribution on a unit hypersphere is assumed for this purpose, representing a completely randomly directed and uncorrelated selection regime. The resultant quantities are called average measures hereafter.Footnote 2 Biologically speaking, the average measures may be regarded as representations of general evolvability (in the sense of Riederer et al. 2022), as compared to specific evolvability which is represented by the evolvability measures as functions of a fixed \(\varvec{\upbeta }\).

However, there is a major obstacle to using average measures for practical or even theoretical investigations. That is, most of the average measures lack known explicit, exact expressions beyond as expectations, except for the average evolvability \({\bar{e}}\) and, under certain special conditions, the average conditional evolvability \({\bar{c}}\) and the average respondability \({\bar{r}}\) (Hansen and Houle 2008; Kirkpatrick 2009). Consequently, previous studies had to rely on approximations of average measures. Presumably the most widely used approximation method is Monte Carlo evaluation (e.g., Marroig et al. 2009; Bolstad et al. 2014; Haber 2016; Grabowski and Porto 2017), in which a large number of \(\varvec{\upbeta }\) is randomly generated on a computer and the resultant values calculated with the primary definition are literally averaged to yield an estimate of average measure. This has long been done in the so-called random skewers analysis for matrix comparison using response correlation \(\rho \) (e.g., Cheverud 1996; Cheverud and Marroig 2007; Revell 2007; see also Rohlf 2017). It is implemented in the R packages evolvability (Bolstad et al. 2014) and evolqg (Melo et al. 2016) for calculating various average measures. This method necessarily involves random fluctuations in the estimate and can take a large computational time, although the latter is becoming less of a concern with modern-day computers. On the other hand, Hansen and Houle (2008, 2009) themselves provided approximate expressions for average measures based on the delta method (which they called “analytical” or “standard” approximations). Apart from ill-defined notations used therein, a practical problem there is that the accuracy of this sort of approximation is generally unknown, so a separate Monte Carlo evaluation is usually required to ascertain whether the delta method approximation has an acceptable accuracy. This approximation has been used in some subsequent studies (e.g., Hansen and Voje 2011; Brommer 2014; Delahaie et al. 2017; Saltzberg et al. 2022) and is available in the evolvability package. Apart from these methods, Kirkpatrick (2009) used numerical integration to evaluate his average selection response, a measure equivalent to respondability (below), but this method has not been widely used in evaluating average measures.

This technical paper provides exact expressions for the following average measures: average evolvability \({\bar{e}}\), average conditional evolvability \({\bar{c}}\), average autonomy \({\bar{a}}\), average respondability \({\bar{r}}\), average flexibility \({\bar{f}}\), average response difference \({\bar{d}}\), and average response correlation \({\bar{\rho }}\). These expressions are derived from existing and new results on the moments of simple or multiple ratios of quadratic forms in normal variables. They are expressed as infinite series involving zonal or invariant polynomials of matrix arguments, and in most cases can be numerically evaluated as their partial sums to yield quasi-exact values. For some of the expressions, an upper bound for the truncation error is available. In addition to the special condition where \(\varvec{\upbeta }\) is spherically distributed as assumed in most previous studies (but see also Chevin 2013), this study also concerns the general condition that \(\varvec{\upbeta }\) is normally distributed with potentially nonzero mean and nonspherical covariance. The latter condition can be of substantial biological interest, as it can model a fairly wide range of random directional and/or correlated selection regimes.

The paper is structured as follows. Section 2 first reviews the definition of the evolvability measures; some of them are redefined to accommodate potentially singular \(\textbf{G}\) matrices. After reviewing some known results on the moment of a ratio of quadratic forms, it then provides new expressions for average evolvability measures under the spherical distribution of \(\varvec{\upbeta }\). Section 3 presents a new R implementation of the analytic results and evaluate its performance by numerical experiments. Section 4 concludes the main body of the paper by adding some theoretical and practical considerations on the use of average measures. As the zonal and invariant polynomials appear to have rarely been used in the biological literature, Appendix A gives a brief overview on their theories, providing a basis for the present mathematical results. Appendix B states a new result on the moment of a multiple ratio of quadratic forms in normal variables, and Appendix C presents new expressions for average measures under the general normal distribution of \(\varvec{\upbeta }\). Appendix D clarifies connections to previous results derived under special conditions (Hansen and Houle 2008; Kirkpatrick 2009).

2 Theory

2.1 Notations

In the following discussion, it is convenient to define the linear combination of the variables (traits) along the direction of \(\varvec{\upbeta }\). For this purpose, consider the decomposition \(\varvec{\upbeta } = |\varvec{\upbeta } | \textbf{u}\), where \(|\cdot |\) is the vector norm or length, \(|\textbf{a} | = \sqrt{\textbf{a}^T \textbf{a}}\) for an arbitrary vector \(\textbf{a}\) (the superscript \({}^T\) denotes matrix transpose), and \(\textbf{u}\) is a unit vector: \(\textbf{u}^T \textbf{u} = 1\). Define the \(p \times (p - 1)\) matrix \(\textbf{U}_{(-1)}\) so that the matrix \(\textbf{U} = (\textbf{u}, \textbf{U}_{(-1)})\) is orthogonal. With these, the orthogonal linear transform of the variables \(\textbf{z}^* = \textbf{U}^T \textbf{z}\) will be considered; the first entry of this new vector \(\textbf{u}^T \textbf{z}\) represents the score along the direction of \(\varvec{\upbeta }\). Note that the covariance matrix of the new variables \(\textbf{z}^*\) can be written as

$$\begin{aligned} \textbf{G}^* = \textbf{U}^T \textbf{G} \textbf{U} = \begin{pmatrix} \textbf{u}^T \textbf{G} \textbf{u} &{} \textbf{u}^T \textbf{G} \textbf{U}_{(-1)} \\ \textbf{U}_{(-1)}^T \textbf{G} \textbf{u} &{} \textbf{U}_{(-1)}^T \textbf{G} \textbf{U}_{(-1)} \end{pmatrix}. \end{aligned}$$

Previous authors (Hansen and Houle 2008; Marroig et al. 2009; Grabowski and Porto 2017) defined some evolvability measures under the (sometimes implicit) constraint \(|\varvec{\upbeta } | = 1\). To avoid potential confusion, this standardized vector is always denoted by \(\textbf{u}\) herein. Throughout this paper, \(\textbf{G}\) is assumed to be given rather than random and validly constructed as a covariance matrix, that is, symmetric and nonnegative definite (i.e., either positive definite or positive semidefinite, in the terminology of Schott 2016).

Although Hansen and Houle (2008) assumed \(\textbf{G}\) to be positive definite, this assumption is not imposed here to accommodate potentially singular (and positive semidefinite) \(\textbf{G}\) where possible. This is done by using the generalized inverse; for a matrix \(\textbf{A}\), a generalized inverse \(\textbf{A}^{-}\) is such a matrix that satisfies \(\textbf{A} \textbf{A}^{-} \textbf{A} = \textbf{A}\). If \(\textbf{A}\) is nonsingular, \(\textbf{A}^{-} = \textbf{A}^{-1}\).

For notational simplicity, the range or column space and the null space of a nonzero \(p \times q\) matrix \(\textbf{A}\) are defined: \(R \left( \textbf{A} \right) = \{ \textbf{y}: \textbf{y} = \textbf{A} \textbf{x}, \textbf{x} \in {\mathbb {R}}^q \}\), and \(N \left( \textbf{A} \right) = \{ \textbf{x}: \textbf{0}_p = \textbf{A} \textbf{x}, \textbf{x} \in {\mathbb {R}}^q \}\), where \(\textbf{0}_p\) is the p-dimensional vector of zeros. When \(\textbf{A}\) is nonnegative definite, these are the spaces spanned by the eigenvectors corresponding to its nonzero and zero eigenvalues, respectively. The p-dimensional identity matrix is denoted by \(\textbf{I}_p\).

2.2 Evolvability measures: fixed selection

Evolvability \(e(\varvec{\upbeta })\) is the variance in \(\textbf{G}\) along the direction of selection \(\textbf{u}\), or equivalently the norm of the response vector \(\Delta \bar{\textbf{z}}\) projected onto \(\varvec{\upbeta }\), standardized by \(|\varvec{\upbeta } |\) (Hansen and Houle 2008) (Fig. 1):

$$\begin{aligned} e(\varvec{\upbeta })&= \frac{ \varvec{\upbeta }^T \Delta \bar{\textbf{z}}}{ \varvec{\upbeta }^T \varvec{\upbeta } } \nonumber \\&= \frac{ \varvec{\upbeta }^T \textbf{G} \varvec{\upbeta } }{ \varvec{\upbeta }^T \varvec{\upbeta } } \end{aligned}$$
(1)
$$\begin{aligned}&= \textbf{u}^T \textbf{G} \textbf{u}. \end{aligned}$$
(2)

The numerator in (1) can be related to the rate of adaptation (change in mean fitness) under simple directional selection (Agrawal and Stinchcombe 2009; Chevin 2013).

Conditional evolvability \(c(\varvec{\upbeta })\) is conceptually defined as the variance along \(\textbf{u}\) when evolution in other traits/directions is not allowed due to stabilizing selection (Hansen 2003; Hansen et al. 2003a, 2019). With above notations, it is the conditional variance of the first element of \(\textbf{z}^*\) given the other elements (redefined here after Hansen and Houle 2008):

$$\begin{aligned} c(\varvec{\upbeta })&= \textbf{u}^T \textbf{G} \textbf{u} - \textbf{u}^T \textbf{G} \textbf{U}_{(-1)} \left( \textbf{U}_{(-1)}^T \textbf{G} \textbf{U}_{(-1)} \right) ^{-} \textbf{U}_{(-1)}^T \textbf{G} \textbf{u} \nonumber \\&= (\textbf{u}^T \textbf{G}^{-} \textbf{u})^{-} \quad \text {if } \varvec{\upbeta }, \textbf{u} \in R \left( \textbf{G} \right) , \end{aligned}$$
(3)
$$\begin{aligned}&= \frac{ \varvec{\upbeta }^T \varvec{\upbeta } }{ \varvec{\upbeta }^T \textbf{G}^{-} \varvec{\upbeta } }. \end{aligned}$$
(4)

Here, the equation of (3) is from well-known results on the (generalized) inverse of a partitioned matrix (e.g., Schott 2016, theorems 7.1 and 7.12), for which Hansen and Houle (2008) restated a proof in nonsingular \(\textbf{G}\). A lack of conditional variance along \(\varvec{\upbeta }\) results in \(c(\varvec{\upbeta }) = 0\), which happens when \(\textbf{G}\) is singular and \(\varvec{\upbeta } \notin R \left( \textbf{G} \right) \); the expressions (3) and (4) do not hold in this case.

Autonomy \(a(\varvec{\upbeta })\) is defined as the complement of the squared multiple correlation of the first element of \(\textbf{z}^*\) with respect to the other elements:

$$\begin{aligned} a(\varvec{\upbeta })&= 1 - (\textbf{u}^T \textbf{G} \textbf{u})^{-1} \textbf{u}^T \textbf{G} \textbf{U}_{(-1)} \left( \textbf{U}_{(-1)}^T \textbf{G} \textbf{U}_{(-1)} \right) ^{-} \textbf{U}_{(-1)}^T \textbf{G} \textbf{u} = \frac{c(\varvec{\upbeta })}{e(\varvec{\upbeta })} \nonumber \\&= (\textbf{u}^T \textbf{G} \textbf{u} \textbf{u}^T \textbf{G}^{-} \textbf{u})^{-1} \quad \text {if } c(\varvec{\upbeta }) \ne 0, \end{aligned}$$
(5)
$$\begin{aligned}&= \frac{(\varvec{\upbeta }^T \varvec{\upbeta })^2}{\varvec{\upbeta }^T \textbf{G} \varvec{\upbeta } \varvec{\upbeta }^T \textbf{G}^{-} \varvec{\upbeta }}. \end{aligned}$$
(6)

This quantity is undefined when \(e(\varvec{\upbeta }) = \textbf{u}^T \textbf{G} \textbf{u} = 0\). Hansen and Houle (2008) also defined the integration \(i(\varvec{\upbeta })\), which is the squared correlation just mentioned: \(i(\varvec{\upbeta }) = 1 - a(\varvec{\upbeta })\).

Respondability \(r(\varvec{\upbeta })\) is the magnitude of response standardized by that of selection, or the ratio of the norms of \(\Delta \bar{\textbf{z}}\) and \(\varvec{\upbeta }\):

$$\begin{aligned} r(\varvec{\upbeta })&= \frac{|\Delta \bar{\textbf{z}} |}{|\varvec{\upbeta } |} \nonumber \\&= \sqrt{ \frac{\varvec{\upbeta }^T \textbf{G}^2 \varvec{\upbeta }}{\varvec{\upbeta }^T \varvec{\upbeta }} } . \end{aligned}$$
(7)

The definition and terminology of r here follows Hansen and Houle (2008), but it should be noted that Kirkpatrick (2009) independently devised an equivalent measure as the (relative) selection response (R in the latter author’s notation) with a slightly different standardization.

Flexibility \(f(\varvec{\upbeta })\) quantifies similarity in direction between \(\Delta \bar{\textbf{z}}\) and \(\varvec{\upbeta }\) in terms of vector correlation (cosine of the angle formed by two vectors), and is also the ratio of \(e(\varvec{\upbeta })\) to \(r(\varvec{\upbeta })\) (Hansen and Houle 2008; Marroig et al. 2009):

$$\begin{aligned} f(\varvec{\upbeta })&= \frac{e(\varvec{\upbeta })}{r(\varvec{\upbeta })} = \frac{ \varvec{\upbeta }^T \Delta \bar{\textbf{z}}}{ |\varvec{\upbeta } | |\Delta \bar{\textbf{z}} | } \nonumber \\&= \frac{ \varvec{\upbeta }^T \textbf{G} \varvec{\upbeta } }{ \sqrt{ \varvec{\upbeta }^T \varvec{\upbeta } \varvec{\upbeta }^T \textbf{G}^2 \varvec{\upbeta } } }. \end{aligned}$$
(8)

This quantity is undefined if \(|\Delta \bar{\textbf{z}} | = 0\), i.e., \(\varvec{\upbeta } \in N \left( \textbf{G} \right) \). Otherwise, \(0 < f(\varvec{\upbeta }) \le 1\) from the nonnegative definiteness of \(\textbf{G}\) and the definition of vector correlation. The term flexibility was coined by Marroig et al. (2009) due to Hansen’s suggestion, although the use of the vector correlation or angle was suggested in a few different works (Blows and Walsh 2009; Rolian 2009).

All the above indices are one-matrix measures that primarily concern characterization of a single \(\textbf{G}\) matrix. On the other hand, the following two indices concern pairwise comparisons between two \(\textbf{G}\) matrices in terms of (dis)similarity of the response vectors given the same \(\varvec{\upbeta }\). The two \(\textbf{G}\) matrices and \(\Delta \bar{\textbf{z}}\) are denoted here with subscripts: \(\textbf{G}_i\), \(\Delta \bar{\textbf{z}}_i\). Response difference \(d(\varvec{\upbeta })\) is a measure of dissimilarity, defined as a standardized difference between the two response vectors (Hansen and Houle 2008):

$$\begin{aligned} d(\varvec{\upbeta })&= \frac{| \Delta \bar{\textbf{z}}_1 - \Delta \bar{\textbf{z}}_2 |}{|\varvec{\upbeta } |} \nonumber \\&= \sqrt{ \frac{\varvec{\upbeta }^T \left( \textbf{G}_1 - \textbf{G}_2 \right) ^2 \varvec{\upbeta }}{\varvec{\upbeta }^T \varvec{\upbeta }}}. \end{aligned}$$
(9)

Response correlation \(\rho (\varvec{\upbeta })\) is a measure of similarity in direction, defined as the vector correlation between the two response vectors (e.g., Cheverud 1996; Cheverud and Marroig 2007):

$$\begin{aligned} \rho (\varvec{\upbeta })&= \frac{ \Delta \bar{\textbf{z}}_1^T \Delta \bar{\textbf{z}}_2 }{ |\Delta \bar{\textbf{z}}_1 | |\Delta \bar{\textbf{z}}_2 | } \nonumber \\&= \frac{ \varvec{\upbeta }^T \textbf{G}_1 \textbf{G}_2 \varvec{\upbeta } }{ \sqrt{ \varvec{\upbeta }^T \textbf{G}_1^2 \varvec{\upbeta } \varvec{\upbeta }^T \textbf{G}_2^2 \varvec{\upbeta }}}. \end{aligned}$$
(10)

This quantity is undefined when at least one of \(|\Delta \bar{\textbf{z}}_1 |\) and \(|\Delta \bar{\textbf{z}}_2 |\) is 0. Otherwise, \(-1 \le \rho (\varvec{\upbeta }) \le 1\).Footnote 3

2.3 Ratio of quadratic forms

A key to derive expressions for average measures is that the measures defined above are simple or multiple ratios of quadratic forms in \(\varvec{\upbeta }\) or \(\textbf{u}\). Ratios of quadratic forms, especially those in normal random variables, play a pivotal role in statistics, as many practically important test statistics can be written as a ratio of quadratic forms. Naturally, there is a vast body of literature regarding the distribution and moments of quadratic forms and ratios thereof (e.g., Magnus 1986; Jones 1986, 1987; Smith 1989, 1993; Mathai and Provost 1992; Hillier 2001; Forchini 2002; Hillier et al. 2009, 2014; Bao and Kan 2013).

Regarding the ratio of quadratic forms in normal variables, there are several equivalent ways to derive its moments. One starts from the joint moment-generation function (e.g., Cressie et al. 1981; Meng 2005) of quadratic forms, and typically results in integrations which does not have simple closed-form expressions (e.g., Magnus 1986; Jones 1986, 1987; Gupta and Kabe 1998). Another way is to expand a ratio into infinite series, which in turn is integrated using the zonal and invariant polynomials (e.g., Smith 1989, 1993; Hillier 2001; Hillier et al. 2009, 2014). (See Bao and Kan (2013) for connections between these two approaches.) The latter way typically yields an infinite series including zonal or invariant polynomials, which can be evaluated with reasonable speed and accuracy with the aid of recent algorithmic developments (Hillier et al. 2009, 2014; Bao and Kan 2013). This is the way followed in the present paper. The zonal polynomials are a special form of homogeneous polynomials in eigenvalues of a matrix which generalize powers of scalars into symmetric matrices (e.g., James 1960, 1964; Muirhead 1982, 2006; Gross and Richards 1987; Mathai et al. 1995). The invariant polynomials are a generalization of the zonal polynomials into multiple matrix arguments (Davis 1979, 1980; Chikuse 1980, 2003; Chikuse and Davis 1986b). A brief overview on these mathematical tools is provided in Appendix A.

A general result for the moments of a simple ratio of quadratic forms in normal variables is restated in the following proposition.

Proposition 1

(Smith 1989; Bao and Kan 2013) Let \(\textbf{x} \sim N_p \left( \varvec{\upmu }, \textbf{I}_p \right) \), \(\textbf{A}\) be a \(p \times p\) symmetric matrix, \(\textbf{B}\) be a \(p \times p\) nonnegative definite matrix, and mn be positive real numbers. When the expectation of \(\frac{(\textbf{x}^T \textbf{A} \textbf{x})^m}{(\textbf{x}^T \textbf{B} \textbf{x})^n}\) exists, it can be written as

$$\begin{aligned} {{\,\textrm{E}\,}}\! \left( \frac{(\textbf{x}^T \textbf{A} \textbf{x})^m}{(\textbf{x}^T \textbf{B} \textbf{x})^n} \right)&= 2^{m - n} \alpha _{\textbf{A}}^{-m} \alpha _{\textbf{B}}^{n} e^{ -\frac{ \varvec{\upmu }^T \varvec{\upmu } }{2} } \sum _{i=0}^{\infty } \sum _{j=0}^{\infty } \sum _{k=0}^{\infty } \frac{ \left( -m \right) _{i} \left( n \right) _{j} \Gamma \left( \frac{p}{2} + m - n + k \right) }{ 2^k \left( \frac{1}{2} \right) _{k} \Gamma \left( \frac{p}{2} + i + j + k \right) } \nonumber \\&\qquad \cdot \frac{\left( \frac{1}{2} \right) _{i+j+k} }{ i! j! k! } C^{\left[ {i} \right] , \left[ {j} \right] , \left[ {k} \right] }_{\left[ {i+j+k} \right] } \! \left( \textbf{I}_p - \alpha _{\textbf{A}} \textbf{A} , \textbf{I}_p - \alpha _{\textbf{B}} \textbf{B} , \varvec{\upmu } \varvec{\upmu }^T \right) , \end{aligned}$$
(11)

where \(\alpha _{\textbf{A}}\) and \(\alpha _{\textbf{B}}\) are any positive constants that satisfy \(0< \alpha _{\textbf{A}} < 2 / \lambda _{\max } \! \left( \textbf{A} \right) \) and \(0< \alpha _{\textbf{B}} < 2 / \lambda _{\max } \! \left( \textbf{B} \right) \), with \(\lambda _{\max } \! \left( \cdot \right) \) denoting the largest eigenvalue of the argument matrix, \(\left( a \right) _{k} = a (a + 1) \dots (a + k - 1)\), and \(C^{\left[ {i} \right] , \left[ {j} \right] , \left[ {k} \right] }_{\left[ {i+j+k} \right] } \! \left( \cdot , \cdot , \cdot \right) \) are the (ijk)-th top-order invariant polynomials (whose explicit form is presented in Appendix A).

Remark

Conditions for the existence of the moment are stated in Bao and Kan (2013, proposition 1 therein): when \(\textbf{B}\) is positive definite, the moment exists if and only if \(\frac{p}{2} + m > n\); and when \(\textbf{B}\) is positive semidefinite, slightly different conditions apply, depending on the rank of \(\textbf{B}\) and structures of \(\textbf{A}\) and \(\textbf{B}\). When m is an integer, the expressions can be simplified so that the summation for i disappears. Also, when \(\varvec{\upmu } = \textbf{0}_p\), they substantially simplify as all terms with \(k > 0\) are zero. If \(\textbf{A}\) or \(\textbf{B}\) is asymmetric, it can be symmetrized by using \(\textbf{x}^T \textbf{A} \textbf{x} = \textbf{x}^T \textbf{A}^T \textbf{x} = \textbf{x}^T (\textbf{A} + \textbf{A}^T) \textbf{x} / 2\).

In practice, the above series can be approximated by its partial sum, with recursive algorithms to calculate \(d_{i, j, k} \! \left( \textbf{A}_1 , \textbf{A}_2 , \textbf{A}_3 \right) = \frac{ \left( \frac{1}{2} \right) _{i + j + k} }{ i! j! k!} C^{\left[ {i} \right] , \left[ {j} \right] , \left[ {k} \right] }_{\left[ {i + j + k} \right] } \! \left( \textbf{A}_1 , \textbf{A}_2 , \textbf{A}_3 \right) \) (Chikuse 1987; Hillier et al. 2009, 2014; Bao and Kan 2013). General covariance structures for \(\textbf{x}\) can be accommodated under certain conditions by transforming the variables and quadratic forms, as described in Appendix C. Appendix C also provides a similar result for multiple ratios of the form \(\frac{(\textbf{x}^T \textbf{A} \textbf{x})^l}{(\textbf{x}^T \textbf{B} \textbf{x})^m (\textbf{x}^T \textbf{D} \textbf{x})^n}\).

2.4 Average measures for uniformly distributed selection

With the above theories, average measures can be readily derived as expectations with respect to random \(\varvec{\upbeta }\) under the assumption of normality. Previous treatments of evolvability measures (Hansen and Houle 2008; Kirkpatrick 2009) and random skewers analysis (Revell 2007) typically assumed that \(\textbf{u}\) is uniformly distributed on the unit hypersphere in the p-dimensional space. This condition is entailed in spherical multivariate normal distributions of \(\varvec{\upbeta }\): \(\varvec{\upbeta } \sim N_p \left( \textbf{0}_p, \sigma ^2 \textbf{I}_p \right) \) for arbitrary positive \(\sigma ^2\). This section provides expressions of average measures under this condition. (Note that the normality of \(\varvec{\upbeta }\) is not necessary for \(\textbf{u}\) to be uniformly distributed; i.e., all results in this section hold as long as the distribution of \(\varvec{\upbeta }\) is spherically symmetric.) Expressions for a general normal case \(\varvec{\upbeta } \sim N_p \left( \varvec{\upeta }, \varvec{\Sigma } \right) \) are given in Appendix C. Most of the results below can be derived either by applying Proposition 1 or Proposition 4 in Appendix B, or directly as done in Appendix A.

The average evolvability \({\bar{e}}\) is straightforward to obtain:

$$\begin{aligned} {\bar{e}}&= {{\,\textrm{E}\,}}_{\textbf{u}}\! \left( \textbf{u}^T \textbf{G} \textbf{u} \right) \nonumber \\&= {{\,\textrm{E}\,}}_{\textbf{u}}\! \left( {{\,\textrm{tr}\,}}\left( \textbf{G} \textbf{u} \textbf{u}^T \right) \right) \nonumber \\&= {{\,\textrm{tr}\,}}\left( \textbf{G} \right) / p, \end{aligned}$$
(12)

as derived by Hansen and Houle (2008). This is because \({{\,\textrm{E}\,}}_{\textbf{u}}\! \left( \textbf{u} \textbf{u}^T \right) = \textbf{I}_p / p\) under this condition.

Conditional evolvability \(c(\varvec{\upbeta })\) and autonomy \(a(\varvec{\upbeta })\) can be nonzero only when \(\varvec{\upbeta } \in R \left( \textbf{G} \right) \) (see above). When \(\textbf{G}\) is singular, this happens with probability 0 when \(\varvec{\upbeta }\) is continuously distributed across the entire space. Hence, the average conditional evolvability \({\bar{c}}\) and the average autonomy \({\bar{a}}\) are 0 when \(\textbf{G}\) is singular. Although the following expressions can operationally be applied to singular \(\textbf{G}\) by using \(\textbf{G}^-\) in place of \(\textbf{G}^{-1}\), such a result is invalid because the ratio expressions do not hold when \(\varvec{\upbeta } \notin R \left( \textbf{G} \right) \).

When \(\textbf{G}\) is nonsingular, the average conditional evolvability \({\bar{c}}\) is:

$$\begin{aligned} {\bar{c}}&= {{\,\textrm{E}\,}}_{\varvec{\upbeta }}\! \left( \frac{ \varvec{\upbeta }^T \varvec{\upbeta } }{ \varvec{\upbeta }^T \textbf{G}^{-1} \varvec{\upbeta } } \right) \nonumber \\&= \alpha \sum _{k=0}^{\infty } \frac{ \left( \frac{1}{2} \right) _{k} }{ \left( \frac{p}{2} \right) _{k} } C_{\left[ {k} \right] } \! \left( \textbf{I}_p - \alpha \textbf{G}^{-1} \right) \nonumber \\&= \alpha \sum _{k=0}^{\infty } \frac{ \left( 1 \right) _{k} }{ \left( \frac{p}{2} \right) _{k} } d_{k} \! \left( \textbf{I}_p - \alpha \textbf{G}^{-1} \right) \nonumber \\&= \alpha \, {}_2F_1 \left( \frac{1}{2}, 1; \frac{p}{2} ; \textbf{I}_p - \alpha \textbf{G}^{-1} \right) , \end{aligned}$$
(13)

where \(\alpha \) is any positive constant that satisfies \(0< \alpha < 2 / \lambda _{\max } \left( \textbf{G}^{-1}\right) \) (see Proposition 1), \( C_{\left[ {k} \right] } \! \left( \cdot \right) \) are the kth top-order zonal polynomials, and \({}_2F_1 \left( a, b; c; \cdot \right) \) is an alternative expression using the hypergeometric function of matrix argument with the three parameters ab, and c (see Appendix A). When \(p = 2\), this expression simplifies to the geometric mean of the two eigenvalues of \(\textbf{G}\) as mentioned by Hansen and Houle (2008) (Appendix D).

The average autonomy \({\bar{a}}\) is, when \(\textbf{G}\) is nonsingular,

$$\begin{aligned} {\bar{a}}&= {{\,\textrm{E}\,}}_{\varvec{\upbeta }}\! \left( \frac{(\varvec{\upbeta }^T \varvec{\upbeta })^2}{\varvec{\upbeta }^T \textbf{G} \varvec{\upbeta } \varvec{\upbeta }^T \textbf{G}^{-1} \varvec{\upbeta }} \right) \nonumber \\&= \alpha _1 \alpha _2 \sum _{i=0}^{\infty } \sum _{j=0}^{\infty } \frac{ \left( \frac{1}{2} \right) _{i+j} }{ \left( \frac{p}{2} \right) _{i+j} } C^{\left[ {i} \right] , \left[ {j} \right] }_{\left[ {i+j} \right] } \! \left( \textbf{I}_p - \alpha _1 \textbf{G} , \textbf{I}_p - \alpha _2 \textbf{G}^{-1} \right) \nonumber \\&= \alpha _1 \alpha _2 \sum _{i=0}^{\infty } \sum _{j=0}^{\infty } \frac{ \left( 1 \right) _{i} \left( 1 \right) _{j} }{ \left( \frac{p}{2} \right) _{i+j} } d_{i,j} \! \left( \textbf{I}_p - \alpha _1 \textbf{G} , \textbf{I}_p - \alpha _2 \textbf{G}^{-1} \right) , \end{aligned}$$
(14)

where \(\alpha _1\) and \(\alpha _2\) are to satisfy \(0< \alpha _1 < 2 / \lambda _{\max } \! \left( \textbf{G} \right) \) and \(0< \alpha _2 < 2 / \lambda _{\max } \! \left( \textbf{G}^{-1} \right) \). The average integration \({\bar{i}}\) is simply \(1 - {\bar{a}}\) (and equals 1 when \(\textbf{G}\) is singular).

The average respondability \({\bar{r}}\) is

$$\begin{aligned} {\bar{r}}&= {{\,\textrm{E}\,}}_{\varvec{\upbeta }}\! \left( \sqrt{ \frac{\varvec{\upbeta }^T \textbf{G}^2 \varvec{\upbeta }}{\varvec{\upbeta }^T \varvec{\upbeta }} } \right) \nonumber \\&= \alpha ^{-\frac{1}{2}} \sum _{k=0}^{\infty } \frac{ \left( -\frac{1}{2} \right) _{k} \left( \frac{1}{2} \right) _{k} }{ \left( \frac{p}{2} \right) _{k} k! } C_{\left[ {k} \right] } \! \left( \textbf{I}_p - \alpha \textbf{G}^2 \right) \nonumber \\&= \alpha ^{-\frac{1}{2}} \sum _{k=0}^{\infty } \frac{ \left( -\frac{1}{2} \right) _{k} }{ \left( \frac{p}{2} \right) _{k} } d_{k} \! \left( \textbf{I}_p - \alpha \textbf{G}^2 \right) \nonumber \\&= \alpha ^{-\frac{1}{2}} \, {}_2F_1 \left( -\frac{1}{2}, \frac{1}{2}; \frac{p}{2} ; \textbf{I}_p - \alpha \textbf{G}^2 \right) , \end{aligned}$$
(15)

where \(\alpha \) is to satisfy \(0< \alpha < 2 / \lambda _{\max } \! \left( \textbf{G}^2 \right) \) (note that \(\lambda _{\max } \! \left( \textbf{G}^2 \right) = \lambda _{\max } \! \left( \textbf{G} \right) ^2\) since \(\textbf{G}\) is nonnegative definite). Kirkpatrick (2009) discussed on an equivalent quantity, the average (relative) selection response (\({\bar{R}}\) there), but did not provide a closed-form expression for arbitrary \(\textbf{G}\). It is possible to show his expression for the special case with complete integration is entailed in the present result (Appendix D).

The average flexibility \({\bar{f}}\) is

$$\begin{aligned} {\bar{f}}&= {{\,\textrm{E}\,}}_{\varvec{\upbeta }}\! \left( \frac{\varvec{\upbeta }^T \textbf{G} \varvec{\upbeta }}{\sqrt{ \varvec{\upbeta }^T \varvec{\upbeta } \varvec{\upbeta }^T \textbf{G}^2 \varvec{\upbeta }}} \right) \nonumber \\&= \alpha ^{\frac{1}{2}} \sum _{k=0}^{\infty } \frac{ \left( \frac{1}{2} \right) _{k} \left( \frac{1}{2} \right) _{k+1} }{ \left( \frac{p}{2} \right) _{k+1} k! } C^{\left[ {k} \right] , \left[ {1} \right] }_{\left[ {k+1} \right] } \! \left( \textbf{I}_p - \alpha \textbf{G}^2, \textbf{G} \right) \nonumber \\&= \alpha ^{\frac{1}{2}} \sum _{k=0}^{\infty } \frac{\left( \frac{1}{2} \right) _{k}}{\left( \frac{p}{2} \right) _{k+1}}d_{k,1} \! \left( \textbf{I}_p - \alpha \textbf{G}^2, \textbf{G} \right) , \end{aligned}$$
(16)

where \(\alpha \) is to satisfy \(0< \alpha < 2 / \lambda _{\max } \! \left( \textbf{G}^2 \right) \). In general, \({\bar{f}}\) is well defined even if \(\textbf{G}\) is singular, unless \(\varvec{\upbeta } \in N \left( \textbf{G} \right) \) with nonzero probability (which of course does not happen here).

The average response difference \({\bar{d}}\) has a form essentially identical to \({\bar{r}}\) except for the argument matrix

$$\begin{aligned} {\bar{d}}&= {{\,\textrm{E}\,}}_{\varvec{\upbeta }}\! \left( \sqrt{ \frac{\varvec{\upbeta }^T \left( \textbf{G}_1 - \textbf{G}_2 \right) ^2 \varvec{\upbeta }}{\varvec{\upbeta }^T \varvec{\upbeta }} } \right) \nonumber \\&= \alpha ^{-\frac{1}{2}} \sum _{k=0}^{\infty } \frac{ \left( -\frac{1}{2} \right) _{k} \left( \frac{1}{2} \right) _{k} }{ \left( \frac{p}{2} \right) _{k} k! } C_{\left[ {k} \right] } \! \left( \textbf{I}_p - \alpha \left( \textbf{G}_1 - \textbf{G}_2 \right) ^2 \right) \nonumber \\&= \alpha ^{-\frac{1}{2}} \sum _{k=0}^{\infty } \frac{ \left( -\frac{1}{2} \right) _{k} }{ \left( \frac{p}{2} \right) _{k} } d_{k} \! \left( \textbf{I}_p - \alpha \left( \textbf{G}_1 - \textbf{G}_2 \right) ^2 \right) \nonumber \\&= \alpha ^{-\frac{1}{2}} \, {}_2F_1 \left( -\frac{1}{2}, \frac{1}{2}; \frac{p}{2} ; \textbf{I}_p - \alpha \left( \textbf{G}_1 - \textbf{G}_2 \right) ^2 \right) , \end{aligned}$$
(17)

where \(\alpha \) is to satisfy \(0< \alpha < 2 / \lambda _{\max } \! \left( \left( \textbf{G}_1 - \textbf{G}_2 \right) ^2 \right) \).

The average response correlation \({\bar{\rho }}\) is

$$\begin{aligned} {\bar{\rho }}&= {{\,\textrm{E}\,}}_{\varvec{\upbeta }}\! \left( \frac{ \varvec{\upbeta }^T \textbf{G}_1 \textbf{G}_2 \varvec{\upbeta } }{ \sqrt{ \varvec{\upbeta }^T \textbf{G}_1^2 \varvec{\upbeta } \varvec{\upbeta }^T \textbf{G}_2^2 \varvec{\upbeta } } } \right) \nonumber \\&= (\alpha _1 \alpha _2)^{\frac{1}{2}} \sum _{i=0}^{\infty } \sum _{j=0}^{\infty } \frac{ \left( \frac{1}{2} \right) _{i} \left( \frac{1}{2} \right) _{j} \left( \frac{1}{2} \right) _{i+j+1} }{ \left( \frac{p}{2} \right) _{i+j+1} i! j! } \nonumber \\&\quad \cdot C^{\left[ {i} \right] , \left[ {j} \right] , \left[ {1} \right] }_{\left[ {i+j+1} \right] } \! \left( \textbf{I}_p - \alpha _1 \textbf{G}_1^2 , \textbf{I}_p - \alpha _2 \textbf{G}_2^2 , \frac{\textbf{G}_1 \textbf{G}_2 + \textbf{G}_2 \textbf{G}_1}{2} \right) \nonumber \\&= (\alpha _1 \alpha _2)^{\frac{1}{2}} \sum _{i=0}^{\infty } \sum _{j=0}^{\infty } \frac{ \left( \frac{1}{2} \right) _{i} \left( \frac{1}{2} \right) _{j} }{ \left( \frac{p}{2} \right) _{i+j+1} } d_{i,j,1} \! \left( \textbf{I}_p - \alpha _1 \textbf{G}_1^2 , \textbf{I}_p - \alpha _2 \textbf{G}_2^2 , \frac{\textbf{G}_1 \textbf{G}_2 + \textbf{G}_2 \textbf{G}_1}{2} \right) , \end{aligned}$$
(18)

where \(\alpha _1\) and \(\alpha _2\) are to satisfy \(0< \alpha _1 < 2 / \lambda _{\max } \! \left( \textbf{G}_1^2 \right) \) and \(0< \alpha _2 < 2 / \lambda _{\max } \! \left( \textbf{G}_2^2 \right) \).

2.5 Truncation error

It is seen that the successive terms in the above series decrease in magnitude, thus a partial sum up to certain higher-order terms can in principle be used as an approximation of the exact value of the infinite series. Although each term can be easily evaluated with a recursive algorithm (Bao and Kan 2013; Hillier et al. 2014), accurate numerical evaluation involving higher-order terms is practically a computer-intensive problem. The speed of numerical convergence and computational time will be examined in the next section.

The choice of \(\alpha \) is arbitrary within the constraint \(0< \alpha < 2 / \lambda _{\max }\) with respect to the relevant matrix (see above), but influences the signs of the terms as well as the speed of convergence. When \(0 < \alpha \le 1 / \lambda _{\max }\), all the zonal and invariant polynomials in the above expressions are positive because the argument matrices have only nonnegative eigenvalues. Therefore, all terms in the summations in \({\bar{c}}\) (13), \({\bar{a}}\) (14), \({\bar{f}}\) (16), and \({\bar{\rho }}\) (18) are positive, so truncation results in (slight) underestimation. On the other hand, all terms in \({\bar{r}}\) (15) and \({\bar{d}}\) (17), except those for \(k = 0\), are negative in the same condition, so truncation results in overestimation. If \(1 / \lambda _{\max }< \alpha < 2 / \lambda _{\max }\), the signs of the polynomials possibly, though not always, fluctuate, rendering the behavior of the series less predictable.

It is possible to derive bounds for the truncation errors in \({\bar{c}}\), \({\bar{r}}\), \({\bar{f}}\), and \({\bar{d}}\) by applying the method of Hillier et al. (2009, theorem 6) with slight modifications, if \(\textbf{G}\) is nonsingular and the constants \(\alpha \) are taken within \(0 < \alpha \le 1 / \lambda _{\max }\). Let \({\bar{c}}_M\), \({\bar{r}}_M\), \({\bar{f}}_M\), and \({\bar{d}}_M\) be the relevant partial sums up to \(k = M\). The truncation error bounds are:

$$\begin{aligned} 0&\le {\bar{c}} - {\bar{c}}_M \le \frac{\left( 1 \right) _{M+1}}{\left( \frac{p}{2} \right) _{M+1}} \left( \alpha ^{-\frac{p-2}{2}} \left|\textbf{G} \right|^{\frac{1}{2}} - \alpha \sum _{k=0}^{M} d_{k} \! \left( \textbf{I}_p - \alpha \textbf{G}^{-1} \right) \right) , \end{aligned}$$
(19)
$$\begin{aligned} 0&\ge {\bar{r}} - {\bar{r}}_M \ge \frac{\left( -\frac{1}{2} \right) _{M+1}}{\left( \frac{p}{2} \right) _{M+1}} \left( \alpha ^{-\frac{p+1}{2}} \left|\textbf{G} \right|^{-1} - \alpha ^{-\frac{1}{2}} \sum _{k=0}^{M} d_{k} \! \left( \textbf{I}_p - \alpha \textbf{G}^{2} \right) \right) , \end{aligned}$$
(20)
$$\begin{aligned} 0&\le {\bar{f}} - {\bar{f}}_M \nonumber \\&\le \frac{\left( \frac{1}{2} \right) _{M+1}}{\left( \frac{p}{2} \right) _{M+2}} \left( \alpha ^{-\frac{p+1}{2}} \left|\textbf{G} \right|^{-1} \frac{ {{\,\textrm{tr}\,}}{ \left( \textbf{G}^{-1} \right) } }{2} - \alpha ^{\frac{1}{2}} \sum _{k=0}^{M} d_{k,1} \! \left( \textbf{I}_p - \alpha \textbf{G}^2 , \textbf{G} \right) \right) , \end{aligned}$$
(21)
$$\begin{aligned} 0&\ge {\bar{d}} - {\bar{d}}_M \nonumber \\&\ge \frac{\left( -\frac{1}{2} \right) _{M+1}}{\left( \frac{p}{2} \right) _{M+1}} \left( \alpha ^{-\frac{p+1}{2}} \left|\textbf{G}_1 - \textbf{G}_2 \right|^{-1} - \alpha ^{-\frac{1}{2}} \sum _{k=0}^{M} d_{k} \! \left( \textbf{I}_p - \alpha \left( \textbf{G}_1 - \textbf{G}_2 \right) ^{2} \right) \right) , \end{aligned}$$
(22)

where the constants \(\alpha \) are as in the corresponding definitions (but to satisfy \(0 < \alpha \le 1 / \lambda _{\max }\)). Unfortunately, the same method are not applicable to \({\bar{a}}\) or \({\bar{\rho }}\). Also, if \(\textbf{G}\) (or \(\textbf{G}_1 - \textbf{G}_2\) in case of \({\bar{d}}\)) is singular, these error bounds do not hold.

Empirically, a larger value of \(\alpha \) often improves the speed of convergence, as more eigenvalues of the argument matrix approach 0, but this should be used with caution since the above error bounds are not strictly applicable if \(\alpha > 1 / \lambda _{\max }\). To be precise, the error bounds should hold even under this condition provided that all the zonal/invariant polynomials above \(k = M\) are nonnegative, but it is not obvious in which case that condition can be shown to be true.

3 Numerical evaluation

3.1 Software implementation

There exist two Matlab programs distributed by Raymond M. Kan (https://www-2.rotman.utoronto.ca/~kan/) for evaluating the moments of simple ratios of quadratic forms in normal variables (Hillier et al. 2009, 2014; Bao and Kan 2013), which can be used to evaluate \({\bar{c}}\), \({\bar{r}}\), and \({\bar{d}}\) if a suitable environment is available. Alternatively, these average measures could be evaluated as hypergeometric functions of matrix argument, for which an efficient, general algorithm has been developed by Koev and Edelman (2006) and implemented in the R package HypergeoMat (Laurent 2022). However, that seems relatively inefficient in the present applications where the terms corresponding to all non-top-order partitions can be omitted. None of these can be used to evaluate moments of multiple ratios like \({\bar{a}}\), \({\bar{f}}\), and \({\bar{\rho }}\).

In order to make all above expressions available in the popular statistical software environment R, the present author has developed an independent implementation of the algorithms of Hillier et al. (2014) and Bao and Kan (2013) and published it as the package qfratio on the CRAN repository (Watanabe 2023). This package can be used to evaluate general moments of ratios of quadratic forms in normal variables with arbitrary mean and covariance (under certain constraints for singular cases as discussed in Appendix C.1). It uses RcppEigen (Bates and Eddelbuettel 2013) to effectively execute the computationally intensive recursive algorithms.

3.2 Numerical experiment: methods

Using the new R package, numerical experiments were conducted in a small set of artificial covariance matrices to evaluate behaviors and efficiency of the present series expressions. For each of \(p = 5, 20, 50\), and 200, three different eigenvalue conformations were used; two were set to have single large eigenvalues with different magnitudes of integration, \(V_\textrm{rel} = 0.1, 0.5\)—corresponding to relatively weak and rather strong integration, respectively—and one was set to have a quadratically decreasing series of eigenvalues (for the algorithms see Watanabe (2022) and the R package eigvaldisp (https://github.com/watanabe-j/eigvaldisp)). The matrices with these eigenvalue conformations are referred to as \(\textbf{G}_{0.1}\), \(\textbf{G}_{0.5}\), and \(\textbf{G}_\textrm{Q}\). For each of these, two eigenvector conformations were used so that the resultant covariance matrices were either diagonal matrices of the eigenvalues or correlation matrices with the same eigenvalues constructed with Givens rotations (Davies and Higham 2000). For the one-matrix measures (\({\bar{c}}\), \({\bar{r}}\), \({\bar{a}}\), and \({\bar{f}}\)), the choice of eigenvectors is inconsequential. For the two-matrix measures (\({\bar{d}}\) and \({\bar{\rho }}\)), comparisons were made between pairs of matrices with identical and different eigenvectors. The results were also compared to those from the traditional Monte Carlo evaluation using the qfratio package and the delta method approximation of Hansen and Houle (2008, 2009) using new codes.

When possible, an average computational time from 10 runs is recorded for the present series evaluation and Monte Carlo evaluation with 10,000 iterations in each condition. This is except for \({\bar{\rho }}\) in certain conditions with \(p \ge 50\) where only a single run was timed for the series evaluation as it did not reach numerical convergence (below). Where a truncation error bound is available, an order of series evaluation M was chosen so that the error is below \(1.0 \times 10^{-8}\). Where an error bound is unavailable, it was aimed to evaluate the series until the result rounded to the digit of \(10^{-9}\) appeared stable. (These values were arbitrarily chosen for benchmarking purposes, and not to be used as a guide for applications.) No formal timing was done for the delta method approximation as it is not computer-intensive. All computational times reported are in elapsed time rather than CPU time. The calculations were executed on a regular desktop PC with Intel® Core i7-8700 CPU and 16 GB RAM, running R version 4.2.3 (R Core Team 2023) with its built-in BLAS. For those problems that required larger computational memory for convergence (\({\bar{\rho }}\) with \(\textbf{G}_{0.5}\) and \(p \ge 20\); see below), another machine with 256 GB RAM was used to calculate accurate values. The codes used in the numerical experiments are provided as Online Resource 1, as well as maintained on a GitHub repository (https://github.com/watanabe-j/avrevol).

3.3 Numerical experiment: results

The present series evaluations as implemented in the R package qfratio reached numerical convergence within reasonable computational times in all cases except \({\bar{\rho }}\) with large (\(p = 200\)) or highly integrated (\(\textbf{G}_{0.5}\)) matrices (detailed below). The order of evaluation M required to accomplish a given level of accuracy varied across evolvability measures and eigenvalue conformations, but in most cases increased with p (Figs. 2, 3, 4, 5, 6, 7; Tables 1, 2, 3, 4, 5, 6). The rate of scaling with respect to p tended to be more rapid in the series evaluation than in Monte Carlo evaluation, when the required accuracy for the former and the number of iteration for the latter are pre-determined as described above (Fig. 8).

For \({\bar{c}}\), the computational time required for the error bound to numerically converge below \(1.0 \times 10^{-8}\) was at most in the order of \(\sim \)0.1 s and compared favorably with the Monte Carlo evaluation with 10,000 iterations across the entire range of p examined (Fig. 8; Table 1). For \({\bar{r}}\) and \({\bar{f}}\), the computational time varied from \(\sim \)1.5 ms in \(p = 5\) to several seconds in \(p = 200\), and in some eigenvalue conformations the series evaluation was outcompeted by the Monte Carlo evaluation with a fixed number of iterations in terms of plain computational time (Fig. 8; Tables 2, 3). However, it is to be remembered that results from Monte Carlo evaluations always involve stochastic error, whose standard error (SE) scales only with the inverse square root of the number of iteration (i.e., 100 times as many calculations are required to reduce the SE by the factor of 1/10). It is also notable that the partial sums themselves typically appeared to reach asymptotes much earlier than the error bounds did in these measures (Figs. 2, 3, 4). It took substantially longer time to accurately evaluate \({\bar{a}}\) than the other one-matrix measures, since it involves a double series so that the computational burden scales roughly with \(M^2\). Consequently, the computational time required for \({\bar{a}}\) to numerically converge at the order of \(10^{-9}\) varied from \(\sim \)7 ms to \(\sim \)170 s (Fig. 8; Table 4). Technically, this large computational time was partly because of the necessity to scale individual terms in the evaluation of \({\bar{a}}\) to avoid numerical overflow and underflow.

Fig. 2
figure 2

Results of numerical experiments for \({\bar{c}}\). Partial sums from series expressions are shown in blue solid lines, and associated truncation error bounds as deviations from the partial sum are in red broken lines, across varying orders of evaluation M. Mean and its 95% confidence interval from a Monte Carlo run with 10,000 iterations are shown in brown circles and vertical bars. Delta method approximations by Hansen and Houle (2008, 2009) are shown in green squares. Rows correspond to eigenvalue conformations (see text) and columns to number of variables p. Each panel represents \(\pm 10\%\) region of the terminal value of series evaluation

Fig. 3
figure 3

Results of numerical experiments for \({\bar{r}}\). See Fig. 2 for legend

Fig. 4
figure 4

Results of numerical experiments for \({\bar{f}}\). See Fig. 2 for legend. Delta method approximation is not available

Fig. 5
figure 5

Results of numerical experiments for \({\bar{a}}\). See Fig. 2 for legend. Truncation error bounds are not available. In some results, delta method approximation is too inaccurate to be plotted in the panels

Fig. 6
figure 6

Results of numerical experiments for \({\bar{d}}\). Top three rows are comparisons between covariance matrices with different eigenvalues and same eigenvectors, whereas the bottom three rows are between matrices with same eigenvalues and different eigenvectors (indicated by “+ evecs”). Comparisons between different eigenvalues and different eigenvectors are not shown for space limitations. See Fig. 2 for legend. Error bounds are not available in some same-eigenvalue comparisons because of the singularity of argument matrices (see text)

Fig. 7
figure 7

Results of numerical experiments for \({\bar{\rho }}\). Top three rows are comparisons between covariance matrices with different eigenvalues and same eigenvectors, whereas the bottom three rows are between matrices with same eigenvalues and different eigenvectors. Comparisons between different eigenvalues and different eigenvectors are not shown for space limitations. See Fig. 2 for legend. Delta method approximation or error bound is not available

Fig. 8
figure 8

Summary of computational time. a conditional evolvability \({\bar{c}}\); b respondability \({\bar{r}}\); c flexibility \({\bar{f}}\); d autonomy \({\bar{a}}\); e response difference \({\bar{d}}\); f response correlation \({\bar{\rho }}\). Average computational time t for each eigenvalue conformation (ad; legend in b) and combination of matrices (e, f; legend in f) is shown against varying p, along with average computational time for Monte Carlo evaluation (MC). “+ evecs” indicates combinations with different eigenvectors. Both axes in log scale

Table 1 Results of numerical experiments for \({\bar{c}}\)
Table 2 Results of numerical experiments for \({\bar{r}}\)
Table 3 Results of numerical experiments for \({\bar{f}}\)
Table 4 Results of numerical experiments for \({\bar{a}}\)
Table 5 Results of numerical experiments for \({\bar{d}}\)
Table 6 Results of numerical experiments for \({\bar{\rho }}\)

For two-matrix comparisons, \({\bar{d}}\) behaved similarly to \({\bar{r}}\) with which it shares the same functional form (Fig. 6; Table 5). In some comparisons between matrices with the same eigenvalues and different eigenvectors, the error bound of \({\bar{d}}\) could not be calculated because the argument matrix (\(\left( \textbf{G}_1 - \textbf{G}_2 \right) ^2\)) was singular due to the artificial conformations of eigenvalues in \(\textbf{G}_{0.1}\) and \(\textbf{G}_{0.5}\) where the trailing eigenvalues are identical in magnitude. Series evaluation of \({\bar{\rho }}\) took substantially more computational time and memory than the other measures. For \(p \ge 20\), the series could not be evaluated until numerical convergence in any comparisons that involve \(\textbf{G}_{0.5}\) due to excessively large memory requirement for the 16 GB RAM machine used here for main calculations, at least when the accuracy in the order of \(10^{-9}\) is aimed at (Fig. 7; Table 6 involves accurate values from a larger machine). This is apparently because \(\textbf{G}_{0.5}^2\) has rather small eigenvalues, which translate into extremely large magnitudes of higher-order terms, \(d_{i,j,1}\) in (18). For \(p = 200\), accurate series evaluation of \({\bar{\rho }}\) seemed infeasible in most combinations for too much computational memory required. The only exception in the present test cases is the comparison between two \(\textbf{G}_\textrm{Q}\)’s with different eigenvectors where the series converged fairly rapidly (Table 6).

As expected, the Monte Carlo evaluation always yielded an unbiased estimate with the confidence intervals having the intended nominal coverage of the converged series where available. As implemented in the qfratio package, iterations in the order of \(10^4\) can be executed within a second, so might be favored over the new series expression when the computational burden is a constraint and stochastic error is tolerated. Interestingly, relative accuracies of the series and Monte Carlo evaluations showed a qualitatively similar pattern of variation across eigenvalue conformations for a given average measure and p; the Monte Carlo evaluation tended to yield a large SE when the series evaluation or its error bound was slow to converge (Figs. 2, 3, 4, 5, 6, 7). In other words, eigenvalue conformations seem to largely determine the accuracy at which average measures can be empirically evaluated.

The delta method approximations by Hansen and Houle (2008, 2009) can be evaluated almost instantaneously (from \(\sim \)0.1 to \(\sim \)7 ms in \(p = 5\) and 200), as they are straightforward arithmetic operations once eigenvalues of the argument matrix are obtained. For \({\bar{c}}\) in large p, they showed notable empirical accuracy (Fig. 2; Table 1) (see also Hansen and Houle 2008). In many other cases, however, they yielded inaccurate values, often deviating from the converged series by >5%, and it is even difficult to discern a consistent trend in the directions of errors (Figs. 3, 5, 6; Tables 2, 4, 5). Except in the former cases, it is rather questionable when those approximations can be reliably used.

4 Discussion

The present paper derived new expressions for average evolvability measures using results on the moments of ratios of quadratic forms in normal variables. These series expressions are in theory exact, although for practical evaluation summation must be truncated, yielding truncation error. A great advantage in the present approach is that a hard upper bound for the truncation error is known for some of the measures: namely, average conditional evolvability \({\bar{c}}\), average respondability \({\bar{r}}\), average flexibility \({\bar{f}}\), and average response difference \({\bar{d}}\) (above). This is unlike the confidence intervals from Monte Carlo evaluations, which always involve some uncertainty. In addition, empirically speaking, evaluation of most of the average measures, excepting average autonomy \({\bar{a}}\) and average response correlation \({\bar{\rho }}\), can be faster than Monte Carlo evaluation for modest-sized problems (e.g., \(p \le 50\)), depending on the accuracy aimed at. The accuracy and speed of the series evaluation will facilitate practical studies that involve quantification and/or comparison of evolvability or integration using average measures, for which the Monte Carlo evaluation or delta method approximation has traditionally been used.

As a practical guide for calculating average measures, the series evaluation is recommended whenever numerical convergence can be reached within reasonable computational time and memory. Its feasibility largely depends on the eigenvalues of \(\textbf{G}\), but it is always easy to check whether convergence has been reached by inspecting a profile of the partial sum and by using error bounds when available. When the computational time and/or memory is a constraint (for \({\bar{a}}\) and \({\bar{\rho }}\) in large p with many small eigenvalues), then an alternative will be Monte Carlo evaluation, which is adequate because the evolvability measures considered herein have finite second moments (Appendix B). The delta method approximations as proposed by Hansen and Houle (2008, 2009) can yield rather inaccurate values with hardly predictable errors (above), thus will not remain a viable option in estimating average measures. This is perhaps except when even the (typically sub-second) computational burden of the Monte Carlo evaluation is a bottleneck, e.g., when one has large Bayesian samples of \(\textbf{G}\). However, a primary problem in the delta method remains that, although the method empirically works well in certain situations (e.g., \({\bar{c}}\) in large p), there is no theoretical guarantee for it to work well in general situations; the accuracy always need to be assessed by another method of evaluation.

It should be remembered that, in empirical analyses, \(\textbf{G}\) is almost always estimated with uncertainty, so that estimates of (average) evolvability measures involve error and potentially bias (Houle and Meyer 2015; Grabowski and Porto 2017). Although uncertainty may partially be incorporated in a Bayesian sampling of \(\textbf{G}\) (Aguirre et al. 2014) or a parametric-bootstrap-like sampling with multivariate normal approximation (Houle and Meyer 2015), these methods will not by themselves deal with potential estimation bias. The present series expressions are not free from this problem, which need to be addressed in future investigations. It might be possible to derive sampling moments of the (average) measures under simple sampling conditions (e.g., Wishartness), as was recently done for eigenvalue dispersion indices of covariance and correlation matrices (Watanabe 2022). A potentially promising fact is that the expectations of zonal and invariant polynomials of Wishart matrices are known (Constantine 1963; Davis 1980). It remains to be done to derive equivalent results to polynomials in more complicated functions of random matrices like the ones appearing in the present expressions. Apart from that, simulation-based approaches would also be useful in obtaining rough estimates of sampling error and bias (Haber 2011; Grabowski and Porto 2017). In any case, the present expressions will greatly facilitate future investigations into sampling properties of average measures as they provide quasi-exact values given (estimates of) \(\textbf{G}\), unlike the approximate methods used in previous studies.

Although the present paper dealt with six evolvability measures, this is not to endorse all of them. Some of these measures may be disfavored from a biological or other theoretical ground. For example, Hansen and Houle (2008) criticized the use of (average) response correlation \({\bar{\rho }}\) in the random skewers analysis for its insensitivity to potentially different magnitudes of response vectors and recommended (average) response difference \({\bar{d}}\) as an alternative. (To be fair, Cheverud (1996) argued for the use of \({\bar{\rho }}\) by claiming that there tends to be a large sampling bias in the magnitude of response vectors from empirical covariance matrices, but that remains to be justified.) Rohlf (2017) also pointed out difficulties in interpreting \({\bar{\rho }}\) without knowledge on the eigenvalues and eigenvectors of the matrices compared, and recommended simply using those quantities for comparing matrices. The present author refrains from going into details on biological validity or utility of individual evolvability measures, because they will ultimately depend on the scope and context of individual studies as well as the nature of the data analyzed.

It seems worth repeating one of Rohlf’s (2017) insightful caveats here. Rohlf (2017) considered the angle between response vectors (arc cosine of response correlation \(\rho \)) from two \(\textbf{G}\) matrices in bivariate traits and empirically showed that the angle can have a bimodal distribution when \(\varvec{\upbeta }\) is distributed uniformly on the unit circle. The interpretation of the mean of such a multimodal distribution will not be straightforward, although, at least as empirically observed, the multimodality may not be a universal phenomenon (see Machado et al. 2018). At present, the possibility remains that other evolvability measures can have multimodal distributions as well. Generally speaking, it is known that the distribution of the ratio of quadratic forms \(\textbf{x}^T \textbf{A} \textbf{x} / \textbf{x}^T \textbf{B} \textbf{x}\), where \(\textbf{B}\) is positive definite, has different functional forms across intervals bordered by eigenvalues of \(\textbf{B}^{-1} \textbf{A}\) (Hillier 2001; Forchini 2002, 2005), around which distinct density peaks can potentially be observed. Formal evaluation of the conditions for multimodality seems rather tedious (if tractable at all) and is not pursued herein. It should at least be borne in mind that the average measures are insensitive to the potential multimodality and that, generally speaking, a scalar summary index like them can mask those nuanced aspects of the underlying distribution and the multivariate covariance structure.

The present paper suggested slight modifications for some evolvability measures to accommodate potentially singular \(\textbf{G}\) matrices. As originally defined by Hansen and Houle (2008), conditional evolvability c and autonomy a (and integration i) could not be evaluated for singular \(\textbf{G}\) as they required matrix inversion. This restriction can be released by using the generalized inverse as done above. Conditional evolvability c and autonomy a are zero unless \(\varvec{\upbeta } \in R \left( \textbf{G} \right) \). The corresponding average measures (\({\bar{c}}\) and \({\bar{a}}\)) are also zero unless the same condition is satisfied with nonzero probability (i.e., when \(\varvec{\upbeta }\) is continuously distributed, its distribution is to be restricted to \(R \left( \textbf{G} \right) \)), as partially surmised by Hansen and Houle (2008). The new expressions enable to calculate c (4), a (6), i and their averages (C30), (C31) even when \(\textbf{G}\) is singular, as long as \(\varvec{\upbeta } \in R \left( \textbf{G} \right) \). Biologically speaking, the singularity of \(\textbf{G}\) represents the presence of absolute genetic constraint, where no genetic variance is available in certain directions. Although detection of absolute constraint in empirical systems requires different frameworks than evolvability measures (Mezey and Houle 2005; Hine and Blows 2006; Meyer and Kirkpatrick 2008; Pavlicev et al. 2009), the present development will facilitate investigations into systems with absolute constraints. These will also involve, e.g., shape variables that lack variation in certain directions in the full (nominal) trait space (see below). However, it is necessary to distinguish the singularity in a true \(\textbf{G}\) matrix and that in its empirical estimates (\(\hat{\textbf{G}}\), say) due to small sample size. It should be viewed with much caution whether evolvability measures can be meaningfully applied to the latter, because \(R \left( \hat{\textbf{G}} \right) \) as a space fluctuates due to random sampling when the sample size is not large enough to span the entire \(R \left( \textbf{G} \right) \).

Previous treatments of evolvability measures (Hansen and Houle 2008; Kirkpatrick 2009; Marroig et al. 2009; Bolstad et al. 2014) and random skewers (Revell 2007) almost exclusively considered the simple condition where selection gradients are spherically distributed, \(\varvec{\upbeta } \sim N_p (\textbf{0}_p, \sigma ^2 \textbf{I}_p)\), or uniformly distributed on a unit hypersphere. As detailed in Appendix C, the present results can be extended to a more general condition where the selection gradient has an arbitrary normal distribution, \(\varvec{\upbeta } \sim N_p (\varvec{\upeta }, \varvec{\Sigma })\). These extended results can potentially be useful when interest is in characterizing and comparing evolvability of populations under a certain directional and/or correlated selection regime. Of course, the original expressions of evolvability measures for a fixed \(\varvec{\upbeta }\) can be used when the selection is considered deterministic, so the utility of the new extended expressions will be in when the selection is considered random but potentially directional and/or correlated.

Chevin (2013) theoretically investigated the rate of adaptation (change in log average fitness across generations) in the same condition where \(\varvec{\upbeta }\) has a general normal distribution. He notably showed that, when stabilizing selection is assumed to be weak, the rate of adaptation is approximately equal to \(\varvec{\upbeta }^T \textbf{G} \varvec{\upbeta }\). He called this quantity “evolvability”, ascribing it to Hansen and Houle (2008), and discussed at length on its interpretation. However, that quantity should be distinguished from the evolvability e as defined by Hansen et al. (2003b) and Hansen and Houle (2008), which is standardized by the norm of the selection gradient \(\varvec{\upbeta }\).Footnote 4 This standardization can also be seen as projection of the selection gradients onto a unit hypersphere (Mardia and Jupp 1999); directional and correlational selections can be expressed as concentration of density on the hypersphere. Accordingly, Chevin’s quantity and the present expressions for evolvability measures under general normal selection regimes have different interpretations and utility. The former would be more appropriate as a measure of absolute rate of adaptation, whereas the latter seems to be more appropriate for characterizing \(\textbf{G}\) matrices independently of the magnitude of selection.

The present expressions for a general normal case would be particularly useful when the selection gradients are assumed to be restricted to a certain subspace rather than distributed across the entire trait space, i.e., when \(\varvec{\Sigma }\) is singular. Such restriction can potentially arise from biological reasons, but probably are more often encountered from structural constraints in certain types of traits, e.g., shape variables. When the support of the (nominal) traits is restricted to a certain subspace, it would be sensible to assume that selection is also restricted to the same subspace (e.g., when geometric shapes are concerned, it is senseless to consider the components of selection corresponding to geometric rotation or translocation). Along with the accommodation of potentially singular \(\textbf{G}\) matrices, this extension allows for application of average measures to various types of traits encountered in the current evolutionary biology. Overall, the present development will facilitate theoretical and empirical investigations into the evolution of multivariate characters, by providing accurate means to quantify and compare various aspects of evolvability, integration, and/or constraint.