Abstract
Theory predicts that the additive genetic covariance (\(\textbf{G}\)) matrix determines a population’s short-term (in)ability to respond to directional selection—evolvability in the Hansen–Houle sense—which is typically quantified and compared via certain scalar indices called evolvability measures. Often, interest is in obtaining the averages of these measures across all possible selection gradients, but explicit formulae for most of these average measures have not been known. Previous authors relied either on approximations by the delta method, whose accuracy is generally unknown, or Monte Carlo evaluations (including the random skewers analysis), which necessarily involve random fluctuations. This study presents new, exact expressions for the average conditional evolvability, average autonomy, average respondability, average flexibility, average response difference, and average response correlation, utilizing their mathematical structures as ratios of quadratic forms. The new expressions are infinite series involving top-order zonal and invariant polynomials of matrix arguments, and can be numerically evaluated as their partial sums with, for some measures, known error bounds. Whenever these partial sums numerically converge within reasonable computational time and memory, they will replace the previous approximate methods. In addition, new expressions are derived for the average measures under a general normal distribution for the selection gradient, extending the applicability of these measures into a substantially broader class of selection regimes.
Similar content being viewed by others
1 Introduction
The quantitative genetic theory of multivariate trait evolution provides a powerful framework to analyze and predict phenotypic evolution (Steppan et al. 2002; Blows 2007; Blows and Walsh 2009; Walsh and Blows 2009; Teplitsky et al. 2014). At the core of the theory is the Lande equation, which describes a population’s response to directional selection under certain simplifying conditions (Lande 1979; Lande and Arnold 1983):
where \(\varvec{\upbeta }\) is a p-dimensional selection gradient vector (partial regression coefficients of trait values to relative fitness), \(\textbf{G}\) is a \(p \times p\) (additive) genetic covariance matrix, and \(\Delta \bar{\textbf{z}}\) is a p-dimensional vector of per-generation change of the population’s mean trait values. (Throughout the paper, p denotes the (nominal) number of the traits/variables analyzed, and all vectors are column vectors without transposition.) In light of this theory, the \(\textbf{G}\) matrix is supposed to represent the population’s (in)ability to respond to directional selection—what has been termed evolvability by Hansen, Houle and colleagues (Houle 1992; Hansen 2003; Hansen et al. 2003a, b, 2011; Hansen and Houle 2008; Hansen and Pélabon 2021).Footnote 1 The theory has motivated extensive investigations into aspects of the \(\textbf{G}\) matrix or its substitutes, ranging from theoretical and simulation-based analyses (e.g., Pavlicev and Hansen 2011; Pavlicev et al. 2011; Jones et al. 2012; Chevin 2013; Melo and Marroig 2015; Hansen et al. 2019) to empirical inter-population/specific comparisons of trait covariation and its expansion to inter-population divergences (e.g., Cheverud 1982, 1989; Schluter 1996; Porto et al. 2009; Marroig et al. 2009; Rolian 2009; Bolstad et al. 2014; Haber 2016; Puentes et al. 2016; Costa e Silva et al. 2020; Machado 2020; McGlothlin et al. 2022; Opedal et al. 2022, 2023; Hubbe et al. 2023).
Based on this framework, and following earlier theoretical developments (Hansen 2003; Hansen et al. 2003a, b), Hansen and Houle (2008) proposed several scalar indices to capture certain aspects of the multivariate evolvability represented in \(\textbf{G}\) (see also Kirkpatrick 2009; Marroig et al. 2009). Of these, (unconditional) evolvability e, conditional evolvability c, autonomy a, integration i, respondability r, and flexibility f (due to Marroig et al. 2009) concern a single \(\textbf{G}\) matrix (Fig. 1A), whereas response difference d concerns comparison between two \(\textbf{G}\) matrices. Related to the latter is the use of response correlation \(\rho \) for matrix comparison (Cheverud 1996; Cheverud and Marroig 2007) (Fig. 1B). Notably, all these indices are simple or multiple ratios of quadratic forms in \(\varvec{\upbeta }\) (see below). In this paper, they are collectively called evolvability measures. Many of them can be related to the rate of adaptation or evolutionary constraint/bias (e.g., Hansen and Houle 2008; Chevin 2013; Bolstad et al. 2014; Hansen et al. 2019), providing means to characterize and compare \(\textbf{G}\) matrices in biologically meaningful ways.
The evolvability measures are primarily defined with respect to a given selection gradient \(\varvec{\upbeta }\). In this sense, they represent aspects of \(\textbf{G}\) matrices under a fixed, deterministic selection. In many theoretical and comparative analyses of evolvability (see references above), however, interest is often in characterizing \(\textbf{G}\) matrices without referring to \(\varvec{\upbeta }\). For this purpose, it is sensible to take the expectation of an evolvability measure assuming a certain probability distribution of \(\varvec{\upbeta }\). Most typically, although not necessarily, the uniform distribution on a unit hypersphere is assumed for this purpose, representing a completely randomly directed and uncorrelated selection regime. The resultant quantities are called average measures hereafter.Footnote 2 Biologically speaking, the average measures may be regarded as representations of general evolvability (in the sense of Riederer et al. 2022), as compared to specific evolvability which is represented by the evolvability measures as functions of a fixed \(\varvec{\upbeta }\).
However, there is a major obstacle to using average measures for practical or even theoretical investigations. That is, most of the average measures lack known explicit, exact expressions beyond as expectations, except for the average evolvability \({\bar{e}}\) and, under certain special conditions, the average conditional evolvability \({\bar{c}}\) and the average respondability \({\bar{r}}\) (Hansen and Houle 2008; Kirkpatrick 2009). Consequently, previous studies had to rely on approximations of average measures. Presumably the most widely used approximation method is Monte Carlo evaluation (e.g., Marroig et al. 2009; Bolstad et al. 2014; Haber 2016; Grabowski and Porto 2017), in which a large number of \(\varvec{\upbeta }\) is randomly generated on a computer and the resultant values calculated with the primary definition are literally averaged to yield an estimate of average measure. This has long been done in the so-called random skewers analysis for matrix comparison using response correlation \(\rho \) (e.g., Cheverud 1996; Cheverud and Marroig 2007; Revell 2007; see also Rohlf 2017). It is implemented in the R packages evolvability (Bolstad et al. 2014) and evolqg (Melo et al. 2016) for calculating various average measures. This method necessarily involves random fluctuations in the estimate and can take a large computational time, although the latter is becoming less of a concern with modern-day computers. On the other hand, Hansen and Houle (2008, 2009) themselves provided approximate expressions for average measures based on the delta method (which they called “analytical” or “standard” approximations). Apart from ill-defined notations used therein, a practical problem there is that the accuracy of this sort of approximation is generally unknown, so a separate Monte Carlo evaluation is usually required to ascertain whether the delta method approximation has an acceptable accuracy. This approximation has been used in some subsequent studies (e.g., Hansen and Voje 2011; Brommer 2014; Delahaie et al. 2017; Saltzberg et al. 2022) and is available in the evolvability package. Apart from these methods, Kirkpatrick (2009) used numerical integration to evaluate his average selection response, a measure equivalent to respondability (below), but this method has not been widely used in evaluating average measures.
This technical paper provides exact expressions for the following average measures: average evolvability \({\bar{e}}\), average conditional evolvability \({\bar{c}}\), average autonomy \({\bar{a}}\), average respondability \({\bar{r}}\), average flexibility \({\bar{f}}\), average response difference \({\bar{d}}\), and average response correlation \({\bar{\rho }}\). These expressions are derived from existing and new results on the moments of simple or multiple ratios of quadratic forms in normal variables. They are expressed as infinite series involving zonal or invariant polynomials of matrix arguments, and in most cases can be numerically evaluated as their partial sums to yield quasi-exact values. For some of the expressions, an upper bound for the truncation error is available. In addition to the special condition where \(\varvec{\upbeta }\) is spherically distributed as assumed in most previous studies (but see also Chevin 2013), this study also concerns the general condition that \(\varvec{\upbeta }\) is normally distributed with potentially nonzero mean and nonspherical covariance. The latter condition can be of substantial biological interest, as it can model a fairly wide range of random directional and/or correlated selection regimes.
The paper is structured as follows. Section 2 first reviews the definition of the evolvability measures; some of them are redefined to accommodate potentially singular \(\textbf{G}\) matrices. After reviewing some known results on the moment of a ratio of quadratic forms, it then provides new expressions for average evolvability measures under the spherical distribution of \(\varvec{\upbeta }\). Section 3 presents a new R implementation of the analytic results and evaluate its performance by numerical experiments. Section 4 concludes the main body of the paper by adding some theoretical and practical considerations on the use of average measures. As the zonal and invariant polynomials appear to have rarely been used in the biological literature, Appendix A gives a brief overview on their theories, providing a basis for the present mathematical results. Appendix B states a new result on the moment of a multiple ratio of quadratic forms in normal variables, and Appendix C presents new expressions for average measures under the general normal distribution of \(\varvec{\upbeta }\). Appendix D clarifies connections to previous results derived under special conditions (Hansen and Houle 2008; Kirkpatrick 2009).
2 Theory
2.1 Notations
In the following discussion, it is convenient to define the linear combination of the variables (traits) along the direction of \(\varvec{\upbeta }\). For this purpose, consider the decomposition \(\varvec{\upbeta } = |\varvec{\upbeta } | \textbf{u}\), where \(|\cdot |\) is the vector norm or length, \(|\textbf{a} | = \sqrt{\textbf{a}^T \textbf{a}}\) for an arbitrary vector \(\textbf{a}\) (the superscript \({}^T\) denotes matrix transpose), and \(\textbf{u}\) is a unit vector: \(\textbf{u}^T \textbf{u} = 1\). Define the \(p \times (p - 1)\) matrix \(\textbf{U}_{(-1)}\) so that the matrix \(\textbf{U} = (\textbf{u}, \textbf{U}_{(-1)})\) is orthogonal. With these, the orthogonal linear transform of the variables \(\textbf{z}^* = \textbf{U}^T \textbf{z}\) will be considered; the first entry of this new vector \(\textbf{u}^T \textbf{z}\) represents the score along the direction of \(\varvec{\upbeta }\). Note that the covariance matrix of the new variables \(\textbf{z}^*\) can be written as
Previous authors (Hansen and Houle 2008; Marroig et al. 2009; Grabowski and Porto 2017) defined some evolvability measures under the (sometimes implicit) constraint \(|\varvec{\upbeta } | = 1\). To avoid potential confusion, this standardized vector is always denoted by \(\textbf{u}\) herein. Throughout this paper, \(\textbf{G}\) is assumed to be given rather than random and validly constructed as a covariance matrix, that is, symmetric and nonnegative definite (i.e., either positive definite or positive semidefinite, in the terminology of Schott 2016).
Although Hansen and Houle (2008) assumed \(\textbf{G}\) to be positive definite, this assumption is not imposed here to accommodate potentially singular (and positive semidefinite) \(\textbf{G}\) where possible. This is done by using the generalized inverse; for a matrix \(\textbf{A}\), a generalized inverse \(\textbf{A}^{-}\) is such a matrix that satisfies \(\textbf{A} \textbf{A}^{-} \textbf{A} = \textbf{A}\). If \(\textbf{A}\) is nonsingular, \(\textbf{A}^{-} = \textbf{A}^{-1}\).
For notational simplicity, the range or column space and the null space of a nonzero \(p \times q\) matrix \(\textbf{A}\) are defined: \(R \left( \textbf{A} \right) = \{ \textbf{y}: \textbf{y} = \textbf{A} \textbf{x}, \textbf{x} \in {\mathbb {R}}^q \}\), and \(N \left( \textbf{A} \right) = \{ \textbf{x}: \textbf{0}_p = \textbf{A} \textbf{x}, \textbf{x} \in {\mathbb {R}}^q \}\), where \(\textbf{0}_p\) is the p-dimensional vector of zeros. When \(\textbf{A}\) is nonnegative definite, these are the spaces spanned by the eigenvectors corresponding to its nonzero and zero eigenvalues, respectively. The p-dimensional identity matrix is denoted by \(\textbf{I}_p\).
2.2 Evolvability measures: fixed selection
Evolvability \(e(\varvec{\upbeta })\) is the variance in \(\textbf{G}\) along the direction of selection \(\textbf{u}\), or equivalently the norm of the response vector \(\Delta \bar{\textbf{z}}\) projected onto \(\varvec{\upbeta }\), standardized by \(|\varvec{\upbeta } |\) (Hansen and Houle 2008) (Fig. 1):
The numerator in (1) can be related to the rate of adaptation (change in mean fitness) under simple directional selection (Agrawal and Stinchcombe 2009; Chevin 2013).
Conditional evolvability \(c(\varvec{\upbeta })\) is conceptually defined as the variance along \(\textbf{u}\) when evolution in other traits/directions is not allowed due to stabilizing selection (Hansen 2003; Hansen et al. 2003a, 2019). With above notations, it is the conditional variance of the first element of \(\textbf{z}^*\) given the other elements (redefined here after Hansen and Houle 2008):
Here, the equation of (3) is from well-known results on the (generalized) inverse of a partitioned matrix (e.g., Schott 2016, theorems 7.1 and 7.12), for which Hansen and Houle (2008) restated a proof in nonsingular \(\textbf{G}\). A lack of conditional variance along \(\varvec{\upbeta }\) results in \(c(\varvec{\upbeta }) = 0\), which happens when \(\textbf{G}\) is singular and \(\varvec{\upbeta } \notin R \left( \textbf{G} \right) \); the expressions (3) and (4) do not hold in this case.
Autonomy \(a(\varvec{\upbeta })\) is defined as the complement of the squared multiple correlation of the first element of \(\textbf{z}^*\) with respect to the other elements:
This quantity is undefined when \(e(\varvec{\upbeta }) = \textbf{u}^T \textbf{G} \textbf{u} = 0\). Hansen and Houle (2008) also defined the integration \(i(\varvec{\upbeta })\), which is the squared correlation just mentioned: \(i(\varvec{\upbeta }) = 1 - a(\varvec{\upbeta })\).
Respondability \(r(\varvec{\upbeta })\) is the magnitude of response standardized by that of selection, or the ratio of the norms of \(\Delta \bar{\textbf{z}}\) and \(\varvec{\upbeta }\):
The definition and terminology of r here follows Hansen and Houle (2008), but it should be noted that Kirkpatrick (2009) independently devised an equivalent measure as the (relative) selection response (R in the latter author’s notation) with a slightly different standardization.
Flexibility \(f(\varvec{\upbeta })\) quantifies similarity in direction between \(\Delta \bar{\textbf{z}}\) and \(\varvec{\upbeta }\) in terms of vector correlation (cosine of the angle formed by two vectors), and is also the ratio of \(e(\varvec{\upbeta })\) to \(r(\varvec{\upbeta })\) (Hansen and Houle 2008; Marroig et al. 2009):
This quantity is undefined if \(|\Delta \bar{\textbf{z}} | = 0\), i.e., \(\varvec{\upbeta } \in N \left( \textbf{G} \right) \). Otherwise, \(0 < f(\varvec{\upbeta }) \le 1\) from the nonnegative definiteness of \(\textbf{G}\) and the definition of vector correlation. The term flexibility was coined by Marroig et al. (2009) due to Hansen’s suggestion, although the use of the vector correlation or angle was suggested in a few different works (Blows and Walsh 2009; Rolian 2009).
All the above indices are one-matrix measures that primarily concern characterization of a single \(\textbf{G}\) matrix. On the other hand, the following two indices concern pairwise comparisons between two \(\textbf{G}\) matrices in terms of (dis)similarity of the response vectors given the same \(\varvec{\upbeta }\). The two \(\textbf{G}\) matrices and \(\Delta \bar{\textbf{z}}\) are denoted here with subscripts: \(\textbf{G}_i\), \(\Delta \bar{\textbf{z}}_i\). Response difference \(d(\varvec{\upbeta })\) is a measure of dissimilarity, defined as a standardized difference between the two response vectors (Hansen and Houle 2008):
Response correlation \(\rho (\varvec{\upbeta })\) is a measure of similarity in direction, defined as the vector correlation between the two response vectors (e.g., Cheverud 1996; Cheverud and Marroig 2007):
This quantity is undefined when at least one of \(|\Delta \bar{\textbf{z}}_1 |\) and \(|\Delta \bar{\textbf{z}}_2 |\) is 0. Otherwise, \(-1 \le \rho (\varvec{\upbeta }) \le 1\).Footnote 3
2.3 Ratio of quadratic forms
A key to derive expressions for average measures is that the measures defined above are simple or multiple ratios of quadratic forms in \(\varvec{\upbeta }\) or \(\textbf{u}\). Ratios of quadratic forms, especially those in normal random variables, play a pivotal role in statistics, as many practically important test statistics can be written as a ratio of quadratic forms. Naturally, there is a vast body of literature regarding the distribution and moments of quadratic forms and ratios thereof (e.g., Magnus 1986; Jones 1986, 1987; Smith 1989, 1993; Mathai and Provost 1992; Hillier 2001; Forchini 2002; Hillier et al. 2009, 2014; Bao and Kan 2013).
Regarding the ratio of quadratic forms in normal variables, there are several equivalent ways to derive its moments. One starts from the joint moment-generation function (e.g., Cressie et al. 1981; Meng 2005) of quadratic forms, and typically results in integrations which does not have simple closed-form expressions (e.g., Magnus 1986; Jones 1986, 1987; Gupta and Kabe 1998). Another way is to expand a ratio into infinite series, which in turn is integrated using the zonal and invariant polynomials (e.g., Smith 1989, 1993; Hillier 2001; Hillier et al. 2009, 2014). (See Bao and Kan (2013) for connections between these two approaches.) The latter way typically yields an infinite series including zonal or invariant polynomials, which can be evaluated with reasonable speed and accuracy with the aid of recent algorithmic developments (Hillier et al. 2009, 2014; Bao and Kan 2013). This is the way followed in the present paper. The zonal polynomials are a special form of homogeneous polynomials in eigenvalues of a matrix which generalize powers of scalars into symmetric matrices (e.g., James 1960, 1964; Muirhead 1982, 2006; Gross and Richards 1987; Mathai et al. 1995). The invariant polynomials are a generalization of the zonal polynomials into multiple matrix arguments (Davis 1979, 1980; Chikuse 1980, 2003; Chikuse and Davis 1986b). A brief overview on these mathematical tools is provided in Appendix A.
A general result for the moments of a simple ratio of quadratic forms in normal variables is restated in the following proposition.
Proposition 1
(Smith 1989; Bao and Kan 2013) Let \(\textbf{x} \sim N_p \left( \varvec{\upmu }, \textbf{I}_p \right) \), \(\textbf{A}\) be a \(p \times p\) symmetric matrix, \(\textbf{B}\) be a \(p \times p\) nonnegative definite matrix, and m, n be positive real numbers. When the expectation of \(\frac{(\textbf{x}^T \textbf{A} \textbf{x})^m}{(\textbf{x}^T \textbf{B} \textbf{x})^n}\) exists, it can be written as
where \(\alpha _{\textbf{A}}\) and \(\alpha _{\textbf{B}}\) are any positive constants that satisfy \(0< \alpha _{\textbf{A}} < 2 / \lambda _{\max } \! \left( \textbf{A} \right) \) and \(0< \alpha _{\textbf{B}} < 2 / \lambda _{\max } \! \left( \textbf{B} \right) \), with \(\lambda _{\max } \! \left( \cdot \right) \) denoting the largest eigenvalue of the argument matrix, \(\left( a \right) _{k} = a (a + 1) \dots (a + k - 1)\), and \(C^{\left[ {i} \right] , \left[ {j} \right] , \left[ {k} \right] }_{\left[ {i+j+k} \right] } \! \left( \cdot , \cdot , \cdot \right) \) are the (i, j, k)-th top-order invariant polynomials (whose explicit form is presented in Appendix A).
Remark
Conditions for the existence of the moment are stated in Bao and Kan (2013, proposition 1 therein): when \(\textbf{B}\) is positive definite, the moment exists if and only if \(\frac{p}{2} + m > n\); and when \(\textbf{B}\) is positive semidefinite, slightly different conditions apply, depending on the rank of \(\textbf{B}\) and structures of \(\textbf{A}\) and \(\textbf{B}\). When m is an integer, the expressions can be simplified so that the summation for i disappears. Also, when \(\varvec{\upmu } = \textbf{0}_p\), they substantially simplify as all terms with \(k > 0\) are zero. If \(\textbf{A}\) or \(\textbf{B}\) is asymmetric, it can be symmetrized by using \(\textbf{x}^T \textbf{A} \textbf{x} = \textbf{x}^T \textbf{A}^T \textbf{x} = \textbf{x}^T (\textbf{A} + \textbf{A}^T) \textbf{x} / 2\).
In practice, the above series can be approximated by its partial sum, with recursive algorithms to calculate \(d_{i, j, k} \! \left( \textbf{A}_1 , \textbf{A}_2 , \textbf{A}_3 \right) = \frac{ \left( \frac{1}{2} \right) _{i + j + k} }{ i! j! k!} C^{\left[ {i} \right] , \left[ {j} \right] , \left[ {k} \right] }_{\left[ {i + j + k} \right] } \! \left( \textbf{A}_1 , \textbf{A}_2 , \textbf{A}_3 \right) \) (Chikuse 1987; Hillier et al. 2009, 2014; Bao and Kan 2013). General covariance structures for \(\textbf{x}\) can be accommodated under certain conditions by transforming the variables and quadratic forms, as described in Appendix C. Appendix C also provides a similar result for multiple ratios of the form \(\frac{(\textbf{x}^T \textbf{A} \textbf{x})^l}{(\textbf{x}^T \textbf{B} \textbf{x})^m (\textbf{x}^T \textbf{D} \textbf{x})^n}\).
2.4 Average measures for uniformly distributed selection
With the above theories, average measures can be readily derived as expectations with respect to random \(\varvec{\upbeta }\) under the assumption of normality. Previous treatments of evolvability measures (Hansen and Houle 2008; Kirkpatrick 2009) and random skewers analysis (Revell 2007) typically assumed that \(\textbf{u}\) is uniformly distributed on the unit hypersphere in the p-dimensional space. This condition is entailed in spherical multivariate normal distributions of \(\varvec{\upbeta }\): \(\varvec{\upbeta } \sim N_p \left( \textbf{0}_p, \sigma ^2 \textbf{I}_p \right) \) for arbitrary positive \(\sigma ^2\). This section provides expressions of average measures under this condition. (Note that the normality of \(\varvec{\upbeta }\) is not necessary for \(\textbf{u}\) to be uniformly distributed; i.e., all results in this section hold as long as the distribution of \(\varvec{\upbeta }\) is spherically symmetric.) Expressions for a general normal case \(\varvec{\upbeta } \sim N_p \left( \varvec{\upeta }, \varvec{\Sigma } \right) \) are given in Appendix C. Most of the results below can be derived either by applying Proposition 1 or Proposition 4 in Appendix B, or directly as done in Appendix A.
The average evolvability \({\bar{e}}\) is straightforward to obtain:
as derived by Hansen and Houle (2008). This is because \({{\,\textrm{E}\,}}_{\textbf{u}}\! \left( \textbf{u} \textbf{u}^T \right) = \textbf{I}_p / p\) under this condition.
Conditional evolvability \(c(\varvec{\upbeta })\) and autonomy \(a(\varvec{\upbeta })\) can be nonzero only when \(\varvec{\upbeta } \in R \left( \textbf{G} \right) \) (see above). When \(\textbf{G}\) is singular, this happens with probability 0 when \(\varvec{\upbeta }\) is continuously distributed across the entire space. Hence, the average conditional evolvability \({\bar{c}}\) and the average autonomy \({\bar{a}}\) are 0 when \(\textbf{G}\) is singular. Although the following expressions can operationally be applied to singular \(\textbf{G}\) by using \(\textbf{G}^-\) in place of \(\textbf{G}^{-1}\), such a result is invalid because the ratio expressions do not hold when \(\varvec{\upbeta } \notin R \left( \textbf{G} \right) \).
When \(\textbf{G}\) is nonsingular, the average conditional evolvability \({\bar{c}}\) is:
where \(\alpha \) is any positive constant that satisfies \(0< \alpha < 2 / \lambda _{\max } \left( \textbf{G}^{-1}\right) \) (see Proposition 1), \( C_{\left[ {k} \right] } \! \left( \cdot \right) \) are the kth top-order zonal polynomials, and \({}_2F_1 \left( a, b; c; \cdot \right) \) is an alternative expression using the hypergeometric function of matrix argument with the three parameters a, b, and c (see Appendix A). When \(p = 2\), this expression simplifies to the geometric mean of the two eigenvalues of \(\textbf{G}\) as mentioned by Hansen and Houle (2008) (Appendix D).
The average autonomy \({\bar{a}}\) is, when \(\textbf{G}\) is nonsingular,
where \(\alpha _1\) and \(\alpha _2\) are to satisfy \(0< \alpha _1 < 2 / \lambda _{\max } \! \left( \textbf{G} \right) \) and \(0< \alpha _2 < 2 / \lambda _{\max } \! \left( \textbf{G}^{-1} \right) \). The average integration \({\bar{i}}\) is simply \(1 - {\bar{a}}\) (and equals 1 when \(\textbf{G}\) is singular).
The average respondability \({\bar{r}}\) is
where \(\alpha \) is to satisfy \(0< \alpha < 2 / \lambda _{\max } \! \left( \textbf{G}^2 \right) \) (note that \(\lambda _{\max } \! \left( \textbf{G}^2 \right) = \lambda _{\max } \! \left( \textbf{G} \right) ^2\) since \(\textbf{G}\) is nonnegative definite). Kirkpatrick (2009) discussed on an equivalent quantity, the average (relative) selection response (\({\bar{R}}\) there), but did not provide a closed-form expression for arbitrary \(\textbf{G}\). It is possible to show his expression for the special case with complete integration is entailed in the present result (Appendix D).
The average flexibility \({\bar{f}}\) is
where \(\alpha \) is to satisfy \(0< \alpha < 2 / \lambda _{\max } \! \left( \textbf{G}^2 \right) \). In general, \({\bar{f}}\) is well defined even if \(\textbf{G}\) is singular, unless \(\varvec{\upbeta } \in N \left( \textbf{G} \right) \) with nonzero probability (which of course does not happen here).
The average response difference \({\bar{d}}\) has a form essentially identical to \({\bar{r}}\) except for the argument matrix
where \(\alpha \) is to satisfy \(0< \alpha < 2 / \lambda _{\max } \! \left( \left( \textbf{G}_1 - \textbf{G}_2 \right) ^2 \right) \).
The average response correlation \({\bar{\rho }}\) is
where \(\alpha _1\) and \(\alpha _2\) are to satisfy \(0< \alpha _1 < 2 / \lambda _{\max } \! \left( \textbf{G}_1^2 \right) \) and \(0< \alpha _2 < 2 / \lambda _{\max } \! \left( \textbf{G}_2^2 \right) \).
2.5 Truncation error
It is seen that the successive terms in the above series decrease in magnitude, thus a partial sum up to certain higher-order terms can in principle be used as an approximation of the exact value of the infinite series. Although each term can be easily evaluated with a recursive algorithm (Bao and Kan 2013; Hillier et al. 2014), accurate numerical evaluation involving higher-order terms is practically a computer-intensive problem. The speed of numerical convergence and computational time will be examined in the next section.
The choice of \(\alpha \) is arbitrary within the constraint \(0< \alpha < 2 / \lambda _{\max }\) with respect to the relevant matrix (see above), but influences the signs of the terms as well as the speed of convergence. When \(0 < \alpha \le 1 / \lambda _{\max }\), all the zonal and invariant polynomials in the above expressions are positive because the argument matrices have only nonnegative eigenvalues. Therefore, all terms in the summations in \({\bar{c}}\) (13), \({\bar{a}}\) (14), \({\bar{f}}\) (16), and \({\bar{\rho }}\) (18) are positive, so truncation results in (slight) underestimation. On the other hand, all terms in \({\bar{r}}\) (15) and \({\bar{d}}\) (17), except those for \(k = 0\), are negative in the same condition, so truncation results in overestimation. If \(1 / \lambda _{\max }< \alpha < 2 / \lambda _{\max }\), the signs of the polynomials possibly, though not always, fluctuate, rendering the behavior of the series less predictable.
It is possible to derive bounds for the truncation errors in \({\bar{c}}\), \({\bar{r}}\), \({\bar{f}}\), and \({\bar{d}}\) by applying the method of Hillier et al. (2009, theorem 6) with slight modifications, if \(\textbf{G}\) is nonsingular and the constants \(\alpha \) are taken within \(0 < \alpha \le 1 / \lambda _{\max }\). Let \({\bar{c}}_M\), \({\bar{r}}_M\), \({\bar{f}}_M\), and \({\bar{d}}_M\) be the relevant partial sums up to \(k = M\). The truncation error bounds are:
where the constants \(\alpha \) are as in the corresponding definitions (but to satisfy \(0 < \alpha \le 1 / \lambda _{\max }\)). Unfortunately, the same method are not applicable to \({\bar{a}}\) or \({\bar{\rho }}\). Also, if \(\textbf{G}\) (or \(\textbf{G}_1 - \textbf{G}_2\) in case of \({\bar{d}}\)) is singular, these error bounds do not hold.
Empirically, a larger value of \(\alpha \) often improves the speed of convergence, as more eigenvalues of the argument matrix approach 0, but this should be used with caution since the above error bounds are not strictly applicable if \(\alpha > 1 / \lambda _{\max }\). To be precise, the error bounds should hold even under this condition provided that all the zonal/invariant polynomials above \(k = M\) are nonnegative, but it is not obvious in which case that condition can be shown to be true.
3 Numerical evaluation
3.1 Software implementation
There exist two Matlab programs distributed by Raymond M. Kan (https://www-2.rotman.utoronto.ca/~kan/) for evaluating the moments of simple ratios of quadratic forms in normal variables (Hillier et al. 2009, 2014; Bao and Kan 2013), which can be used to evaluate \({\bar{c}}\), \({\bar{r}}\), and \({\bar{d}}\) if a suitable environment is available. Alternatively, these average measures could be evaluated as hypergeometric functions of matrix argument, for which an efficient, general algorithm has been developed by Koev and Edelman (2006) and implemented in the R package HypergeoMat (Laurent 2022). However, that seems relatively inefficient in the present applications where the terms corresponding to all non-top-order partitions can be omitted. None of these can be used to evaluate moments of multiple ratios like \({\bar{a}}\), \({\bar{f}}\), and \({\bar{\rho }}\).
In order to make all above expressions available in the popular statistical software environment R, the present author has developed an independent implementation of the algorithms of Hillier et al. (2014) and Bao and Kan (2013) and published it as the package qfratio on the CRAN repository (Watanabe 2023). This package can be used to evaluate general moments of ratios of quadratic forms in normal variables with arbitrary mean and covariance (under certain constraints for singular cases as discussed in Appendix C.1). It uses RcppEigen (Bates and Eddelbuettel 2013) to effectively execute the computationally intensive recursive algorithms.
3.2 Numerical experiment: methods
Using the new R package, numerical experiments were conducted in a small set of artificial covariance matrices to evaluate behaviors and efficiency of the present series expressions. For each of \(p = 5, 20, 50\), and 200, three different eigenvalue conformations were used; two were set to have single large eigenvalues with different magnitudes of integration, \(V_\textrm{rel} = 0.1, 0.5\)—corresponding to relatively weak and rather strong integration, respectively—and one was set to have a quadratically decreasing series of eigenvalues (for the algorithms see Watanabe (2022) and the R package eigvaldisp (https://github.com/watanabe-j/eigvaldisp)). The matrices with these eigenvalue conformations are referred to as \(\textbf{G}_{0.1}\), \(\textbf{G}_{0.5}\), and \(\textbf{G}_\textrm{Q}\). For each of these, two eigenvector conformations were used so that the resultant covariance matrices were either diagonal matrices of the eigenvalues or correlation matrices with the same eigenvalues constructed with Givens rotations (Davies and Higham 2000). For the one-matrix measures (\({\bar{c}}\), \({\bar{r}}\), \({\bar{a}}\), and \({\bar{f}}\)), the choice of eigenvectors is inconsequential. For the two-matrix measures (\({\bar{d}}\) and \({\bar{\rho }}\)), comparisons were made between pairs of matrices with identical and different eigenvectors. The results were also compared to those from the traditional Monte Carlo evaluation using the qfratio package and the delta method approximation of Hansen and Houle (2008, 2009) using new codes.
When possible, an average computational time from 10 runs is recorded for the present series evaluation and Monte Carlo evaluation with 10,000 iterations in each condition. This is except for \({\bar{\rho }}\) in certain conditions with \(p \ge 50\) where only a single run was timed for the series evaluation as it did not reach numerical convergence (below). Where a truncation error bound is available, an order of series evaluation M was chosen so that the error is below \(1.0 \times 10^{-8}\). Where an error bound is unavailable, it was aimed to evaluate the series until the result rounded to the digit of \(10^{-9}\) appeared stable. (These values were arbitrarily chosen for benchmarking purposes, and not to be used as a guide for applications.) No formal timing was done for the delta method approximation as it is not computer-intensive. All computational times reported are in elapsed time rather than CPU time. The calculations were executed on a regular desktop PC with Intel® Core i7-8700 CPU and 16 GB RAM, running R version 4.2.3 (R Core Team 2023) with its built-in BLAS. For those problems that required larger computational memory for convergence (\({\bar{\rho }}\) with \(\textbf{G}_{0.5}\) and \(p \ge 20\); see below), another machine with 256 GB RAM was used to calculate accurate values. The codes used in the numerical experiments are provided as Online Resource 1, as well as maintained on a GitHub repository (https://github.com/watanabe-j/avrevol).
3.3 Numerical experiment: results
The present series evaluations as implemented in the R package qfratio reached numerical convergence within reasonable computational times in all cases except \({\bar{\rho }}\) with large (\(p = 200\)) or highly integrated (\(\textbf{G}_{0.5}\)) matrices (detailed below). The order of evaluation M required to accomplish a given level of accuracy varied across evolvability measures and eigenvalue conformations, but in most cases increased with p (Figs. 2, 3, 4, 5, 6, 7; Tables 1, 2, 3, 4, 5, 6). The rate of scaling with respect to p tended to be more rapid in the series evaluation than in Monte Carlo evaluation, when the required accuracy for the former and the number of iteration for the latter are pre-determined as described above (Fig. 8).
For \({\bar{c}}\), the computational time required for the error bound to numerically converge below \(1.0 \times 10^{-8}\) was at most in the order of \(\sim \)0.1 s and compared favorably with the Monte Carlo evaluation with 10,000 iterations across the entire range of p examined (Fig. 8; Table 1). For \({\bar{r}}\) and \({\bar{f}}\), the computational time varied from \(\sim \)1.5 ms in \(p = 5\) to several seconds in \(p = 200\), and in some eigenvalue conformations the series evaluation was outcompeted by the Monte Carlo evaluation with a fixed number of iterations in terms of plain computational time (Fig. 8; Tables 2, 3). However, it is to be remembered that results from Monte Carlo evaluations always involve stochastic error, whose standard error (SE) scales only with the inverse square root of the number of iteration (i.e., 100 times as many calculations are required to reduce the SE by the factor of 1/10). It is also notable that the partial sums themselves typically appeared to reach asymptotes much earlier than the error bounds did in these measures (Figs. 2, 3, 4). It took substantially longer time to accurately evaluate \({\bar{a}}\) than the other one-matrix measures, since it involves a double series so that the computational burden scales roughly with \(M^2\). Consequently, the computational time required for \({\bar{a}}\) to numerically converge at the order of \(10^{-9}\) varied from \(\sim \)7 ms to \(\sim \)170 s (Fig. 8; Table 4). Technically, this large computational time was partly because of the necessity to scale individual terms in the evaluation of \({\bar{a}}\) to avoid numerical overflow and underflow.
For two-matrix comparisons, \({\bar{d}}\) behaved similarly to \({\bar{r}}\) with which it shares the same functional form (Fig. 6; Table 5). In some comparisons between matrices with the same eigenvalues and different eigenvectors, the error bound of \({\bar{d}}\) could not be calculated because the argument matrix (\(\left( \textbf{G}_1 - \textbf{G}_2 \right) ^2\)) was singular due to the artificial conformations of eigenvalues in \(\textbf{G}_{0.1}\) and \(\textbf{G}_{0.5}\) where the trailing eigenvalues are identical in magnitude. Series evaluation of \({\bar{\rho }}\) took substantially more computational time and memory than the other measures. For \(p \ge 20\), the series could not be evaluated until numerical convergence in any comparisons that involve \(\textbf{G}_{0.5}\) due to excessively large memory requirement for the 16 GB RAM machine used here for main calculations, at least when the accuracy in the order of \(10^{-9}\) is aimed at (Fig. 7; Table 6 involves accurate values from a larger machine). This is apparently because \(\textbf{G}_{0.5}^2\) has rather small eigenvalues, which translate into extremely large magnitudes of higher-order terms, \(d_{i,j,1}\) in (18). For \(p = 200\), accurate series evaluation of \({\bar{\rho }}\) seemed infeasible in most combinations for too much computational memory required. The only exception in the present test cases is the comparison between two \(\textbf{G}_\textrm{Q}\)’s with different eigenvectors where the series converged fairly rapidly (Table 6).
As expected, the Monte Carlo evaluation always yielded an unbiased estimate with the confidence intervals having the intended nominal coverage of the converged series where available. As implemented in the qfratio package, iterations in the order of \(10^4\) can be executed within a second, so might be favored over the new series expression when the computational burden is a constraint and stochastic error is tolerated. Interestingly, relative accuracies of the series and Monte Carlo evaluations showed a qualitatively similar pattern of variation across eigenvalue conformations for a given average measure and p; the Monte Carlo evaluation tended to yield a large SE when the series evaluation or its error bound was slow to converge (Figs. 2, 3, 4, 5, 6, 7). In other words, eigenvalue conformations seem to largely determine the accuracy at which average measures can be empirically evaluated.
The delta method approximations by Hansen and Houle (2008, 2009) can be evaluated almost instantaneously (from \(\sim \)0.1 to \(\sim \)7 ms in \(p = 5\) and 200), as they are straightforward arithmetic operations once eigenvalues of the argument matrix are obtained. For \({\bar{c}}\) in large p, they showed notable empirical accuracy (Fig. 2; Table 1) (see also Hansen and Houle 2008). In many other cases, however, they yielded inaccurate values, often deviating from the converged series by >5%, and it is even difficult to discern a consistent trend in the directions of errors (Figs. 3, 5, 6; Tables 2, 4, 5). Except in the former cases, it is rather questionable when those approximations can be reliably used.
4 Discussion
The present paper derived new expressions for average evolvability measures using results on the moments of ratios of quadratic forms in normal variables. These series expressions are in theory exact, although for practical evaluation summation must be truncated, yielding truncation error. A great advantage in the present approach is that a hard upper bound for the truncation error is known for some of the measures: namely, average conditional evolvability \({\bar{c}}\), average respondability \({\bar{r}}\), average flexibility \({\bar{f}}\), and average response difference \({\bar{d}}\) (above). This is unlike the confidence intervals from Monte Carlo evaluations, which always involve some uncertainty. In addition, empirically speaking, evaluation of most of the average measures, excepting average autonomy \({\bar{a}}\) and average response correlation \({\bar{\rho }}\), can be faster than Monte Carlo evaluation for modest-sized problems (e.g., \(p \le 50\)), depending on the accuracy aimed at. The accuracy and speed of the series evaluation will facilitate practical studies that involve quantification and/or comparison of evolvability or integration using average measures, for which the Monte Carlo evaluation or delta method approximation has traditionally been used.
As a practical guide for calculating average measures, the series evaluation is recommended whenever numerical convergence can be reached within reasonable computational time and memory. Its feasibility largely depends on the eigenvalues of \(\textbf{G}\), but it is always easy to check whether convergence has been reached by inspecting a profile of the partial sum and by using error bounds when available. When the computational time and/or memory is a constraint (for \({\bar{a}}\) and \({\bar{\rho }}\) in large p with many small eigenvalues), then an alternative will be Monte Carlo evaluation, which is adequate because the evolvability measures considered herein have finite second moments (Appendix B). The delta method approximations as proposed by Hansen and Houle (2008, 2009) can yield rather inaccurate values with hardly predictable errors (above), thus will not remain a viable option in estimating average measures. This is perhaps except when even the (typically sub-second) computational burden of the Monte Carlo evaluation is a bottleneck, e.g., when one has large Bayesian samples of \(\textbf{G}\). However, a primary problem in the delta method remains that, although the method empirically works well in certain situations (e.g., \({\bar{c}}\) in large p), there is no theoretical guarantee for it to work well in general situations; the accuracy always need to be assessed by another method of evaluation.
It should be remembered that, in empirical analyses, \(\textbf{G}\) is almost always estimated with uncertainty, so that estimates of (average) evolvability measures involve error and potentially bias (Houle and Meyer 2015; Grabowski and Porto 2017). Although uncertainty may partially be incorporated in a Bayesian sampling of \(\textbf{G}\) (Aguirre et al. 2014) or a parametric-bootstrap-like sampling with multivariate normal approximation (Houle and Meyer 2015), these methods will not by themselves deal with potential estimation bias. The present series expressions are not free from this problem, which need to be addressed in future investigations. It might be possible to derive sampling moments of the (average) measures under simple sampling conditions (e.g., Wishartness), as was recently done for eigenvalue dispersion indices of covariance and correlation matrices (Watanabe 2022). A potentially promising fact is that the expectations of zonal and invariant polynomials of Wishart matrices are known (Constantine 1963; Davis 1980). It remains to be done to derive equivalent results to polynomials in more complicated functions of random matrices like the ones appearing in the present expressions. Apart from that, simulation-based approaches would also be useful in obtaining rough estimates of sampling error and bias (Haber 2011; Grabowski and Porto 2017). In any case, the present expressions will greatly facilitate future investigations into sampling properties of average measures as they provide quasi-exact values given (estimates of) \(\textbf{G}\), unlike the approximate methods used in previous studies.
Although the present paper dealt with six evolvability measures, this is not to endorse all of them. Some of these measures may be disfavored from a biological or other theoretical ground. For example, Hansen and Houle (2008) criticized the use of (average) response correlation \({\bar{\rho }}\) in the random skewers analysis for its insensitivity to potentially different magnitudes of response vectors and recommended (average) response difference \({\bar{d}}\) as an alternative. (To be fair, Cheverud (1996) argued for the use of \({\bar{\rho }}\) by claiming that there tends to be a large sampling bias in the magnitude of response vectors from empirical covariance matrices, but that remains to be justified.) Rohlf (2017) also pointed out difficulties in interpreting \({\bar{\rho }}\) without knowledge on the eigenvalues and eigenvectors of the matrices compared, and recommended simply using those quantities for comparing matrices. The present author refrains from going into details on biological validity or utility of individual evolvability measures, because they will ultimately depend on the scope and context of individual studies as well as the nature of the data analyzed.
It seems worth repeating one of Rohlf’s (2017) insightful caveats here. Rohlf (2017) considered the angle between response vectors (arc cosine of response correlation \(\rho \)) from two \(\textbf{G}\) matrices in bivariate traits and empirically showed that the angle can have a bimodal distribution when \(\varvec{\upbeta }\) is distributed uniformly on the unit circle. The interpretation of the mean of such a multimodal distribution will not be straightforward, although, at least as empirically observed, the multimodality may not be a universal phenomenon (see Machado et al. 2018). At present, the possibility remains that other evolvability measures can have multimodal distributions as well. Generally speaking, it is known that the distribution of the ratio of quadratic forms \(\textbf{x}^T \textbf{A} \textbf{x} / \textbf{x}^T \textbf{B} \textbf{x}\), where \(\textbf{B}\) is positive definite, has different functional forms across intervals bordered by eigenvalues of \(\textbf{B}^{-1} \textbf{A}\) (Hillier 2001; Forchini 2002, 2005), around which distinct density peaks can potentially be observed. Formal evaluation of the conditions for multimodality seems rather tedious (if tractable at all) and is not pursued herein. It should at least be borne in mind that the average measures are insensitive to the potential multimodality and that, generally speaking, a scalar summary index like them can mask those nuanced aspects of the underlying distribution and the multivariate covariance structure.
The present paper suggested slight modifications for some evolvability measures to accommodate potentially singular \(\textbf{G}\) matrices. As originally defined by Hansen and Houle (2008), conditional evolvability c and autonomy a (and integration i) could not be evaluated for singular \(\textbf{G}\) as they required matrix inversion. This restriction can be released by using the generalized inverse as done above. Conditional evolvability c and autonomy a are zero unless \(\varvec{\upbeta } \in R \left( \textbf{G} \right) \). The corresponding average measures (\({\bar{c}}\) and \({\bar{a}}\)) are also zero unless the same condition is satisfied with nonzero probability (i.e., when \(\varvec{\upbeta }\) is continuously distributed, its distribution is to be restricted to \(R \left( \textbf{G} \right) \)), as partially surmised by Hansen and Houle (2008). The new expressions enable to calculate c (4), a (6), i and their averages (C30), (C31) even when \(\textbf{G}\) is singular, as long as \(\varvec{\upbeta } \in R \left( \textbf{G} \right) \). Biologically speaking, the singularity of \(\textbf{G}\) represents the presence of absolute genetic constraint, where no genetic variance is available in certain directions. Although detection of absolute constraint in empirical systems requires different frameworks than evolvability measures (Mezey and Houle 2005; Hine and Blows 2006; Meyer and Kirkpatrick 2008; Pavlicev et al. 2009), the present development will facilitate investigations into systems with absolute constraints. These will also involve, e.g., shape variables that lack variation in certain directions in the full (nominal) trait space (see below). However, it is necessary to distinguish the singularity in a true \(\textbf{G}\) matrix and that in its empirical estimates (\(\hat{\textbf{G}}\), say) due to small sample size. It should be viewed with much caution whether evolvability measures can be meaningfully applied to the latter, because \(R \left( \hat{\textbf{G}} \right) \) as a space fluctuates due to random sampling when the sample size is not large enough to span the entire \(R \left( \textbf{G} \right) \).
Previous treatments of evolvability measures (Hansen and Houle 2008; Kirkpatrick 2009; Marroig et al. 2009; Bolstad et al. 2014) and random skewers (Revell 2007) almost exclusively considered the simple condition where selection gradients are spherically distributed, \(\varvec{\upbeta } \sim N_p (\textbf{0}_p, \sigma ^2 \textbf{I}_p)\), or uniformly distributed on a unit hypersphere. As detailed in Appendix C, the present results can be extended to a more general condition where the selection gradient has an arbitrary normal distribution, \(\varvec{\upbeta } \sim N_p (\varvec{\upeta }, \varvec{\Sigma })\). These extended results can potentially be useful when interest is in characterizing and comparing evolvability of populations under a certain directional and/or correlated selection regime. Of course, the original expressions of evolvability measures for a fixed \(\varvec{\upbeta }\) can be used when the selection is considered deterministic, so the utility of the new extended expressions will be in when the selection is considered random but potentially directional and/or correlated.
Chevin (2013) theoretically investigated the rate of adaptation (change in log average fitness across generations) in the same condition where \(\varvec{\upbeta }\) has a general normal distribution. He notably showed that, when stabilizing selection is assumed to be weak, the rate of adaptation is approximately equal to \(\varvec{\upbeta }^T \textbf{G} \varvec{\upbeta }\). He called this quantity “evolvability”, ascribing it to Hansen and Houle (2008), and discussed at length on its interpretation. However, that quantity should be distinguished from the evolvability e as defined by Hansen et al. (2003b) and Hansen and Houle (2008), which is standardized by the norm of the selection gradient \(\varvec{\upbeta }\).Footnote 4 This standardization can also be seen as projection of the selection gradients onto a unit hypersphere (Mardia and Jupp 1999); directional and correlational selections can be expressed as concentration of density on the hypersphere. Accordingly, Chevin’s quantity and the present expressions for evolvability measures under general normal selection regimes have different interpretations and utility. The former would be more appropriate as a measure of absolute rate of adaptation, whereas the latter seems to be more appropriate for characterizing \(\textbf{G}\) matrices independently of the magnitude of selection.
The present expressions for a general normal case would be particularly useful when the selection gradients are assumed to be restricted to a certain subspace rather than distributed across the entire trait space, i.e., when \(\varvec{\Sigma }\) is singular. Such restriction can potentially arise from biological reasons, but probably are more often encountered from structural constraints in certain types of traits, e.g., shape variables. When the support of the (nominal) traits is restricted to a certain subspace, it would be sensible to assume that selection is also restricted to the same subspace (e.g., when geometric shapes are concerned, it is senseless to consider the components of selection corresponding to geometric rotation or translocation). Along with the accommodation of potentially singular \(\textbf{G}\) matrices, this extension allows for application of average measures to various types of traits encountered in the current evolutionary biology. Overall, the present development will facilitate theoretical and empirical investigations into the evolution of multivariate characters, by providing accurate means to quantify and compare various aspects of evolvability, integration, and/or constraint.
Data availibility
Not applicable.
Code availability
R codes to reproduce numerical experiments are provided as the Online Resource 1.
Notes
The term evolvability bears various meanings in the current evolutionary biology (e.g., Brown 2014; Crother and Murray 2019; Feiner et al. 2021; Jablonski 2022; Love et al. 2022; Riederer et al. 2022). In the present paper, the use of this term is restricted to certain mathematical aspects of the \(\textbf{G}\) matrix and response vectors—the Hansen–Houle evolvability—unless stated otherwise.
A potential alternative to the average measures in the above sense is the average of the evolvability measures calculated from the \(\varvec{\upbeta }\)’s aligned with all individual traits (Armbruster et al. 2009; Haber 2011; Pavlicev and Hansen 2011; Machado et al. 2018). Although that would be valid for descriptive purposes, its interpretability in the context of multivariate selection might be limited. In any case, the calculation of such averages is straightforward and not concerned herein.
Rohlf (2017, p. 544) asserted that \(\rho \) cannot be negative because he assumed, inappropriately in this context, the response vectors to have no polarity. Negative \(\rho \) can occur when, e.g., \(\varvec{\upbeta }\) bisects the wider of the two angles formed by the major axes of highly integrated \(\textbf{G}\) matrices. Except for this subtle fallacy, his criticisms on previous interpretations of \(\rho \) or \({\bar{\rho }}\) (e.g., Melo et al. 2016) seem largely pertinent (see also Discussion).
This confusion apparently arose from the custom of using the same symbol \(\varvec{\upbeta }\) for both standardized and unstandardized selection gradients (above). It seems advisable to use a different symbol for the former, e.g., \(\textbf{u}\) herein.
A few examples will clarify the indexing in (A14) (semicolons separate terms of the summation). For \((k_1, k_2) = (1, 1)\): \((i_1, j_2) = (1, 1), \nu _{1,1} = 1\); \((\nu _1, \nu _2) = (1, 0), (0, 1), \nu _{1,0} = 1, \nu _{0,1} = 1\). For \((k_1, k_2) = (2, 1)\): \((i_1, i_2) = (2, 1), \nu _{2,1} = 1\); \((i_1, i_2) = (2, 0), (0, 1), \nu _{2,0} = 1, \nu _{0,1} = 1\); \((i_1, i_2) = (1, 1), (1, 0), \nu _{1,1} = 1, \nu _{1,0} = 1\); \((i_1, i_2) = (1, 0), (0, 1), \nu _{1,0} = 2, \nu _{0,1} = 1\).
The recursive algorithm for evaluating \({\hat{h}}_{i; j}\) was said to be “very similar” to that for \(h_{i; j}\) but was not explicitly stated in Hillier et al. (2014). This is obtained by, in the definition and updating equation for \(g_{i,j}\) in the theorem 6 therein, replacing all the subtractions of terms (but not indices) by additions.
References
Agrawal AF, Stinchcombe JR (2009) How much do genetic covariances alter the rate of adaptation? Proc R Soc B 276:1183–1191. https://doi.org/10.1098/rspb.2008.1671
Aguirre JD, Hine E, McGuigan K, Blows MW (2014) Comparing G: multivariate analysis of genetic variation in multiple populations. Heredity 112:21–29. https://doi.org/10.1038/hdy.2013.12
Armbruster WS, Hansen TF, Pélabon C, Pérez-Barrales R, Maad J (2009) The adaptive accuracy of flowers: measurement and microevolutionary patterns. Ann Bot 103:1529–1545. https://doi.org/10.1093/aob/mcp095
Bao Y, Kan R (2013) On the moments of ratios of quadratic forms in normal random variables. J Multivar Anal 117:229–245. https://doi.org/10.1016/j.jmva.2013.03.002
Bates D, Eddelbuettel D (2013) Fast and elegant numerical linear algebra using the RcppEigen package. J Stat Softw 52(5):1–24. https://doi.org/10.18637/jss.v052.i05
Blows MW (2007) A tale of two matrices: multivariate approaches in evolutionary biology. J Evol Biol 20:1–8. https://doi.org/10.1111/j.1420-9101.2006.01164.x
Blows M, Walsh B (2009) Spherical cows grazing in flatland: constraints to selection and adaptation. In: van der Werf J, Graser HU, Frankham R, Gondro C (eds) Adaptation and fitness in animal populations: evolutionary and breeding perspectives on genetic resource management. Springer, Dordrecht, pp 83–101. https://doi.org/10.1007/978-1-4020-9005-9_6
Bolstad GH, Hansen TF, Pélabon C, Falahati-Anbaran M, Pérez-Baralles R, Armbruster WS (2014) Genetic constraints predict evolutionary divergence in Dalechampia blossoms. Philos Trans R Soc B 369(20130):255. https://doi.org/10.1098/rstb.2013.0255
Brommer JE (2014) Using average autonomy to test whether behavioral syndromes constrain evolution. Behav Ecol Sociobiol 68:691–700. https://doi.org/10.1007/s00265-014-1699-6
Brown RL (2014) What evolvability really is. British J Philos Sci 65:549–572. https://doi.org/10.1093/bjps/axt014
Cheverud JM (1982) Phenotypic, genetic, and environmental morphological integration in the cranium. Evolution 36:499–516. https://doi.org/10.1111/j.1558-5646.1982.tb05070.x
Cheverud JM (1989) A comparative analysis of morphological variation patterns in the papionins. Evolution 43:1737–1747. https://doi.org/10.1111/j.1558-5646.1989.tb02623.x
Cheverud JM (1996) Quantitative genetic analysis of cranial morphology in the cotton-top (Saguinus oedipus) and saddle-back (S. fuscicollis) tamarins. J Evol Biol 9:5–42. https://doi.org/10.1046/j.1420-9101.1996.9010005.x
Cheverud JM, Marroig G (2007) Comparing covariance matrices: random skewers method compared to the common principal components model. Genet Mol Biol 30:461–469. https://doi.org/10.1590/S1415-47572007000300027
Chevin LM (2013) Genetic constraints on adaptation to a changing environment. Evolution 67:708–721. https://doi.org/10.1111/j.1558-5646.2012.01809.x
Chikuse Y (1980) Invariant polynomials with matrix arguments and their applications. In: Gupta RP (ed) Multivariate statistical analysis. North-Holland, Amsterdam, pp 53–68
Chikuse Y (1987) Methods for constructing top order invariant polynomials. Econom Theor 3:195–207. https://doi.org/10.1017/S026646660001029X
Chikuse Y (2003) Statistics on special manifolds. Springer, New York
Chikuse Y, Davis AW (1986) Some properties of invariant polynomials with matrix arguments and their applications in econometrics. Ann Inst Statist Math 38:109–122. https://doi.org/10.1007/BF02482504
Chikuse Y, Davis AW (1986) A survey on the invariant polynomials with matrix arguments in relation to econometric distribution theory. Econom Theor 2:232–248. https://doi.org/10.1017/S0266466600011518
Constantine AG (1963) Some non-central distribution problems in multivariate analysis. Ann Math Stat 34:1270–1285. https://doi.org/10.1214/aoms/1177703863
Costa e Silva J, Potts BM, Harrison PA (2020) Population divergence along a genetic line of least resistance in the tree species Eucalyptus globulus. Genes 11:1095. https://doi.org/10.3390/genes11091095
Cressie N, Davis AS, Folks JL, Policello GEII (1981) The moment-generating function and negative integer moments. Am Stat 35:148–150. https://doi.org/10.2307/2683982
Crother BI, Murray CM (2019) Early usage and meaning of evolvability. Ecol Evol 9:3784–3793. https://doi.org/10.1002/ece3.5002
Davies PI, Higham NJ (2000) Numerically stable generation of correlation matrices and their factors. BIT Numer Math 40:640–651. https://doi.org/10.1023/A:1022384216930
Davis AW (1979) Invariant polynomials with two matrix arguments extending the zonal polynomials: applications to multivariate distribution theory. Ann Inst Statist Math 31:465–485. https://doi.org/10.1007/BF02480302
Davis AW (1980) Invariant polynomials with two matrix arguments, extending the zonal polynomials. In: Krishnaiah PR (ed) Multivariate analysis–V. North-Holland, Amsterdam, pp 287–299
Delahaie B, Charmantier A, Chantepie S, Garant D, Porlier M, Teplitsky C (2017) Conserved G-matrices of morphological and life-history traits among continental and island blue tit populations. Heredity 119:76–87. https://doi.org/10.1038/hdy.2017.15
Feiner N, Jackson ISC, Van der Cruyssen E, Uller T (2021) A highly conserved ontogenetic limb allometry and its evolutionary significance in the adaptive radiation of Anolis lizards. Proc R Soc B 288(20210):226. https://doi.org/10.1098/rspb.2021.0226
Forchini G (2002) The exact cumulative distribution function of a ratio of quadratic forms in normal variables, with application to the AR(1) model. Econom Theor 18:823–852. https://doi.org/10.1017/s0266466602184015
Forchini G (2005) The distribution of a ratio of quadratic forms in noncentral normal variables. Comm Stat Theory Methods 34:999–1008. https://doi.org/10.1081/STA-200056855
Grabowski M, Porto A (2017) How many more? Sample size determination in studies of morphological integration and evolvability. Methods Ecol Evol 8:592–603. https://doi.org/10.1111/2041-210X.12674
Gross KI, Richards DSP (1987) Special functions of matrix argument. I: algebraic induction, zonal polynomials, and hypergeometric functions. Trans Am Math Soc 301:781–811. https://doi.org/10.1090/S0002-9947-1987-0882715-2
Gupta AK, Kabe DG (1998) Moments of ratios of quadratic forms. Stat Probab Lett 38:69–71. https://doi.org/10.1016/S0167-7152(97)00155-7
Haber A (2011) A comparative analysis of integration indices. Evol Biol 38:476–488. https://doi.org/10.1007/s11692-011-9137-4
Haber A (2016) Phenotypic covariation and morphological diversification in the ruminant skull. Am Nat 187:576–591. https://doi.org/10.1086/685811
Hansen TF (2003) Is modularity necessary for evolvability? Remarks on the relationship between pleiotropy and evolvability. Biosystems 69:83–94. https://doi.org/10.1016/S0303-2647(02)00132-6
Hansen TF, Houle D (2008) Measuring and comparing evolvability and constraint in multivariate characters. J Evol Biol 21:1201–1219. https://doi.org/10.1111/j.1420-9101.2008.01573.x
Hansen TF, Houle D (2009) Corrigendum [of “Measuring and comparing evolvability and constraint in multivariate characters’’]. J Evol Biol 22:913–915. https://doi.org/10.1111/j.1420-9101.2009.01715.x
Hansen TF, Pélabon C (2021) Evolvability: a quantitative-genetics perspective. Annu Rev Ecol Evol Syst 52:153–175. https://doi.org/10.1146/annurev-ecolsys-011121-021241
Hansen TF, Voje KL (2011) Deviation from the line of least resistance does not exclude genetic constraints: a comment on Berner et al. (2010). Evolution 65:1821–1822. https://doi.org/10.1111/j.1558-5646.2011.01281.x
Hansen TF, Armbruster WS, Carlson ML, Pélabon C (2003) Evolvability and genetic constraint in Dalechampia blossoms: genetic correlations and conditional evolvability. J Exp Zool B Mol Dev Evol 296B:23–39. https://doi.org/10.1002/jez.b.14
Hansen TF, Pélabon C, Armbruster WS, Carlson ML (2003) Evolvability and genetic constraint in Dalechampia blossoms: components of variance and measures of evolvability. J Evol Biol 16:754–766. https://doi.org/10.1046/j.1420-9101.2003.00556.x
Hansen TF, Pélabon C, Houle D (2011) Heritability is not evolvability. Evol Biol 38:258–277. https://doi.org/10.1007/s11692-011-9127-6
Hansen TF, Solvin TM, Pavlicev M (2019) Predicting evolutionary potential: a numerical test of evolvability measures. Evolution 73:689–703. https://doi.org/10.1111/evo.13705
Hillier G (2001) The density of a quadratic form in a vector uniformly distributed on the \(n\)-sphere. Econom Theor 17:1–28. https://doi.org/10.1017/S026646660117101X
Hillier G, Kan R, Wang X (2009) Computationally efficient recursions for top-order invariant polynomials with applications. Econom Theor 25:211–242. https://doi.org/10.1017/S0266466608090075
Hillier G, Kan R, Wang X (2014) Generating functions and short recursions, with applications to the moments of quadratic forms in noncentral normal vectors. Econom Theor 30:436–473. https://doi.org/10.1017/S0266466613000364
Hine E, Blows MW (2006) Determining the effective dimensionality of the genetic variance-covariance matrix. Genetics 173:1135–1144. https://doi.org/10.1534/genetics.105.054627
Houle D (1992) Comparing evolvability and variability of quantitative traits. Genetics 130:195–204. https://doi.org/10.1093/genetics/130.1.195
Houle D, Meyer K (2015) Estimating sampling error of evolutionary statistics based on genetic covariance matrices using maximum likelihood. J Evol Biol 28:1542–1549. https://doi.org/10.1111/jeb.12674
Hubbe A, Machado FA, Melo D, Garcia G, Sebastião H, Porto A, Cheverud J, Marroig G (2023) Morphological integration during postnatal ontogeny: implications for evolutionary biology. Evolution 77:763–775. https://doi.org/10.1093/evolut/qpac052
Jablonski D (2022) Evolvability and macroevolution: overview and synthesis. Evol Biol 49:265–291. https://doi.org/10.1007/s11692-022-09570-4
James AT (1960) The distribution of the latent roots of the covariance matrix. Ann Math Stat 31:151–158. https://doi.org/10.1214/aoms/1177705994
James AT (1961) Zonal polynomials of the real positive definite symmetric matrices. Ann Math 74:456–469. https://doi.org/10.2307/1970291
James AT (1964) Distributions of matrix variates and latent roots derived from normal samples. Ann Math Stat 35:475–501. https://doi.org/10.1214/aoms/1177703550
Jones MC (1986) Expressions for inverse moments of positive quadratic forms in normal variables. Austral J Stat 28:242–250. https://doi.org/10.1111/j.1467-842X.1986.tb00604.x
Jones MC (1987) On moments of ratios of quadratic forms in normal variables. Stat Probab Lett 6:129–136. https://doi.org/10.1016/0167-7152(87)90086-1
Jones AG, Bürger R, Arnold SJ, Hohenlohe PA, Uyeda JC (2012) The effects of stochastic and episodic movement of the optimum on the evolution of the G-matrix and the response of the trait mean to selection. J Evol Biol 25:2210–2231. https://doi.org/10.1111/j.1420-9101.2012.02598.x
Kirkpatrick M (2009) Patterns of quantitative genetic variation in multiple dimensions. Genetica 136:271–284. https://doi.org/10.1007/s10709-008-9302-6
Koev P, Edelman A (2006) The efficient evaluation of the hypergeometric function of a matrix argument. Math Comp 75:833–846. https://doi.org/10.1090/S0025-5718-06-01824-2
Lande R (1979) Quantitative genetic analysis of multivariate evolution, applied to brain:body size allometry. Evolution 33:402–416. https://doi.org/10.1111/j.1558-5646.1979.tb04694.x
Lande R, Arnold SJ (1983) The measurement of selection on correlated characters. Evolution 37:1210–1226. https://doi.org/10.1111/j.1558-5646.1983.tb00236.x
Laurent S (2022) HypergeoMat: hypergeometric function of a matrix argument. https://CRAN.R-project.org/package=HypergeoMat
Love AC, Grabowski M, Houle D, Liow LH, Porto A, Tsuboi M, Voje KL, Hunt G (2022) Evolvability in the fossil record. Paleobiology 48:186–209. https://doi.org/10.1017/pab.2021.36
Machado FA (2020) Selection and constraints in the ecomorphological adaptive evolution of the skull of living Canidae (Carnivora, Mammalia). Am Nat 196:197–215. https://doi.org/10.1086/709610
Machado FA, Zahn TMG, Marroig G (2018) Evolution of morphological integration in the skull of Carnivora (Mammalia): changes in Canidae lead to increased evolutionary potential of facial traits. Evolution 72:1399–1419. https://doi.org/10.1111/evo.13495
Magnus JR (1986) The exact moments of a ratio of quadratic forms in normal variables. Ann Écon Stat 4:95–109. https://doi.org/10.2307/20075629
Mardia KV, Jupp PE (1999) Directional statistics. Wiley, Chichester
Marroig G, Shirai LT, Porto A, de Oliveira FB, De Conto V (2009) The evolution of modularity in the mammalian skull II: evolutionary consequences. Evol Biol 36:136–148. https://doi.org/10.1007/s11692-009-9051-1
Mathai AM, Provost SB (1992) Quadratic forms in random variables: theory and applications. Marcel Dekker, New York
Mathai AM, Provost SB, Hayakawa T (1995) Bilinear forms and zonal polynomials. Springer, New York
McGlothlin JW, Kobiela ME, Wright HV, Kolbe JJ, Losos JB, Brodie EDIII (2022) Conservation and convergence of genetic architecture in the adaptive radiation of Anolis lizards. Am Nat 200:E207–E220. https://doi.org/10.1086/721091
Melo D, Marroig G (2015) Directional selection can drive the evolution of modularity in complex traits. Proc Natl Acad Sci U S A 112:470–475. https://doi.org/10.1073/pnas.1322632112
Melo D, Garcia G, Hubbe A, Assis AP, Marroig G (2016) EvolQG—an R package for evolutionary quantitative genetics [version 3; referees: 2 approved, 1 approved with reservations]. F1000Research 4:925. https://doi.org/10.12688/f1000research.7082.3
Meng XL (2005) From unit root to Stein’s estimator to Fisher’s \(k\) statistics: if you have a moment, I can tell you more. Stat Sci 20:162–414. https://doi.org/10.1214/088342304000000279
Meyer K, Kirkpatrick M (2008) Perils of parsimony: properties of reduced-rank estimates of genetic covariance matrices. Genetics 180:1153–1166. https://doi.org/10.1534/genetics.108.090159
Mezey JG, Houle D (2005) The dimensionality of genetic variation for wing shape in Drosophila melanogaster. Evolution 59:1027–1038. https://doi.org/10.1111/j.0014-3820.2005.tb01041.x
Muirhead RJ (1982) Aspects of multivariate statistical theory. Wiley, Hoboken. https://doi.org/10.1002/9780470316559
Muirhead RJ (2006) Zonal polynomials. In: Kotz S, Balakrishnan N, Read C, Vidakovic B, Johnson NL (eds) Encyclopedia of statistical sciences, vol 15, 2nd edn. Wiley, Hoboken, pp 9225–9231
Opedal ØH, Hildesheim LS, Armbruster WS (2022) Evolvability and constraint in the evolution of three-dimensional flower morphology. Am J Bot 109:1906–1917. https://doi.org/10.1002/ajb2.16092
Opedal ØH, Armbruster WS, Hansen TF, Holstad A, Pélabon C, Andersson S, Campbell DR, Caruso CM, Delph LF, Eckert CG et al (2023) Evolvability and trait function predict phenotypic divergence of plant populations. Proc Natl Acad Sci U S A 120:e2203228120. https://doi.org/10.1073/pnas.2203228120
Pavlicev M, Hansen TF (2011) Genotype-phenotype maps maximizing evolvability: modularity revisited. Evol Biol 38:371–389. https://doi.org/10.1007/s11692-011-9136-5
Pavlicev M, Wagner GP, Cheverud JM (2009) Measuring evolutionary constraints through the dimensionality of the phenotype: adjusted bootstrap method to estimate rank of phenotypic covariance matrix. Evol Biol 36:339–353. https://doi.org/10.1007/s11692-009-9066-7
Pavlicev M, Cheverud JM, Wagner GP (2011) Evolution of adaptive phenotypic variation patterns by direct selection for evolvability. Proc R Soc B 278:1903–1912. https://doi.org/10.1098/rspb.2010.2113
Porto A, de Oliveira FB, Shirai LT, De Conto V, Marroig G (2009) The evolution of modularity in the mammalian skull I: morphological integration patterns and magnitudes. Evol Biol 36:118–135. https://doi.org/10.1007/s11692-008-9038-3
Puentes A, Granath G, Ågren J (2016) Similarity in G matrix structure among natural populations of Arabidopsis lyrata. Evolution 70:2370–2386. https://doi.org/10.1111/evo.13034
R Core Team (2023) R: a language and environment for statistical computing, version 4.2.3. https://www.R-project.org/
Revell LJ (2007) The G matrix under fluctuating correlational mutation and selection. Evolution 61:1857–1872. https://doi.org/10.1111/j.1558-5646.2007.00161.x
Riederer JM, Tiso S, van Eldijk TJB, Weissing FJ (2022) Capturing the facets of evolvability in a mechanistic framework. Trends Ecol Evol 37:430–439. https://doi.org/10.1016/j.tree.2022.01.004
Roberts LA (1995) On the existence of moments of ratios of quadratic forms. Econom Theor 11:750–774. https://doi.org/10.1017/S0266466600009725
Rohlf FJ (2017) The method of random skewers. Evol Biol 44:542–550. https://doi.org/10.1007/s11692-017-9425-8
Rolian C (2009) Integration and evolvability in primate hands and feet. Evol Biol 36:100–117. https://doi.org/10.1007/s11692-009-9049-8
Saltzberg CJ, Walker LI, Chipps-Walton LE, Costa BMA, Spotorno ÁE, Steppan SJ (2022) Comparative quantitative genetics of the pelvis in four-species of rodents and the conservation of genetic covariance and correlation structure. Evol Biol 49:71–83. https://doi.org/10.1007/s11692-022-09559-z
Schluter D (1996) Adaptive radiation along genetic lines of least resistance. Evolution 50:1766–1774. https://doi.org/10.1111/j.1558-5646.1996.tb03563.x
Schott JR (2016) Matrix analysis for statistics, 3rd edn. Wiley, Hoboken
Smith MD (1989) On the expectation of a ratio of quadratic forms in normal variables. J Multivar Anal 31:244–257. https://doi.org/10.1016/0047-259X(89)90065-1
Smith MD (1993) Expectations of ratios of quadratic forms in normal variables: evaluating some top-order invariant polynomials. Austral J Stat 35:271–282. https://doi.org/10.1111/j.1467-842X.1993.tb01335.x
Steppan SJ, Phillips PC, Houle D (2002) Comparative quantitative genetics: evolution of the G matrix. Trends Ecol Evol 17:320–327. https://doi.org/10.1016/S0169-5347(02)02505-3
Takemura A (1984) Zonal polynomials, Lecture notes-monograph series, vol 4. Institute of Mathematical Statistics, Hayward
Teplitsky C, Robinson MR, Merilä J (2014) Evolutionary potential and constraints in wild populations. In: Charmantier A, Garant D, Kruuk LEB (eds) Quantitative genetics in the wild. Oxford University Press, Oxford, pp 190–208. https://doi.org/10.1093/acprof:oso/9780199674237.003.0012
Walsh B, Blows MW (2009) Abundant genetic variation + strong selection = multivariate genetic constraints: a geometric view of adaptation. Annu Rev Ecol Evol Syst 40:41–59. https://doi.org/10.1146/annurev.ecolsys.110308.120232
Watanabe J (2023) qfratio: R package for moments of ratios of quadratic forms using recursion. https://cran.r-project.org/package=qfratio
Watanabe J (2022) Statistics of eigenvalue dispersion indices: quantifying the magnitude of phenotypic integration. Evolution 76:4–28. https://doi.org/10.1111/evo.14382
Acknowledgements
The author would like to thank Thomas F. Hansen, Diogo Melo, the Associate Editor Reinhard Bürger, and an anonymous reviewer for constructive comments. This work was supported by the Newton International Fellowships by the Royal Society (NIF\R1\180520) and the Overseas Research Fellowships by the Japan Society for the Promotion of Science (202160529).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author has no competing interests to declare that are relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Appendices
Appendix A Zonal and invariant polynomials
This paper evaluates the expectations of ratios of quadratic forms using the zonal and invariant polynomials, mainly following Smith (1989) and Hillier et al. (2009, 2014). These polynomials are widely used in the literature of multivariate distribution theory, primarily to integrate out components affected by orthogonal rotations from functions of random matrices. To be precise, they are not strictly required to derive results relevant to the present applications (see Bao and Kan 2013), but allow for a conceptually elegant derivation and provide alternative interpretations of some of the results as hypergeometric functions of matrix argument. A brief introduction to these mathematical toolkits is provided in this section, as they appear to have attained little use in the biological literature. For more details, readers are directed to, e.g., James (1964), Muirhead (1982, 2006), Takemura (1984), Gross and Richards (1987), Mathai et al. (1995), and Chikuse (2003, appendix A).
1.1 A.1 Notations
In this section, \(\textbf{Y}\) denotes a \(p \times p\) symmetric matrix with eigenvalues \(y_1, y_2, \dots , y_p\). \(\textbf{H}\) denotes a \(p \times p\) orthogonal matrix, and the space of \(p \times p\) orthogonal matrices is denoted by O(p).
For indexing polynomials, (ordered) partitions \(\kappa \) of positive integers k are used; \(\kappa {=} (k_1, k_2, \dots )\), where \(k_i\) are nonnegative integers that satisfy \(k_1 {\ge } k_2 \ge \dots \) and \(\sum _i k_i = k\). For instance, the possible partitions of \(k = 4\) are \((4), (3, 1), (2, 2), (2, 1, 1), \text {and } (1, 1, 1, 1)\). The partition of k with a single nonzero integer part is called the top-order partition of k, and is denoted by \(\left[ {k} \right] \).
The Pochhammer’s symbol is
where \({\Gamma (\cdot )}\) is the gamma function, with the understanding \(\left( a \right) _{0} = 1\). A multivariate extension \(\left( a \right) _\kappa \) (indexed by a partition) is defined as
1.2 A.2 Definitions of zonal polynomials
The zonal polynomials are a certain class of polynomials in matrix entries which has several useful properties. The primary definition of the zonal polynomials comes from the group representation theory (James 1960, 1961, 1964), which tends to be too abstract to comprehend for anyone new to the concept. There are alternative, more operational definitions of the zonal polynomials (Muirhead 1982; Takemura 1984), which would be more approachable to most biologists. However, those alternative definitions do not generalize into the invariant polynomials of multiple matrix arguments, which are also relevant to the present application. For this reason, the representation-theoretic definition is briefly outlined herein, omitting technical details. See Gross and Richards (1987) for a readable and sufficiently comprehensive introduction to the topic.
Consider the vector space \(V_k\) of homogeneous polynomials \(\varphi (\textbf{Y})\) of degree k in the elements of the symmetric matrix \(\textbf{Y}\). The transformation with a \(p \times p\) invertible matrix \(\textbf{L}\),
transforms the vector space by
(Technically, \(T(\textbf{L})\) here is a representation, which abstracts the transformation of the right-hand side by a linear transformation.) It is known that the vector space \(V_k\) decomposes into a direct sum of irreducible invariant subspaces \(V_\kappa \) as
where the direct summation runs over all partitions \(\kappa \) of k into not more than p parts. By \(V_\kappa \) being invariant, it is meant that \(T(\textbf{L}) V_\kappa \subset V_\kappa \) for any invertible \(\textbf{L}\), and by them being irreducible invariant, it is meant that they do not contain any proper invariant subspace by themselves. It is further known that each \(V_\kappa \) contains a unique one-dimensional subspace that is invariant under \(\textbf{Y} \rightarrow \textbf{H} \textbf{Y} \textbf{H}^T\) (restricting the above transformation to \(\textbf{L} \in O(p)\)). Then, \(\left[ {{\,\textrm{tr}\,}}\left( \textbf{Y} \right) \right] ^k \in V_k\) has a unique decomposition
into some polynomials \( C_{\kappa } \! \left( \textbf{Y} \right) \in V_\kappa \) which satisfy \( C_{\kappa } \! \left( \textbf{Y} \right) = C_{\kappa } \! \left( \textbf{H} \textbf{Y} \textbf{H}^T \right) \). These polynomials \( C_{\kappa } \! \left( \textbf{Y} \right) \) are called the zonal polynomials corresponding to the partitions \(\kappa \). In short, the zonal polynomials of degree k constitute a basis of the subspace of \(V_k\) that is invariant under the transformation \(\textbf{Y} \rightarrow \textbf{H} \textbf{Y} \textbf{H}^T\), each corresponding to the irreducible subspace represented by a partition \(\kappa \) of k. In terms of (A3), they can be regarded as a generalization of powers of scalars into matrices.
These polynomials can be scaled by arbitrary constants, and there exist several different normalizations in the literature. The normalization used in (A3), with all coefficients for \( C_{\kappa } \! \left( \textbf{Y} \right) \) being unity, is primarily for convenience of notation.
1.3 A.3 Basic properties of zonal polynomials
A consequence of the above definition with (A3) is that \( C_{\kappa } \! \left( \textbf{Y} \right) \) are polynomials in the eigenvalues \(y_1, \dots , y_p\) of \(\textbf{Y}\) (note that \({{\,\textrm{tr}\,}}\left( \textbf{Y} \right) = {{\,\textrm{tr}\,}}\left( \textbf{H} \textbf{Y} \textbf{H}^T \right) = \sum _i y_i\)). It also entails that they are symmetric against permutation of the eigenvalues. Alternatively, or equivalently, they can be expressed as polynomials in the traces of matrix powers \({{\,\textrm{tr}\,}}\left( \textbf{Y}^j \right) \) (\(= \sum _i y_i^j\)). Unfortunately, no closed-form expressions are known for the zonal polynomials, except for a few special cases, so the coefficients in these polynomials in general need to be determined recursively (see Muirhead 1982, 2006).
The zonal polynomials are originally defined for symmetric matrices, but the definition can be extended. If \(\textbf{X}\) is positive definite and \(\textbf{Y}\) is symmetric, the eigenvalues of \(\textbf{X} \textbf{Y}\) and \(\textbf{X}^{1/2} \textbf{Y} \textbf{X}^{1/2}\) are the same. Therefore, a zonal polynomial of \(\textbf{X} \textbf{Y}\) is defined as \( C_{\kappa } \! \left( \textbf{X} \textbf{Y} \right) = C_{\kappa } \! \left( \textbf{X}^{1/2} \textbf{Y} \textbf{X}^{1/2} \right) \).
Properties of the zonal polynomials have been extensively investigated (Constantine 1963; James 1964; Muirhead 1982). Elementary properties involve the following:
-
\( C_{\kappa } \! \left( a \textbf{Y} \right) = a^k C_{\kappa } \! \left( \textbf{Y} \right) \) holds for any constant a.
-
If the number of parts in \(\kappa \) exceeds the rank of \(\textbf{Y}\), \( C_{\kappa } \! \left( \textbf{Y} \right) = 0\) (Muirhead 1982, corollary 7.2.4(i)).
-
If \(\textbf{Y}\) is positive definite, \( C_{\kappa } \! \left( \textbf{Y} \right) > 0\) (Muirhead 1982, corollary 7.2.4(ii)).
-
\( C_{\kappa } \! \left( \textbf{Y} \right) = C_{\kappa } \! \left( {{\,\textrm{diag}\,}}\left( \textbf{Y}, \textbf{0} \right) \right) \), where the right-hand side is with respect to a block diagonal matrix with a square matrix of 0’s of arbitrary dimension.
One of the most important properties of the zonal polynomials is the following integral identity.
Proposition 2
(James (1960)) Let \(\textbf{A}\) and \(\textbf{Y}\) be \(p \times p\) symmetric matrices. Then,
where \((\textrm{d}\textbf{H})\) is the normalized invariant measure on O(p) that satisfies \(\int _{O(p)} (\textrm{d}\textbf{H}) = 1\).
Accessible proofs are given in Muirhead (1982, theorem 7.2.5) and Gross and Richards (1987, proposition 5.5).
1.4 A.4 Top-order zonal polynomials
The zonal polynomials corresponding to the top-order partition, \([k] = (k)\), are called the top-order zonal polynomials and appear frequently in the multivariate distribution theory. An explicit expression of the top-order zonal polynomials is (James 1964, (123)):
where \(s_j = \sum _{i=1}^p y_i^j = {{\,\textrm{tr}\,}}\left( \textbf{Y}^j \right) \) and the summation is over all such combinations of the nonnegative integers \(i_j\) that satisfy \(\sum _{j = 0}^k i_j j = i_1 + 2 i_2 + 3 i_3 + \dots = k\). An alternative expression is given in Gross and Richards (1987, (6.8.1)).
For the identity matrix \(\textbf{I}_p\), the expression simplifies as:
1.5 A.5 Hypergeometric functions of matrix argument
With the zonal polynomials as a generalization of powers into matrices, the hypergeometric functions of matrix argument are defined as
where the summation \(\sum _{\kappa }\) is across all ordered partitions \(\kappa \) of k, and \(\left( x \right) _{\kappa }\) is defined as in (A2). Conditions for convergence are as follows: (i) when \(m \le n\), the series converges for all \(\textbf{Y}\); (ii) when \(m = n + 1\), the series converges for all \(\Vert \textbf{Y} \Vert < 1\), where \(\Vert \textbf{Y} \Vert \) is the maximum of the absolute values of the eigenvalues; (iii) when \(m > n + 1\), the series diverges for all \(\textbf{Y} \ne 0\), unless the series terminates (when any \(a_i\) is a negative integer). The following paragraphs list a few results relevant to the present applications. More extensive treatments on this topic can be found in, e.g., Muirhead (1982), Gross and Richards (1987), and Chikuse (2003).
If any of \(a_i\) is 1/2, the series can be expressed as a weighted sum of top-order zonal polynomials, as \(\left( 1/2 \right) _{\kappa } = 0\) unless \(\kappa = [k]\) (see (A2)):
Analogous to the scalar equivalent, \({}_0F_0\) is an exponential series but about the trace:
Provided that all eigenvalues of \(\textbf{Y}\) are less than 1 in absolute values, \({}_1F_0\) is a generalization of the binomial series (Muirhead 1982, corollary 7.3.5):
1.6 A.6 Invariant polynomials of multiple matrix arguments
The invariant polynomials of multiple matrix arguments provide some extension of the zonal polynomials into multiple matrices (Davis 1979, 1980; Chikuse 1980; Chikuse and Davis 1986a). Accessible introductions to the topic are found in, e.g., Chikuse and Davis (1986b) and Mathai et al. (1995, appendix).
Let \(V_{k_1, \dots , k_r} = V_{k_1} \otimes \dots \otimes V_{k_r}\) be the vector space of polynomials of degree \(k_1, \dots , k_r\) in the elements of \(p \times p\) symmetric matrices \(\textbf{Y}_1, \dots , \textbf{Y}_r\), respectively. Consider the following simultaneous transformation with a \(p \times p\) invertible matrix \(\textbf{L}\):
First noting the subspaces in \(V_{k_1, \dots , k_r}\) that are invariant against this transformation and then by decomposing them into irreducible ones, one has the following decomposition:
Here, \(V_{\kappa _i}\) are analogous to \(V_\kappa \) discussed above but for \(\textbf{Y}_i\), and \(V_{\Phi }^{\kappa _1, \dots , \kappa _r}\) are irreducible invariant subspaces indexed by \(\Phi \) and \(\kappa _1, \dots , \kappa _r\). \(\Phi \) are certain ordered partitions of 2f, with \(f = k_1 + \dots + k_r\), into not more than p parts, whose range is determined by \(\kappa _1, \dots , \kappa _r\). \(\kappa _i\) run across partitions of \(k_i\), \(i = 1, \dots , r\), into not more than p parts. It is known that only those \(V_{\Phi }^{\kappa _1, \dots , \kappa _r}\) with partitions taking the form \(\Phi = 2 \phi \), where \(\phi \) is a partition of f, contain a subspace invariant under the simultaneous transformation \(\textbf{Y}_1 \rightarrow \textbf{H} \textbf{Y}_1 \textbf{H}^T, \dots , \textbf{Y}_r \rightarrow \textbf{H} \textbf{Y}_r \textbf{H}^T\), which is one-dimensional. The corresponding polynomials are called the invariant polynomials, \(C^{\kappa _1, \dots , \kappa _r}_{\phi } \! \left( \textbf{Y}_1, \dots , \textbf{Y}_r \right) \), indexed by \(\kappa _1, \dots , \kappa _r\) and \(\phi \in \kappa _1 \cdots \kappa _r\). (There is a potential issue of non-uniqueness, but this is not concerned in most applications).
Explicit forms of the invariant polynomials are even less obvious than the zonal polynomials, but they are polynomials in matrix traces of the form \({{\,\textrm{tr}\,}}\left( \textbf{Y}_1^{a_1} \textbf{Y}_2^{a_2} \dots \textbf{Y}_r^{a_r} \right) \), so that the total degrees in elements of \(\textbf{Y}_1, \dots , \textbf{Y}_r\) are \(k_1, \dots , k_r\), respectively.
An important generalization of Proposition 2 enabled by the invariant polynomials is the following (see also Mathai et al. 1995, theorem A.3).
Proposition 3
(Davis (1980), Chikuse (1980)) Let \(\textbf{A}_1, \dots , \textbf{A}_r\) and \(\textbf{Y}_1, \dots , \textbf{Y}_r\) be \(p \times p\) symmetric matrices, and \(\kappa _1, \dots , \kappa _r\) be ordered partitions of \(k_1, \dots , k_r\) into not more than p parts.
where the summation runs across all permissible partitions \(\phi \) of \(f = k_1 + \dots + k_r\) corresponding to \(\kappa _1, \dots , \kappa _r\).
One notable corollary from this result is (Davis 1980; Chikuse 1980)
The top-order invariant polynomials, in which \(\phi = \left[ {f} \right] \), the top-order partition of f, are of particular importance here. This partition occurs only when \(\kappa _i = \left[ {k_i} \right] \) for all i. An explicit form of the top-order invariant polynomials is (Chikuse 1987; Hillier et al. 2009):
Here, \(d_{k_1, \dots , k_r}\) is defined as:
where \(i_j\) run from 0 to \(k_j\) (but at least one of \(i_j\) is nonzero), and \(\nu _{i_1, \dots , i_r}\) are nonnegative integers; these collectively satisfy the r linear constraints \(\sum _{\nu _1 = 0}^{k_1} \dots \sum _{\nu _r = 0}^{k_r} i_l \nu _{i_1, \dots , i_r} = k_l\) for \(l = 1, \dots , r\).Footnote 5 And \(p_{i_1, \dots , i_r}\) are the coefficients for \(\prod _{j=1}^r t_j^{i_j}\) in the expansion of the following:
Note that (A13) simplifies into (A5) when \(r = 1\). In addition, it holds that (Chikuse 1987)
regardless of individual values of \(k_i\) (see also (A6)).
1.7 A.7 Example application to evolvability measures
As a demonstration for how above mathematical toolkits act in the present problem of evaluating moments of (multiple) ratios of quadratic forms, an expression of the average autonomy \({\bar{a}}\) is derived here, under the simplest case that \(\varvec{\upbeta } \sim N_p(\textbf{0}_p, \sigma ^2 \textbf{I}_p)\), where \(\sigma ^2\) is an arbitrary positive constant, and \(\textbf{G}\) is nonsingular. The derivation here is a simplified version of that in Smith (1989). When \(\textbf{G}\) is singular, the approach of Bao and Kan (2013) can be used instead.
As discussed in the text, \(\textbf{u}\) is uniformly distributed on the unit hypersphere \(S^{p - 1}\) under this assumption. Thus, the average autonomy \({\bar{a}}\), from the definition in (5), is to be evaluated as the following integral:
where \((\textrm{d} \textbf{u})\) denotes the normalized uniform measure on \(S^{p - 1}\) (see, e.g. Muirhead 1982). To evaluate this, consider the following transformation:
where \(\alpha _1\) is a positive constant that satisfy \(0< \alpha _1 < 2 / \lambda _{\max } \! \left( \textbf{G} \right) \) (so that \(|\textbf{u}^T \left( \textbf{I}_p - \alpha _1 \textbf{G} \right) \textbf{u} |< 1\)), the second equation is from the fact \((1 - x)^{-1} = \sum _{k=0}^\infty x^k\) when \(|x |< 1\), and the last equation is from (A3) and the fact \(\textbf{u}^T \left( \textbf{I}_p - \alpha _1 \textbf{G} \right) \textbf{u} = {{\,\textrm{tr}\,}}\left( \left( \textbf{I}_p - \alpha _1 \textbf{G} \right) \textbf{u} \textbf{u}^T \right) \). Note that (A18) can be written as \({}_1F_0 \left( 1; \left( \textbf{I}_p - \alpha _1 \textbf{G} \right) \textbf{u} \textbf{u}^T \right) \) from (A10) and is uniformly convergent under the assumptions. (This series does not converge when \(\textbf{G}\) is singular.) By similar transformations, \(\left( \textbf{u}^T \textbf{G}^{-1} \textbf{u} \right) ^{-1} = \alpha _2 \sum _{l=0}^\infty \sum _{\lambda } C_{\lambda } \! \left( \left( \textbf{I}_p - \alpha _2 \textbf{G}^{-1} \right) \textbf{u} \textbf{u}^T \right) \), where \(0< \alpha _2 < 2 / \lambda _{\max } \! \left( \textbf{G}^{-1} \right) \). Then, the above integral becomes:
Here, term-by-term integration is permissible because the series is uniformly convergent (see also Bao and Kan 2013).
Note that integration of \(\textbf{u}\) over \(S^{p - 1}\) is equivalent to that of \(\textbf{H}\) over O(p) (e.g., Mardia and Jupp 1999), whence the change of variables \(\textbf{u} = \textbf{H} \textbf{e}\) is possible, where \(\textbf{e} = (1, 0, \dots 0)^T\) (or \(\textbf{e}\) could be any unit vector). By denoting \(\textbf{e} \textbf{e}^T = \textbf{E}\), the integral is now
by applying Proposition 3. Furthermore, using (A12), \(C^{\kappa , \lambda }_{\phi } \! \left( \textbf{E}, \textbf{E} \right) = C^{\kappa , \lambda }_{\phi } \! \left( \textbf{I}_p, \textbf{I}_p \right) C_{\phi } \! \left( \textbf{E} \right) / C_{\phi } \! \left( \textbf{I}_p \right) \). However, \( C_{\phi } \! \left( \textbf{E} \right) \) is nonzero only if \(\phi = \left[ {k + l} \right] \), the top-order partition, because \(\textbf{E}\) has only one nonzero eigenvalue, which is unity. Hence, the summations for \(\kappa , \lambda , \phi \) disappear, leaving only the terms corresponding to the top-order partitions, \(\left[ {k} \right] , \left[ {l} \right] , \left[ {k + l} \right] \). Also note that, from (A16), \(C^{\left[ {k} \right] , \left[ {l} \right] }_{\left[ {k + l} \right] } \! \left( \textbf{I}_p, \textbf{I}_p \right) / C_{\left[ {k + l} \right] } \! \left( \textbf{I}_p \right) = 1\). The integral finally becomes
Inserting (A6) yields (14) in the text. The expressions for all average measures presented in the main text under the same assumption can be derived in a similar manner.
1.8 A.8 Generating functions for top-order zonal and invariant polynomials
For practical evaluation of top-order zonal and invariant polynomials, it is more efficient to handle \(d_{k_1, \dots , k_r}\) in (A13) and (A14) than the polynomials themselves. Chikuse (1987) showed that the same \(d_{k_1, \dots , k_r} \! \left( \textbf{A}_1 , \dots , \textbf{A}_r \right) \) appear as coefficients in the following, which is the joint moment generating function of \(\textbf{x}^T \textbf{A}_1 \textbf{x} / 2, \dots , \textbf{x}^T \textbf{A}_r \textbf{x} / 2\) for \(\textbf{x} \sim N_p (\textbf{0}_p, \textbf{I}_p)\):
Chikuse (1987) and Hillier et al. (2009, 2014) presented a few recursive algorithms to evaluate \(d_{k_1, \dots , k_r}\).
Bao and Kan (2013) and Hillier et al. (2014) presented further improvements for evaluating moments of ratios of quadratic forms in potentially noncentral normal variables, \(\textbf{x} \sim N_p (\varvec{\upmu }, \textbf{I}_p)\). Following those papers, the coefficients \(h_{k_1, k_2}, {\tilde{h}}_{k_1; k_2}, {\tilde{h}}_{k_1; k_2, k_3}\) that appear in the following expansions are used in the present paper:
(A23) and (A24) are from Bao and Kan (2013), and (A25) is a slight extension from (A24) used in appendices B and C. All these simplify to (A22) when \(\varvec{\upmu } = \textbf{0}_p\). Note that the first arguments in \({\tilde{h}}_{k_1; k_2}\) and \({\tilde{h}}_{k_1; k_2, k_3}\) are not interchangeable with the other arguments, unlike those in \(d_{k_1, k_2}\) or \(h_{k_1, k_2}\).
Appendix B Moment of multiple ratio of quadratic forms
Among the average evolvability measures mentioned in the text, the expressions for \({\bar{e}}\), \({\bar{c}}\), \({\bar{r}}\), and \({\bar{d}}\) can be directly derived from Proposition 1. For \({\bar{a}}\), \({\bar{f}}\), and \({\bar{\rho }}\), an expression is required for the expectation of the form \(\frac{(\textbf{x}^T \textbf{A} \textbf{x})^l}{(\textbf{x}^T \textbf{B} \textbf{x})^m (\textbf{x}^T \textbf{D} \textbf{x})^n}\); it suffices here to concern integer-valued l. Smith (1989) derived such an expression for \(\textbf{B}, \textbf{D}\) being positive definite. However, it is possible to derive the following, equivalent expression with a slightly general condition that \(\textbf{B}, \textbf{D}\) being nonnegative definite, following a similar line of arguments as in Bao and Kan (2013).
Proposition 4
Let \(\textbf{x} \sim N_p \left( \varvec{\upmu }, \textbf{I}_p \right) \), \(\textbf{A}\) be a symmetric \(p \times p\) matrix, \(\textbf{B}, \textbf{D}\) be symmetric, nonnegative definite \(p \times p\) matrices, l be a positive integer, and m, n be positive real numbers. When the expectation of \( \frac{(\textbf{x}^T \textbf{A} \textbf{x})^l}{(\textbf{x}^T \textbf{B} \textbf{x})^m (\textbf{x}^T \textbf{D} \textbf{x})^n}\) exists, it is
where \(\alpha _{\textbf{B}}, \alpha _{\textbf{D}}\) are to satisfy \(0< \alpha _{\textbf{B}} < 2 / \lambda _{\max } \! \left( \textbf{B} \right) \) and \(0< \alpha _{\textbf{D}} < 2 / \lambda _{\max } \! \left( \textbf{D} \right) \).
Proof is omitted because it is almost identical to that of the proposition 3 in Bao and Kan (2013, pp. 237–243), but by using (A25) in place of (A23).
Remark
Conditions for the existence of the above moment were stated by Smith (1989) for positive definite \(\textbf{B}, \textbf{D}\); in the notation here, the moment exists if and only if \(\frac{p}{2} + l > m + n\). For positive semidefinite \(\textbf{B}, \textbf{D}\), dealing with their null spaces is critical (see Roberts 1995). Deriving general existence conditions in that case does not seem straightforward, so only the following simple cases are noted here. (i) When \(N \left( \textbf{D} \right) \subseteq N \left( \textbf{B} \right) \) (or \(N \left( \textbf{B} \right) \subseteq N \left( \textbf{D} \right) \)): In the former case, it is possible to find a positive constant \(\epsilon \) that always satisfies \((\textbf{x}^T \textbf{B} \textbf{x})^m (\textbf{x}^T \textbf{D} \textbf{x})^n \ge (\epsilon \textbf{x}^T \textbf{B} \textbf{x})^{m + n}\), so that \((\textbf{x}^T \textbf{A} \textbf{x})^l / \left[ (\textbf{x}^T \textbf{B} \textbf{x})^m (\textbf{x}^T \textbf{D} \textbf{x})^n \right] \le \epsilon ^{-(m + n)} (\textbf{x}^T \textbf{A} \textbf{x})^l / (\textbf{x}^T \textbf{B} \textbf{x})^{m + n}\). The desired moment of the left-hand side exists if and only if that of the right-hand side does, and the necessary and sufficient condition for the latter is given in the proposition 1 of Bao and Kan (2013), which encompasses the positive definite case above. The same argument applies when \(N \left( \textbf{B} \right) \subseteq N \left( \textbf{D} \right) \) with obvious modifications. (ii) When \(N \left( \textbf{D} \right) \nsubseteq N \left( \textbf{B} \right) \) and \(N \left( \textbf{B} \right) \nsubseteq N \left( \textbf{D} \right) \): Consider the space \(F = \lbrace \textbf{y}: \textbf{y} \perp N \left( \textbf{B} \right) , \textbf{y} \perp N \left( \textbf{D} \right) , \textbf{y} \in {\mathbb {R}}^p \rbrace \). Assuming \(F \ne \varnothing \), it is possible to find a positive semi-definite matrix \(\textbf{F}\) that always satisfies \((\textbf{x}^T \textbf{A} \textbf{x})^l / \left[ (\textbf{x}^T \textbf{B} \textbf{x})^m (\textbf{x}^T \textbf{D} \textbf{x})^n \right] \le (\textbf{x}^T \textbf{A} \textbf{x})^l / (\textbf{x}^T \textbf{F} \textbf{x})^{m + n}\), by choosing arbitrary small positive numbers and a basis of F as its nonzero eigenvalues and eigenvectors, respectively. The proposition 1 of Bao and Kan (2013) applied to the right-hand side defines a sufficient condition for the existence of the moment. This is admittedly a restrictive condition, and the existence condition in the case \(F = \varnothing \) is not always obvious.
In the present applications, the moment existence conditions can be easily confirmed for \({\bar{e}}\), \({\bar{c}}\), \({\bar{r}}\), and \({\bar{d}}\) by directly applying the proposition 1 of Bao and Kan (2013). \({\bar{a}}\) and \({\bar{f}}\) always fall within the case (i) above, and the moments exist as long as they are defined (in case of \({\bar{f}}\), \(\varvec{\upbeta }\) should not be restricted onto \(N \left( \textbf{G} \right) \)). On the other hand, \({\bar{\rho }}\) may fall within either (i) or (ii) depending on the structures of the two \(\textbf{G}\)’s. Even in the latter case and if \(F = \varnothing \), which is a pathologic condition in this application, this specific moment exists as long as \(\rho \) is defined with probability 1, since \(-1 \le \rho \le 1\) whenever defined (above). In addition, the same criteria can be used to confirm that these evolvability measures have finite second moments, which formally validate the use of Monte Carlo evaluations.
Appendix C Evolvability measures in general normal cases
1.1 C.1 Conditions considered
This section considers a general p-variate normality condition for the selection gradient, \(\varvec{\upbeta } \sim N_p \left( \varvec{\upeta }, \varvec{\Sigma } \right) \). Regarding ratios of quadratic forms, however, tractable results are available only for normal variables with a spherical covariance structure, i.e., \(N_q \left( \varvec{\upmu }, \textbf{I}_q \right) \) for some q and arbitrary \(\varvec{\upmu }\). Let \(\textbf{A}\) be an arbitrary \(p \times p\) symmetric matrix and \(\textbf{K}\) be a \(p \times q\) matrix that satisfies \(\textbf{K} \textbf{K}^T = \varvec{\Sigma }\), where q (\(\le p\)) is the rank of \(\varvec{\Sigma }\). Writing \(\varvec{\upbeta } = \varvec{\upeta } + \textbf{K} \varvec{\upbeta }_0\) with \(\varvec{\upbeta }_0 \sim N_q (\textbf{0}_q, \textbf{I}_q)\), the quadratic form in \(\varvec{\upbeta }\) with respect to \(\textbf{A}\) can be written as \(\varvec{\upbeta }^T \textbf{A} \varvec{\upbeta } = (\varvec{\upeta } + \textbf{K} \varvec{\upbeta }_0)^T \textbf{A} (\varvec{\upeta } + \textbf{K} \varvec{\upbeta }_0)\) (see also Mathai and Provost 1992, chapter 3). Certain assumptions are required to express this quadratic form as one in normal variables with a spherical covariance structure.
(1) When \(\varvec{\Sigma }\) is nonsingular In this case, \(p = q\) and \(\textbf{K}\) is invertible. It suffices to consider the transformation \(\varvec{\upbeta }_* = \textbf{K}^{-1} \varvec{\upbeta } \sim N_p \left( \varvec{\upmu }, \textbf{I}_p \right) \), where \(\varvec{\upmu } = \textbf{K}^{-1} \varvec{\upeta }\). The quadratic form can be rewritten as \(\varvec{\upbeta }^T \textbf{A} \varvec{\upbeta } = \varvec{\upbeta }_*^T \textbf{K}^T \textbf{A} \textbf{K} \varvec{\upbeta }_*\), a new quadratic form in \(\varvec{\upbeta }_*\) with respect to \(\textbf{K}^T \textbf{A} \textbf{K}\), to which results below apply.
(2) When \(\varvec{\Sigma }\) is singular In this case, \(\textbf{K}\) is not invertible and at least one of the following additional conditions need to be satisfied for the results below to be applicable. (i) \(\varvec{\upeta } \in R \left( \varvec{\Sigma } \right) \) (including \(\varvec{\upeta } = \textbf{0}_p\)): Under this condition, it holds that \(\varvec{\upeta } = \textbf{K} \textbf{K}^- \varvec{\upeta }\) (e.g., Schott 2016, theorem 5.25) since \(R \left( \varvec{\Sigma } \right) = R \left( \textbf{K} \right) \). It is then possible to consider \(\varvec{\upbeta }^T \textbf{A} \varvec{\upbeta } = \varvec{\upbeta }_*^T \textbf{K}^T \textbf{A} \textbf{K} \varvec{\upbeta }_*\) with \(\varvec{\upbeta }_* = \textbf{K}^- \varvec{\upbeta } \sim N_p \left( \varvec{\upmu }, \textbf{I}_p \right) \), where \(\varvec{\upmu } = \textbf{K}^- \varvec{\upeta }\), as a generalization of the nonsingular case. (ii) \(R \left( \textbf{A} \right) \subseteq R \left( \varvec{\Sigma } \right) \): From the symmetry of \(\textbf{A}\) and the fact that \(R \left( \textbf{K} \right) = R \left( {\textbf{K}^-}^T \right) \), it holds that \(\textbf{A} = {\textbf{K}^-}^T \textbf{K}^T \textbf{A} \textbf{K} \textbf{K}^-\) under this condition (Schott 2016, theorem 5.25). Then the same transformation as for (i) applies. (Utility of this condition in the present applications is relatively limited, since all evolvability measures except \(\rho \) involve at least one \(\textbf{I}_p\) in place of \(\textbf{A}\).) (iii) \(\textbf{A} \varvec{\upeta } = \textbf{0}_p\): Under this condition, the above quadratic form simplifies into \(\varvec{\upbeta }_0^T \textbf{K}^T \textbf{A} \textbf{K} \varvec{\upbeta }_0\), to which results below apply with \(\varvec{\upmu } = \textbf{0}_q\).
In practice, the conditions (1) or (2)(i) above seem to cover most cases of biological interest. That is, if the variation of selection gradient \(\varvec{\upbeta }\) is restricted to a certain subspace (condition (2)), then it seems plausible that the mean selection gradient \(\varvec{\upeta }\) is also in the same subspace (condition (2)(i)). Otherwise, \(\varvec{\upbeta }\) ought to contain a nonzero deterministic component (\(\varvec{\upeta }\)) that is strictly unaffected by (linearly independent of) its potential variability (represented by \(\varvec{\Sigma }\)).
1.2 C.2 Average evolvability measures
Expressions for average evolvability measures for \(\varvec{\upbeta } \sim N_p \left( \varvec{\upeta }, \varvec{\Sigma } \right) \) or equivalently \(\varvec{\upbeta }_* \sim N_q \left( \varvec{\upmu }, \textbf{I}_q \right) \) can be derived from Propositions 1 and 4.
The average evolvability \({\bar{e}}\) is
where \(\alpha \) is a positive constant that satisfies \(0< \alpha < 2 / \lambda _{\max } \! \left( \textbf{K}^T \textbf{K} \right) \).
As discussed in the text (2.4), the average conditional evolvability \({\bar{c}}\) and average autonomy \({\bar{a}}\) can be nonzero only if the distribution of \(\varvec{\upbeta }\) is strictly within \(R \left( \textbf{G} \right) \). This happens (i) when \(\textbf{G}\) is nonsingular; or (ii) when \(\textbf{G}\) is singular, \(\varvec{\upeta } \in R \left( \textbf{G} \right) \) and \(R \left( \varvec{\Sigma } \right) \subseteq R \left( \textbf{G} \right) \).
Under these conditions, the following expression can be obtained for the average conditional evolvability \({\bar{c}}\):
where \(\alpha \) is to satisfy \(0< \alpha < 2 / \lambda _{\max } \! \left( \textbf{K}^T \textbf{G}^{-} \textbf{K} \right) \).
Under the same conditions, the average autonomy \({\bar{a}}\) is
where \(\alpha _1\) and \(\alpha _2\) are to satisfy \(0< \alpha _1 < 2 / \lambda _{\max } \! \left( \textbf{K}^T \textbf{G} \textbf{K} \right) \) and \(0< \alpha _2 < 2 / \lambda _{\max } \! \left( \textbf{K}^T \textbf{G}^{-} \textbf{K} \right) \). As before, the average integration \({\bar{i}}\) is \(1 - {\bar{a}}\).
The average respondability \({\bar{r}}\) is
where \(\alpha _1\) and \(\alpha _2\) are to satisfy \(0< \alpha _1 < 2 / \lambda _{\max } \! \left( \textbf{K}^T \textbf{G}^2 \textbf{K} \right) \) and \(0< \alpha _2 < 2 / \lambda _{\max } \! \left( \textbf{K}^T\textbf{K} \right) \).
The average flexibility \({\bar{f}}\) is
where \(\alpha _1\) and \(\alpha _2\) are to satisfy \(0< \alpha _1 < 2 / \lambda _{\max } \! \left( \textbf{K}^T \textbf{K} \right) \) and \(0< \alpha _2 < 2 / \lambda _{\max } \! \left( \textbf{K}^T \textbf{G}^2 \textbf{K} \right) \).
The average response difference \({\bar{d}}\) is
where \(\alpha _1\) and \(\alpha _2\) are to satisfy \(0< \alpha _1 < 2 / \lambda _{\max } \! \left( \textbf{K}^T \left( \textbf{G}_1 - \textbf{G}_2 \right) ^2 \textbf{K} \right) \) and \(0< \alpha _2 < 2 / \lambda _{\max } \! \left( \textbf{K}^T\textbf{K} \right) \).
The average response correlation \({\bar{\rho }}\) is
where \(\textbf{G}_{12} = \left( \textbf{G}_1 \textbf{G}_2 + \textbf{G}_2 \textbf{G}_1 \right) / 2\), and \(\alpha _1\) and \(\alpha _2\) are to satisfy \(0< \alpha _1 < 2 / \lambda _{\max } \! \left( \textbf{K}^T \textbf{G}_1^2 \textbf{K} \right) \) and \(0< \alpha _2 < 2 / \lambda _{\max } \! \left( \textbf{K}^T \textbf{G}_2^2 \textbf{K} \right) \).
1.3 C.3 Practical evaluation and truncation error
The above series expressions for average measures in general cases can be evaluated by one of the recursive algorithms described by Hillier et al. (2009, 2014) and Bao and Kan (2013). When the mean \(\varvec{\upmu }\) is nonzero, the series can be most efficiently evaluated by recursions for \(h_{i,j}\) and \({\tilde{h}}_{i,j}\) described in Hillier et al. (2014, section 5.4) and Bao and Kan (2013, section 5). When possible, it is recommended to inspect a profile of the partial sum across varying orders as the signs of the terms can in general fluctuate in those series.
At present, truncation error bounds for general cases are available only for \({\bar{e}}\) and \({\bar{c}}\). From the theorem 7 of Hillier et al. (2014), a truncation error bound for these average measures evaluated up to \(k = M\) is obtained from the following, by inserting \((\textbf{A}, \textbf{B}) = (\textbf{K}^T \textbf{G} \textbf{K}, \textbf{K}^T \textbf{K})\) and \((\textbf{K}^T \textbf{K}, \textbf{K}^T \textbf{G}^- \textbf{K})\) for \({\bar{e}}\) and \({\bar{c}}\), respectively:
Here, \(n = 1\), \(t = 2\), and \(S_M = \sum _{j=0}^M {\hat{h}}_{1; j} \! \left( \textbf{A} ; \textbf{I}_q - \alpha \textbf{B} \right) \), with \({\hat{h}}_{1; j} \! \left( \textbf{A}_1; \textbf{A}_2 \right) \) being the coefficient of \(t_1 t_2^j\) in the expansion
which can be evaluated by a recursive algorithm similar to the one presented in Hillier et al. (2014, theorem 6).Footnote 6 This expression bounds the truncation error in absolute values unlike those in (19)–(22) which are one-sided. This is because \({\tilde{h}}_{i,j}\) can take both positive and negative values when \(\varvec{\upmu }\) is nonzero, so that the partial sums are not always increasing.
Under the restriction \(\textbf{K}^T \textbf{K} = \textbf{I}_q\), a similar error bound can be derived for \({\bar{f}}\), by letting \(\textbf{A} = \textbf{K}^T \textbf{G} \textbf{K}\), \(\textbf{B} = \textbf{K}^T \textbf{G}^2 \textbf{K}\), \(n = 1/2\), \(t = 3\), and \(S_M = \sum _{k=0}^{M} \sum _{l=0}^{k} {\hat{h}}_{1; l, k - l} \! \left( \textbf{A} ; \textbf{I}_q - \alpha \textbf{B} , \textbf{0}_{q \times q} \right) \) in (C36). Here, \({\hat{h}}_{1; j, k} \! \left( \textbf{A}_1; \textbf{A}_2, \textbf{A}_3 \right) \) is the coefficient of \(t_1 t_2^j t_3^{k}\) in the expansion of
which can be evaluated by the same recursive algorithm for \({\hat{h}}_{i; j}\) with obvious modifications.
Appendix D Connections to previous results
1.1 D.1 Average conditional evolvability in two traits
Hansen and Houle (2008) stated that the average conditional evolvability, under the uniform distribution of \(\textbf{u}\) on \(S^{p - 1}\), equals to the geometric mean of the eigenvalues of \(\textbf{G}\) when \(p = 2\). It is possible to confirm that the present result is consistent with their argument as follows. From (13),
where in the second equation the denominator parameter cancels with one of the numerator parameters, and the third equation is from (A10). Since the determinant is equal to the product of the eigenvalues, this proves the argument.
1.2 D.2 Average respondability under complete integration
Kirkpatrick (2009) considered the average selection response \({\bar{R}}\), which equals \({\bar{r}} / \lambda _{\max } \! \left( \textbf{G} \right) \) here when \(\textbf{u}\) is uniformly distributed on \(S^{p - 1}\). This section shows that the present expression for \({\bar{r}}\) (15) entails his result for \({\bar{R}}\) under the case of complete integration, \(\textbf{G} = {{\,\textrm{diag}\,}}\left( \lambda _{\max }, 0, \dots , 0 \right) \), the only situation for which an analytic expression was given. The expression is (Kirkpatrick 2009, (6) and (A7) therein)
Taking \(\alpha = \lambda _{\max }^{-2}\), (15) becomes
where the second equation comes from the fact \( C_{\left[ {k} \right] } \! \left( {{\,\textrm{diag}\,}}\left( 0, 1, \dots , 1 \right) \right) = C_{\left[ {k} \right] } \! \left( \textbf{I}_{p-1} \right) \) and (A6), and the last equation is from the Gauss hypergeometric theorem: \({}_2F_1 \left( a, b; c; 1 \right) = \frac{\Gamma \left( c \right) \Gamma \left( c - b - a \right) }{\Gamma \left( c - b \right) \Gamma \left( c - a \right) }\). The equivalence to (D37) is obvious.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Watanabe, J. Exact expressions and numerical evaluation of average evolvability measures for characterizing and comparing \(\textbf{G}\) matrices. J. Math. Biol. 86, 95 (2023). https://doi.org/10.1007/s00285-023-01930-8
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00285-023-01930-8
Keywords
- Evolutionary constraint
- Phenotypic integration
- Quadratic form
- Quantitative genetics
- Random skewers
- Zonal polynomial