Illustrating the importance of meta-analysing variances alongside means in ecology and evolution

Meta-analysis is increasingly used in biology to both quantitatively summarize available evidence for specific questions and generate new hypotheses. Although this powerful tool has mostly been deployed to study mean effects, there is untapped potential to study effects on (trait) variance. Here, we use a recently published data set as a case study to demonstrate how meta-analysis of variance can be used to provide insights into biological processes. This data set included 704 effect sizes from 89 studies, covering 56 animal species, and was originally used to test developmental stress effects on a range of traits. We found that developmental stress not only negatively affects mean trait values, but also increases trait variance, mostly in reproduction, showcasing how meta-analysis of variance can reveal previously overlooked effects. Furthermore, we show how meta-analysis of variance can be used as a tool to help meta-analysts make informed methodological decisions, even when the primary focus is on mean effects. We provide all data and comprehensive R scripts with detailed explanations to make it easier for researchers to conduct this type of analysis. We encourage meta-analysts in all disciplines to move beyond the world of means and start unravelling secrets of the world of variance.


| INTRODUC TI ON
'Our preoccupation with averages has blinded us to biological realities' (Hogben & Sim, 1953). Despite the exponential increase in the use of meta-analysis in recent years (Stewart, 2009;Gurevitch et al., 2018), most meta-analyses have exclusively focused on the study of mean effects (using effect sizes such as response ratios and standardized mean differences between two groups; Nakagawa & Santos, 2012; Koricheva & Gurevitch, 2014). Meta-analysis is a powerful statistical tool for integrating and quantitatively summarizing findings (i.e. effect sizes) from multiple studies tackling a common research question, and for generating new hypotheses. Yet, meta-analysts may be neglecting important biological realities by focusing on means alone.
In biological systems, variation from the mean is important to ecological and evolutionary processes. Phenotypic variance is important for how we understand and predict responses to selection using quantitative genetics because phenotypic variance is a key component of set as a case study to demonstrate how meta-analysis of variance can be used to provide insights into biological processes. This data set included 704 effect sizes from 89 studies, covering 56 animal species, and was originally used to test developmental stress effects on a range of traits. We found that developmental stress not only negatively affects mean trait values, but also increases trait variance, mostly in reproduction, showcasing how meta-analysis of variance can reveal previously overlooked effects. Furthermore, we show how meta-analysis of variance can be used as a tool to help meta-analysts make informed methodological decisions, even when the primary focus is on mean effects. We provide all data and comprehensive R scripts with detailed explanations to make it easier for researchers to conduct this type of analysis. We encourage meta-analysts in all disciplines to move beyond the world of means and start unravelling secrets of the world of variance.

K E Y W O R D S
coefficient of variation, early-life effects, opportunity for selection, parental effects, variability, variance ratio heritability and, thus, of the breeder's equation (Arnold, 1992;Blows & Hoffmann, 2005;Walsh & Lynch, 2018). As such, phenotypic variance has been the focus of abundant research in biology, leading to the development of important evolutionary hypotheses (e.g. sex-chromosome hypothesis: James, 1973;reviewed in Reinhold & Engqvist, 2013), principles (Bateman's principles: Arnold, 1994;Bateman, 1948;reviewed in Janicke et al., 2016) and entire research fields (e.g. animal personality: reviewed in Réale et al., 2007Réale et al., , 2010. Meta-analysis of variance components such as repeatability and heritability has been possible for a long time, and a few studies have done so (repeatability: Bell et al., 2009;Holtmann et al., 2017;heritability: Dochtermann et al., 2019). However, only recent statistical advances in meta-analysis have made it possible to analyse differences in variance between groups , allowing us to test, for example, whether the opportunity of selection (variance in fitness) differs between groups. As a result, meta-analyses of variance are emerging (Supplementary material S1). For example, meta-analyses have shown that early-life dietary restriction not only affects mean longevity (English & Uller, 2016) but also increases variance in longevity (Senior et al., 2017); that poor condition increases mean risk-taking behaviour but does not generally affect total phenotypic variance (except in specific contexts; see Moran et al., 2020); and that sexual selection on males not only increases mean but also decreases variance in fitness-related traits (Cally et al., 2019).
Despite this, meta-analyses of variance are still rarely used.
In this study, we aim to promote the use of meta-analysis of variance in biology and other disciplines. We used a recently published meta-analytic data set of experimental studies (Eyck et al., 2019) as a case study to test the prediction that developmental stress not only negatively affects mean trait values, but also increases total trait variance. Furthermore, we used meta-regression to test whether mean and variance effects differ across traits (e.g. behaviour, morphology, reproduction). Our meta-analysis of variance revealed developmental stress effects on variance, mostly on reproduction, and highlighted the importance of shifting some of our meta-analytic attention to the raw material for natural selection: variation.

| Data analysed
Experimental data on the effects of developmental stress on phenotype and fitness were obtained from Eyck et al. (2019). Before the analyses, we made modifications to the data set (see details in Supplementary material S2).
To study mean effects, we calculated the log response ratio (lnRR, Hedges et al., 1999; also known as the ratio of means or ROM, Friedrich et al., 2008). We chose lnRR instead of the standardized mean difference Cohen's d (Cohen, 1988) as used in Eyck et al. (2019), because (a) lnRR is less affected by heteroscedasticity (see results), and (b) lnRR can be readily interpreted as the percentage of change between the two groups. Nonetheless, for comparison with the original study, we conducted an additional meta-analysis of means based on a standardized mean difference effect size (see Supplementary Material S3).
To study variance effects, we calculated the log coefficient of variation ratio (lnCVR; Nakagawa et al., 2015;Senior et al., 2020). We chose lnCVR over the log variability ratio (lnVR; Nakagawa et al., 2015;Senior et al., 2020) because groups can simultaneously differ at both the mean and variance levels (mean-variance relationship, e.g. Taylor's Law; Cohen & Xu, 2015;Nakagawa & Schielzeth, 2012), such as that observed in our sample (see Supplementary Material S4), and lnCVR is designed to account for that. An alternative approach to account for mean-variance relationships would be to model group means and standard deviations (SD) using univariate random slope mixed-effects models or bivariate mixed-effects meta-analytic models (also called arm-based models). Compared with lnCVR, the latter approaches can have both advantages (e.g. incorporating interval scale data) and disadvantages (e.g. unknown sampling error covariances and complexity; Dias & Ades, 2016;Nakagawa et al., 2015).
Although we note that such approaches are possible, and have been successfully implemented elsewhere (Simons, 2015;O'Dea et al., 2019), our choice of lnCVR was mostly driven by its simplicity and integration within established meta-analytic paradigms, and because lnCVR is easily comparable to lnRR.
Multiple treatment groups shared a common control group in 23 studies (25.8% of all studies) involving 252 effect sizes (35.8% of all effect sizes), leading to nonindependence among effect sizes (Lajeunesse, 2011). To deal with this nonindependence, we adjusted the sample size of the control groups to be equal to the original sample size of that control group divided by the number of times that control group was compared with a treatment group . For the meta-analysis of means, all the revised effect sizes that we calculated were coded such that negative values indicate that developmental stress negatively affects fitness. That is, effect sizes were coded based on the expected relationship between the trait and fitness. For example, since fitness is expected to positively associate with body mass and immune response, no change in sign had to be implemented for those effect sizes. However, since fitness is expected to negatively associate with latency to reproduce and corticosterone levels, we inverted the sign of those effect sizes before the analyses (all decisions are available in data accessibility section). For the meta-analysis of variance, effect sizes were left unchanged as we expected an increase in variance across traits.

| Meta-analyses and meta-regressions
We ran two multilevel meta-analytic (i.e. intercept-only) models, one for each type of effect size, to test whether developmental stress generally affects phenotype and fitness both at the mean (lnRR) and variance (lnCVR) levels, and two multilevel meta-regression models to test whether developmental stress effects differed across different types of traits. For meta-analytic models, we investigated unexplained variation across studies (after accounting for sampling variance) by estimating total and separate relative heterogeneity for each random effect (I 2 ; Nakagawa & Santos, 2012; more in R script: 009_results_figures_and_tables.R), and absolute heterogeneity (Q) using the R package 'metafor' v.2.1-0 (Viechtbauer, 2010). For meta-regressions, we estimated the percentage of variance explained by the moderators (R 2 marginal ; Nakagawa & Schielzeth, 2013).

| Publication bias
We assessed publication bias-specifically small-study bias-by running a variant of Egger's regression that uses the meta-analytic residuals as the response variable, and the precision (i.e. the square root of the inverse of the sampling variance) as the moderator (Nakagawa & Santos, 2012). Publication bias occurs when specific effect sizes are overrepresented in the literature, and it is normally indicated by an overrepresentation of large effect sizes of small precision (small-study bias; Jennions et al., 2013;Rothstein et al., 2005). Additionally, we assessed potential temporal trends in effect sizes that could indicate a time-lag bias or decline effect by running a multilevel meta-regression that included year of publication as a z-transformed moderator (Nakagawa & Santos, 2012;Sánchez-Tójar et al., 2018). A decline effect consists of decreasing support for a specific research hypothesis over time as evidence accumulates and is normally identified by effect sizes becoming smaller over time (Jennions & Møller, 2002;Koricheva & Kulinskaya, 2019).

| Random effects
All models included the following random effects: (a) observation ID, which represents the observational or residual variance that needs to be explicitly modelled in a meta-analytic model, (b) study ID, which encompassed those estimates obtained within each specific study, (c) species ID, which encompassed those estimates obtained for each species, and (d) phylogeny, which consisted of a phylogenetic relatedness correlation matrix. To build the phylogeny, we searched for our species in the Open Tree Taxonomy (Rees & Cranston, 2017) and retrieved the phylogenetic relationships from the Open Tree of Life (Hinchliff et al., 2015) using the R package 'rotl' v.3.0.5 (Michonneau et al., 2016). We estimated branch lengths following Grafen (1989)

| Meta-analysis of variance
Overall, developmental stress increased variance by around 4% on average, albeit uncertainty was high (Table 1, Figure 1). The effect of developmental stress on variance differed depending on the trait studied, with reproduction showing the largest increase in variance (ca. 21% on average) (Figure 2). However, the percentage of variance TA B L E 1 Results of the meta-analyses testing the effect of developmental stress on mean (lnRR) and variance (lnCVR) in phenotype and fitness. The results of the Egger's regression tests are also shown. explained by the trait moderator was less than 1% (Table 2), indicating that most heterogeneity remained unexplained.

| Meta-analysis of mean
Our results showed that, on average, developmental stress negatively affected mean trait values by around 13% (Table 1, Figure 1). The meta-regression showed that developmental stress negatively affected all traits, with the strongest effects being on reproduction (ca. 21% on average) and behaviour (ca. 16% on average; Table 2, Figure 2). Nonetheless, heterogeneity remained high even after including the trait moderator (Table 2). Our additional meta-analysis based on a standardized mean difference effect size led to a meta-analytic mean that was very similar to the original study (i.e. Eyck et al., 2019; see Supplementary material S3). However, our phylogenetically corrected meta-analytic mean was much more uncertain (i.e. wider credible intervals overlapping zero). Additionally, relative (I 2 ) and mostly absolute (Q) heterogeneity were lower, indicating lower unexplained variation across studies in our meta-analysis (see Supplementary material S3).

| PUB LI C ATI ON B IA S
The intercepts of the Egger's regressions were clearly different from zero, suggesting the existence of publication bias (small-study bias) at both mean and variance levels ( Table 1). The meta-regressions testing for temporal trends in effect sizes showed a small effect size reduction over time at both mean and variance levels, but the trends were uncertain and the percentage of variance explained by the moderator was essentially zero (Table 2).

| D ISCUSS I ON
Combining a recent advance in meta-analytic methodology and a case study, we demonstrate how meta-analysis of variance can shed light on important biological processes. We showed that developmental stress not only negatively affects mean trait values, but can also increase total trait variance. Our results have also revealed that developmental stress affects reproduction most strongly, both at the mean and at the variance level. Overall, we encourage meta-analysts to focus on both mean and variance effects to unearth previously overlooked effects.

| Case study: developmental stress effects
Developmental stress effects on phenotype and fitness have been studied often. For example, studies have investigated the effects of different developmental stressors on morphology and coloration (Tschirren et al., 2009), attractiveness (Kahn et al., 2012), social network position (Boogert et al., 2014), telomere dynamics (Grunst et al., 2019) and fitness (Arbuthnott & Whitlock, 2018). Several reviews and meta-analyses have attempted to synthesize how different developmental stressors influence phenotype and fitness. However, the majority focused on mean effects (e.g. English & Uller, 2016;Eyck et al., 2019;Macartney et al., 2019), with only a few recent meta-analyses exploring the effects of specific developmental stressors on variance. For example, O'Dea et al. (2019) showed that experimentally increasing developmental temperature leads to an ~ 8% average increase in phenotypic variance across 43 fish species, which could facilitate adaptation to novel environments by increasing the amount of rare phenotypes in the population. Dietary restriction during development has also been shown to lead to an ~ 9% average increase of variance in longevity across 14 animal species, which may affect the strength of selection on longevity (Senior et al., 2017). In contrast, lower quality/quantity diets during development have been shown to lead to an ~ 8% average decrease of variance in risk-taking behaviour across animal species, suggesting that individuals may converge on high-risk behavioural phenotype under developmental diet stress (Moran et al., 2020). In all, these recent meta-analyses of variance provide good evidence suggesting that developmental stress can affect phenotypic variance and the opportunity for selection.
Our results first confirm that overall, developmental stress negatively affects mean trait values, with the strongest effects on reproduction (ca. 21%) and behaviour (ca. 16%). Furthermore, our meta-analysis of variance revealed that even when multiple developmental stressors are considered together (e.g. physiological, environmental, nutritional, etc)-as in our study, developmental stress also leads to a small increase of around 4% on average in trait variance, with that effect being F I G U R E 1 Developmental stress affects negatively the mean and slightly increases variance in trait values. Points and associated error bars correspond to posterior modes and 95% highest posterior density intervals (HPDI) from the meta-analyses. The posterior distributions with vertical lines indicating the median are plotted on top of their respective modes and 95% HPDI mostly driven by an increase in variance of around 21% on average in reproduction. Thus, the increase in variance observed in experimental versus control treatments is in agreement with the recent meta-analyses above (Senior et al., 2017;O'Dea et al., 2019). Furthermore, our results seem in agreement with another two recent meta-analyses of variance showing that environmental stress (i.e. not only during development) measured as single-food diets and high temperature increase variance in fitness  and reproductive success (García-Roa et al., 2018), respectively. Our results on variance in reproduction, specifically, confirm previous theoretical predictions (Martin & Lenormand, 2006) and recent experimental work (Martinossi-Allibert et al., 2017). Since a recent meta-analysis showed that environmental stress can increase both genetic and residual variances (i.e. not just total phenotypic variance; Rowiński & Rogell, 2017), developmental stress could have far reaching evolutionary consequences. Overall, our study shows that developmental stress may lead to increased opportunity for selection; however, these results should be interpreted carefully as most of the heterogeneity in our models remained unexplained.

| Promoting meta-analysis of variance
Our results show how meta-analysing variances alongside means can unearth otherwise overlooked effects and contribute to our understanding of biological processes. Indeed, all but one (93%) of the meta-analyses of variance performed in the field of ecology and evolution (see Supplementary material S1) revealed important variance effects that otherwise would have remained unknown.
Calculating lnCVR for a meta-analysis of variance requires essentially the same information needed to estimate commonly used effect size statistics for comparing means such as Hedges' g (Hedges, 1981) and lnRR (Hedges et al., 1999). Specifically, one simply needs the mean, SD and sample size for the two groups being compared . Since over 60% of published meta-analyses in ecology and evolution compare means (Nakagawa & Santos, 2012;Koricheva & Gurevitch, 2014), meta-analysis of variance could be applied to most meta-analytic data sets in the field, even retrospectively.
Nonetheless, there are some limitations that meta-analysts need to know when conducting a meta-analysis of variance. First, as in the case of lnRR, only ratio scale data can be used to calculate lnCVR, and equations to derive lnCVR from other statistics such as F or t statistics are not available. Furthermore, lnCVR cannot be calculated for group-level proportional data . Second, absolute error variance will generally be larger for lnCVR than for mean-based effect size statistics. This large sampling variance will generally lead to lower levels of heterogeneity in lnCVR compared with mean-based effect size statistics (Table 1), and overall highlights that meta-analysing variances will usually be more data-hungry than meta-analysing means. Despite these limitations, meta-analysis of variance is rather uncomplicated, making it easy for meta-analysts to shift some of their preoccupations with averages to more variance-driven hypothesis testing and development.
Meta-analysis of variance not only can reveal important biological realities, but can also help with making informed methodological decisions. By identifying whether the compared groups show unequal variances (i.e. whether there is heteroscedasticity), meta-analysis of variance can help meta-analysts choose between effect sizes that assume homoscedasticity (e.g. Cohen's d, Cohen, 1988;Hedges' g, Hedges, 1981), and those that incorporate heteroscedasticity (e.g. standardized mean difference with heteroscedasticity or SMDH, Bonett, 2008Bonett, , 2009; see Supplementary material S3). This is important because not accounting for heteroscedasticity can cause parameter misestimation in meta-analysis (Bonett, 2008(Bonett, , 2009). Overall, we suggest that even when variance-based hypotheses are of no interest to the researcher, meta-analysis of variance can still be used F I G U R E 2 Developmental stress affects mean and variance differently across traits, with the strongest effects being on reproduction. Points and associated error bars correspond to posterior modes and 95% highest posterior density intervals (HPDI) from the meta-regressions. The posterior distributions with vertical lines indicating the median are plotted on top of their respective modes and 95% HPDI. Point size is proportional to the number of effect sizes (see Table 2) as a powerful methodological tool for helping to choose the most appropriate effect size statistic.

| CON CLUS ION
Our analyses on the effects of developmental stress on both mean and variance in phenotype and fitness showcase how meta-analysing variances alongside means can help unravel crucial processes.
Importantly, meta-analysing variances is not limited to ecology and evolution, and can also advance disciplines such as agriculture (Knapp & Heijden, 2018), social sciences (O'Dea et al., 2018) and medicine (Senior et al., 2016). We have also shown how metaanalysis of variance can be used as a methodological tool to make informed decisions on how to choose effect size statistics for the study of mean effects. Overall, a holistic understanding of the world requires moving beyond the world of means to incorporate the world of variance.

ACK N OWLED G M ENTS
We are grateful to the authors of the original publication for sharing data with us and answering our questions about the data, especially Harrison J.F. Eyck and Tim S. Jessop. We are grateful to Pietro D'Amelio for enlightening discussions about data visualization, and the reviewers and the editor for their helpful comments on the manuscript.

AUTH O R S' CO NTR I B UTI O N S
AST contributed to conceptualization, data curation, formal analysis, investigation, methodology, project administration, provision of resources, provision of software, validation, visualization, writing the original draft of the manuscript, and review and editing of the manuscript. NPM contributed to investigation, methodology, validation, and review and editing of the manuscript. REO contributed to conceptualization, methodology, and review and editing of the manuscript. KR contributed to funding acquisition, supervision, and review and editing of the manuscript. SN contributed to conceptualization, methodology, funding acquisition, supervision, and review and editing of the manuscript.

CO M PE TI N G I NTER E S TS
We declare no competing interests.

PE E R R E V I E W
The peer review history for this article is available at https://publo ns.com/publo n/10.1111/jeb.13661.

DATA AVA I L A B I L I T Y S TAT E M E N T
All data and code are available at Zenodo (http://doi.org/10.5281/zenodo.3843018), and brms models are available as '.RData' at the Open Science Framework (https://doi.org/10.17605 /OSF.IO/YJUA8).

TA B L E 2
Results of the meta-regressions testing whether the effect of developmental stress on mean (lnRR) and variance (lnCVR) differs across traits. The results of the meta-regressions assessing temporal trends in effect sizes are also shown.