Ornaments indicate parasite load only if they are dynamic or parasites are contagious

Abstract Choosing to mate with an infected partner has several potential fitness costs, including disease transmission and infection-induced reductions in fecundity and parental care. By instead choosing a mate with no, or few, parasites, animals avoid these costs and may also obtain resistance genes for offspring. Within a population, then, the quality of sexually selected ornaments on which mate choice is based should correlate negatively with the number of parasites with which a host is infected (“parasite load”). However, the hundreds of tests of this prediction yield positive, negative, or no correlation between parasite load and ornament quality. Here, we use phylogenetically controlled meta-analysis of 424 correlations from 142 studies on a wide range of host and parasite taxa to evaluate explanations for this ambiguity. We found that ornament quality is weakly negatively correlated with parasite load overall, but the relationship is more strongly negative among ornaments that can dynamically change in quality, such as behavioral displays and skin pigmentation, and thus can accurately reflect current parasite load. The relationship was also more strongly negative among parasites that can transmit during sex. Thus, the direct benefit of avoiding parasite transmission may be a key driver of parasite-mediated sexual selection. No other moderators, including methodological details and whether males exhibit parental care, explained the substantial heterogeneity in our data set. We hope to stimulate research that more inclusively considers the many and varied ways in which parasites, sexual selection, and epidemiology intersect.

tone, and reflectance) of plumage or skin patches that have a sexual signalling function; 4) extended ornaments such as nests or bowers; 5) sexual display behaviours. We considered both long-range display behaviours which may be expressed in the absence of members of the opposite sex (such as calling in frogs and crickets) as well as short-range courtship behaviours targeted at a specific individual. Following Dougherty (2021aDougherty ( , 2021b, we focused on behavioural traits that represent either the energy or time invested into signal production (e.g. display duration, rate or intensity) or the motivation to signal (e.g. display latency). We excluded measures of signal complexity (e.g. song repertoire size) or signal composition that do not clearly relate to differences in energetic investment.
We considered all parasite taxa, including viruses, bacteria, single-celled protists and fungi, nematodes, platyhelminths, arthropods, and other animals. As metrics of parasite load, we accepted any quantification of parasite number measured at the individual host level.
Because of the typically substantial variation between individual hosts in the impact of infection, even with the same number of parasites, we excluded cases where a disease symptom was measured (e.g. virus-induced lesions (Kortet et al., 2003)) rather than actual parasite load, unless there was a convincing correlation between symptoms and number of parasite individuals. We included experimental studies (i.e., those that use experimental exposure, or removal of parasites from naturally infected hosts) and observational studies (i.e., those that examined natural variation in parasite prevalence). The majority of included studies measured sexual ornaments and parasite load at the same time. However, we also included studies where parasite load and ornament quality were measured at different times (López, 1998;Lindström & Lundström, 2000;Hill & Farmer, 2005;Dawson & Bortolotti, 2006;Stephenson et al., 2020), and studies in which hosts were assigned to different experimental infection groups, but individual parasite load not subsequently measured (e.g. Milinski & Bakker, 1990;Simmons, 1994;Zuk et al., 1998;Gilbert et al., 2016).

Calculating effect sizes
We obtained correlations directly from the meta-analyses by Møller et al. (1999), Garamszegi (2005), and Dougherty (2021a), or from the original papers. Where correlations were not available in the text, we either: 1) converted the results of statistical tests into the correlation coefficient using the formulae in Koricheva et al. (2013) (pp 200-201); 2) calculated them from raw data made available by the authors or extracted from figures using the online tool WebPlotDigitizer v4 (https://apps.automeris.io/wpd/); or 3) calculated them from summary statistics presented in the text or extracted from figures. When sexual trait expression was presented as continuous data, but parasite load was presented as categorical data, we first calculated the standardised mean difference (Hedges' d) between two groups. If parasite data consisted of more than two groups, we compared sexual trait expression between the highest and lowest parasite load categories. We then converted the standardized difference between two groups into the correlation coefficient using the equation in Koricheva et al. (2013) (pp 201). When estimates came from a paired experimental design (for example using a test statistic from a paired t-test), we assumed that measurements from the same individual had a correlation of 0.5. In this case, the effect size calculations are identical to those derived from independent-measures tests (Koricheva et al., 2013). We extracted all relevant effect sizes from each paper. This often resulted in multiple correlations per paper, either because studies reported data relating to multiple experiments, populations, or host species, or presented correlation between multiple ornaments or parasite species.

Testing for publication bias
We tested for publication bias in two ways. First, we tested for a temporal change in the average correlation using a meta-regression with study year as a continuous fixed effect (and the same four random effects as above). A significant change in effect size over time has been suggested to represent publication bias (Jennions & Møller, 2002;Koricheva et al., 2013), although other changes in research practices may also play a role. Second, we tested for funnel plot asymmetry, which can be caused by publication bias against studies with small sample sizes or non-significant results (Koricheva et al., 2013;Nakagawa et al., 2021), using a meta-regression with study precision (inverse standard error) as a fixed effect (Moran et al., 2021;Nakagawa et al., 2021), and the same four random effects as above.

Statistical Analysis
Prior to analysis, all correlations were converted into the Fisher's Z transformation of the correlation coefficient (Zr) (Koricheva et al., 2013). We used Zr as the response variable for all analyses, and converted results back to r for presentation. The associated variance for Zr was calculated as 1/(n -3) (Borenstein et al., 2021), with n being the total number of animals used in the test. The mean correlation was considered significantly different from zero if the 95% confidence intervals did not overlap zero. We calculated heterogeneity across each dataset using the I 2 statistic (Higgins et al., 2003). We also partitioned heterogeneity with respect to each of the four random factors, following Nakagawa & Santos (2012). I 2 values of 25, 50 and 75% are considered low, medium and high respectively (Higgins et al., 2003).     N = 83 species). The host taxonomic class categories used during meta-regressions are denoted by the illustrations on the right, and tree branch colours: insects and arachnids (orange), fish (yellow), amphibians (green), reptiles (teal), birds (blue) and mammals (pink).

Fig. S4. Histogram showing the distribution of P values from the simulation testing how results from our analysis of only male morphological traits measured at the same time as parasite load (N=259 effect sizes) was affected by a reduction in statistical power compared to our analysis of the full dataset (N=424 effect sizes).
We re-ran the overall meta-analysis model after randomly removing 165 rows from the data set. We did this random removal and re-analysis 1000 times, and find that in 35.8% of cases the P value is greater than 0.05. Across these 1000 datasets, the overall mean correlation between parasite load and ornament quality was -0.083 (bootstrapped 95% confidence intervals: -0.119 to -0.017).  Fig. S4).

PRISMA-EcoEvo checklist
The PRISMA-EcoEvo extension was published in 2021. It consists of a 27-item checklist and guidance for reporting systematic reviews and meta-analyses of primary research in ecology and evolutionary biology. Within each item, sub-items are given a percentage score (calculated using the Shiny app: https://prisma-ecoevo.shinyapps.io/checklist/). Higher item scores thus indicate that a higher proportion of sub-items are reported in the manuscript.