Evolution of seed mass associated with mating systems in multiple plant families

Abstract In flowering plants, the evolution of self‐fertilization (selfing) from obligate outcrossing is regarded as one of the most prevalent evolutionary transitions. The evolution of selfing is often accompanied by various changes in genomic, physiological and morphological properties. In particular, a set of reproductive traits observed typically in selfing species is called the “selfing syndrome”. A mathematical model based on the kinship theory of genetic imprinting predicted that seed mass should become smaller in selfing species compared with outcrossing congeners, as a consequence of the reduced conflict between maternally and paternally derived alleles in selfing plants. Here, we test this prediction by examining the association between mating system and seed mass across a wide range of taxa (642 species), considering potential confounding factors: phylogenetic relationships and growth form. We focused on three plant families—Solanaceae, Brassicaceae and Asteraceae—where information on mating systems is abundant, and the analysis was performed for each family separately. When phylogenetic relationships were controlled, we consistently observed that selfers (represented by self‐compatible species) tended to have a smaller seed mass compared with outcrossers (represented by self‐incompatible species) in these families. In summary, our analysis suggests that small seeds should also be considered a hallmark of the selfing syndrome, although we note that mating systems have relatively small effects on seed mass variation.

Seed mass is a key trait that is a central to many aspects of plant ecology, such as viability, dispersal, and seedling growth (Baskin & Baskin, 2014;Moles et al., 2005;Stanton, 1984). Given that there is a trade-off between seed number and seed mass (Smith & Fretwell, 1974), De Jong et al. (2005) proposed a mathematical model predicting that seed mass variation would also be influenced by the evolution of selfing. This model stems from Haig's kinship theory of genetic imprinting (Haig, 1997), which postulated that optimal seed mass differs for mothers and for offspring, given the conflict of interests between maternally and paternally derived alleles. Because the degree of conflict should depend on the selfing rate of maternal plants, De Jong et al. (2005) predicted that outcrossing plant species tend to produce larger seeds compared with selfers. They tested this prediction using data from 265 common British grassland plants (Grime et al., 1988) and found that the mean seed mass of selfers was smaller than that of outcrossers. Although this was consistent with what the model predicted, the test was a simple comparison of mean seed mass, and this was potentially confounded by phylogeny or growth form that are interrelated with breeding systems.
Here we tested the prediction of De Jong et al. (2005) more formally, using a mixed model-based regression analysis. We explicitly considered the following factors. First, we took phylogenetic confounding into account. We focused on three plant families, Solanaceae, Brassicaceae and Asteraceae, where information on mating systems is available for hundreds of species, and the analysis was performed for each family separately. We considered the genus-level phylogenetic information as a random effect in the framework of Bayesian linear mixed model (the details are presented in the Materials and Methods). Second, we took growth forms into consideration, which suggested to be most strongly associated with seed mass (Moles et al., 2005). Whilst most species are herbaceous in these three studied families, non-herbaceous plants (trees, vines or shrubs) are also common in the Solanaceae (Table 1). We included this factor as a fixed effect in the mixed model, which allowed us to quantify the relative contributions to seed mass variation. Through this analysis, we generally found that selfers (represented by selfcompatible species) tended to have smaller seed mass compared with outcrossers (represented by self-incompatible species) in the three families studied, although the result varied by plant families and methods. Recently, Mazer et al. (2020) also reported that considering climatic variables, seed mass was correlated with the mating system in the genus Clarkia. Our study, using 642 species from three families complements that of Mazer et al. (2020) and further supports the correlation between seed mass and mating systems in a wider range of plant taxa.

| Data collection
In this study, we focused on three plant families, Solanaceae, Brassicaceae and Asteraceae, because their mating systems have been studied extensively and information is available for hundreds of species in these families (e.g. Goldberg et al., 2010;Grossenbacher et al., 2017). We first collected information on the mating systems of the species. As a proxy for selfing rate, we used information on self-incompatibility (SI) and self-compatibility (SC), because the available data of selfing rate was limited compared with the accumulated knowledge of SI and SC. We exploited the data of two papers (Goldberg et al., 2010 for Solanaceae, andGrossenbacher et al., 2017 for Solanaceae, Brassicaceae and Asteraceae) and also collected data by searching the literature manually, especially for genera in which mating system evolution is well studied (e.g. Petunia and Brassica). Goldberg et al. (2010) classified the mating system as 0, SI; 1, SC; 2, SI + SC; 3, SI + SC + Dioecy; 4, Dioecy; and 5, SI + Dioecy. In this study, we only used the species classified as 0 (SI) or 1 (SC). We then collected the seed mass data of the species in which we obtained the data of mating systems, mostly from Kew's Seed Information Database (SID http://data.kew.org/sid/). The SID describes a mean value of 1000 seed mass for each species when multiple data are available; here, we used mean values for analysis.
To deal with confounding by growth form, we obtained information on whether the species are herbaceous, trees, vines or shrubs.
These data were mostly obtained from Engemann et al. (2016) and Global Biodiversity Information Facility (GBIF; https://www.gbif. org/), but when they were not available, we searched the literature directly.
We obtained the genus-level phylogenetic information to be incorporated as a random factor in Bayesian linear mixed model (MCMCglmm). We used the Phylomatic platform (Webb & Donoghue, 2005) Zanne et al. (2014). This dataset did not cover all the species but almost all the genera for which we obtained the phenotypic data.
We, therefore, first generated a genus-level phylogenetic tree for each family, by randomly choosing one available species per genus ( Figure S1). Then, we included all the used species in the tree, assuming the same phylogenetic distance (0.000001) within the genus. We excluded genera in which phylogenetic data were not available for any species.
In total, we used 345 species (117 genera) for Asteraceae, 153 species (56 genera) for Brassicaceae and 144 species (15 genera) for Solanaceae (Table 1). An overview of the relationship between seed mass, mating systems, and growth forms is summarized in Figure 1.
The numbers of SI and SC species and of herbaceous species and non-herbaceous (trees, vines or shrubs) for each family used in this study are shown in Table 1. All the data including references are available in Table S1.

| Data analysis
All the data analysis was performed by using R version 3.6.3 (R Core Team, 2020). For each family, we first performed an analysis without phylogenetic information: a simple analysis of variance (ANOVA) for each family, in which mating systems (SI/SC) and growth forms were the explanatory variables, and log 10 (1000 seed mass [g]) was the response variable, by using lm and anova functions. We also performed a regression analysis in which genera were the fixed effect to quantify the between-genera variance of seed mass.
To deal with the confounding by phylogeny, we performed Bayesian linear mixed model by using the R library MCMCglmm (Hadfield, 2010), in which mating systems (SI/SC) and growth forms were the explanatory variables, and log 10 (1000 seed mass [g]) was the response variable. Growth forms were classified into two categories: herbaceous or non-herbaceous (tree, vine or shrub) (Table 1).
We included the phylogenetic information as a random effect. We specified priors V = 1 and nu = 0.02 for R and G structures, and a Gaussian distribution was assumed. We ran 1 000 000 iterations with a burn-in of 1000 iterations and a thinning interval of 1 (no thinning, as recommended by Link and Eaton (2012)). The MCMC simulations generally generated well-converged posterior distributions. The trees generated from the Phylomatic platform were mostly ultrametric, but we slightly corrected by the force.ultrametric function of phytools to make usable for MCMCglmm (Revell, 2012). The used The R script, phenotypic data, and phylogenetic trees in newick format are available at GitHub (https://github.com/tsuch imats u/seed_mass).

| RE SULTS
We first performed ANOVA with mating systems (SI/SC) and growth forms as the explanatory variable, seed mass as the response variable. We found that the trend varied between plant families: Seed mass tended to be smaller in SC species compared with SI species

| DISCUSS ION
We performed a meta-analysis on the relationship between seed mass and mating systems in three plant families (Solanaceae, Brassicaceae and Asteraceae), by explicitly taking potential confounding factors into account: phylogenetic relationships and growth forms. We found that SC species generally show smaller seed mass compared with their SI congeners (Figure 1; Table 3). Whilst we obtained mixed support in ANOVA without the information of phylogeny (Table 2), we consistently detected significant effects of mating systems in the Bayesian linear mixed model, in which phylogenetic relationships were controlled as a random factor.
Several previous studies have reported that self-fertilizing taxa produce smaller seeds than their outcrossing congeners in specific taxa (e.g. Knies et al., 2004;Mazer et al., 2020;Mitchell-Olds, 2001;Sharma et al., 1999). Mazer et al. (2020) reported the correlation between mating system and seed mass by controlling for potential confounding factors such as climate variables in Clarkia.
Our study also found correlations between seed mass and mating system in a much wider range of plant taxa by controlling for possible confounding factors, thereby serving as a complementary study to Mazer et al. (2020). Our results support the emerging notion that small seed mass also constitutes a component of the selfing syndrome (Mazer et al., 2020;Ornduff, 1969;Shimizu & Tsuchimatsu, 2015;Sicard & Lenhard, 2011). However, our caveat is that, although the effect of mating systems was significant, its effect on seed mass variation was relatively small. We rather found that the large portions of seed mass variation were due to the between-genera variance. Given that the evolutionary transition from SI to SC occurs frequently within each genus (Goldberg et al., 2010;Igic et al., 2006;Shimizu & Tsuchimatsu, 2015), the portion of seed mass variation explained by mating systems would be relatively limited.
Nonetheless, our results are consistent with the theoretical model (De Jong et al., 2005), which predicted that outcrossing plant species tend to produce larger seeds compared with selfers according to Haig's kinship theory of genetic imprinting (Haig, 1997

TA B L E 2
Summary of an analysis of variance for mating systems and growth forms both attributes might increase the ability for colonization (Mazer et al., 2020). This is because selfing plants can reproduce without mates or pollinators (Baker, 1955;Darwin, 1876), and smaller seeds are expected to disperse farther than larger ones (Greene & Johnson, 1993;Tamme et al., 2014). Second, sex allocation theory predicts that the pollen-ovule (P/O) ratio should increase linearly with increasing seed mass amongst seeding plants (Charnov, 1986;Götzenberger et al., 2006). Götzenberger et al. (2006) indeed found a positive correlation between the P/O ratio and seed mass through a meta-analysis. Because the P/O ratio tends to be lower in selfing species (Cruden, 2000), the correlation between mating system and seed mass could have arisen as a consequence.
These hypotheses are not mutually exclusive; thus, it is possible that these effects might have jointly led to the observed correlation between mating system and seed mass. Further analysis including these factors as covariates may help quantify the direct effect of mating systems. Willi (2013) tested the effect of parental conflict on seed mass variation by diallel crosses in Arabidopsis lyrata. That study showed that seeds were larger when pollen came from another outcrossing population than when pollen came from a selfing or the same population, providing support for the idea of parental conflict between male-derived selfish genes and female recognition genes. Cailleau et al. (2018) also tested the effect of parent-offspring conflict towards resource allocation in seed development through diallel crosses in maize. Raunsgard et al. (2018) found that more outcrossed paternal populations produce larger seeds when crossed with less outcrossed maternal populationsand vice versa-in Dalechampia scandens. Given these findings, it is possible that mating systems, which affect the degree of parental conflict, might have important roles in determining seed mass variation across species.
Thus, the association between seed mass and mating systems may partly be influenced by polyploidy, but it would rather have an opposite effect, given that polyploid species tend to produce larger seeds (e.g. Bretagnolle et al., 1995;Eliášová & Münzbergová, 2014;Miller et al., 2012;Stevens et al., 2020). Whilst we did not distinguish diploid and polyploid species in this study, the association between seed mass and mating systems may even become stronger when the ploidy level is controlled. We note, however, that the mechanism of self-incompatibility varies between families, which also makes the effect of polyploidization on mating system transition different.
Specifically, whereas polyploidization almost always disrupt SI in gametophytic SI systems, in sporophytic SI systems, polyploidization does not necessarily induce the loss of SI (e.g. Mable et al., 2004). It would be important to extend the analysis to a broader scale beyond three studied families.
In summary, we found evidence that selfing species tend to produce smaller seeds after controlling for possible confounding factors in multiple plant families, albeit its relatively small effect. This suggests that small seeds could also be considered a hallmark of the selfing syndrome, which was previously not well-recognized (Mazer et al., 2020). Whilst several traits have been reported as components of the selfing syndrome, thanks to the increase in accumulated phenotypic and genomic data in many taxa, more traits may found to be typical features of selfing species (Ornduff, 1969;Shimizu & Tsuchimatsu, 2015;Sicard & Lenhard, 2011).

ACK N OWLED G EM ENTS
We thank Shigeto Dobata, Tom de Jong, and an anonymous reviewer for helpful comments on the manuscript. This work was partially supported by JSPS KAKENHI (grant no. 19H03271), MEXT KAKENHI (grant nos. 17H05833 and 19H04851) and the Inamori Research Grant.

CO N FLI C T O F I NTE R E S T
The authors have no conflict of interest to declare.

AUTH O R CO NTR I B UTI O N S
TT conceived and designed the research. HT, KC and TT collected the data. HT and TT performed the analysis. HT and TT wrote the paper.

PE E R R E V I E W
The peer review history for this article is available at https://publo ns.com/publo n/10.1111/jeb.13949.

DATA AVA I L A B I L I T Y S TAT E M E N T
All the data and scripts are available at GitHub (https://github.com/ tsuch imats u/seed_mass).