Evidence of spontaneous selfing and disomic inheritance in Geranium robertianum

Abstract Knowing species’ breeding system and mating processes occurring in populations is important not only for understanding population dynamics, gene flow processes, and species' response to climate change, but also for designing control plans of invasive species. Geranium robertianum, a widespread biennial herbaceous species showing high morphological variation and wide ecological amplitude, can become invasive outside its distribution range. A mixed‐mating system may be expected given the species’ floral traits. However, autonomous selfing is considered as a common feature. Genetic variation and structure, and so population mating processes, have not been investigated in wild populations. We developed 15 polymorphic microsatellite markers to quantify genetic variation and structure in G. robertianum. To investigate whether selfing might be the main mating process in natural conditions, we sampled three generations of plants (adult, F1, and F2) for populations from the UK, Spain, Belgium, Germany, and Sweden, and compared open‐pollinated with outcrossed hand‐pollinated F2 progeny. The highly positive Wright's inbreeding coefficient (F IS) values in adults, F1, and open‐pollinated F2 progeny and the low F IS values in outcross F2 progeny supported autonomous selfing as the main mating process for G. robertianum in wild conditions, despite the presence of attractive signals for insect pollination. Genetic differentiation among samples was found, showing some western–eastern longitudinal trend. Long‐distance seed dispersal might have contributed to the low geographic structure. Local genetic differentiation may have resulted not only from genetic drift effects favored by spontaneous selfing, but also from ecological adaptation. The presence of duplicate loci with disomic inheritance is consistent with the hypothesis of allotetraploid origin of G. robertianum. The fact that most microsatellite markers behave as diploid loci with no evidence of duplication supports the hypothesis of ancient polyploidization. The differences in locus duplication and the relatively high genetic diversity across G. robertianum range despite spontaneous autonomous selfing suggest multiple events of polyploidization.


| INTRODUC TI ON
The breeding system in Angiosperms can vary from autogamy (self-fertilization) to strict allogamy (obligate outcrossing). Strict allogamy may also evolve into a heteromorphic self-incompatibility system preventing selfing or into dioecy (Charlesworth, 2006;Richards, 1997). Autogamy can allow for purging deleterious recessive alleles by natural selection (Charlesworth & Charlesworth, 1987;Goodwillie et al., 2005) and facilitate colonization of new territories for pioneer species or occurrence in extreme or unpredictable habitats where pollinators are scarce or absent (Barrett, 2003;Hartfield et al., 2017;Kalisz & Vogler, 2003). However, it can reduce effective genome recombination and within-population genetic diversity (e.g., Bomblies et al., 2010;Jullien et al., 2019;Nordborg, 2000). Obligate outcrossing represents an advantage by mixing gene pools, increasing genetic diversity, and preventing inbreeding depression (Arista et al., 2017;Charlesworth, 2006), but it can require pollinating vectors, such as insects, birds, or bats, and a sufficient number of compatible mates or extensive gene flow between populations for ensuring reproductive success (Berjano et al., 2013;Menz et al., 2011). Retaining facultative self-pollination, in particular delayed autonomous selfing, can offer reproductive assurance when outcrossing has not occurred in case of limited pollinator service (Busch & Delph, 2012;Kalisz & Vogler, 2003). Pollinator service may be limited in fragmented habitats or in case of temporary unfavorable environmental conditions (Arista et al., 2017;Goodwillie & Weber, 2018). Therefore, a lot of species are characterized by a mixed-mating system to guarantee seed production despite a risk of inbreeding depression in the progeny (Goodwillie et al., 2005(Goodwillie et al., , 2010; Kalisz et al., 2004).
Outcrossing species usually possess attractive floral traits for pollinators, for example, a high number of colored flowers and nectar reward, whereas autonomous selfers often have reduced floral display and nectar reward (Bartoš et al., 2020;Goodwillie et al., 2010;Sicard & Lenhard, 2011). Knowing species' breeding system and quantifying mating processes (outcrossing and selfing rates), which occur in populations, are important for understanding population dynamics, gene flow processes, and potential species' response to climate change (Charlesworth, 2006;Razanajatovo et al., 2020). They are also important for designing conservation recovery plans of endangered species and control plans of invasive exotic species (Barrett, 2010;Dudash & Murren, 2008). For instance, small populations of species with a self-incompatibility system require a high number of compatible mates for successful demographic and genetic restoration, whereas inbreeding issues may be found for species with a mixed-mating system, requiring genetic rescue of small populations (e.g., Menges, 2008;Olivieri et al., 2016;Van Rossum, Destombes et al., 2021). Autonomous selfers may easily produce seeds and naturalize, and may therefore become potentially invasive outside their distribution range (Antoń & Denisow, 2018;Razanajatovo et al., 2016). Exclusion and pollination experiments can give insights on whether species are self-compatible or self-incompatible (e.g., Bartoš et al., 2020), but genetic studies using molecular markers can allow for quantifying outcrossing rates, inbreeding levels, pollen dispersal processes, and genetic diversity and structure in wild populations (e.g., Arista et al., 2017;Bomblies et al., 2010;Charlesworth, 2006;Gelmi-Candusso et al., 2017;Jacquemart et al., 2021).
Geranium robertianum L. (Geraniaceae) is a common, biennial(annual), ruderal herb and is highly variable morphologically. The species shows a wide ecological amplitude, mainly occurring in woodlands and hedge banks, but also in various open habitats, such as grasslands, wastelands, railway banks, skeletal soils, and walls, on calcareous and acidic soils (Tofts, 2004;Vandelook & Van Assche, 2010;Wierzbicka et al., 2014). It is widely spread in its native distribution area in Europe, and naturalized in temperate regions of many other continents, where it can become invasive (Tofts, 2004). Individual plants bear between 10 and 300 pink flowers (12-17 mm diameter), usually slightly protandrous, sometimes homogamous or protogynous (Bertin, 2001;Tofts, 2004).
The dehiscing of the five inner anthers usually precedes the lengthening of the style and stigma receptivity. When the inner stamens wither, the fiver outer anthers move to the center of the flower around the style and dehisce (Knuth, 1908;Tofts, 2004).
Flowers stay open for two to five days (Tofts, 2004;F. Vandelook, personal observation), which is similar to other Geranium species (e.g., Willson et al., 1979). Generally, five seeds per fruit are produced (Tofts, 2004). Flowers produce nectar and are visited by insects, in particular butterflies, Syrphid flies, wild bees, and honey bees (Endress, 2010;Tofts, 2004;Yeo, 1973), suggesting outcrossing. Self-fertilization is, however, possible, as stigmas during elongation can be covered with pollen of the inner whorl of stamens before possible outcrossing events, and when the stigmas standing above the dehiscing outer anthers recurve (Knuth, 1908;Tofts, 2004), allowing for prior and delayed autonomous selfing.
Autonomous selfing has been considered as a common feature (Bertin, 2001;Yeo, 1973Yeo, , 1985. Consequently, mixed mating likely occurs in G. robertianum. However, population mating processes have never been investigated in the field using codominant molecular markers to estimate genetic variation and inbreeding levels. Besides, plants only reproduce by seeds, which are dispersed not only at short distances by carpel projection but also at long distances by epizoochory (Tofts, 2004;Yeo, 1973). As a result, genetic variation and structure patterns may be contrasted according to mating processes and short-and long-distance seed dispersal (e.g., Bomblies et al., 2010;Gelmi-Candusso et al., 2017;Helsen et al., 2015;Jacquemart et al., 2021). Moreover, due to its wide distribution range combined with a wide ecological amplitude, G. robertianum appears as an interesting model for studying local adaptation and response to climate change (Hoffmann & Sgrò, 2011;Wierzbicka et al., 2014). Therefore, we developed polymorphic microsatellite markers to quantify genetic variation and structure in G. robertianum. To investigate whether selfing might be the main mating process in natural conditions, we sampled three generations of plants (adult, F1, and F2) for populations from the UK, Spain, Belgium, Germany, and Sweden, and progeny obtained from outcrossed hand-pollinated were compared with progeny in open-pollinated conditions.

| Study populations and sampling
To cover a wide ecological amplitude and geographic range of G. robertianum, 43 populations were selected from various calcareous or acidic habitats (e.g., forests, forest edges, grasslands, railway banks, sandy and shingle beaches), from the UK, Spain, Belgium, Germany, and Sweden ( Figure 1, Table 1 where F2 seed progeny was obtained after outcrosses (F2c) between F1 plants ( TA B L E 2 Summary of crosses between populations and number of genotyped seed progeny per cross (for population codes, see Table 1) TA B L E 3 Characteristics of 15 microsatellite markers developed in Geranium robertianum. For each marker (and duplicate loci in GER17, GER35, GER42, GER45, and GER47 indicated as A and B), the forward and reverse sequences, repeat type, size of the original fragment (bp), number of alleles (An), allele size range, multiplex number, fluorescent dye, primer amount used in the multiplex PCR (pmol), and null allele frequency (with their 95% highest posterior density intervals) are given

Dye
Primer amount (pmol) Null allele frequency

| Population genetic structure at a wide geographic scale
To investigate population genetic structure patterns, we performed a principal coordinate analysis (PCoA) based on a standardized distance matrix using GenAlEx 6.5 (Peakall & Smouse, 2012) and Bayesian clustering analyses using STRUCTURE version 2.3.4

| Loci and scored alleles
Out of the 15 primer pairs, 10 could be interpreted to amplify diploid loci. Five primer pairs (GER17, GER35, GER42, GER45, and GER47) showed two to four peaks ascribed to different alleles ( Figure S1). From the genotyping of the F2 progeny obtained by outcrosses (F2c) and of their maternal and paternal plants, the amplified regions for each primer pair could be interpreted as corresponding to two duplicate loci (Table 3), not overlapping for GER42, but overlapping for the four other markers ( Figure S1).
However, the higher size of the peak allowed us to identify when two overlapping alleles occurred in both loci. For GER35, only one (rare) allele was found in both loci, and separating the two loci was easy. For GER17, GER45, and GER47, it can be difficult to distinguish both loci in some genotypes without data on maternal and paternal plants together with their progeny, and so we recommend not using them unless performing paternity analyses or

Dye
Primer amount (pmol) Null allele frequency and GER47 did not appear to be duplicated and some other markers did not amplify.
We scored two to 16 alleles in the 20 loci for a total of 133 alleles (Table 3). Five loci showed evidence for null alleles as 95% HPDI differed from 0, but only GER26 showed a high null allele frequency (0.133; 95% HPDI: 0.087-0.184; Table 3). There was significant genotypic disequilibrium between 15 and 6 of the 190 pairs of loci after sequential Bonferroni correction (p < .05) for adults and F1 progeny, respectively.  (Table 4), and significantly lower than F1 and

| Genetic structure at wide geographic scale
The PCoA distinguished Spanish samples from the other populations that showed some continuous variation, although UK samples tended to be separated from Belgian, German, and Swedish samples that overlapped (Figure 2). Within each region, adult, F1, and/or F2o generations overlapped, suggesting similar mating processes in the three generations. The Bayesian clustering analysis gave an optimal number of clusters at K = 2. The UK samples showed high membership (Q) values for cluster 1 (≥80% for 91% of the individuals) and clustered together with a few German and Belgian samples (e.g., AAL, BOI, DIN, DR2, HE2, NIE, and RAU) (Figure 3a). A second peak was found for DeltaK at K = 4, further distinguishing the DI2 population from Belgium (from which there were 21 samples) and some longitudinal trend for the continental populations (Figure 3b). The clustering was not related to habitat differences ( Figure 1, Table 1).

| D ISCUSS I ON
The highly positive inbreeding coefficient (F IS ) values found for adults, F1, and F2o progeny supported the former hypothesis (Bertin, 2001;Yeo, 1985) that autonomous self-pollination is the main mating process contributing to seed production in wild populations of G. robertianum, and that outcross pollination was limited, despite the presence of attractive signals for insect pollination such as nectar production (Endress, 2010) and reporting of pollinator visitations (Bertin, 2001;Tofts, 2004). However, given the high number of flowers per plant, geitonogamous self-pollination might also be possible in case of pollinators visiting several flowers on the same plant (Goodwillie et al., 2010;Richards, 1997).
Moreover, crosses between closely related individuals, such as full siblings with the same multilocus genotype, resulting in biparental inbreeding, might also contribute to high F IS values (Bomblies et al., 2010). This needs to be verified by investigating within-population genetic variation with more samples (Leipold et al., 2020). Spontaneous autonomous selfing is often observed in annuals, weeds, and pioneer species such as G. robertianum, whereas outcrossing is more common in perennials and species occurring in stable vegetation communities (Bartoš et al., 2020;Charlesworth, 2006). For predominantly selfing species, outcrossing rates can also vary along the flowering season, depending on pollinator and resource availability (Jullien et al., 2021).
Some genetic differentiation among samples was found, but with no pronounced geographic pattern except for the UK and Spanish (Mallorca) samples and some western-eastern longitudinal trend.
Long-distance seed dispersal (Tofts, 2004) might have contributed to the low geographic structure, as found for the bird seed-dispersed Juniperus communis (Jacquemart et al., 2021) and for species showing epizoochorous seed dispersal, such as Anthyllis vulneraria (Helsen et al., 2015) and Dianthus carthusianorum (Rico & Wagner, 2016), as well as accidental introduction of seeds along with anthropogenic activities and infrastructures (Wierzbicka et al., 2014). Moreover, no evidence of reproductive isolation was found between the UK and German populations assigned to separate clusters as viable seeds and healthy plants were obtained from outcrosses (F. Vandelook, unpublished data). Local genetic differentiation between populations may have resulted not only from genetic drift effects promoted by spontaneous selfing, but also possibly from local ecological adaptation (Bomblies et al., 2010;Hartfield et al., 2017;Wierzbicka et al., 2014). To get a comprehensive view of genetic structure patterns and of their shaping factors, we need to expand the sampling within populations and across species' distribution range.
The presence of duplicate loci suggests that the species might be of polyploid origin, which is consistent with the hypothesis that G.
robertianum is an allotetraploid resulting from hybridization between G. purpureum and another unknown parental species, based on chromosome numbers, morphological similarities, cytological observations, and nectar composition (Baker & Baker, 1976;Widler-Kiefer & Yeo, 1987;Yeo, 1973Yeo, , 2004. Tetrasomic inheritance, that is, random pairing of four homologous chromosomes, leading to all possible combinations of up to four alleles per locus, can be expected for autotetraploids (Soltis et al., 2014;Stift et al., 2008). Disomic inheritance, with two separate pairs of two homologous chromosomes, is usually found in allotetraploids, but disomic inheritance can also establish in autopolyploids when whole-genome duplication is ancient, through the action of genetic drift combined with selection (Guo et al., 2015;Le Comber et al., 2010;Soltis et al., 2014). The fact that most microsatellite markers developed in the present study behave as diploid loci with no evidence anymore of duplication supports the hypothesis of ancient polyploidization (Yeo, 1973) and evolution to fixation of disomic inheritance in the genome of G.
robertianum. Genetic drift and selection processes might have been promoted by the short generation times for this annual-biennial species (Tofts, 2004) Table 1 | 8651 VAN ROSSUM et Al. et al., 2014). The differences in locus duplication and the relatively high genetic diversity (Table 1) across the range of G. robertianum despite spontaneous autonomous selfing suggest multiple events of polyploidization (Soltis et al., 2014).
Further testing of developed molecular markers on G. purpureum and a comprehensive study of population genetic structure of both species might contribute to shed light on speciation processes and possible relationships between population genetic structure based on molecular markers, and morphological and environmental variation across species' distribution range.

ACK N OWLED G M ENTS
We thank S. Godefroid for help in collecting leaf material, S. Le Pajolec for help with seed germination and cross experiments, W.
Baert and P. Asselman for DNA extraction, A. Destombes and S.
Contreras (Genoscreen) for microsatellite development, and two anonymous reviewers and the associate editor for constructive comments on the manuscript. Most of the plant material was sampled when F. Vandelook was enrolled at Philipps-Universität Marburg (Germany), with the support of D. Matthies.

CO N FLI C T O F I NTE R E S T
No conflict of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
Individual multilocus genotypes are available at Zenodo (https://doi. org/10.5281/zenodo.4698869).