Effective population size in a partially clonal plant is not predicted by the number of genetic individuals

Abstract Estimating effective population size (N e) is important for theoretical and practical applications in evolutionary biology and conservation. Nevertheless, estimates of N e in organisms with complex life‐history traits remain scarce because of the challenges associated with estimation methods. Partially clonal plants capable of both vegetative (clonal) growth and sexual reproduction are a common group of organisms for which the discrepancy between the apparent number of individuals (ramets) and the number of genetic individuals (genets) can be striking, and it is unclear how this discrepancy relates to N e. In this study, we analysed two populations of the orchid Cypripedium calceolus to understand how the rate of clonal versus sexual reproduction affected N e. We genotyped >1000 ramets at microsatellite and SNP loci, and estimated contemporary N e with the linkage disequilibrium method, starting from the theoretical expectation that variance in reproductive success among individuals caused by clonal reproduction and by constraints on sexual reproduction would lower N e. We considered factors potentially affecting our estimates, including different marker types and sampling strategies, and the influence of pseudoreplication in genomic data sets on N e confidence intervals. The magnitude of N e/N ramets and N e/N genets ratios we provide may be used as reference points for other species with similar life‐history traits. Our findings demonstrate that N e in partially clonal plants cannot be predicted based on the number of genets generated by sexual reproduction, because demographic changes over time can strongly influence N e. This is especially relevant in species of conservation concern in which population declines may not be detected by only ascertaining the number of genets.

The observed number of plants in partially clonal plants reflects the contribution of both sexual reproduction, which generates genetic individuals or genets, and clonal reproduction, which generates ramets that are replicates of the same genet (as in the most common type of clonal or vegetative growth, occurring in 80% of angiosperms; Klimeš et al., 1997;Vallejo-Marín et al., 2010). Just as N e is imperfectly predicted by population census size (N C ) (Frankham, 1995;Palstra & Ruzzante, 2008;Waples et al., 2016), the number of ramets is notoriously a poor surrogate for N e in a partially clonal plant population (Chung et al., 2004;Tepedino, 2012), as well as a poor indicator of levels of genetic diversity (Mandel, 2010;Raabová et al., 2015). It is worth noting that the number of ramets also encompasses juveniles, and thus may not equal N C , because the latter is most commonly defined as the number of mature individuals (e.g., Frankham, 1995;Luikart et al., 2010;Nunney, 1991Nunney, , 1995. The number of genets, which can only be ascertained through genetic analysis, is generally expected to be closer to N e than N C , albeit not equivalent.

It has been suggested that the relationship between clonality
and N e is not straightforward because of confounding factors linked to other life-history traits (Campbell & Husband, 2005), such as lifespan and generation time (Nunney, 1993;Yonezawa, 1997), rate of selfing, and especially the variance of clonal and sexual reproductive contributions of individuals (Orive, 1993;Yonezawa et al., 2004).
However, clonal reproduction alone should not cause any significant change in N e , unless it occurs at extremely high rates and generates fixed heterozygosity (Balloux et al., 2003).
A review based on 63 iteroparous (i.e., capable of reproducing multiple times in a lifetime) species showed that only two traits, namely age at maturity and adult lifespan, explained half of the variance in N e /N C , demonstrating that the evolutionary implications of these two traits are consistent across taxa Waples et al., 2016; see also Lee et al., 2011). Most of the investigations of N e in partially clonal plants are based on demographic estimators of N e , whereas exhaustive empirical comparisons of the number of ramets, number of genets, and N̂e obtained through genetic analyses are rare (Chung et al., 2004). In general, genetic estimates of N e are expected to be lower than demographic estimates because they combine the influence of all demographic factors (Nunney & Elam, 1994;Palstra & Ruzzante, 2008) which are difficult to account for simultaneously.
The partially clonal orchid Cypripedium calceolus L. (lady's slipper orchid) is a good model system to investigate how N e changes depending on the balance between clonal and sexual reproduction.
Some populations of the species are characterized by a ratio of sexual reproduction to clonal reproduction equal to 1:200, mainly as a result of limitations in seed germination or the absence of pollinators (Devillers-Terschuren, 1999;Kull, 1998Kull, , 1999but see García et al., 2010). Moreover, seedling survival is generally low, as in other terrestrial orchid populations (Shefferson et al., 2020), with a probability of seeds reaching maturity estimated as 10 −7 in Polish populations (Nicolé et al., 2005). Nevertheless, genetic analyses employing traditional molecular markers (allozymes, AFLPs, and microsatellites) have shown moderately high levels of genetic diversity even in small populations (N C < 500), and this has been mainly attributed to vegetative growth, genet longevity (30-100 years; Kull, 1999; 110-350 years according to Nicolé et al., 2005; with an age at reproductive maturity: 6-10 years old; Kull, 1999), and mating system by outcrossing (Brzosko et al., 2002;Fay et al., 2009;Gargiulo, Adamo et al., 2021;Kull & Paaver, 1997;Minasiewicz et al., 2018). Tremblay et al. (2005) suggested that N e /N C in orchids is particularly low because of pollinator-related limitations, and this is consistent with the observation that an increase in the variance in reproductive success decreases N e (Frankham, 1995;Nunney, 1991Nunney, , 1993Waples, 2016a). All else (i.e., generation length and age at maturity) being equal within a single species, populations in which sexual reproduction is less limited by pollinators (i.e., populations with a higher rate of sexual reproduction) should have a higher N e /N C ratio than populations with little sexual reproduction. Although effective population size should not be significantly affected by clonal reproduction unless sexual reproduction is very rare (Balloux et al., 2003), the vegetative spread will imply that larger individuals may sexually reproduce more, thus increasing variance in reproductive success and lowering N e .
In the present study, we asked how clonal versus sexual reproduction affected the effective population size of two populations of C. calceolus with different demographic histories. We start from the theoretical expectation that clonal reproduction lowers N e by increasing variance in reproductive success among individuals, and the constraints on sexual reproduction also lower N e by causing only a few plants to reproduce. In two populations of the same species, we expect that when sexual reproduction is less constrained, N e /N C would be larger. We used an exhaustive sampling strategy and analysed microsatellites and SNPs derived from double-digest restriction site-associated DNA sequencing (ddRADseq) to compare different sets of genetic estimates (which are influenced by different mutation rates and errors associated with different molecular marker types). We first assessed whether genetic data support the observation of different rates of clonal and sexual reproduction in the two populations. We then estimated contemporary N e with the linkage disequilibrium method (Hill, 1981;Waples & Do, 2008) using both microsatellites and ddRADseq-SNPs. We improved the precision of our N̂e confidence interval by subsampling the number of loci, and we corrected the bias in N̂e point estimates due to physical linkage among loci following Waples et al. (2016).

| Population sampling
The two Estonian populations of C. calceolus selected for the present study have been monitored annually since 1978 and 1985, respectively (Hurskainen et al., 2017;Kull, 1995Kull, , 1998Kull, , 2003. The continental population, hereafter "Ussisoo," is characterized by a generally stable demography (Table S1) and little fruit set. The insular population, hereafter "Kõrgessaare," occurs in a coastal forest on the Baltic Island of Hiiumaa at the border of a lagoon system (1-2 m above sea level) and includes abundant seedlings. Kõrgessaare is thought to have originated more recently than Ussisoo, probably around 100 years ago or less, after changes in the habitat type (Gargiulo et al., 2018;Kull & Paaver, 1997), with substantial population growth in the last few decades (Table S1).
Clonal growth in C. calceolus follows a phalanx strategy, with all ramets from the same clone (i.e., a clump) close to each other in a rounded shape ( Figure 1). However, especially when understory vegetation is abundant, different clones may be difficult to distinguish; recruitment within a clump may also occur (Nicolé et al., 2005). All emerging ramets from every single clump were sampled for leaf tissue (nondestructive sampling) and stored in silica gel (Chase & Hills, 1991). In Ussisoo, we collected 451 ramets from 35 putative clumps (exhaustive sampling of all visible plants in the population), and in Kõrgessaare, we collected ~700 ramets from >40 putative clumps (exhaustive sampling of all visible plants at the random coordinates chosen). Sampling was carried out in June 2019, and the Ussisoo population was translocated in August 2019 due to the expansion of the adjacent road.

| Microsatellite genotyping and analyses of multilocus genotypes
Genomic DNA was extracted with a modified CTAB method (Doyle & Doyle, 1987) and purified with a QIAquick PCR purification kit (QIAGEN, Manchester, UK). All samples were genotyped for 11 nuclear microsatellite (or simple sequence repeat, SSR) loci (Gargiulo et al., 2018;Minasiewicz & Znaniecka, 2014)  of scoring errors/null alleles. Samples belonging to the same populations/clumps were randomized at different steps of the analysis (DNA extraction, polymerase chain reaction, and capillary electrophoresis) to avoid batch effects (Bonin et al., 2004;Meirmans, 2015).
The total SSR data set obtained (i.e., including all ramets) is hereafter indicated as "raw data set." Multilocus genotypes (MLGs) were analysed in the R v4.0.5 (R Core Team, 2020) packages poppr v2.8.6 (Kamvar et al., 2014(Kamvar et al., , 2015 and adegenet v2.1.3 (Jombart, 2008;Jombart & Ahmed, 2011) to obtain indices of genotypic diversity. In poppr, we identified identical MLGs and kept one representative for each MLG to generate the "MLG-based clone-corrected data set." Before proceeding with further analyses, we performed some checks aimed to avoid the overestimation of either clonal or sexual reproduction in the two populations. In poppr, we assessed (1) whether all replicates of the same MLG (i.e., putative clones) were truly part of the same genet and not randomly generated by sexual reproduction (psex probability; Parks & Werth, 1993;Arnaud-Haond et al., 2007; Figure S2) and (2) whether each distinct MLG actually belonged to a distinct genet and was not an artefact deriving from scoring errors (Arnaud-Haond et al., 2007;Halkett et al., 2005). To assess the second point, we estimated a genetic distance threshold for collapsing MLGs potentially deriving from scoring errors, using the function cutoff_predictor based on Bruvo's distances (see details in Figure S3). After establishing the genetic threshold, we recomputed indices of genotypic diversity in the data set obtained by collapsing potentially identical MLGs in multilocus lineages (MLLs; the related data set is hereafter indicated as "MLL-based clonecorrected data set") and produced a minimum spanning network to visualize relationships among MLLs.
The Pareto distribution is used to describe the distribution of ramets into MLLs and is influenced by both genotypic (clonal) richness and evenness. For example, when MLLs have comparable sizes (high evenness), the Pareto plot will result in a steeper slope (high β) (see Arnaud-Haond et al., 2007;Stoeckel, Porro, & Arnaud-Haond, 2021).
We used exact tests in Genepop v4.5.1 (Raymond & Rousset, 1995;Rousset, 2008) with default parameters to assess deviations from Hardy-Weinberg proportions (potentially indicative of deviations from random mating), and we compared the departures towards heterozygosity excess and deficit and calculated locuslevel F IS . The number of private alleles, observed heterozygosity (H O ) and unbiased expected heterozygosity (uH E ), were computed in GeneAlEx v6.5 (Peakall & Smouse, 2006, 2012, and private allelic richness was computed in HP-Rare (Kalinowski, 2004(Kalinowski, , 2005 on the MLG-based and the MLL-based clone-corrected data sets. To test the hypothesis that outbred (i.e., highly heterozygous) plants have higher fitness (Alberto et al., 2005;Hämmerli & Reusch, 2003; but see Shefferson et al., 2018), we evaluated the correlation between individual heterozygosity and the number of ramets representing each MLL, as we can assume that plants with more ramets have lived for a long time. Individual heterozygosity was computed as the proportion of typed loci for which an individual clone was heterozygous (Hämmerli & Reusch, 2003). The allelic richness and F IS values per population were computed in FSTAT v.2.9.3 (Goudet, 2001). To assess the occurrence of nonrandom associations among loci, we computed the index of association (r d ) in poppr.
To check for the occurrence of recent migrants and internal population structure, which can bias N̂e, we analysed the "MLG-based clone-corrected data set" through Bayesian clustering in Structure v.2.3.4 (Pritchard et al., 2000) using the Admixture model and no prior on sampling sites. We ran the analysis with 10 5 burn-in, 10 5 MCMC replicates, and 20 iterations, and tested K values (number of genetic clusters) ranging from 1 to 5. We evaluated the most likely K using the LnPr(X|K) method (Pritchard & Wen, 2003) and the ΔK Evanno method (Evanno et al., 2005) in Structure Harvester (Earl & vonHoldt, 2012). The results were summarized in CLUMPAK (Kopelman et al., 2015).

| Analysis of double-digest RAD sequencing data
We used the double-digest RAD sequencing (ddRADseq) protocol and data set obtained as in Gargiulo, Kull, et al. (2021). The data set included 31 ramets from Ussisoo, each collected from a different clump, and 32 ramets from Kõrgessaare (Table S2) all representing different clumps except three pairs of putative "biological replicates." Each pair of biological replicates includes ramets at short distances that may belong to the same genet (pairs: EK308-EK549, EK333-EK336, and EK471-EK206). We used one sample from Ussisoo as a technical replicate throughout ddRADseq library preparation and sequencing. De novo locus assembly was conducted in Stacks v2.4 (Catchen et al., 2013;Rochette et al., 2019) as detailed in Gargiulo, Kull, et al. (2021). We used the populations program of Stacks to filter ddRADseq data depending on the different assumptions and software programmes required in our downstream analyses, as detailed in Table S3, in addition to filtering mostly aimed at reducing the influence of repetitive and paralogous loci expected in the large genome of C. calceolus (Gargiulo, Kull, et al., 2021). We checked the occurrence of loci potentially under the effect of selection using BayeScan v2.1 (Foll et al., 2010;Foll & Gaggiotti, 2008), to exclude them from the subsequent analyses focused on neutral demographic processes. To avoid that the reduction of informative sites (due to our filtering strategies) determined the detection of false positives (Lotterhos & Whitlock, 2014), we performed the analysis on the "r80 data set" (i.e., the data set including loci shared by 80% of the samples in each population; see Table S3). Only the first SNP at each locus was included in the analysis (option in the Stacks populations program: write-single-snp). We set the prior odds of neutrality at 1000, the false discovery rate at 0.05, and the chain parameters at default values. Potential deviations from the Hardy-Weinberg proportions were evaluated in Stacks populations and in vcftools v0.1.16 (Danecek et al., 2011) using exact tests, after excluding F ST -outlier loci (see also Table S3). P-values for the multiple comparisons were corrected using the p.adjust function in R, using the false discovery rate method. After excluding the loci deviating from the Hardy-Weinberg proportions, we estimated the average nucleotide diversity (π), H O , H E , F IS , fixation index (F ST ), and the number of private alleles in Stacks.
The model implemented in fineRADstructure assumes linkage disequilibrium (LD) among SNPs, although a possible limitation of our data set is the relatively large size of our loci (>250 bp), so we cannot exclude historical recombination. Samples with high percentages of missing data were removed from the data set, as they may

| Estimation of effective population sizes N e
We employed the software NeEstimator v2.1 (Do et al., 2014) to estimate contemporary N e using the linkage disequilibrium method (LDN e ; Hill, 1981;Waples & Do, 2008). Confidence intervals for N̂e were obtained by jackknifing over samples (Do et al., 2014;Jones et al., 2016;Waples et al., 2021). Both marker types (SSRs and SNPs) were analysed, on genet-level data only, because linkage disequilibrium at the ramet-level would be mainly affected by clonal reproduction and not by genetic drift. Below, we detail the analytical procedure for both marker types, including some caveats and the corrections aimed to improve both the precision and the accuracy on N̂e.

| SSRs
We analysed both the MLL-based and the MLG-based clonecorrected data sets by excluding singletons (i.e., alleles only occurring in one heterozygote), and using different thresholds to screen out rare alleles (p-crit for allele frequencies equal to 0.05, 0.02, 0.01, and 0) (Waples et al., 2016;Waples & Do, 2010). As we used an exhaustive sampling strategy (i.e., of all visible ramets/genets, except very young seedling stages and protocorm stages that are not visible overground), we assumed that the sampled cohorts approached the generation length of the species. Therefore, our N̂e estimated from microsatellites should be close to the true N e , although a downward bias of at least 10% associated with mixed-age adult sampling cannot be ruled out . Note that we consider the genet age in the case of a partially clonal plant such as C. calceolus.
TA B L E 1 Genotypic parameters associated with the SSR data set for the two populations of Cypripedium calceolus analysed in this study. method is robust to some population structure and to migration rates <0.1, whereas for higher migration rates, N̂e approaches the N e of the metapopulation (Waples & England, 2011; see also Gilbert & Whitlock, 2015). To evaluate the influence of potential migrants on N̂e, we removed the admixed individuals as detected in Structure for both K = 2 and K = 3 (displaying a proportion of admixture >20%, see Section 3) and recomputed N̂e.

| SNPs
The SNPs data set only included one SNP at each locus (option in the Stacks populations program: write-random-snp, see Table S3) to reduce the influence of physical linkage among sites. We used different p-crit values for allele frequencies (i.e., 0.05, 0.02, 0.01, and 0) and excluded singletons. The LDN e method implemented in NeEstimator is particularly robust with RADseq data when the number of samples is ≥30 (Nunziata & Weisrock, 2018). As nonrandom missingness due to allele dropout may bias the N e estimation (Marandel et al., 2020), we previously evaluated the correlation between the proportion of missing data and F IS at each locus, using Spearman's rank correlation test.
To compare confidence intervals for N̂e generated by jackknifing over samples with parametric confidence intervals, which generally will be too narrow when thousands of loci are used (Waples, 2021;Waples et al., 2021), we subsampled the SNPs data set by generating 40 random whitelists of 800 loci in Stacks populations (with one SNP per locus, see Table S3) and analysed these subsets in NeEstimator.
In addition, to further reduce potential biases due to the physical linkage among loci, we divided the point N̂e by 0.098 + 0.219 × ln (10) To account for the confounding effect of migrants and/or population structure, we removed a potentially admixed individual detected in fineRADstructure from the data set and recomputed N̂e.

| Microsatellite genotyping and analyses of multilocus genotypes
The raw data set was composed of 1123 samples, including 451 samples from Ussisoo and 632 from Kõrgessaare, with a negligible percentage of missing data ( Figure S1). When replicated MLGs were excluded, the MLG-based clone-corrected data set counted 66 MLGs in Ussisoo and 191 in Kõrgessaare (Table 1).
To avoid the overestimation of clones in the two populations, we evaluated whether all replicates of the same MLG (i.e., putative clones) were truly part of the same genet and not generated by chance due to sexual reproduction. The probability of encountering a genotype more than once by chance (single method) indicated that multiple MLGs are part of a single genet (p-value associated with psex ≪ 0.05), except in five cases in Kõrgessaare ( Figure S2a cal MLGs (i.e., truly clonal ramets; Figure 2). In a few cases, for example, when boundaries among clumps were less clear, different MLLs deriving from sexual recruitment were found at small distances (i.e., <50 cm; Figure 2). Clump sizes were similar between the two populations, ranging from 1 to ~40 ramets. The Pareto β obtained in RClone ( Figure S4) was low, reflecting the occurrence of clonal reproduction. Pareto plots showed a similar trend in both populations, reflecting lower genotypic richness/higher evenness in Ussisoo and higher genotypic richness/lower evenness in Kõrgessaare (Table 1).
After applying a Bonferroni correction for multiple testing, we found evidence for deviations from Hardy-Weinberg proportions and high variance in F IS , differentially occurring depending on the data set considered (i.e. raw data set, MLL-based clone-corrected data set, and MLG-based clone-corrected data set, see Table S4). When removing clones, most of these deviations disappeared. In particular, Ussisoo still exhibited an excess of heterozygotes at locus Ccal_25, TA B L E 2 Genetic diversity and differentiation in the two populations of Cypripedium calceolus analysed in this study. For SSRs, values for MLG-based and the MLL-based clone-corrected data sets are identical when not reported otherwise.

H E (SE) H O (SE)
Private alleles (rarefied a )  Table 2). There was no correlation between individual-genet heterozygosity and the number of ramets representing that genet (R = 0.13, p = 0.34 at Ussisoo and R = 0.061, p = 0.45 in Kõrgessaare; Figure S5). Allelic richness based on the minimum sample size (64 diploid individuals) was 5.9 in Ussisoo and 3.4 in Kõrgessaare;

MLL-based clone-
F IS was 0.04 and 0.06 when considering, respectively, the MLG-based and the MLL-based clone-corrected data sets, in both populations ( Table 2). The r d revealed associations among loci in each population that did not disappear when removing clones and when considering MLLs (p-value = 0.001) ( Figure S6), especially in Ussisoo. However, these associations mostly affected different loci across populations, suggesting that they are not related to physical linkage among loci.

The analysis of genetic differentiation carried out in Structure
showed that the most likely number of genetic clusters was K = 2 ( Figure S7), with a few admixed individuals occurring in Kõrgessaare ( Figure S8). We also report the results for K = 3, in which some internal structure emerges in Kõrgessaare ( Figure S8).

| Analysis of double-digest RAD sequencing data
The results of the ddRADseq data analysis using the different filtering strategies are summarized in Table S3. In BayeScan, we detected two F ST -outlier loci for which we did not recover any significant correspondence on the NCBI database ( Figure S9), and we excluded them from the subsequent analyses (  (Figure 3). In Kõrgessaare, we observed a further population subdivision which reflected only partially the spatial distribution of samples. In Ussisoo, many sample pairs were highly related (with a coancestry coefficient sensu Malinsky et al., 2018 > 500). In general, most of the closest coancestry coefficients occurred between samples at short distances. The biological replicates EK549-EK308 were among the sample pairs with the highest coancestry coefficients.
However, six pairs (five in Ussisoo and one in Kõrgessaare) had a coancestry coefficient higher than the biological replicates, possibly because of missing data affecting samples differently ( Figure S10; see also Gargiulo, Kull, et al., 2021). Among the sample pairs with

| Estimation of effective population sizes N e
Estimates of N e obtained in NeEstimator differed somewhat according to the marker type used, especially in Kõrgessaare ( Table 3). The analysis of the microsatellite data set (after excluding singletons) in Ussisoo produced an N̂e equal to  for the MLG-based clone-corrected data set and 20.6 (CI: 13.5-32.3) for the MLL-based clone-corrected data set. In Kõrgessaare, N̂e was 24 (CI: 13.9-39.5) and 24.1 (CI: 13-42.9) when considering respectively the MLG-based clone-corrected data set and the MLL-based clone-corrected data set. When 16 admixed individuals were removed, N̂e was 32.8 (CI: 18-59.5). In addition, accounting for the putative internal population structure (as observed for the second most-likely K value, K = 3) determined different estimates: 6.1 (CI: 2.0-20.2) for the purple cluster (53 individuals) and 15.5 (CI: 8-28.7) for the blue cluster (74 individuals) ( Figure S8; suggesting that although allele dropout may be the cause for positive F IS values, nonrandom missing data do not strongly affect genetic indices and thus N̂e. The analysis of the total SNPs data set (after excluding singletons) produced an N̂e equal to 25.5 (CI: 15.8-49.1) in Ussisoo and 5.7 (CI: 2.9-9.8) in Kõrgessaare. The latter estimate was still very small after removing the admixed sample EK538: 5.2 (CI: 2.8-9.0) ( Table 3). We also evaluated how precision changed, depending on the LD among different loci, and reported the results obtained by analysing 40 different subsets of 800 loci in NeEstimator (Figure 4). In Ussisoo, the analysis of the 40 subsets (after excluding singletons) produced a median N̂e equal to 25.7 (empirical CI: 21.1-29.3), whereas in Kõrgessaare, the analysis of the 40 subsets produced a median N̂e equal to 5.9 (empirical CI: 4.3-6.7) ( Table 3). For all estimates based on SNPs, a 40% downward bias due to physical linkage among loci can be expected in a species with 10 chromosomes (haploid number), according to the formula in Waples et al. (2016), although we expect that using only the first SNP in the data set was already mitigating some of the downward bias (Table 3).

| Effective population size (N e ) in partially clonal plants
Our study represents the first exhaustive comparison among the number of ramets, number of genets, and genetic contemporary estimate of N e , based on two molecular marker types (microsatellites and ddRADseq-SNPs). Our findings show that N e in a partially clonal plant cannot reliably be predicted based on the number of genets because demographic events (e.g., bottlenecks or declines) strongly influence N e .
In particular, we observed that despite the smaller number of ramets and genets in Ussisoo, and its lower rate of sexual reproduction, this population has higher genetic diversity and higher N̂e than Kõrgessaare, and this is consistent with the stable demogra- undergone an expansion, and this is also suggested by the larger N̂e obtained from the microsatellites data set including juveniles.
The differences between Ussisoo and Kõrgessaare show that generalization at the species level is not possible without a full understanding of the demographic changes different populations undergo, and that reproductive patterns alone cannot explain a small N e . In different species, differences among N e /N C ratios can be dictated by generation length and age at maturity (the longer these are, the more N e increases) and lifetime variance in reproductive success (higher variance decreases N e ) (Frankham, 1995;Nunney, 1991Nunney, , 1993Waples, 2016a). Within the same species, with generation length and age at maturity being similar, differences in N e among populations will especially depend on variance in reproductive success and demographic events.
In terms of variance in reproductive success, larger clumps (genetic individuals with more ramets, which are not always older than other genetic individuals) may sexually reproduce more than smaller clumps, and may also inbreed via geitonogamy (i.e., mating among ramets of the same clone). These factors may reduce N e by favouring larger individuals with more flowers and by causing inbreeding, respectively. The behaviour of pollinators may partially prevent geitonogamy, as pollinators tend to abandon clumps when they discover sexual deception or food deception in orchids that do not offer such rewards (Jersáková et al., 2006;Tremblay et al., 2005;Whitehead et al., 2015), but this may not be the case in other species. Moreover, the advantage of larger or older clumps may be reduced if some of the ramets die over the years, and this would counterbalance the reduction of N e via lifetime variance in reproductive success. Another phenomenon that could potentially counterbalance the reduction of N e via lifetime variance in reproductive success is vegetative dormancy (Davison et al., 2013;Lesica & Steele, 1994;Shefferson et al., 2001Shefferson et al., , 2020, which may cause some plants to skip breeding in some years, spreading lifetime reproductive success among individuals .

F I G U R E 4
Estimates of the effective population size in the populations of Cypripedium calceolus at (a) Ussisoo and (b) Kõrgessaare, obtained from 40 different subsets of ~800 SNPs in NeEstimator (results reported in Table 3 and filtering strategies explained in Table S3). Dots indicate point estimates for N e , ordered by effective degrees of freedom, ranging between 560 and 941 in Ussisoo and 221 and 391 in Kõrgessaare. Shading indicates jackknife confidence intervals (Jones et al., 2016), whereas dashed lines indicate empirical 95% confidence intervals. Dots with asterisks indicate estimates obtained from the full data set (27,136 SNPs), included for comparison. Note the different scales of y-axes for the two populations.

TA B L E 3
N̂e for the two populations of Cypripedium calceolus analysed in this study. Note: N, sample size; N̂e, effective population size estimate, calculated after excluding singletons (p-crit depending on sample size). In parentheses, confidence intervals (CI) are obtained by jackknifing over loci, except when empirical CI are specified. Refer to Figure S8 for the Structure results (purple and blue clusters). Refer to Table S3 for the filtering strategies applied to the SNPs data set (only one SNP per locus was included in all NeEstimator analyses). Removing sites significantly deviating from the Hardy-Weinberg proportions did not produce any change in the N e estimates. LD-related bias correction refers to the formula 0.098 + 0.219 × ln(10) = 0.60, used to correct the potential bias caused by physical linkage among a high number of loci (see Waples et al., 2016): N̂e/0.60.
Although the long lifespan of C. calceolus implies more opportunities for migration to have an impact on the patterns of LD, ecological observations in orchids suggest that local recruitment, especially at short distances from parental plants, predominates over recruitment of nonlocal seeds (Chung et al., 2009;Duffy et al., 2020;Hedrén et al., 2021;Jacquemyn et al., 2009;Zhang et al., 2019), and that long-distance dispersal, although important in the colonization stage, contributes less to gene flow once a stable population has been established (Hedrén et al., 2018). Such observations are corroborated by our genetic results, as we found clusters of similar genets in Ussisoo ( Figure 3) and most juveniles were related to local adults in Kõrgessaare (Figure 2b). Therefore, we believe migration does not substantially influence our N̂e (see also Table 3), although some internal population structure may influence N̂e in Kõrgessaare ( Figure S8).

The influence of pseudoreplication on confidence intervals
is also important when estimating N e for species of conservation concern, because it may lead to adopting measures that are too severe if confidence intervals are much narrower than the true ones (Waples, 2021;Waples et al., 2021). We investigated the influence of pseudoreplication (i.e., the lack of independence among thousands of loci, as they occur within a much smaller number of chromosomes) on N̂e confidence intervals by evaluating the differences in precision between a Snip data set including ~30 K SNPs and subsets of 800 SNPs . The empirical CI across the subsets of 800 loci was narrower than the jackknife CI, and this is because

| N e /N C ratios and implications for conservation
Although a small N e /N C ratio may be perceived as a typical feature of orchids, even in the absence of clonal reproduction (Trapnell et al., 2022;Tremblay & Ackerman, 2001;Tremblay et al., 2005), and the equilibrium among life-history traits can buffer genetic drift, populations are still susceptible to environmental and genetic stochasticity (Palstra & Ruzzante, 2008). In the terrestrial orchid Cremastra appendiculata, Chung et al. (2004) concluded that genetic diversity is maintained despite a small local N e , possibly because of metapopulation dynamics or because genetic diversity reflected past levels of diversity, whereas their N̂e reflected a contemporary estimate.
Similarly, we observed that indices of genetic diversity (e.g., heterozygosity) contrast with the N̂e found (see also Siol et al., 2007). In particular, N̂e is significantly smaller than the thresholds signalling critical genetic erosion (N e < 50; Jamieson & Allendorf, 2012;Frankham et al., 2014). In Kõrgessaare, we have explained the extremely small N̂e as the legacy of a founder effect and, if the recent population expansion continues, N e will probably increase in a few generations and the genetic signal of the founder effect will disappear. In Ussisoo, it is possible that moderate levels of genetic diversity despite small N̂e persist because of the long lifespan of genets, and thus reflect past levels of gene flow. Such time lags between the occurrence of demographic changes and the attainment of new equilibrium values of genetic parameters (Epps & Keyghobadi, 2015) are also known in the literature as "genetic extinction debt" (Honnay et al., 2006;Vranckx et al., 2012) et al., 2007). We found the ratio N e /N ramets <0.05 in the two populations of C. calceolus, whereas the ratio N e /N genets was <0.3 (Table 3, when considering the estimates based on microsatellites, as SNPs were analysed only in a subset of samples). We assume that estimating N e /N C ratios using the number of genets (ideally flowering genets to avoid including juveniles and immature individuals) as a surrogate for the number of mature individuals in a population is more appropriate than the number of ramets for comparisons among species. Obtaining multi-generational N e estimates in a species with a long generation time, such as C. calceolus, remains challenging, but our results may be used for this purpose if complemented with data available in the future. Including populations from further sites that may have different demographic histories and balance between clonal and sexual reproduction may also offer a more robust picture of N e in this species.
In summary, we used a single-sample, contemporary estimate of N e to understand how effective population size changes depending on the balance between clonal and sexual reproduction. We provided the first exhaustive comparison between the number of ramets, number of genets, and contemporary N̂e. The N e /N ramets and N e /N genets ratios we provide may be used as reference points by researchers and practitioners interested in the magnitude of these ratios in their partially clonal species of interest, and in general for meta-analyses across species (Frankham, 2019(Frankham, , 2021.
As N e is notoriously difficult to calculate, we considered the factors potentially affecting our estimates, and we showed how estimates can differ when using different molecular marker types and different sampling strategies. In addition, the influence of pseudoreplication on confidence intervals concerns modern data sets regardless of the study species, and we showed that N e point estimates obtained from ~30 K loci and from subsets of ~800 loci were comparable despite slightly narrower confidence intervals for the larger data set.
Most importantly, we found that effective population size in partially clonal plants cannot be predicted based on the number of genetic individuals (or genets) because demographic events (i.e., changes in the number of individuals over time) strongly influence N e and this influence can last for decades, depending on the generation time of the species and other life-history traits. Our findings are especially relevant in partially clonal species of conservation concern, in which population declines may not be detected by only counting individuals or by only ascertaining the number of genets using genetic methods. Estimating contemporary effective population size in partially clonal species is therefore crucial for evaluating their conservation status, and results should be interpreted in light of population-specific demographic changes over time. Clarke and the Linux Team are also acknowledged for their support in maintaining the HPC cluster at RBG Kew. We thank Gordon Luikart and two anonymous reviewers for providing valuable comments on the manuscript.

CO N FLI C T O F I NTE R E S T
The authors declare no competing interests.

B EN EFIT-S H A R I N G S TATEM ENT
Benefits from this research accrue from the sharing of our data and results on public databases as described above.