Invasion history of Lycium ferocissimum in Australia: The impact of admixture on genetic diversity and differentiation

We investigated the invasion history of Lycium ferocissimum, a spine‐covered shrub native to South Africa that was introduced to Australia in the mid‐1800s, and has since developed into a damaging invasive plant of undisturbed landscapes and pastures. In addition to identifying the provenance of the Australian plants, we tested for evidence of admixture, and contrasted genetic diversity and structuring across the native and introduced ranges.


| INTRODUC TI ON
Invasive plants are a major threat to global biodiversity (Early et al., 2016). Understanding the introduction history of weeds is crucial, as it allows researchers to develop biosecurity strategies and implement management options such as biological control (Colautti & Lau, 2015;Cristescu, 2015). However, the introduction history of weeds is often complex and may involve multiple introductions from different localities, resulting in admixture or hybridization (e.g. Dlugosch et al., 2015;Encinas-Viso et al., 2022;Hirsch et al., 2021). Admixture may facilitate invasion success, as it can generate novel allelic combinations that are beneficial in the new environment (Dlugosch et al., 2015;Qiao et al., 2019;Rius & Darling, 2014;Smith et al., 2020). Although some recent genomic studies have detected admixture in invasive lineages (e.g. Dlugosch et al., 2015;Rius & Darling, 2014), little is known about how this admixture can shape genetic structure in the introduced range. Admixed invasive lineages often have a unique genetic makeup not found in the native range (Dlugosch et al., 2015;Encinas-Viso et al., 2022). As a result, these admixed populations may be more challenging to manage (Schierenbeck & Ellstrand, 2009;Williams et al., 2014). However, a thorough understanding of the invasion history can guide the development of management solutions (e.g. biological control).
Genomic approaches, such as genotyping-by-sequencing (GBS), allow researchers to identify the geographic sources of invasive populations with a far greater level of accuracy than previously (see Barker et al., 2017;Burrell et al., 2015;Encinas-Viso et al., 2022;McCulloch et al., 2021). Furthermore, population genomic data are ideal for demographic inference methods, such as coalescent analyses (Cornuet et al., 2014). These methods can be used to test alternative introduction scenarios (Barker et al., 2017;Guillemaud et al., 2010) and are particularly powerful in cases where there is incomplete lineage sorting or genetic admixture among founding lineages (Welles & Dlugosch, 2018).
In this study, we investigate the introduction history of Lycium ferocissimum in Australia. This long-lived spine-covered shrub is native to the coastal and inland parts of the Cape provinces of South Africa (Arnold & De Wet, 1993;Venter, 2000). It was introduced to Australia in the mid-1800s as a hedge plant, but subsequently developed into a damaging invasive plant across large areas of the country (Parsons & Cuthbertson, 2001). It is now considered a Weed of National Significance in Australia (Australian Weeds Committee, 2013), as its presence degrades pastures and natural vegetation and it also forms thickets that impede access of livestock to watercourses and provide refuge for feral pest animals (Noble & Rose, 2013).
Management with chemical and mechanical methods is challenging because the plant readily regrows from its tap root system (Noble & Rose, 2013). These limitations, and the spatial extent of the plant's distribution, mean biological control is being explored as a landscape-scale management approach (Adair, 2013;Ireland et al., 2019;Julien, 2006;van Klinken et al., 2016). A previous phylogenetic study, based on Sanger sequencing, identified little genetic diversity in the L. ferocissimum plants sampled across South Africa, with no genetic structuring detected. Hence, the provenance of the invasive L. ferocissimum plants in Australia could not be resolved (McCulloch et al., 2020).
We used a coalescent approach to identify the likely provenance of Australian L. ferocissimum plants and to explore whether the plants from Australia are admixed. We also tested from isolation-bydistance and isolation-by-environment in South Africa and Australia, to examine whether these differ across the native and introduced range. We then discuss the implications of our results for the management of L. ferocissimum across Australia.
However, vegetative reproduction via broken stem and root fragments does occur (Muyt, 2001). Controlled pollination experiments indicate that this species is strongly self-incompatible, with outcrossing resulting in 93-fold more seed production per flower than selfing (Miller et al., 2008). The pollinators of L. ferocissimum have not been studied, but it is most likely insect-pollinated like most of its congeners (Galetto et al., 1998). Although longevity data for this species are not recorded in the literature, several of its congeners live for more than 100 years (Bowers, 2005;Bowers et al., 1995).

| Sample collection and DNA extraction
For our GBS analysis, we used L. ferocissimum DNA that had previously been extracted from 24 localities across South Africa (eight from the Western Cape and 16 from the Eastern Cape), and from 26 localities in Australia [ Figure 1a,b; Table S1; McCulloch et al., 2020].
Plants included in this study were identified both morphologically and by Sanger sequencing as being L. ferocissimum (McCulloch et al., 2020). To increase the number of samples at some localities, genomic DNA was extracted from additional plants, following the protocols outlined by McCulloch et al. (2020). Where possible, we sequenced at least ten plants per locality (Table S1). However, at most Western Cape localities only a single plant was found (so only a K E Y W O R D S environmental weed, genetic structure, genotyping-by-sequencing, Lycium ferocissimum, multiple introductions single plant was sequenced), as L. ferocissimum is not abundant in this region. We sequenced 381 plants in total, with an average 6.8 plants per locality from the Eastern Cape (range 1-11 plants), 3.5 plants per locality from the Western Cape (range 1-11), and 9.4 plants per locality from Australia (range 3-22; Table S1).
Although the concentration of some samples was lower than 5 ng/ μL, these samples were still included to increase the total number of individuals.
We adapted a protocol and adaptor regime based on the methods of Elshire et al. (2011), Poland et al. (2012, and Peterson et al. (2012), with barcode designations based on those of Caporaso et al. (2012). Full details of the method can be found in Hereward et al. (2020). Briefly, DNA was double digested using the restriction enzymes MspI and PstI (New England Biolabs), followed by forward and reverse adapter ligation to provide a unique barcode for each sample. The digested samples from each plate were pooled and cleaned using spin columns and eluted into 100 μL. Each pool was size selected (300-400 bp) on a Pippin Prep (Sage Science). PCR was performed to complete the adaptors and add indexes. The amplified products were checked using a microchip electrophoresis system (Shimadzu Corporation) and Qubit
Single-nucleotide polymorphisms (SNPs) were extracted using F I G U R E 1 Significant genomic divergence among Lycium ferocissimum samples from the Western Cape, Eastern Cape and Australia (a) Localities where Lycium ferocissimum samples were taken for genetic analysis across South Africa (cyan circles = Eastern Cape province, magenta circles = Western Cape province). The cyan and magenta shaded regions (drawn as a minimum convex polygon of all collection records) show the native distribution of L. ferocissimum across these two provinces, based on 332 records (GBIF.org, 2019). STACKS 2.0 (Catchen et al., 2013). Briefly, reads were demultiplexed using the process_radtags module. As there is no reference genome, the reads were assembled de novo in STACKS, using the default parameters. We filtered the data further using the populations module.
We kept only SNPs that were genotyped in at least 80% of the samples. Loci containing SNPs with more than two alleles were removed, as these loci have a high chance of being erroneous. Only the first SNP in each locus was retained for subsequent analysis. We discarded any loci containing SNPs with a heterozygosity above 0.65, as these loci are potentially paralogs (O'Leary et al., 2018). Finally, we removed any individuals that had more than 50% missing data, as missing data can influence the analyses of population genetic structure (Yi & Latch, 2022).

| Outlier loci
We used PCADAPT version 4.0.5 (Privé et al., 2020) to identify outlier SNPs. PCADAPT identifies SNPs that are outliers with respect to population structure using a PCA approach. We used the first two principal components (K = 2) to estimate the test statistics. To account for multiple testing, we converted the p values outputted by PCADAPT into Q values using the R package QVALUE (Storey et al., 2020). SNPs with a Q value of <0.10 were considered potential outliers. These potential outliers were removed from the dataset prior to subsequent analysis.

| Genetic structure
We assessed neutral genetic structuring with principal component analysis (PCA) using adegenet. The relative contribution of each SNP to the PCA was evaluated using the R package PCADAPT. We further assessed population structure using the Bayesian clustering algorithm STRUCTURE (Pritchard et al., 2000). Preliminary STRUCTURE runs, with the number of genetic clusters (K) set from 1 to 20, indicated there was little additional structure beyond K = 3.
Final analyses were therefore run ten times for each genetic cluster between 1 and 5. Three separate STRUCTURE analyses were run: (1) the complete dataset, (2) only the Australian samples and (3) only the South African samples. We ran STRUCTURE analyses with an initial burn-in period of 50,000 generations, followed by two million Markov Chain Monte Carlo generations. The admixture model was used with allele frequencies correlated among populations.
Structure Harvester (Earl & vonHoldt, 2012) was used to select the best value of K using Evanno's Delta K versus K method (Evanno et al., 2005). We then used CLUMPAK (Kopelman et al., 2015) to permute and plot the ten independent runs from each value of K under default parameters.
We calculated pairwise F ST values (among populations with at least five samples) using the adegenet package in R Studio (R Core Team, 2020). The significance of pairwise F ST values was assessed with 1000 bootstrap replicates. We assessed whether sample size impacted our calculation of genetic structure, by testing for an association between sample size and mean pairwise F ST , and also between mean sample size and pairwise F ST values. These analyses were conducted using linear regression in R Studio.
We conducted isolation-by-distance and isolation-byenvironment analyses independently on the populations from Australia and South Africa, to test whether similar patterns were found in the native and introduced ranges. Both isolation-by-distance and isolation-by-environment in populations from Australia and the Eastern Cape were assessed with a Mantel test (Mantel, 1967) using adegenet, using linearized F ST . We calculated the geographic distance between sites using the Geographic distance matrix calculator v1.2.3 (Ersts, 2012).
Tests for isolation-by-environment involved bioclimatic variables downloaded from the WorldClim database (https://world clim. org/; Hijmans et al., 2005), using the 'layers' function in the Atlas of Living Australia spatial portal (https://spati al.ala.org.au/). This database includes 19 bioclimatic variables. However, many of these variables are collinear, so we excluded any environmental variables that were correlated with any other variable (as assessed on the Atlas of Living Australia website). The following five variables were retained for analyses: annual temperature, annual precipitation, annual temperature range, precipitation: seasonality and temperature: isothermality. We did not test for isolation-by-distance or isolationby-environment in the Western Cape, as most populations from this region had fewer than five samples, so pairwise F ST values were not calculated.

| Genetic diversity
Observed heterozygosity (H O ) values were calculated in the adegenet package in R Studio (Jombart, 2008;Weir & Cockerham, 1984). We assessed whether sample size impacted our calculation of genetic diversity, by testing for an association between sample size and mean H O values (using linear regression in R Studio). We likewise used linear regression to test for an association between genetic diversity and each of the bioclimatic variables. This allowed us to determine whether environmental conditions shaped the genetic diversity of the native or introduced lineages.
We used Arlequin 3.5.2.2 (Excoffier & Lischer, 2010) to identify alleles unique to either the Western Cape, Eastern Cape, or Australia. We applied a minor allele frequency of 0.05 to our dataset to exclude particularly rare alleles.

| Coalescent analyses
We explicitly tested alternative introduction histories using the Approximate Bayesian Computation (ABC) method in DIYABC v2.0.1 (Cornuet et al., 2014). Three potential demographic scenarios were considered: (a) the Australian lineage originated from the Western Cape, (b) the Australian lineage originated from the Eastern Cape and (c) the Australian lineage was formed by admixture between the Western Cape and Eastern Cape lineages. We set the minimum minor allele frequency to 0.10 to exclude any uniformative SNPs (as recommended by Cornuet et al., 2014).
In the absence of detailed demographic information to estimate generation time from life table analyses, we conservatively assumed the maximum divergence time between the Australian lineage and its putative source population to be 85 generations. This estimate is based on the plant having been present in Australia since about 1850, and plants becoming reproductive when they are 2 years old (Parsons & Cuthbertson, 2001). To assess whether the generation time used was influencing the results of our coalescent analyses, we also ran these analyses with different generation times (1, 5 and 10 years). As the specific divergence time between the Eastern Cape and Western Cape lineage is not known, we used a broad prior (10 2 -10 7 generations) with a uniform distribution. Likewise, we used broad priors (10 2 -10 6 individuals), with uniform distributions, for the effective population sizes. Mean pairwise F ST , mean Nei's distance and mean genetic diversity were used as summary statistics.
We calculated the posterior probability of each scenario by performing a logistic regression on the 1% of simulated data that was closest to the observed data (following Cornuet et al., 2008). The favoured scenario was that with the highest posterior probability value. We tested the goodness-of-fit of the model with the highest posterior probability through a PCA comparing the observed data to prior and posterior distributions of the summary statistics, using the 'model checking' option in DIYABC. The proportion of type I and type II errors in our best-supported model was calculated using the 'confidence in scenario choice' option in DIYABC, using 1000 pseudo-observed datasets (see Cornuet et al., 2014).

| RE SULTS
Illumina sequencing yielded an average of 1,555,370 reads per individual. The initial filtered dataset consisted of 3130 SNPs across 381 individuals. PCADAPT identified 13 SNPs as potential outliers.
These SNPs were removed from the dataset, leaving 3117 SNPs in the final dataset. The proportion of missing data per individual ranged from 2.5% to 49.5% (mean = 9.1%), with most samples (99%) having less than 30% missing data. The mean proportion of missing data per individual was similar across all three regions (Western Cape = 8.2%; Eastern Cape 10.4%; Australia 8.7%).

| Genetic structure
Principal component analysis separated the Eastern Cape and Western Cape plants into two distinct clusters (Figure 1c). The samples from Australia formed a largely separate cluster to the other two clusters (Figure 1c). Many of the samples from Australia were similar to samples from the Western Cape, particularly with samples from the north of the Western Cape (WC3, WC4 and WC8; Figure 1c). However, some samples from Australia (mainly from South Australia; AU32, AU30) clustered closely to the samples from the Eastern Cape (Figure 1c).   Figure S6). STRUCTURE analysis of this dataset likewise separated plants from these two regions, though plants from several F I G U R E 2 Australian Lycium ferocissimum plants are genomically most similar to plants from the Western Cape. STRUCTURE plot (K = 2) for L. ferocissimum samples collected from across its native distribution (South Africa) and introduced range (Australia). Each vertical bar represents an individual, with colour representing the inferred genomic cluster.
Principal component analysis also revealed significant population structure within the Eastern Cape plants ( Figure S7). Specifically, plants from the two westernmost localities of the Eastern Cape (EC24, EC25) formed a distinct cluster, as did plants from the southern coast near Port Alfred (EC14, EC15 and EC16; Figure S7). Significant isolation-by-distance was also evident across the Eastern Cape (R = 0.846, p < .001; Table 1). Likewise, an isolationby-environment pattern was identified, with significant associations detected between pairwise F ST and annual temperature range and between pairwise F ST and isothermality ( Table 1). Some geographic population structure was also evident in the Western Cape, with samples clustering by locality, and plants from the three northernmost populations (WC3, WC4 and WC8) all clustering together ( Figure S8). formed separate clusters when the next two principal components were compared (PC3 vs. PC4; Figure 3b). STRUCTURE analysis confirmed that plants from AU11 and AU13 were genetically distinct, with plants from these localities clearly differentiated from plants from other parts of Australia (and from one another) at K = 4 ( Figure S9).

| Genetic structure across Australia
No isolation-by-distance was evident across Australia (R = 0.045; p = .419; Table 1). However, we found evidence of isolation-byenvironment, with significant associations detected between pairwise F ST values and mean annual temperature, annual temperature range and mean annual precipitation ( Table 1).

| Genetic diversity
No association was detected between sample size and the heterozygosity at individual localities, indicating sample size is not influencing the assessment of genetic diversity. Mean heterozygosity was similar in plants across the Western and Eastern Cape (0.082 and 0.086, respectively). We detected a significant negative association between heterozygosity and mean annual precipitation across the Eastern Cape (i.e. heterozygosity was highest in localities with lower rainfall; Table 2). In addition, we detected a significant positive association between heterozygosity and annual temperature range in this region (suggesting plants in localities with a larger annual temperature range are more genetically diverse; Table 2). No association was detected between heterozygosity and any environmental variable across the plants from Australia ( Table 2).

Heterozygosity in plants in
We identified 120 alleles in the Eastern Cape plants that were not found in the Western Cape plants and 22 alleles in the Western Cape that were not in the Eastern Cape. However, almost all of these alleles were found in the Australian plants, with only four alleles from the Eastern Cape and three alleles from the Western Cape not found in Australia. Furthermore, we identified 40 alleles in Australia that were not found in the Eastern Cape and 137 alleles in Australia that were not found in the Western Cape. These results strongly support the suggestion that the L. ferocissimum lineages in Australia were formed by admixture between the Western Cape and Eastern Cape lineages. We did, however, identify 18 alleles in plants from Australia that were not found in either Eastern Cape or Western Cape plants.

| Coalescent analyses
Coalescent analyses revealed that the lineage of L. ferocissimum  Running the coalescent analyses using different generation times produced almost identical results (Table S2). These analyses suggest that most (76%) of the genetic material in the lineage in Australia is from the Western Cape, supporting the results of the PCA and STRUCTURE analyses. A PCA of the summary statistics for this scenario confirmed that the observed data were contained within the space of the prior and posterior distributions of the summary statistics ( Figure S10). The estimated proportion of Type I and type II errors for this scenario were low (0.057 and 0.012, respectively), suggesting that the results of the analyses are robust.

| DISCUSS ION
Our results indicate that genetic material from both of the genomically distinct Eastern Cape and Western Cape L.
ferocissimum lineages was introduced to Australia These results add to a growing body of literature that shows how genetic admixture could help invading species avoid the predicted detrimental effects of population genetic bottlenecks (Dlugosch et al., 2015;Dlugosch & Parker, 2008;Ellstrand & Schierenbeck, 2000;Hahn & Rieseberg, 2017;Lombaert et al., 2010). Furthermore, our study highlights how genetic admixture postintroduction can not only lead to the development of an invasive lineage that is genomically distinct from those in the native range but may also cause significant genetic differentiation to develop among plants in the introduced range. In addition, we demonstrate that genetic diversity and structuring may vary significantly across the native and introduced range of an invasive plant. Our results help not only to elucidate fundamental aspects of the genetics of invasion but also provide guidance for the management of invasive species (see below). This semiarid mountainous region appears to be a barrier to gene flow in several other plant species, including the Babiana lily and

F I G U R E 3 Weak genomic divergence among Australian
Nymania capensis (Potts, 2017;Schnitzler et al., 2011). However, our analyses suggest that plants from the most western localities in the Eastern Cape shared some genetic similarities to plants from the Western Cape (although they were still clearly differentiated on the PCA). It is therefore possible that there is some overlap between these two lineages in the unsampled regions of South Africa.
Clear genetic structuring was also evident within the Eastern and Western Cape regions, even though most of the sampled localities within these provinces are geographically close to one another (less than 100 km apart). Furthermore, an isolation-bydistance pattern was evident across the Eastern Cape. These results indicate that gene flow across localities in the native range is limited. The limited gene flow across geographically close localities is surprising, as outcrossing plants (such as L. ferocissimum ;Levin & Miller, 2005;Miller et al., 2008) typically have significantly less genetic structuring than selfing species (Hamrick & Godt, 1996).
Genetic structure in plants has also traditionally been thought to be linked to seed dispersal mode (Duminil et al., 2007), with those plants relying on animal-mediated seed dispersal (such as L. ferocissimum; Parsons & Cuthbertson, 2001) expected to have significantly less genetic structure than those with wind-dispersed seed (Hamrick et al., 1993;Hamrick & Godt, 1996). However, more recent studies suggest that pollination mode may play a greater role than seed dispersal mode in determining population genetic structure (Duminil et al., 2007). Plants pollinated by small insects (such as L. ferocissimum) typically have far greater levels of genetic structure than those pollinated by larger insects, bats or by wind (Gamba & Muchhala, 2020). Alternatively, the genetic structuring in L. ferocissimum across its native range may be shaped by geographic barriers that restrict dispersal (Cox et al., 2016) or may be influenced by environmental gradients (Wang & Bradburd, 2014). Indeed, we detected a pattern of isolation-by-environment across the Eastern Cape. In any case, our results indicate that L. ferocissimum does not disperse long distances in its native range, suggesting that anthropogenic processes have played a crucial role in the spread of this plant across Australia.

| Invasion history of L. ferocissimum
While our PCA and STRUCTURE analyses showed that L. ferocissimum plants in Australia are genetically close to plants from the Western Cape, they still have a unique genetic makeup not found in the native range. Such a pattern is often attributed to genetic admixture (see Dlugosch et al., 2015;Hudson et al., 2020;van Boheemen et al., 2017 We detected no evidence of isolation-by-distance between L.
ferocissimum populations across Australia, which is in stark contrast to the clear isolation-by-distance pattern found in South Africa.
However, we did detect evidence of isolation-by-environment in Australia, indicating plants from similar environments are more genetically similar than would be expected by chance (see Wang & Bradburd, 2014). Overall, we suggest that the contrast in genetic structuring across the native and introduced ranges reflects differences in the spread of L. ferocissimum across the two regions, with the lack of genetic structuring in Australia most likely the result of major anthropogenic spread of L. ferocissimum following its introduction, since it was widely grown as a hedge plant (Parsons & Cuthbertson, 2001).

Principal component analysis suggests that both the Eastern and
Western Cape lineages were initially introduced to South Australia, with no evidence that the distinct lineages were introduced to other parts of Australia. The high genetic diversity in South Australia (relative to plants from elsewhere in Australia) likewise supports the suggestion that plants were originally introduced to this region, as heterozygosity is often highest at the point of introduction (Bors et al., 2019;Excoffier & Ray, 2008;White et al., 2013).
We detected a significant association between population-level heterozygosity and several environmental variables across South Africa, with populations in localities with lower annual precipitation or a larger annual temperature range more genetically diverse.
However, no such association was detected in Australia. We suggest that the genetic diversity in the Australian populations is shaped by a combination of genetic bottlenecks (particularly in recently colonized localities) and the extent of admixture among the two genetically diverse native range lineages. Our results thus demonstrate how genetic diversity can differ across the native and introduced ranges of an invasive species, particularly when introduced lineages are formed by admixture.

| Management implications
In addition to improving our understanding of the genetic processes involved in biological invasions, our results have important implications for the management of L. ferocissimum in Australia. Admixed lineages, as found for L. ferocissimum in Australia, can be difficult to manage using traditional biological control approaches (Schierenbeck & Ellstrand, 2009;Williams et al., 2014). For instance, depending on the region where candidate biological control agents for L. ferocissimum are sourced in the native range, they may vary in their effectiveness against the Western Cape and Eastern Cape lineages (see Goolsby et al., 2006;Kniskern & Rausher, 2001). Biological control agents may also be less effective against the genetically admixed plants (such as the L. ferocissimum plants in Australia; see Gaskin, 2017;Manrique et al., 2008). Furthermore, our results indicate that there is significant genetic differentiation among plants from separate Australian localities (likely as a consequence of genetic admixture). Plants from these different localities may therefore not respond in the same manner to highly specific biological control agents. In addition to screening potential biological control agents against both the Western Cape and Eastern Cape L. ferocissimum lineages, our results suggest that candidate agents should also be screened against plants from a number of different (genetically admixed) populations from Australia, to test whether they are likely to provide effective control of the genetically distinct L. ferocissimum lineage in Australia.

ACK N O WLE D G E M ENTS
We thank Ambrocio (Din) Matias, Ludovic Dutoit and Dean Brookes for their statistical and conceptual input. We thank Andrea Bowden,