Hybridization and adaptive introgression in a marine invasive species in native habitats

Summary Hybridization of distinct populations or species is an important evolutionary driving force. For invasive species, hybridization can enhance their competitive advantage as a source of adaptive novelty by introgression of selectively favored alleles. Using single-nucleotide polymorphism (SNP) microarrays we assess genetic diversity and population structure in the invasive ctenophore Mnemiopsis leidyi in native habitats. Hybrids are present at the distribution border of two lineages, especially in highly fluctuating environments including very low salinities, while hybrids occur at lower frequency in stable high-saline habitats. Analyses of hybridization status suggest that hybrids thriving in variable environments are selected for, while they are selected against in stable habitats. Translocation of hybrids might accelerate invasion success in non-native habitats. This could be especially relevant for M. leidyi as low salinity limits its invasion range in western Eurasia. Although hybridization status is currently disregarded, it could determine high-risk areas where ballast water exchange should be prevented.

Hybridization and adaptive introgression in a marine invasive species in native habitats Jose ´Martin Pujolar, 1, * Denise Breitburg, 2 Joanna Lee, 3 Mary Beth Decker, 4 and Cornelia Jaspers 1,5, * SUMMARY Hybridization of distinct populations or species is an important evolutionary driving force.For invasive species, hybridization can enhance their competitive advantage as a source of adaptive novelty by introgression of selectively favored alleles.Using single-nucleotide polymorphism (SNP) microarrays we assess genetic diversity and population structure in the invasive ctenophore Mnemiopsis leidyi in native habitats.Hybrids are present at the distribution border of two lineages, especially in highly fluctuating environments including very low salinities, while hybrids occur at lower frequency in stable high-saline habitats.Analyses of hybridization status suggest that hybrids thriving in variable environments are selected for, while they are selected against in stable habitats.Translocation of hybrids might accelerate invasion success in non-native habitats.This could be especially relevant for M. leidyi as low salinity limits its invasion range in western Eurasia.Although hybridization status is currently disregarded, it could determine high-risk areas where ballast water exchange should be prevented.

INTRODUCTION
Marine invasive species cause large biological and economic impacts worldwide. 1,2Records of new non-indigenous species sightings are increasing without signs of saturation either now 3,4 or in projections. 5It is still debated which traits allow species to invade ecosystems and which factors facilitate invasion success. 6It is acknowledged, however, that hybridization with successful interbreeding of distinct populations or species in the recipient ecosystem can accelerate invasion success. 70][11] Invasion success has been shown to be higher for hybrid plant populations formed in the recipient habitat, 7 but the effect of hybridization within native populations, setting the stage for increased fitness and invasion potential, remains understudied, especially for marine species.
In native habitats, admixed populations can be found in hybrid zones, i.e., the area where two lineages meet naturally.The geographic extent of hybrid zones is normally narrow and maintained by balancing dispersal, selection, hybrid fitness, and ecological conditions. 9,12,13][16][17] Hybrids can show either increased (hybrid vigor) or decreased (outbreeding depression) fitness. 8,18While in many cases hybrids are outperformed by pure lines, hybrid genotypes have been shown to perform better under novel environmental conditions or in extreme habitats, 19 as exemplified for alpine-adapted butterflies 20 and North Atlantic eels in Iceland. 21he ctenophore Mnemiopsis leidyi provides a case to test the role of hybridization and introgression as a potential source of adaptive novelty in the context of biological invasion.The species is native to the Atlantic coasts of North and South America. 22t was introduced from the east coast of the United States to western Eurasia, and is now invasive in large areas of the Black Sea, the Caspian Sea, and the Mediterranean, as well as NW Europe. 23Previous genetic studies identified two distinct populations, or lineages, in the native range: a southern lineage occurring in Florida/Gulf of Mexico and a northern lineage occurring in New England. 24,25Based on mitochondrial cytochrome b and six nuclear microsatellite loci, Bayha et al. 25 identified Cape Hatteras as the location of a genetic break between the two lineages driven by oceanographic features. 26Using whole genome data, Jaspers et al. 27 reported high genetic differentiation between the two lineages (Figure 1), which is comparable to differentiation between named congeneric species. 28ere, we use a diagnostic single-nucleotide polymorphism (SNP) microarray developed for the northern and southern M. leidyi lineages 29 to study spatial and temporal genetic diversity, as well as population structure, of M. leidyi in its native habitat along the US Atlantic coast.
Estuaries in the northwestern Atlantic experience differing salinity conditions, especially Chesapeake Bay, the largest estuary in the US, which has a large spatial and temporal variation in environmental conditions including extremely low salinity levels. 30We investigate the level of hybridization and introgression in eight locations and link occurrence of hybrids to environmental conditions and oceanographic features/ dispersal as well as potential evolutionary processes.

Population structure
SNP genotyping of 176 M. leidyi individuals from eight locations in two regions (New England and mid-Atlantic Coast) was conducted (Figure 1).For comparison, one location from Miami, FL was also included.Measures of genetic variability are summarized in Table S1.Observed heterozygosities ranged from 0.27 to 0.36 with no statistical differences observed among samples (p = 0.921).Similarly, no differences were found in expected heterozygosities (p = 0.948).Allelic richness ranged from 1.74 in Greenwich Cove to 1.91 in Sandwich, with no differences observed among samples (p = 0.696).
A highly significant genetic differentiation was detected between the two Virginia locations (Wachapreague on the Altantic coast and Gloucester Point inside Chesapeake Bay) and all other samples, with no genetic structure detected for samples collected North of Virginia along the cost and estuaries of New England (Table 1).In accordance with the pairwise F ST values, principal components analysis (PCA) indicated three clusters (Figure 2).One group included individuals from all northern sites, Sandwich, MA, Woods Hole, MA, and Esker Point, CT, as well as the Narraganset locations of Fort Adams, Fort Wetherill, and Greenwich Cove, RI.A second group included all individuals from Chesapeake Bay (Gloucester Point) and east of Chesapeake Bay along the Atlantic coast (Wachapreague).A third distinct cluster included all individuals from Miami, FL.The first two coordinates explained 29.5% and 8.6% of the variance, respectively (p < 0.001), while the other axes were not significant (p > 0.05), following Tracy-Widom statistics.
STRUCTURE analysis suggested a scenario with two groups (K = 2) as the most likely (Figure 3), which correspond to the northern and southern lineages.We conducted structure analyses including K = 3 to 5, which did not result in any logical additional clustering (Figure S1).All individuals north of Chesapeake Bay could be assigned with high confidence to the northern lineage, while the baseline individuals from Miami could be assigned with high confidence to the southern lineage.All individuals north of Chesapeake Bay were suggested to be non-admixed, while the majority of individuals from the Chesapeake Bay area (coastal and Bay) were of admixed origin.At the Atlantic coast of Virginia (Wachapreague), 36.3% of individuals showed an admixture proportion >10%, while at the sample station inside Chesapeake Bay (Gloucester Point), 100% of individuals were admixed, with admixture proportion ranging from 21 to 49%.Inside Chesapeake Bay, M. leidyi have also been confirmed from large salinity ranges, including very low salinities (Figure 4).A detailed classification of the hybrids identified in STRUCTURE was conducted in NEWHYBRIDS (Table S2), with potential hybridization scenarios outlined in Figure 5.All specimen collected at locations north of Chesapeake Bay were confirmed as pure northern lineage individuals, while those collected from Miami were confirmed as pure southern lineage individuals.At the Atlantic coast of Virginia (Wachapreague), samples consisted of 63.6% pure northern individuals, 4.6% F 2 hybrids, 22.7% first-generation backcrosses, and 9.1% second-generation backcrosses.All first-generation backcrosses were classified as pure northern 3 F 1 hybrid backcrosses.Second-generation backcrosses were all classified as backcrosses between first-generation backcrosses (pure northern 3 F 1 hybrid) and F 1 individuals.
In comparison, a higher percentage of admixture was detected inside Chesapeake Bay (Gloucester Point), where only hybrid individuals were identified, i.e., no pure northern or southern individuals.Individuals from Gloucester Point were classified as 30% F 2 hybrids,  30% first-generation backcrosses (all with pure northern animals) and 40% second-generation backcrosses.Most individuals were classified with high confidence (>90%, see Table S2).
In all comparisons with the locations north of Chesapeake Bay (Sandwich, Woods Hole, Fort Adams, Forth Wetherill, Greenwich Cove, and Esker Point), introgression tests confirmed no significant deviation from the null expectation of D = 0, suggesting no shared ancestry between Miami and the northern locations and no indication of introgression (Table 2).However, we observed an excess shared ancestry between the Atlantic coast station east of Chesapeake Bay (Wachapreague) and Miami, with a c. 2-fold excess of ABBA over BABA and an average D = 0.24 (p < 0.001).This suggests recent introgression from Miami into Wachapreague.Similarly, a highly significant amount of shared ancestry was found between Gloucester Point inside Chesapeake Bay and Miami, with a c. 3-fold excess of ABBA over BABA and an average D = 0.42 (p < 0.001).

DISCUSSION
The increased resolution of genetic markers sheds new light on the status of hybrids and the nature of introgression between northern and southern lineages of M. leidyi in native habitats long the Atlantic coast of the US.We show that hybrids are present at the transition zone between northern and southern lineages, around Cape Hatteras, which represents an oceanographic front where the Gulf Stream begins its arc offshore. 26Apart from such oceanographic features that lead to limited gene flow, marine systems are normally characterized by high dispersal and connectivity that usually limits genetic differentiation in marine systems. 31M. leidyi is an example of a species with high dispersal capacity facilitated by its holoplanktonic life cycle and self-fertilizing reproduction mode. 32Hence, high gene flow within areas characterized by high connectivity can be expected, as documented for the invasive range in NW Europe. 23In line with this, we observe high gene flow and lack of population structure in the native range north of Chesapeake Bay, which is located north of Cape Hatteras.This is irrespective of diverse habitats, i.e., coastal, estuarine, and offshore waters that were sampled in our study, also including subsequent sampling years.This is concordant with the study of Bayha et al. 25 using cytochrome b, which only detected northern haplotype individuals north of Chesapeake Bay, in an extensive sampling scheme including the Long Island Sound and Rehoboth Bay, DE.
In contrast, two locations just north of Cape Hatteras around/in Chesapeake Bay were highly differentiated from the northern and southern lineages due to the presence of hybrids.In order to understand environmental drivers and document adaptive introgression in hybrid zones, we compare two locations within the hybrid zone with contrasting environments.First, a coastal area along the Atlantic coast characterized by high salinity and relatively stable environmental conditions, and second, Chesapeake Bay, an estuary characterized by high levels of spatial and temporal variability in environmental conditions.The proportion and nature of hybrids found during our investigation were contrasting in the two locations, as outlined in the following sections.

Stable environmental conditions: Example of a hybrid tension zone
The Atlantic coast east of Chesapeake Bay is characterized by relatively stable environmental conditions, which are more similar to other areas along the north-east coast of the USA. 33Salinity in this area, a key environmental parameter, is high (> 30) and relatively constant. 33The M. leidyi population at this sampling site is composed of pure northern individuals plus a few late-generation hybrids.The population structure found during our investigation is consistent with and suggests a hybrid tension zone.In a hybrid tension zone, the population structure is balanced between dispersal of parental individuals and selection against hybrids.This has been suggested to be due to intrinsic fitness differences of hybrid lines. 12,16Even though we do not have experimental data to proof this hypothesis, our data suggest that hybrids are selected against in this area.On the other hand, lack of hybrids found in more northern sampling locations of New England is probably dispersal-dependent, due to the prevailing current pattern. 26Along the north east coast of the US, the coastal current runs toward the south, 26 which is leading to limited dispersal of animals from the southern lineage to the north, especially considering dispersal beyond Chesapeake Bay.This likely explains why only northern haplotype individuals were detected north of Chesapeake Bay in the most comprehensive previous study using cytochrome b as marker gene. 25s observed in our data, contact between the northern and southern populations can result not only in hybridization, but also introgression because of backcrossing.However, in a hybrid tension zone, hybrid fitness is expected to be lower relative to the parental forms due to endogenous incompatibilities including the loss of coadapted gene complexes involved in local adaptation due to recombination. 34This may explain the low number of hybrids observed in comparison with parental northern genotypes.The lack of southern pure individuals both in pure lines and backcrosses suggests that even though they occasionally arrive from the south, their occurrence is sporadic.Northern lineage animals, in comparison, are expected to be continuously seeded into the Chesapeake Bay area due to the prevailing connectivity pattern with ocean currents running to the south along the coast up to Cape Hatteras. 26In support, all backcrosses were from northern individuals.

SA
Under the tension zone model, hybrid zones are dynamic and free to move because they are maintained independent of the environment. 35,36his suggests that the stable environment along the Atlantic coast is a hybrid tension zone where selection is likely to act against hybrids due to potentially lower fitness compared to parental lines.

Variable environmental conditions: Example of a hybrid swarm
A contrasting setting was found inside Chesapeake Bay, where M. leidyi is present in a spatially and temporally varying environment, especially when considering salinity, ranging from > 2 to 27.1 PSU (see Figure 4; 30,37,42 ).Our genetic analyses indicate that no parental forms of either northern or southern origin were detected in Chesapeake Bay.All sampled individuals were of admixed origin, consisting of F 2 (or later generation) hybrids as well as first-and second-generation backcrosses.This is indicative of a stable hybrid population, actively producing viable offspring and forming a hybrid swarm. 19,43In contrast to the hybrid tension zone model, the prediction of a hybrid swarm is that hybrid fitness is enhanced under particular environmental conditions in which hybrids are superior with increased fitness compared to either of the parental genotypes. 9,18,44Hybrid superiority can be the result of coadaptation of parental gene pools to distinct exogenous regimes. 8lthough hybrids might outperform either pure parental species in surviving due to hybrid vigor, 45,46 hybrids are not preadapted to the restricted regions where they occur.Their success is rather related to parental forms being less adapted. 8In this sense, hybrid swarms are commonly located in marginal habitats substantially different from that of either parental lines. 47For instance, natural hybrids between  the two North Atlantic eel species are exclusively found in Iceland, characterized by extreme environmental conditions. 21In this sense, the hybrid swarm found in our study is in Chesapeake Bay, a large estuary, which is characterized by highly variable environmental conditions, especially considering salinity. 30Chesapeake Bay has extensive low salinity (<10) regions where M. leidyi is known to form large populations, including in areas of very low salinity (<5). 37,42Salinity has also been linked to differing mortality regimes of M. leidyi in Chesapeake Bay; regions of low salinity act as refugia for M. leidyi to avoid predation by the scyphomedusa Chrysaora quinquecirrha, which is less tolerant to low salinities. 38Hence, predation might act as an additional selection pressure to maintain hybrid populations, especially in low saline regions of Chesapeake Bay.Parental forms of the northern linage are not particularly well adapted to low saline conditions, as documented from experimental and field observations in the invasive range. 23,48For the invasive northern population, salinity drastically impacts reproduction rates, and active recruitment in the field ceases at salinities <10. 48,49Salinity effects on reproduction rates in native northern and southern populations have not explicitly been investigated.One mechanism that might explain the putative fitness advantage of hybrids in the fluctuating environment of Chesapeake Bay is beneficial reversal of dominance, as demonstrated experimentally in the invasive copepod Eurytemora affinis. 50Dominance reversals are contextdependent so that a beneficial allele in a favorable environment is dominant yet recessive in a non-favored environment. 51As the fitness of alleles differs across conditions, beneficial reversals of dominance enable antagonistic selection to maintain high levels of genetic variation for fitness traits such as salinity tolerance, [52][53][54] which would in our case increase the ability to adapt to the fluctuating habitats of Chesapeake Bay.
Hybrid zones are predicted to be ephemeral since their existence is dispersal-dependent. 8,12However, in Chesapeake Bay, hybrids have existed for 20+ years or at least 500 generations.First records date back to 1998, when hybrids were suggested to be present in Rhode River, MD, and then later when sampled in northern and southern Chesapeake Bay areas. 25Introgression was observed at all locations, but low marker number did not allow for further hybrid characterization. 25Ghabooli et al. 55,56 included animals collected in 2008 at York River (same as our study), which appeared highly differentiated in comparison with other native samples, but without testing for hybridization.Moreover, the recent study of Verwimp et al. 40 using genome wide SNP data, compared genetic variability between native and invasive populations and included a sample location from Chesapeake Bay (N = 7) at the mouth of the Potomac River.Sampling salinity was not provided, but likely ranged around 15. Re-examination of the data by Jaspers et al. 27 suggested that all individuals from that sampling site were hybrid backcrosses.This indicates that Chesapeake Bay represents a consistent hybrid zone in space and time.Our analyses further suggest that all second-generation backcrosses were between F 1 hybrids and first-generation backcrosses with pure northern individuals.This indicates that pure northern animals must have been present at low frequencies or originated from nearby coastal areas, with subsequent selection for hybrid lines.Irrespective of the source of the animals observed in Chesapeake Bay, our data suggest the potential presence of a hybrid swarm in this habitat, which is characterized by high environmental variability, 30 including areas of very low salinity where M. leidyi has been found (Figure 4).Even though we cannot exclude that hybrids are intermediate in temperature tolerance, which might contribute to their fitness, salinity is the most extreme environmental parameter in Chesapeake Bay.Taken together, our data suggest that hybrids present in Chesapeake Bay might have a putative adaptive advantage over pure parental lines, but further spatiotemporal population samples and physiological experiments are needed to substantiate this hypothesis.

Potential role of hybridization in invasiveness
Hybridization in the introduced range as a driver of invasion success has gained considerable attention in invasion ecology.This was originally postulated for plants, 7 but applies to other organisms as well, including insects, 57 amphibians, 18 and fish. 58Paradoxically, while recombination in hybrids can generate maladaptive individuals, it can also generate both novel genotypes and an overall increase in genetic variation, which can give hybrids a fitness advantage especially in novel environments. 7Moreover, it has been suggested that environmental fluctuations in the native range could facilitate invasion success by imposing balancing selection on key fitness traits. 54As a result of balancing selection, native populations from habitats with varying conditions would maintain high standing genetic variation and thereby an enhanced invasive potential. 54For example, balancing selection on standing genetic variance for osmotic tolerance in the native range underlies freshwater adaptation in the invasive copepod E. affinis. 5956,60 Two scenarios might lead to the presence of M. leidyi hybrids in Europe in the future: (1) post-introduction hybridization in the recipient habitat; and (2) translocation of hybrids directly from native habitats to the non-native range.It is interesting to note that the northern and southern M. leidyi lineages seem to differ in their salinity tolerance as inferred from their observed distribution range in The test is based on the distribution of derived alleles and determines whether introgression has occurred and in which direction.We tested introgression from the southern lineage (Miami) into the northern populations.Significant p values in bold.
invasive populations. 23While salinity restricts range expansion in Northern Europe, 48 the southern invasive population thrives at low salinities including the Caspian Sea and the Sea of Azov. 42In our study, no pure individuals were genetically identified in Chesapeake Bay.Whether hybrid fitness at low salinity is enhanced over both parental lines needs further investigation via experiments.However, our data suggest that hybrid lines might have an adaptive advantage in low salinity environments and could potentially facilitate range expansion and contribute to the acceleration of their invasion success.We cannot dismiss that other mechanisms such as advection, hydrographic fronts and dispersal differences contribute to maintain hybrid populations in the native range.However, our data suggest that hybrids represent a potential risk if translocated into low saline areas, where M. leidyi has not been established, such as the Baltic Sea. 23More generally, and unrelated to salinity, hybrid populations can pose a risk due to higher standing genetic variation. 54Given the lag time of non-indigenous species in the novel habitat, which has been suggested to facilitate hybridization from distant source pools in the non-native habitat, 7 translocation from hybrid zones in native areas can be a matter of concern, due to their potential to increase colonization and invasion success in novel habitats.Even though physiological experiments and increased spatiotemporal samplings are needed to confirm the extent of the here detected hybrid zone of M. leidyi in the native range, this study contributes to the general understanding of how hybridization in native populations might contribute to successful invasions in the marine environment.We encourage genomic monitoring of native populations of highly invasive species in order to identify hybrid populations to prevent translocation of admixed individuals from hybrid zones in the native area to new, thus far uninvaded habitats.

Limitations of the study
We are the first to disentangle the status and proportion of admixed individuals and the nature of the hybrid zone in the native habitat of M. leidyi using a large number of genetic markers.We acknowledge that our spatial and temporal sampling of Chesapeake Bay is limited.At present, our study does not include the very low saline locations which would be needed to further support a putative advantage of hybrids in Chesapeake Bay, while temporal samples would allow to assess the overall stability of the hybrid zones.However, previous studies using low number of markers 25,40,55,56 confirm that the hybrid zone in Chesapeake Bay has existed for at least 20 years.Future studies should include extended sampling using genome-wide markers also in area between New England and Chesapeake Bay, where hybridization is unlikely but might occur.As discussed, the observation of hybrids in Chesapeake Bay points to a putative hybrid advantage in the particular variable environmental conditions of Chesapeake Bay.While other factors might contribute to the observed population structure (dispersal, temperature), we hypothesize that salinity in combination with predation pressure by the higher saline adapted jellyfish Chrysaora quinquecirrha are likely the drivers to maintain hybrids in Chesapeake Bay.Experiments should be conducted to substantiate a hybrid advantage at extreme salinity levels.

Sample collection
All permissions and regulations to sample the invertebrate comb jelly (ctenophore) Mnemiopsis leidyi in the native range along the US east coast have been attained and were followed.No cultivation was needed for sample generation.M. leidyi is a simultaneous hermaphrodite hence sex bias does not apply.Specifically, a total of 176 M.  S3 for location and environmental details; Figure 1).Salinity environments encountered by M. leidyi in the Northern population investigated here (USA-states: MA, RI, CT) are high and >19 even in Narraganset Bay, 65 while in Chesapeake Bay, M. leidyi has been confirmed from salinities as low as >2 (Figure 4).Samples were collected at all locations during late summer/early fall 2018 (N = 102), additional samples were collected at four locations in summer/early fall 2020 (N = 74) to allow for temporal genetic analysis.All samples were collected north of Cape Hatteras and correspond to the northern lineage identified in previous genetic studies. 25,27For comparison, we also included samples collected at one site south of Cape Hatteras (Miami, FL, N = 15), which corresponds to the southern lineage previously analyzed in Jaspers et al. 27 The latter is the only southern native population analysed using whole genome sequencing data up to now.Previous analyses using low genetic marker density showed differences between Gulf of Mexico and Florida M. leidyi populations for mitochondrial cytochrome b, but not for microsatellite markers. 25owever, connectivity between the locations previously analysed along the US Gulf of Mexico coastline 25 and Florida, Miami is limited.We can not exclude that cryptic diversity exists in the southern linage but so far, no differentiation has been found within the southern Atlantic linage from south of Cape Hatteras to Florida. 25 METHOD DETAILS

DNA extraction
M. leidyi were individually placed on coffee filters and dried for 48 hours at 60 C. DNA was extracted using the DNEasy Blood & Tissue kit (Qiagen) following the manufacturer's protocol except for the sample processing and the elution steps.A 1 cm x 1 cm piece of dried tissue was cut from coffee filters and placed into a 2 ml microcentrifuge tube.180 ml ATL buffer and 20 ml proteinase K were added, mixed by vortexing and incubated for 3 hours at 56 C with occasional vortexing in between.After centrifuging at 10,000 rpm for 1 min, 200 ml AL buffer was added, mixed thoroughly by vortexing and incubated at room temperature for 10 min.In the final elution step, after discarding the collection tube and transferring the column into a new 1.5 ml Eppendorf tube, DNA was eluted by adding 50 ml AE buffer (pre-heated to 60 C) to the center of the spin column membrane, incubating for 2 min and centrifuging at 10,000 rpm for 1 min.DNA concentration and purity were measured and afterwards samples were diluted 1:100 for further analyses.

QUANTIFICATION AND STATISTICAL ANALYSIS SNP chip genotyping
All individuals were genotyped at a total of 96 single nucleotide polymorphisms (SNPs) using a high throughput low-density SNP microarray, developed from whole-genome re-sequencing data. 29SNP genotyping was performed using Fluidigm 96.96 Dynamic Arrays (Fluidigm Corporation, San Francisco, CA, USA).The Fluidigm system uses nano-fluidic circuitry to allow for the simultaneous genotyping of up to 96 samples with 96 loci. 66Genotypes were called and compiled using the Fluidigm SNP Genotyping Analysis software.Each assay was assessed for plot quality and expected clustering patterns.Northern and southern lineage individuals identified in Jaspers et al. 27 were used as positive controls.

Data analyses
Genetic diversity was estimated using observed (Ho) and expected (He) heterozygosities and standardized allelic richness (AR) per population, calculated in Arlequin v3.5.2.2. 61Diversity values across populations were compared by one-way ANOVA using R. Standardized genetic differentiation statistics between sampling locations were calculated using Arlequin v3.5.2.2 in accordance with Weir and Cockerham. 67First, pairwise genetic differentiation (F ST ) was calculated between all sample pairs.Second, a hierarchical AMOVA was conducted partitioning genetic differentiation into a geographical and temporal component.All SNP data were used to conduct a Principal Component Analysis (PCA) to visualize population structure using smartPCA from the Eigensoft package, 62 with significance calculated using the Tracy-Widom statistic. 68Population structure was further investigated using the Bayesian assignment approach implemented in STRUCTURE, 63 a model-based clustering algorithm that infers the most likely number of groups (K) in the data.The analysis was performed with K = 1-9, assuming an admixture model, correlated allele frequencies and without population priors.A burn-in of 100,000 steps followed by 1,000,000 additional Markov Chain Monte Carlo iterations were performed.For each K, 10 independent runs were conducted to check the consistency of the results.The most likely K was inferred using the method of Evanno et al., 69 which measures the steepest increase of the ad hoc statistic DK based on the rate of change in the log probability of data between successive K values.STRUCTURE was also used to identify hybrids, estimating individual admixture proportions and their probability intervals.
Hybridization patterns were assessed using the framework of Bayesian model-based clustering implemented in NEWHYBRIDS, 64 which computes the posterior probability of belonging to each of the parental and distinct hybrid classes.The original genotype classes: parental northern, parental southern, F1 (parental northern x parental southern), F2 (F1 x F1) and first-generation backcrosses (F1 x parental northern, F1 x parental southern) were expanded to include all possible second-generation backcrosses (see Figure 5).The software was run for 100,000 iterations in the burn-in period, followed by one million Markov Chain Monte Carlo iterations in each analysis.
Introgression was also investigated by testing for an excess of shared derived alleles using the ABBA-BABA test. 70The test considers ancestral (A) and derived (B) alleles and given three populations (P1, P2, P3) and an outgroup O with the relationship (((P1,P2),P3),O), counts the SNPs that match the ABBA and BABA genotype patterns.An excess of ABBA is indicative of recent introgression from P3 into P2, while an excess of BABA suggests excess shared ancestry between P1 and P3.Excess of ABBA or BABA patterns was tested by calculating Patterson's D statistic 71 using a jackknife method to test for a significant deviation from the null expectation of D = 0.In our case, we used the baseline population of Miami as P3 in order to infer the amount of shared ancestry between the northern populations and Miami and the direction of introgression.We included the North Sea population from Jaspers et al. 27 as an outgroup.

Power of the markers to identify hybrids
To test the power of the markers to classify hybrids, a total of 120,000 individuals were simulated using all SNPs in the dataset for 12 categories including first-and second-generation backcrosses and reassigned blindly.Figure 5 shows all STRUCTURE plots for all simulated categories.Using NEWHYBRIDS, a correct assignment was made for all parental individuals, both northern and southern lineages, with 100% confidence (Table S4).Identification of F 1 vs. F 2 hybrids was more difficult since both classes shared similar admixture proportions and F 1 hybrids presented no exclusive genotypes relative to F 2 hybrids.However, F 2 hybrids could be distinguished from F 1 hybrids by the presence of recombinant genotypes and correctly assigned with a confidence of 94%.Regarding backcrosses, a correct assignment with high confidence was obtained for first-generation (on average 90.4%) and second-generation backcrosses with the same parental type (on average 94.4%).The remaining second-generation backcrosses were correctly assigned with lower confidence (60.1-61.8%).Overall, results suggest that our SNP panel has enough discriminatory power to correctly identify parental northern, parental southern, F 2 , first-generation and second-generation hybrids.While simulated F 1 hybrids could also be assigned as F 2 hybrids, they could not be assigned as parental or backcrosses.It should also be noted that we extended the original hybrid classes to include second-generation backcrosses but we did not include later generation hybrids such as F 3 or F 4 hybrids, which would be not possible to distinguish from F 2 hybrids, hence we refer to those hybrids as F 2 (or later generation) hybrids in the discussion.

Figure 3 .
Figure 3. Admixture analysis visualized by STRUCTURE plots of Mnemiopsis leidyi sampled along the US east coast Individuals were assigned on the basis of the most likely K, in this case (K = 2).Locations as outlined in Figure 1 (SA = Sandwich, MA; WH = Woods Hole, MA; FA = Fort Adams, RI; FW = Fort Wetherill, RI; GC = Greenwich Cove, RI; EP = Esker Point, CT; WA = Wachapreague, VA; GP = Gloucester Point, Chesapeake Bay, VA; MI = Miami, FL).

Table 2 .
ABBA-BABA tests provide evidence of introgression