Stochasticity in space, persistence in time: genetic heterogeneity in harbour populations of the introduced ascidian Styela plicata

Spatio-temporal changes in genetic structure among populations provide crucial information on the dynamics of secondary spread for introduced marine species. However, temporal components have rarely been taken into consideration when studying the population genetics of non-indigenous species. This study analysed the genetic structure of Styela plicata, a solitary ascidian introduced in harbours and marinas of tropical and temperate waters, across spatial and temporal scales. A fragment of the mitochondrial gene Cytochrome Oxidase subunit I (COI) was sequenced from 395 individuals collected at 9 harbours along the NW Mediterranean coast and adjacent Atlantic waters (> 1,200 km range) at two time points 5 years apart (2009 and 2014). The levels of gene diversity were relatively low for all 9 locations in both years. Analyses of genetic differentiation and distribution of molecular variance revealed strong genetic structure, with significant differences among many populations, but no significant differences among years. A weak and marginally significant correlation between geographic distance and gene differentiation was found. Our results revealed spatial structure and temporal genetic homogeneity in S. plicata, suggesting a limited role of recurrent, vessel-mediated transport of organisms among small to medium-size harbours. Our study area is representative of many highly urbanized coasts with dense harbours. In these environments, the episodic chance arrival of colonisers appears to determine the genetic structure of harbour populations and the genetic composition of these early colonising individuals persists in the respective harbours, at least over moderate time frames (five years) that encompass ca. 20 generations of S. plicata.


INTRODUCTION
genetic drift, and adaptation to novel environments (Sakai et al., 2001;Strayer et al., 2006;Keller & Taylor, 2008) may yield important temporal changes in the genetic composition of these populations.
Despite the importance of temporal dynamics for understanding introduction processes in the sea (Rius et al., 2015), most studies to date only address spatial differentiation in the genetic structure of NIS. Among the few studies that have investigated temporal variation, some report marked decreases in genetic diversity over time (Pérez-Portela, Turon & Bishop, 2012), while others documented short-term changes in allele frequencies, within a context of sustained high genetic diversity (Paz et al., 2003;Guardiola, Frotscher & Uriz, 2012;Reem et al., 2013;Karahan et al., 2016;Pineda et al., 2016). To our knowledge, only Dupont, Bernas & Viard (2007b) reported temporal genetic homogeneity across age groups of an introduced gastropod. Goldstien et al. (2013) demonstrated how temporal sampling can provide a better understanding of introduction dynamics and improve the identification of sources of introduced populations.
In this study, S. plicata was used as a model to test the spatio-temporal dynamics of populations inhabiting small to medium-size harbours and marinas across > 1,200 km of Mediterranean coast (Iberian Peninsula) and adjacent Atlantic waters. Contrary to other introduced ascidians in this area (e.g., Microcosmus squamiger, Ordóñez et al., 2013), S. plicata does not occur outside of ports and confined environments, thus its dispersal among localities relies on human transport. We sampled the same populations at two time points separated by five years. Considering that S. plicata grows to maturity in three months and features a continuous reproductive cycle in the study area (Tursi & Matarrese, 1981;Pineda, López-Legentil & Turon, 2013), this temporal scale encompassed at least 20 generations. The investigated harbours have mostly local traffic and are likely seeded by occasional interchange of NIS among neighbouring harbours, as well as interchange with more distant, larger ports that act as entry points. Thus, we hypothesized that the high stochasticity of these seeding events and the low number of individuals that can be transported by small vessels at a given time will result in a patchwork-like distribution of genetic variability, which will in turn be highly dynamic in time as a result of bottlenecks, drift, and further occasional interchanges. Unravelling the genetic signatures of these processes will provide insight into the secondary spread of NIS in the area and contribute to the broader knowledge of introduced species dynamics in highly urbanized coasts.

Sampling
Nine localities along the Spanish coast (Iberian Peninsula) were sampled: seven on the Mediterranean side and two on the Atlantic shores, near the Strait of Gibraltar (Fig. 1). These localities consisted of small to medium-size harbours, mostly with only short-range fishing fleets and/or recreational vessels (Table 1). The harbours were sampled in 2009 and 2014 by collecting specimens attached to ropes and buoys (at least 5 m apart from each other), at 0-2 m depth. The ascidians were dissected immediately upon collection, and tissue close to the buccal siphon was preserved in absolute ethanol. Samples were kept at -20 C until processed.

DNA extraction and sequencing
DNA was extracted from muscle or branchial tissue with REDExtract-N-AmpTissue PCR Kit (Sigma-Aldrich, St. Louis, MO, USA). A fragment of the mitochondrial Cytochrome Oxidase I (COI) gene was amplified using the universal primers LCO1490 and HCO2198 (Folmer et al., 1994). Amplifications were performed in a final volume of 20 mL using 10 mL of REDExtract-N-amp PCR reaction mix (Sigma-Aldrich, St. Louis, MO, USA), 0.8 mL of each primer (10 mM), and 2 mL of template DNA. The PCR program consisted of an initial denaturing step at 94 C for 2 min, 30 amplification cycles (denaturing at 94 C for 45 s, annealing at 50 C for 45 s and extension at 72 C for 50 s), and a final extension at 72 C for 6 min, on a PCR System 9700 (Applied Biosystems). PCR products were purified using MultiScreen Ò filter plates (Millipore), labelled using BigDye Ò Terminator v.3.1 (Applied Biosystems) and sequenced on an ABI 3730 Genetic Analyser (Applied Biosystems) at the Scientific and Technological Centres of the University of Barcelona, Spain (CCiTUB). Other samples were directly sent for purification and sequencing to Macrogen Inc. (Seoul, South Korea). Sequences were edited and aligned using BioEdit Ò v.7.0.5.3 (Hall, 1999).

Genetic analyses
Number of alleles (Nh), haplotype diversity (Hd), and nucleotide diversity (π) were computed with DnaSP v.5 (Librado & Rozas, 2009). Two estimates of allelic differentiation between populations at each sampling year were used: the F ST estimator of Weir & Cockerham (1984), based on allele frequencies and calculated with Arlequin v 3.0 (Excoffier, Laval & Schneider, 2005); and the adjusted D est estimate described by Jost (2008), and obtained with SPADE (Chao & Shen, 2010). Both estimators varied between 0 (no differentiation) and 1 (complete differentiation). The use of F ST -like estimators has been criticized on the basis that it is dependent on the variability of the marker used (Jost, 2008;Jakobsson, Edge & Rosenberg, 2013), and it is advisable to use estimators independent of within population diversity, such as D est , in combination with traditional F ST -like statistics (Meirmans & Hedrick, 2011;Verity & Nichols, 2014).  Table 1. The significance of F ST values was calculated (permutation tests, 10,000 replicates), and a correction for multiple comparisons was applied following the Benjamini-Yekutieli false discovery rate (FDR) (Narum, 2006). For D est , the mean and SE values obtained with SPADE from 10,000 bootstrap replicates were used to calculate confidence intervals (using a normal approximation) with a FDR-corrected probability. A value of D est was deemed significant when the confidence interval around its mean did not contain 0.
In order to complement these differentiation methods based on allelic frequencies, we used two approaches that explicitly consider the spatial distribution of alleles across samples. First, we performed a spatial analysis of shared alleles (SAShA, Kelly et al., 2010), which tests the average geographic distance between co-occurrences of alleles against its expectation under panmixia. Based on results from the previous tests (see Results), this analysis was performed pooling the samples for both years. We used the SAShA 1.0 software and performed 10,000 permutations of data for assessing significance. Individual alleles were also analysed separately, which may be useful when common alleles mask a potential non-random distribution of less common alleles. In a second approach, estimates of gene flow (in number of migrants per generation) based on the frequency of private alleles (Slatkin, 1985; Barton & Slatkin, 1986) were performed between the localities studied at both years using the web version of the GENEPOP 4.2 software (Raymond & Rousset, 1995). Again, the analysis of private alleles may uncover patterns masked by abundant and widespread alleles.
For temporal comparisons, differentiation values (F ST and D est ) were computed for each population between sampling years. A correlation between these values among pairs of populations at both years was also calculated. Finally, an analysis of the molecular variance (AMOVA) was performed using haplotype frequencies with Arlequin, grouping the populations per sampled year, and its significance was tested by running 10,000 permutations of the data.
Mantel tests were also computed to correlate genetic and geographic distances separately at each year. The shortest distances by sea between points were calculated using Google Earth (Google Inc.). Mantel tests were performed with Arlequin for F ST and with the R package ade4 (function mantel-rtest) (Dray & Dufour, 2007) for D est , and their significance was tested by permutation (10,000 replicates).
The dataset was used to construct a median-joining network using Network v.4.5.1.6 (Bandelt, Forster & Röhl, 1999). A maximum likelihood tree was also computed using Styela gibsii as an outgroup (GenBank accession number HQ916447) with Mega v6.06 (Tamura et al., 2013), and the significance of the branches was tested by 10,000 bootstrap replicates. The General Time Reversible (GTR) model with proportion of invariable sites (+I) and rate variation among sites (+G) was selected for tree-building using jModelTest v.2.1.7 (Posada, 2008) and the Akaike Information Criterion.

RESULTS
Twenty to twenty-six individuals were sequenced per locality in 2009 and 2014 (Table 2), resulting in a total of 395 sequences with a final length after alignment and trimming of 580 bp. These sequences corresponded to 12 haplotypes featuring 28 variable positions. Half were private haplotypes (Table 2) from a single locality and time point. Between 1 and 4 haplotypes were found in each population. For consistency with the global dataset of Pineda, López-Legentil & Turon (2011), the same haplotype numbers presented in that study were used for identical sequences. New haplotypes were labelled from 23 onwards, as there were 22 haplotypes described in Pineda, López-Legentil & Turon (2011). Six of our sequences (haplotypes 1, 2, 3, 4, 5 and 9) were already identified by Pineda, López-Legentil & Turon (2011), of which 5 (haplotypes 1, 2, 3, 4 and 5) were also found by Maltagliati et al. (2015) in Italian waters (from both eastern and western shores). Haplotype frequencies per population and year are presented in Supplemental Information (Table S1). All new sequences obtained in this study have been deposited in GenBank (accession numbers KU878146-KU878151).
Overall, haplotype and nucleotide diversity were low (0.206 ± 0.042 and 0.0027 ± 0.0009, respectively, mean ± SD) ( Table 2). The highest haplotype diversity appeared in Conil (CON) in 2014 (0.594 ± 0.097), while a single haplotype (Hd = 0) was found in La Herradura (HER) in 2014 and in Torre Ladrones (TOR) and Chipiona (CHI) at both years. No clear geographic or temporal trend was apparent for haplotype diversity, with some populations showing higher values in 2009 and others in 2014 (Table 2). Haplotype 2 was the most abundant and widespread (only absent in Xàbia in 2009), followed by haplotype 1, which was dominant in Blanes (BLA), Arenys de Mar (ARE), and Xàbia (XAB). Haplotype 5 was the third most frequent and the dominant haplotype in Carboneras (CAR). The remaining haplotypes were found only in 6 or less individuals ( Fig. 1; Table S1). The haplotype frequencies in the two years were quite similar for the three common haplotypes (Fig. S1), while they varied in the less abundant alleles, of which three were found exclusively in 2009 and another three only in 2014 (Fig. S1).
The haplotype network revealed that haplotype 5 was most divergent, separated by 16 mutational steps from the remaining haplotypes, which were grouped more closely together, albeit with some divergent sequences (e.g. haplotypes 3 and 25, Fig. 2). Accordingly, the maximum likelihood tree (Fig. 3) showed a highly supported branch comprising all sequences except for haplotype 5, which was set apart in a different branch.
The results of the spatial differentiation analyses using F ST and D est are shown in Table 3. Mean values for both years were 0.481 ± 0.044 for F ST and 0.586 ± 0.052 for D est (mean ± SE). A majority of pairwise comparisons were significant (F ST : 64% in 2009, 72% in 2014; D est : 64% in 2009 and in 2014). Overall, Vilanova i la Geltrú (VIL) was the locality that showed fewer significantly different comparisons with other populations, while Carboneras (CAR) showed significant differentiation in all cases. In contrast, the values of differentiation between years for the nine localities were extremely low and non-significant in all cases (F ST = 0.007 ± 0.004, D est = 0.005 ± 0.003, mean ± SE) ( Table 3).
Population differentiation means per year and between years are plotted in Fig. 4, which clearly reflects the gap between differentiation measures among localities and those between years for the same locality. There is also a highly significant correlation between F ST and D est values for both years (correlation coefficients r > 0.95, Fig. 5), indicating that the differentiation level for any given population pair was maintained over the period studied. Figure 5 clearly shows two groups of pairwise comparisons separated by a gap. Some population pairs exhibited low differentiation values, including geographically close populations (for instance, most populations pairs around the Strait of Gibraltar) and also distant populations (for instance, Xàbia (XAB) with the two northernmost populations). A majority of population pairs, however, showed high differentiation values at both years (Fig. 5).
The spatial analysis of shared alleles (SAShA), combining information of both years, showed a significantly restricted spatial distribution of the alleles in our samples (expected mean distance between co-occurrences: 497.98 km, observed mean: 353.63 km, p < 0.001) (Fig. 6). These results imply that alleles occur more closely together than expected by chance, indicating an overall restriction to gene flow. A more detailed analysis allele per allele (only those distributed among at least two localities can be used in this approach) (Fig. S2) revealed that this pattern is consistent among alleles, with the exception of haplotype 4, which is more evenly distributed among localities. On the other hand, an analysis based on the frequency of private alleles revealed a non-negligible migration rate between localities, and the estimate was lower (1.139 migrants) for 2009 than for 2014 (2.201 migrants). This difference relates to the higher mean frequency of private alleles in 2009 (0.078) than in 2014 (0.053).
The results of the AMOVA grouping populations by sampling year showed that most genetic variability occurred between populations in a given year (74.26%, p < 0.001) and within populations (33.64%, p < 0.001) ( Table 4). In contrast, the variance explained by the factor year was negative and not significant.

DISCUSSION
Our setting is representative of many highly urbanized coasts along the Mediterranean Sea and elsewhere, with a dense network of harbours and marinas of different sizes. We focused specifically on small to medium-size harbours, which mainly host short-range fishing fleets and recreational boats. These are typically sites of secondary dispersal for introduced species (post-border dispersal sensu Forrest, Gardner & Taylor, 2009). In a study of populations of Styela clava in New Zealand, Goldstien, Schiel & Gemmell (2010) showed that large ports and marinas can have separate dynamics. Large ports with commercial and overseas cruise activities are interspersed among our sampling areas (Barcelona, Tarragona, Valencia, Cartagena, and Algeciras have the highest volumes of traffic) and can act as initial entry points of introduced species from other seas. Through the analysis of the genetic structure of populations of an introduced ascidian over ca. 1,200 km of coastline at two time points five years apart, we have found significant spatial differentiation, but negligible temporal variation. This was contrary to our prediction of high variability related to both space and time in this species that relies on human transport for dispersion between harbours. The system was less dynamic than anticipated, showing temporal persistence in the genetic composition of the populations over a period of time that encompassed at least 20 generations of the species. This time frame has also provided ample opportunities for interchange via vessel movements, considering that fishing activities are continuous and that recreational traffic is very active in the studied area, particularly in summertime.
In another population of Styela plicata, Pineda et al. (2016) documented a much more dynamic scenario with frequent gains and losses of microsatellite alleles. That population, however, was located in an unstable habitat in the Atlantic Intracoastal Waterway (North Carolina, USA), subject to periodic flooding and die-off episodes, which contrasts with the apparent stability of the harbours analysed here, at least over the temporal scale analysed. Importantly, Pineda et al. (2016) used both microsatellite and COI sequence   datasets, and the former was more informative than the latter. However, changes in low frequency COI haplotypes were detected in Pineda et al. (2016) after die-off episodes, suggesting that mitochondrial sequence information alone is enough to detect substantial temporal changes in genetic structure. Two major clades of COI have been described for Styela plicata (Pineda, López-Legentil & Turon, 2011) and most of our sequences belonged to haplogroup 1. The divergent haplotype 5 was the only representative of haplogroup 2, confirming its presence in the Mediterranean Sea (Maltagliati et al., 2015). Overall, some geographically close localities displayed very different haplotype composition  (e.g., Vilanova i la Geltrú and Arenys de Mar), while some widely separated populations were genetically similar. Other localities, like Carboneras, exhibited a haplotype composition completely different from all other populations. Therefore, the spatial pattern was complex and heterogeneous, albeit with a tendency towards the dominance of haplotype 1 in the North and haplotype 2 in the South of the studied area. This tendency explains the marginally significant Mantel tests, as North-South comparisons were also the most distant ones. The dominance of a few haplotypes resulted in populations being relatively similar or highly differentiated, depending on whether they shared one or several of these common haplotypes, with few populations showing intermediate levels of variability (Fig. 5). The analysis of statistics based on spatial distribution of alleles showed more subtle patterns. For instance, the spatial analysis of shared alleles indicated a geographic span of allele co-occurrences smaller than expected, which suggested some degree of stepping stone dispersal among localities (Kelly et al., 2010). The inference based on  private alleles, on the other hand, showed a small but non-negligible estimate of migration between localities, and changes in private alleles were also detected between years. Thus, the genetic composition of the populations may in fact change over time, but likely at temporal scales much larger than the one studied here.
Haplotype richness (1-4 haplotypes) and gene diversity values (mean Hd of ca. 0.2) were generally low in the sampled populations and lower than reported for the same marker in other introduced ascidians in harbours (e.g., Botryllus schlosseri, López-Legentil, Turon & Planes, 2006;Lejeusne et al., 2010;Microcosmus squamiger, Rius, Pascual & Turon, 2008;Diplosoma listerianum, Pérez-Portela et al., 2013). In other cases, evidence for bottlenecks and low genetic diversity was found associated with introduction events (e.g., Corella eumyota, Dupont et al., 2007a;Didemnum vexillum, Stefaniak et al., 2012;Perophora japonica, Pérez-Portela, Turon & Bishop, 2012). In further instances, a wide range of genetic diversities has been found among introduced populations of some species (e.g., Styela clava, Goldstien et al., 2011). In Styela plicata, low values of diversity, similar to those found here, were reported for populations from Australia and New Zealand (Torkkola, Riginos & Liggins, 2013), while in a study of 15 harbours in Italy (including some big ports with international traffic), widely different values of haplotype diversity were found (Maltagliati et al., 2015). The low genetic diversity observed here is consistent with the idea that the studied populations are seeded by small number of individuals, likely associated to local boating activities. However, the lack of significant changes in the genetic structure of all investigated populations over time is unexpected and suggests that interchange with other populations is sparse in time, and that genetic drift alone in inherently small populations does not suffice to modify allele frequencies, at least at the temporal scale surveyed (5 years).
In conclusion, our results confirm the stochastic nature of colonization of small harbours and marinas by introduced species traveling as ship fouling. Recurrent introductions do not seem to be frequent in these harbours, preventing genetic homogenization over space and enabling the persistence of haplotypes in a given location over time, barring major perturbations that result in die-offs (Pineda et al., 2016). Our study area is representative of many highly urbanized coasts with dense harbours. In these environments, the episodic chance arrival of early colonisers appears to determine the structure of the harbour populations and the genetic composition of these early colonising