Conservation Genomics Reveals Low Connectivity Among Populations of Threatened Roseate Terns in the Atlantic Basin

While the effects of barriers to dispersal such as population declines, habitat fragmentation, and geographic distance have been well-documented in terrestrial wildlife, factors impeding the dispersal of highly vagile taxa such as seabirds are less well understood. The roseate tern (Sterna dougallii) is a globally distributed seabird species, but populations tend to be both fragmented and small, and the species is declining across most of its range. Within the Atlantic Basin, past work has shown differentiation among roseate terns breeding on different continents, but these results were generated with a limited number of microsatellite markers. Relationships between breeding populations in the Northwestern Atlantic and the Caribbean have never been analyzed. We evaluated population structuring of roseate tern populations in North America and the Azores using both microsatellite markers and single-nucleotide polymorphisms generated through targeted sequencing of Ultra-conserved Elements. For both marker types, we found signi�cant genetic differentiation among all 3 populations and evidence for moderate contemporary unidirectional gene �ow from the Caribbean to the Azores, but not among other populations. Within the Caribbean metapopulation, we found high rates of unidirectional migration from the Virgin Islands to Florida, potentially indicative of movement from source population to sink or an artifact of dispersal among other unsampled populations in the Caribbean region. These observations have signi�cance for species persistence in the Atlantic, as our results suggest that loss of genetic diversity within populations is unlikely to be buffered by in�ow of new alleles from other breeding populations.


Introduction
Population connectivity, or migration of breeding individuals between populations, can be crucial for the persistence of species because periodic in uxes of new genetic material can help maintain adequate levels of genetic diversity in small and declining wildlife populations (Chesser 1991).Human activities in the Anthropocene have led to habitat fragmentation and widespread declines across wildlife taxa (Dirzo et al. 2014;McCauley et al. 2015), both of which can alter historical connectivity patterns (Hess 1996).
When connectivity is altered or severed, populations can become isolated, which can lead to subsequent erosion of genetic diversity (Johnson et al. 2004;Athrey et al. 2011;Tracy & Jamieson 2011;Leigh et al. 2019).This loss of diversity can compromise both population-wide tness (Reed & Frankham 2003) and the adaptive potential of populations in the face of environmental change (Pauls et al. 2013).Effects of isolation are exacerbated as populations decline, with smaller populations at greater risk of deleterious tness effects via inbreeding depression (Frankham 1996), which can contribute to further population declines (Crnokrak & Roff 1999;O'Grady et al. 2006).Quantifying connectivity between populations has therefore become increasingly recognized as a priority action for the conservation of threatened and endangered species (Pierson et al. 2016;Funk et al. 2018;Ralls et al. 2018;Hoban et al. 2020).
Although connectivity is sometimes inferred by a species' movement capabilities, empirical quanti cation of gene ow between populations is necessary both because dispersal is not always determined by movement potential (Lombal et al. 2020) and because gene ow is not always facilitated through dispersal (Bensch et al. 1998;Peterson et al. 2014).This disparity between potential and actual movement is exempli ed by highly mobile avian taxa such as seabirds.Most seabird species have the potential to disperse across vast distances.Accordingly, they typically exhibit an overall lack of genetic structure among populations within the same oceanic basin (Tigano et al. 2015), with dispersal among populations typically inhibited by geographic barriers such as land, but not by distance between breeding sites (Genovart et al. 2003;Bicknell et al. 2012;Mariano-Jelicich & Madrid 2014;Wojczulanis-Jakubas et al. 2015;Yannic et al. 2016;Quillfeldt et al. 2017).As many seabirds are migratory, population connectivity is often linked to wintering movements, with dispersal aided by movement along migratory routes (Szczys et al. 2017) or by mixing of geographically disjunct populations on shared wintering grounds, resulting in individuals following conspeci cs to new breeding populations (Sonsthagen et al. 2019).However, population structuring has also been detected among seabirds on relatively small spatial scales (Nuss et al. 2016;Sruoga & Butkauskas 2006).Restricted gene ow between seabird populations may indicate other isolating mechanisms at work, including philopatry (Hailer et al. 2011;Welch et al. 2012), assortative mating due to adaptive differences (Lombal et al. 2017;Nunes 2018), or seascape variables such as marine currents or winds (Nuss et al. 2016).Restricted dispersal can also result from anthropogenic in uences such as habitat loss, which can lead to population declines and range contractions that have the potential to alter historical dispersal regimes among populations (Cristofari et al. 2019;Peery et al. 2010).
The roseate tern (Sterna dougallii) is a globally distributed seabird species, but most populations are classi ed as threatened and the species is declining across its range.In North America, subspecies S. d. dougallii is separated into federally endangered Northwestern Atlantic (hereafter, Northwestern) and threatened Caribbean metapopulations (USFWS 2010).The Northwestern population is known to have experienced substantial declines in the 20th century.Long-term population dynamics within the Caribbean are not known, but extirpations and range contractions have been observed at multiple locations within the Caribbean, and monitored populations are declining (Zambrano et al. 2000;Soanes et al. 2020).There is thought to be no gene ow between the Northwestern Atlantic and Caribbean metapopulations based primarily on potential morphological differences, geographical separation, and a lack of band return evidence (Shealer & Saliva 1992;USFWS 2010).
The assumption that roseate tern metapopulations in North America do not interbreed requires validation.
Morphological differences among seabirds can be a product of environmental conditions and are not necessarily representative of genetic differentiation (Genovart et al. 2003;Lashko 2004;Wojczulanis-Jaukubas et al. 2015).Despite an extensive long-term research program in the Northwest, roseate tern dispersal within the Atlantic is still poorly understood due in part to low research effort in the tropics, and the degree of movement among regions is largely unknown (Mostello et al. 2014;Shealer & Saliva, 1992;Spendelow et al. 2010).Geographical separation is unlikely to inhibit dispersal as roseate terns have high movement potential, and banded individuals have been observed in colonies far from their natal region during the breeding season (Shealer & Saliva 1992;Hays et al. 1999).There are no physical barriers between the two populations, and Northwestern roseate terns use Caribbean islands as stopovers during their migration to South America (Nisbet 1984;Mostello et al. 2014), suggesting that seascape barriers are unlikely to compromise movement between regions.Additionally, the two populations are believed to overlap on their wintering grounds in South America (Nisbet 1984;Hays et al. 1999), which has been shown to promote interbreeding between distinct breeding populations in other seabirds (Friesen et al. 2007;Bicknell et al. 2012;Mariano-Jelicich & Madrid 2014).
Here, we conduct the rst investigation of genetic connectivity between both roseate tern populations in North America.To assess potential dispersal across the Atlantic, we also included samples from the Azores population, which is roughly equidistant from the Northwestern and Caribbean populations and can provide insight into large-scale cross-oceanic movement.Mitochondrial and microsatellite loci have been previously used to investigate roseate tern global connectivity, with populations found to be genetically differentiated between the Atlantic and Indo-Paci c oceanic basins and across the Atlantic (Lashko 2004), but with no genetic structuring found between colonies in the Northwest (Szczys et al. 2005;Dayton & Szczys 2021).The Caribbean metapopulation was not included as part of this earlier study, and we do not have a good understanding of movement within the Caribbean (Nisbet & Ratcliffe 2008).Additionally, roseate terns exhibit low genetic variability (Szczys et al. 2005;Dayton & Szczys 2021), as is common in terns (Faria et al. 2007;Mariano-Jelicich & Madrid 2014), which can make identifying genetic structuring challenging with low-quantity markers such as microsatellites.Due to this low genetic variation, we supplemented microsatellite markers with single nucleotide polymorphisms (SNPs) derived from capture and sequencing of Ultra-Conserved Elements (UCEs; Faircloth et al. 2012) .UCEs have successfully been applied to resolve phylogenetic relationships among closely related species complexes for a variety of taxa (Everson et al. 2019;Oswald et al. 2019;Tsai et al. 2019;Winker et al. 2018), but application within species and for more shallow timescales has been minimal (Parker et al. 2020).Using two sets of marker types enables comparison of results and validation of the effectiveness of UCEs for population-level questions.
Minimal genetic structuring among populations of roseate terns would be consistent with other studies of large-scale seabird connectivity in the absence of biogeographical impediments to movement, and would indicate that gene ow indeed occurs between these two populations.Alternatively, presence of genetic structuring between these populations may indicate disruption of historical dispersal patterns resulting from extensive population declines in the 20th century, a theory that could be supported by ndings of low genetic diversity and recent population bottlenecks in one or both metapopulations.

Sample Collection and DNA Preparation
We collected samples from roseate tern chicks over 2017-18 from Leduck Island in the U.S. Virgin Islands (USVI; n = 37) and from 3 sites in the Florida Keys (FL; n = 26; Fig. 1).Blood was collected by extracting < 25 µl from the tarsal veins of chicks.Samples were stored in Queen's lysis buffer.We collected blood samples only from chicks, which were captured by hand at nest sites, and sampled only chicks < 3 days old to facilitate rapid capture.Only one individual was sampled per nest.Because of high yearly site turnover and apparent mixing between colonies in the VI (Pierce 2009), genetic structuring is likely to be minimal, and the samples from Leduck Island were assumed to be representative of this population.We also opportunistically salvaged tissue from deceased chicks and eggs.Blood samples from Massachusetts (MA; n = 35) were collected in 2017 by the Massachusetts Division of Fisheries and Wildlife using similar methods.As the Northwestern population has been demonstrated to be panmictic (Szczys et al. 2005;Dayton & Szczys 2021), we assumed that the MA colonies would be representative of roseate tern genetic variation within the Northwestern Atlantic.Tissue samples were salvaged from chicks subsequent to depredation in the Azores (AZ; n = 10) in 2018 by researchers from the University of Azores.
We extracted DNA from all sample types using Qiagen DNeasy blood and tissue kits (Qiagen, Hilden, Germany) following recommended protocols and with a modi ed 24-hour digestive period for tissue samples.We quanti ed sample concentration using a Qubit Fluorometer (Invitrogen, Carlsbad, CA) and removed samples with DNA concentrations below a minimum of 0.05 ng/µl from further analysis.

Microsatellite Genotyping
We ampli ed 14 microsatellite markers previously optimized for roseate terns in Dayton and Szczys (2021; see Table S1 for microsatellite sequences and conditions).Independent triplicates were run for low-quality (≤ 2.0 ng/µl) samples, samples obtained from tissue, and samples with evidence of null alleles (i.e., failure to amplify at a speci c locus).PCR conditions followed Szczys et al. (2005) with an initial denaturation of 2 min at 94°C, 35 cycles of 30 s at 72°C, 30 s annealing at 50-58°C, 30 s touchdown at 72°C, and 2 min extension at 72°C.PCR product was loaded on an ABI PRISM 3100 (Applied Biosystems Foster City, CA, USA) to separate amplicons by size and visualize alleles.
We manually detected and scored microsatellite alleles using GENEMAPPER 3.7 (Applied Biosystems).We then checked microsatellite loci for evidence of null alleles and genotyping error via 10,000 Monte Carlo runs and 95% con dence intervals in MICRO-CHECKER (Van Oosterhout, Hutchinson, Wills, & Shipley, 2004), which tests alleles for evidence of excess homozygosity.We tested alleles for linkage disequilibrium and deviation from Hardy-Weinberg Equilibrium (HWE) using 10,000 Monte Carlo replicates in the R package 'adegenet' 2.1.3(Jombart 2011) and removed loci in disequilibrium.For this and all analyses involving multiple comparisons, we corrected α levels at the signi cance threshold α = 0.05 using a sequential Bonferroni correction (Rice 1989).Null alleles can lead to overestimation of genetic differentiation among sampling locations.We used the program FreeNA (Chapuis & Estoup 2007) to investigate the effect of potential null alleles on population differentiation in microsatellites.FreeNA estimates the null frequencies for each locus and population and applies a correction to account for the positive bias of null alleles on Weir's F ST (Chapuis & Estoup 2007).

SNP Genotyping
We prepared samples for genomic library construction by shearing DNA using a Biorupter with a target fragment size of < 500bp.Fragment size was visualized via 1% agarose gel.After shearing, we prepared genomic libraries using the Blunt-end Single-Tube (BEST) method (Carøe et al. 2018), with sheared DNA fragments ligated to barcoded adapters and P7 and P5 Illumina primers, ampli ed for sequencing with 9-12 cycles, and puri ed using AMPure XP beads (Beckman Coulter, Inc.).We then quanti ed DNA concentration in multiplexed libraries using a Qubit Fluorometer and assessed mean fragment size using an Agilent 2200 TapeStation (Agilent Technologies, Santa Clara, CA).For UCE enrichment, we combined multiplexed libraries into 750 (150ng/library) pools.We then enriched equimolar pools for 5,060 UCE loci using the myBaits UCE Tetrapods 5Kv1 kit (Arbor Biosciences, Ann Arbor, MI) following the recommended protocol and with 16 cycles of post-enrichment PCR.We quanti ed fragment size distribution of enriched pools with a Bioanalyzer High Sensitivity dsDNA kit (Agilent Technologies, Santa Clara, CA).Enriched pools were then combined at equimolar ratios and sequenced at GENEWIZ (South Plain eld, NJ, USA) as paired-end 150bp reads on an Illumina HiSeq system.
Sequenced genomic reads were demultiplexed using Bcl2fastq 1.8.4 (Illumina).Following demultiplexing, we processed reads using the Phyluce bioinformatics pipeline, a set of programs set up for UCE sequence analysis (Faircloth, 2016).First, we cleaned reads by removing adapter contamination and low-quality bases with Illumiprocessor (Faircloth 2013), which processes reads via the trimming tool Trimmomatic 0.32.1 (Bolger et al. 2014).For de novo contig assembly, we chose a single modern sample with an intermediary number of reads (FL1843), which we reasoned would maximize recovery of UCE contigs while minimizing the amount of duplicate contigs, and therefore processing time.To prepare our reference, we assembled the reference sample's Read 1 and 2 les de novo using the Trinity 2.0.6 assembler (Grabherr et al. 2013), mapped the output contigs to a FASTA le of the 5,060 UCE probes using Phyluce, discarding duplicate contigs and non-UCE loci in the process, and then converted the results to our nal reference FASTA le.We then aligned our raw, clean reads from each sample to the reference using the 'mem' algorithm in BWA 0.7.12 (Li & Durbin 2010; Li 2013) and ltered the output using BWA and PicardTools 1.113 (http://picard.sorceforge.net)within the Phyluce wrapper.We then merged all individual Binary Alignment/Map (BAM) les into a single BAM le with Picard and indexed the output merged BAM with Samtools 0.1.19(Li 2009).
To call SNPs and indels, we used HaplotypeCaller in GATK 4. 1.8.1 (McKenna et al. 2010).We then used VCFtools 0.1.12b(Danecek et al. 2011) to lter SNPs by rst removing indels and all loci below a Phredscaled minimum sequence quality of 30, a minor allele count of 3, a minimum read depth of 5, and a maximum read depth of 100.We then ltered out individuals with > 50% missing data and restricted loci to a maximum of 10% missing data.We removed loci out of HWE while controlling for a potential Wahlund effect by grouping individuals by sampling location, testing for deviation from HWE using 1000 Monte Carlo replicates in the package 'pegas', and removing loci out of HWE in any of the putative populations from all analyses using a sequentially Bonferroni-corrected signi cance threshold of α = 0.05.To create our nal dataset, we thinned loci randomly to one SNP per UCE locus to control for effects of linkage disequilibrium.

Population Genetics
To measure genetic diversity, we estimated observed and expected heterozygosity (respectively, H o and H e ) and rare ed allelic richness (A R ) for each of the four sampling locations in the R package diveRsity (Keenan et al. 2013).To evaluate differences in genetic diversity among populations, we used a randomized one-way ANOVA with locus as a blocking factor to test for signi cant differences in H o , H e , and A R among sampling locations and used Tukey testing to conduct multiple comparisons among population pairs if the ANOVA revealed signi cant interpopulation differences in a genetic diversity metric.We examined potential population differentiation with Weir & Cockerham's F ST (Weir & Cockerham 1984) in diveRsity, using 10 4 bootstrap permutations to calculate bias-corrected 95% con dence intervals and test pairwise F ST for signi cant differences among sampling locations.
Roseate tern yearly census sizes based on nest counts are not necessarily indicative of genetic changes in the population, given that individuals appear to skip breeding years; thus, effective population size (N e ) may provide a better estimate of number of breeding individuals within each population.To quantify N e , we used the bias-corrected Linkage Disequilibrium (LD) method in the program NeEstimator v. 2.01.Estimates of N e can be less precise with lower sample sizes and numbers of loci (Leberg 2005;England et al. 2006), and we therefore excluded the FL and AZ populations from the analysis and quanti ed N e only with the SNP dataset.We estimated N e separately for the NW and VI populations and report results from lowest allele frequency ≤ 02 following the recommendations of Waples and Do (2010) and with parametric 95% con dence intervals.
To test sampling locations for signals of recent (< 4 N e ) population bottlenecks we used BOTTLENECK 1.2.02 (Cornuet & Luikart 1996) to quantify differences between observed and expected heterozygosity under simulated mutation models.In BOTTLENECK, signi cant heterozygosity excess is taken as evidence of recent genetic bottlenecks under the assumption that allelic richness is lost from a population before heterozygosity with a rapid population decline.We removed AZ from the SNP analysis due to insu cient sample size (Cornuet & Luikart 1996).For both microsatellites and SNPs, we used the sign test to investigate signi cant differences between heterozygosity excess and expected heterozygosity excess under mutation-drift equilibrium.We examined microsatellites under both a stepwise mutation model (SMM) and an intermediate two-phase model (TPM) with 90% stepwise mutation, and examined SNPs under an in nite allele mutation (IAM) model, with 1000 iterations for all models.

Population Structure
We inferred population structuring or number of genetic clusters among our sampling locations for both microsatellite and SNP data using two methods.The rst, STRUCTURE v. 2.3.4.(Pritchard 2000), uses Bayesian clustering to assign individuals to the most likely number of genetic populations through multilocus genotypes by grouping genotypes to minimize deviations from HWE and linkage equilibrium.For both marker types, we ran STRUCTURE simulations for 1-5 possible clusters (K) using the admixture model with correlated allele frequencies and with sampling location both included and omitted as prior information.We ran 10 iterations each of K = 1-5 with 10 4 MCM repetitions after a burn-in of 10 4 generations.We used the online program STRUCTURE HARVESTER (Earl & von Holdt 2012) to infer the most likely value of K via the Evanno et al. ( 2005) method, which de nes the best supported value of K as that with the highest ΔK, the rate of change in log probability of the data between successive K values.The Evanno method does not return results from K = 1, and we therefore validated results from the Evanno method with the mean log likelihood from each iteration of K (LnP(K)), which can also be an effective means of identifying best-supported K (Pritchard 2000).We then aligned clusters, merged STRUCTURE runs by K value, and visualized output using the R package Pophelper 2.3.0 (Francis 2019).STRUCTURE models can be sensitive to deviations from HWE and linkage equilibrium, which we suspected might occur if populations were substructured or subject to inbreeding.We therefore complemented analysis of population structuring in STRUCTURE with a Discriminant Analysis of Principal Components (DAPC) for both marker types in adegenet.DAPC is a multivariate analysis method that uses principal components (PCs) to infer genetic clustering and assign of individuals to groups by minimizing within-group variation (Jombart et al. 2010) .DAPC has been empirically demonstrated to be better at detecting ne-scale clustering than STRUCTURE and, because it does not require a speci ed population genetics model for input, does not require loci to be in HWE (Jombart et al. 2010).Prior to implementing DAPC, we used the K-means clustering via the ' nd.clusters' function to identify the most likely number of genetic clusters in our dataset without a priori population information, retaining all PCs, and selecting the number of clusters based on the Bayesian Information Criterion (BIC) value.For DAPC analysis, retaining a suboptimal number of PCs can cause model under or over tting (Jombart et al. 2010).For both marker types, we used cross-validation with 1000 permutations to identify the optimal number of PCs as those that balanced the highest mean successful assignment of samples with the lowest mean squared error rate, and used this used this as input for the discriminate function analysis.We visualized results both without and with a priori genetic clusters to describe the relationships among putative populations.We used number of a priori clusters (K = 4) to assign individuals to most likely population of origin and to identify individuals that deviated from their prior group assignment based on retained discriminate functions.We cross-validated results for both marker sets by masking 25% of individuals from each population (but using 50% of individuals from AZ for SNP loci, as we only had 4 samples) to use as a test dataset in a DAPC analysis.We then used the predict.dapcfunction with the rest of the data as a training dataset, and evaluated assignment of the training data via posterior membership probabilities.For SNPs, we further investigated population assignment testing using Monte Carlo cross-validation in the R package assignPOP (Chen et al. 2018) with resampling iterations of 50%, 70% , and 90% of individuals with 50%, 75%, and 100% of SNP loci, and with 100 replicates run for each proportion of individuals.
We investigated contemporary gene ow between regions using BayesAss 3.0.4(Wilson & Rannala 2003) for both markers with modi cations for SNP analysis (Mussmann et al. 2019).BayesAss estimates recent (< 3 generations) unidirectional migration rate among populations as m ij , with m as the proportion of population i resulting from gene ow from population j.We ran the program for 3 × 10 6 iterations, with a burn-in of 1 × 10 6 iterations and a sampling frequency of 2,000.To evaluate convergence of MCMC values we performed the analysis 5 times using different starting seeds and compared the posterior mean parameter estimates for each run to check concordance.Delta values were adjusted with a goal of a 20-60% acceptance rate for each parameter (Wilson & Rannala 2005).

Microsatellite Genotyping
Fourteen samples were excluded based on low concentration or failure to amplify (Table 1).No loci showed signs of signi cant LD.One microsatellite marker (IVGUD103) was monomorphic for all samples and was removed from further analysis.No loci showed evidence of scoring errors or evidence for large allele dropout, but 3 loci (AAT-27, AAC-20, and RGB-27) exhibited homozygote excess.Although homozygote excess can indicate presence of null alleles, no loci showed consistent homozygote excess across all populations.We found no substantial differences for pairwise F ST calculated with and without the null correction in FreeNA, suggesting that null alleles were unlikely to have a substantial effect on our results.Additionally, no loci exhibited high frequency of null alleles across all sites, and none exceeded an estimated frequency of > 0.21 null alleles per locus per site (Table S2); therefore, we did not remove any loci based on null allele frequencies.When the 4 sampling regions were treated as separate populations, the only locus with signi cant HWE deviation across > 1 sampling location was Calbo-2 for both the USVI and NW Atlantic.We removed this locus from the analysis.SNP Genotyping 37 samples were excluded based on low concentration, failure during library preparation, or failure to sequence (Table 1).Assembly of the specimen used to create our reference (FL1843) resulted in 21,839 contigs with a mean length of 722 base pairs (bp), for a total of 4,520 UCE loci representing 3,265,327 bp .Following sample alignment and ltering, we were left with a nal SNP dataset of 3,385 total SNPs with an average per-site sequencing depth of 53 x (SD ± 37.91) per individual.With thinning by locus to remove loci out of linkage equilibrium and removal of 44 loci out of HWE, this resulted in 2,043 SNPs analyzed.

Population Genetics
For microsatellites, we found no evidence for differences among sampling locations (Table 1) for either H o (F 3 = 0.174, p = 0.60), H e (F 3 = 0.24, p = 0.91), or A R (F 3 = 0.18, p = 0.95).For SNPs, genetic diversity statistics (Table 1) differed signi cantly between sampling locations for H o (F 3 = 168, p < 0.001), H e (F 3 = 37.48, p < 0.001), and A R (F 3 = 31.62,p < 0.001).All 3 genetic diversity statistics were signi cantly lower for AZ than the other 3 sampling locations (p < 0.001).Both H o (p = 0.003) and H e (p = 0.04) were signi cantly higher for the USVI than for MA, but there were no differences between other populations.All pairwise values of F ST were signi cantly different from 0 among populations except for USVI and FL for both microsatellites and SNPs (Table 2), indicating signi cant barriers to dispersal between all 3 populations.Differences in N e between sampling regions largely corresponded to census population size, with a smaller effective population size for NW (N e = 903.9;95% CI: 935.1, 3730.3)versus VI (N e = 1171.9;95% CI: 1278.0,5038.7).
Results from BOTTLENECK did not agree between the two marker types.For microsatellites, no signi cant heterozygosity excess was detected in any regions under either the SMM or TPM models, suggesting absence of a bottleneck for all populations.Conversely, for SNPs, all regions exhibited signi cant heterozygosity excess at a level of p < 0.001 under the IAM model, indicating presence of genetic bottlenecking for all 4 populations sampled.

Population Structure
We found strong evidence for population structuring between the NW and CAR.STRUCTURE results for both markers returned K = 2 as the most likely number of genetic clusters among our four sampling locations (Fig. S1, a-d), with ΔK values supported by lower values of (LnP(K)) at K = 1 for all 4 treatments.For microsatellites, MA clustered separately from the other 3 sampling locations when location was included as a prior, and clustering was consistent with AZ as differentiated from MA at all levels of K (Fig. 2a).For SNPs, MA also clustered separately from FL and the USVI, but AZ showed high proportions of relatedness to MA at all values of K except K = 4 (Fig. 2b).When location was not included as a prior, results did not differ for SNPs (Fig. S2a), but microsatellites showed greater admixture among all 4 sampling locations at all values of K (Fig. S2b), indicating that microsatellites require location included as a prior to increase statistical power.
Structure analysis with DAPC also indicated a high degree of genetic structuring between the NW and CAR.Identi cation of optimal number of clusters with K-means clustering returned K = 4 for microsatellites, while for SNPS output from K-means clustering was consistent with STRUCTURE output, with 2 clusters identi ed as optimal.For DAPC by optimal clusters, 5 PCs were retained for microsatellites, representing 53% of conserved variance, and 15 PCs were retained for SNPs, representing 34% of conserved variance.For microsatellites, samples grouped optimally did not clearly cluster by sampling location (Fig. 3a), with mixed groupings of all 4 regions in each cluster.For SNPs, samples were largely differentiated by region (Fig. 3b), with samples clustering as MA versus FL and the USVI, and with AZ split equally between groups.
When plotted as K = 4 with a priori population groups as input, microsatellites were differentiated slightly by region (Fig. 3c), whereas SNPs clearly clustered individuals by region (Fig. 3d).DAPC from microsatellites was able to correctly assign samples to sampling region 67% of the time, with assignment accuracy varying by population (AZ: 0.70, FL: 0.13, MA: 0.91, USVI: 0.67; Fig. S3).By contrast, DAPC from SNPs correctly assigned samples to region 100% of the time.For microsatellites, cross-validation with a reduced dataset resulted in an assignment accuracy rate of 52%, with no population experiencing 100% accurate assignment (Fig. S4).Cross-validation of DAPC assignment for SNPs accurately assigned samples to sampling region 84% of the time (Fig. S5).Assignment accuracy of the reduced dataset was lower for FL (0.85) and the USVI (0.87), but 100% of samples were accurately assigned to prior population for AZ and MA.Results from Monte Carlo assignment testing based on SNPs were comparable to DAPC assignment testing for FL (0.67, with 0.32 assigning to USVI), MASS (0.99), and USVI (0.97), but accuracy was low for AZ (0.55), with 0.30 of AZ samples assigning to USVI, 0.06 to FL, and 0.06 to MASS.Overall assignment accuracy increased with proportion of individuals included in the training dataset (Fig. S5).
Consistent with population structuring results, we did not nd evidence for recent migration between the NW and CAR, although results were indicative of movement with CAR and potentially between CAR and AZ.Bayesian estimates of gene ow were run with nal delta values of 0.40 for migration rate and 0.90 for both allele frequency and inbreeding coe cient for microsatellites and migration rate of 0.27, allele frequency 0.39, and inbreeding coe cient 0.13 for SNPs.BayesAss inferred unidirectional rates of gene ow greater than 0 from the USVI into FL and AZ (Table 3).All other estimates of gene ow had con dence intervals that overlapped 0 (Table 3), suggesting that gene ow was negligible between other sampling locations.

Discussion
We found evidence for substantial genetic structure among roseate tern populations in the Atlantic Basin, with differentiation between metapopulations detected via signi cant F ST , low levels of admixture between metapopulations as determined through both STRUCTURE and DAPC, and minimal evidence for migration between regions.Contrary to our prediction, this included signi cant differentiation between the Northwestern Atlantic and Caribbean populations, with results from both microsatellite and SNP markers indicating a lack of dispersal among roseate terns breeding in MA and the Caribbean.We found evidence of population bottlenecks in all sampling locations with SNP, but not microsatellite, markers.These results are consistent with the nding of no evidence for bottlenecking in samples from Massachusetts analyzed with 16 microsatellite loci (Dayton & Szczys 2021).Evidence for bottlenecks via the SNP data suggests that differentiation between the two North American populations may be the recent result of genetic drift following population declines, although it is also possible that our results were due to ascertainment bias resulting from SNP ltering.Historically, the breeding range of roseate terns in the Northwestern Atlantic stretched down to the Carolinas (Soots & Parnell 1974) .Due to range contractions in the 20 th century, 90% of breeding roseate terns in the Northwestern Atlantic are now concentrated in only 3 primary colony sites in New England (USFWS 2020).It is possible that this range contraction, paired with extensive population declines, altered historical dispersal patterns of the Northwestern population and severed connectivity with the Caribbean.Lack of interbreeding between the two populations may also be the result of assortative breeding due to adaptive differences between the two breeding populations, as has been noted in other seabird species with overlapping winter distributions but differing breeding habitat conditions (Lombal et al. 2017;Nunes 2018) .Further work is needed to determine if behavioral and morphological differences between the two populations (Nisbet & Ratcliffe 2008) have a genetic basis.Such differences would provide additional support for designation of the two North American metapopulations of breeding roseate terns as Evolutionary Signi cant Units (Moritz 1994).Given that both metapopulations may represent unique pools of genetic diversity, the loss of either would represent a considerable reduction in roseate tern genetic diversity on a global scale.
Within the Caribbean, we did not nd evidence for signi cant differentiation between roseate terns in the USVI and FL via F ST , and STRUCTURE plots for both markers were indicative of high admixture between the two regions.Estimates of contemporary gene ow between these two populations indicate that this is likely due to unidirectional dispersal from the USVI into FL.This nding of movement from the larger population to the smaller may be indicative of a source-sink relationship, with asymmetrical movement of dispersing individuals from a higher productivity site towards a site that does not provide reciprocal migrants (Pulliam, 1988).Roseate terns in FL have declined considerably in the 20 th and 21 st centuries and are nearing extirpation from the state (Gore et al. 2007;USFWS 2010).Reproductive success in FL appears to be lower than in other sites in Caribbean or the Northwest, as evidenced by lower hatch success and lower chick weight across growth stages (Zambrano 2007), with low reproductive success likely attributable to poor habitat quality (Zambrano et al. 2000).Based on our results, the maintenance of this population is due in part to in uxes of breeders from other, more productive sites in the Caribbean, which has negative implications for the persistence of roseate terns in FL.However, these results may have been confounded by our inability to sample other breeding populations in the Caribbean.Roseate tern distribution in the greater Caribbean basin region is not well-known, but breeding colonies have been located in the Bahamas, and a large subpopulation is thought to be present in Cuba.It is probable that these subpopulations also share migrants with Florida and the Virgin Islands.Although existence of these unsampled "ghost populations" is unlikely to substantially in uence estimates of migration (Beerli 2004), we can't fully discount the effects of potential dispersal across the greater Caribbean region on our ndings.
Results were mixed for the relationship of roseate terns breeding in the Azores to those in the rest of the Atlantic.Both STRUCTURE and DAPC supported the presence of two genetic clusters in the Atlantic.
However, although STRUCTURE from microsatellites and DAPC from SNPs clearly clustered the Caribbean and AZ populations apart from MA, STRUCTURE analysis of SNP data indicated a high degree of admixture between the MA and AZ populations.Although SNPs in general had higher resolution than microsatellite markers, sample size for the AZ was lower for the UCE analysis, with only 4 individuals sequenced versus 10 for microsatellites.Discordance between these results may in part be due to this smaller sample size, as inferences of genetic clustering can be unreliable with sample sizes smaller than n = 5 (Fogelqvist et al. 2010).We found similarly con icting results between the two markers in analyses of migration rates, with microsatellites, but not SNPs, indicating signi cant rates of unidirectional movement from the USVI to AZ. Accuracy of Bayesian migration rate estimates increase with both sample size and number of markers (Wilson & Rannala 2003).Although SNP-based analyses should provide higher statistical power to resolve movement and population structure (Fischer et al. 2017;Haasl & Payseur 2011), the sample size for the SNP analysis was likely too low to effectively detect the presence of migrants within the AZ population.The fact that all other migration rates were largely concordant between SNP and microsatellite markers suggests that this is the case here, and that our microsatellite results may provide a more accurate depiction of levels of immigration from the USVI to the AZ.Despite this, the wide con dence intervals indicate that the actual proportion of Caribbean migrants within the AZ population is not well resolved due to the lower informativeness of microsatellite data.
Findings of migration from the USVI to the Azores provide the rst evidence of potential roseate tern gene ow across the Atlantic Basin.A roseate tern banded on the wintering grounds in Brazil was found breeding in the Azores (Hays et al. 2002), and transatlantic movement has also been detected between Ireland and Massachusetts via banding resights (Nisbet & Cabot 1995); however, it has not been veri ed if these movements are indicative of genetic exchange among populations.Past microsatellite analysis did not indicate evidence of gene ow between the AZ and the Northwestern population, but could not fully resolve relatedness due to low sample sizes and numbers of markers (Lashko 2004).Our results are consistent with prior ndings of high genetic differentiation between the Northwestern and AZ populations (Lashko 2004), but suggest the potential for genetic exchange across the Atlantic via the Caribbean population.Roseate terns breeding in the Azores are subject to high rates of nest predation from invasive species, and reproductive success can be low and highly variable (Neves 2006).
Unidirectional migration rates from the USVI to AZ may signal movement from a source population to a sink, with the Azores population unable to produce enough migrants to facilitate symmetrical dispersal, although it is again possible that our results are in uenced by gene ow from unsampled populations in the Caribbean that may also send migrants to AZ.Additionally, we would expect lower differentiation between USVI and AZ if migrants regularly moved from the Caribbean to Europe, suggesting again that our understanding of dispersal between these regions is not fully resolved.
Diversity metrics recovered by both markers, as represented by allelic richness and observed heterozygosity, were low compared to other super-abundant seabird species (Wojczulanis-Jakubas et al. 2015, Yannic et al. 2016), but were largely consistent with studies of other breeding gulls and terns (Perez et al. 2020) and were comparable to past assessments of roseate tern genetic diversity (Lashko 2004;Dayton & Szczys 2021).Results from SNP markers suggested that all 4 sampling locations showed evidence of recent declines, consistent with known population declines in both the early and mid-20 th century for the Northwestern population (USFWS 2020) and probable declines in the Florida and USVI populations.Assessments of genetic diversity within our four sampling sites were mixed, but results from SNP markers, which are considered more reliable for evaluating diversity metrics (Fischer et al. 2017), suggested signi cantly higher observed and expected heterozygosity in the USVI, the largest population sampled.This nding of lower genetic diversity in the smaller populations is suggestive of eroded genetic diversity resulting from population declines.Results from SNP markers suggested all 4 sampling locations showed evidence for recent declines, in line with known population declines in both the early and mid-20 th century for the Northwestern population (USFWS 2020) and probable declines in the Florida and USVI populations.
Results based on microsatellite and SNP markers were largely in agreement, with both markers able to detect signi cant differentiation among populations and agreeing on the optimal number of population genetic clusters via Bayesian cluster analysis.This concordance suggests that SNPs derived from UCEanchored target sequences can to resolve population genomic questions at a shallower timescale than has been previously applied, as they were able differentiate populations within the same species and to quantify parameters such as migration rates for populations with minimal genetic structuring.
Additionally, SNP markers derived from UCEs-anchored target sequences provided higher resolution data than microsatellites, based on their ability to more accurately assign samples to their breeding population, which provides support for the use of genomic data derived from next-generation sequencing in resolving population-level differentiation.

Conservation Implications
Population overlaps on wintering grounds are largely unknown for roseate terns on both sides of the Atlantic, with previous studies relying on band resights of individuals roosting on beaches at night (Hays et al. 1999).Our results suggest that movement between wintering grounds may contribute to crossoceanic dispersal in the Atlantic.A better understanding of roseate tern wintering ecology and migratory movements would contribute greatly to our understanding of dispersal among populations.The low success of tracking studies (Mostello et al. 2014) has previously made this challenging.Using population assignment to identify origin of individuals screened on migratory routes or wintering grounds has been investigated as a potential means of understanding overwintering dynamics in migratory birds, but ne scale assessment has largely been unsuccessful due to limited genetic structuring among populations (Connan et al. 2015;Gómez-Díaz et al. 2009) or use of uninformative genetic markers (Lovette et al. 2004).Higher-coverage genomic markers such as SNPs, which can better resolve ne-scale structure, can be used to more effectively identify unknown migratory birds to their natal regions (DeSaix et al. 2019).Given the high accuracy of population assignment with SNP markers, genetic analysis represents a promising tool for assigning individual roseate terns sampled on wintering grounds to population of origin, which can greatly expand our ability to understand population dynamics of nonbreeding roseate terns in the Atlantic Basin.
The lack of detectable dispersal between the Northwestern and Caribbean populations suggests that natural recolonization of either region is unlikely in the event of local extirpation.Similarly, dispersal from the Azores to the Western Atlantic appears to be rare, and past work has demonstrated high population differentiation between the UK and Azores population (Lashko, 2004), indicating that all 4 metapopulations of breeding roseate terns in the Atlantic are largely isolated despite apparent low levels of movement from the Caribbean to the Azores.This unexpected isolation has potential negative implications for their future persistence.If populations continue to decline, erosion of genetic diversity is unlikely to be offset by in uxes of new alleles via migrants.Low diversity has the potential to compromise the adaptability of roseate terns in the face of changing climatic conditions, which is emerging as a primary threat to seabird persistence globally (Daunt & Mitchell 2013;Oro 2014;Dias et al. 2019).Our study highlights the importance of continued conservation efforts for all breeding populations of roseate terns, with the ultimate goal of enhancing population sizes and encouraging recolonization of historically used breeding sites, thereby potentially enabling increased dispersal among breeding populations in the Atlantic Basin.

Declarations
Tables

Figures Figure 1
Figures

Table 1 .
Comparison of summary genetic diversity statistics (and 95% con dence intervals) for roseate terns from 4 sampling locations based on microsatellite and single nucleotide polymorphic (SNP) Bird Island, Massachusetts; FL = Florida Keys; USVI = Leduck Island, U.S. Virgin Islands; AZ = Praia Islet, Azores; A R = rare ed allelic richness; H o = observed heterozygosity; H e = expected heterozygosity Table2.F ST with 95% con dence intervals estimated from 10 4 bootstraps based on microsatellite (below diagonal) and SNP (above diagonal) markers.Bolded values represent F ST values with estimated 95% con dence intervals that do not overlap 0.

Table 3 .
estimates of contemporary (≤ 3 generations) gene ow among 4 roseate tern breeding locations with 95% con dence intervals.Bolded values represent migration rates with con dence intervals that do not overlap 0. The arrow signi es direction of dispersal between pairs of populations.