Discovery and characterization of single nucleotide polymorphisms in two anadromous alosine fishes of conservation concern

Abstract Freshwater habitat alteration and marine fisheries can affect anadromous fish species, and populations fluctuating in size elicit conservation concern and coordinated management. We describe the development and characterization of two sets of 96 single nucleotide polymorphism (SNP) assays for two species of anadromous alosine fishes, alewife and blueback herring (collectively known as river herring), that are native to the Atlantic coast of North America. We used data from high‐throughput DNA sequencing to discover SNPs and then developed molecular genetic assays for genotyping sets of 96 individual loci in each species. The two sets of assays were validated with multiple populations that encompass both the geographic range and the known regional genetic stocks of both species. The SNP panels developed herein accurately resolved the genetic stock structure for alewife and blueback herring that was previously identified using microsatellites and assigned individuals to regional stock of origin with high accuracy. These genetic markers, which generate data that are easily shared and combined, will greatly facilitate ongoing conservation and management of river herring including genetic assignment of marine caught individuals to stock of origin.


| INTRODUCTION
Genetic data are routinely used to inform ecological investigation and formulate conservation and management plans for fish and wildlife.
Elucidation of population structure and patterns of connectivity are often the first steps in the use of genetic data to understand a species' biology. In addition, common applications for such data include reconstructing pedigree relationships, inferring historical demography, individual identification for mark/recapture type analyses, and evaluating patterns of natural selection and the identification of individuals to population of origin (Morin, Luikart, & Wayne, 2004;Narum et al., 2008).
Alewife (Alosa pseudoharengus) and blueback herring (Alosa aestivalis)-collectively known as "river herring"-are migratory sea-run (i.e., anadromous) fishes that reproduce in lakes and rivers along the east coast of North America, but typically migrate to the Atlantic Ocean as juveniles to grow and reach sexual maturity before returning to their natal freshwater spawning grounds to reproduce (Loesch, 1987). River herring once supported an important commercial fishery, but spawning adult abundances have declined by 93% since 1970, and many spawning populations (hereafter "populations") are now at historically low levels and are of increasing conservation concern (Hightower et al., 1996;Limburg & Waldman, 2009 Recent genetic studies of alewife and blueback herring used polymorphic microsatellite markers to resolve the spatial scale of population genetic structure (McBride, Willis, Bradford, & Bentzen, 2014;Palkovacs et al., 2014), examine range-wide patterns of hybridization , assess the influence of stocking activities on genetic structure (McBride, Hasselman, Willis, Palkovacs, & Bentzen, 2015), and determine the origin of river herring bycatch in commercial fisheries (Hasselman et al., 2016). Palkovacs et al. (2014) used data from microsatellites to reveal that US alewife populations (n = 21) were nested within three regional genetic stocks (Northern New England, Southern New England, and Mid-Atlantic), whereas US blueback herring populations (n = 21) were nested within four regional genetic stocks (Northern New England, Southern New England, Mid-Atlantic, and South Atlantic), with similar but not identical boundaries. Hasselman et al. (2016) also found that these same data had sufficient statistical power to confidently assign river herring bycatch in commercial fisheries to regional genetic stocks. Given their propensity for natal philopatry, the conservation and management of river herring requires a "population-level" approach, and there is a need for molecular tools that can resolve population genetic structure at spatial scales finer than regional genetic stock. Moreover, for anadromous fishes, such as river herring, that migrate substantial distances across jurisdictional boundaries and are subject to capture as bycatch in mixedstock fisheries, a method that generates portable genetic data that can be easily shared and allows unambiguous assignment of individuals to population of origin is an important conservation and management tool (Clemento, Crandall, Garza, & Anderson, 2014;Morin et al., 2004;Starks, Clemento, & Garza, 2016).
Single nucleotide polymorphisms (SNPs) are bi-allelic markers, ubiquitous in the genome of most species (Morin et al., 2004), that are relatively simple to genotype and provide data that are easily portable between laboratories and instruments. Recent higher-throughput SNP genotyping technologies allow samples to be processed efficiently and in a cost-effective manner (Clemento, Abadía-Cardoso, Starks, & Garza, 2011;Larson, Seeb, Pascal, Templin, & Seeb, 2014;Seeb, Pascal, Ramakrishnan, & Seeb, 2009). SNP marker data have utility for a variety of ecological and evolutionary questions, and a suitable number of SNPs have been demonstrated to provide sufficient statistical power for resolving the spatial scale of population genetic structure in anadromous fishes (Clemento et al., 2014;Narum et al., 2008;Starks et al., 2016) and identifying pedigree relationships (Anderson & Garza, 2006). SNP data are also useful for assignment of individuals of unknown provenance to population of origin, often called genetic stock identification (GSI), and can be particularly informative when some of those SNPs have been affected by divergent selection between populations (Ackerman et al., 2011;Nielsen et al., 2012).
Here, we describe the development of two sets of 96 SNP assays, one specific to alewife and the other to blueback herring. These SNP panels are suitable for resolving range-wide population genetic structure and have applications for GSI, investigating patterns of hybridization and introgression, and addressing issues of ecological and evolutionary relevance in a conservation and fisheries management framework. We used samples collected from across the ranges of both species for SNP discovery to minimize ascertainment bias (Albrechtsen, Nielsen, & Nielsen, 2010;Clark, Hubisz, Bustamante, Williamson, & Nielsen, 2005) and assess the power of the SNP data to accurately resolve previously described genetic stock structure for both species. The SNPs described herein will provide more power for population genetic investigations, enable higher throughput genotyping than with microsatellites, and allow for more effective data sharing across laboratories and management agencies.

| Sample collection
Muscle plugs or fin tissue was obtained from alewife and blueback herring captured across the species' ranges (Table 1, Figure 1). All samples were obtained from adult fish and were collected in freshwater. Tissue was preserved in 95% ethanol until DNA extraction.

| SNP discovery and assay development
Tissue samples were removed from ethanol and air-dried before extracting genomic DNA using DNeasy 96 Blood and Tissue kits with a BioRobot 3000 (Qiagen, Inc.). To identify potential SNPs for alewife and blueback herring, we used double digest Restriction Associated DNA sequencing (ddRADseq), a genome-reduction technique that uses two restriction enzymes to create DNA fragments with identical fixed endpoints for annealing sequencing adapters (Peterson, Weber, Kay, Fisher, & Hoekstra, 2012). To ensure range-wide coverage and reduce the risk of ascertainment bias, samples chosen for SNP discovery included individuals from at least one population from each of the regional genetic stocks for alewife and blueback herring previously identified by Palkovacs et al. (2014).
We performed ddRADseq library construction, sequencing, and SNP identification separately for each species, following the same protocol. Undiluted genomic DNA from 48 alewife from 18 populations and 12 blueback herring from four populations (Table S1), representative of genetic lineages throughout the species' geographic ranges Palkovacs et al., 2014) was digested using two restriction enzymes-Sph1 and EcoR1. Next, we performed size selection for 350-bp fragments using the Pippin Prep system (Sage Science, Inc.). Following the addition of adapters, sequencing was performed on a MiSeq instrument (Illumina Inc.). We used two 600-cycle sequencing reactions with paired-end reads for alewife and a single such sequencing reaction for blueback herring. Sequence data from each species were processed, with homologous reads identified and SNPs called, using Stacks (Catchen, Hohenlohe, Bassham, Amores, & Cresko, 2013). Loci were selected for assay design by identifying sequences with a single SNP that also met the following three criteria: (1) all three genotypes (both homozygotes and the heterozygote) were observed, (2) a minimum of 20 sequence reads per allele for alewife and 15 reads for blueback herring were detected, and (3) the sequence did not share high similarity (>80%) with any other sequence selected from Stacks when global alignment was evaluated using BLAST (Altschul, Gish, Miller, Myers, & Lipman, 1990). These criteria were used to choose unique and sufficiently polymorphic target loci for the development of SNPtype genotyping assays (Fluidigm Corporation).
A total of 166 SNPs in alewife and 141 SNPs in blueback herring were chosen for SNPtype assay (Fluidigm) design. Assays were evaluated for consistency and polymorphism by genotyping 382 alewife samples from eight populations and 474 blueback herring from 10 populations throughout the species ranges ( Figure 1; Table 1). SNP genotyping was performed with 96.96 Dynamic Genotyping Arrays on an EP1 Genotyping system (Fluidigm), which combines 96 DNA samples with 96 assays for a total of 9,216 reactions on each nanofluidic array. SNP genotypes were called with the Fluidigm SNP Genotyping software package.
Assays were first evaluated for their ability to produce clearly and consistently distinct clusters of genotypes. Loci were excluded that produced ambiguous genotypes or for which all validation samples appeared to have either homozygote or heterozygote genotypes, indicating null alleles or a lack of Mendelian inheritance. Sets of 96 wellperforming assays were then retained for the final alewife and blueback herring SNP panels. Details of these SNP genotyping assays, including target polymorphism, primer/probe sequences, and database accession numbers are in Table S2 (alewife) and Table S3 (blueback herring).
Additionally, as there were more than 96 high-quality blueback herring loci remaining at this stage, minor allele frequencies in the validation populations were estimated and used as a proxy for the expected power of the markers in pedigree reconstruction applications (Anderson & Garza, 2006) and used to choose the final panel of 96 assays.
Following genotyping, samples with missing data for 10 or more loci were excluded from further analyses (Table 1) To evaluate the utility and performance of the two sets of SNP assays for GSI, self-assignment analyses were conducted with GeneClass 2.0 (Piry et al., 2004), and the proportion of accurately assigned individuals per population and regional reporting group estimated.  [1996][1997][1998][1999][2000][2001][2002][2003][2004]. Model-based clustering was performed with structure (Pritchard, Stephens, & Donnelly, 2000) without prior location information and using the admixture and correlated allele frequencies model.
There were 15 loci that deviated from HWE expectations in one alewife population, one locus that deviated in two populations (Aps_14730), and one locus in four populations (Aps_5844) (Table S4). The majority of both blueback herring and alewife individuals were accurately assigned to their population and reporting group of origin in the self-assignment analyses. For alewife, 67% of all fish were correctly assigned to population of origin and 93% to reporting group of origin (Table 2a) when no probability criterion was used, and 83% to population and 97% to reporting group, when a 90% probability criterion was applied (Table 2b). For blueback herring, assignment accuracy was similar, with 67% accurately assigned to population of origin and 96% assigned to reporting group of origin with no probability criterion (Table 2c) and 79% to population and 98% to reporting group, when a 90% probability criterion was applied (Table 2d).
Self-assignment for alewife was less accurate in northern populations than in the south, corresponding to increased geographic distance between rivers in the south, whereas there was no discernible pattern with blueback herring.

| DISCUSSION
Molecular genetic data and analysis have become a critical component of biological investigation and conservation for migratory species, particularly anadromous fishes that are harvested and often subject to multijurisdictional management (Clemento et al., 2014;Hasselman et al., 2016;Palkovacs et al., 2014). We describe here validated SNP assays for alewife and blueback herring that provide power for multiple applications, including GSI across the species' ranges, as well as pedigree reconstruction and phylogeography.
Self-assignment analyses with both alewife and blueback herring populations demonstrated clear delineation between regional genetic stocks previously identified using microsatellite data . Alewife validation populations displayed differentiation with these SNP assays that mirror the regional population genetic structure previously identified by Palkovacs et al. (2014) and expand the utility of genetic identification into the northern portion of the species range. Pairwise F ST values revealed significant differentiation between all sets of populations except the Chowan and Alligator Rivers (Table 3), which are geographically proximate and tributaries of the same coastal estuary (Albemarle Sound). Two-thirds of alewife samples assigned to their correct population/river basin of origin, but the proportion of accurate assignments by population ranged between 49% and 84% (Table 2) when no probability criterion was applied. When such a criterion was applied, the overall proportion of accurate assignments increased substantially (82%), and the proportion per population ranged from 74% to 97%, but nearly half of the samples remained unassigned, emphasizing the lack of finescale differentiation and population structure in alewife. When selfassignment was evaluated at the scale of the previously reported regional genetic stocks, accuracy was much higher, with 93% of samples assigned accurately to regional stock of origin and the accuracy per population of assignment to reporting unit ranged from 82% to 100% without a probability criterion and 91% to 100% with a probability criterion.
Model-based clustering results from structure were consistent with the self-assignment and DAPC results for alewife and found that populations within the same regional genetic stock generally clustered together (Figure 2a and 3a, Table 2a,b). The T A B L E 2 Accuracy of leave-one-out self-assignment analyses to population and regional stock for alewife (a) without applying a probability criterion (i.e., all individuals assigned) and (b) with a 90% criterion. Blueback herring are also assigned (c) without applying a probability criterion (d) with a 90% criterion  (Table 3). This is consistent with movement between the Delaware River and Chesapeake Bay (likely via the Chesapeake and Delaware Canal), as has been shown by otolith microchemistry (Turner, Limburg, & Palkovacs, 2015). The proportion of blueback herring correctly self-assigned to population of origin was similarly variable, ranging from 33% to 100% without a probability criterion (Table 2). Using a probability criterion increased overall accuracy of assignments (72%), with a range of 43%-100%  per population. Geographic proximity and population connectivity appear to explain many of the misassignments. For example, a high frequency of misassignments involved the Delaware and Rappahannock Rivers, which are connected via the Chesapeake and Delaware Canal, and the Savannah and Altamaha, which are geographically proximate (Figure 1). Similar to alewife, assignment to previously reported regional stocks was much more accurate, with overall assignment of 96% without a probability criterion and 97% with a probability criterion ( Table 2).
The structure clustering results with blueback herring populations again mirrored the self-assignment and DAPC results (Figure 2b and 3b, Table 2c,d). Although the validation samples encompass most of the species range, the spatial distribution of populations is uneven, and the proximate populations consistently grouped. At K = 2, the southernmost populations, in the Altamaha and Savannah Rivers, clustered separately (Fig S1), whereas at K = 3, the proximate Southern New England populations separated. At K = 4, the four regional genetic stocks within the US range, previously identified by Palkovacs et al. (2014), were recovered ( Figure 2b); however, the northernmost populations (Margaree and Petitcodiac) grouped with the Northern New England populations (Kennebec and East Machias). At K = 5, the Petitcodiac separated, whereas the Margaree continued to group with Northern New England (Fig. S1).
The 96 locus SNP set for blueback herring provided clustering concordant with the genetic stocks previously identified with microsatellite markers, and the four northernmost populations, which had not previously been evaluated together, formed a distinct cluster. The alewife SNP panel extends the geographic range for GSI, and also recovers clusters consistent with the regional genetic stocks found with microsatellites, yet the ability for the assays to discriminate population structure at small spatial scales, especially among rivers exchanging frequent migrants, may prove difficult.
Data from SNP assays are unambiguous and easily portable between laboratories. Generating these markers using high-throughput sequencing of genomic DNA enhances our ability to confidently distinguish populations of alewife and blueback herring that are genetically distinct across both species' ranges. This ability to identify stock of origin for fish caught at sea is critical for management of populations experiencing diminished spawning returns and of increasing conservation concern (Hasselman et al., 2016). Using these new tools, samples from mixed-stock assemblages can be quickly and efficiently genotyped, allowing new insights into marine movement patterns and impacts of marine fisheries.   Fig. 2