Seascape genomics reveals limited dispersal and suggests spatially varying selection among European populations of sea lamprey (Petromyzon marinus)

Abstract Sea lamprey Petromyzon marinus is an anadromous and semelparous fish without homing behaviors. Despite being a freshwater, free‐living organism for a large part of their life cycle, its adulthood is spent as a parasite of marine vertebrates. In their native European range, while it is well‐established that sea lampreys comprise a single nearly‐panmictic population, few studies have further explored the evolutionary history of natural populations. Here, we performed the first genome‐wide characterization of sea lamprey's genetic diversity in their European natural range. The objectives were to investigate the connectivity among river basins and explore evolutionary processes mediating dispersal during the marine phase, with the sequencing of 186 individuals from 8 locations spanning the North Eastern Atlantic coast and the North Sea with double‐digest RAD‐sequencing, obtaining a total of 30,910 bi‐allelic SNPs. Population genetic analyses reinforced the existence of a single metapopulation encompassing freshwater spawning sites within the North Eastern Atlantic and the North Sea, though the prevalence of private alleles at northern latitudes suggested some limits to the species' dispersal. Seascape genomics suggested a scenario where oxygen concentration and river runoffs impose spatially varying selection across their distribution range. Exploring associations with the abundance of potential hosts further suggested that hake and cod could also impose selective pressures, although the nature of such putative biotic interactions was unresolved. Overall, the identification of adaptive seascapes in a panmictic anadromous species could contribute to conservation practices by providing information for restoration activities to mitigate local extinctions on freshwater sites.


| INTRODUC TI ON
Freshwater systems have played a key role in the evolution of human society by providing reliable sources of potable water and food, as well as communication and trading routes (Ormerod et al., 2010).
To facilitate these services, rivers have been heavily modified in order to facilitate navigation and provide more efficient hydropower generation through the building of, for example, weirs and dams (Dudgeon et al., 2006;Grill et al., 2019). For anadromous fishes, including some species of salmonids and lampreys, these modifications can place considerable pressure on their freshwater life phases through blocking access to environmentally stable reproductive and nursery habitats (e.g., Branco et al., 2012;Mateus et al., 2012;Moser et al., 2020). Maintaining sustainable diadromous fish populations is vital both from a social-economic perspective-as many of these species are economically relevant-and from a biological standpoint, as spawning and foraging migrations ensure energy and biomass transfer between marine and freshwater environments (Limburg & Waldman, 2009).

Despite local freshwater extinctions, anadromous populations
can still be maintained across large spatial scales through their use of the marine habitat, while using available rivers for spawning and nursery habitats (Moser et al., 2020). In this context, characterizing gene flow and inferring whether populations are locally adapted is essential for the sustainable replenishment of extant populations.
Genetic and genomic resources have greatly contributed to understanding connectivity patterns (Asaduzzaman et al., 2020;Perrier et al., 2014), defining conservation units (Mateus et al., 2011;Turbek et al., 2023;Waples & Lindley, 2018), and pinpointing evolutionary signatures related to migration, maturation and/or local adaptation in several anadromous species, such as in the Atlantic salmon, brown trout or shad (Antognazza et al., 2021;Kapralova et al., 2011;Leitwein et al., 2017;Prince et al., 2017). Genome-wide information is nevertheless vital to ensure the effectiveness of conservationdriven translocation activities, as it would permit to devise genetic rescue strategies to mitigate the species' biodiversity losses and increase population resilience without compromising local adaptation (Whiteley et al., 2015).
The sea lamprey Petromyzon marinus is an anadromous, ancient jawless vertebrate whose native distribution is limited to the Western and Eastern coasts of the North Atlantic and inside the Mediterranean, where individual abundances fluctuate regionally and locally from one freshwater system to another (Guo & Britton, 2017;Boulêtreau et al., 2020). The sea lamprey has a complex lifecycle consisting of several developmental stages, where the major morphological changes are intrinsically associated with feeding behavior (Youson & Potter, 1979). Posthatching larvae, or ammocoetes, are free-living organisms that feed on detritus deposited on river sediment beds or through filtration of the water column (Guo et al., 2016). The metamorphosis into young adults marks the onset of the species' parasitic behavior, as sea-dwelling lampreys mandatorily feed on blood of other vertebrates (Quintella et al., 2021;Silva et al., 2013). Adult sea lampreys spend their life at sea, a stage that is expected to last between 10 and 28 months preceding sexual maturation (Beamish, 1980;Renaud & Cochran, 2019;Silva et al., 2013). Spawning migration towards freshwater, however, is not philopatric regarding spawning locations; contrary to other anadromous fishes, sea lampreys do not return to the natal rivers to spawn (Almada et al., 2008;Bergstedt & Seelye, 1995;Spice et al., 2012). Instead, it has been suggested that sea lampreys are steered by a migratory pheromone released by stream-resident larvae (Sorensen et al., 2005;Vrieze et al., 2011), although natural recolonization of freshwater locations without ammocoete presence has been recently noted (Reid & Goodman, 2020). Less understood is how selection acts across the sea lamprey life cycle, despite its numerous life-history events (such as the transition to parasitism linked to adulthood, onset of anadromous and catadromous migrations) providing substantial opportunities for selection to act. For instance, it has been recently shown that the parasitic adult stage imposes a selective pressure on the host, as suggested by evidence showing the transcriptional response of genes involved in the regulation of inflammation and cellular damage in parasitized lake trout (Goetz et al., 2016). Indeed, recent work investigating the molecular pathway of the renin-angiotensin system on lampreys has suggested that its evolution is linked with mimicry of the teleost-like angiotensin while parasitizing, in order to recreate host's environment and avoid rejection-referred to as endocrine mimicry (Goetz et al., 2016). Though plausibly expected, a putative relationship between the distribution of potential hosts and signatures of selection on the sea lamprey genome-which could indicate adaptation to the host community-has not been investigated at any spatial scale.
Transitions from fresh to salt water (and vice-versa) do impose an energetic cost associated with physiological and molecular re-arrangements to cope with the chemical composition of the different environments (Ferreira-Martins et al., 2016). Reduced tolerance to high salinities has been reported in sea lamprey adults performing spawning migrations; failure to locate suitable spawning locations in freshwater due to connectivity loss by dams and weirs requires re-entering estuaries and the marine environment to access adjacent river systems (Sunga et al., 2020). Assuming that tolerance to those transitions would be an adaptive trait, it could be hypothesized that ocean salinity would play a role in limiting adult sea lamprey dispersal. Within a certain geographic area, individuals could be adapted to fresh-saltwater gradients to minimize failure of reproduction by spending time searching for suitable freshwater spawning locations. Given the regional and local extinctions of sea lamprey populations in the European range (Hansen et al., 2016;Mateus et al., 2012), the absence of homing behavior might facilitate conservation practices based on, for example, the translocation of ammocoetes to maintain local biodiversity levels, as performed on the Columbia river basin with Pacific lampreys (Ward et al., 2012). Undertaking similar conservation measures to mitigate the local extinctions of European lamprey would thus demand an upscale of extant knowledge on the spe-cies´ connectivity and the identification of signatures associated with adaptive processes underlying their evolutionary potential (Eizaguirre & Baltazar-Soares, 2014;McMahon et al., 2014). While traditional molecular tools have been extensively applied to investigate genetic structure or historical demography (Genner et al., 2012;Mateus et al., 2012), next-generation sequencing has been surprisingly lacking among the repertoire of studies dedicated to natural populations of sea lamprey. By harnessing the power of genome-wide information, we intend to revisit genetic structuring among populations of sea lamprey across the North Eastern Atlantic coast-hypothesizing panmixia-while exploring molecular signatures of the selective pressures proposed above.
Inferring signatures of selection on a highly migratory species, such as sea lamprey, would help the understanding of how microevolutionary processes may act on complex life-history strategies.
Concomitantly, characterizing the connectivity among spawning grounds and the evolutionary pressures that could be delimiting the marine dispersal of eastern Atlantic Sea lampreys is fundamental in establishing conservation measures befitting the species' wide distribution range. Migrating sea lamprey were captured using unbaited two-funnel eel pots. Captured individuals were held in water-filled containers (100 L) prior to general anesthesia (MS-222) that enabled a fin biopsy to be taken. All the sampled individuals were adults, except for those from the Frome, where, due to the unavailability of adults during the sampling period, ammocoetes were sampled. Ammocoete sampling was performed by dredging, with a dipnet, soft sediment areas of the river Frome. All the tissue samples were preserved in ethanol (96%) prior to extraction.

| DNA extraction, library preparation, and double-digest restriction-associated DNA tag sequencing
Genomic DNA was extracted from the fin or muscle tissues with Qiagen DNeasy Blood and Tissue kit (Hilden, Germany) following the manufacturer's instructions. Library preparation, sequencing, and bioinformatic processing of raw reads (demultiplexing of individual barcodes, removal of adaptors and barcodes, and trimming) were performed at the GENEWIZ© in Takeley, United Kingdom. Library preparation followed the double digest protocol of Peterson et al. (2012). Libraries were constructed by pooling 48 samples after individual barcoding. Genomic DNA was digested with EcoR1/Msp1 restriction enzymes. Two 96-well plates were sequenced on an Illumina HiSeq2500 with a 2 × 125bp paired-end (PE) configuration. Final library was size-selected between 300 and 400 bp inserts. Only reads with Q > 30 were kept for downstream analysis.

| Processing paired-end reads
Reads were aligned against the available sea lamprey genome (Smith et al., 2013) using the bwa-mem algorithm (Li, 2013) implemented in bwa (Li & Durbin, 2009). Bam files were imported into Stacks, version 2.2 (Catchen et al., 2013) for further analyses and variant calling. Only bi-allelic loci that were present in 80% of the F I G U R E 1 Geographic information and distribution of sampled locations. Here represented are the sampled freshwater locations (dots) and respective correspondence to FAO's major fishing areas in the Northeast Atlantic (italicized numbers), delineated by dashed lines. Specifically, 3 corresponds to Skagerrak, Kattegat, Sound, Belt Sea, and Baltic Sea, 4 to the North Sea, 7 to Irish Sea, West of Ireland, Porcupine Bank, Eastern and Western English Channel, 8 to the Bay of Biscay and 9 to the Portuguese waters. In parenthesis, the number of individuals collected for each location was utilized in this study.

Severn -UK
populations were kept, as well as one SNP per tag (Rochette & Catchen, 2017). Because the sampling spanned a wide geographic area, a minor allelic frequency of 0.01 (overall dataset) was filtered out to avoid losing unique information. We also kept only loci with the maximum observed heterozygosity of 0.5 (Roesti et al., 2012).
To mitigate a potential effect of the high repeatability found on this species' genome, all the reads whose coverage was above 100× were also filtered out, along with loci that mapped to more than one catalog.

| Genetic diversity indices across sampled locations
All statistical analyses were computed in R v 3.4.1. (R Core Team, 2013).
For each location, we estimated the heterozygosity (Hs), and inbreeding coefficient (F) with the adegenet R package (Jombart, 2008) and the number of private alleles (pA), nucleotide diversity (pi) with Stacks.
Absolute values of pA were log-transformed to normalize the distribution prior to statistical analyses. The distribution of genome-wide diversity was first compared across sampled locations, where we investigated the relationship of each index with latitude and longitude.
Relationships were tested between diversity indices and geographic distance using linear models. Because we collected ammocoetes from the Frome, we utilized PLINk2 (Chang et al., 2015) to perform a relatedness analyses (KING) across the overall dataset and utilizing a cut-off of 0.1 following the convention to cut off first-degree relationships as it could be expected if collected ammocoetes would be originated from the same breeding pairs (Chang et al., 2015).

| Population structure and connectivity among freshwater locations
Multiple approaches were used to investigate the population connectivity between the sampled locations. A PCA was performed in the R package pcadapt v 4 (Privé et al., 2020) in a first attempt to verify if sampling locations harbored genetically similar individuals.
Pairwise F ST was estimated in Arlequin v3.5 (10,000 permutations) (Excoffier & Lischer, 2010) to understand overall genome-wide diversity. Bayesian clustering methodology explored the distribution of genome-wide variation across the sampled sites, running faststructure (Raj, Stephens, et al., 2014) with three independent iterations for K values of 1-8. The most likely K number was assessed using chooseK (Raj, Rothamel, et al., 2014). Admixture proportions were visualized with compoplot function in the adegenet. Lastly, discriminant analysis of principal component (DAPC) was used to assess individual clustering based on genetic similarities in adegenet.
We performed DAPCs assuming both the original sampling location, that is, predefined populations, as well as exploring the configuration of clusters with the function find. clusters. Putative clusters were identified with Bayesian Information Criteria (BIC), and the assignment of individuals was compared with that of sampling locations.

| Investigating the relationship between genomic variation and environmental variables
Environmental and genetic variation tend to be interpreted together under the assumption that co-variation suggests signatures of evolutionary responses to environmental conditions (Günther & Coop, 2013 Relationships between the above-mentioned variables and genetic variation were first tested for the full SNP dataset in Bayenv2 (Günther & Coop, 2013). To increase the robustness of any observed gene-environment association, we imposed a double filter for statistical significance. Specifically, only the gene-environment relationships with a Bayes Factor > 10 (Jeffreys, 1998) and an Spearman's rho statistic above 0.2 (Günther & Coop, 2013) were considered as significant. In addition, we utilized latent factor mixed models (LFMM) to further inspect for environment correlations with the genetic data (Frichot et al., 2013). Contrary to the island model of neutral allelic frequency distribution to identify outliers assumed in Bayenv2, LFMMs require a priori information on population structure to infer the background noise of neutral differentiation (Frichot et al., 2013). We ran LFMM on the R package LEA (Frichot & François, 2015). Because the methodology is sensitive to missing data, we performed the imputation of null alleles based on estimated ancestry coefficients and ancestral genotype frequencies, as suggested by Frichot and François (2015) and considering the "mode" of each allele. We set the K to assume all possible populations, that is, 1-8, we replicated each run 10 times and applied a random set of 3000 SNPs to initialize the cross-entropy algorithm. Outliers were identified considering a z-score distribution of correlation values after p-value adjustment, as suggested by Frichot et al. (2013). This approach has been shown to be potentially complementary to other correlation-based strategies, though with a higher-than-average number of positives (Nielsen et al., 2020). As such, we further controlled for false discovery rate (FDR) according to the Benjamini-Hochberg procedure with a q = 0.1 (Benjamini & Hochberg, 1995).
To control for the spatial heterogeneity of both abiotic and biotic factors, we added latitude and longitude as variables in Bayenv2 and LFMM. Loci associated with latitude and longitude were removed from the candidate's list, independently of being associated with any other factor.
Lastly, we utilized pcadapt to screen for outlier loci. The number of PCs to retain was chosen accordingly to the scree plot curve and p-values were subject to Bonferroni correction with α = 0.001.
Only loci identified as outliers in all detection methods were classified as candidates under selection and were used in further analyses. We calculated diversity estimates and inferred population structure with the same methodology applied to the full loci dataset.

| Candidate loci annotation
Genomic sequences flanking candidate loci were retrieved from the sea lamprey genome (P.marinus_7.0) at Ensembl's database by extending the size of the RADtag 1000 bp upstream and downstream the coordinates to which they mapped against. Extending the RADtag alongside the reference genome was performed with the intent of capturing regions potentially linked to the candidate loci.
Fragments were blasted against the NCBI database.

| Genetic diversity estimates
Genetic diversity estimates were similar among sampled locations when considering all sequenced sites, with Hs varying between 0.10 for the Frome 0.12 for other 6 locations ( Table 1). The number of private alleles had higher variation, with the highest number found in the Rolfsan (228) and the lowest in Gironde (77), with the difference being significant according to geographic distance (R 2 = 0.18, p = 0.01) (Figure 2a), where the higher the distance between sampling areas, the greater the differential of private allele numbers between the sampling areas. There was also a significant and positive linear relationship between the number of private alleles and latitude (R 2 = 0.72, p = 0.04) (Figure 2b), suggesting an overall pattern of increased unique genomic diversity at higher latitudes. A total of 2 individuals, one from Rolfsan and other from Gironde, did not pass the 0.1 kinship cut-off but were still maintained in the analyses as those do not belong to the same population.  (Table S2).

| Gene-environment associations and gene ontology of putative candidate loci
There were 118 loci with a Bayes factor > 10; among these, 42 also had an absolute value of ρ > 0.20. The pcadapt approach, which accounts for population differentiation while identifying outliers, detected 629 candidate loci across variables. LFMMs detected candidate loci, per variable, in the range of those detected by pcadapt.
However, decreasing the adopted q (from the suggested q = 0.1 to q = 10 −3 ) to adjust for the false discovery rate did not significantly In relation to the putative abiotic pressures, candidate locus 26,013 was associated with both maximum phosphate concentration and maximum dissolved oxygen concentration and mapped to a genomic region coding identified as the par-3 family cell polarity regulator beta (pardb3 gene), a highly conserved feature across metazoan genomes involved in cell polarity and adhesion (Thompson, 2022); loci 13,513 and 14,785, which associated with maximum phosphate concentration and maximum temperature respectively, mapped against P. marinus´ inositol-3-phosphate synthase 1 (isyna-1 gene) a molecule involved in the production of myo-inositol in a wide range of organisms including fish (Agranoff, 2009) and to P. marinus´ homeodomain HOXW10A-like gene and Tcl-1 like transposon, both previously identified to be associated with long-term plasticity of spinal cord locomotor network in sea lamprey (Bevan et al., 2008).

| DISCUSS ION
Despite being a key species in the understanding of the evolution of vertebrates (Smith et al., 2013;Sower, 2018), there remains a paucity of information on how populations of sea lamprey evolved in natural habitats. With their abundance declining in freshwater locations alongside the eastern Atlantic shore (Hansen et al., 2016), understanding connectivity among spawning locations, as well as the evolutionary potential of the species, is a pressing issue towards

| Sea lampreys constitute a large eastern Atlantic metapopulation limited by northward dispersal
The absence of homing behavior towards natal spawning grounds renders sea lamprey an exception among diadromous species. Thus, the migratory behavior associated with the mandatory marine phase of sea lamprey would suggest a homogenization of genetic variance across the species' distribution range. It is not surprising then that patterns of panmixia have been reported among populations of sea lamprey on the western Atlantic coast (Waldman et al., 2008), on the eastern Atlantic coast (Almada et al., 2008;Lança et al., 2014;Mateus et al., 2012;Rodríguez-Muñoz et al., 2004), as well as among populations of Pacific lamprey Entosphenus tridentatus (Spice et al., 2012).
Despite the upscale to a genome-wide representation of molecular signatures, we observed a strikingly similar pattern of panmixia supported both by homogenized genetic diversity and almost complete absence of differentiation among sampled locations. However, our results did suggest that Atlantic sea lampreys may have limits to their dispersal, as also identified in their Pacific counterpart (Spice et al., 2012). Support for a limited dispersal scenario was provided by both the discriminant analyses on the whole SNP dataset, which pinpointed the north easternmost location of Rolfsan with some degree of differentiation in relation to all other locations, as well as a significant and positive correlation between the number of private alleles and latitude, which may be interpreted as a signature of limited connectivity with southern populations. The barrier imposed by the salinity gradient to other marine species across the Kattegat-Skagerrak F I G U R E 2 Relationships between location-specific alleles and geography. Panel a represents the relationship between the absolute number of private allele differences (x-axis) as a function of geographic distance (y-axis) between population pairs. Panel b investigates the relationship between latitude (x-axis) and the log-scaled number of private alleles (y-axis) with sampled locations marked in the dots. Smooth bands of the linear model log(pa) ~ latitude were applied to aid the visualization of the trends.

| Investigating dispersal limitations under the magnifying glass of molecular data
With generally low genome-wide differentiation over large spatial areas, allelic frequencies deviating from the neutral spectrum in marine populations have been increasingly associated with environmental cues characteristic of a heterogenous oceanic environment (Gagnaire & Gaggiotti, 2016). For example, seascape genomics have revealed that despite high levels of gene flow, temperature could explain distribution patterns in abalone (Haliotis laevigata), American lobster (Homarus americanus) (Benestan et al., 2016) and cod (Clucas et al., 2019;Therkildsen et al., 2013), while salinity extended associations to herring (Clupea harengus) and cod (Berg et al., 2015;Guo et al., 2016). Sea lampreys have a very specialized set of life-history traits whereby the freshwater phase hinders a full-scale open ocean dispersal, as shown by the stark differentiation between Eastern and Western Atlantic coasts-AMOVA FCT = 79.5% between regions, p < 0.001 (Genner et al., 2012), and their obligate parasitism could render their dispersal to be passively dependent on that of the hosts.
Thus, it has been argued that limits to marine dispersal might be imposed both by abiotic and biotic factors (Guo et al., 2016).
By utilizing three different approaches to identify candidate loci under selection, we expected to mitigate issues related to high frequencies of false positives (type II errors), (Capblancq et al., 2018;Lotterhos & Whitlock, 2015;Narum & Hess, 2011;Nielsen et al., 2020), with the logical trade-off of increasing the rate of false negatives (Dalongeville et al., 2018).
Here, we report a relationship between the abundance of confirmed host species´ such as the Atlantic cod, and the twaite shad, and allelic frequencies of loci located within the vicinity (maximum 1000 bp) of genomic regions annotated to immune-related Other genomic regions identified either on these loci or in other loci associated with these same biotic variables further support regulatory functions to be under selection in our target species. These results are suggestive of a hypothetical scenario of local adaptation where sea lampreys have evolved immune responses against the antigens of the most likely hosts that they will encounter at sea. It is known that sea lampreys elicit immune response from their hosts (Bullingham et al., 2022;Goetz et al., 2016), and thus it is plausible, at the light of host-parasite co-evolution, to expect adaptive signatures on the sea lamprey genome (Ebert & Fields, 2020;Eizaguirre et al., 2012). To the best of our knowledge, this is the first time that genomic signatures of putative adaptation to host species have been shown in sea lamprey and consequently, at such an early stage of vertebrate adaptive immune system evolution (Boehm et al., 2012;Finstad & Good, 1964;Li et al., 2013). A plausible caveat to these observations relies on the accuracy of host abundances collected for this study. Food and Agricultural Organization fisheries datasets are based on capture and aquaculture data, and despite being extensively utilized to understand trends in fish abundance in the context of sustainable fisheries (Froese et al., 2012;Garibaldi, 2012;Naylor et al., 2021), their interpretation has historically experienced criticism (Daan et al., 2011;Pauly & Zeller, 2016). Alternatively, it could be the case that putative adaptive signatures reflect hostparasite co-evolution where the sea lamprey is the host. However, such scenario would imply the frequency distribution of a hypothetical sea lamprey parasite to co-vary with an abundance of sea  R o l f s a n -S w e F I G U R E 5 Venn diagram of candidate loci identified by multiple outlier detection techniques. Shown here is the number of candidate loci identified by each software. LFMM approach implemented in the R package lea is colored in yellow for the abiotic variables and in green for the biotic variables.
lamprey hosts (here investigated) across spatial scales. As in the case of sea lamprey being the parasite, data on sea lamprey's parasites is scarce and the functional link to fitness remains elusive (Shavalier et al., 2021).
From an abiotic perspective, we find it interesting that loci associated with maximum phosphate concentration and maximum temperature relate to a genomic region coding for the isyna-1 gene, involved in the myo-inositol (formerly referred to as vitamin B 8 ) pathway. The biosynthesis of myo-inositol is an evolutionary conserved pathway (Majumder et al., 2003), known to modulate several vital physiological functions in living organisms, such as growth, immune efficiency, osmoregulation, and exposure to heavy metal concentrations (Cui et al., 2022). Interestingly, this locus has been identified as a candidate under selection also in the sand goby (Pomatoschistus minutus) in the context of adaptations to less saline conditions after postglacial expansion (Leder et al., 2021). Because the same candidate loci also identified other genomic features such as the HOXW10Alike gene and the Tcl-1-like transposon (both involved in highly conserved mechanisms of homeostasis and plasticity of spinal cord locomotor network, respectively) it is suggested that the selection for organismal balance is plausible in the context of the heterogenous environment that sea lamprey experiences during its life cycle.
While the fresh-to saltwater transition is generally important, our results suggest also the importance of the abiotic environment and putative anthropogenic impacts, such as phosphate concentration (Farmer, 2018), which along with dissolved oxygen concentration, is critical for the thermal regulation and metabolic processes of aquatic ectotherms (Pörtner & Knust, 2007). Temperature has also been shown multiple times to be a selective pressure in the marine environment (Conover et al., 2006;Pörtner et al., 2001). Nevertheless, we are unable to provide evidence to support the functionality of the associations reported in our study. In a broader context, we argue that these patterns are likely to relate to spatially varying selection, a scenario previously reported in the American eel (Anguilla rostrata), a catadromous species with an equally diverse distribution range and complex life cycle (Gagnaire et al., 2012).

| Concluding remarks
We show that sea lampreys in the Eastern Atlantic coast constitute a nearly-panmictic population, with North Sea imposing some limits to the species' dispersal capacity. Within the limitations of our sampling strategy, we support the idea that the North Sea-Baltic Sea transition can be a reservoir of unique sea lamprey diversity (whether it be genetic, life-history-based, etc), and thus monitoring and conservation of this anadromous species should be integrated into activities that aim to preserve biodiversity in this region. Regarding the factors mediating the dispersal ability of a species without homing behavior, we showed one possibility would be spatially varying selection shaping the adaptive seascape across sea lamprey distribution.
While we find particularly interesting the association between candidate loci mapping to jawless-specific immune genes and host abundances, we acknowledge more work is necessary to further validate these results. Nevertheless, these observations could provide the background to investigate how host-parasite co-evolution could have mediated the evolution of earlier versions of vertebrate adaptive immune repertoire under natural contemporary conditions. Söderman for the donation of samples.

CO N FLI C T O F I NTER E S T S TATEM ENT
The authors declare no conflict of interests.

DATA AVA I L A B I L I T Y S TAT E M E N T
Raw RADseq reads will be deposited in NCBI's Short Read Archive