The Genomic Basis of the Genetic Differentiation and Local Adaptive Evolution of Pampus Echinogaster based on SLAF-seq

Background: Factors such as climate change (especially ocean warming) and overshing have led to a decline in the supply of Pampus echinogaster and a trend of decreasing age. Exploring the genetic structure and local adaptive evolutionary mechanisms is crucial for the management of P. echinogaster. Results: This population genomic study of nine geographical populations of P. echinogaster in China was conducted by specic-locus amplied fragment sequencing (SLAF-seq). A total of 935,215 SLAF tags were obtained, and the average sequencing depth of the SLAF tags was 20.80×. After ltering, a total of 46,187 high-consistency genome-wide single nucleotide polymorphisms (SNPs) were detected. Based on all SNPs, the overall genetic diversity among the nine P. echinogaster populations was high. The Shantou population had the lowest genetic diversity, and the Tianjin population had the highest. Meanwhile, the population genetic structure based on all SNPs revealed signicant gene exchange and insignicant genetic differentiation between the nine P. echinogaster populations. Based on pairwise genetic differentiation (F ST ), we further screened 1,852 outlier SNPs that might have been affected by habitat selection and annotated SLAF tags containing these 1,852 outlier SNPs using Blast2GO. The annotation results showed that the genomic sequences at the outlier SNPs were mainly related to material metabolism, ion transport, breeding, stress response, and inammatory reactions, which may be related to the adaptation of P. echinogaster to different environmental conditions (such as water temperature and salinity) in different sea areas. Conclusions: The high genetic similarity of nine P. echinogaster populations may have been caused by the population expansion after the last glacial period, the lack of balance between migration and genetic drift, and the long-distance diffusion of eggs and larvae. We suspected that variation of these genes associated with material metabolism, ion transfer, breeding, stress reactions, and inammatory reactions were critical for adaptation to spatially heterogeneous temperatures in natural P. echinogaster populations.


Background
Widely distributed organisms inevitably face diverse habitats, and due to habitat heterogeneity-induced differences, natural selection can improve the habitat adaptation and utilization e ciency of geographically separated populations by changing their phenotypic and genomic characteristics. Ultimately, different geographical groups can undergo adaptive differentiation, even leading to new species [1,2]. The effective population size (Ne) of marine organisms is usually large, and seawater can ow with minimal physical barriers, which may result in a high level of gene exchange between marine biogeographical populations and make them less susceptible to genetic drift, generally limiting their genetic differentiation [3][4][5]. The different life history characteristics (such as spawning type, length of the pelagic larval period, and migration mode) of marine biogeographical populations and long-term heterogeneity in environmental factors can promote adaptive differentiation of populations in different habitats [6]. At the same time, over shing, habitat loss, and climate change have reduced some population sizes, leading to losses of genetic diversity. The resultant genetic drift and inbreeding may further reduce population sizes, thus affecting the adaptation of populations to their habitats [7,8]. Therefore, understanding the genetic mechanism of adaptive differentiation related to habitat heterogeneity in different populations can not only reveal the evolutionary history of species but also effectively de ne protection units in the context of climate change in order to achieve the rational management and protection of resources.
Pampus echinogaster belongs to Stromateidae, Perciformes, and is widely distributed in the northwestern Paci c Ocean, including the waters of Russia, Japan, the Korean Peninsula, China's Bohai Sea, the Yellow Sea, the East China Sea, and the northern South China Sea [9][10][11][12][13]. The complex paleogeographic dynamics of the northwestern Paci c Ocean may affect the formation of different geographical populations of marine organisms, including P. echinogaster, through events such as geographical isolation caused by glaciers and postglacial recolonization, while the hydrological conditions and environmental speci city of different habitats may further cause habitat-based adaptive differentiation of different geographical populations of marine organisms [14]. P. echinogaster is a seasonal migratory sh species. Its inner spawning cycle of the gonad development cycle is short (about 15 days) during somewhat long reproductive periods [11], and it can produce buoyant eggs with a pelagic larval period of at least one month [15]. Thus, it can spread over long distances. The complex oceanic current system in the northwestern Paci c Ocean results in a mix of recruited populations and overwintering populations of juvenile P. echinogaster, and these factors may eventually shape the genetic homogeneity between different geographical populations of this species [16]. In summary, the complex life history characteristics of this species have made it di cult to study its population differentiation. In recent years, factors such as climate change (especially ocean warming) and over shing have led to a decline in the supply of P. echinogaster and a trend of decreasing age [16]. In this context, it is necessary to evaluate the population differentiation of P. echinogaster in order to achieve the rational management of this resource.
In previous studies, researchers mainly divided P. echinogaster into the Yellow Sea population, Bohai Sea population, and East China Sea population based on their seasonal migratory characteristics [17] found the same population structure of P. echinogaster based on microsatellite molecular markers. However, based on mitochondrial control region sequences and six pairs of microsatellite loci, the results of our previous study were different from the population division of traditional sheries; that is, we found that the seven P. echinogaster populations in the waters of the Yellow Sea, the Bohai Sea, and the East China Sea may belong to one free-mating group, and there was no signi cant genetic differentiation between the populations [16]. These discrepant research results could be explained by the lack of methods for genome-wide genotyping.
In addition, since they have been limited to neutral markers, few studies have analyzed the habitat adaptation mechanisms of P. echinogaster populations. However, it is undeniable that environmental factors such as temperature, salinity, and pathogens vary greatly across the widely distributed range of P. echinogaster, which must have led to adaptive habitat differentiation between different geographical populations [18,19].
In this context, genome sequencing by high-throughput sequencing technologies can increase the number of genetic markers, including both neutral and adaptive markers, to help achieve the ne assessment of population-genetic parameters in different P. echinogaster populations and to clarify the genetic mechanisms of its habitat adaptation. For speci c-locus ampli ed fragment sequencing (SLAF-seq), the length of an effective read is 2 × 100 bp, and more than 100,000 tags can be developed at one time, so genome-wide scans of genetic markers can be completed for almost any species, contributing to a better understanding of the genetic structure of the species and its population-genomic characteristics under habitat selection [20].
In this study, SLAF-seq was used for the rst time to scan the genome-wide genetic markers of nine P. echinogaster populations off the coast of China. We used a population-genomic method to quantify the genetic variation in P. echinogaster populations and revealed the genetic differentiation between them. Then we predicted the sizes of different geographical P. echinogaster populations under climate change and human activity. Finally, we determined the regulatory mechanism related to the habitat adaptation of geographical P. echinogaster populations. In short, the results of this study could improve our understanding of the population structure and habitat-based adaptive differentiation of P. echinogaster and provide basic information for the accurate determination of its protection units. These contributions will be of great signi cance for maintaining the sustainability of P. echinogaster resources against the background of climate change and human activity.

SLAF-seq results
After ltering, a total of 448.17 million high-quality paired-end reads were obtained from SLAF-seq of 135 P. echinogaster samples. The average Q30 of the reads was 95.59%, and the average GC content was 43.17%. In addition, for the control sequence of Nipponbare, which was used to evaluate the accuracy of the established library, 1.79 M reads were obtained. After clustering high-quality reads, we obtained a total of 935,215 SLAF tags, and the average sequencing depth of the SLAF tags was 20.80×. Among the SLAF tags, 461,932 were polymorphic. With GATK and SAMTOOLS, 6,152,196 SNPs were further obtained. After the removal of low-quality SNPs, we obtained 46,187 highly consistent SNPs for subsequent genetic structure analysis.

Population genetic diversity and population structure
The π and Tajima's D values of the 46,187 SNPs exhibited similar uctuating patterns among the nine P. echinogaster populations (Fig. 2). The Tajima's D values of most SNPs in the nine P. echinogaster populations were less than 0, indicating that many rare alleles existed at a high frequency, which may have been due to the large population size.
Statistical results of genetic diversity at the 46,187 SNPs showed that the average H O , average H E , and π of the nine P.  Table. 1). Among the nine populations, the Shantou population had the lowest genetic diversity, as it had the lowest average H O , average H E , and π. These metrics were the highest in the Tianjin population, meaning that the genetic diversity level in this population was the highest. The statistical results for genetic diversity also showed that the percentage of polymorphic SNPs in each of the nine P. echinogaster populations was relatively high (86.52-93.28%), suggesting that the relatively large effective population sizes caused this result. The nine P. echinogaster populations had signi cantly low values of genetic differentiation, ranging from -0.00121 to 0.00125 (Table. 2). ADMIXTURE analysis of 46,187 SNPs revealed that an optimum K value of 2 with a minimum CV error. The clustering results from ADMIXTURE based on K=2 showed that all the individuals in the nine populations were clustered into one group, and the clustering results from ADMIXTURE based on K = 3 to 7 showed a similar result (Fig. 3A). The clustering pattern from ADMIXTURE was validated by analysis of NJ trees (Fig. 3B) and the PCA results ( Fig. 3C) for all 46,187 SNPs.
In addition, NetView P software revealed ne genetic structure among the nine P. echinogaster populations. In this study, the optimal KNN value was determined to be 38 based on various algorithms, and the network topological results showed that all the individuals in all nine populations were clustered together (Fig. 3D). The AMOVA results (Table. 3) also showed that the mean value representing the genetic differentiation of the nine populations was 0.03753, which was statistically signi cant, while statistically nonsigni cant genetic differentiation existed between the three groups (F CT = 0.00020).
Gene exchange between the nine P. echinogaster populations divMigrate-Online software was used to analyze the gene exchange between the nine P. echinogaster populations, and the results showed that there was high-intensity gene exchange between them (Fig. 4), providing good support for the above results on genetic differentiation.

Analysis of local adaptation
After running the Lositan program ve times, a total of 1,852 outlier SNPs were screened from the 46,187 SNPs (Fig. 5), while only four outlier SNPs were screened by BayeScan software. Thus, these two F ST -based methods successfully uncovered 1,852 outlier SNPs, which were used for subsequent analysis of the local adaptation of P. echinogaster.
Blast2GO was used to annotate the SLAF tags containing the 1,852 outlier SNPs. A total of 604 SLAF tags containing outlier SNPs were successfully annotated. GO functional classi cation was further applied using Blast2GO, in which 453 SLAF tags containing outlier SNPs could be assigned to 30 GO terms. The successfully annotated SLAF tags were related to multiple biological processes, molecular functions, and cellular components (Fig. 6). The biological processes mainly included cell processes, metabolic processes, and biological regulatory processes. The molecular functions mainly included molecular binding, catalytic activity, and transport activity. The cellular components mainly included cellular components, cells, and organelles. Moreover, 194 SLAF tags containing outlier SNPs were assigned to 249 metabolic pathways (Table. S1). These pathways were primarily related to material metabolism, ion transport, breeding, stress reactions, and in ammatory reactions.

Discussion
Compared with traditional molecular markers, SNPs can be typed across the whole genome with more simpli ed genome sequencing technology, and they can reveal more sophisticated genetic information in evolutionary biology, especially for species with insigni cant genetic differences. Previous studies on the genetic structure of P. echinogaster populations were mainly based on ecological characteristics [35,36], mitochondrial DNA sequences [16], or small numbers of microsatellite loci [16,17]. The development of population genetics research in P. echinogaster was restricted by the limited number of genetic markers, and therefore studies on the genetic structure of some populations have yielded different results. It is di cult to detect adaptive characteristics related to the habitats of P. echinogaster populations by traditional molecular markers [37]. Therefore, in this study, we used genome-wide SNPs obtained from SLAF-seq to analyze and evaluate the genetic structure and habitat adaptation characteristics in regionally represented samples of P. echinogaster we collected from coastal waters of China.
Gene ow between the nine P. echinogaster populations This is the rst study exploring the genetic structure and the genetic diversity of different geographical populations of P. echinogaster at the genomic level. The results showed that in the nine P. echinogaster populations, H O , H E , and π were relatively high, and the average F IS was relatively low. We speculate that the rapid population expansion of this species in the Pleistocene and the heterogeneous habitats in its wide distribution provided a basis for the maintenance of high genetic diversity in natural populations. Previous research has also shown that in coastal waters of China, the existing resources and effective female population sizes of P. echinogaster are large and stable, the male:female ratio in the spawning period is 1:1, and the species exhibits fast growth, a short gonadal development cycle, and batch spawning. These life history characteristics can bolster the size of recruit populations, thus facilitating the accumulation of more genetic mutations and rich genetic diversity between the populations.
Although P. echinogaster occupies a wide variety of habitats, this study con rmed that there was no signi cant genetic differentiation between P. echinogaster populations based on multiple analytical methods. Gene ow analysis also showed frequent gene exchange between P. echinogaster populations, which can be explained by random mating and suggests that high genetic homogeneity exists between P. echinogaster populations. These results are consistent with our previous results based on mitochondrial DNA sequences and microsatellite loci [16]. We speculate that population expansion of this species after the last glacial period and lacking balance between migration and genetic drift may be two main reasons for their unclear genetic structure [16,38]. Similar genetic structure is also found in species with closed expansion times, such as the horse mackerel [39], Sebastes schlegeli [40], the spotted-tail goby Synechogobius ommaturus [41], and the small yellow croaker [42,43]. At the same time, the open marine environment lacking obvious physical barriers; the strong diffusion ability of eggs, larvae, and adults; and the high randomness of diffusivity provided by ocean currents are key reasons for the genetic homogeneity of marine organisms [44]. P. echinogaster is a seasonal migratory sh species that lays eggs in batches. The spawning period can last for more than two months, and the number of eggs per mother can be 117,000-218,000 [35]. In addition, the eggs of P. echinogaster are buoyant, and the larvae have a pelagic period of at least 1 month. These life history characteristics may be conducive to the long-distance diffusion of eggs and larvae and can promote gene exchange between P. echinogaster populations, ultimately resulting in very low population genetic differentiation within a very wide distribution range [45]. In addition, we speculate that the complex ocean current system along the coast of China may further affect the population size and genetic connectivity of this species by increasing the diffusion of eggs and larvae. Currently, P. echinogaster populations form a continuous distribution and present a population-wide genetic pattern [16]. In summary, the life history characteristics of P. echinogaster and the in uence of marine environmental factors may cause the mixing of P. echinogaster populations, thus resulting in frequent gene exchange and high genetic homogeneity between them.

Local adaptation of P. echinogaster populations
The annotation of SLAF tags containing outlier SNPs indicated the local adaptation of different P. echinogaster populations. P. echinogaster is a seasonal migratory sh with long-distance diffusion [15,16], so aquatic environmental conditions (such as salinity and temperature) vary among geographical populations of this species. The GO annotation results showed that outlier SNPs mainly participate in metabolic processes and cellular processes, and their functions mainly involve protein binding and catalytic activity. These results suggest that driven by certain environmental factors unique to each P. echinogaster population, some regions of the genome corresponding to adaptive differentiation exist, leading to differences in physiological function. The KEGG results showed that multiple outlier SNPs were related to material metabolism. Fish use substances (amino acids, fat, and sugar) stored in their bodies to adapt to changes in habitat factors and for the energy required for life activities, such as migration [46][47][48]. On the other hand, sh also regulate the fatty acid composition of their bodies to increase membrane uidity, thereby adapting to changes in habitat temperature [13]. Therefore, genes related to material metabolism may play an important role in the adaptation of P. echinogaster to the temperatures of different geographical environments. Ion (such as calcium) transport signaling pathways also play important roles in the process of local adaptive evolution in P. echinogaster. For example, calcium ions are involved in sh reproduction, development, learning and memory, mitochondrial function, muscle contraction, and other functions [49]. Calcium signaling is also thought to play an important role in the regulation of ion exchange and osmoregulation, which makes this pathway a reasonable target of spatially differentiated selection under osmotic pressure in P. echinogaster because this species feeds on a wide variety of prey. Differences in environmental factors, such as temperature and salinity, may also lead to different breeding times in different P. echinogaster populations. In this study, the oocyte meiosis pathway was enriched at the molecular level, also providing evidence of this phenomenon. Gene mutations related to in ammatory reactions may provide evidence for the speci c resistance of different geographical groups to different habitats. In fact, a correlation between the immune response and the adaptive evolution to habitat temperature has been con rmed in populations of many other marine organisms, such as Syngnathus scovelli [50], Haliotis laevigata [19], and Trachidermus fasciatus [13]. It is undeniable that local adaptation of P. echinogaster populations may be affected by many factors, such as water quality, heavy metals, water temperature, salinity, parasites, and predators. The mechanism of local adaptation may be very complex, and various factors probably interact to drive this process.

Conclusion
In this study, genome-wide information was obtained from nine P. echinogaster populations, and the genetic structure of the populations and the genomic characteristics of local adaptation were explored. Genetic structure analysis showed that the nine populations had high genetic similarity, which may be due to population expansion after the last glacial period, the lack of balance between migration and genetic drift, and the long-distance diffusion of eggs and larvae. Different habitat conditions might have caused the maintenance of many genetic mutations in the different populations. The annotation of SLAF tags containing outlier SNPs indicated that genes related to material metabolism, ion transfer, breeding, stress reactions, and in ammatory reactions were essential to the habitat adaptation of P. echinogaster populations.

Sample collection and SLAF-seq
To ensure the accuracy of the sample source, a total of 135 P. echinogaster samples (Fig. 1) were collected at nine different locations along the coast of China (15 individuals at each location) from October 2017 to December 2017. External morphological identi cation of the samples was performed mainly by referring to Nakabo [12] and Li et al. [13]. For each fresh P. echinogaster sample, sterile scissors and tweezers were used to obtain back muscle tissues. All muscle tissues were stored in a cryogenic freezer at -80°C for future experiments. The phenol-chloroform method was used to extract the genomic DNA from each sample. Then, 1% agarose gel electrophoresis and an Invitrogen Qubit uorometer were used to assess the degradation degree and concentration of the genomic DNA. Quali ed DNA was submitted to Biomarker Technologies Corporation (Beijing, China) for library construction and sequencing.
According to the genome size and guanine+cytosine (G+C) content of P. argenteus, enzymatic digestion prediction was performed. The analysis software for SLAF enzymatic digestion prediction independently developed by Biomarker Technologies Corporation [20] was used to predict the digestion of the reference genome, and the optimal digestion scheme was selected according to the following principles: (1) the percentage of enzymatic fragments located in the repeat sequences was as low as possible; (2) the enzymatic fragments were distributed as evenly as possible across the genome; (3) the length of each enzymatic fragment was consistent with that of the speci c experimental system; and (4) the number of obtained enzymatic fragments (SLAF tags) agreed with the expected number of tags. With the HaeIII restriction enzyme, the genomic DNA of each quali ed sample was digested separately. Each obtained digestion fragment (SLAF tag) had a poly (A) tag added to the 3' end, the dual-index sequencing adaptors were attached [21], and this new sequence was ampli ed by polymerase chain reaction (PCR), puri ed, and mixed. The target fragments were taken out by cutting the gel. After the samples were quali ed by the library, they were sequenced using the Illumina HiSeq 2500 platform. To evaluate the accuracy of enzyme digestion, Nipponbare was selected as the control for sequencing.

SNP detection and screening
The raw data obtained from sequencing were characterized using the dual-index sequencing adaptor. The raw reads of each sample were obtained, and reads with a quality score lower than 30 were excluded. Based on sequence similarity, the remaining high-quality reads of each sample were clustered, and the reads with a similarity greater than 98% were considered to have clustered as a single SLAF tag. A SLAF tag with sequence differences between samples can be de ned as a polymorphic SLAF tag. The sequence with the greatest sequencing depth for each SLAF tag was taken as the reference sequence. The Burrows-Wheeler alignment tool was used [22] to align the reads to the SLAF tags, both GATK [23] and SAMTOOLS [24] were used to perform single nucleotide polymorphism (SNP) calling, and the overlapping SNPs in the GATK and SAMTOOLS results were used as the nal SNP dataset. The generated SNPs were saved in a variant call format (VCF) le. To ensure the accuracy of subsequent analyses, VCFtools [25] was used to screen SNPs with the following parameters: -MAF 0.01 (minimum allele frequency> 0.01); --max-missing 0.1 ( ltering out the genotypes with less than 90% data); --min-meanDP 150 (minimum mean depth of coverage > 150); --min-alleles 2 --max-alleles 2 (only two alleles); --minGQ 98 (quality score> 98); --minQ 30 (retaining the loci with a quality score> 3); --remove-indels (excluding the loci containing indels); and -HWE 0.05 (excluding the loci with P < 0.05 in the Hardy-Weinberg equilibrium test).

Genetic diversity and population genetic structure
To describe the genetic diversity levels of all genome-wide SNPs of the nine P. echinogaster populations, with TASSEL software (version 5.2.31) [26], the nucleotide diversity (π) and Tajima's D value of each SNP in each P. echinogaster population were estimated. With Circos software [27], the π, expected heterozygosity (H E ), and Tajima's D of each SNP were visualized.
Using the "populations" module of Stacks software (version 1.34) [28], the genetic diversity levels of the nine populations, including the polymorphic loci, π, observed heterozygosity (H O ), H E , and inbreeding coe cient (F IS ), were statistically analyzed. Arlequin software (version 3.5.2.2) [29] was used to estimate the pairwise genetic differentiation (F ST ) of P.
echinogaster, and 10,000 permutations were used to analyze the signi cance of F ST .
Based on all SNPs, four methods were used to estimate population structure and individual clustering within a population.
(1) PGDSpider software (version 2.0.5.2) [30] was used to convert the VCF le of all SNPs to STR format, and then ADMIXTURE software, based on the maximum likelihood method (version 1.3.0) [31], was used to evaluate ancestors. This software uses a fast numerical optimization algorithm to achieve a fast analysis rate. During analysis, the range of genetic clusters (K) was set to 2-7, and each analysis was repeated 10 times. Finally, based on the cross-validation (CV) error corresponding to the optimal K value, the clustering results were plotted. (2)    Statistics on the genetic diversity of a genome-wide set of SNP loci in the nine P. echinogaster populations