The swimming crab, Portunus trituberculatus, is one of the important economic crabs in the Chinese marine fisheries and mariculture industry. It has a wide distribution in the coastal areas of South-East Asia and has been farmed for more than 30 years1,2,3. Over the past few decades, the consumption of swimming crab has gradually increased due to the delicious taste and versatile nutrients4. Among the main producers, China ranked first with an annual production of 559,796 tons according to the China Fisheries Statistical Yearbook (2022) published by the Ministry of Agriculture, China. However, with the development of intensive farming and marine fishing industry in recent years, germplasm resources of P. trituberculatus have dramatically declined due to over-exploitation and environmental deterioration5,6. In addition, the heavy demand for wild parents from artificial propagation resulted in the decline of the genetic diversity of the natural populations7. Such episodes emphasize the vital nature of monitoring the genetic diversity of P. trituberculatus populations to protect germplasm resources and facilitate molecular marker-assisted breeding (MAS).

Investigating the genetic diversity of species is a prerequisite for the effective exploration and utilization of germplasms8. A high level of genetic diversity indicates strong biological survivability and environmental adaptation, which is required for sustained genetic improvement and stable inheritance of desirable traits9. Conversely, low genetic diversity can lead to reduced adaptability and viability, and ultimately to the degradation of species10. In aquaculture, genetic diversity constitutes a fundamental resource to improve the quality of stock11. However, for breeding populations of P. trituberculatus, long-term artificial directional selection eventually leads to a decline in genetic diversity12. Moreover, it is difficult to recover the declining genetic diversity caused by overfishing13. To formulate an effective conservation strategy, it is necessary to evaluate the genetic diversity and population structure of P. trituberculatus. In our previous study, SNP markers determined by genotyping-by-sequencing (GBS) revealed a low level of genetic diversity in P. trituberculatus along the coastal waters of China14. To evaluate the impact of the massive releases on natural populations, the researchers monitored the temporal variations in genetic diversity and structure in Panjin and Yingkou using microsatellite markers, which suggested that the large-scale stock enhancement of P. trituberculatus presented potential genetic risks to wild populations15,16. However, hatchery stock enhancements resulted in no reduction in genetic diversity for wild populations of P. trituberculatus in the Yangtze Estuary17.

The development of high-throughput sequencing technologies provides great convenience for the identification of DNA molecular markers in genetic research. Among known DNA molecular markers, simple sequence repeat (SSR) shows the advantages of co-dominant inheritance, highly polymorphic, and wide distribution throughout the genome18,19,20. At present, RNA-seq has become a popular high-throughput sequencing technology that enables the development of SSR markers due to its characteristics of wide dynamic range, high accuracy, and strong sensitivity21. In addition, compared with genomic-derived SSRs, transcriptome-derived SSRs are characterized by high efficiency, strong transferability, and correlation with potential genes22. Cao et al.23 first analyzed the transcriptome of Crassadoma gigantean using RNA-seq technology, identified 12 polymorphic SSRs, and found several genes related to the growth and immunity of C. gigantean. These results would facilitate future studies of population structure and conservation genetics in this species. In aquatic crustaceans, Zhang et al.24 conducted transcriptome sequencing on the male and female gonads of Portunus sanguinolentus and detected 93,196 SSR loci. In Pachygrapsus marmoratus, 43,915 SSRs were excavated by RNA-seq, providing a reliable resource for investigating biological responses to pollution in intertidal and marine populations25. Lv et al.6 identified 22,673 SSRs with transcriptome analysis of P. trituberculatus, which provided a material basis for genetic linkage and quantitative trait loci analyses. The objective of the current study is to evaluate the genetic diversity and population structure of P. trituberculatus in the Bohai Sea with transcriptomic SSRs. The findings will contribute to understanding the population genetic structure of P. trituberculatus in the Bohai Sea and be useful in improving management and conservation strategies for this species.

Material and methods

Sample collection and DNA extraction

A total of seven populations were collected from the Bohai Sea (Fig. 1, Table 1). Six wild populations included Dalian (DL), Huludao (HLD), Qinhuangdao (QHD), Huanghua (HW), Dongying (DY), and Penglai (PL). One cultured population (HC) that was sampled from the national breeding farm of swimming crabs in Huanghua (Hebei, China) came from the Bohai Sea. The claws of all individuals were collected and immediately preserved in 95% ethanol and stored at −20 °C. Genomic DNA was isolated from claw muscle using the TIANamp Marine animal DNA extraction kit (TIANGEN, Beijing, China) following the manufacturer's recommended protocols. After extraction, the quality and concentration of DNA samples were determined using a NanoDrop2000 spectrophotometer (Thermo Fischer Scientific), quantified, diluted to 100 ng/μl, and stored at s−20 °C.

Figure 1
figure 1

Swimming crab sampling locations. Note: This figure was created by DIVA-GIS 7.5 software (http://swww.diva-gis.org/).

Table 1 Sampling information of seven P. trituberculatus populations from the Bohai Sea.

PCR amplification and capillary electrophoresis

Forty pairs of SSR primers were obtained from the transcriptome data in our previous study26 (Table 2). All forward primers were labeled with the fluorescent dye, 6-carboxy-fluorescein (FAM). Polymerase chain reaction (PCR) amplification was performed in 20 µL reaction volumes containing 2 μL of template DNA, 2 μL of each primer (2.5 μmol/L each), 10 μL of 2 × Es Taq Master Mix (CWBIO, Beijing, China) and 4 μL of ddH2O. Amplification cycles consisted of initial denaturation (5 min at 95℃), followed by 35 cycles of denaturation (30 s at 94 ℃), annealing (30 s), extension (30 s at 72 °C) and additional extension (10 min at 72 °C). After amplification, PCR products were diluted 10 times with sterile water. The pooled sample was composed of 20 μL Hi-Di formamide and 0.2 μL GeneScan 500 ROX size standard. An ABI 3730XL Genetic Analyzer (Applied Biosystems, Foster City, CA) was used to conduct capillary electrophoresis (CE) following the manufacturer's instructions. Each CE sample contained 1μL diluted PCR product and 15 μL pooled sample. Allele sizes (in base pairs) were determined with GeneMarker®Fragment Analysis Software (Softgenetics LLC®, State College, PA, USA) on the comparison of the position of the internal size standard in each lane with the position of the peak value of each sample.

Table 2 Characteristics of 40 SSR loci for P. trituberculatus.

Data analysis

Genetic diversity within P. trituberculatus populations was estimated by determining genetic parameters, including the number of alleles (Na), the effective number of alleles (Ne), Shannon’s diversity index (SI), observed heterozygosity (Ho) and expected heterozygosity (He) using POPGENE version 1.327. Based on allele frequency, polymorphism information content (PIC) was estimated by PIC-CALC software28. Null allele frequencies (Fna) for SSR loci were calculated using GenePOP29. P values were calculated for determining Hardy–Weinberg equilibrium (HWE) at each locus with POPGENE version 1.3. Genetic differentiation and variation were inferred using Nei's genetic distance (D)30 and genetic identity (I) calculated by POPGENE version 1.3 and F-statistics (Fst, Fis) calculated by analysis of molecular variance (AMOVA) with software GenAlEx 6.531 through 999 permutations. Gene flow (Nm) was inferred from the formula of Nm = (1 − Fst)/4Fst32.

The phylogenetic tree was constructed based on Nei’s genetic distance and used to test population grouping as implemented in MEGA733. Principal component analysis (PCA) was carried out using Canoco 4.5 to elucidate genetic relationships within and among P. trituberculatus populations. Based on the 40 polymorphic SSR loci, Bayesian model-based population genetic structure was inferred using STRUCTURE version 2.3.434. The putative number of populations (K) was set from 1 to 10 with 3 replicate simulations for each K value using 100,000 MCMC (Markov Chain Monte Carlo) iterations after an initial 100,000 burn-in period. With the log probability of data (LnP(D)) and an ad hoc statistic ΔK based on the rate of change in LnP(D) between successive K-values, the structure output was entered into Structure Harvester35,36 to determine the optimum K value. The best K value was analyzed by CLUMPP37 and visualized with Distruct 1.1 software38.

Results

Genetic diversity within populations

In this study, all parameters of the 40 SSR loci were calculated and presented in Table 3. A total of 217 alleles were found with an average of 5.425 per locus. The effective number of alleles (Ne) ranged from 1.785 to 10.271 with a mean of 4.264. Shannon’s diversity index (SI), observed heterozygosity (Ho) and expected heterozygosity (He) ranged from 0.885 to 2.404 (mean: 1.482), 0.405 to 0.950 (mean: 0.639) and 0.440 to 0.903 (mean: 0.725), respectively. PIC values ranged from 0.415 (TRAN1) to 0.895 (TRAN20) with an average of 0.685. Five SSRs (TRAN1, TRAN3, ZL05, DX14, and TRAN13) showed moderate polymorphism (0.25 < PIC < 0.5), and the remaining 35 SSRs showed high polymorphism (PIC > 0.5). Null allele frequencies (Fna) and fixation index (Fis) varied from 0.029 (DX19) to 0.564 (TRAN13) and -0.207 (DX19) to 0.478 (TRAN21) respectively, indicating the existence of null alleles and heterozygosity deficit. Additionally, nine SSR loci fitted with HWE (P > 0.05), and the remaining 7 and 24 loci deviated from HWE at P < 0.05 and P < 0.01 levels, respectively.

Table 3 Genetic parameters for 40 SSR loci.

The mean values of Na, Ne, SI, Ho, He, and PIC of seven P. trituberculatus populations ranged from 5.225 to 5.375, 3.794 to 4.103, 1.374 to 1.449, 0.624 to 0.654, 0.687 to 0.714, and 0.643 to 0.673, respectively (Table 4), revealing a relatively low level of genetic diversity in the cultured population (SI = 1.374, He = 0.687, and PIC = 0.643) in comparison with wild populations (SI ≥ 1.399, He ≥ 0.692, and PIC ≥ 0.651).

Table 4 Genetic diversity indices of seven populations of P. trituberculatus from the Bohai Sea.

Population genetic structure

Genetic structural analysis of the total 420 P. trituberculatus individuals was performed to infer the optimal K value with the ΔK method. When the highest ΔK value was observed, the optimal K value was 4 (Fig. 2), which indicated that the seven populations were divided into four subpopulations (Fig. 3). The populations of Dalian (DL), Dongying (DY), and Huludao (HLD) formed a subpopulation (blue). Similarly, the populations of Huanghua (HW), Penglai (PL), and Qinhuangdao (QHD) formed another subpopulation (red). In the cultured population (HC), the genetic components of most individuals were homozygous but formed two subpopulations (green and yellow). The phylogenetic tree at the individual level based on Nei's genetic distances provided supplementary evidence that the HC population was scattered in different branches and DY individuals showed group clustering (Fig. 4).

Figure 2
figure 2

Relationships between the number of clusters (K) and the corresponding Delta K statistics from structure analysis.

Figure 3
figure 3

Population genetic structure based on the Bayesian clustering model among 420 P. trituberculatus individuals at K = 4.

Figure 4
figure 4

The phylogenetic tree based on Nei's unbiased genetic distance (Nei, 1978) among 420 P. trituberculatus individuals.

The population clustering results showed that the seven populations of Portunus trituberculatus formed two main groups (Fig. 5). Group I included four populations: HC, QHD, PL, and HW. The HC and QHD populations aggregated first, then with PL populations, and finally with HW population. Group II included three populations of HLD, DL, and DY. Overall, DY and HC had the largest genetic distance, which revealed that the genetic structure of P. trituberculatus populations in the Bohai Sea was not significantly related to their geographical distribution. In addition, PCA analysis demonstrated that the first two principal components explained 3.94% (PC1) and 3.68% (PC2) of total variation and could distinguish cultivated individuals from wild populations (Fig. 6). In summary, no obvious geographical distribution pattern was found, which illustrated high genetic mixing and gene flow between individuals of different populations.

Figure 5
figure 5

The phylogenetic tree based on Nei's unbiased genetic distance (Nei, 1978) among seven P. trituberculatus populations.

Figure 6
figure 6

Genetic relationships of 420 P. trituberculatus individuals as revealed by principal component analysis (PCA) with 40 SSR loci.

Population differentiation and variation

The low differentiation (Fst = 0.001) and high gene flow (Nm = 249.750) were observed between the PL and QHD populations, and the high differentiation (Fst = 0.060) and low gene flow (Nm = 3.917) was observed between the HC and DY populations (Table 5). In addition, Nei's genetic distance (D) and genetic identity (I) showed similar results between HC and DY populations (D = 0.177, I = 0.838) and PL and QHD populations (D = 0.025, I = 0.975) (Table 6). AMOVA analysis revealed that only 4% of genetic variation was partitioned among populations while 96% of the variation was concentrated within populations (Table 7).

Table 5 Genetic differentiation coefficient (Fst, below diagonal) and gene flow (Nm, above diagonal) among seven P. trituberculatus populations from the Bohai Sea.
Table 6 Nei′s genetic distance (D, below diagonal) and genetic identity (I, above diagonal) among seven P. trituberculatus populations from the Bohai Sea.
Table 7 Analysis of molecular variance (AMOVA) from seven P. trituberculatus populations.

Discussion

Genetic diversity is a crucial criterion in estimating the adaptability of species to changing environments, hence a better understanding of the genetic diversity of species is vital for evaluating population structure and evolutionary dynamics39. Genetic diversity is susceptible to artificial selection, genetic drift, migration, and breeding systems40 and is normally evaluated by genetic parameters such as polymorphism information content (PIC), Shannon’s diversity index (SI), and heterozygosity (H). However, expected heterozygosity (He) could better reflect the genetic diversity of species than observed heterozygosity (Ho)41.

The current study reported PIC values of 40 SSR loci of 0.415 ~ 0.895, indicating the polymorphic nature of the loci and their suitability for assessing genetic diversity in the seven P. trituberculatus populations. Genetic analysis revealed that the genetic diversity of the wild populations (He ≥ 0.692) was higher than that of the cultivated population (He = 0.687), which was consistent with our previous report42. A similar result was found in E. sinensis43. In general, genetic drift, selection, and inbreeding resulted in low genetic variability in farmed stocks44. In addition, many SSR loci significantly deviated from HWE (P < 0.05), which might be attributed to null allele and heterozygote deficiency (Fis > 0). Null alleles might be accounted for insufficient sampling45 and variation of microsatellite flanking sequence46. Loss of heterozygosity might be accounted for migration, artificial selection, and inbreeding47,48, which was common in marine species such as Scylla paramamosain49,50,51, Pinctada margaritifera52, and Hypophthalmichthys nobilis53. Chen et al.54 used ten SSRs to investigate the effect of artificial selection on the genetic structure of two abalone lines and found a loss of heterozygosity (Ho = 0.650 < He = 0.711). These studies indicated the negative impact of heterozygote deficiency on population genetic diversity. Therefore, it is necessary to maintain a high level of genetic diversity in aquatic animals to reduce heterozygous loss and prevent germplasm degradation.

In terms of expected heterozygosity, this study showed lower genetic diversity of P. trituberculatus in the Bohai Sea (He = 0.725) than that in the Yellow Sea47 (He = 0.814) and the East China Sea55 (He = 0.916), which was consistent with the results revealed by SNP markers14. It has been shown that when conducting genetic diversity analysis on aquatic animals, the number of SSR loci should be greater than 20 and the sample size should be greater than 4556. The number of loci and sample size in this study meet this standard, indicating the reliable result of low genetic diversity of swimming crabs in the Bohai Sea. Bohai Sea is a semi-enclosed and shallow body of water that limits the dispersal of P. trituberculatus, leading to a decline in genetic diversity47. In the SSR investigation of Exopalaemon carinicauda, Zhang et al.57 suggested that the Binzhou population in the Bohai Sea had the lowest level of genetic diversity, which illustrated that the Bohai Sea might hinder the gene flow. Moreover, marine pollution, aquaculture pollution, and reclamation also reduced genetic diversity58. Therefore, it is necessary to carry out long-term genetic monitoring of P. trituberculatus in the Bohai Sea for full protection and utilization of the germplasm resources of this species.

A stable genetic structure is central to the survival of a species. Its disintegration leads to a reduction or even extinction of the population. Given the economic significance of P. trituberculatus, genetic monitoring of population structure is essential for the development of effective management strategies13. The results of the current study established that all P. trituberculatus individuals were divided into four subpopulations (Fig. 2). DY population indicated relatively low gene flow with other populations, which might be related to its geographical location. Dongying is located in the relatively closed Laizhou Bay, which restricts the gene exchange of P. trituberculatus with other populations in the Bohai Sea. The phylogenetic tree proved this result. The individuals from the HC population were located at the different clades in the phylogenetic tree, which illuminated a strong genetic mixing between cultured and wild individuals. It is speculated that the frequent gene flow between cultured and wild populations resulted from releases and artificial breeding by catching wild crabs as parents. For example, different regions shared the juvenile crabs of a full sibling family from the Huanghua farm for artificial breeding and releases, resulting in gene flow between the HC population and different wild populations. Therefore, formulating reasonable management measures is necessary to monitor the impact of the releases on wild populations and maintain the genetic integrity of cultivated populations. However, the phylogenetic tree was quite different from the PCA results, which might be due to the indistinct genetic differentiation and the close genetic distance between individuals. Additionally, the calculation methods between the phylogenetic tree and PCA analysis are different59,60. Further research is needed into the reasons for this difference.

The genetic differentiation index (Fst), an essential gauge of genetic differentiation among populations, is crucial to understand genetic relationships. 0 < Fst < 0.05, 0.05 < Fst < 0.15, 0.15 < Fst < 0.25, and Fst > 0.25 showed negligible, moderate, high, and strong genetic differentiation respectively61. In this study, HC and DY populations were medium differentiation (Fst = 0.060 > 0.05), which might be related to the geographical location of the two groups. Huanghua and Dongying were located at Bohai Bay and Laizhou Bay on both sides of the Yellow River estuary, respectively. The ecological environment, species distribution, and organic pollution in the Yellow River estuary led to the geographical differences between the two different sea areas62,63, which led to the differences in activity scope and habitat preference of P. trituberculatus, and ultimately resulted in high genetic differentiation between the HC and DY populations. In addition, geographic isolation also leads to low gene exchange between cultivated and wild populations compared to wild populations in the open sea, which can be proven by the genetic differentiation index. The average value of Fst between the HC and wild populations was 0.031, and between wild populations was 0.017 (Table 5). Moreover, the average value of gene flow (Nm = 31.289), genetic distance (D = 0.08), and genetic identity (I = 0.924) also demonstrated low genetic differentiation and strong genetic admixture among the seven P. trituberculatus populations.

Conclusions

In summary, this study provided useful insights into the population structure of P. trituberculatus throughout the coastal areas of the Bohai Sea. Forty microsatellite loci revealed a low level of genetic diversity in the seven P. trituberculatus populations in the Bohai Sea. A low level of genetic differentiation and frequent gene flow among these seven populations were revealed, suggesting high genetic connectivity. The structure analysis illustrated four subpopulations, but the clustering pattern was not related to geographical location. To increase the genetic diversity of P. trituberculatus, practical and effective protective measures are expected to be taken to prevent the degeneration of germplasm resources. This study also provides a theoretical basis for selecting parents from different geographical populations during the artificial breeding programs.