GWAS Enhances Genomic Prediction Accuracy of Caviar Yield, Caviar Color and Body Weight Traits in Sturgeons Using Whole-Genome Sequencing Data

Caviar yield, caviar color, and body weight are crucial economic traits in sturgeon breeding. Understanding the molecular mechanisms behind these traits is essential for their genetic improvement. In this study, we performed whole-genome sequencing on 673 Russian sturgeons, renowned for their high-quality caviar. With an average sequencing depth of 13.69×, we obtained approximately 10.41 million high-quality single nucleotide polymorphisms (SNPs). Using a genome-wide association study (GWAS) with a single-marker regression model, we identified SNPs and genes associated with these traits. Our findings revealed several candidate genes for each trait: caviar yield: TFAP2A, RPS6KA3, CRB3, TUBB, H2AFX, morc3, BAG1, RANBP2, PLA2G1B, and NYAP1; caviar color: NFX1, OTULIN, SRFBP1, PLEK, INHBA, and NARS; body weight: ACVR1, HTR4, fmnl2, INSIG2, GPD2, ACVR1C, TANC1, KCNH7, SLC16A13, XKR4, GALR2, RPL39, ACVR2A, ADCY10, and ZEB2. Additionally, using the genomic feature BLUP (GFBLUP) method, which combines linkage disequilibrium (LD) pruning markers with GWAS prior information, we improved genomic prediction accuracy by 2%, 1.9%, and 3.1% for caviar yield, caviar color, and body weight traits, respectively, compared to the GBLUP method. In conclusion, this study enhances our understanding of the genetic mechanisms underlying caviar yield, caviar color, and body weight traits in sturgeons, providing opportunities for genetic improvement of these traits through genomic selection.


Introduction
Sturgeons, with 27 species distributed across the Northern Hemisphere, are ancient fish that represent a remarkable evolutionary relic, often referred to as "living fossils" [1].All of these species are listed as Appendix II species under the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES).In the sturgeon breeding programs, there are three key traits, including caviar yield, caviar color, and body weight.Caviar yield has a direct correlation with caviar and fry production, leading to significant demand from enterprises for sturgeon breeding populations with high caviar yields.Among the range of caviar products, golden caviar is more costly than caviar of other colors, possibly due to the association of gold with luxury and quality.Additionally, sturgeon is highly valued for its meat quality, and cultivating a fast-growing sturgeon breeding population can expedite sturgeon production in the meat industry market.Therefore, it is recommended that caviar yield, caviar color, and body weight should be the primary objectives of sturgeon cultivation in China.Due to the late sexual maturity of sturgeon, which typically takes about 6-8 years, the breeding cycle is prolonged, and traditional pedigree-based methods suffer from lengthy generation intervals and low efficiency.Therefore, molecular markerbased breeding, especially genomic selection [2], emerges as an effective approach to expedite genetic progress in sturgeon breeding.However, at present, no reports exist on comprehensive genome-wide scanning for key molecular markers associated with all three traits based on population size.
Currently, with the rapid development of whole-genome sequencing technology and the reduction of costs, GWAS has gradually become a mainstream strategy for genetic analysis and identification of important candidate genes related to economic traits in livestock [3], plants [4], and aquatic animals [5].In aquaculture, GWAS have been used for genetic dissection of meat quality in common carp [6], Atlantic salmon [7] and large yellow croaker [8], growth in catfish [9,10], disease resistance in Atlantic salmon [11] and large yellow croaker [12,13].However, there have been no reports on GWAS for traits such as caviar yield, caviar color, and body weight in sturgeons.
The concept of genomic selection (GS) was initially proposed by Meuwissen et al. [2] in 2001.This method involves deriving genomic estimated breeding values (GEBV) from high-density markers across the entire genome, premised on the assumption that at least one SNP is in LD with quantitative trait loci (QTLs) affecting the target trait.In recent years, a growing body of research on GS in aquaculture animals has positioned it as a cuttingedge technology in aquaculture breeding, highlighting its potential to expedite breeding cycles and reduce associated costs.Enhancing the accuracy of genomic prediction is a prevalent challenge in GS, and incorporating prior information from GWAS to enhance this accuracy has been reported in rainbow trout [14], dairy cattle [15] and pigs [16].However, many studies have found that incorporating GWAS prior information does not improve the accuracy of genomic prediction [17][18][19].Therefore, this study implemented a unique strategy that combines LD-pruned markers and GWAS prior information to improve the accuracy of genomic prediction for caviar yield, caviar color, and body weight traits in sturgeons.

Whole-Genome Sequencing and SNP Calling
Whole-genome sequencing was conducted for 673 fish, yielding a total of 67.84 billion reads with an average of 0.10 billion reads per individual.Among these, 92.60% of reads successfully aligned to the reference genome, resulting in an average sequencing depth of 13.69× for 673 individuals (ranging from 5.06× to 25.85×).After stringent quality control, a total of 10.41 million SNPs were identified.Figure 1 illustrates the histogram of SNP distribution and SNP density plots across all chromosomes.The number of high-quality SNPs per chromosome varied from 385 (Chr60) to 794,151 (Chr1) (Figure 1B), with an average density of 5345.31SNPs/Mb (Figure 1A).

Population Structure Analysis
Through the first three principal component analyses (Figure 1C), it can be observed that individuals have similar genetic backgrounds in the Russian sturgeon population, which is beneficial for conducting GWAS.The pattern of LD, as depicted in Figure 1D, indicates that the average genome-wide LD (r2) obtained based on adjacent pairs of markers was 0.049 and the LD decay was 20 kb at r2 = 0.05, suggesting that candidate genes can be effectively mapped in GWAS results by setting the region of 20 kb upstream and downstream of significant SNPs.

Population Structure Analysis
Through the first three principal component analyses (Figure 1C), it can be observed that individuals have similar genetic backgrounds in the Russian sturgeon population, which is beneficial for conducting GWAS.The pattern of LD, as depicted in Figure 1D, indicates that the average genome-wide LD (r2) obtained based on adjacent pairs of markers was 0.049 and the LD decay was 20 kb at r2 = 0.05, suggesting that candidate genes can be effectively mapped in GWAS results by setting the region of 20 kb upstream and downstream of significant SNPs.

Phenotype Statistics and Heritability Estimation
Descriptive statistical data for the analysis of traits in the Russian sturgeon population are shown in Table 1.The mean (standard deviation) caviar yield, caviar color, and body weight were 0.19 (0.057), 2.453 (0.653), and 19.933 (4.029), respectively.Coefficients

Phenotype Statistics and Heritability Estimation
Descriptive statistical data for the analysis of traits in the Russian sturgeon population are shown in Table 1.The mean (standard deviation) caviar yield, caviar color, and body weight were 0.19 (0.057), 2.453 (0.653), and 19.933 (4.029), respectively.Coefficients of variation were high for caviar yield, caviar color, and body weight, 30.00%, 26.62%, and 20.21%, respectively.In addition, as shown in Table 2, SNP-based heritability was estimated through genome-wide association analysis, with heritabilities for caviar yield, caviar color, and weight being 0.497, 0.614, and 0.627, respectively, indicating moderate to high levels of heritability for each trait, which is advantageous for selective breeding programs.

Identification of Candidate Genes
GWAS based on whole-genome sequencing data were used to detect candidate functional genes.Based on the functional annotation analysis, candidate genes were detected within a 20-kb region, centering each significant and suggestive SNPs.As shown in Table 3, 29 genes were found for caviar yield, of which 10 genes were potential candidate genes.For caviar color (Table 4), 22 genes were detected, and 6 genes had functions related to caviar color.For body weight (Table 5), 77 genes were detected, of which 15 genes were potential candidate genes.

Genomic Prediction Performance
To assess the effect of incorporating GWAS results on genomic prediction, the accuracy of genomic prediction for caviar yield, caviar color, and body weight traits was evaluated using the GBLUP, GLDBLUP, and GFBLUP methods, as shown in Figure 3. GBLUP and GLDBLUP produced similar predictive accuracy, demonstrating that reducing SNP density to 50 K by LD pruning can yield prediction accuracy comparable to utilizing all markers.Additionally, GFBLUP produced the highest predictive accuracy in all cases, with GFBLUP improving by 2%, 1.9%, and 3.1% over GBLUP for caviar yield, caviar color, and body weight, respectively.For prediction bias, as shown in Figure 3, GFBLUP produces similar or lower prediction bias compared to GBLUP and GLDBLUP methods, e.g., for the body weight trait, the prediction biases of GFBLUP, GBLUP, and GLDBLUP are 0.269, 0.325, and 0.323, respectively.For MSE, GLDBLUP and GFBLUP produced lower values than GBLUP for caviar yield, while for the other two traits, all three methods produced similar MSE.Additionally, all three methods produced similar MAE in all cases.and body weight, respectively.For prediction bias, as shown in Figure 3, GFBLUP produces similar or lower prediction bias compared to GBLUP and GLDBLUP methods, e.g., for the body weight trait, the prediction biases of GFBLUP, GBLUP, and GLDBLUP are 0.269, 0.325, and 0.323, respectively.For MSE, GLDBLUP and GFBLUP produced lower values than GBLUP for caviar yield, while for the other two traits, all three methods produced similar MSE.Additionally, all three methods produced similar MAE in all cases.

Discussion
In this study, we conducted GWAS and identified several candidate genes related to caviar yield, caviar color, and body weight in Russian sturgeon.Furthermore, to verify the reliability of GWAS results, we evaluated the accuracy of genomic prediction for the

Discussion
In this study, we conducted GWAS and identified several candidate genes related to caviar yield, caviar color, and body weight in Russian sturgeon.Furthermore, to verify the reliability of GWAS results, we evaluated the accuracy of genomic prediction for the three traits by combining LD pruning markers and GWAS prior information.The result showed that combining LD-pruned markers and GWAS prior information could improve the accuracy of genomic prediction for caviar yield, caviar color, and body weight traits in sturgeons.

Potential Candidate Genes for Caviar Yield
For caviar yield, a number of candidate genes located within 20 kb of genome-wide significant and suggestive significant SNPs were identified in both lines.Among them, the TFAP2A gene plays a vital role in mouse oocyte maturation [20].Overexpression of TFAP2A may upregulate p300, increasing levels of histone acetylation and lactylation, which in turn impede spindle assembly and chromosome alignment, ultimately hindering nuclear meiotic division in mouse oocytes [20].Niu et al. [21] reported that the RPS6KA3 gene was associated with reproduction pathways in Xiang pigs.The presence of CRB3 in many organs and its distribution pattern during mouse embryonic development suggest that the CRB3 plays a significant role in establishing and maintaining polarity in mouse embryos [22].For TUBB gene, Zhao et al. [23] reported that TUBB regulates spindle assembly and chromosome dynamics during mouse oocyte maturation.A study showed a role for the H2AFX gene in germ cell loss, and histone H2AFX links meiotic chromosome asynapsis to prophase I oocyte loss in mammals [24].The morc3 gene was related to the regulation of animal reproduction, and the deletion of morc3 reduced the pregnancy rate of male mice and led to low fertility [25].The BAG1 gene was found to have potential efficacy in terms of ameliorating oocyte maturation [26].A study showed that RANBP2 acts as an inhibitor of premature maturation-promoting factor activation and the untimely degradation of securin in oocyte maturation, thereby preserving the accurate timing of the resumption of maturation and meiotic progression in mouse oocytes [27].The PLA2G1B gene was found to be possible a newly discovered component affecting the efficacy of horse IVM/IVF [28].A study observed that NYAP1 plays a key role in ovarian development by regulating target genes related to the oxytocin signaling pathway, and its differential expression level in Han sheep may contribute to improving fecundity [29].

Potential Candidate Genes for Caviar Color
For caviar color, within the range of 20 kb of the genome wide significant and suggestive significant SNPs, only two genes, SRFBP1 and INHBA, have been reported to be directly associated with pigment formation.The SRFBP1 gene was reported to be associated with skin pigmentation in an Ogye x White Leghorn F2 chicken population [30].The INHBA gene strongly controls skin pigmentation and also influences serum vitamin D levels in African Americans [31].Surprisingly, the functions of other genes identified are directly related to immunity rather than pigment formation.Among them, the NFX1 protein was found to encode a repressor of gene expression, suggesting that NFX1 limits the immune response following infection [32].Fiil et al. [33] reported that OTULIN restricts Met1-Ub formation after immune receptor stimulation to prevent unwarranted proinflammatory signaling.The PLEK gene was related to the immune system, suggesting an inactive immune regulation [34].The NARS gene plays a role in oxidative stress/hypoxia and endoplasmic reticulum stress/unfolded protein response, and its mutation leads to melanoma susceptibility [35].This suggests that the immune response, as a protective mechanism, will indirectly lead to the formation of pigment.Similar results have been reported in a large number of studies, e.g., Linher-Melville and Li [36] demonstrated that the melanocytes could swallow exogenous beads and then recruit immune cells to protect from injury in zebrafish (Danio rerio).Similarly, the INHBA gene participates in the biological processes related to pigmentation [31] and also participates in the biological processes significantly related to hematopoiesis and immune system [34].

Potential Candidate Genes for Body Weight
For body weight, 15 potential candidate genes have been identified within the range of 20 kb of the genome-wide significant and suggestive significant SNPs.The ACVR1 gene was identified in multiple regions and belongs to the transforming growth factor (TGF)-β superfamily, which can inhibit muscle differentiation [37].Zhao et al. [38] reported that the ACVR1 gene might contribute to later myogenesis and more muscle fibers in Landrace (LR, lean) than Lantang (LT, obese) pig breeds.A study indicated that a synonymous mutation g.101220 C > T located on the fifth intron of the ovis HTR4 gene was detected, and association analysis showed that this mutation was significantly associated with growth traits in sheep [39].The fmnl2 gene is a candidate gene responsible for facioscapulohumeral muscular dystrophy, and it is critical for muscle development [40].The polymorphism of the INSIG2 gene is associated with increased subcutaneous fat in women and poor resistance training response in men [41].The GPD2 gene could catalyze the esterification of fatty acids to triglycerides [42].ACVR1C is one of the type I transforming growth factor-β (TGF-β) receptors, and can be used as an adipocyte developmental marker [43].The TANC1 gene is essential for mammalian myoblast fusion [44].Xie et al. [45] reported that KCNH7 is the candidate gene related to growth in Licha Black Pig.A study showed that loss of the SLC16A13 gene increases mitochondrial respiration in the liver, leading to reduced hepatic lipid accumulation and increased hepatic insulin sensitivity in high-fat diet-fed SLC16A13 knockout mice [46].The XKR4 gene is related to feed intake and average daily gain of cattle [47].SNPs near the XKR4 gene are also associated with subcutaneous, which has been considered as a candidate for carcass traits [48].The GALR2 gene is a regulator of insulin resistance, and activation of GALR2 represents a promising strategy against obesity-induced insulin resistance [49].The RPL39 is a crucial candidate gene associated with growth in farm animals [50].Goh et al. [51] reported that ACVR2A directly and negatively regulates osteoblasts' bone mass through activin receptor signaling.Dong et al. [52] reported that the ADCY10 gene could be one of the key regulating switches for the energy metabolism in Yili goose.The ZEB2 gene was also reported to be associated with body weight in Hu sheep [53].

Genomic Prediction Incorporating GWAS Prior Information
Whole-genome sequencing data includes most causal mutations that affect traits of interest, making genomic prediction less limited by the LD between SNPs and causal mutations.Simulation studies have shown that whole-genome sequencing data can improve the accuracy of genomic prediction within populations by 40% [54].However, a substantial amount of empirical data suggests that whole-genome sequencing does not always provide greater prediction accuracy compared to SNP chips [17].The primary reason is the presence of a large number of noisy loci in the genome, which adversely affect the accuracy of genomic prediction.Therefore, some studies have reported that LD pruning of whole genome sequencing data can reduce the number of noisy loci and improve the accuracy of genomic prediction [17,55,56].However, our previous research has shown that using LD pruning to reduce SNP density to different levels cannot necessarily improve the accuracy of genomic prediction [57].One possible reason is that while noisy loci are removed, functional loci may also be inadvertently eliminated, resulting in an inability to enhance prediction accuracy.In addition, there have been reports on using GWAS priors to improve the accuracy of genomic prediction [14-16], e.g., Yoshida and Yáñez [14] reported that the accuracy of genomic prediction can be improved using preselected variants from GWAS for growth under chronic thermal stress in rainbow trout.However, many studies have reported that utilizing prior information from GWAS does not improve the accuracy of genomic prediction [17][18][19].This may be because, although functional sites are included in the genome, noisy sites have not been effectively removed, resulting in an inability to enhance prediction accuracy.Therefore, this study identified the advantages of both methods.Firstly, noisy loci were removed by performing LD pruning on whole genome sequencing data.Then, functional loci were screened using GWAS based on whole genome sequencing and combined with LD-pruned loci.The results showed that all three traits-caviar yield, caviar color, and body weight-could achieve improved accuracy in genomic prediction, further verifying the reliability of the GWAS results in this study.This study provides a new approach for enhancing the accuracy of genomic prediction based on whole-genome sequencing data.

Population and Phenotyping Measurement
The Russian sturgeons used in this study were from Hangzhou Qiandaohu Xunlong Sci-tech Co., Ltd.(Hangzhou, China).Details regarding fish rearing and phenotyping procedures have been provided in our previous study [58].In 2012, 6 dams and 26 sires were artificially inseminated to create 26 full-sib families.At the age of 8, the developmental status of fish roe was assessed using in vitro puncture.Fish with an average roe diameter exceeding 2.8 mm were individually tagged with passive integrated transponder (PIT) electronic markers, and a fin sample was collected and preserved in absolute ethanol.Subsequently, these tagged fish were processed for caviar production at Hangzhou Qiandaohu Xunlong Sci-tech Co., Ltd.The body weight (BW), total caviar weight (CW), and caviar color (CC) of each fish were recorded.Caviar yield (CY) was calculated relative to the female body weight using the formula CY = CW/BW.A subjective color score for the caviar was assigned based on color depth, ranging from 1 to 4, with gold receiving a score of 4, light as 3, middle as 2, and black as 1.All caviar color scores were recorded by the same operator, who used the image as a reference guide for classification.In total, 673 fish with phenotype records were selected for subsequent analysis.The descriptive statistics of phenotypes are presented in Table 1.

Genotype Imputation and Population Structure Analysis
Imputation for missing genotypes of whole-genome sequencing data was performed with Beagle (version 4.1) [62].Variants with a minor allele frequency (MAF) lower than 0.05 and deviation from the Hardy-Weinberg equilibrium (HWE) (p value < 10 −7 ) were excluded using the PLINK software (version 1.90) [63].Furthermore, due to the high level of LD in the genome, most SNPs are redundant; LD pruning was performed using PLINK [63] to remove variants in high LD (r2 > 0.9).After LD pruning, 10,409,793 SNPs were retained for the whole-genome sequencing data.Principal component analysis (PCA) was performed on the genomic relationship matrix using GCTA software (version 1.25.3)[64].This resulted in a matrix of eigenvectors in descending order that represented principal components (PCs), where PC1 had the largest eigenvalue.The overall structuring of genetic variation was visualized in a scatterplot of the top few PCs.LD between a pair of SNPs was measured as r2, and LD decay analysis based on r2 was conducted using PopLDdecay (version 3.42) [65] to assess LD patterns.

Genome-Wide Association Study
A single-marker regression model was implemented to detect the association of SNP with caviar yield, caviar color, and body weight traits.The model includes a random polygenic effect to account for shared genetic effects of related individuals and to control population stratification.The statistical model is described below: in which y is the vector of phenotypes; 1 is a vector of ones; µ is the overall mean; b is the average effect of the gene substitution of a particular SNP; x is a vector of the SNP genotype (coded as 0, 1, or 2); g is a vector of random polygenic effects with a normal distribution g ~N(0, Gσ a 2 ), in which σ a 2 is the polygenic variance and G is the genomic relationship matrix and was constructed using all markers following VanRaden [66]; Z is an incidence matrix relating phenotypes to the corresponding random polygenic effects; and e is a vector of residual effects with a normal distribution N(0, Iσ e 2 ), in which σ e 2 is the residual variance.The software GCTA (version 1.25.3)[64] was used to fit the model.

Functional Genomic Analysis
Functional annotation of all coding genes of Acipenser ruthenus was performed using eggNOG-mapper (version 2) [69], which offers higher accuracy compared to traditional sequence similarity search methods such as BLAST search, as it avoids annotating from collateral homology.Genes located in the region between the 20 kb upstream and 20 kb downstream of the significant and suggestive SNPs were retrieved for data mining.

Genomic Prediction Incorporating GWAS Prior Information
In order to evaluate the genomic prediction effect of caviar yield, caviar color, and body weight traits, the genomic best linear unbiased prediction (GBLUP) based on the genomic relationship matrix and genomic feature BLUP (GFBLUP) including GWAS prior information were implemented to predict GEBV for each genotyped individual.

GBLUP
The GBLUP [66] model was used to predict the GEBV of all genotyped individuals: where y is the vector of phenotypes, µ is the overall mean, 1 is a vector of ones, g is the vector of genomic breeding values, following a normal distribution of N 0, Gσ 2 g ), where σ 2 g is the additive genetic variance, and G is the marker-based genomic relationship matrix [66].
Z is an incidence matrix linking g to y and e is the vector of random errors, following a normal distribution of N 0, Iσ 2 e ), where σ 2 e is the residual variance.For GBLUP, the G was constructed using whole-genome sequencing markers.According to our previous study [57], reducing SNP density to 50 K through LD pruning yielded similar prediction accuracy to using all markers.This method is termed GLDBLUP.

GFBLUP
The GFBLUP [70] model, which uses prior information about genomic features, is based on a linear mixed model with two random genomic effects: y = 1µ + Zf + Zr + e, where y, 1, µ, and e are the same as in the GBLUP model, f is the vector of genomic values captured by genetic markers associated with a genomic feature of interest, following a normal distribution of N(0, G f σ 2 f ); r is the vector of genomic effects captured by the remaining set of genetic markers, following a normal distribution N 0, G r σ 2 r ), and Z is an incidence matrix that links f and r to y. Matrices G f and G r were constructed similarly to G, with G f based on significant genetic markers determined by FDR with 0.05.G r utilizes 50 K SNPs obtained through LD pruning, excluding the markers used in G f .It should be noted that GWAS analysis is based only on reference data.
To assess prediction efficiency, genomic prediction was carried out through 10-fold cross-validation (CV).The genotyped individuals were randomly split into ten folds, phenotypes from one-fold (validation population) were removed from the dataset, and the remaining folds (reference population) were used to predict the GEBV in the validation population.This 10-fold CV was replicated 20 times, resulting in 20 average accuracies of genomic prediction.The validation population was the same in each replicate of 10-fold CV for all the three methods, GBLUP, GLDBLUP, and GFBLUP.Prediction accuracy was calculated as the Pearson's correlation between phenotypic values y and GEBV for the validation individuals, i.e., r(y, GEBV).The regression coefficient of y on GEBV was used to evaluate the bias of predictions, and the bias was expressed as the absolute value of the regression coefficient minus 1, i.e., abs(1-b(y, GEBV)).In addition, mean squared error (Mse) and mean absolute error (Mae) metrics were used to compare model performance.
Mse (Mae) represented the average square (absolute) of the difference between y and GEBV centered on zero.

Conclusions
In this study, the GWAS based on whole-genome sequencing was performed for caviar yield, caviar color, and body weight in Russian sturgeon.Combining the results of GWAS and bioinformatics annotation analysis, 10 genes were identified as potential candidate genes associated with the caviar yield trait; 6 genes were considered potential candidate genes related to the caviar color trait; and 15 genes were detected as potential candidate genes related to the body weight trait.In addition, combining LD-pruned markers and GWAS prior information could improve the accuracy of genomic prediction for caviar yield, caviar color, and body weight traits in sturgeons.These findings provide valuable insights into the genetic mechanisms underlying these important traits and demonstrate the potential for their genetic improvement through advanced genomic selection methods.Future studies could further enhance this understanding by integrating advanced microscopical techniques, such as transmission electron microscopy, to provide a comprehensive morpho-functional analysis of Russian sturgeons.

Figure 1 .
Figure 1.SNP distribution and population structure of Russian sturgeon.(A) Distribution of SNPs in 10 Mb windows across the genome; (B) Number of SNPs on each chromosome; (C) Principal component analyses for the first to the third dimensions of principal component (PC); (D) Genomewide LD decay.

Figure 1 .
Figure 1.SNP distribution and population structure of Russian sturgeon.(A) Distribution of SNPs in 10 Mb windows across the genome; (B) Number of SNPs on each chromosome; (C) Principal component analyses for the first to the third dimensions of principal component (PC); (D) Genomewide LD decay.

Figure 2 .
Figure 2. Manhattan and QQ plots of genome-wide association studies for caviar yield, caviar color, and body weight in the Russian sturgeon population.(A,B) Caviar yield; (C,D) Caviar color; (E,F) Body weight.In the Manhattan diagram, the dashed and solid lines indicate the genome-wide and suggestive significance threshold, respectively.In the Manhattan plots, different colors represent individual chromosomes.Each dot corresponds to a SNP, and its color indicates its chromosomal location.

Figure 2 .
Figure 2. Manhattan and QQ plots of genome-wide association studies for caviar yield, caviar color, and body weight in the Russian sturgeon population.(A,B) Caviar yield; (C,D) Caviar color; (E,F) Body

Table 1 .
The descriptive statistics of caviar yield, caviar color, and body weight.

Table 2 .
Estimated variance components and heritability for caviar yield, caviar color, and body weight.

Table 3 .
The genome significant and suggestive SNPs with the caviar yield trait using whole-genome sequencing data.
Chr, chromosome.SNP_R, range of significant and suggestive SNPs region.SNP_N, number of significant and suggestive SNPs.Position_Top, the position (bp) of the top SNP in the range of significant and suggestive SNPs region.p value_Top, p value of the top SNP.The bolded text shows the potential candidate genes associated with caviar yield, identified through functional annotation with eggNOG-mapper.

Table 4 .
The genome significant and suggestive SNPs with the caviar color trait using whole-genome sequencing data.

Table 4 .
Cont.Chr, chromosome.SNP_R, range of significant and suggestive SNPs region.SNP_N, number of significant and suggestive SNPs.Position_Top, the position (bp) of the top SNP in the range of significant and suggestive SNPs region.p value_Top, p value of the top SNP.The bolded text shows the potential candidate genes associated with caviar color, identified through functional annotation with eggNOG-mapper.

Table 5 .
The genome significant and suggestive SNPs with the body weight trait using whole-genome sequencing data.
Chr, chromosome.SNP_R, range of significant and suggestive SNPs region.SNP_N, number of significant and suggestive SNPs.Position_Top, the position (bp) of the top SNP in the range of significant and suggestive SNPs region.p value_Top, p value of the top SNP.The bolded text shows the potential candidate genes associated with body weight, identified through functional annotation with eggNOG-mapper.