A high-density SNP-based genetic map and several economic traits-related loci in Pelteobagrus vachelli

A high-density genetic linkage map is essential for QTL fine mapping, comparative genome analysis, identification of candidate genes and marker-assisted selection in aquaculture species. Pelteobagrus vachelli is a very popular commercial species in Asia. However, some specific characters hindered achievement of the traditional selective breeding based on phenotypes, such as lack of large-scale genomic resource and short of markers tightly associated with growth, sex determination and hypoxia tolerance related traits. By making use of 5059 ddRAD markers in P. vachelli, a high-resolution genetic linkage map was successfully constructed. The map’ length was 4047.01 cM by using an interval of 0.11 cm, which is an average marker standard. Comparative genome mapping revealed that a high proportion (83.2%) of markers with a one-to-one correspondence were observed between P. vachelli and P. fulvidraco. Based on the genetic map, 8 significant genome-wide QTLs for 4 weight, 1 body proportion, 2 sex determination, and 1 hypoxia tolerance related traits were detected on 4 LGs. Some SNPs from these significant genome-wide QTLs were observably associated with these phenotypic traits in other individuals by Kompetitive Allele Specific PCR. In addition, two candidate genes for weight, Sipa1 and HSD11B2, were differentially expressed between fast-, medium- and slow-growing P. vachelli. Sema7a, associated with hypoxia tolerance, was induced after hypoxia exposure and reoxygenation. We mapped a set of suggestive and significant QTLs as well as candidate genes for 12 growth, 1 sex determination and 1 hypoxia tolerance related traits based on a high-density genetic linkage map by making use of SNP markers for P. fulvidraco. Our results have offered a valuable method about the much more efficient production of all-male, fast growth and hypoxia tolerance P. vachelli for the aquaculture industry.


Background
High-resolution linkage maps have become indispensable to many genetic studies, such as fine-scale quantitative trait locus (QTL) mapping, mapping molecular markerassisted selection (MAS), genome scaffolding and assembly (GSA) and comparative genomic analysis (CGA). For the construction of high-density genetic linkage maps, single-nucleotide polymorphism (SNP) markers representing the most abundant sources of variation in the genome are increasingly utilized [1]. Newly developed genotyping methods, such as reduced-representation sequencing, SNP arrays, and re-sequencing, have allowed for the discovery and simultaneous scoring of thousands of SNP markers from a single sequencing run for dozens of individuals. These techniques have already been applied in some fish species, such as Ictalurus punctatus [2], Siniperca chuatsi [3], Scophthalmus maximus [4], Paralichthys olivaceus [5], and Megalobrama amblycephala [6]. In addition, significant QTLs for weight, gender and disease resistance in these fish species have been detected. MAS using QTL analysis is the most effective breeding method for many animals and plant species and it is a direct choice for genotyping the control trait loci.
In Pelteobagrus genus, adult individuals of Pelteobagrus vachelli are the largest [7]. In Asia, the reason why it is a very popular commercial species is that it has relatively high yield together as well as an affordable price for consumers to buy [8]. This species is also known as the male parent of the new aquatic species 'Huangyou-1 catfish' (P. vachelli ♂ × P. fulvidraco ♀), which has advantages such as faster growth, a high feeding rate, and greater immunity when compared to its parent species [9]. As with a lot of other cultured fish species, the production of P. vachelli seedlings by catching wild parents and carrying out multiple generations of inbreeding has given rise to varying qualities and the degradation of economic characteristics, e.g., body length, weight, head length and condition factors. These problems result in farming production efficiencies and economic benefits that are difficult to guarantee, in addition to the declining competition for these goods in fish markets, which have seriously affected the sustainable development of the P. vachelli industry. Identification of molecular markers from important QTLs for growth-related characters could contribute to efficient breeding varieties, similar to several other aquaculture species, including Lates calcarifer [10], Hypophthalmichthys nobilis [11], Oncorhynchus nerka [12], and O. tshawytscha [13].
Fish species often exhibit significant sexual dimorphism [14]. Commercial fish species, such as Dicentrarchus labrax [15], Hippoglossus hippoglossus [16], and Oreochromis niloticus [17], display significant variations in size and growth rates between female and male individuals, substantially affecting their commercial value.
Bagridae fish, the Pelteobagrus genus in particular, also exhibit significant sexual dimorphism [18,19]. Females tend to grow more slowly than males and lower body weights (up to 30% less) have been reported at harvest time compared to the body weights of identical cohorts of males [20]. It's worth noting that Pelteobagrus fulvidraco has achieved all-male fish lines [21], which is contributed by sex-associated markers for successfully breeding YY-male and YY-female [22]. Therefore, screening for sex-associated markers will shorten the time required for the development of all-male fish lines for aquaculture and will be helpful in elucidating the mechanisms of sex chromosome differentiation and sex determination [23].
Exposure to hypoxia induces both acute and chronic stress responses, which play important roles in the health of cultured organisms, such as growth, reproduction, immunity, and other energy demanding activities [24]. Many environmental factors (e.g., elevated temperatures, decomposition, algal blooms, and organic matter accumulation via faeces) will lead to rising biological oxygen demand and make the hypoxia status more serious for animals in the system. Living in hypoxic conditions can place selection pressure on genotypes which could tolerate hypoxia at the genomic level more [25], thus developing lines by MAS that are more tolerant to hypoxia is an economical as well as sustainable solution to the problem about aquaculture. However, progress in improving hypoxia tolerance in fish, including Bagridae fish, has been very slow due to the shortage of genetic and physiological knowledge of hypoxia tolerance traits. With fish, especially aquaculture species, genetic analysis of the QTLs associated with hypoxia tolerance is important for efficient MAS in fish, as has been demonstrated in I. punctatus [26,27] and O. niloticus [28].
Although considerable work has been carried out in P. vachelli to increase production, some specific characteristics have continued to hinder the development of traditional phenotype-based selective breeding techniques, such as lacking in large-scale genomic resources as well as few markers that are tightly associated with growth-, sex determination-and hypoxia tolerance-related traits. MAS technology, which uses a series of selected markers that are connected to economical traits tightly, can be used to develop stress tolerant and fast growing lines more effectively [29]. As a result, it is of great necessity to start a selective breeding programme for P. vachelli to promote these economical traits. Comparing to existing RAD-Seq approaches, as a reduced-representation sequencing method, double digest restriction-site associated DNA sequencing (ddRAD-seq) is a way for de novo SNP discovery and genotyping in non-model species that is efficient and cost-saving [30]. This technique has already been applied in some fish, such as Oncorhynchus kisutch [31] and Polyprion oxygeneios [32]. For the specific characteristics of P. vachelli, the ddRAD-seq technique is a very effective and appropriate method for the MAS of a selective breeding programme.
In this study, a high-resolution genetic linkage map of P. vachelli was constructed using the ddRAD-seq technique for comparative genome mapping with assembled genomes of I. punctatus and P. fulvidraco, as well as for the fine mapping of economical QTL traits, including 13 growth-, 1 sex determination-, and 1 hypoxia tolerancerelated trait. Kompetitive Allele Specific PCR (KASP) was used to test tight chain SNPs for conducting MAS and the candidate genes screened across significant genome-wide (GW) QTL localization intervals were further analysed by qRT-PCR. Our results will facilitate the elucidation of the molecular mechanisms that determine growth kinetics, sex determination and hypoxia tolerance-related traits. They will also be useful in future MAS and de novo GSA in P. vachelli and in CGA in Siluriformes fish. We provide a good case study demonstrating the utility of genomic data for investigating the genetic basis of important phenotypic traits using reduced-representation sequencing.

Characteristics of several phenotypic traits
The mapping family in this study consisted of 200 P. vachelli progeny and their phenotypic growth (e.g., W, TL, HL, BL/HL, BH, BL, BW/ED, BL/SL) and hypoxia tolerance-related traits were in concordance with the normal distribution (Kolmogorov-Smirnov test, asymptotic significance > 0.05; Table S1). All growth-related trait values are shown in Table S1, as well as their phenotypic sexes and time to balance loss.
With regards to the growth-related traits, BL/HL, CF, BW/ED, BH/ED, and BL/SL were related to body type proportions and W, BL, HL, TL, BW, BH, CL, and CH were directly related to weight, with these eight traits showing a strong correlation with each other (r = 0.593-0.967, P < 0.001 for all) ( Table 1). The highest correlation value (r = 0.967) was observed between BL and TL. There was a strong correlation between W, TL and BL, with the W being strongly correlated with TL (r = 0.927) and BL (r = 0.934). HL was highly correlated with TL (r = 0.813), BL (r = 0.813) and W (r = 0.823). By phenotype sexing of the mapping family, 101 and 99 individuals were identified as males and females, respectively, with a sex ratio of 1.02:1.  (Table S2).

SNP discovery and genotyping
A total of 461,909 raw polymorphic markers were detected using the STACKS pipeline. After stringent selection, 5165 polymorphic markers were successfully genotyped in both parents and offspring. These SNPs were classified into three categories: maternal heterozygous (1881 SNPs), paternal heterozygous (2405 SNPs), and heterozygous in both (879 SNPs). All SNPs are listed in Table S3.

High-resolution genetic map construction
A high-resolution ddRAD-based linkage map of P. vachelli was constructed using a pseudo-testcross strategy (a mapping population is developed by hybridizing two unrelated, highly heterozygous parents to produce a set of F1 progeny). The resulting integrated map consisted of 26 linkage groups (LGs), including 5059 segregating SNPs (97.9% of all 5165 polymorphic markers). The number of LGs was congruent with the number of P. vachelli chromosomes (2n = 52) [33] The total map length was 4047.01 cM with an average interlocus distance of 0.11 cM. The genetic length of each LG ranged from 122.47 cM (LG1) to 189.03 cM (LG24) with an average interlocus distance of 0.03-0.83 cM ( Table 2 and Fig. 1). The locus names and SNP positions in the 26 LGs of the integrated genetic map are listed in Table S4.

Comparative genome mapping
Within the comparative genome mapping between the LGs of P. vachelli and the chromosomes of P. fulvidraco, LGs and were used for synteny analysis (Fig. 2a). Overall, a one-to-one correspondence was observed between the LGs of P. vachelli and the chromosomes of P. fulvidraco. Notably, for each LG of P. vachelli, approximately 83.2% of the 3879 synteny markers were located on a single chromosome of P. fulvidraco (Fig. 2b), while the remaining were dispersed into various LGs (Table S5). A total of 871 (17.2% of 5059) markers in the linkage map of P. vachelli were uniquely aligned to the chromosomes of I. punctatus (Table S5 and Fig. 2c).

QTL mapping of significant economic traits
In this study, we conducted 15 QTL location analyses, including 13 growth (5 body proportion-and 8 weightrelated traits), sex determination-, and hypoxia tolerancerelated traits. For body proportion-related traits, the CW and GW LOD significance thresholds varied from 3.1 to 3.3 and from 4.9 to 5.7, respectively, based on the permutation test. Using the interval mapping algorithm, no significant QTLs were associated with BW/ED. A total of 11 QTLs that associated with weight-related traits were detected in 6 LGs, including 1 significant GW QTL (qBL/ HL5-a; Fig. 3d) and 10 significant CW QTLs, with LOD scores ranging from 3.1 to 5.1 (Table 3). For weight-related traits, the CW and GW LOD significance thresholds varied from 3 to 3.6 and from 5 to 5.1, respectively, based on the permutation test. A total of 36 QTLs associated with weight-related traits were detected in 10 LGs using the interval mapping algorithm, including 4 significant GW QTLs (qW14-a, qTL3-a, qTL14-a and qHL14-a; Fig. 3a, b, c) and 19 significant CW QTLs, with LOD scores ranging from 3.26 to 5.52 (Table 4). Five QTLs (qW14-a/b, qW24-a and qW3-a/b) associated with W were located at 38 Table S4.  LG Linkage group, CI Confidence interval, GW Genome-wide, CW chromosome-wide, PVE phenotypic variance explained. The numbers in bold represent significant GW QTLs correlation between W, TL and BL, QTLs for BL and TL were located within the same confidence intervals along three LGs (  (Fig. 3d). Among all QTLs, a total of 18 confidence intervals within LG3 and LG14 were associated with weight-related traits and in addition, all significant GW QTLs were mapped to these two LGs (Table 4). For the detection of sex determination loci, the CW and GW LOD significance thresholds varied from 3.3-3.6 and 80.4, respectively, based on the permutation test. A total of 6 QTLs associated with sex-related traits were detected within 3 LGs, including 2 significant GW QTLs (qSXE14-a/b) and 4 significant CW QTLs, with LOD scores ranging from 3.46 to 111.33 ( Fig. 3e and Table 5).
For hypoxia tolerance-related traits, the CW and GW LOD significance thresholds varied from 3.2-3.7 and 5.1, respectively, based on the permutation test. A total of 11 QTLs associated with hypoxia tolerance-related traits were detected within 7 LGs using the interval mapping algorithm, including 1 significant GW QTL (qHT12-a) and 10 significant CW QTLs, with LOD scores ranging from 3.34 to 5.22 ( Fig. 3f and Table 5).

SNP association with significant traits
To confirm and refine the position of these significant GW QTLs, the 200 remaining individuals and the two parents of the 400 offspring, as well as 48 independent wild individuals (24 ♀:24 ♂) were used for SNP phenotypic association analysis by Kompetitive Allele Specific PCR (KASP) assays (Table S6). Our results showed that growth-related traits, including body proportion (BL/HL; un_16957051) and weight (W, BL, TL, HL, and CH) were closely related to several markers (un_15322259, un_40276574, un_ 36488182, un_47032303, un_34909841, and un_54585281) with a K* of 3.884-9.534 and a significance of 0.002-0.049. The un_28380227 marker showed the highest association LG Linkage group, CI Confidence interval, GW Genome-wide, CW Chromosome-wide, PVE Phenotypic variance explained. The numbers in bold represent significant GW QTLs with the hypoxia locus from 1 significant GW QTL (qHT12-a) with a K* of 4.72 and a significance of 0.03. All of the genotypic types (hkxhk, lmxll, and nnxnp) of the 200 remaining individuals were verified and in accordance with their parents. For these 12 sex-associated SNPs, the loci with hkxhk and nnxnp were found to be completely related to the gender phenotype in the 200 remaining individuals (over 97%). For example, in each case the remaining individuals were homozygous (nn) male and heterozygous (np) female when the parents were a homozygous ♂ and a heterozygous ♀ (nnxnp) for these markers. If both parents were heterozygous (hkxhk), the offspring were homozygous with one exclusively male (hh) and the other exclusively female (kk), and the heterozygous offspring (hk) were either male or female. Sexassociated SNPs could not be determined with the lmxll marker (approximately 50-64.7%; Fig. 4 and Table S6). Most loci in the wild population showed no gender difference (approximately 50%), but there were several potentially valuable loci to consider as a marker for potential  Table S6. nnxnp: homozygous ♂ and heterozygous ♀; hkxhk: heterozygous ♂ and heterozygous ♀; lmxll: homozygous ♀ and heterozygous ♂ LG Linkage group, CI Confidence interval, GW Genome-wide, CW Chromosome-wide, PVE Phenotypic variance explained. The numbers in bold represent the significant GW QTLs gender determination, with up to 85.2% association (un_43077153, un_13604800, and un_63006763).

Potential candidate genes for the 8 significant GW QTLs
Based on the non-redundant annotation information of the P. fulvidraco genome, 72 genes in total were located within the 8 significant GW QTLs (4 weight, 1 body proportion, 2 sex determination, and 1 hypoxia tolerance; Table S7). Among them, 2 genes from 1 body proportionrelated QTL (qBL/HL5-a) are related to minus enddirected microtubule transport and sphingolipid synthesis. A total of 14 genes from 4 weight-related QTLs (qW14-a, qTL3-a, qTL14-a, qHL14-a) play important roles in the genetic regulation of development, cell proliferation, immunity, and energy metabolism. The expression of the rapid growth-related genes, Signal-induced proliferationassociated 1-like protein 1 (Sipa1) and Corticosteroid 11beta-dehydrogenase isozyme 2 (HSD11B2), was significantly higher in fast-growing fish than in slow-growing fish in both the liver and the brain (P < 0.05, two-tailed ttest) and Sipa1 expression in the muscle of slowgrowing fish was significantly lower than in fast-and medium-growth fish (Fig. 5).
A total of 55 candidate QTL genes from 2 sex-related QTLs (qSXE14-a/b) were identified, however, no genes associated with sexual differentiation or determination were identified within this region based on the nonredundant annotation information of the P. fulvidraco genome (Table S7). The gene Semaphorin-7A (Sema7a) from 1 of the hypoxia tolerance-related traits (qHT12-a) has been reported to play an important role in hypoxia adaption (Table S7). The temporal expression of Sema-7a mRNA changed in the liver and brain after hypoxia and reoxygenation. Sema-7a mRNA expression peaked in both the liver and the brain after 1.5 h of hypoxia and remained high throughout the recovery.

Discussion
High-resolution genetic map construction using ddRADbased SNPs To date, most catfish genetic maps have been constructed using expressed sequence tags [34], AFLP [35], and SSR [36]. The density of these linkage maps in catfish is still low, with an average density of 1.40-15.20 cM. Currently, there have been several attempts to construct high-density genetic maps of catfish, but thus far, only channel catfish (I. punctatus) [37] and southern catfish (Silurus meridionalis) [38] maps have been constructed. No high-density linkage maps for any Bagridae catfish species have been generated. In the present study, a high-resolution genetic linkage map of dark barbell catfish (P. vachelli) was constructed with 5059 SNPs using the ddRAD-seq technique, which represents the most saturated linkage map to date. This is the first high-density Bagridae fish genetic map reported. A highdensity linkage map, such as the one we have constructed, provides detailed information on genomic structure and is a valuable resource for comparative genomics and fine-scale QTL mapping in catfish.

Comparative genome analysis
Comparative mapping with model species is an efficient way to evaluate the accuracy of genetic maps of non-model species. In this research, syntenic analyses were performed between our map and the reference genome of P. fulvidraco, a species belonging to the same genus as P. vachelli. The number of LGs was congruent with the number of P. fulvidraco chromosomes (2n = 52) [18]. A near 1:1 relationship was observed between our map and the draft genome of P. fulvidraco and a high level of conserved genomic synteny (76.7%) and syntenic boxes (83.2%) were presented, which indicates a high genetic relationship between P. vachelli and P. fulvidraco. A similar phenomenon was also detected for the previous linkage map of Cyprinus carpio [39]. However, only 17.2% of the mapped markers were uniquely aligned to the chromosomes of I. punctatus (2n = 58), indicating the presence of extensive intra-chromosomal rearrangements between the two catfish species or issues with the assembly of the draft genome [40].

Fine scale QTL mapping and SNPs phenotypic association
Fine scale QTL mapping is an efficient approach to identify genetic loci and candidate genes underlying quantitative traits of interest. To date, growth-related QTL studies have been reported in various aquatic species, but with the exception of those in I. punctatus [26,27], very few studies including fine scale QTL mapping have been performed in catfish. QTL fine mapping and candidate gene identification were difficult to achieve in P. vachelli due to the absence of a high-density linkage map. Within this study, our map contains abundant high confidence QTLs related to weight, body proportion, sex determination and hypoxia tolerance. These findings represent a promising platform for MAS and GAS in catfish and will help to clarify the molecular mechanisms of the above phenotypic traits.
Validating the results of the QTL association analysis is of the utmost importance. The fact that the phenotype-associated SNPs showed strong association with weight, body proportion (BL/HL), sex determination, and hypoxia tolerance traits when tested in the 200 remaining individuals, two parents and 48 independent wild individuals using KASP assays is promising. Interestingly, we found that several markers were associated with multiple traits (W, BL, TL, HL, CH) at the same time (Table 6), which also reflects the strong correlation between these weight-related traits. For the 12 sex-associated SNPs, several loci, including hkxhk and nnxnp, showed great relevance to the gender phenotypes (over 97% association in the remaining individuals and over 85.2% association in the wild population), but most loci showed no gender differences (approximately 50% association) in the wild population. The above results also demonstrate the limitations of using SNPs as a sex marker, because they are susceptible to the influence of the parents within the family, which has also been observed in sex association analyses of O. niloticus L. [17], H. hippoglossus [16], and P. oxygeneios [32]. Nonetheless, the use of these markers is possible to conduct genetic gender identification and all-male breeding based on the P. vachelli family line selection. MAS could also be conducted using these SNPs, providing a valuable tool towards the more efficient production of all-male, fast growing and hypoxia tolerant P. vachelli for the aquaculture industry.

Growth-related traits and candidate genes
Strong correlations among the growth-related traits (BW, TL, BL, BH and HL) have been reported in some aquaculture species [39]. Similarly, these five growthrelated traits were also highly correlated with each other in this study ( Table 1). As expected, based on the relatively high correlation value (mean r = 0.943) between W, TL and BL, QTLs for W, TL and BL had the same distribution patterns within the three LGs (LG3, LG14, and LG24). Additionally, all the significant GW QTLs associated with weight-related traits were detected within LG3 and LG14, indicating that these two major LGs are involved in rapid growth and that the candidate genes associated with weight may also be located within these LGs. Among the significant GW QTLs, it is worth noting that several genes related to T cell development (thymocyte selection-associated high mobility group box protein, TOX), T cell antigen recognition (T cell receptor beta variable region), and the inducible expression of T cell cytokine genes (nuclear factor of activated T cells, cytoplasmic 3) were identified, indicating that strong disease resistance may contribute to individual growth [41,42]. In addition, two genes, Sipa1 and HSD11B2, were the most likely to be associated with growth, given their important roles in cell proliferation and development. Sipa1 stimulates the GTPase activity of the Ras-related protein Rap-2a, which may well regulate cytoskeletal rearrangements, cell migration, cell adhesion and cell spreading [43,44], as well as promote the reorganization of the actin cytoskeleton and the recruitment of disks large homologue 4 to F-actin [45]. In mammals, glucocorticoid is very decisive in cell proliferation and differentiation, with the two glucocorticoid metabolic enzymes, HSD11B1 and HSD11B2, considered as anti-proliferation and pro-proliferation switches in cells, respectively. During embryonic development, HSD11B2 prevents cells from the growth-inhibiting and/or pro-apoptotic effects of cortisol [46]. In fish, HSD11B2 was considered to have an activity that is similar to that of mammalian HSD11B2 in preventing the cortisol activation of the glucocorticoid receptors, which could mediate cell development and stress responses. Additionally, HSD11B2 catalyses 11b-hydroxytestosterone converses to 11-keto-testosterone, a fish androgen. Previous studies have reported that androgen-related genes (e.g., 17α-Methyltestosterone) could improve the GH/IGF axis and growth in P. fulvidraco [47]. It was also found that HSD11B2 increases gradually during the period of individual growth in two catfish species [48,49]. Interestingly, the expression of Sipa1 and HSD11B2 in both the liver and brain of the fast-growing fish was significantly higher than that in the medium-or slow-growth fish. This is consistent with the function of these genes in cell proliferation and development, supporting the notion that Sipa1 and HSD11B2 are the two good candidate genes for growth. However, with the current data, we cannot rule out the involvement of other genes within the significant QTLs for growth and further detailed studies on gene function are required to understand the underlying mechanisms of differential growth among individuals.

Sex determination-related traits and candidate genes
The male P. vachelli exhibit > 30% faster growth than females and therefore, in both the P. vachelli farming and academic fields, the development of all-male P. vachelli for aquaculture has been a topic of great concern. The sex determination of P. vachelli is XX/XY because sex ratios in conventional diploid offspring approximate 1:1 [19,33] and gynogenetic offspring are all female (unpublished data), which is the same in P. fulvidraco. However, similar to other fish species, it is not easy to distinguish between the sex chromosomes (X and Y), as well as between the sex and autosomal chromosomes based on current cytogenetic techniques. Screening of sexassociated markers will make the time required for allmale P to develop much shorter [50]. vachelli. In this study, two significant GW QTLs for sex determination in P. vachelli were identified on LG14 (with 92.3% of PVE). There were also some significant CW QTLs within LG21 and LG24, which is very similar to S. maximus. The most sex-related QTLs in S. maximus are found within LG21 (with 99.9% of PVE), with a linkage span of 70.882 cM. In LG7 and LG14, a small quantity of QTLs with high feasibility are discovered too [4]. In some other fish species, the similar phenomena have also been found. In H. hippoglossus, four markers which can be located in LG13 were found to be related to sex determination significantly, with 82% of PVE associated with sex [16]. Additionally, in Sparus aurata L. [3] and S. chuatsi [51], the significant QTLs affecting sex determination were found to be located in LG21 and LG23, respectively. It is suggested that, in some fish species, a main LG or chromosome is involved in sex determination and that in this LG, the sex determination genes may be located.
Among the significant GW QTLs of sex determination, not a single gene associated with sexual differentiation or determination was identified in this area, which may be due to the limitations of our simplified genome sequencing data or to the limitation that the P. fulvidraco genome came from an XX individual. Among the candidate genes, thrombospondin-3b was the most likely to be associated with gender development. Thrombospondin (TSP) proteins are a major component of cortical rods in mature oocytes in invertebrates [52]. A previous study has reported that two types of TSPs from O. niloticus involved in the formation of gonads and oogenesis also play an important role in the 14 day spawning cycle [53].

Hypoxia tolerant traits and candidate genes
The hypoxia tolerant trait was found to be a quantitative trait that is controlled by many genes in animals [54]. Mapping QTLs for hypoxia traits associated with tolerance is the first step in improving the stress tolerance of breeding fish, but few QTL mapping studies on hypoxia traits in fish have been conducted. Only recently, two genome-wide association studies were carried out to identify QTLs for hypoxia tolerance using 250 K SNP arrays and ddRAD-seq with channel catfish families [26,27] and O. niloticus [28], respectively. It is worth noting that we identified a marker, un_28380227, that could be directly used in the MAS of tolerant lines in P. vachelli. Additionally, our findings of the markers located on significant GW QTL intervals in LG12 provide a useful resource. Semaphorin-7A (Sema7a) was located in the significant GW hypoxia QTL. Sema7a plays an important role in integrin-mediated signalling and functions in regulating both cell migration and immune responses. Coimmunoprecipitation assays showed that hypoxiainducible factor-1 (hif-1) had a structure that interacts with Sema7a and that overexpression of hif-1 induces Sema7a production even in normoxic conditions [55]. Previous reports suggested that hypoxia-induced Sema7a could promote neutrophil migration and a protumourigenic mesenchymal phenotype [55,56], which also correlated with the severity of inflammation and osteoclastogenesis [57]. Furthermore, Sema7a was highly expressed in the early stages of hypoxic stimulation and remained high in various tissues throughout the recovery period. These results suggest that Sema7a plays a vital role during hypoxia and reoxygenation, indicating that Sema7a may be a good candidate gene for the hypoxic response.

Conclusion
We constructed a high-resolution genetic linkage map with a length of 4047.01 cM and an average marker interval of 0.11 cM using 5059 ddRAD markers in P. vachelli. Comparative genome mapping revealed that a high proportion (83.2%) of markers with a one-to-one correspondence were observed between P. vachelli and P. fulvidraco. Several SNPs from 8 significant genome-wide QTLs (4 weight, 1 body proportion, 2 sex, and 1 hypoxia) were found to be associated with these phenotypic traits in other individuals by KASP assays. In addition, two candidate genes for weight, Sipa1 and HSD11B2, were differentially expressed between fast-, medium-and slow-growing P. vachelli. Sema7a, which is associated with hypoxia tolerance, was induced after hypoxia exposure and reoxygenation. We mapped a set of suggestive and significant QTLs, as well as candidate genes for growth-, sex determination-and hypoxia tolerance-related trait based on the P. fulvidraco genome. Our results provide a valuable tool for more efficient production of all-male, fast growing and hypoxia tolerant P. vachelli for the aquaculture industry.

Mapping population and phenotypic data
The male parent was selected from a group of P. vachelli derived from a wild population from the Pearl River in Guangdong province and the female parent was chosen from a cultured population from the ChenQiang fish farm in Changzhou, Jiangsu Province, China. Fish breeding and cultivation were performed at the Nanjing Normal University. The juvenile fish were fed live Artemia nauplii to satiation twice daily until the mouth gape sizes matched the size of the artificial compound feed. The feed was slightly over-supplied based on previous consumption.
Fish were acclimated at an ambient temperature of 26 ± 1°C in aerated flow-through water with a biofiltered water recirculation system for 1 week, followed by feed restriction for 2 days. For phenotypic data collection, a total of 400 offspring from a full-sib family at 105 days post-hatch were randomly selected and stocked in an aquarium (580 L of volume). The dissolved oxygen level was reduced gradually from 8.23 to 1.51 mg/L over 2.8 h by bubbling pure nitrogen gas. After that, the surface of the water was enclosed by a plastic film and fish were monitored for signs of balance loss due to hypoxic stress as the dissolved oxygen level was reduced from 1.51 to 0.32 mg/L over 2.6 h. The time and sequence of balance loss for each fish was recorded for the analysis of hypoxia tolerance.
Fish were stopped breathing after anaesthetization in a eugenol bath (1:10,000) for 5.5 min, then dissection directly. Growth related traits including weight (W), body length (BL), condition factor (CF), head length (HL), BL/ HL, total length (TL), body width (BW), body height (BH), BW/ED (eye distance), BH/ED (eye diameter), BL/ SL (snout length), caudal peduncle length (CL), and caudal peduncle height (CH) were measured by electronic scales or Vernier calliper for all 400 offspring, at which time the sexes were also identified by the stained squash technique. After the measurements were taken, a portion of the fins and muscle tissues were frozen in liquid nitrogen and stored at − 80°C. This family exhibited a high variation in their growth-and hypoxia tolerancerelated traits and therefore 200 randomly selected P. vachelli offspring (F1) from the above 400 offspring and their parents (F0) were used for the development of a genetic linkage map and QTL analysis. The 200 remaining individuals were used to test SNP markers in KASP assays.

Construction and sequencing of ddRAD libraries
Genomic DNA was extracted from the fin tissue of each of the 200 randomly selected offspring (F1) and two parents (F0). ddRAD libraries for all F1 lines were constructed as described in a previous study [40]. Briefly, 500 ng of DNA template from each individual was double-digested using the restriction enzymes EcoRI and NlaIII (20 U/ reaction; New England Biolabs, Ipswich, MA, USA) in a combined reaction for 30 min at 37°C. Subsequently, each fragmented sample was purified using a Qiagen MinElute Reaction Cleanup Kit (Qiagen, Valencia, CA, USA) and eluted in 20 μL of elution buffer (EB). The fragments were then ligated to P1 (includes a unique 4-to 8-bp multiplex identifier used to distinguish each individual) and P2 adapters that bound to the EcoRI and NlaIII overhangs, respectively. In each 40-μL reaction, 500 ng of DNA, 1 μL of P1 adapter (10 mM), 1 μL of P2 adapter (10 mM), 1 μL of T4 ligase (1000 U/ mL), 4 μL of 10X T4 ligation buffer, and double-distilled water were mixed. The ligation was performed in a PCR machine using the following conditions: 37°C for 30 min, 65°C for 10 min, followed by a decrease in temperature to 20°C at a rate of 1.3°C/min. The samples were pooled and size-selected (400-600 bp) from an agarose gel. The DNA product was subsequently purified using a Qiagen MinElute Gel Purification Kit and eluted in 10 μL of EB. The paired-end (150 bp) sequencing of the ddRAD products from the 202 individuals was performed using the Illumina HiSeqXten sequencing platform (Illumina, Inc., San Diego, CA, USA). The sequencing data for each individual was extracted according to the specific multiplex identifier.

SNP discovery and genotyping
We first filtered out Illumina short reads lacking sample-specific multiplex identifiers and the expected restriction enzyme motifs. Thereafter, the reads were filtered on the basis of their quality scores using Trimmomatic (v0.32) [58] in three steps: (1) removal of adapters; (2) removal of reads with bases from the start or end of a read, if below the quality threshold of 3; and (3) scanning of the reads with a 4-bp sliding window, removing reads with an average Phred quality per base below 15.
The STACKS pipeline (Version 1.32) [59] was used to assemble the loci, de novo, from the sequencing data for SNP calling. The USTACKS, CSTACKS, SSTACKS, and GENOTYPE programmes were then used to create libraries of the loci, i.e., one for each individual and one for all the loci shared among the individuals. The minimum depth of coverage required to create a stack was 3; the maximum distance (in nucleotides) allowed between stacks was 3; 2 mismatches were allowed between sample loci when the catalogue was built. The detailed parameters used were as follows: USTACKS: -t gzfastq -i - Only the SNPs with miss rates (number of samples with no genotype information/number of total samples) less than 10% along with biallelic SNPs were selected using custom PERL scripts (https://github.com/Niuyongchao/Fish_linkage_map) to avoid sequencing errors.

Linkage map construction
A linkage map was constructed using JoinMap 4.1 [60]. The linkage group assignments were made under a logarithm of odds (LOD) score limit of 5.0-7.0. The regression mapping algorithm and Kosambi's mapping function were used for map construction with the following settings: Rec = 0.4, LOD = 1.0, Jump = 5. The resulting linkage maps were drawn using a custom PERL script (https:// github.com/Niuyongchao/Fish_linkage_map).

Comparative genome analysis
Using a method from a previous study [61], comparative mapping was performed between our genetic map and the reference genomes of three Siluriformes fish, including the channel catfish (I. punctatus) and the yellow catfish (P. fulvidraco). In brief, the ddRAD tag markers were mapped to the two catfish genomes using the BLAST software suite (NCBI BLAST+ version 2.6.0) with an e-value cut-off of 10 − 10 . In cases where the search of a query sequence hit two or more loci, the hit with the smallest e-value was considered significant [38]. Finally, markers with a single unique position in the three reference genomes were used to perform comparative mapping and the genomic synteny was visualized using Circos software.

QTL mapping
The QTLs were identified using MapQTL 6.0 with an interval mapping algorithm [62]. Automatic cofactor selection (backward elimination, P < 0.05) was used for the detection of significantly associated markers as cofactors. The LOD significance threshold levels were determined by a permutation test on the basis of 1000 permutations at a significance level of P < 0.05. The location of each QTL was determined according to its LOD peak location and surrounding region. The percentage of the phenotypic variance explained by a QTL (R 2 ) was estimated at the highest probability peak. The QTL results were drawn using a custom PERL script (Genepioneer Biotechnologies, Nanjing, China). QTLs with LOD scores exceeding the chromosome-wide (CW) LOD threshold at P < 0.05 were considered as suggestive, while those with LOD scores exceeding the genome-wide (GW) LOD threshold at P < 0.05 were considered as significant.

Kompetitive allele specific PCR assay
To confirm and refine the positions of the significant GW QTLs detected in the F1 mapping population, the 200 remaining individuals from the 400 total offspring of a full-sib family at 105 days post-hatch were used. In addition, in order to avoid the restriction of the remaining population on the genotyping of the parental fish, the sex marker associations were tested in the two parents of the 400 offspring (1 ♀:1 ♂) and in 48 independent wild individuals (24 ♀:24 ♂) originating from the Yangtze River, Yellow River, Amur River and Pearl River. Marker phenotype association was tested using KASP assays based on SNPs that were commonly found in the mapping families and spanned the regions of 8 significant GW associations with weight (un_15322259, un_ 40276574, un_36488182, un_47032303, un_34909841, un_ 54585281), body proportion (un_16957051), sex determination (un_42441442, un_40145170, un_43077153, un_ 47032303, un_36488182, un_15322259, un_40276574, un_ 34909841, un_13604800, un_25544718, un_63006763, un_ 55377625), and hypoxia tolerance (un_28380227).
The KASP assay was carried out according to the manufacturer's recommendations (LGC Genomics, Beverly, MA, USA) and a previous study [63]. Marker sequences and KASP assay primers are listed in Table  S8. Amplification was carried out starting with 15 min at 94°C, followed by 10 touchdown cycles of 20 s at 94°C and 60 s at 65-57°C and 26-35 cycles of 94°C for 20 s and 60°C for 1 min. Endpoint genotyping was done using the CFX Manager 3.1 software. The specificity and sensitivity of all tested markers are listed in Table S2. Afterwards, an association analysis was performed using the Kruskal-Wallis test with SPSS 22.0. The Kruskal-Wallis test statistic (K*) and asymptotic significance (P < 0.05) were applied in order to test the magnitude of associations between the SNP genotypes and phenotype.

Identification of potential candidate genes
For those markers that were located in the confidence intervals of GW QTLs and mapped at a single position on the assembled genome of P. fulvidraco, their sequences were extended by adding 500 nucleotide sequences from each side of the genome. The extended markers were used to identify conserved regions in the genome of P. fulvidraco and the potential candidate genes were identified in the conserved regions based on the annotation information.
We identified a number of candidate genes including Semaphorin-7A (Sema7a), Signal-induced proliferationassociated 1-like protein 1 (Sipa1) and Corticosteroid 11-beta-dehydrogenase isozyme 2 (HSD11B2), which were possibly responsible for the rapid growth and hypoxia tolerance traits (see below). The relative expression levels of Sipa1 and HSD11B2 were detected between fast-, medium-and slow-growing P. vachelli (fast: 13.2 ± 1.7 g; medium: 9.2 ± 1.1 g; slow: 4.7 ± 0.6 g) by qRT-PCR. Each group had nine individuals and three fish were pooled to create a sample. The materials for the hypoxia experiment were the same as our previously published articles [64]. In brief, control fish were removed from three aquaria for direct organ dissection. Afterwards, the water was deoxygenated for 25-30 min by bubbling pure nitrogen gas in order to decrease the oxygen concentration from 6.8 mg/L to 0.7 mg/L. After the fish were subjected to hypoxia for 1.5, 4 and 6.5 h, fish were quickly removed for organ dissection. After hypoxia for 6.5 h, the bubbling of nitrogen gas was replaced by bubbling with air, and the oxygen concentration was returned to a normal level (6.8 mg/L) in the three aquaria after 25-30 min. During the reoxygenation for 1.5, 4 and 6.5 h, fish were quickly removed for organ dissection.
The harvested organs were immediately frozen in liquid nitrogen and stored at − 80°C. To evaluate the role of Sema7a in response to hypoxia and reoxygenation in P. vachelli, the sampling of liver and brain tissues from nine fish of the control (C) and challenged groups (hypoxia for 1.5, 4 and 6.5 h: H1.5, H4 and H6.5, respectively; reoxygenation for 1.5, 4 and 6.5 h: R1.5, R4 and R6.5, respectively) in the three aquaria was also conducted. Three fish were pooled to create a sample from an aquarium. qRT-PCR with β-actin as an internal control was used to explore mRNA expression levels. qRT-PCR was performed with the SYBR Green Master kit according to the manufacturer's instructions (Roche, Basel, Switzerland). The primers for qRT-PCR are listed in Table S8. The experiments were carried out in triplicate in an ABI Stepone™ Plus (Applied Biosystems, USA) with a total volume of 20 μL containing 10 μL of SYBR green mastermix, 4 μL of cDNA (500 ng), and 3 μL each of forward and reverse primers (2 μmol/L). The qRT-PCR was carried out as follows: 95°C for 10 min, followed by 40 cycles of 95°C for 15 s, and 55°C for 1 min. The expression levels were calculated by the 2 -△△CT method and subjected to statistical analysis. During the analysis of mRNA expression, fold change was determined by comparing to the expression of a reference gene.