Development of genomic markers associated to growth-related traits and sex determination in lumpfish ( Cyclopterus lumpus )

Cleaner fish species have gained great importance in the control of sea lice, among them, lumpfish ( Cyclopterus lumpus ) has become one of the most popular. Lumpfish life cycle has been closed


Introduction
Sea lice infestation remains the most pressing issue affecting salmon aquaculture worldwide.Losses linked to sea lice were estimated at €700 million worldwide in 2015 and continues to increase [1].These losses not only result from reduced production due to sea lice-associated mortalities, decreased fish growth, and reduced flesh quality, but also from the cost of treatment against sea lice.This often involves the use of parasiticide chemicals or mechanical treatments of limited effectiveness or carry other health risks [2].
To address this issue, biological control of sea lice infection in Atlantic salmon cages has become an important alternative to tackle the one of the most important diseases affecting salmon aquaculture.This strategy has gained increasing popularity mainly due to its effectiveness and environmental safety [3].In Norway, for example, about 0.7 million cleaner fish were deployed in salmon cages in 2006 which drastically increased to 43 million cleaner fish deployed in 2019, while showing a slight decrease in the last couple of years (Norwegian Directorate of Fisheries, 2022 [4]).Estimated figures indicate that approximately half of the cleaner fish used in the Atlantic salmon industry are lumpfish (Cyclopterus lumpus), and the remainder includes diverse species of wrasse [5,6] such as ballan wrasse (Labrus bergylta), corkwing wrasse (Symphodus melops) and Goldsinny wrasse (Ctenolabrus rupestris).However, current cleaner fish production relies heavily on wild-caught broodstock, and production from farmed broodstock remains minimal [5,7].
In recent years, hatchery production technologies for cleaner fish, particularly lumpfish and ballan wrasse, have been under intense development.Among these species, lumpfish hatchery production has proven to be more straightforward, with relatively high and stable survival rates [1] being achieved, making it a promising candidate for extensive use as a cleaner fish [8].Lumpfish is a sub-Arctic species commonly found along the Icelandic, Norwegian, and British coastlines, as well as the East coast of North America [9].Crucially, the lumpfish life cycle has been closed, and hatchery reproduction is now possible.However, there are only a limited number of hatcheries producing lumpfish in Europe, and breeding programmes are notably lacking [1].Selective breeding is an effective strategy for improving the production of aquaculture species, by enhancing economically important traits [10].Lumpfish hatchery production, which imitates natural conditions, takes only about seven months before the fish are ready for deployment.This is significantly shorter than the roughly 1.5 years required for ballan wrasse deployment, making the production cycle much more cost-effective [1].Contrary to most aquaculture species, lumpfish grow much faster than would be preferred, leading to problems associated with delousing behaviour [11].There is a marked decline in delousing activity in lumpfish upon reaching a large body size (over 300 g in 6-10 months), likely due to their slow movement, feeding off salmon pellet, and increasing aggressive (and territorial) behaviour, mainly triggered by the onset of sexual maturation [11].Therefore, the establishment of breeding programmes that allow the production of stocks with a more favourable growth rate and other desirable traits would greatly benefit lumpfish production.
Selective breeding of commercially important traits has been applied in the improvement of numerous aquaculture species [12].Growthrelated traits in particular are common targets for selection in aquaculture breeding programmes due to their impact in production and profitability, and have shown great improvement after a few generations [13,14].This success in growth improvement can be attributed the medium to high heritability values that most aquaculture species show for these type of traits, ranging from 0.3 to 0.6 [12].Contrary to the prevailing progression towards the growth rate increase of livestock species, lumpfish selection aims to reduce the growth if the animal to increase its usage as cleaner fish.
Recent developments in genomic technologies have transformed selective breeding programmes for aquaculture species.This has mostly been driven by continuous advances in sequencing technologies that enable high-throughput discovery and screening of genetic markers, in particular single nucleotide polymorphisms (SNPs), which are highly abundant and widely distributed through the genome [12].Screening of thousands of SNPs via genotyping-by-sequencing (GBS) techniques (such as ddRAD) or by SNP arrays has become common practice in devising and managing selective breeding schemes for many commercial aquaculture species, including emerging ones [12,15].GBS techniques are based on a reduced representation of the genome achieved by the use of restriction enzymes that will cut the DNA at specific sites and will create a library of sequences dispersed across the genome.The main advantage of GBS tools is that can provide sequencing of numerous individuals at enough depth to identify SNPs confidently.While high coverage whole genome sequencing can provide much better representation of the genome, the cost associated to the sequencing is still highly expensive, considering the of hundreds of animals needed for GWAS or breeding programme applications [15].These technologies have also helped in the development of reference genome sequences for numerous aquaculture species [16], including the recent release of a reference genome for lumpfish that serves as a valuable genomic resource for advancing production of this species [17].All these genomic resources have provided a keystone for the development of selection schemes by the application of marker assisted selection (MAS), the adoption of accurate genomic relationship matrices (GRM) for breeding value calculations, and the advancement of genomic selection in most aquaculture species [12].
To support the establishment of sustainable selective breeding programmes for lumpfish, this study aimed to develop genetic resources for the species and identify genomic regions associated with growth traits and sex.These efforts will contribute to the production of stocks with favourable growth rates and other traits of interest, ultimately leading to the development of an effective and sustainable solution to the sea lice infestation issue in salmon aquaculture.

Family creation
Wild broodstock were obtained from Skjerneset Fisk at Averøy, Norway.A total of 14C.lumpus independent full-sib families (1Q to 14Q) were created, within 3 h of each other on the same day in October 2018, and reared from fertilisation to final sampling at the NOFIMA Cleaner Fish Unit at Sunndalsora, Norway.All families were reared in discrete incubation units/tanks as outlined below, that were supplied by a common water source to ensure comparable environmental conditions.During egg incubation water temperature averaged 7.4 • C ± 0.8 • C, while during larval rearing water temperature averaged 11.5 • C ± 0.5 • C. Following hatching, temperatures were increased from the incubation temperature to the larval rearing temperature over a period of one week.From fertilisation to just prior to hatching, egg masses were held within individual perforated trays (20 × 30 × 10 cm) submerged within a communal trough.Just prior to hatching these trays were transferred to individual 250 L cylindroconical tanks.Hatching in all families initiated within three calendar days of each other, no later than 300 degree-days post fertilisation.Larvae were fed following routine commercial practice, first with live feed (Artemia), before weaning to a commercial formulated feed (Otohime, PTAqua, Norway).At 90 days post hatch, when fish reached an average of 0.58 g wet weight, the total number of families being reared was rationalised to four, rearing capacity was limited which required the reduction of families and therefore four families were selected based on survival performance to this point to ensure a suitable number of study animals to complete the study.Stock numbers were balanced to an average of 3250 juveniles which were selected at random within each family.At this stage the four selected families were transferred into 1500 L cylindroconical tanks for the remainder of the study.The final four families (3Q, 7bQ, 10Q &11Q), were reared following normal commercial practice until days post hatch, when the final phenotyping sampling was performed.

Phenotype measurement and sampling
For all four families the same sampling regime was followed.A total of 100 individuals were selected at random, culled by lethal anaesthesia and then for each individual total length (±1 mm), standard length (±1 mm), weight (±0.01 g) and sex (where identifiable) were recorded.Thereafter a family specific upper and lower size threshold was calculated (bottom 10% of population curve "small" and top 10% of population curve "big") as shown in Table 1.Then, a further 100 "small" and 100 "big" individuals within each family were sampled.For all fish body condition was measured using Fulton's condition factor (K = 100 × weight/length 3 ; [18]).The subsequent genomic analysis was based on a 125 individual (50 "big", 50 "small" as well as 25 random selected individuals) from each family.Gender of juveniles within each family was balanced where possible.A total of 36 wild broodfish (including parental pairs) were also samples but collecting fine clips.

DNA extraction
Fin clips for all parents and offspring were stored in 99% ethanol at 4 • C until DNA extraction.Genomic DNA was extracted using a salt extraction method as described before [19].Total nucleic acid content and quality (260 nm/230 nm and 260 nm/280 nm ratios) were determined by spectrometry (Nanodrop; Thermo Scientific, Hemel Hempstead, UK) before measuring double-stranded DNA concentrations using a Qubit dsDNA Broad Range Assay Kit and Qubit Fluorometer (Invitrogen, Paisley, UK).

Library preparation and sequencing
The ddRAD libraries were prepared using an adapted version of an existing protocol [19].Briefly, DNA from each sample was digested at 37 • C, for 75 min with restriction enzymes PstI and NlaIII (New England Biolabs, UK), followed by heat-inactivation at 65 • C, for 25 min.The DNA samples were then individually barcoded through the ligation of specific P1 and P2 adapters, each containing a unique five or seven base nucleotide sequence.After addition of pre-mixed adaptors (PstI:NlaIII 1:16) and incubation of samples at 22 • C, for 10 min, T4 ligase (2000 ceU/μg DNA), rATP (100 mM) and CutSmart buffer (1×) were added and samples incubated for 90 min at 22 • C, followed by heat inactivation (65 • C, 20 min).Libraries were column purified (PCR MinElute, Qiagen, Manchester, UK), size selected by gel electrophoresis (550-650 bp) and amplified by PCR (15 cycles).Sequencing was performed by Novogene (UK) Co. Ltd. (Cambridge, UK) using an Illumina Novaseq 6000 platform (150-base paired-end reads).

Marker assembly and genotyping
The sequence data from the 536 individuals (4 families for 125 offspring, and 36 broodfish; Supplementary Table S1) were preprocessed to discard low quality reads (i.e., with an average quality score <20).Sequences lacking the restriction site or having ambiguous barcodes were discarded during sample demultiplexing stage.Retained reads were then aligned against the genomic assembly of C. lumpus (NCBI Assembly accession GCA_009769545.1) using bowtie2 v2.3.5.1 [20] and assembled using gstacks from Stack v2.60 [21] (using the marukilow model, and ignoring the read pairing).
All loci that were common to at least two individuals, with no further filtering, were exported from Stacks.As the analysis was based on a ddRAD sequencing method, many SNPs were sequenced in a very limited number of individual, creating a lot of missing genotypes.To avoid to introduces errors or reliable result in the downstream analysis, we didn't use an imputation step on such large matrix, where a substantial portion of the genotypes is missing.Using PLINK v1.9 [22], groups of variants that shared the same coordinates were identified, and only the first marker was retained (-list-duplicate-vars suppress-first), to avoid duplications or indistinctions.Moreover, SNPs with unknown position or located in partial chromosomes were excluded from the analysis.For each dataset (parents and offspring, corresponding to two separate sampling campaigns), SNPs and individuals were further filtered for quality control in a two steps process, again using PLINK.First, SNP inclusion was confined to those with minor allele frequency ≥ 0.005 and p-value of χ 2 test for Hardy-Weinberg equilibrium ≥10 -6 Then SNPs and animals with a call rate ≥ 0.9 were selected.Quality control was performed on the datasets (parents and offspring) independently.Filtered scores were then combined in one dataset, keeping only shared SNPs.

Multidimensional scaling analysis
R v4.2.0 [23] was used to carry out Multidimensional Scaling Analysis on the dataset using the package Bioconductor/SNPRelate v1.30.1 [24] to calculate the Identity-By-State (IBS) proportion for each sample.

Identification of trait associated markers
Using the recorded phenotypic data (total length, standard length, weight, condition factor and sex) association analyses were performed within the package R/SNPassoc v2.0-11 [25] for R v4.2.0 [23], using the "log-additive" model (except for sex, where "co-dominant" model was used).We used a p-value threshold of 0.001 and a corrected p-value for multiple tests of 0.001/number of tests.The model used for the analysis was based on Interval Mapping.The algorithm used considers the phenotype to follow a mixture of Bernoulli distributions and uses a form of the EM algorithm for obtaining maximum likelihood estimates [26].Two-way and multiple quantitative trait locus (QTL) models were also run with this package.R/qtl2 v0.28 [27] was used to identify the confidence interval of the QTLs on each chromosomes.After calculating the kinship matrix with calc_kinship, we performed a univariate QTL mapping using the scan1 function for each trait.

Genetic relationship matrix
A genetic relationship matrix (GRM) was used to estimate the genetic relationships between individuals from the SNPs.The assigned genetic relationship value between two individuals (i and k) represents the offdiagonal values of the GRM and can be estimated by the following equation: Where x i. is the number of copies of the reference allele (0,1,2) for the i th SNP and for individual j or k respectively, N is the number of SNPs in common (for both individuals j and k) and p i is the reference allele frequency of the i th SNP.For each SNP the estimate is summed (Σ) and weighed equally (1/N).
The diagonal values of the GRM represent one plus the inbreeding coefficient and can be calculated with the following equation:

Estimation of the variance explained by genome-wide SNPs by REML
A restricted maximum likelihood estimation (REML) analysis was used to partition the variance of phenotypic trait across a group of observed values (fixed effects) and the genomic relatedness (random effect based on the GRM).The analysis was performed in the GCTA v1.94.1 [28] using the following equation: where Y is the phenotype, X is the incidence matrix for fixed effects, β is the fixed variable effect size, Z is the GRM, γ is the vector of random effects from the GRM, and ε is the residual random effect (representing environmental, non-genetic effects).

Table 1
Summary of population statistics.Mean weight (g) and length (mm) distributions as well as maximum and minimum sizes observed, along with threshold sizes (as defined by individual total length) which demarked population specific upper (90%) and lower (10%) size thresholds for selective sampling.

Estimating components of heritability
With the fixed and random (GRM) effects specified, the model is fitted using REML to generate an estimate of variance explained by the GRM adjusted for all the fixed effects.GCTA calculates by default a likelihood ratio test (LRT) in order to examine the significance of the random effects for the GRM on the fit of the model (p-value).SNP-based heritability estimates were obtained by the proportion of genotypic to phenotypic variation obtained by the GRM in the analysis.

QTL strength model
The effect of all SNP markers for each QTL was analysed using WEKA v3.8 [29], which contains a variety of machine-learning algorithms, including "REPTree" [29], a fast decision tree learner that builds a decision/regression tree using information gain/variance and reducederror pruning with backfitting."REPTree" considers all the markers, then derives for each individual a phenotype prediction (lengths, weight and sex) based on its genotypes for the markers considered.To train the model, we used minimum total weight of the instances in a leaf of 3 (minNum) and the amount of data used for pruning was 4 (numFold).The most predictive SNP markers for each QTL were selected and used to produce a reduced SNP panel with the same prediction power compared to the full set of markers.Permutatively, individuals were removed oneby-one from the training set, with the algorithm subsequently assigning their predicted phenotypic values.

Low density SNP panel
To develop an extensive SNP panel able to capture all genomic regions of interest, as well as maximising the estimation of diversity, all SNP markers associated with the phenotypes of interest (lengths, weight, and sex) were selected as well as markers with the highest F st values (elevated genetic differentiation).F st were calculated using the function gl.fst.popfrom dartR v2.7.2 [30] and based on all available allele frequencies for each families and broodfish.After several tests run by LGC Genomics (Teddington, UK), SNP markers that presented successful amplification by SeqSNP were used for the final panel.

Panel validation
The usefulness of the SNP panel was validated to confirm the association of the selected markers to the analysed traits.For this purpose, additional members of the four families previously used in the genomewide association study (GWAS) analyses, as well as parents, were genotyped, selecting the rest of the "big" and "small" individuals from each family (398 samples [~100 samples from each family], as well as a further 59 broodfish and 20 duplicated samples for control; Supplementary Table S2).In total, tissue samples from 477 fish were shipped and genotyped by LGC Genomics (Teddington, UK).

Library sequencing
High throughput sequencing of 536 individuals produced 3,260,920,744 paired-end reads in total.After the removal of lowquality and incomplete reads, 78.9% of the total raw reads were retained (2,571,378,028 PE-reads; Supplementary Table S1).C. lumpus genome was used to map the reads and generate ddRAD-tags.A total of 3,048,066 unique loci were detected, with 477,421 shared by at least two samples.

SNP identification and quality control
From the 477,421 SNPs initially identified between the two groups (36 parents and 500 offspring), the filtering process left 35 parents with 19,227 SNPs passing the threshold, and 499 offspring with 8186 SNPs, as shown in Table 2.A total of 7193 common informative markers were identified (covering the remaining 534 individuals) and used in subsequent analyses (Supplementary Data S1 and Supplementary Table S3).

Sample structure
A Multidimensional Scaling Analysis (IBS) was utilized to capture the complex structure of the samples and separate the individuals into clusters based on their genetic distance [31].This process grouped individuals of same origins together (families), while positioning prior family assignment errors or poor-quality samples as outliers (Fig. 1).Five distinct clusters were separated using the first two components (67.3% of cumulative variance).Families and parental/wild generation were clearly clustered.There was one exception; individual 11Q-212 did not behave as expected and did not cluster with any of the families, most likely due to wrong assignment during sampling or handling issues during the transfer into family tanks.

Association analysis
Making use of the 7193 QC filtered and informative SNP markers, R/ SNPassoc and R/qtl2 were used to conduct a QTL/GWAS analysis for both sex and morphometric ratios measurements.Genome wide association was detected for both sex and morphometric measurements after Bonferroni test correction (Supplementary Table S3).One single major QTL (57 SNPs) was identified for sex determination (Fig. 2A), whereas a total of nine QTL regions (120 SNPs) were detected for weight (Fig. 2B), and three QTL regions (23 SNPs) for standard length (Fig. 2C).On the other hand, no significant association was detected when using Fulton's condition factor as a trait (Fig. 2D).All SNP markers associated with standard length were also significantly associated with Weight (Fig. and Table 3).The SNP based heritability estimates for all traits showed high values, with length showing a h 2 of 0.57 (±0.04), condition factor 0.49 (±0.06) and 0.67 (±0.04) for weight; Sex showed a heritability of 1.

Prediction and validation
The combined prediction power of these 177 unique SNP markers (from the sex and weight QTL) was tested by building machine-learning algorithms and using an instance-based k-nearest neighbours' algorithm [32] based on the additive effect of the genotypes at each locus considered.Phenotype prediction power from using these 177 SNPs provided a 99% accuracy for the selection of a desired phenotype (Table 4).The SNP markers defining the QTLs for Weight and Standard length were further investigated to provide a small subset of marker fit for a quick SNP assay.This approach produced a robust combined subset of 29 SNP markers (10 associated to sex, 14 to weight and 18 to standard length, with weight and standard-length markers overlapping; Supplementary Table S4).When applied to all individuals, the combined prediction power remains over 90% (Table 4).

Low density SNP panel
A total of 177 SNP markers of interest (from QTL regions) and top F st SNP markers were combined in a low-density panel to test its usefulness in quantifying and maintaining genetic diversity within the tested population, along with potential use for selection purposes in the future.The inclusion the high Fst SNP also allow for detecting local adaptation, and merits inclusion in future genome scans and methodologies.This final panel of 576 markers successfully delivering informative genotypes was selected (Supplementary Data S2) and evaluated on the previously mentioned 477 samples, showing its usefulness to provide genotypes that can be used for selection purposes and family structure (Fig. 4), although lacking enough resolution to determine population structure.

Importance of generating genomic resources for emerging species
The development of comprehensive genomic resources for important and emerging aquaculture species is crucial for gaining deeper insights into their biological characteristics, optimising breeding programmes, and enhancing selective breeding for desirable traits [12,15].Genomic tools can be particularly advantageous for emerging species such as lumpfish, as it can expedite and improve the accuracy of the selection process for important traits, in addition to establishing breeding programmes.
In this study, genomic markers were developed for a lumpfish stock, with 7193 informative SNPs being identified following a thorough QC filtering process.This is a significant achievement and represents a valuable starting point for future genomic research on the species.Genomic resources, such as DNA markers, have become an essential component of successful aquaculture production.As a result, many relevant aquaculture species have been targeted for SNP marker development via genotyping-by-sequencing (GBS) or whole genome sequencing [15,16].
The development of SNP markers has been extensively researched in Atlantic salmon over the last two decades, starting with the screening of a few hundred SNPs, to reach the development and application of numerous SNP arrays containing up to 900K markers [33][34][35][36][37]. Sea bream and sea bass are additional examples of species with successful genomic marker development, transitioning from GBS-based SNP identification to the development of medium-density SNP arrays [38][39][40].Low-density SNP panels have emerged as a cost-effective alternative, providing access to genotyping resources at a fraction of the cost compared to high-density panels, especially when dealing with hundreds or thousands of animals.These panels have been successfully described for numerous aquaculture species and have demonstrated their application in genomic selection through accurate genomic relationship matrices (GRM) or imputation into high-density panels [41].
This progress exemplifies how commercial interest and production needs can positively stimulate research advancement in aquaculture species, which could serve as a model for lumpfish research given the current demand for cleaner fish.

GWAS on growth and sex in other species
The increased accessibility to sequencing technologies has made GWAS (and QTL) analyses involving thousands of markers a norm for studying aquaculture and livestock species.This has enabled the identification of significant associations between genomic markers and particular phenotypes, such as growth, sex, disease resistance, and colour, among others, through GWAS and QTL analyses, which is a fundamental step towards the selective improvement of stocks.Aquaculture species have been extensively researched for the identification of QTL regions and markers associated with important traits [42].The present study aimed to identify genomic regions associated with analysed traits, and was successful in this regard (Table 3).The analysis of sex showed the strongest association, with a single major QTL located on chromosome 13 (Fig. 2A and Fig. 3) being identified.This result is in line with what was recently reported for another lumpfish stock, where chromosome 13 was also identified as the sex chromosome, and the Amh gene was suggested as the sex determining gene [17].The Amh gene, short for "Anti-Müllerian hormone" gene, plays a significant role in sex determination and differentiation in various vertebrate species, including several species of fish, either as a primary determinant, a secondary influencer, or as part of a broader network of genes and molecular pathways.Although the QTL peak position does not exactly match the position of the Amh gene in the lumpfish genome.This could be due to many reasons, first the previous study utilized a 70K SNP array for lumpfish, while our analyses were based on 7193 SNPs, and therefore, even though our results agree with the chromosome location, the lower marker density did not provide enough resolution to identify the specific location of the candidate gene.Nevertheless, a set of 10 SNPs located within this region accurately predicted sex in all samples, giving evidence that the sex determining locus is shared between populations.
Growth rate is a significant trait for improvement in newly domesticated species, and it has been extensively studied in most aquaculture species [42].Analysis of weight and length in lumpfish showed polygenic involvement, identifying significant associations across many chromosomes, and showing evidence of overlap in QTL regions in chromosomes 5, 19, 14, and 22 (Fig. 2B-C, Table 3, and Fig. 3).In contrast, the analysis of condition factor (K) did not identify significant associations, most likely due to the round morphology of lumpfish, which makes this index uninformative for this species [43].The polygenic nature of growth traits is not surprising, as most aquaculture species show this pattern.Contrary to most reported aquaculture goals,   where growth rate QTLs have been largely exploited to increase growth rate, selection for slower growth rate and longer deployment time may be feasible for lumpfish.Grazing efficacy has been negatively correlated with the size of lumpfish and linked to parental/family effects, suggesting that the genetic component can play a significant role in improving growth and grazing [44].Accordingly, obtained heritability estimates (0.57, 0.49, 0.67 and 1 for length, condition factor, weight and Sex respectively) are in accordance to values estimated for growth traits in other aquaculture species [12], therefore, improvement in growth rate (for better grazing efficiency) can be anticipated, considering that the genetic component plays an important role in the expression of these traits.
The markers identified in this study show promise for the selection of slow-growing fish using a low number of markers (Table 4), and therefore, they have the potential to improve the grazing efficiency of selected stocks.

Usefulness of findings. MAS and genomic selection applications
The QTL markers identified in this study have great potential to significantly enhance the analysed traits, particularly growth, which has shown average genetic gains of over 10% per generation in some aquaculture species [45].Selection to obtain the opposite outcome (slower growth) should be possible at similar rates, particularly with the introduction of genomic resources into the selection process.Our analysis of 177 markers associated with the traits revealed 99% accuracy in predicting the animal's phenotype, and a selection of only 29 SNPs achieved similar accuracy, thus opening the possibility of using lowdensity SNP panels, such as the one described in this study, to provide practical genomic resources at a lower cost without sacrificing selection power.
The results of this study demonstrate that a panel of 576 markers can determine family structure and accurately predict slow growth phenotypes, and sex-associated markers can accurately distinguish the sex of Table 3 QTL detected in this study and the genomic regions harbouring them.The peaks and confidence intervals (CI) can be visualised in Fig. 2, while the QTLs are reported in Fig. 3. *Small peaks not reported in Fig. 3

Fig. 2 .
Fig. 2. Markers associated with phenotypes.(A) Manhattan plot of the association for phenotypic sex.(B) Manhattan plot of the association with the fish weight.(C) Manhattan plot of the association with the fish Standard length.(D) Manhattan plot of the association with the fish condition factor.The -log 10 (p-value) values for association of directly genotyped SNPs are plotted as a function of position of the physical map.Each chromosome has been represented with a different colour.

Fig. 3 .
Fig. 3. QTL map.Chromosomal locations of highlighted genomic regions for QTLs in this study, including Standard length, fish weight, phenotypic sex and condition factor.The peak locations are located with white circles.

Fig. 4 .
Fig. 4. Multidimensional Scaling Analysis results of the validation panel (477 samples) dataset.The first and second components explain 31.7%, and 29.1% of the variation found.Based on 576 SNP markers.
Fig. 1.Multidimensional Scaling Analysis results of the full dataset.The first and second components explain 34.6%, and 32.7% of the variation found.Based on 7193 SNP markers.The black dots clustering with each family are the two brookstock/wild parents.Families 7bQ and 10Q are more clearly separated in the component 3 (not show in this figure).A.P.Gutierrez et al. .

Table 4
Details of the Phenotypic Variation Explained and prediction accuracy for the full SNP dataset and reduced subset.For each trait tested, the subset of SNPs is reported between brackets.The marker subsets overlap.Sex is a binary character, where correct prediction is provided, Weight and Std length are continuous variables where Precision (Correlation) is specified.Subset list is provided in Supplementary TableS4(29 unique SNP markers).