Chromosome-level assembly reveals a putative Y-autosomal fusion in the sex determination system of the Greenland Halibut ( Reinhardtius hippoglossoides )

Despite the commercial importance of Greenland Halibut ( Reinhardtius hippoglossoides) , important gaps still persist in our knowledge of this species, including its reproductive biology and sex determination mechanism. Here, we combined single-molecule sequencing of long reads (Paciﬁc Sciences) with chromatin conformation capture sequencing (Hi-C) data to assemble the ﬁrst chromosome-level reference genome for this species. The high-quality assembly encompassed more than 598 Megabases (Mb) assigned to 1594 scaffolds (scaffold N50 ¼ 25Mb) with 96% of its total length distributed among 24 chromosomes. Investigation of the syntenic relationship with other economically important ﬂatﬁsh species revealed a high conservation of synteny blocks among members of this phylogenetic clade. Sex determination analysis revealed that similar to other teleost ﬁshes, ﬂatﬁshes also exhibit a high level of plasticity and turnover in sex determination mechanisms. A low-coverage whole-genome sequence analysis of 198 individuals revealed that Greenland Halibut possesses a male heterogametic XY system and several putative candidate genes implied in the sex determination of this species. Our study also suggests for the ﬁrst time in ﬂatﬁshes that a putative Y-autosomal fusion could be associated with a reduction of recombination typical of the early steps of sex chromosome evolution.


Introduction
The mechanisms underlying sex determination (SD) are known to affect developmental processes, the evolution of genomes, and are responsible for determining sex ratio, a key demographic parameter for population viability and stability (Bull 1983;Uller et al. 2007).In recent years, genetic analyses in nonmodel species have revealed a greater degree of sex chromosome diversity and turnover than previously appreciated (Abbott et al. 2016).In fact, many case studies in different taxonomic groups, including lizards, fish, amphibians, insects, and plants show frequent changes in the location of SD genes and high rates of turnover of sex chromosomes, suggesting that exceptions are the rules in sex determination (Yano et al. 2013;McGrath 2020, for review).Therefore, a more cyclical than linear conceptual framework could be used to explain the evolution of sex chromosomes (Bachtrog 2006;Furman et al. 2020).In such a cycle, a new master sex-determining locus arises on an autosome, leading to sex chromosome formation, generally characterized by recombination suppression between the Y and X, or the Z and W chromosomes.As under a classic sex chromosome evolution model, this situation may progress toward increased divergence between male and female chromosomes, leading to chromosome heteromorphism characterized by genomic degradation, including rearrangements and loss of genetic material (Charlesworth et al. 2005;Bachtrog 2013).However, sex chromosomes and sexdetermining regions can also undergo turnover with either the evolution of a new sex-determining gene or the transposition of the sex-determining gene to a new location in the genome, leading to another cycle of sex chromosome evolution (Bachtrog 2006).The increasing availability of high-quality genome assemblies in nonmodel taxonomic groups is now providing exciting opportunities to revisit the evolution of genetic SD (Whibley et al. 2021).
Fish exhibit a remarkable plasticity in gonad development and a high evolutionary SD turnover, making them a particularly relevant group to investigate the evolutionary dynamics of sex chromosomes (Kitano and Peichel 2012).Fish display highly diverse chromosome SD systems, including models analogous to the contiguous reference genomes annotated for functional elements and documented for SD.
Greenland Halibut (Reinhardtius hippoglossoides) is one of the main exploited demersal fish in Eastern Canada (Gulf of St. Lawrence, Newfoundland, and the Arctic), representing around 4000 tons of landings per year across the two last decades (DFO 2018).However, the management of Greenland Halibut fisheries would benefit from improved genomic resources.Indeed, no karyotype, linkage map nor genome assembly has been developed for this species so far.Given that juvenile growth potential and size are sexually dimorphic in this species (Ghinter et al. 2019), it is valuable to better understand the sex-determining mechanisms to control sex imbalance in catches.Moreover, the recent genome assemblies of other flatfish species (Figure 1, Table 1; Chen et al. 2014;Figueras et al. 2016;Robledo et al. 2017;Shao et al. 2017;Einfeldt et al. 2021) together with studies on sexdetermining loci in this taxon (Chen et al. 2014;Drinan et al. 2018;Liang et al. 2018;Einfeldt et al. 2021) provide an opportunity to investigate the evolutionary turnover of sex chromosomes and SD systems.Flatfishes studied so far are characterized by a variable heterogametic system and different sex-determining genes.For instance, a similar female heterogametic system (ZZ/ZW) is found both in the Tongue Sole (Cynoglossus semilaevis) and the Turbot but the candidate SD drivers differ, being sox2 in the latter (Martinez et al. 2021; Table 1), and dmrt1 in the former species (Chen et al. 2014).Japanese Flounder, Paralichthys olivaceus (also called the Olive Flounder) has an XX/XY system (Liang et al. 2018) but no sex determining gene has been found (Shao et al. 2017) and the environment is known to alter sex differentiation in this species (Ospina-Alvarez and Piferrer 2008).Within the righteye flounder family (Pleuronectidae), the Hippoglossinae subfamiliy harbors variable genetic SD systems.The Pacific Halibut (Hippoglossus stenolepis) has a ZZ/ZW system and several sexlinked loci aligned in three different linkage groups (Drinan et al. 2018) while its sister-species, the Atlantic Halibut (H.hippoglossus), has a XX/XY system and a putative SD factor called gsdf on chromosome 12 (Palaiokostas et al. 2013;Einfeldt et al. 2021).
In this study, we used an array of genomic technologies to provide high-quality genomic resources and to investigate the genetic basis of sex determination in Greenland Halibut.We produced the first chromosome-level genome assembly for this species by combining single-molecule sequencing of long reads (Pacific Sciences) with chromatin conformation capture sequencing (Hi-C) data.Furthermore, we also identified and characterized the molecular SD system by mapping whole genome sequencing data of 198 individuals onto the assembly to screen for genetic markers that diverged between sexes.Finally, in order to provide an estimate of the extent of rearrangements between species, we analyzed our results in a comparative genomic framework by assessing synteny (i.e., blocks of genes found in the same order in the genome) with other species of flatfish and by targeting candidate sex-determining genes in this group.

Reference genome assembly
We built a reference genome assembly for the Greenland Halibut in two steps.First, we created a draft assembly using the PacBio long-reads sequencing approach (Pacific Biosciences, Menlo Park, CA, USA) and we then applied Dovetail TM Hi-C (Lieberman-Aiden et al. 2009), from Dovetail Genomics to increase contiguity, break up mis-joins, and orient and join scaffolds into chromosomes.

Sampling
The individual whose DNA was used to assemble the reference was a juvenile female caught in the St. Lawrence River Estuary (48 39 0 11 00 N, 68 28 0 37 00 W) with a Commando-type bottom trawl aboard the CCGS Leim [Fisheries and Oceans Canada (DFO) survey] at the end of May 2016 and kept alive in controlled environmental conditions at the Maurice Lamontagne Institute (IML), Department of Fisheries and Oceans Canada (Ghinter et al. 2019).The female was anesthetized for 5 min in a 0.18 g L À1 MS-222 solution (tricaine methane sulfonate; Sigma -Aldrich Co, MI, USA).Blood was then sampled from the caudal artery using a 23-gauge needle in a 1 ml TB syringe (Becton, Dickinson & Co., NJ, USA), both of which were previously heparinized in a 100 U ml À1 heparin solution.Blood was immediately flash-frozen in liquid nitrogen and then stored at À80 C until DNA extraction.Multiple tissues including liver, muscle, heart, brain, eye, gonads, gall bladder, and kidney were then excised from the fish, flash-frozen, and stored at À80 C until further transcriptomic analyses.

DNA extraction, sequencing, and assembly
High molecular weight DNA was extracted from flash-frozen blood using the Qiagen MagAttract kit, (Valencia, CA, USA).Then a library was created using the SMRTbell Template Prep kit 1.0 (Pacific Biosciences, Menlo Park, CA, USA), which incorporates size selection and DNA damage repair steps.We obtained fragments of >10 kb using size selection with a Blue Pippin instrument (Sage Science, Beverly, MA, USA) according to the manufacturer's recommended protocol for 20 kb template preparation.Five micrograms of concentrated DNA was then used as input for the library preparation reaction.Library quality and quantity were assessed using the Pippin Pulse field inversion gel electrophoresis system (Sage Science), as well as with the dsDNA Broad Range Assay kit and Qubit Fluorometer (Thermo Fisher).Single-Molecule Real Time sequencing was performed on the Pacific Biosciences RS II instrument at McGill University using an on-plate concentration of 125 pM, P6-C4 sequencing chemistry, with magnetic bead loading and 360-min movies.A total of 14 SMRT Cells were run, generating a mean sequence coverage of Figure 1 Phylogenetic relationships and syntenic distances between important flatfish species in fisheries and aquaculture.Phylogenetic trees were retraced according to previous studies, Betancur-R and Orti (2014) for the top one and Vinnikov et al. (2018) for the zoom into the sub-family Hippoglossinae framed in red.Only the names of relevant species compared in our studies are reported as well as the SD system, either "XY" for a male heterogametic system (XX/XY) or "ZW" for a female heterogametic system (ZZ/ZW).The XY system for Greenland Halibut has been assessed in the present study.Circos plots showing synteny between the Greenland Halibut (displayed in low half circle) and the Tongue Sole, Turbot, Japenese Flounder, Atlantic, and Pacific Halibut (displayed in upper half circle).Sex chrosomosomes or the chromosomes in which sex-related markers have been described in the literature have been increased in front size and adequately colored according to the relative species.See Table 1 and its caption for references about other genome assemblies and SD systems in related flatfish species.The same color code as the one displayed in Supplementary Figure S2 is used.
$70-fold, with half of the sequences contained in reads longer than 10.6 kb.An initial assembly was carried out on the PacBio long reads using Wtdbg v2.5 (-x sq gg 0.7 g; Ruan and Li 2020), resulting in 2037 contigs (scaffold N50 ¼ 3.17 Mb), for a total assembly size of 598.5 Mb.One Hi-C proximity ligation library was generated from a flash-frozen piece of muscle of the same individual using the Dovetail's TM Hi-C Kit v.1.0.This kit employs the restriction enzyme DpnII which recognizes the sequence 5'/GATC 3'.The long-range library was then sent to Dovetail Genomics to be sequenced and run with the initial PacBio assembly through Dovetail's HiRiseTM scaffolding pipeline.The completeness of the final assembly was assessed with BUSCO v 4.0.5 (Simão et al. 2015), the eukaryota_odb10 database (created in October 2020) and the -m genome option.The occurrence of repeat elements in our assembly as well as in Atlantic and Pacific Halibut was assessed using RepeatMasker v.4.0.6 (Smit et al. 2013(Smit et al. -2015)).

Synteny with other flatfish species
Synteny blocks between the genomes of Greenland Halibut and other flatfish species for which a reference genome is available (Tongue Sole, Turbot, and Japanese Flounder as well as Atlantic and Pacific Halibut) were computed using SyMAP v4.2 (Soderlund et al. 2011).Genomic sequences were first aligned using promer/ MUMmer (Kurtz et al. 2004).Raw anchors resulting from MUMmer were clustered into (putative) gene anchors, filtered using a reciprocal top-2 filter and used as input to the synteny algorithm (Soderlund et al. 2006).The algorithm constructs maximalscoring anchor chains based on a given gap penalty.It also searches a range of gap penalties to generate the longest chains subject to several quality criteria, which are based on the Pearson correlation coefficient applied to the anchors in the chain as well as the anchors in its bounding box.The chains are not required to be entirely colinear and may incorporate local inversions relative to the overall chain orientation.Results stemming from the synteny analysis were visualized using the chromosome explorer and the circle view options in SyMAP v4.2.Finally, we named scaffolds by mapping our assembly to the Pacific Halibut assembly (IPHC_HiSten_1.0,GenBank accession: GCA_013339905.1).

Transcriptome assembly and annotation of the assembly
In order to annotate the reference assembly, total RNA from eight different tissues (liver, muscle, heart, brain, eye, gonads, gall bladder, and kidney) from the same individual was extracted using the RNeasy Mini kit from Qiagen then treated using DNase I, Amplification Grade (1 unit per ll; Invitrogen, Carlsbad, CA, USA) following the manufacturer's protocol.RNA concentration of each tissue was quantified with Quant-iT TM RiboGreen V R RNA Assay Kit (Invitrogen) and fragment size distribution was estimated with an Agilent BioAnalyzer before being shipped to the Ge ´nome Que ´bec Centre d'Expertise et de Services (Montre ´al, Canada) to prepare libraries and perform sequencing.One RNAseq library was constructed for each tissue using the NEBNext V R Ultra TM II Directional RNA Library Prep Kit (Illumina, San Diego, CA, USA) and sequenced with a 2 Â 100bp (paired-end) read module using the 6000, generating 50M reads.
no gene has been specifically identified ( 2) Homolog of sex chromosome in Greenland Halibut -GH06, GH12, G20 Toward this end, we performed the low coverage whole genome sequencing approach proposed by Therkildsen and Palumbi (2017).Briefly, genomic DNA was extracted from a finclip using a salt-extraction protocol (Aljanabi and Martinez 1997) with a RNase A treatment (Qiagen).DNA quality of each extract was evaluated with Nanodrop 2000 (ThermoFisher scientific) and migration on a 1% agarose gel electrophoresis.Following Therkildsen and Palumbi (2017), we removed DNA fragments shorter than 1 kb by treating each extract with Axygen magnetic beads in a 0.4:1 ratio, and eluted the DNA in 10 mM Tris-Cl, pH 8.5.We measured DNA concentrations with the Accuclear ultra high sensitivity dsDNA quantification kit (Biotium) and normalized all samples at a concentration of 5 ng/ll.Then, sample DNA extracts were randomized, distributed in plates (96 -well see details about randomization and number of plates used in Supplementary Material) and re-normalized at 2 ng/ll.
Whole-genome high-quality libraries were prepared for each sample according to the protocol described in (Baym et al. 2015;Therkildsen and Palumbi 2017).Briefly, a tagmentation reaction using enzyme from the Nextera DNA sample preparation kit (Illumina), which simultaneously fragments the DNA and incorporates partial adapters, was carried out in a 2.5 ll volume with approximately 2 ng of input DNA.Then, we used a two-step PCR procedure with a total of 12 cycles (8 þ 4) to add the remaining Illumina adapter sequence with dual index barcodes and to amplify the libraries.The PCR was conducted with the KAPA Library Amplification Kit and custom primers derived from Nextera XT barcode sets A, B, C, and D (total of 384 possible combinations).Amplification products were purified from primers and sizeselected with a two-steps Axygen magnetic beads cleaning protocol, first with a ratio 0.5:1, keeping the supernatant (medium and short DNA fragments), second with a ratio 0.75:1, keeping the beads (medium fragments).Final concentrations of the libraries were quantified with the Accuclear ultra high sensitivity dsDNA quantification kit (Biotium) and fragment size distribution was estimated with an Agilent BioAnalyzer for a subset of 10-20 samples per plate.Finally, libraries were pooled (see details in Supplementary Material) for sequence lanes of paired-end 150 bp reads on an Illumina HiSeq 4000 at the Norwegian Sequencing Center at the University of Oslo.Given our multiplexing (up to 96 individuals per lane) and genome size ($ 600 Mb), we targeted a low sequencing coverage around 1.26 X.

Sequencing filtering and processing
Raw reads were trimmed, filtered for quality, mapped to the reference genome (obtained above), cleaned for duplicate reads and mapping quality and then re-aligned using the pipeline available at https://github.com/enormandeau/wgs_sample_preparation,inspired by Therkildsen and Palumbi (2017) and fully described in Me ´rot et al. (2021).Given the high percentage of the assembly comprised into the anchored 24 chromosomes (>96%, see results section), raw reads were mapped against a reduced version of the genome including only the 24 chromosomes and excluding the unassembled scaffolds.A preliminary analysis, in which individual reads were mapped to the full genome did not change the outcome of the results.
For low coverage whole genome sequencing (lcWGS) data, the recommended practice is to avoid basing downstream analysis on called genotypes (Nielsen et al. 2011) and to instead use a probabilistic approach based on Genotype Likelihoods.Several models for computing genotype-likelihood-based on read data have been implemented in the program ANGSD (Korneliussen et al. 2014) and it is currently the most widely used and versatile software package for the analysis of lcWGS (Lou et al. 2021).Therefore, ANGSD v0.931 was used for most of our subsequent analyses according to the pipeline documentation available at https:// github.com/clairemerot/angsd_pipeline.For all analyses, input reads were filtered to remove reads with a samtools flag above 255 (not primary, failure and duplicate reads, tag -remove_bads ¼ 1), with mapping quality below 30 (-minMapQ 30) and to remove bases with quality below 20 (-minQ 20).We also filtered in order to keep only SNPs covered by at least one read in at least 30% of individuals (-minInd 60) and remove SNPs in putative repeated regions allowing a maximum depth of 3 times the number of individuals (-setMaxDepth 594).Finally, for most of the subsequent analyses (unless mentioned otherwise) we kept SNPs with minor allele frequency above 5%.
First, we ran ANGSD to estimate genotype likelihoods (GL) with the GATK model (-doGlf 2 -GL 2 -doCounts 1), the spectrum of allele frequency (-doSaf 1) and the minor allele frequency (-doMaf 1) options.The major allele was based on the genotype likelihood and was the most frequent allele (-doMajorMinor 1).From this first analysis, we generated (1) a beagle file with GL estimates and (2) a list of variants passing those filters and their respective major and minor alleles that were used for most subsequent analysis.The R program (R Core Team 2020) was employed for graphic output in subsequent analyses, via the package ggplot (Wickham 2016).

Genome-wide coverage difference between sexes
Total depth along the entire genome was calculated using the function -doDepth 1 in ANGSD, excluding high coverage (-setMaxDepth 594 and -maxDepth 1000) and low coverage regions (-minInd 60), along with the following arguments for depth: -dumpCounts 2 -doCounts 1. Depth was then averaged across sliding windows of 25 Kb with 5 Kb steps using the winScan function of the R package windowscanr (Tavares 2021).The sliding approach allowed us to visualize individual coverage along the genome, while a ratio of total depth of females above total depth of males was also plotted using ggplot.This ratio is expected to deviate from 1 in genomic regions in which a sex-bias in coverage exists.

Genome wide variation between sexes
Genome-wide variation across samples was explored using PCAangsd (Meisner and Albrechtsen 2018) on the genotype likelihoods.This program extracts a covariance matrix that is then decomposed into the principal component analysis (PCA) with R, using a scaling 2 transformation adding an eigenvalues correction, to obtain the individuals PC scores (Legendre and Legendre 1998).

Identifying genomic regions implied in sex determination
In order to identify the genomic regions involved in sex differentiation, two analyses were performed.First, we ran local PCAs (Li and Ralph 2018) with PCAangsd on genotype likelihoods in nonoverlapping windows of 1000 SNPs in each chromosome to extract local covariance matrices and obtained local PCAs of genomic variation (as detailed above).For each local PCA, we analyzed the correlation between the PC1 score and PC scores from the global PCA performed on the entire genome.The resulting correlation indices were then plotted across their relative position in each chromosome.Second, genome-wide Fst was estimated between males and females.To do this, allele frequency spectrum (-doSaf 1) and minor allele frequencies were calculated for each sex with the previous list of variant positions (-sites) and their polarization as major or minor alleles (-doMajorMinor 3).Genome-wide Fst was estimated using the realSFS function in ANGSD between males and females and then summarized across sliding-windows of 25 Kb with a step of 5 Kb.Observed heterozygosity was computed for each sex and each SNP using the function -doHWE in ANGSD and then averaged across sliding-windows of 25 Kb with a step of 5 Kb using the winScan function of the R package windowscanr.

LD estimation
For the two chromosomes associated with sex determination (see results below), intra-chromosomal as well as inter-chromosomal linkage disequilibrium (LD) was calculated within each sex for a reduced number of SNPs, filtered with more stringent criteria (MAF > 10%, at least one read in 75% of the samples).Pairwise R 2 were calculated using ngsLD (Fox et al. 2019) based on genotype likelihood reported in the beagle file obtained previously.The 10th percentile of pairwise R 2 values was averaged across windows of 250 Kb and plotted for each sex using custom scripts available at https://github.com/clairemerot/angsd_pipeline.Then changes in LD along the sex identified chromosomes were analyzed by calculating mean R 2 for males and females separately.

Mapping known sex locus
To localize orthologous genes of previously identified sexassociated genes in our assembly, we aligned all sex-associated markers from Drinan et al. (2018) as well as others previously described to have a SD effect in flatfish and some other teleost species, to our reference assembly using LAST v.1179 (Kiełbasa et al. 2011; see Supplementary Table S1 for details about those genes and their accession number).

Chromosome-level reference genome assembly
We generated a final genome assembly of 598.5 Mb with high contiguity (contig N50 ¼ 3.1 Mb, scaffold N50 ¼ 25 Mb, scaffold N90 ¼ 20 Mb), of which 96.35% is comprised in 24 chromosome-length scaffolds (GenBank accession: GCA_006182925.3).These 24 scaffolds are consistent with the number of diploid chromosomes (2n ¼ 48) for Atlantic Halibut (Einfeldt et al. 2021) and Pacific Halibut (GCA_013339905.1).The Hi-C data supported a high degree of accuracy in the overall assembly into these 24 scaffolds, as indicated by the strong concentration of data points along the diagonal rather than elsewhere in the contact maps (Supplementary Figure S1).Altogether, this suggests that these 24 scaffolds can be considered as chromosomes.The completeness of the assembly based on single-copy orthologs across eukaryotes was 98.04%, with 96.47% complete and single copy orthologs and 1.57% duplication.
There were 60,313 transcripts detected from multi-tissue transcriptomics data, with 97.7% of them (58,908 transcripts) within the 24 anchored chromosomes.A total of 19,252 genes and pseudogenes were annotated, of which 19,053 (99%) were located within the 24 chromosomes (Supplementary Table S2).Repeated elements accounted for 8.20% of chromosome lengths (49.1 Mb).These measures of completeness and assessments of repeat elements in Greenland Halibut were highly concordant with the statistics obtained from Atlantic and Pacific Halibut as well as the three other flatfish species with reference genome and important for fisheries and/or aquaculture production (Table 1).SyMap revealed shared syntenic blocks with other flatfish species (Table 2 and Figure 1).Most of the chromosomes harbor a single syntenic block compared with the most closely related Hippoglossinae, the Pacific Halibut (16 out of 24 chromosomes), and the Atlantic Halibut (15/24).In contrast, mapping with the three other flatfish species displayed several syntenic blocks for all chromosomes (Table 2).As a result, syntenic blocks were larger with Pacific and Atlantic Halibut than with the three other flatfishes (Table 2 and Supplementary Figure S2).Supplementary Figure S2 displays this negative relationship between the size of syntenic blocks and phylogenetic distance.
By generating this chromosome-level genome assembly for Greenland Halibut, among the most contiguous of flatfish assemblies to date, we provide a genomic resource for this unique species of the genus Reinhardtius.This reference genome will now become a fundamental tool to understand the genomic variability of Greenland Halibut in the context of comparative genomics and fisheries applications.

XY sex determination system with a putative autosomal-Y fusion
After mapping the reads of 99 male and 99 female Greenland Halibut samples to the reference genome, cleaning for quality and processing for SNP identification with genotype likelihoods, we identified 7,301,470 single nucleotide polymorphisms (SNPs) with a minor allelic frequency (MAF) above 5%.These SNPs were then used for further analyses pertaining to sex differentiation.
Genome-wide variation analyzed by a global PCA displayed two clusters corresponding to males and females on PC1, which explained 5.7% of the genetic variance (Supplementary Figure S4).Local PCAs on windows of 1000 SNPs along the genome revealed that this sex differentiation was mainly explained by genetic variation on two chromosomes (Chr10 and Chr21, Figure 2A).The correlation between the 1st PC of the global PCA and each local PCA was very low at the very beginning of each of those two chromosomes but increased rapidly along both chromosomes.A large portion of each of these two chromosomes explains the total genetic variation observed between the two sexes in PC1.Particularly, the last 66.5% of Chr10 (16.5 Mb out of 24.8 Mb) and 52.2% of Chr21 (10.5 Mb out of 20.1 Mb) explained more than 90% of the sex differentiation observed on global PC1 (Figure 2B).Genome-wide Fst estimations between males and females corroborate this result revealing high differentiation on Chr10 and Chr21 (Figure 2C).While the mean genome-wide Fst between the sexes is as low as 0.01, Chr10 and Chr21 harbored a large region with higher Fst values with a mean Fst of 0.21 and 0.15, respectively (Figure 2D).Few localized peaks of sex differentiation were also observed on Chr9, Chr11, Chr12, Chr13, and Chr15 (Figure 2B).
The observed proportion of heterozygotes followed the same pattern, with similar values between males and females across most of the genome (Figure 2E), but with a much higher fraction of heterozygotes in males than females in the two major regions previously described.This difference between males and females was more pronounced in Chr10 than in Chr21.Both Chr10 and Chr21 were also characterized by strong LD, which was markedly higher in males than in females (Figure 3), suggesting lower recombination rate in males than females for these genomic regions.
In contrast with differentiation and heterozygosity, coverage did not differ between males and females, neither genome-wide nor in regions of sex differentiation (Supplementary Figure S5), suggesting that Greenland Halibut is not harboring sex chromosome heteromorphism.Moreover, the putative sex chromosomes Chr10 and Chr21 have apparently not undergone degradation, which could be either typical of early SD systems or the result of an older system that has sex chromosome turnover as found in other fish species (see below).
Altogether, these results point toward a major role of Chr10 and Chr21 in genetic sex determination.The higher heterozygosity and LD in males suggest that Greenland Halibut is characterized by a XX/XY sex determination mechanism.The putative male Y chromosome exhibits two characteristics of sex chromosome evolution: it is divergent from the X chromosome and exhibits a reduction of recombination.However, it does not  appear degraded, suggesting either very nascent sex chromosome at an early evolutionary stage or resulting from the high evolutionary lability of SD in teleost fish.This is corroborated by the syntenic relationships with closely related flatfish species which indicate that homologs of Chr10 and Chr21 are autosome chromosomes.Chr10 and Chr21 neither correspond to locations where sex-determining regions have been defined in other flatfish species (Figure 1, Table 1, and Supplementary Figure S3).Overall, the resolved sex-determining system in Greenland Halibut is highly consistent with the rapid turnover of sex chromosomes in fish, particularly in flatfish, given that the Atlantic Halibut has also undergone a recent evolution of an XY SD mechanism (Einfeldt et al. 2021).
Observing high sex differentiation and characteristics of sex chromosome evolution on two different chromosomes was unexpected and puzzling.The Hi-C contact map very strongly supported that Chr10 and Chr21 are distinct chromosomes in the specimen used for the assembly, a juvenile female (Supplementary Figure S1).Inter-chromosomal LD was also low in females, confirming the independent segregation of Chr10 and Chr21.Conversely, LD between Chr10 and Chr21 was very high in males (Figure 3), suggesting that the male haplotypes of those two chromosomes segregate together.One possible mechanism explaining such pattern would be a male-restricted fusion between the Y chromosome and one autosome.Chromosome fission and fusion events are frequent in the evolution of vertebrates, particularly in fish, in which they could explain the enormous species richness and rapid evolution of karyotypes (Cheng et al. 2020;Sutherland et al. 2017).Fusions may also be polymorphic in fish and play an adaptive role, as reported in Atlantic salmon (Salmo salar), in which fused and unfused karyotypes have been shown to coexist (Lehnert et al. 2019;Wellband et al. 2019).Moreover, fusions between a Y chromosome and an autosome are known to occur more frequently than autosomal fusions or W-autosome fusions, particularly in fishes (Pennell et al. 2015).The most probable explanation for the preponderance of Y-autosome fusions involves both selection and sex biases, as the smallest effective size (Ne) of the Y makes it more likely to fix rearrangements, including those with partially deleterious effects.This process may be even more prevalent in species with polygynous mating system like fish, which increases the variance of male reproductive success, as well as in species with skewed sex-ratios, such as the Greenland Halibut (Ghinter et al. 2019).Because of its significant impact on recombination, chromosomal fusion may be one of the key factors underlying the evolution of differentiated sex chromosomes.Fusions are a peculiar class of rearrangements that not only link chromosomes that previously segregated independently but also reduce recombination rates within each arm and prevent recombination between fused and unfused homologous chromosomes in polymorphic populations (Dumas and Britton-Davidian 2002).This typically leads to gradual differentiation from the fusion point as observed here for Greenland Halibut or for the young sex chromosomes found in three-spined stickleback (Gasterosteus aculeatus) (Kitano et al. 2009).This represents a typical step in the birth and establishment of new sex chromosomes (McAllyster 2003;Kirkpatrick 2010), often mediated via large chromosomal rearrangements such as inversions or fusions.One of the main determinants of this process is selection against recombination, because linkage between SD loci and alleles beneficial to one sex has the potential to resolve sexual conflict (Charlesworth et al. 2005).As discussed below, we identified three sex-associated loci distributed on two chromosomes and one may speculate whether epistatic interactions may occur between them and putative other candidates not identified in this study.Further work would be needed to better understand gene interactions in this region and more formally confirm the putative male-limited fusion (Guerrero and Kirkpatrick 2014;Wright et al. 2016).

Several gene candidate drivers of Greenland Halibut SD
Alignment of DNA or mRNA sequences of previously described sex-associated markers revealed that several putative gene candidate drivers of SD were distributed throughout Greenland Halibut assembly (on seven chromosomes, Supplementary Table S1).Most of those genes are located in regions of low differentiation between males and females: amhr2, cyp19a1A on chr05; cyp11b2 and rspo1 on chr08; dmrt1 on chr09, foxl2 on chr10 and gsdf on chr18, making them unlikely to underlie sex differentiation in Greenland Halibut.However, three candidate genes mapped to the putative sex chromosomes: gdf6 and sox2 on Chr10, and sox9a2 on Chr21 (Figure 2 and Supplementary Table S1) and are thus candidate drivers of SD in Greenland Halibut.
Growth differentiation factor 6 (Gdf6) is a member of the TGFb family involved in vertebral segmentation and cell differentiation.While no gonadal function was reported previously, gdf6Y emerged recently as a putative SD gene in the Turquoise Killifish (Nothobranchius furzeri) (Reichwald et al. 2015) because sex-linked allelic variations were found to affect the Gdf6 protein function.The Y chromosome allele, gdf6Y, also exhibits sex-linked expression during testicular differentiation in the Turquoise Killifish, supporting its SD role but functional evidence of its role as a SD gene is yet not available.
Sox2 is a member of the SOX (SRY-related HMG-box) family with a high-mobility-group (HMG) DNA-binding domain (Zhang and Cui 2014).It is an important transcription factor, which is involved in developmental processes of vertebrates, such as embryogenesis, maintenance of stem cells, and neurogenesis.In the Rohu Carp (Labeo rohita), sox2 mRNA is expressed in various organs, as well as in the culture proliferating spermatogonial stem cells (Patra et al. 2015).Sox2 is also a candidate SD gene in two aquaculture species, the mollusk Zhikong Scallop (Chlamys varia) (Liang et al. 2019) and the crustacean Black Tiger Shrimp (Penaeus monodon) (Guo et al. 2019).In flatfishes, sox2 is also a strong candidate for SD.For instance, in Japanese Flounder sox2 is expressed in gonadal tissues and the transcript abundance in ovaries is higher than in testis (Gao et al. 2014).In Turbot, sox2 is defined as a consistent candidate gene putatively driving SD (Martinez et al. 2021).
The Sox9 gene has been related to sexual differentiation in several teleost species: Rainbow Trout, (Oncorhynchus mykiss, Takamatsu et al. 1997;Vizziano et al. 2007) Zhou et al. 2003).In the Rainbow Trout, two sox9 genes have been identified: sox9a1 and sox9a2.Sox9a1 shows higher expression levels in males than in females before sexual differentiation (Vizziano et al. 2007).In Tilapia, sox9 expression is similar in XX and XY gonads, but, after sexual differentiation, it is positively regulated in XY gonads (Ijiri et al. 2008).In Medaka, testicular sox9 (sox9a2) expression in somatic cells is similar in both sexes during early gonadal differentiation and becomes positively regulated in males during the testicular lobe formation stage (Nakamoto et al. 2005).Overall, these observations support the hypothesis of a significant role for sox9 in either testicular differentiation or testicular development.This gene may also be performing a similar function in flatfish.In Atlantic Halibut, sox9a2 is included in a cascade of gene activation starting with the dmrt1 gene which is activated by the putative SD factor, the gsdf gene (Norris and Carr 2013;Einfeldt et al. 2021).
In Greenland Halibut, sox2, gdf6, and sox9a2 represent three possible candidate genes for sex determination.One can think that these three genes may be interacting in shaping the determination of sex and/or the subsequent development of sex organs.In addition, it is not excluded that some genes located in the highest peaks of differentiation but currently not included in the search neither in the annotation, and thus not identified by our study, could also have an effect on sex differentiation in Greenland Halibut and/or interact with current identified candidates.
SD genes identified in the two most closely related flatfish species, Atlantic and Pacific Halibut (Table 1), do not appear to differentiate male and female Greenland Halibut.This is somewhat unsurprising given that even within the same genus (Hyppoglossus), not only do different species have different genes driving SD, but they also harbor a different heterogametic sex chromosome system (XY for Atlantic Halibut and ZW for Pacific Halibut, Drinan et al. 2018;Einfeldt et al. 2021).
The identified SD system with low differentiation between sex chromosomes revealed in Greenland Halibut might reflect the observed complex and still enigmatic sex maturation.For example, in some Greenland Halibut individuals the maturation cycle can be interrupted by a sudden degeneration of the oocytes (Fedorov 1971).More recently, it has been suggested that this species utilizes a very unusual oocyte development pattern containing two simultaneously developing cohorts of vitellogenic oocytes, not seen in any related species (Rideout et al. 2012;Dominguez-Petit et al. 2018).Also, Greenland Halibut present high densities and regular occurrence of rodlet cells in male and female gonads, which is unusual and still unexplained (Rideout et al. 2015).Furthermore, environmental factors such as temperature, pH, population density, and social interactions have all been found to influence the sex-ratio in fish (Crews 1976;Nakamura et al. 1998;Baroiller et al. 2009) and we cannot rule out a possible role of such factors.Clearly, more research is needed to better disentangle the relative contributions of environmental and genetic factors in SD in Greenland Halibut.

Conclusion
Despite the commercial importance of Greenland Halibut, major knowledge gaps still persist for this species, namely pertaining to its reproduction cycle and sex determination mechanisms.In this study, we used a compendium of recent technologies [singlemolecule sequencing of long reads (Pacific Sciences) with chromatin conformation capture sequencing (Hi-C) data and wholegenome sequencing of hundreds of individuals] to provide the very first chromosome-level genome reference and explore the sex determination mechanism of this demersal fish.The chromosome-level assembly and the investigation of its syntenic relationships with other economically important flatfish species highlight the high conservation of synteny blocks within the flatfish phylogenetic clade.In fish, the early evolutionary stage of most sex chromosomes and the subsequent small differences between those chromosomes make it extremely difficult to find diagnostic differences between them.Our sex determination analysis revealed that flatfishes do not escape this rule and exhibit a high level of lability and turnover in SD mechanisms.A whole-genome sequencing approach of 198 individuals allowed us to unravel the molecular SD system in Greenland Halibut.Specifically, our results support the view that Greenland Halibut has a male heterogametic XY system, with several putative candidate genes, also described as SD drivers in other flatfish species like the Turbot (S. maximus) and the Japanese Flounder.Interestingly, our study also suggested for the first time in flatfish the presence of a putative Y-autosomal fusion that could be associated with a reduction of recombination typical of the early stages of sex chromosome evolution.

Figure 2
Figure 2 Two genomic regions implied in sex differentiation.(A) Correlation between PC1 scores of the local PCAs performed on windows of 1000 SNPs and PC1 scores of the global PCA along with their genomic position in each chromosome.(C) Fst differentiation estimated between males and females in sliding-windows of 25 Kb within each chromosome.(E) Mean observed heterozygosity in each sex smoothed for visualization.Panels (B), (D), and (F) are zooms on the two differentiated chromosomes (Chrs 10 and 21) of the previous values, respectively (A), (C), and (E).Vertical dashed gray lines indicate the positions of detected sex-related candidate genes.

Figure 3
Figure 3 Intra-and Interchromosomal LD among males and females, revealing a putative Y-autosomal fusion.The color scale corresponds to the 10th highest percentile of the R 2 value of each SNP pairwise summarized by windows of 250 Kb.Male values are displayed above the diagonal and female values are displayed below the diagonal.Positions are represented in megabases.

Table 1
Statistics of genome assemblies of the important flatfish species in fisheries and aquaculture Downloaded from https://academic.oup.com/g3journal/article/12/1/jkab376/6428537 by Bibliotheque de l'Universite Laval user on 20 May 2022The 1,048,307 assembled expressed sequences were used to annotate the assembled genome using the GAWN pipeline v0.3.4 (https://github.com/enormandeau/gawn).The annotations were then simplified by using the simplify_genome_annotation_table.pyscript (included in the GAWN pipeline), which removes consecutive almost-identical annotations from the genome annotation table.