Evolution of gene dosage on the Z-chromosome of schistosome parasites

XY systems usually show chromosome-wide compensation of X-linked genes, while in many ZW systems, compensation is restricted to a minority of dosage-sensitive genes. Why such differences arose is still unclear. Here, we combine comparative genomics, transcriptomics and proteomics to obtain a complete overview of the evolution of gene dosage on the Z-chromosome of Schistosoma parasites. We compare the Z-chromosome gene content of African (Schistosoma mansoni and S. haematobium) and Asian (S. japonicum) schistosomes and describe lineage-specific evolutionary strata. We use these to assess gene expression evolution following sex-linkage. The resulting patterns suggest a reduction in expression of Z-linked genes in females, combined with upregulation of the Z in both sexes, in line with the first step of Ohno's classic model of dosage compensation evolution. Quantitative proteomics suggest that post-transcriptional mechanisms do not play a major role in balancing the expression of Z-linked genes.


Introduction
In species with separate sexes, genetic sex determination is often present in the form of differentiated sex chromosomes (Bachtrog et al., 2014). A sex-specific chromosome can be carried by the male (such as the Y of mammals and fruit flies, in male heterogamety) or by the female (such as the W of birds, in female heterogamety). These sex chromosomes originally arise from pairs of autosomes, which stop recombining after they acquire a sex-determining region (Charlesworth, 1991;Charlesworth et al., 2005). The loss of recombination between X/Z and Y/W chromosomes is likely driven by selective pressures to link the sex-determining gene and alleles with sexually antagonistic effects, and often occurs through inversions on the sex-specific chromosome (Rice, 1987;Bergero and Charlesworth, 2009). The inverted Y/W-linked region stops recombining entirely, which hampers the efficacy of selection and leads to its genetic degeneration (Charlesworth et al., 2005;Engelstädter, 2008). The appearance of further sexually antagonistic mutations can restart the process and select for new non-recombining regions, creating sex chromosome 'strata' of different ages (Ellegren, 2011;Wang et al., 2012;Vicoso et al., 2013a;Vicoso et al., 2013b). Eventually, this suppression of recombination can extend to most of the chromosome, leading to genepoor, mostly heterochromatic sex chromosomes such as the Y chromosome of mammals (Lemaitre et al., 2009).
The loss of one gene copy on the Y/W is predicted to result in a two-fold reduction of expression in the heterogametic sex, as gene expression is correlated with gene copy number (Guo et al., 1996). This can cause imbalances in gene networks composed of both X/Z-linked and autosomal genes (Wijchers and Festenstein, 2011). Such imbalances can drive the appearance of dosage compensation mechanisms, which target X/Z chromosomes and regulate their expression to restore optimal dosage (Ohno, 1967;Gartler, 2014). While X/Z upregulation in the heterogametic sex is required to re-establish balanced levels of expression, global downregulation in the homogametic sex is also observed (e.g. X-inactivation in mammals). Ohno suggested a two-step mechanism, in which the initial upregulation of expression is not sex-specific. This leads to an excess of dosage in the homogametic sex, and secondarily selects for further repressing mechanisms ('Ohno's hypothesis' of dosage compensation Ohno, 1967]). How relevant this model is to the evolution of mammalian dosage compensation is still under debate (e.g. Lin et al., 2012;Nguyen and Disteche, 2006;Vicoso and Bachtrog, 2009a;Mank, 2013). Independent of the underlying mechanisms, balanced gene expression between males and females in species with differentiated sex chromosomes was used as diagnostic of a chromosome-wide (also referred to as 'global', or 'complete') mechanism of dosage compensation in many different clades (Vicoso and Bachtrog, 2009a;Mank, 2013;. In ZW systems, the loss of genes on the sex-specific W chromosome is generally accompanied by unequal expression levels of the Z-chromosome between ZZ males and ZW females, as well as reduced expression of the Z relative to the autosomes in females. This has generally been interpreted as a lack of chromosome-wide dosage compensation (also referred to as 'partial ', or 'incomplete'), with individual dosage-sensitive genes being independently regulated instead. Incomplete dosage compensation was described in a wide range of species, including birds (Itoh et al., 2007;Ellegren et al., 2007;Arnold et al., 2008;Wolf and Bryk, 2011), fishes (Chen et al., 2014) and snakes (Vicoso et al., 2013a). So far, Lepidoptera are the only exception to this observation Huylmans et al., 2017). Why many ZW systems should fail to acquire a global mechanism of dosage compensation is not entirely clear, although several and non-mutually exclusive hypotheses have been put forward (see Discussion, and Gu and Walters [2017] for a review). Another possibility is that the male-bias of the Z is instead caused by an accumulation of eLife digest The DNA inside cells is organized in structures called chromosomes, some of which can control whether individuals develop as males or females. For instance, female mammals have two X chromosomes, whereas male mammals have one X and one Y chromosome. A mechanism called 'dosage compensation' makes sure that females do not produce double the number of transcripts from genes on the X-chromosome as males.
In other organisms, including the parasitic flatworms called Schistosomes, females have ZW sex chromosomes, whereas males have two Z chromosomes. In these parasites, males do create more transcripts from genes on the Z chromosome than females do, suggesting they do not have the same kind of compensation mechanisms as mammals.
Among Schistosome parasites, the Z chromosome has only been studied in detail in the model organism Schistosoma mansoni. Investigating other closely related species can shed light on how the Z and W chromosomes evolved.
Picard et al. studied the Z chromosome in two additional species of Schistosome parasites: the African S. haematobium and the Asian S. japonicum. Using a technique called DNA sequencing, Picard et al. were able to analyse their genes, focusing on a part of the Z chromosome known to have been lost from the W chromosome. The results revealed that this region was different in the African and Asian species. In addition, females of both species expressed genes on their single Z chromosome at fairly high levels. The males did not need to express these genes at a high level because they have two copies -but they did so anyway. This could be because this high expression is a by-product of the way the females have evolved to boost their Z chromosome gene expression.
A next step will be to investigate the molecular mechanisms underlying this regulation. Schistosomiasis -a disease caused by this type of flatworm parasite -is one of the deadliest neglected tropical diseases, according to the US Centers of Disease Control. It kills more than 200,000 people a year. Better understanding of the reproductive biology of this parasite could eventually help to develop ways to control it by interfering with its reproduction. genes with male functions due to the male-biased transmission of the Z, which may favor the fixation of sexually antagonistic male-beneficial mutations on this chromosome.
While the direct comparison of male and female expression of X/Z-linked and autosomal genes has provided an overview of dosage compensation in many clades, it suffers from several drawbacks . First, chromosome-wide dosage compensation can lead to strongly sexbiased expression, if only the initial upregulation of expression in both sexes has occurred (but not the secondary downregulation of Ohno's hypothesis [Ohno, 1967]). This has been suggested for the flour beetle (Prince et al., 2010) and for the young sex chromosomes of the threespine stickleback (Schultheiß et al., 2015). Biases in expression levels between sexes and/or chromosomes may also have been present ancestrally, before the present sex-chromosomes evolved, and using a proxy for ancestral expression can yield insights into the direct consequences of sex-linkage (Julien et al., 2012;Vicoso and Bachtrog, 2015;. Finally, the vast majority of studies relied only on microarray or RNA-seq data and did not consider any post-transcriptional regulation that might affect gene dosage at the protein level, but not at the transcript level (whereas protein dosage is in most cases the functionally relevant measure). For instance, a proteomic analysis in birds found that several genes appeared to be partially equalized at the protein level despite being strongly malebiased at the transcript level (Uebbing et al., 2015). In humans, post-transcriptional regulation does not appear to play a major role in dosage compensation (Chen and Zhang, 2015).
Here, we combine comparative genomics, transcriptomics and quantitative proteomics to obtain a complete overview of the evolution of gene dosage on the Z-chromosome of parasites of the genus Schistosoma. Schistosomes are a group of blood parasites that can cause schistosomiasis in humans (Chitsulo et al., 2004). Their complex life cycle is characterized by a phase of clonal multiplication in an intermediate mollusk host, and a phase of sexual reproduction in the final warm-blooded host. Unlike the other 20,000 species of hermaphroditic platyhelminths, schistosomes have separate sexes: sexual reproduction occurs immediately after the primary development of males and females in their definitive host, and mating is compulsory for the sexual maturation of females (Loker and Brant, 2006;Kunz, 2001). Sex determination is genetic, and relies on a pair of cytogenetically welldifferentiated ZW chromosomes (Grossman et al., 1981a). All schistosomes are thought to share the same ancestral pair of ZW sex chromosomes, but differences in their morphology and in the extent of heterochromatization of the W suggest that different strata were acquired independently by different lineages (Grossman et al., 1981a;Lawton et al., 2011).
The model blood fluke Schistosoma mansoni was one of the first ZW clades to be evaluated for the presence of global dosage compensation, through the comparison of male and female microarray data derived from several tissues (Vicoso and Bachtrog, 2011). It showed reduced expression of Z-linked genes in females relative (i) to the autosomes and (ii) to males, consistent with a lack of chromosome-wide dosage compensation. Interestingly, the reduction of Z-expression in females was less than two-fold, and the Z:autosome ratio of expression was slightly, but consistently, greater than one in males. Our combined genomic, transcriptomic, and proteomic approaches allow us to fully probe the evolution of the male-biased expression of the Z, and suggest a more complex scenario than previously proposed. We discuss this in light of the different hypotheses put forward to account for the evolution of gene dosage on Z chromosomes.

Genomic differentiation of ZW sex chromosomes in Asian and African lineages
The difference in morphology of the ZW pair in African and Asian schistosomes suggests that the two lineages may differ in their gene content (Grossman et al., 1981a). We compared the gene content of the Z-chromosomes of three different species: S. mansoni and Schistosoma haematobium, which belong to African schistosomes, and Schistosoma japonicum, an Asian schistosome ( Figure 1). We first identified syntenic blocks between the S. mansoni genome and the S. haematobium and S. japonicum scaffolds. To this end, we mapped all S. mansoni protein coding sequences to the genome assemblies of the two other species and selected only the hits with the highest scores, yielding 9504 S. mansoni/S. haematobium orthologs and 8555 S. mansoni/S. japonicum orthologs ( Table 1). Scaffolds were then assigned to one of the S. mansoni chromosomes, based on their ortholog content ( Figure 1-source data 1).
We further performed a comparative coverage analysis to define the Z-specific regions of the three species. Z-derived sequences are expected to display half the genomic coverage in ZW females as in ZZ males, and as the autosomes. We thus mapped male and female genomic reads (or only female reads in the case of S. haematobium) to the reference genome of each species (Protasio et al., 2012a;Criscione et al., 2009;Young et al., 2012;Zhou et al., 2009). Publicly available raw reads were used for S. mansoni and S. haematobium (Wellcome Trust Sanger Institute . Female coverage is shown for S. haematobium scaffolds (C). All species share an ancestral Z-linked stratum S0 (marked in orange). The stratum S1jap (in green) is specific to the Asian lineage represented by S. japonicum. The stratum S1mans (in blue) is specific to the African lineage, represented by S. mansoni and S. haematobium. Dot color is attributed depending on the window/scaffold location within each species: Z-specific regions in orange, pseudoautosomal regions in grey, and ambiguous regions in beige. DOI: https://doi.org/10.7554/eLife.35684.003 The following source data is available for figure 1: Bioprojects PREJB2320 and PREJB2425), whereas male and female S. japonicum were sequenced for this study. We then estimated the per base genomic coverage. Median coverage values were 18.40 and 18.99 for S. mansoni male and female libraries; 23.5 and 7.43 for S. haematobium female#1 and female#2 libraries; 23.77 and 20.53 for S. japonicum male and female libraries. Z-specific genomic regions were defined by a maximum value of the female:male ratio of coverage (S. mansoni: log2(female:male)=À0.4; S. japonicum: log2(female:male)=À0.84), or a maximum value of female coverage (S. haematobium: log2(female)=4.41). Details of how these cutoff values were obtained are provided in the Materials and methods and Appendix 1. This analysis resulted in 285 newly described Z-specific genes in S. mansoni that were previously located on 19 unplaced scaffolds longer than 50 kb, and to a refined pseudoautosomal/Z-specific structure of the published ZW linkage group (Protasio et al., 2012a;Criscione et al., 2009) (Figure 1-source data 2). It further allowed us to define 379 Z-specific scaffolds (containing 1409 annotated genes with orthologs in S. mansoni) in S. haematobium ( Figure 1-source data 3 for exhaustive list) and 461 Z-specific scaffolds (containing 706 annotated orthologs) in S. japonicum ( Figure 1-source data 4 for exhaustive list).
While the content of the Z was largely shared between the African S. mansoni and S. haematobium (Table 1, Figure 1), large differences were found between the African and Asian lineages: only 476 Z-specific genes were shared by S. mansoni and S. japonicum, while 306 were only Z-specific in S. mansoni and 137 only in S. japonicum ( Table 1). Of all these Z-specific genes, 613 were already mapped to the S. mansoni ZW linkage group (Protasio et al., 2012a;Criscione et al., 2009) and, when plotted along the Z-chromosome, outlined three different evolutionary strata: one shared ancestral stratum (S0: 367 genes) and two lineage-specific strata (S1mans, specific to the African schistosomes, with 180 genes; and S1jap, specific to S. japonicum, with 66 genes) (Table 1, Figure 1, and Figure 1-source data 1). The presence of pseudoautosomal regions throughout the S0 (Figure 1) is likely due to errors in the genome assembly. All further analyses were run using all newly identified Z-specific genes, but hold when only Z-specific genes that were previously mapped to the ZW linkage group are considered (Appendix 1).

Consistent patterns of expression in S. mansoni and S. japonicum
In order to test for dosage compensation, the median expression of Z-specific genes in ZW females can be compared to the median autosomal expression (Z:AA ratio) and/or to the Z-specific gene expression in ZZ males (F:M ratio). Z:AA or F:M ratio of~1 supports global dosage compensation, while a ratio between 0.5 and 1 suggests partial or local dosage compensation. We performed this analysis in S. mansoni and in S. japonicum, using publicly available RNA-seq reads derived from a sexually undifferentiated stage (schistosomula, [Picard et al., 2016;Wang et al., 2017]) and a sexually mature stage (adults, Wellcome Trust Sanger Institute Bioproject PRJEB1237, [Wang et al., 2017]). The inclusion of a sexually undifferentiated stage (which lack primary spermatocytes or eggs) is important, as much of the expression obtained from adults will necessarily come from their welldeveloped gonads. Sex-linked genes are often sex-biased in the germline, even in organisms that have chromosome-wide dosage compensation (e.g. due to sex-chromosome inactivation during gametogenesis), and the inclusion of gonad expression has led to inconsistent assessments of the status of dosage compensation in other clades Huylmans et al., 2017). Reads were mapped to their respective genomes, and expression values in Reads Per Kilobase Million (RPKM) were calculated for each gene ( Figure 2, Figure 2-source data 1 and 2); only genes with a minimum RPKM of 1 in both sexes were considered. We consistently observed a strong male bias in the expression of Z-specific genes in both stages and species (F:M ratio between 0.58 and 0.69, Figure 2 and Supplementary file 1), consistent with local or incomplete dosage compensation. While this was generally supported by the lower expression levels of Z-specific genes in females when compared to the autosomes (Z:AA ratio between 0.73 and 0.85; Figure 2, Supplementary file 1), this difference was only apparent for some filtering procedures ( , and even then was not sufficient to fully account for the strong male-bias of the Z. Instead, the higher expression of the male Z in both stages and species (ZZ:AA ratio between 1.25 and 1.46, Supplementary file 1) appeared to also contribute to the male-bias of Z-linked genes. These patterns were qualitatively robust to changes in the methods used to estimate expression (RPKM or TPM [Transcripts Per Kilobase Million]), in the filtering procedure (RPKM > 0, RPKM > 1, TPM >0 or TPM >1), and when only genes that were previously mapped to the ZW linkage group were considered. These analyses were further performed independently in the S0, S1mans and S1jap strata, which showed no significant difference in the extent of their male bias. All the resulting plots are shown in

Convergent upregulation of the Z in both sexes
The previous patterns are consistent with an upregulation of the Z-chromosome in both sexes after the degeneration of the W-specific region, and could represent the intermediate step in the evolution of dosage compensation originally postulated by Ohno. However, they could also be due to high expression of the ancestral proto-Z in both sexes, before sex chromosome divergence. To exclude this, we identified one-to-one orthologs between genes annotated in both species using a reciprocal best hit approach (7382 orthologs, Figure 3-source data 1). All genes that were classified as Z-specific in one species but as autosomal in the other were considered to be part of the S1 strata (S1jap if they were Z-specific in S. japonicum or S1mans if they were Z-specific in S. mansoni). We then used the pseudoautosomal expression of these lineage-specific strata as a proxy for the ancestral level of expression. For instance, in S. japonicum, we estimated the S1jap:AA ratio, after normalizing the expression data by their respective (pseudo)autosomal level in S. mansoni ( Figure 3A and B). The reversed analysis was performed for S1mans ( Figure 3C and D). Figure 3 confirms that the male-biased expression of Z-specific genes is a consequence of their sex-linkage, and that the Z-chromosome has become under-expressed in females relative to the ancestral expression. However, a full two-fold reduction in female expression is not observed, consistent with partial upregulation, and/or full upregulation of a subset of dosage-sensitive genes (Z:AA ranging from 0.68 to 0.83, Supplementary file 1) (Sangrithi et al., 2017;Pessia et al., 2012). , and (iiii) we consider the schistosomula stage (with or without the ancestral expression and independent of the classification), this is likely due to noise in the sample and not to a true biological difference (only 58 genes were tested). Figure 3 shows the distributions for all genes with a minimum RPKM value of 1 in males and females of both species. We repeated the analysis using the same filters as before (minimum RPKM of 0, TPM of 0, TPM of 1), and with a publicly available list of 1:1 orthologs (obtained from the Wormbase Biomart, see Methods). The resulting plots are shown in  Adult expression patterns (RPKM>0, exhaustive strata) of genes located in the different strata of the Z (S0, S1man and S1jap), as well as pseudoautosomal and autosomal genes. DOI: https://doi.org/10.7554/eLife.35684.011 Figure supplement 3. Adult expression patterns (RPKM>1, stringent strata) of genes located in the different strata of the Z (S0, S1man and S1jap), as well as pseudoautosomal and autosomal genes. DOI: https://doi.org/10.7554/eLife.35684.012 Figure supplement 4. Adult expression patterns (RPKM>0, stringent strata) of genes located in the different strata of the Z (S0, S1man and S1jap), as well as pseudoautosomal and autosomal genes. DOI: https://doi.org/10.7554/eLife.35684.013 Figure supplement 5. Adult expression patterns (RPKM>1, exhaustive strata, Wormbase orthologs) of genes located in the different strata of the Z (S0, S1man and S1jap), as well as pseudoautosomal and autosomal genes. DOI: https://doi.org/10.7554/eLife.35684.014 Figure supplement 6. Adult expression patterns (TPM>1, exhaustive strata) of genes located in the different strata of the Z (S0, S1man and S1jap), as well as pseudoautosomal and autosomal genes. DOI: https://doi.org/10.7554/eLife.35684.015 Figure supplement 7. Adult expression patterns (TPM>0, exhaustive strata) of genes located in the different strata of the Z (S0, S1man and S1jap), as well as pseudoautosomal and autosomal genes. DOI: https://doi.org/10.7554/eLife.35684.016 Figure supplement 8. Schistosomula expression patterns (RPKM>1, exhaustive strata) of genes located in the different strata of the Z (S0, S1man and S1jap), as well as pseudoautosomal and autosomal genes. DOI: https://doi.org/10.7554/eLife.35684.017 Figure supplement 9. Schistosomula expression patterns (RPKM>0, exhaustive strata) of genes located in the different strata of the Z (S0, S1man and S1jap), as well as pseudoautosomal and autosomal genes.

Male-biased protein dosage of Z-specific genes
We tested for putative post-transcriptional mechanisms by assessing the dosage compensation pattern at the proteomic level in adult S. mansoni, using a somatic tissue (head region) as well as the gonads; three replicates were used for each tissue and sex. Heads and gonads were chosen as they allowed us to compare Z-specific gene dosage in tissues with widespread functional sex-specificity (ovary and testis), and in a tissue where most dosage imbalances are likely to be deleterious. We used a label-free quantitative mass spectrometry approach to obtain a relative quantification of the protein levels in each tissue depending of the sex (Figure 4-source data 1 to 5). Post-transcriptional dosage compensation mechanisms would be detectable by (i) an equalization of the Z expression between sexes at the protein level (F:M close to one for both the Z and the autosomes); (ii) a different correlation between F:M obtained from mRNA and from proteins for Z-linked and autosomal genes. We used publicly available head and gonad microarray data (Nawaratna et al., 2011) as the transcriptomic reference ( Figure 4-source data 6). A significant and positive correlation was found between the F:M ratio derived from the microarray and from the proteomic data ( Similar to what was observed using RNA-seq, the expression of Z-specific genes was strongly male-biased compared to that of autosomal genes in both heads (F:M of 0.68 for the Z chromosome versus 0.92 for the autosomes; Figure 4A, Supplementary file 1) and gonads (F:M of 0.78 versus 0.99; Figure 4B, Supplementary file 1). These F:M ratios are closer to each other than in our RNAseq analysis (Figure 2, Supplementary file 1), or than the microarray data (Supplementary file 1), which could suggest a potential contribution of post-transcriptional regulation to dosage equalization. However, Figure 4B shows that Z-linked and autosomal genes show a similar correlation between the F:M ratios found for mRNAs and proteins (p>0.05 with a Fisher r-to-z transformation of the correlation coefficients, Figure 4C and D), which argues against a major role of post-transcriptional regulation to balance expression. This similarity between Z-linked and autosomal genes holds when only genes with male-biased expression in the microarray data are considered (Figure 4  patterns (RPKM>1, stringent strata) of genes located in the different strata of the Z (S0, S1man and S1jap), as well as pseudoautosomal and autosomal genes. DOI: https://doi.org/10.7554/eLife.35684.019 Figure supplement 11. Schistosomula expression patterns (RPKM>0, stringent strata) of genes located in the different strata of the Z (S0, S1man and S1jap), as well as pseudoautosomal and autosomal genes. DOI: https://doi.org/10.7554/eLife.35684.020 Figure supplement 12. Schistosomula expression patterns (RPKM>1, exhaustive strata, Wormbase orthologs) of genes located in the different strata of the Z (S0, S1man and S1jap), as well as pseudoautosomal and autosomal genes. DOI: https://doi.org/10.7554/eLife.35684.021 Figure supplement 13. Schistosomula expression patterns (TPM>1, exhaustive strata) of genes located in the different strata of the Z (S0, S1man and S1jap), as well as pseudoautosomal and autosomal genes.

Discussion
Schistosome sex chromosome evolution in the age of genomics S. mansoni, S. haematobium and S. japonicum are the main species responsible for human schistosomiasis and have been the subject of many molecular and genomic studies. Despite the availability of extensive genomic and transcriptomic resources (e.g. a genome assembly at the near-chromosome level for S. mansoni (Protasio et al., 2012a;Criscione et al., 2009), or sex-and stage-specific transcriptomes (Picard et al., 2016;Lu et al., 2016;Grevelding et al., 2018;Lu et al., 2017), many basic questions remain regarding their reproduction and biology. For instance, the master sex-determining gene (and whether it is located on the W or Z) is still a mystery (Lepesant et al., 2012;Portela et al., 2010). This is partly due to the inherent challenges of assembling genomes from sequencing data, especially for regions rich in heterochromatin and repetitive sequences, such as sex chromosomes. For instance, 416 scaffolds, including 3893 genes (29% of the annotated nuclear genes), are still unplaced. By basing our analysis on genomic coverage, we were able to detect a further 285 Z-specific genes in S. mansoni; their role in sex determination can be investigated further. Our comparative approach can also reduce the number of candidates, as any gene involved in sex determination should in principle be found in the ancestral Z-specific stratum; similar analyses in Figure 3. Convergent changes in the expression of Z-linked genes in S. japonicum and S. mansoni after sex chromosome differentiation. The Z-linked and autosomal gene expression patterns (normalized by ancestral pseudoautosomal or autosomal expression, to show changes since the appearance of the two S1 strata) are shown for S1jap in S. japonicum (A-D) and S1mans in S. mansoni (E-F). Fem-S1j and Male-S1j refer to the normalized expression levels of genes in the stratum S1jap in females and males, Fem-S1m and Male-S1m refer to the normalized expression levels in S1mans, and Fem-A and Male-A refer to the normalized expression levels of autosomal genes in females and in males, respectively. The level of significance of each comparison (Wilcoxon rank sum test with continuity correction) is denoted with asterisks: *p-value<0.05, **p-value<0.001, ***p-value<0.0001. DOI: https://doi.org/10.7554/eLife.35684.029 The following source data and figure supplements are available for figure 3: Source data 1. One-to-one orthology S. mansoni vs S. japonicum.  other species can in the future refine the candidate region. Another advantage of basing our Z-assignment purely on coverage patterns is that our results should be largely independent of potential biases in the current version of the genome. It should, however, be noted that many genes are likely still missing from the current assembly (which has a BUSCO score of 76% complete plus fragmented genes; https://parasite.wormbase.org/index.html [Howe et al., 2016;Howe et al., 2017]) and that repeating these analyses using future improved assemblies will be necessary to obtain the full set of sex-linked genes.
A gradient of ZW heteromorphism between schistosome species was revealed by cytogenetic studies (Grossman et al., 1981a;Grossman et al., 1981b;Short and Grossman, 1981); in particular, African schistosomes were found to have much more extensive ZW differentiation and W heterochromatinization than Asian species (Hirai et al., 2000). Our results generally support these cytogenetic data: we confirm the acquisition of independent evolutionary strata in the sex chromosomes of S. mansoni and S. japonicum, and detect a larger number of Z-specific genes in the African species (8% to 11% of all annotated orthologs, respectively, in S. mansoni and S. haematobium) than in S. japonicum (5.5% of all annotated orthologs). Interestingly, although the sex chromosomes of the African S. mansoni and S. haematobium differ morphologically, they are largely similar in their gene content (Figure 1), consistent with their much closer phylogenetic relationship (the median synonymous divergence between the two species is around 17%, compared to 65% for S. mansoni/S. japonicum, Appendix 1). This may be comparable to snakes, where ZW pairs with vastly different morphologies were all equally differentiated at the genomic level (Vicoso et al., 2013a), and highlights the contribution of other factors, such as differential transposable element accumulation, to the large-scale morphology of sex chromosomes.

ZW systems and incomplete dosage compensation: gene-by-gene or partial shift?
ZW systems (aside from Lepidoptera) consistently show male-biased expression of the Z chromosome . While female-biased expression of the X occurs in a few young XY systems Schultheiß et al., 2015;Howe et al., 2017;Grossman et al., 1981b;Short and Grossman, 1981;Hirai et al., 2000;Mank and Ellegren, 2009), well-established X chromosomes generally show full equalization of gene expression between the sexes. This difference has often been framed as the acquisition of global mechanisms of dosage compensation, which affects the whole X/Z, versus the acquisition of local compensation, in which dosage-sensitive genes become individually regulated (Mank and Ellegren, 2009). Several parameters should influence this, and favor local compensation in ZW systems: (i) The speed of the heterochromosome degeneration: when only a few genes are lost at a time (because the region of suppressed recombination is small, or because degeneration is slow), the establishment of a gene-by-gene dosage compensation may be favored; on the other hand, the loss of many genes at once could favor global mechanisms of dosage compensation Vicoso and Charlesworth, 2009b). Since more mutations occur during spermatogenesis than oogenesis, female-specific W chromosomes will generally have lower mutation and degeneration rates than male-specific Ys, favoring local compensation; (ii) The effective population size of Z (NeZ): NeZ is decreased when the variance in reproductive success of ZZ males is larger than that of ZW females (e.g. in the presence of strong sexual selection). This will impair the adaptive potential of the Z (Mank, 2009;Mullon et al., 2015), such that only strongly dosage-sensitive genes can become upregulated in the heterogametic sex, while the others remain uncompensated; (iii) More efficient purging of mutations that are deleterious to males: strong sexual selection can also increase the strength of purifying selection on males, by preventing all but the fittest males from contributing to the next generations. If mutations that compensate for the loss of Y/W-linked genes overexpress the X/Z copy in both sexes, they will be under negative selection in the homogametic sex, and may be more efficiently selected against when males are the homogametic sex (Mullon et al., 2015).
Schistosomes are unusual among female-heterogametic clades in that they appear to have a chromosome-wide upregulation of the Z in both sexes; such an increase in males was not detected in birds (Julien et al., 2012) or snakes (Vicoso et al., 2013a), even when ancestral expression was taken into account. They therefore likely represent an intermediate between ZW species with true local compensation, and the chromosome-wide compensation of the ZW Lepidoptera. These results further show that, even if mutations that upregulate gene expression in both sexes are more easily fixed on an evolving X-chromosome than on an evolving Z (Mullon et al., 2015), this is not an absolute barrier to the evolution of global dosage compensation. It is however still unclear why the evolutionary dynamics appear to differ between schistosomes and most other ZW clades, as the demographic and population genetics parameters of this group are largely unknown. The observed male biased sex-ratio in adults, combined with a largely monogamous mating system (Beltran and Boissier, 2009;Beltran and Boissier, 2008), may increase the reproductive variance of males and could reduce the effective population size of the Z. This should also lead to stronger sexual selection in males than in females (Beltran and Boissier, 2008;Steinauer et al., 2009), suggesting similar evolutionary dynamics as in other ZW systems. A detailed characterization of the population genetics of the Z chromosome and autosomes will therefore be crucial for understanding what may have driven the evolution of this unusual system.
The relevance of the Ohno's hypothesis in the high-throughput sequencing era Ohno's hypothesis predicts that the heterochromosome is initially overexpressed in both sexes, then secondarily downregulated in the homogametic sex (Ohno, 1967). This theoretical scenario was first formulated to account for the inactivation of the X in mammals. Since then, similar molecular mechanisms to downregulate the X/Z chromosome have been characterized in nematodes and moths (Kiuchi et al., 2014;Meyer, 2010). If an initial upregulation of the X did occur in both sexes, then inactivation in the homogametic sex should simply restore the ancestral expression levels, a hypothesis that has been tested in many empirical studies in mammals. Most of them assumed that the X and autosomes must have had similar ancestral levels of expression, and simply compared their expression (Julien et al., 2012;Nguyen and Disteche, 2006;Xiong et al., 2010;Chen and Zhang, 2015;Graves, 2016;Deng et al., 2011;Kharchenko et al., 2011;Yildirim et al., 2011;Lin et al., 2012;Pessia et al., 2014). These yielded mixed results, with some (Xiong et al., 2010) finding reduced expression of the X, while others (e.g. Deng et al., 2011;Kharchenko et al., 2011;Yildirim et al., 2011) found similar levels of expression for X-linked and autosomal genes, in agreement with Ohno's predictions. Taking ancestral gene expression into account, Julien et al. (2012) found evidence of an Ohno-like mechanism in the marsupials but not in placental mammals (Julien et al., 2012). Pessia et al. (2012) recently found that while individual dosage-sensitive genes do show evidence of upregulation, the majority does not. The evolution of X-inactivation may therefore have involved a complex scenario under which a few dosage-sensitive genes first became individually upregulated in both sexes (gene-by-gene compensation), followed by the establishment of a chromosome-wide mechanism to downregulate expression in females (global compensation) (Sangrithi et al., 2017;Pessia et al., 2012).
Our results, which consider ancestral expression and do not indicate a major influence of posttranscriptional regulation, suggest a scenario closer to Ohno's original hypothesis, with the male Z showing a consistent increase in expression. A similar pattern has been observed in Tribolium castaneum (Coleoptera, Prince et al., 2010), where the female X has been found to be over-expressed relative to the autosomes, and to the male X-chromosome. However, an RNA-seq analysis in the same species did not detect this (Mahajan and Bachtrog, 2015), so it is at this point unclear whether it truly represents an example of Ohno's model in action. The youngest evolutionary stratum of the young XY pair of threespine sticklebacks also shows overexpression in females (Schultheiß et al., 2015), even when ancestral expression is accounted for (White et al., 2015). However, the interpretation of these patterns is complicated by the fact that such an overexpression is also detected for the pseudoautosomal region, and that the oldest evolutionary stratum appears to lack dosage compensation altogether. Schistosomes may therefore not only represent an ideal system in which to investigate the evolution of dosage compensation in a ZW system, but also an unparalleled system for understanding the relevance of the model and predictions originally made by Ohno.

Materials and methods
A detailed description of the computational analyses, as well as all the scripts that were used, are provided in Appendix 1.

DNA sequencing of S. japonicum males and females
Male and female worms preserved in ethanol of S. japonicum were provided by Lu Dabing from Soochow University (Suzhou, China). DNA was extracted from 28 pooled males and 33 pooled females. The worms were lysed using the Tissue Lyser II kit (QIAGEN) and DNA was isolated using the DNeasy Blood and Tissue Kit (QIAGEN). DNA was then sheared with Covaris Focused-ultrasonicator. Library preparation and sequencing (HiSeq 2500 v4 Illumina, 125 bp paired-end reads) were performed at the Vienna Biocenter Next Generation sequencing facility (Austria). Reads have been deposited at the NCBI Short Reads Archive under accession number SRP135770.
Orthology and assignment to the S. mansoni chromosomes S. mansoni coding sequences and their respective chromosomal locations were obtained from the WormBase Parasite database (https://parasite.wormbase.org/index.html, [Howe et al., 2016;Howe et al., 2017]). This gene set was mapped to the S. haematobium and S. japonicum genome assemblies using Blat (Kent, 2002) with a translated query and dataset (-dnax option), and a minimum mapping score of 50; only the genome location with the best score was kept for each. When more than one gene overlapped by more than 20 base pairs, only the gene that had the highest mapping score was kept. Finally, each scaffold was assigned to one of the S. mansoni chromosomes, depending on the majority location of the genes that mapped to it, or on their total mapping scores if the same number of genes mapped to two separate chromosomes. The final chromosomal assignments are provided in Figure 1-source data 1.

DNA read mapping and estimation of genomic coverage
For the S. japonicum DNA reads, adaptors were removed using Cutadapt (v1.9.1 [Martin, 2011]) and the quality of the reads was assessed using FastQC (v0.11.2, https://www.bioinformatics.babraham.ac.uk/projects/fastqc/); no further quality trimming was deemed necessary. For the S. mansoni and S. haematobium reads, potential adaptors were systematically removed, and reads were trimmed and filtered depending on their quality, using Trimmomatic (v0.36 ). The resulting read libraries of each species were mapped separately to their reference genomes using Bowtie2 (-end-to-end -sensitive mode, v2.2.9 [Langmead and Salzberg, 2012]). The resulting alignments were filtered to keep only uniquely mapped reads, and the male and female coverages were estimated from the filtered SAM files with SOAPcoverage (v2.7.7., http://soap.genomics.org. cn/index.html). Coverage values were calculated for each scaffold in S. haematobium and S. japonicum, and for each 10 kb non-overlapping window in S. mansoni. The coverage values for each library are provided in Figure 1-source data 3 and 4.

Detection of Z-specific sequences
For each species, we calculated the log2(female:male) coverage of each scaffold or, in the case of S. mansoni, of each 10 Kb window along the genome. Since only female DNA data was available for S. haematobium, the log2(female1 +female2) was used instead for this species.
In order to determine the 95% and 99% percentile of log2(female:male) of Z-linked sequences, which we use as cutoff values for assignment to Z-specific regions, we first excluded scaffolds/windows that fit an autosomal profile. To do so, the 1st and 5th percentile of log2(female:male) were estimated using all 10 kb windows found on the annotated autosomes of S. mansoni (Protasio et al., 2012b); in S. haematobium and S. japonicum, all scaffolds that mapped to the S. mansoni autosomes were used for this purpose.
By plotting the distribution of log2(female:male) for the scaffolds/windows that map to each chromosome (Appendix 1), we determined that: (i) for S. mansoni the 1st percentile (log2(female:male) =À0.26) discriminates effectively between autosomal and Z-specific windows; (ii) for S. haematobium and S. japonicum, which have noisier coverage, the 1st percentile included a significant fraction of Z-derived genes, and the 5st percentile was used instead (respectively log2(female) = 4.57 and log2 (female:male)=À0.40). All scaffolds with higher log2(female:male) or log2(female) were excluded.
The 95th and 99th quantiles of coverage were then calculated for the remaining, putatively Z-linked, sequences. By plotting all the coverage values along the S. mansoni Z-chromosome (Appendix 1 and Figure 1), we determined that: (i) for S. mansoni and S. haematobium, the 95th quantile (log2(female:male)=À0.4 and log2(female) = 4.41, respectively) was an effective cut-off for discriminating pseudoautosomal and Z-specific sequences; (ii) for S. japonicum, using the 95th percentile lead to the exclusion of many genes in Z-specific regions, and the 99st percentile (log2(F:M) =À0.84) was used instead. The Z-specific or autosome assignation was finally attributed as follows: (i) in S. mansoni, windows were classified as Z-linked if they displayed log2(F:M)<À0.4 and as autosomal if log2(F:M)>À0.4; (ii) for S. haematobium, scaffolds with log2(femalesum) <4.41 were classified as Z-linked, scaffolds with log2(femalesum) >4.57 as autosomal, and all others as ambiguous; (iii) for S. japonicum, scaffolds with log2(F:M)<À0.84 were classified as Z-linked, scaffolds with log2(F:M) >À0.40 as autosomal, and all others as ambiguous. For S. haematobium and S. japonicum, we considered only scaffolds with at least a coverage of 1 in each library (n.a.). In S. mansoni, unplaced scaffolds shorter than 50 kb were excluded; five consecutive 10 kb windows with consistent coverage patterns were required for a region to be classified as either Z-specific or autosomal. Smaller regions, as well as the two 10 Kb windows surrounding them, were excluded (see Appendix 1). The final classifications for the three species are provided in Figure 1-source data 2, 3 and 4, and summarized for orthologs in Figure 1-source data 1.

Definition of Z-specific strata in S. mansoni and S. japonicum
The Z-specific gene content of S. mansoni and S. japonicum was compared in order to define Z-chromosome strata. Genes that were located on Z-specific scaffolds/windows in both species were assigned to the shared stratum 'S0'. Genes that were assigned to Z-specific regions in one species but not in the other were assigned to lineage-specific strata: 'S1mans' genes were Z-specific in S. mansoni and autosomal in S. japonicum, while 'S1jap' genes were Z-specific in S. japonicum and autosomal in S. mansoni. While the main figures consider all the genes that were classified as Z-linked or autosomal based on coverage (referred to as the 'exhaustive classification' in Appendix 1 and Figures), independent of their original genomic location, we repeated the analyses using only the Z-specific genes that were already assigned to the ZW linkage map of S. mansoni (the 'stringent classification' in Appendix 1). All genes belonging to the categories 'excluded', 'ambiguous', 'n.a.' or that did not have orthologs on S. japonicum scaffolds were not further considered ( Table 1). 'PSA_shared' and 'Aut_shared' are common to the two classifications and correspond to genes that were classified as autosomal in both species using coverage and that were previously mapped to the ZW linkage group or to the autosomes of S. mansoni, respectively (Protasio et al., 2012b).
Publicly available RNA reads and estimation of gene expression S. mansoni and S. japonicum RNA-seq libraries were obtained from SRA (NCBI). Accession numbers are: S. mansoni adult females: ERR506076, ERR506083, ERR506084; S. mansoni adult males: ERR506088, ERR506082, ERR506090; S. mansoni schistosomula females: SRR3223443, SRR3223444; S. mansoni schistosomula males: SRR3223428, SRR3223429; S. japonicum adult females: SRR4296944, SRR4296942, SRR4296940; S. japonicum adult males: SRR4296945, SRR4296943, SRR4296941; S. japonicum schistosomula females: SRR4279833, SRR4279491, SRR4267990; S. japonicum schistosomula males: SRR4279840, SRR4279496, SRR4267991. Raw reads were cleaned using trimmomatic (v 0.36 , and the quality of the resulting reads was assessed using FastQC (v0.11.2,https://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Reads were mapped to their respective reference genomes used Tophat2 (Trapnell et al., 2009). Read counts were obtained with H (Anders et al., 2015) and expression values (in Reads Per Kilobase of transcript per Millionmapped reads, RPKM) were calculated for each gene in each of the RNA-seq libraries (Figure 2-source data 1). TPM (Transcripts Per Kilobase Million) values were also calculated using Kallisto (Bray et al., 2016) against set of coding sequences of the respective species. All expression values are provided in Figure 3-source data 2 and 3. A Loess Normalization (R library Affy) was performed on the schistosomulum data and on the adult data separately and all analyses were performed using different thresholds (RPKM > 0, RPKM > 1, TPM >0 or TPM >1 in all the libraries of the studied stage). The Loess normalization was applied to all conditions at once when we filtered for minimum expression in all stages and sexes (RPKM > 1 and RPKM > 3, Figure 2-figure supplements 15-17, Figure 3- figure supplements 1 and 2). Correlation analyses were performed for each developmental stage, considering libraries from both males and females, and both species. As shown in Appendix 1, two S. mansoni libraries (ERR506076 and ERR506082) were not well correlated with the other samples, and were excluded from our study. Expression values were averaged for each stage and sex. The significance of differences between medians of expression was tested with Wilcoxon rank sum tests with continuity correction.
Detection of S. mansoni and S. japonicum one-to-one orthologs S. japonicum coding DNA sequences and their respective location on the genome scaffolds were obtained from the WormBase Parasite database (https://parasite.wormbase.org/index.html [Howe et al., 2016;Howe et al., 2017]). The S. mansoni set of coding sequences (see above) was mapped to the S. japonicum gene set using Blat (Kent, 2002) with a translated query and dataset (dnax option), and a minimum mapping score of 50; only reciprocal best hits were kept. This reciprocal best hit ortholog list is provided in Figure 3-source data 1 and 2. A second list of orthologs was obtained from the Biomart of WormBase Parasite, excluding paralogues, and requiring a gene stable ID for both S. mansoni (PRJEA36577) and S. japonicum (PRJEA34885) (in Figure 3-source data 1 and 3). In subsequent transcriptomic analyses, each list was used independently to ensure that the results were independent of the method used to assign orthology.

Microarray analysis
Microarray data for male and female heads and gonads (Nawaratna et al., 2011) were obtained from the Gene Expression Omnibus (GEO) database (NCBI, ftp://ftp.ncbi.nlm.nih.gov/geo/series/ GSE23nnn/GSE23942/matrix/). A Loess Normalization (R library Affy) was performed on the head and gonad data separately. When different probes corresponded to one gene, their expression values were averaged. Gene expression was available for a total of 6925 genes. The normalized data are available in Figure 4-source data 6.

Protein extraction and proteomic analysis
Male and female adult S. mansoni gonads were sampled using the whole-organ isolation approach described previously (Hahnel et al., 2013). Twenty ovaries and 20 testes, as well as five heads of each sex, were sampled, in triplicate, from paired worms. All biological samples were resuspended in Laemmli buffer, denatured and frozen at À20˚C until further processing. Subsequent protein treatment and analyses were performed at the 'EDyP-service' -proteomic platform (Grenoble, France). The extracted proteins were digested by modified trypsin (Promega, sequencing grade). The resulting peptides were analyzed by nanoLC-MS/MS (Ultimate 3000 RSLCnano system coupled to Q-Exactive Plus, both Thermo Fisher Scientific). Separation was performed on a 75 mm x 250 mm C18 column (ReproSil-Pur 120 C18-AQ 1.9 mm, Dr. Maisch GmbH) after a pre-concentration and desalting step on a 300 mm Â 5 mm C18 precolumn (Pepmap, Thermo Fisher Scientific).
MS and MS 2 data were acquired using Xcalibur (Thermo Fisher Scientific). Full-scan (MS) spectra were obtained from 400 to 1600 m/z at a 70,000 resolution (200 m/z). For each full-scan, the most intense ions (top 10) were fragmented in MS 2 using high-energy collisional dissociation (HCD). The obtained data were processed in MaxQuant 1.5.8.3 against the database loaded from Uniprot (taxonomy Schistosoma mansoni, October 26th, 2017, 13.521 entries) and the MaxQuant embedded database of frequently observed contaminants. The resulted iBAQ values (Tyanova et al., 2016) were loaded into ProStaR (Wieczorek et al., 2017) for statistical analysis. Contaminant and reverse proteins were removed and only the proteins with three quantified values in at least one condition were taken into account.
After log2 transformation, the iBAQ values were normalized by overall-wise median centering followed by imputation using detQuantile algorithm with quantile set to 1 (Figure 4-source data 1 and 2). An alternative set of data without imputed values is available in Figure 4-source data 3 and 4. 1988 and 2750 Schistosoma mansoni proteins were identified in heads and in gonads, respectively (see Figure 4-source data 1 and 2 for statistical testing of differential abundance between male and female samples). Among them, 1741 and 2516 could be attributed unambiguously to a Schistosoma mansoni gene and were represented by more than one peptide; these were subsequently analyzed (See Figure 4-source data 5).
Data availability DNA reads of male and female S. japonicum are available on the SRA database under study number SRP135770. Sex and tissue-specific S. mansoni label-free proteomic data are provided in Figure 4source data 1 to Figure 4-source data 4.

Code availability
The full bioinformatic pipeline used in this study is provided in Appendix 1. The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Read alignment on reference genome
The same pipeline was applied to all the species. Example is shown here for the S. japonicum species.
Reads were mapped to the respective reference genomes using Bowtie2 (Langmead and Salzberg, 2012

Genomic coverage
The same pipeline was applied to all the species. Example is shown here for the S. japonicum species.

Comparative analysis of gene expression
We perform all the expression analyses in R (Script9_Transcriptomics) with the input files (i) InputFile7_Transcriptomics1 corresponding to the newly identified one-to-one orthologs (see 1.2); and (ii) InputFile10_Transcriptomics2 corresponding to the list obtained on WormBase parasite. In adults, a correlation analysis revealed an inconsistency between replicates, corresponding to the S. mansoni female and male adults: ERR506076 and ERR506082. The corresponding heatmaps are shown in Appendix 1-figure 4. These two libraries were excluded from further analyses.
They can be used directly in the script Script9_Transcriptomics at the step C.