Long non-coding RNAs are essential for Schistosoma mansoni pairing-dependent adult worm homeostasis and fertility

The trematode parasite Schistosoma mansoni causes schistosomiasis, which affects over 200 million people worldwide. Schistosomes are dioecious, with egg laying depending on the females’ obligatory pairing with males. Long non-coding RNAs (lncRNAs) are transcripts longer than 200 nucleotides with low or no protein-coding potential that have been involved in other species with reproduction, stem cell maintenance, and drug resistance. In S. mansoni, we recently showed that the knockdown of one lncRNA affects the pairing status of these parasites. Here, we re-analyzed public RNA-Seq data from paired and unpaired adult male and female worms and their gonads, obtained from mixed-sex or single-sex cercariae infections, and found thousands of differentially expressed pairing-dependent lncRNAs among the 23 biological samples that were compared. The expression levels of selected lncRNAs were validated by RT-qPCR using an in vitro unpairing model. In addition, the in vitro silencing of three selected lncRNAs showed that knockdown of these pairing-dependent lncRNAs reduced cell proliferation in adult worms and their gonads, and are essential for female vitellaria maintenance, reproduction, and/or egg development. Remarkably, in vivo silencing of each of the three selected lncRNAs significantly reduced worm burden in infected mice by 26 to 35%. Whole mount in situ hybridization experiments showed that these pairing-dependent lncRNAs are expressed in reproductive tissues. These results show that lncRNAs are key components intervening in S. mansoni adult worm homeostasis, which affects pairing status and survival in the mammalian host, thus presenting great potential as new therapeutic target candidates.


Supplementary Data
To check if the in vitro mimetic model has characteristics similar to the mixedsex/single-sex in vivo infection model used by Lu et al., 2016 [1], we performed RT-qPCR assays to measure protein-coding genes known to be differentially expressed (DE) in the mixed-sex/single-sex in vivo infection model. We first conducted a screening on genes that could be used as reference genes for the RT-qPCR analyses, by looking at the genes whose expression was more stable in the re-analysis of: (1) all RNA-Seq libraries from Lu et al., 2016 [1], (2) all adult worm RNA-Seq libraries from Lu et al., 2016 [1] and (3) the three best reference genes found in a previous analysis in the literature [2].
The efficiencies of RT-qPCR primers for all candidate reference genes are shown in Table D in S1 Appendix. The Cq values obtained in the RT-qPCR analyses of the putative candidate reference genes measured in all paired and unpaired male and female in vitro cultured samples are shown in Table J in S1 Appendix. A gene expression stability survey from all candidate reference genes was taken with GeNorm [3] (Table K in S1 Appendix) and NormFinder [4] (Table L in S1 Appendix). Our analysis found Smp_099690 (Protein RER1) and Smp_023150 (Serine/threonine-protein phosphatase 6 catalytic subunit) as the two most stable genes in our 14 different samples from adult worms cultured in vitro for up to 8 days either paired or not.
Then, we measured by RT-qPCR the expression of 14 protein-coding genes as controls, including genes that are differentially expressed between females and males (see next paragraph). The efficiencies of RT-qPCR primers for all protein-coding genes used are shown in Table E in S1 Appendix, and the Cq values obtained from this analysis are shown in Table M in S1 Appendix.
In Fig I,  Expression validation of protein-coding genes known to be related to the reproductive system and measured by RT-qPCR in S. mansoni cultured in vitro for 2, 4 or 8 days as paired couples or unpaired males and females. Four protein-coding genes were selected for RT-qPCR assays, namely (A) Smp_316140, Protein p14; (B) Smp_032990, Calcium binding protein CML11; (C) Smp_051920, Nanos type domaincontaining protein; and (D) Smp_307900, female specific protein 800 (fs800). Male related results are shown on the left (green) and female results on the right (orange). Paired couples (P) or unpaired (U) parasites were obtained by perfusion of hamsters infected for 42 days with S. mansoni cercariae. After perfusion, males and females were cultured in vitro for 2, 4 or 8 days as paired (P) couples or unpaired (U) worms. RT-qPCR results (solid-colored graphs) are normalized to the geometric mean of reference genes Smp_099690 and Smp_023150. Expression values from 4 different biological replicates are shown. Standard error of the mean (SEM) is shown in the error bars. (**) = p < 0.01; (***) = p < 0.001, Student t test. N.D.: Not detected. ns: p-value > 0.05. For comparison, RNA-Seq data from re-analysis of Lu et al., 2016 [1] is shown (plaid-colored graphs) and the expression is measured in TPM (transcripts per million); RNA-Seq data is retrieved from males (M), females (F), testes (T) or ovaries (O) from either a mixed-sex (b) or a single-sex (s) infection; (#) = FDR<0.005. The fold-change differences between the compared groups are represented under the brackets.
Expression of the p14 gene was mainly present in females when compared with males, and mostly present in females obtained from mixed-sex infection. Our in vitro mimetic model analysis has shown that the p14 gene was differentially expressed when comparing paired and unpaired females in vitro cultured for 2, 4, and 8 days. Notably, even though the ABC medium was developed to adequately sustain females egg deposition in vitro, the p14 gene expression was sensitive to in vitro culturing of the parasites (Fig I, panel A).
On the other hand, when looking at the expression pattern of Calcium-binding protein CML11 gene (Smp_032990) (Fig I, panel B), another gene mostly expressed in females, we have observed an in vitro culturing time-dependent differential expression in unpaired females when compared with paired females. When looking at the Smp_051920 Nanos-type domain-containing protein (Fig I, panel C), a 3.2-fold higher expression in unpaired females was observed when compared to paired females cultured in vitro for 8 days; this is in line with its higher expression in the male and female gonads when compared to the whole worm in the re-analysis of Lu et al., 2016 [1] data, and with the significantly higher expression in the ovaries retrieved from single-sex infections compared with mixed-sex infections (Fig I, panel C). Further, another gene that is mostly expressed in females has been measured, the female-specific protein 800 (fs800, Smp_307900), and the differential expression was increased in culture in a time-dependent manner in the paired females when compared with the unpaired ones (Fig I,

panel D).
Other genes such as Egg Shell Protein (Smp_000430), Nucleotide

Identification of lincRNAs enriched in mixed-sex females, males, and their reproductive organs
To identify a subset of lincRNA candidates to be tested for their possible involvement in adult worm pairing, we devised the filtering pipeline described below to single-out the lincRNAs enriched in mixed-sex females, males, and their reproductive organs.
First, we took each of the mixed-sex female, male, ovary, and testes samples and performed the most relevant pairwise comparisons between samples (  Table N).
The intergenic lncRNAs (lincRNAs) were chosen because interfering with their expression levels with knockdown approaches (such as using double-stranded RNAs) does not suffer from the problem of a simultaneous artifactual knockdown of a protein-coding message from the same locus. In fact, targeting a sense or an antisense lncRNA with long double-stranded RNAs (dsRNAs) may unwantedly reduce the expression of the protein-coding gene that is expressed from the locus, because the long dsRNA will share a stretch of its sequence with an intron of the immature pre-mRNA of the protein-coding gene message transcribed in the locus. An intergenic lncRNA is, by definition, away from the locus of a protein-coding gene, thus it will not share any stretch of sequence with the pre-mRNA message of a protein-coding gene.

A subset of lncRNAs show higher expression in mixed-sex worms or gonads compared with single-sex worms
Next, to further focus on a reduced list of differentially expressed lincRNAs that could play roles in S. mansoni worm sexual development and pairing, we excluded the lincRNAs with TPM values lower than 2 (lowly expressed), and cross-referenced the lists of lincRNAs found as significantly more expressed in mixed-sex females (bF) compared with single-sex females (sF), with mixed-sex ovaries (bO) and with mixed-sex males (bM), and found 31 bF enriched lincRNAs in common among these three pairwise comparisons (Fig N, red circle).
Similarly, when looking at lincRNAs significantly more expressed in mixed-sex males (bM), we found four bM lincRNAs in common in comparison with sM, bT, and bF, with an additional 44 lincRNAs in common in comparison with bT and bF alone, and 4 lincRNAs significantly more expressed in bM than in sM (Fig O, red circles).
When the lincRNAs significantly more expressed in mixed-sex ovaries (bO) were analyzed, we found 64 lincRNAs in common in the pairwise comparisons with bT, bM, bF, and sO (Fig P, red circle) while looking at lincRNAs significantly more expressed in bT we found 141 in common in the pairwise comparisons with bO, bM, and bF and not with sT (Fig Q, red circle), with only one additional lincRNA being more expressed in bT compared with sT and present in the other comparisons (Fig Q, red circle).  We further extended the search to include all lincRNAs expressed at levels of TPM > 2 in mixed-sex females (bF) and in ovaries from mixed-sex females (bO). We separated the lincRNAs that had epigenetics marks at their TSS (Fig R) from those without the marks (Fig S) and identified the number of lincRNAs that were detected as significantly more expressed in bF in the pairwise comparisons with sF and bM or as  We narrowed the above selected sets by excluding the lincRNAs with more than one isoform in their genomic locus, resulting in a total of 33 lincRNA candidates containing a chromatin epigenetic mark at their transcription start site (TSS), an indicative of lincRNA regulation (    Adult worms retrieved from perfusion of 42 days-infected Syrian hamsters were cultivated in 6-well plates with 5 mL of ABC media for eight days without dsRNA, with dsRNA targeting mCherry (control dsRNA), or with dsRNA targeting SmLINC101519, SmLINC110998, SmLINC175062 or SmLINC130991, as indicated at left. Note that SmLINC130991 is an unrelated control lincRNA, not detected as enriched in mixed-sex samples in the mixedsex/single-sex in vivo model comparison. Eight days after treatment, the eggs were counted and collected for images acquisition and further characterization. Egg area, autofluorescence (AUF), and proliferating cells (EdU + )/total cell (DAPI, nuclei) number ratio, were measured by using Fiji/ImageJ as described in Methods. For the EdU and DAPI labeling, EdU was added to the collected eggs and incubated in ABC medium for another day; DAPI was added 4h before image acquisition. After that, the eggs were processed and EdU (red) and DAPI (gray) fluorescence were measured as described in Methods. Representative images from 1 out of 4 experiments with n > 10 eggs. Note that for each dsRNA that is shown, a different egg in the field is shown at each of the different assays that are presented. Scale bars: 20 µm for egg area and autofluorescence (AUF) and 25 µm for DAPI + and EdU + /DAPI + eggs. Paired adult worm couples were cultured in vitro for 8 days in ABC media and were pulse labelled with EdU that was added on the 7 th day and cultured for an additional 24h, as described in Fig. 6. EdU fluorescence images were acquired and quantified with Fiji/Image J as described in Methods, and quantification is shown as the number of EdU + cells per m 2 in panels A to D. Cultured media were supplemented with 30 µg/mL of dsRNA targeting each of the indicated lincRNAs, namely SmLINC101519, SmLINC110998, SmLINC175062, or SmLINC130991, with a negative control dsRNA targeting mCherry (a gene that is not present in S. mansoni), or with no dsRNA. Note that SmLINC130991 is an unrelated control lincRNA, not detected as enriched in mixed-sex samples in the mixed-sex/singlesex in vivo model comparison.