An ancient germ cell-specific RNA-binding protein protects the germline from cryptic splice site poisoning

Male germ cells of all placental mammals express an ancient nuclear RNA binding protein of unknown function called RBMXL2. Here we find that deletion of the retrogene encoding RBMXL2 blocks spermatogenesis. Transcriptome analyses of age-matched deletion mice show that RBMXL2 controls splicing patterns during meiosis. In particular, RBMXL2 represses the selection of aberrant splice sites and the insertion of cryptic and premature terminal exons. Our data suggest a Rbmxl2 retrogene has been conserved across mammals as part of a splicing control mechanism that is fundamentally important to germ cell biology. We propose that this mechanism is essential to meiosis because it buffers the high ambient concentrations of splicing activators, thereby preventing poisoning of key transcripts and disruption to gene expression by aberrant splice site selection.


Introduction
It has been a long recognised hallmark of mammalian gene expression patterns that there are extremely high levels of transcription and transcriptome complexity in testicular cells (de la Grange et al., 2010;Licatalosi, 2016;Soumillon et al., 2013;Pan et al., 2008;Wang et al., 2008;Clark et al., 2007;Grosso et al., 2008;Yeo et al., 2004). These high gene expression levels are thought to result from epigenetic changes that favour relaxed patterns of gene expression during meiosis -the unique form of division used to generate sperm and eggs (Soumillon et al., 2013). Many nuclear RNA-binding proteins are differentially expressed during and immediately after meiosis (Grellscheid et al., 2011;Schmid et al., 2013). These include the nuclear RNA binding protein RBMXL2 (also known as hnRNP GT) that is expressed only during and immediately after male meiosis, but not in the preceding spermatogonial cells (Ehrmann et al., 2008) ( Figure 1A). Consistent with an important function in germ cell development, genetic studies have identified point mutations within infertile men in the human chromosome 11 RBMXL2 gene (Westerveld et al., 2004). A further connection to male infertility is that RBMXL2 belongs to the same gene family as RBMY, which was historically the first human Y chromosome infertility gene identified in the search for the AZF (AZOO-SPERMIA FACTOR) gene (Ma et al., 1993).
RBMXL2 evolved~65 million years ago via retrotransposition of an mRNA from the X chromosome located RBMX gene ( Figure 1A). An RBMXL2 gene is found in all placental mammals, which is consistent with a fundamental role in germ cell biology. This role remains to be identified, but RBMXL2 protein has an N-terminal RNA Recognition Motif (abbreviated RRM, Figure 1B). RBMX and RBMY proteins are also nuclear RNA binding proteins that are very similar to RBMXL2 (73.2% and 36.8% overall identity to RBMX and RBMY, respectively; 93.7% and 77.2% identity within the RRMs, Figure 1-figure supplement 1). The RNA binding specificity of RBMXL2 protein is unknown, but both RBMX and RBMY proteins bind to AA dinucleotide-containing RNA sequences (Cléry et al., 2011;Moursy et al., 2014;Nasim et al., 2003). The RBMXL2, RBMX and RBMY proteins interact with and modulate the splicing activity of Tra2b and SR proteins in vitro ( Figure 1B) (Cléry et al., 2011;Moursy et al., 2014;Liu et al., 2009;Nasim et al., 2003;Elliott et al., 2000a), suggesting a role in splicing control. Maintaining proper ratios of mRNA splice isoforms can be critical in normal development (Kalsotra and Cooper, 2011), where changes in isoforms can have effects on encoded proteins ranging from major to subtle. Alternative splicing is known to be critical for germ cell development. For example, deletion of the splicing regulator protein PTBP2 within eLife digest In humans and other mammals, a sperm from a male fuses with an egg cell from a female to produce an embryo that may ultimately grow into a new individual. Sperm and egg cells are made when certain cells in the body divide in a process called meiosis. Many proteins are required for meiosis to happen and these proteins are made using instructions provided by genes, which are made of a molecule called DNA.
The DNA within a gene is transcribed to make molecules of ribonucleic acid (or RNA for short). The cell then modifies many of these RNAs in a process called splicing before using them as templates to make proteins. During splicing, segments of RNA known as introns are discarded and other segments termed exons are joined together. Some exons may also be removed from RNAs in different combinations to create different proteins from the same gene.
A protein called RBMXL2 is able to bind to RNA molecules and is only made during and after meiosis in humans and most other mammals. RBMXL2 can also bind to other proteins that are known to be involved in controlling splicing of RNAs, but its role in splicing remains unclear.
To address this question, Ehrmann et al. studied the gene that encodes the RBMXL2 protein in mice. Removing this gene prevented male mice from being able to make sperm. Further experiments using a technique called RNA sequencing showed that the RBMXL2 protein helps to ensure that splicing happens correctly by preventing bits of exons and introns in mouse genes from being rearranged. These findings suggest that the gene encoding RBMXL2 is part of a splicing control mechanism that is important for making sperm and egg cells.
The work of Ehrmann et al. could eventually help some couples understand why they have problems conceiving children. Male infertility is poorly understood, and not knowing its causes can harm the mental health of affected men. Furthermore, these findings may help researchers to understand the role of a closely related protein called RBMY that has also been linked to infertility in men, but is much more difficult to study. germ cells affects mRNA isoforms important for cell-cell communication with Sertoli cells (Hannigan et al., 2017). Some alternative splicing events in the testis are conserved between humans and mice so may control fundamental aspects of germ cell biology (Schmid et al., 2013). However many alternative splicing patterns are not conserved between humans and mice (Kan et al., 2005). RBMX protein is also reported to control transcription (Takemoto et al., 2007), affect DNA double strand repair and mitotic sister chromatid cohesion (Adamson et al., 2012;Matsunaga et al., 2012), and to bind to m6A methylated RNA (Liu et al., 2017).
The function of RBMXL2 and why this RNA binding protein has been conserved across placental mammals are not known. A major factor limiting understanding of endogenous RBMXL2 functions has been the absence of a reliable mouse model. Development of a mouse model is also critical to test the importance of this wider family of RNA binding proteins in germ cell development. Men carrying the AZFb deletion on their Y chromosomes are missing RBMY genes and undergo meiotic arrest. However, it is unclear if RBMY loss is causing this phenotype, because the deletion interval encompasses several other genes which could contribute to male infertility (Elliott, 2000;Vogt et al., 1996). Within the meiotic and immediately post-meiotic germ cells which express RBMXL2, the RBMX and RBMY genes are transcriptionally inactivated within a heterochromatic structure called the XY body (Wang, 2004). Meiosis thus provides a genetically tractable window to probe RBMXL2 function where there should be no redundancy effects possible with either RBMX or RBMY. Hence to discover what RBMXL2 does in the germline we have made a conditional Rbmxl2 gene knockout mouse. Analysis of this knockout mouse reveals that RBMXL2 protein is essential for meiosis and has a major role in protecting the meiotic transcriptome from aberrant selection of cryptic splice sites that are normally ignored by the spliceosome. Our data suggest this fundamentally important process operates so efficiently in meiosis that it has been previously undetected, yet is critical to avoid male infertility caused by aberrant splicing of key meiotic transcripts.

RBMXL2 protein is essential for male fertility
To test how important RBMXL2 protein is for male germline development we made a conditional mouse model in which we flanked the Rbmxl2 open reading frame with LoxP sites. Since Rbmxl2 is only expressed in the testis (Elliott et al., 2000b), we chose to delete the entire Rbmxl2 open reading frame to create a null allele. We achieved this by crossing our conditional model with a mouse strain expressing Cre recombinase under control of the ubiquitous Pgk promoter (experimental details are provided in the Materials and methods). We confirmed deletion of this genomic region in homozygous Rbmxl2 gene knockout (Rbmxl2 -/-) mice by Southern blotting (Figure 1-figure supplement 2) and the specific absence of the 50 KDa RBMXL2 protein from knockout testes by Western blotting ( Figure 1C).
Rbmxl2 -/mice developed comparably to their wild type littermates but their testes were much smaller (Figure 2A and B). This small testis phenotype correlated with a severe disruption of testicular histology. Adult Rbmxl2 -/mice contained cells undergoing meiosis but almost no post-meiotic cells ( Figure 2C, meiotic spermatocytes are abbreviated Spc, and post-meiotic round spermatids are abbreviated Rtd). The epididymis dissected from Rbmxl2 -/mice were completely devoid of sperm. Four wild type mice tested had an average of 7.07 ± 1.39Â10 6 epididymal sperm ml À1 , compared with zero in four Rbmxl2 -/mice ( Figure 2D and E). No effect on female fertility was observed in Rbmxl2 -/mice (not shown).
Very rarely a few round spermatids were observed in adult Rbmxl2 -/testis sections (Figure 2figure supplement 1A), although no elongated spermatids were detected. The presence of these round spermatids indicates that meiosis can occasionally complete in the absence of RBMXL2 protein. This is consistent with loss of RBMXL2 causing either (1) a developmental block in meiosis from which a few cells can escape; or alternatively (2) a slow attrition effect, in which the Rbmxl2 -/testis phenotype is caused by post-meiotic stages dying in the adult testis. To differentiate between these two possibilities we histologically analysed testes at 21 days postpartum (21dpp), at which point germ cells in the wild type (Rbmxl2 +/+ ) mice have just started to enter the post-meiotic round spermatid stage. At 21dpp there were still significantly fewer round spermatids in the Rbmxl2 -/sections compared to wild type (Figure 2-figure supplement 1B), even though there would have been less time for round spermatid cell death compared with the adult. This result thus supports a strong meiotic block in Rbmxl2 -/mice, with occasional completion of meiosis rather than a gradual attrition of round spermatids.
Most Rbmxl2 -/germ cells do not progress past meiotic diplotene The above data demonstrate that an Rbmxl2 gene, which is conserved in all placental mammals and specifically expressed in male meiosis, is essential for mouse spermatogenesis somewhere within meiotic prophase. Staining of nuclear spreads with antibodies specific to the meiotic chromosome proteins SYCP1 and SYCP3 more precisely showed that Rbmxl2 -/mice arrest germ cell development during the diplotene substage of meiotic prophase ( Figure 2F and  Analysis of diplotene nuclear spreads revealed further abnormalities in the Rbmxl2 -/testis. There was a decreased number of H3K9me3-marked centromere clusters (Takada et al., 2011), with each individual cluster also containing more centromeres than in wild type testis (Figure 2-figure supplement 4A-C). H3K9me3 staining was also present at the sex body in a proportion (82%) of wild type diplotene spermatocytes, but essentially absent in mutant diplotene spermatocytes (1.5% of diplotene spermatocytes stained, 66 Rbmxl2 -/and 66 wild type nuclei scored, n = 3 for both) ( In summary, the above mouse phenotype showed that deletion of the ancient RNA binding protein RBMXL2 induces progressive defects in male mouse meiotic prophase. These defects culminate with a major block during diplotene that prevents entry into metaphase I, but with defects becoming already apparent during pachytene. Gene expression analysis of age matched wild type and Rbmxl2 -/testes Since RBMXL2 is a nuclear RNA binding protein we predicted that the above phenotype could be associated with a primary molecular defect in generating or processing RNAs in the Rbmxl2 -/testis. To test this we analysed wild type and Rbmxl2 -/testes using RNAseq. Based on the knockout phenotype above, and because the adult wild type testis contains additional more advanced germ cells that are missing from the adult Rbmxl2 -/testis, we analysed testes at 18 days post partum (18dpp) during the first synchronised wave of mouse spermatogenesis. Such 18dpp wild type mouse testes contain germ cells between spermatogonia all the way through to diplotene, with 60% of cells engaged in meiotic prophase, but no post-meiotic cells (Bellvé et al., 1977).
Gene expression analysis of this RNAseq data (Anders and Huber, 2010) showed overall patterns of transcription were similar between wild type and Rbmxl2 -/-18dpp testes ( Figure 3A, and mice stained with haemotoxylin and eosin. Asterisk indicates lumen (note lumen is empty in Rbmxl2 -/section indicative of azoospermia, but contains multiple cells in the wild type). Scale bar = 20 um. (E) Epididymal sperm counts of wild type and Rbmxl2 -/mice (n = 4 of each genotype). The error bar represents SEM. (F) Meiotic prophase I stages detected in wild type and Rbmxl2 -/testis chromosome spreads stained for SYCP3 (pseudocoloured red) and SYCP1 (pseudocoloured green); and DAPI and SYCP3 (pseudocoloured blue and red respectively). No metaphase I nuclei were identified in the Rbmxl2 -/testes. Quantitative data are provided in Figure   18dpp testis transcriptomes and how they change between wild type and Rbmxl2 -/mice. Genes with more than a 2-fold change in gene expression, and an adjusted p value of less than 0.05 are shown as red dots. All other genes are shown as black dots. This data given in full within Figure 3source data 1. (B) Scatterplot showing splicing changes between the wild-type and Rbmxl2 -/testis detected by RNA-seq. The scatterplot shows expected percentage of exonic segment inclusion, or E(PSI) for wild-type and Rbmxl2 -/testis detected by RNA-seq using the bioinformatics programme MAJIQ (Vaquero-Garcia et al., 2016). Splicing changes in some genes are named and arrowed. High-confidence splicing changes (P(|D PSI| > V)>95%) are marked in dark red for a predicted change of V = 20%, and dark yellow for a predicted change of V = 10%, and are also listed in  Only 45 genes showed a fold change greater or equal to two between the wild type and Rbmxl2 -/testes, with an adjusted p value of less than 0.05, and only 23 of these changes were for known protein coding genes (Figure 3-source data 1). The strongest difference in overall gene expression between the wild type and Rbmxl2 -/backgrounds was for the Rbmxl2 -/gene itself, as expected since the Rbmxl2 -/gene is deleted from the knockout mouse. We also detected strong expression changes within the Fsip2 gene, particularly for Fsip2 exon 16 and downstream exons that were expressed only in the wild type background (Figure 3-figure supplement 1A and B). Mutations in Fsip2 correlate with defects in human sperm flagella -cellular structures which develop after meiosis (Martinez et al., 2018). RNAseq analysis detected more subtle expression changes in Cul4a, Slc9c1 and Tex15, each of which are required for mouse male fertility   (Figure 3-source data 1) (Kopanja et al., 2011). Genes encoding transcription factors (Myf6 and Nxn) and a signalling protein (Cyr61) that controls apoptosis (Jun and Lau, 2011) also changed expression. Mouse phenotype information at the Mouse Genome Database (MGD)  indicate that each of these latter three genes have important roles in normal development but not specifically of the testis (Figure 3-source data 2). Sixteen (36% of the total detected) gene expression changes in the Rbmxl2 -/testis were for predicted non coding RNAs of unknown function (Figure 3-source data 1).

RBMXL2 protein controls splicing patterns during meiosis
We carried out further bioinformatic analysis to specifically search for mis-regulated splicing events in the Rbmxl2 -/testes (Vaquero-Garcia et al., 2016). A total of 237 high-confidence, mis-regulated local splicing variations were identified in 186 genes ( Figure 3B, and Figure 3-source data 3). Using RT-PCR to distinguish splice isoforms 27 of these splicing changes were experimentally tested, validating 23/27 splice isoform switches ( Figure 3-source data 4). Some genes had more than one splicing event controlled by RBMXL2 (e.g. the Catsperb gene had three events). Gene Ontology (GO) analysis showed that a number of the genes regulated at the splicing level by RBMXL2 have established roles in spermatogenesis, meiosis and germ cell development ( Figure 3-source data 5). However, amongst the complete set of regulated genes there was no significant enrichment of particular GO terms. This is consistent with RBMXL2 regulating splicing of a functionally diverse group of genes. Analysis of knockout phenotypes provided by the MGD     indicated that whole gene deletion of 25/186 RBMXL2-regulated genes cause male infertility. These latter target genes must thus have a key role in germ cell development ( Figure 3C, Figure 3source data 6, Figure 3-source data 7). Genetic deletion of some other RBMXL2 target genes cause either embryonic or neonatal lethality (33/186 genes), developmental (19/36 genes) or pleiotropic defects (43/186 genes). Cell type mis-splicing of these latter genes during meiosis could thus cause severe phenotypic effects on germ cell biology.
Fourteen (33%) of the genes originally identified to have changed overall expression levels between wild type and Rbmxl2 -/testes also changed splicing patterns ( Figure 3C, and Figure 3source datas 1 and 3). These included splice variants in Esco1 (encoding a protein involved in chromatid cohesion), Slc39a8 (that encodes the transporter protein responsible for cadmium toxicity in the testis) (Dalton et al., 2005); and the Slc9c1 gene (annotated on the MGD as essential for male fertility ).
The RNA binding specificity of RBMXL2 protein was unknown. Thus we used high throughput sequencing cross linking immunoprecipitation (HITS-CLIP) to enable us to correlate splicing changes detected within the Rbmxl2 -/-18dpp mouse testis with global RBMXL2 protein-RNA interactions (Grellscheid et al., 2011). Antibodies specific to mouse RBMXL2 immuno-precipitated a radiolabelled RNA protein adduct of the known size of RBMXL2 protein (50 KDa) after treatment with high concentrations of RNase (Figure 3-figure supplement 2A) (Elliott et al., 2000b). Lower concentrations of RNase were used to retrieve an average tag length of 40 nucleotides. Enriched motif analysis of the sequenced RBMXL2 CLIP tags showed that each of the top 10 5-mers contained the dinucleotide AA ( Figure 3D). Interestingly, AA is also the dinucleotide bound by RBMX (Moursy et al., 2014), which is consistent with over 90% shared sequence identity within the RRMs of these proteins (Figure 1-figure supplement 1). These intragenic cross-linked sites mapped to the mouse genome most frequently within introns, consistent with RBMXL2 being involved in nuclear RNA-processing events (Figure 3-figure supplement 2B). Next we searched for CLIP tags mapping to regions near the splicing events controlled by RBMXL2, which would suggest direct regulation. We observed enrichment of RBMXL2 binding both within alternative exon sequences and in regions proximal to regulated splice sites compared to non-regulated events (

RBMXL2 protein represses splicing of exons that would compromise meiotic gene expression
Overall, the above bioinformatics analysis indicated an accumulation of defective splice isoform patterns in the Rbmxl2 -/testis. MAJIQ captures local splicing variations (LSVs) that involve both known and un-annotated (de-novo) splice sites, junctions and exons. These LSVs can correspond to both classical binary splicing events and more complex events involving three or more junctions. Since classical binary events (e.g. skipped exons) are easier to inspect and visualise with the RNAseq reads on the UCSC genome browser, we focused on the binary events for further investigation. This showed that 60% of the 87 most easily visualised classical splicing events controlled by RBMXL2 involve the altered selection of de-novo splice sites (defined as not currently annotated on the most recent build of the Mus musculus genome GRCm38/mm10, Figure 3-source data 3). These denovo splicing variations controlled by RBMXL2 are putatively cryptic events as they insert novel internal exons that were flanked by consensus GT-AG 5' and 3' splice sites. Moreover, these de-novo splice sites showed low expected percent spliced in (E(PSI)) values in wild type 18 dpp testes and across a panel of twelve mouse tissues, further suggesting they are cryptic events only included in the absence of RBMXL2 and not in wild type mice (Figure 4-figure supplement 1). Detailed investigation indicated 84% of such cryptic exons were either not multiples of three or introduced stop codons into their mRNAs, meaning their insertion into mRNAs would disrupt protein reading frames and interfere with meiotic protein expression.
Splicing inclusion of cryptic terminal exons could also severely impact patterns of meiotic gene expression. We detected high inclusion of a cryptic terminal exon within the 5 0 UTR of the Kdm4d gene ( Figure 4A and B, note also increased splicing of an already annotated upstream Kdm4d alternative exon within the Rbmxl2 -/testis). We confirmed splicing inclusion of this Kdm4d cryptic exon in the Rbmxl2 -/testes using RT-PCR (primer positions are shown in Figure 4A). Kdm4d encodes a  Figure 4 continued on next page histone demethylase protein that is important for normal patterns of germ cell apoptosis in the testis (Iwamori et al., 2011). Exon two contains the entire CDS (Coding DNA Sequence, Figure 4A) of the Kdm4d gene. Analysis of exon junction read numbers using Sashimi plots confirmed different nuclear processing pathways are used for Kdm4d in wild type and knockout testes (Figure 4-figure supplement 2A). There were 14-fold more exon junction reads connecting Kdm4d exon one to the cryptic exon (140 reads), when compared to exon junction reads joining the cryptic exon to Kdm4d coding exon 2. This pattern is consistent with splicing inclusion of the cryptic exon being connected with use of an associated polyA site. Consistent with this, some individual RNAseq reads extended past the 5 0 splice site of the Kdm4d cryptic exon and then terminated downstream following a consensus polyadenylation site (AATAAA) sequence ( Figure 4C). Both bioinformatics analysis (Figure 3source data 1) and qPCR (Figure 4-figure supplement 2B) detected reduced Kdm4d gene expression levels in each of 3 replicate Rbmxl2 -/testes compared to their wild type equivalents, but these trends were not statistically significant due to biological variability between individual mice. We also detected increased selection in the Rbmxl2 -/testes of another 17 splicing events that mapped to terminal exons (Figure 3-source data 3). These included an already annotated upstream terminal exon in the Lrrcc1 gene (annotated on the MGD as essential for mouse fertility) ; and a terminal exon in the Slc39a8 gene (Figure 4-figure supplement 3A and  B).
In addition to splicing of entire cryptic exons, loss of RBMXL2 protein also activated splicing selection of individual cryptic 5 0 and 3 0 splice sites within introns that increased the length of already annotated exons (Figure 3-source data 3). This kind of changed splicing pattern was observed for the Gm5134 mRNA that encodes a solute transporter important for normal metabolism ( Figure 4D and E). The aberrant splice isoform of the Gm5134 transcript made in the Rbmxl2 -/testis uses a cryptic 3 0 splice site within intron 10, increasing the size of exon 11 by 1353nt.
RBMXL2 protein is important to prevent mis-splicing of long exons in key genes during meiosis As well as cryptic splice sites within introns, some exon-located cryptic splice sites were activated within the Rbmxl2 -/testes, including within genes important for testis development (Figure 3source data 6). Amongst the genes known to be important for meiosis, we observed a dip in RNAseq density for the interior of Meioc (meiosis specific gene with coiled coil domain) exon five in the Rbmxl2 -/testis transcriptome ( Figure 5A) (Abby et al., 2016;Soh et al., 2017). This dip in RNAseq read density was caused by high levels of cryptic splice site activation within Meioc exon 5. Sashimi plots showed that in the Rbmxl2 -/-testes~28% of the splice junctions joined Meioc exon four to an exon-internal cryptic 3 0 splice site within Meioc exon 5 ( Figure 5-figure supplement 1A). A further cryptic 5 0 splice site within Meioc exon five was also used in 22% of the Rbmxl2 -/testis transcripts. Use of both cryptic splice sites converted Meioc exon five into an exitron (an exon with internal splice sites) (Marquez et al., 2015;Sibley et al., 2016). RT-PCR analysis confirmed both these defective Meioc exon five splicing patterns within the Rbmxl2 -/testes ( Figure 5B-C).
The cryptic splice sites within Meioc exon five are hardly ever used in the wild type 18dpp testis transcriptome. In silico analysis showed that the internal cryptic 5 0 splice site within Meioc exon five is weak compared to 5 0 splice sites for other known alternative exons (7.5 th percentile by SROOGLE, compared to 57 th percentile for the downstream annotated 3 0 splice site ( Figure 5-figure supplement 1B) (Schwartz et al., 2009). The first two nucleotide positions downstream of the Meioc exon 5-cryptic internal 5 0 splice site are GC, so diverge from the GT consensus which is found flanking almost all eukaryotic introns. Similarly, the cryptic internal 3 0 splice site within Meioc exon five is relatively weak (8 th percentile of the average 3 0 splice site strength for an alternative exon, compared to 41 st percentile for the upstream annotated 3 0 splice site). Meioc exon five is also directly bound by   Figure 5A). These include a CLIP tag directly overlapping the Meioc exon 5 internal cryptic 5' splice site, as well as CLIP tags upstream of the internal 3 0 splice site.
Meioc exon five is highly conserved across species suggesting it encodes a functionally important part of Meioc protein ( Figure 5A). Exitrons frequently modify protein coding capacity of mRNAs by removing peptide coding segments from mRNAs (Marquez et al., 2015;Sibley et al., 2016). Both the cryptic splicing events in Meioc exon five remove RNA segments within the Meioc CDS that are multiples of 3. These events shorten the CDS just upstream of but largely not including the coding information for the MEIOC domain itself (http://pfam.xfam.org/family/PF15189). We thus predicted that a shorter Meioc protein isoform may be produced in the Rbmxl2 -/testis. Consistent with this, Western blotting with a Meioc specific antiserum (Abby et al., 2016) detected a 40-50 KDa Meioc protein within the Rbmxl2 -/testis, but not in wild type or heterozygote Rbmxl2 ± testes ( Figure 5D). This 40-50 KDa molecular weight corresponds to an expected Meioc protein size following complete exitron removal. Although shorter Meioc protein isoforms were only detected in the Rbmxl2 -/testes, in each genotype we detected a protein doublet just above 100 KDa corresponding to the reported size of full length Meioc protein (Abby et al., 2016) ( Figure 5D).
We further analysed splicing patterns during the first wave of mouse spermatogenesis to test whether appearance of the Meioc splicing defects corresponded with the known expression window of RBMXL2 protein in meiotic prophase (Ehrmann et al., 2008). Aberrant splice isoforms of Meioc exon five only appeared within Rbmxl2 -/testes from 14dpp, and were barely visible at any time point within similar age wild type testes. We performed similar experiments to analyse the dynamics of Brca2 splicing in the first wave of mouse spermatogenesis. The RNAseq data predicted utilisation of an upstream cryptic 5 0 splice site within the 4809 nucleotide Brca2 exon 11 (Figure 3-source data 3) that removes 1263 nucleotides from Brca2 exon 11 in the Rbmxl2 -/testes ( Figure 5-figure  supplement 2A). RT-PCR analysis during the first wave of spermatogenesis showed Brca2 exon 11 splicing was normal at 13dpp in the Rbmxl2 -/testes, with defects appearing only during and after 14dpp ( Figure 5-figure supplement 2B). These time-course data indicate that splicing defects in both Meioc and Brca2 appear and become progressively worse during meiotic prophase in the Rbmxl2 -/testes within genes that are expressed and processed normally in earlier germ cell types. An implication of this is that the aberrant splice isoforms detected in 18dpp whole testis RNA must be significantly diluted by signals from germ cells from earlier developmental stages.
Other unusually long exons in genes important for germ cell development were also mis-spliced in the Rbmxl2 -/testes including exons in the Alms1 and Esco1 genes (Figure 3-source data 3). The Alms1 gene encodes a protein involved in microtubule organisation during cell division, and its knockout disrupts spermatogenesis resulting in a small testis phenotype . In the Rbmxl2 -/testis an upstream 5 0 splice site is selected within Alms1 exon 5, which is is unusually long Figure 5 continued PCR validation of the changing Meioc splicing pattern. This RT-PCR analysis used three primers in multiplex (MeiocF-MeiocR1 detects splicing of exon 4 to exon 5, and MeiocF-MeiocR2 detects splicing from exon four to the cryptic 3 0 splice site within exon 5, or splicing of exon 4 to exon five after complete exitron removal. Detection of full length Meioc exon five inclusion by this latter primer combination is precluded by the large size of the amplicon. (C) Chart showing percentage splicing inclusion (PSI) of different splice variants of Meioc exon 5. PSI data were generated by RT-PCR from three wild type (WT) and 3 Rbmxl2 -/mice, with the individual samples shown in the capillary gel electrophoretogram in part (B) indicated as red dots. Error bars represent the SEM, and p values were calculated using t tests. (D) Western blot of 18dpp mouse testes protein probed with an antibody specific to Meioc protein (Abby et al., 2016). A non-specific protein that was also detected in mouse spleen as well as 18dpp testis is indicated by an asterisk. (E) Capillary gel electrophoretogram showing patterns of Meioc exon 5 splicing between 10dpp and 21dpp in the first wave of mouse spermatogenesis. (F) UCSC screenshot (Karolchik et al., 2014) showing that splicing of Alms1 exon five is disrupted in Rbmxl2 -/testes via selection of an internal 5 0 splice site. DOI: https://doi.org/10.7554/eLife.39304.026 The following figure supplements are available for figure 5: at 1546nt ( Figure 5F). This splice switch removes 1407 nucleotides from the Alms1 mRNA, and coding information for 469 amino acids from the predicted Alms1 reading frame.

Ectopic expression of RBMXL2 protein represses cryptic exon splicing in vitro
The above data indicate that loss of RBMXL2 protein within spermatocytes activates utilisation of cryptic splice sites. We next tested if cryptic splice site selection would be reciprocally repressed by ectopic expression of RBMXL2 protein.  its flanking intron sequences were cloned into a minigene ( Figure 6A). We detected~40% splicing inclusion of the cryptic Kdm4d exon after co-transfection of this minigene into HEK293 cells (that do not normally express RBMXL2) with an expression vector encoding GFP ( Figure 6B lane 1). Levels of Kdm4d cryptic exon splicing inclusion were repressed 6-fold upon co-transfection of a construct encoding an RBMXL2-GFP fusion protein. We also tested the activity of an RBMXL2 fusion protein without the RNP1 motif of the RRM (RBMXL2 DRNP1). Consistent with RNA-protein contacts not being critical for splicing repression of this cryptic exon, RBMXL2 DRNP1 could still repress Kdm4d cryptic exon splicing, although slightly less efficiently than the full length RBMXL2 fusion protein.
Interestingly, splicing of the Kdm4d cryptic exon was also efficiently silenced by ectopic co-expression of an RBMX-GFP fusion protein ( Figure 6B lane 7). The above results show that RBMXL2 represses Kdm4d cryptic exon splicing, and is consistent with RBMXL2 providing a direct functional replacement for RBMX during meiosis. iCLIP experiments in adult mouse testis further identified the Kdm4d cryptic exon as directly binding the splicing activator protein Tra2b (Dalgliesh and Elliott, unpublished data). Consistent with this, Tra2b over-expression in HEK293 cells strongly increased Kdm4d cryptic exon splicing inclusion ( Figure 6B lane 4). Ectopic activation by Tra2b indicated that although this Kdm4d exon cryptic exon is effectively ignored by the spliceosome in wild type testis, cryptic exon splicing is activated in response to increased expression of a splicing activator protein. Importantly, splicing activation by Tra2b was efficiently supressed when the minigene was co-transfected at the same time with both RBMXL2 and Tra2b ( Figure 6B, lanes 5 and 6).
We addressed if other cryptic exons repressed by RBMXL2 might also possess the capacity to be activated by splicing regulators that bind to their associated splicing enhancer sequences. In silico analysis (Schwartz et al., 2009) of a panel of cryptic exons that were activated in the Rbmxl2 -/testes showed that these exons had significantly weaker 3 0 splice sites, but similar Exonic Splicing Enhancer (ESE) sequence content compared to their immediately adjacent downstream exons ( Figure 6C). These other cryptic exons that are repressed by RBMXL2 may be likewise poised for splicing during meiosis, because they bind to splicing activator proteins such as Tra2b.

Discussion
This study finds that loss of the ancient RBMXL2 protein completely prevents sperm production in the mouse, largely because of a developmental block during meiosis. Rbmxl2 -/mouse germ cells develop as far as meiosis, thus escaping the major developmental checkpoints that operate during meiotic pachytene (de Rooij et al., 2003), but then undergo cell death by apoptosis after reaching diplotene. Our data further show that RBMXL2 protein is important to protect the meiotic transcriptome from a spectrum of splicing defects ( Figure 7A). These include classic manifestations of cryptic splicing that would be strongly deleterious to gene expression (Marquez et al., 2015;Sibley et al., 2016), such as insertion of novel cryptic exons and premature polyadenylation sites that would disrupt protein coding sequences, and the selection of cryptic splice sites that would shorten or extend exon lengths. Previous work reporting poor conservation of splicing patterns between species suggested that alternative splicing might not control fundamental aspects of germ cell biology (Kan et al., 2005). In contrast, the work presented here suggest a generic requirement for cryptic splice site suppression during meiosis that could explain the conservation of an Rbmxl2 gene in all placental mammals, even if individual cryptic splice sites are not conserved between species' genomes (Elliott et al., 2000b).
Cells in meiotic prophase might be particularly susceptible to cryptic splice site poisoning because of their high levels of transcription, altered splicing regulator expression and relaxed chromatin environments (de la Grange et al., 2010;Licatalosi, 2016;Soumillon et al., 2013;Pan et al., 2008;Wang et al., 2008;Clark et al., 2007;Grosso et al., 2008;Yeo et al., 2004). In silico data suggest that the cryptic exons repressed by RBMXL2 would be poised for splicing inclusion in conditions of high concentrations of splicing activators because of their strong ESE contents. We suggest a model where RBMXL2 protein may buffer the activity of splicing activator proteins to prevent such cryptic splice site selection from happening in meiosis ( Figure 7B). This model is also consistent with previously reported data showing that RBMXL2, RBMX and RBMY proteins physically interact with and antagonise the splicing activity of Tra2b and SR proteins (Liu et al., 2009;Nasim et al., 2003;Venables et al., 2000;Elliott et al., 2000a) ( Figure 7B). Data presented in this study also show that Kdm4d cryptic exon splicing activation by Tra2b is blocked in vitro by co-expression of RBMXL2. As an alternative and not mutually exclusive model, RBMXL2 protein-RNA binding may sterically block recognition of some cryptic splice sites during meiosis, including the cryptic 5 0 splice site within Meioc exon five which was utilised in the Rbmxl2 -/testis ( Figures 5A and 7B). However, no HITS-CLIP tags for RBMXL2 were detected near the Kdm4d cryptic exon. Furthermore, a version of RBMXL2 deleted for the key RNP1 motif involved in RNA-protein interactions could still repress Kdm4d cryptic exon splicing (albeit slightly less efficiently than the wild type RBMXL2 protein). Thus buffering protein interactions between RBMXL2 and proteins like Tra2b may be potentially more important in repressing cryptic splicing in meiosis than direct RBMXL2 RNA-protein interactions. Loss of RBMXL2 disrupts expression of many downstream target genes in parallel, both at splicing and transcriptional levels. This makes it difficult to resolve whether the phenotype detected in the Rbmxl2 -/mouse testes is caused by aberrant expression of a single target RNA, or is a compound phenotype involving multiple genes. Some genes affected by RBMXL2 protein deletion are annotated as important for spermatogenesis on the MGD . These include Alms1 (Arsov et al., 2006;Collin et al., 2005); Brca2 which is required for meiotic prophase (Sharan et al., 2004); Kdm4d, which helps control apoptosis during male germ cell development (Iwamori et al., 2011); and Cul4a that encodes an important ubiquitin ligase needed for DNA damage response. Two factors complicate direct comparison of MGD phenotypes with those after RBMXL2 knockout. Firstly, conventional knockouts will report a phenotype the first time a gene is needed in a developmental pathway. In contrast RBMXL2 is only expressed from the onset of meiosis, so any defects in target gene expression that occur without RBMXL2 could result in a later phenotype. For example, Meioc knockout phenotype analysis has concentrated on entry into meiosis (Abby et al., 2016;Soh et al., 2017). Our data show that Meioc exon five is still spliced normally on meiotic entry in the Rbmxl2 -/mouse (at 12dpp), but becomes compromised later on during meiotic prophase (when RBMXL2 is expressed). It is possible that Rbmx and Rbmy will provide an equivalent function to Rbmxl2 in spermatogonia and early spermatocytes, and that it is only when meiotic sex chromosome inactivation initiates during pachytene that germ cells will depend on Rbmxl2. Consistent with this, in transfected cells RBMX was able to suppress a cryptic splice exon in the Kdm4d with similar efficiency to the RBMXL2 protein. Secondly, the MGD phenotype data  is after gene knockout, whereas Rbmxl2 deletion changes patterns of splice isoforms.
High levels of gene expression and alternative splicing have also been observed in the nervous system, and this is another anatomic site where pathological cryptic exon inclusion has been reported. Neuronal cryptic exon inclusion occurs following depletion of the splicing repressor protein TDP43, and might contribute to neuron death in neurological diseases like ALS and Alzheimer's disease (Sun et al., 2017). Cryptic splicing has also been observed in cultured human cells after depletion of the splicing repressor proteins hnRNP C, hnRNPL and PTB (Ling et al., 2016;McClory et al., 2018;Zarnack et al., 2013), and in mouse oocytes depleted for the splicing factor SRSF3 (Do et al., 2018). Taking all these results into consideration we suggest that RBMXL2 has a key role controlling the meiotic transcriptome, and suggest that human male infertility caused by loss of RBMXL2 or its paralog RBMY may be associated with germ cell type-specific cryptic splicing.

Continued on next page
Rbmxl2 reading frame (Figure 1-figure supplement 1C). Genetic structures of the wild type and targeted alleles were confirmed by Southern blotting. Genomic DNA was prepared from mouse tails, cut with BamHI, run on an agarose gel and blotted onto a Hybond N nylon membrane (GE Healthcare, Little Chalfont, UK). Blots were probed using an internal probe Enp (generated using primers enpF 5 0 -ACTGTTGATTCCCCTTCCAAC-3 0 and enp R 5 0 -ACTCCTGCCTGTGATTGGTC-3 0 , and a-32 P labelled by random priming). The correct insertion of the targeting vector in the genome was confirmed by cutting genomic DNA with PshA1 and EcoRV and hybridizing with 5 0 (generated from genomic DNA using 5 0 -agcattcagcaaaggctcac-3 0 and 5 0 -ttaaaactgagggagactgc-3 0 ) and 3 0 (generated from genomic DNA using 5 0 -actgcatagttgtagccatc-3 0 and 5 0 -tgcattctctttaggctcatttc-3 0 ) probes respectively.

Animal work
Animal research was carried with the approval of the Newcastle University animal research ethics committee and the UK Government Home Office (Home Office project Licence Number PIL 60/ 4455).

Analysis of mouse germ cell development in vivo
Testis/body weight ratios, and sperm counts were measured on a C57Bl6 background. In order to determine sperm counts, the cauda epididymis was dissected in PBS and the sperm were counted in a haemocytometer.

Preparation of RNA for RNA-Seq
Mice were back crossed onto the C57Bl6 background for eight generations, and we used back crossed male mice for subsequent analysis. Testes were homogenized in RLTplus buffer from Qiagen before purification with the Rneasy plus kit (Qiagen, 74134). The RNA was then re-purified with the kit before Dnase digestion (Ambion).

RNA-Seq analysis
Paired-end sequencing was done for six samples in total (three biological replicates of 18dpp wildtype and Rbmxl2 -/testis) using 75 bp reads. Libraries were prepared using TruSeq Stranded mRNA Library Prep Kit (Illumina) and sequenced; 75 bp single reads were sequenced on a NextSeq 500 (Illumina, using the mid output v2 150 cycles kit). The base quality of the raw sequencing reads were checked using FastQC. Trimmomatic (v0.32) was used to remove adapters and to trim the first twelve bases and the last base at position 76 (Bolger et al., 2014). Reads were aligned to the UCSC D3c. 2011 (GRCm38/mm10) assembly of the mouse genome using STAR (v2.5.2b) (Dobin et al., 2013). Alternative splicing events were assessed using MAJIQ and VOILA software packages (Vaquero-Garcia et al., 2016). Briefly, uniquely mapped, junction-spanning reads were used by MAJIQ to construct splice graphs for transcripts from a custom Ensembl transcriptome annotation and to quantify PSI (within conditions) and DPSI (between conditions) for all local splicing variations (LSVs). The captured LSVs include classical alternative splicing events (e.g. cassette exons, alternative 5' splice sites, etc.) as well as more complex variations (Vaquero-Garcia et al., 2016). LSVs with an expected change of greater than 10% were then visualized using VOILA to produce splice graphs, violin plots representing PSI and DPSI quantifications, and interactive HTML outputs for changes between wild type and Rbmxl2 -/-.

Detection of splice isoforms in mouse testis RNA
The levels of different mRNA splice forms were detected in DNAse treated total RNA isolated from 18dpp mouse testes. cDNA was synthesised using superscript Vilo (Invitrogen) and PCR amplified, followed by either capillary gel electrophoresis (Qiaxcel) or agarose gel electrophoresis (Grellscheid et al., 2011). Gene-specific primers specific for different splice isoforms are given in Figure 3-source data 4.

Hits-clip
HITS-CLIP was performed as previously described (Licatalosi et al., 2008) using an antibody specific to RBMXL2 (Ehrmann et al., 2008). In short, for the CLIP analysis a single mouse testis was sheared in PBS and UV crosslinked. After lysis, the whole lysate was treated with DNase and RNase, followed by radiolabelling and linker ligation. After immunoprecipitation with purified antisera specific to RBMXL2 (Ehrmann et al., 2008), RNA bound RBMXL2 was separated on SDS-PAGE. A thin band at the size of 65 kDa (RBMXL2 migrates at around 50 kDa and MW of 50 nt RNA is about 15 kDa) was cut out and subject to protein digestion. RNA was recovered and subject to Illumina sequencing, and the reads were aligned to the mouse genome. HITS-CLIP tags mapped to 22,438 protein coding genes. HITS-CLIP data were further analysed to extract most frequently occurring pentamer nucleotide sequences within genes (ncRNA, ORF, 3'UTR, 5'UTR and intron). A window between +50 and À50 of the cross link site was analysed (to avoid crosslinking bias), and after correction for average pentamer occurrence in control sequences (as described in Wang et al., 2010). Each pentamer was only counted once in the analysed window.

Detection of proteins in mouse testis
For protein detection by immunohistochemistry, testes were fixed in Bouins, embedded in paraffin wax, sectioned and attached to glass slides. Sections were prepared and immunohistochemistry carried out as previously described (Grellscheid et al., 2011). Primary antibodies were specific to caspase 3, p17-specific (Proteintech Europe Ltd). Protein detection by Western was as previously described (Grellscheid et al., 2011) using primary antibodies specific to RBMXL2 (Ehrmann et al., 2008), Meioc (Abby et al., 2016), and control antibodies specific to tubulin (Sigma T6793) and GAPDH (Abgent,San Diego CA, AP7873b).

Spermatocyte spreading
The tunica was removed from testes, and seminiferous tubules macerated with razor blades in a small volume of DMEM media. DMEM was added to 4 ml, and debris allowed to settle for 10 min. Cells were then pelleted from the suspension at 233 g for 5 min. The pellet was resuspended in 1-2 ml DMEM per animal. 5 drops of 4.5% sucrose were added to the centre of each clean glass slide using a Pasteur pipette. A single drop of the cell suspension was added to each slide from a height of 15-20 cm, followed by one drop of 0.05% Triton-X-100. The slides were incubated in a humid chamber for 10 min, then eight drops of fixative (2% PFA, 0.02% SDS, pH 8) were added to each slide, and incubated in humid chamber for 20 min. Slides were dipped in water to wash and allowed to air dry before storing at À80˚C (Crichton et al., 2017).
Nuclei were staged by immunostaining for the axial/lateral element marker SYCP3. Pachytene nuclei were identified in this analysis by complete co-localisation of SYCP3 and either the transverse filament marker SYCP1 on all nineteen autosomes, or by the presence of nineteen bold SYCP3 axes in addition to two paired or unpaired sex chromosome axes of unequal length. Asynapsed pachytene nuclei were judged to be nuclei containing at least one fully synapsed autosome, and at least one autosome unsynapsed along >50% of its length (Crichton et al., 2017).

Spermatocyte staining analysis
All scoring of immunostained chromosome spreads was performed blind on images after randomisation of filenames by computer script. Three Rbmxl2 -/and three wild type control animals were analysed for each experiment. Meiotic substaging analysis involved 179 nuclei from control animals, and 138 from Rbmxl2 -/mice. Diplotene phospho-H3 analysis involved 128 control and 157 Rbmxl2 -/nuclei, and the proportion of nuclei with bold centromeric signal was analysed using a Fisher's test. Pachytene asynapsis was measured in 108 control and 92 Rbmxl2 -/nuclei, and the data analysed using a Fisher's test. Centromere positioning and H3K9me3 staining was assessed across 73 control and 65 Rbmxl2 -/pachytene nuclei, and 66 diplotene control and 66 Rbmxl2 -/diplotene nuclei. These data were analysed using Wilcoxon Rank Sum test.