Repression of interrupted and intact rDNA by the SUMO pathway in Drosophila melanogaster

Ribosomal RNAs (rRNAs) are essential components of the ribosome and are among the most abundant macromolecules in the cell. To ensure high rRNA level, eukaryotic genomes contain dozens to hundreds of rDNA genes, however, only a fraction of the rRNA genes seems to be active, while others are transcriptionally silent. We found that individual rDNA genes have high level of cell-to-cell heterogeneity in their expression in Drosophila melanogaster. Insertion of heterologous sequences into rDNA leads to repression associated with reduced expression in individual cells and decreased number of cells expressing rDNA with insertions. We found that SUMO (Small Ubiquitin-like Modifier) and SUMO ligase Ubc9 are required for efficient repression of interrupted rDNA units and variable expression of intact rDNA. Disruption of the SUMO pathway abolishes discrimination of interrupted and intact rDNAs and removes cell-to-cell heterogeneity leading to uniformly high expression of individual rDNA in single cells. Our results suggest that the SUMO pathway is responsible for both repression of interrupted units and control of intact rDNA expression.


Introduction
Ribosomal RNAs (rRNAs) are the main structural and central enzymatic components of ribosomes. The constant production of ribosomal RNA is essential for cell growth and division. The high demand for rRNA production is addressed by a two-prong strategy. First, cells use a dedicated transcription machinery composed of RNA polymerase I and associated unique transcription factors to transcribe rDNA genes. Second, genomes contain multiple identical rDNA genes that are transcribed simultaneously.
In most eukaryotes rRNA genes encoding the 18S, 28S and 5.8S rRNAs are transcribed under the control of a single promoter producing pre-rRNA (Mandal, 1984). After co-transcriptional processing and modification, the pre-RNA is converted to mature 18S, 28S and 5.8S rRNAs which together with the additional 5S rRNA and ribosomal proteins are assembled into pre-ribosomal subunits (Henras et al., 2015). Actively transcribed rDNA genes and pre-ribosomal subunits at different steps of maturity form the nucleolus -a membraneless compartment in the nucleus characterized by tripartite morphological ultrastructure.
In eukaryotic genomes dozens to hundreds of nearly identical rDNA units are arranged head-to-tail forming one or several clusters. Studies in such diverse organisms as different plants, yeast, fruit flies, mice and humans showed that only fraction of available rDNA genes is transcriptionally active (Bird et al., 1981;Coffman et al., 2005;Conconi et al., 1989;Dammann et al., 1993;Flavell et al., 1988). In organisms that have several rDNA clusters located on different chromosomes, whole clusters might be transcriptionally inactive and positioned outside of the nucleolus. For example, in Drosophila, where rDNA clusters are located on the X and Y chromosomes, in certain genotypes only the Y chromosome cluster is active and forms the nucleolus (Greil and Ahmad, 2012;Zhou et al., 2012). In mammals which carry several ribosomal loci on different chromosomes some clusters are constantly active, while the activity of others depends on the tissue type, presence of nutrients and stress (Ali et al., 2008(Ali et al., , 2012de Capoa et al., 1985a;de Capoa et al., 1985b;Tseng et al., 2008;Young et al., 4 2007; Zhang et al., 2007). Direct observation of rDNA transcription by electron microscopy showed that individual rDNA units within a single cluster might also have different transcriptional activity and are subject to all-or-none regulation: any given rDNA unit is either actively transcribed with multiple RNA pol I associated with nascent pre-rRNA positioned along the body of the unit, or is inactive. (Foe, 1978;McKnight et al., 1976).
Due to recombination-driven homogenization, individual rDNA units are similar in sequence (Eickbush et al., 2007), arguing against difference in sequence leading to differential expression. In fact, repressed rDNA units might become active when the genetic environment or demand for rRNA production changes. For example, in Drosophila X-linked rDNA genes that are inactive in males are active in females . The molecular mechanism for differential activity of rDNA units was studied in several organisms. In mammals repressed rDNA units are enriched in DNA methylation, repressive histone marks (Coffman et al., 2005;Earley et al., 2006;Li et al., 2006;Santoro et al., 2001;Santoro et al., 2002;Zhou et al., 2002). The chromatin remodeling complexes NoRC and NuRD, in cooperation with noncoding RNAs derived from rDNA loci were shown to be involved in rDNA repression Strohner et al., 2001;Xie et al., 2012;Zhou et al., 2002;Bierhoff et al., 2014;Mayer et al., 2006;Santoro et al., 2010). However, a deep understanding of the molecular mechanisms establishing and maintaining rDNA repression is lacking.
Genomes of many arthropod species including Drosophila melanogaster harbor transposable elements that integrate into rDNA, creating distinct rDNA units. R1 and R2 belong to the non-long terminal repeat (non-LTR) retrotransposons and encode a sequence-specific endonuclease that is responsible for integration of these elements into 28S rDNA (Burke et al., 1987;Eickbush et al., 2015;Jakubczak et al., 1990;Xiong et al., 1988;Yang et al., 1999).
It is believed that R1 and R2 lack their own promoters and instead rely on the rDNA transcription machinery for their expression by transcribing as part of pre-rRNA followed by excision mediated by the transposon-encoded ribozyme (Eickbush et al., 2010;Eickbush et al., 2013). 5 The fraction of rDNA units with R1 and R2 insertions varies between different Drosophila melanogaster strains, but was estimated to reach up to 80% of units in some strains (Jakubczak et al., 1992). Despite the abundance of R1 and R2, their expression level is usually low. Indeed, electron microscopy and run-on analysis of nascent transcripts revealed that rDNA units with transposon insertions are transcriptionally silent, indicating a mechanism that can distinguish damaged rDNA copies and repress their transcription (Jamrich et al., 1984;Ye and Eickbush, 2006). Interestingly, R1 and R2 expression is increased if the total number of rDNA units in the genome is low, suggesting that repression is sensitive to cellular rRNA demand (Eickbush et al., 2003;Long et al., 1981;Terracol, 1986). The molecular mechanism for repression of interrupted rDNA units and its relationship to the general mechanism of rDNA silencing remained unknown.

SUMO (Small Ubiquitin-like Modifier) is a small protein related to ubiquitin that
is covalently attached to other proteins in sequential reactions mediated by E1, E2 and, for some targets, E3 SUMO ligases. Unlike ubiquitination, SUMOylation typically is not linked to protein degradation and instead changes protein-protein interactions by facilitating recruitment of new binding partners or masking existing binding sites. The majority of SUMOylated proteins reside in the nucleus and SUMOylation was shown to be involved in regulation of transcription, mRNA processing, chromatin organization, replication and repair (Geiss-Friedlander et al., 2007) as well as rRNA maturation and nucleolus function (Finkbeiner et al., 2011;Haindl et al., 2008;Yun et al., 2008;Westman et al., 2010). SUMOylation was shown to be essential for growth and viability of Saccharomyces cerevisiae (Johnson et al., 1997), Drosophila melanogaster (Lehembre et al., 2000) and mice (Nacerddine et al., 2005). Mammalian genomes encode four distinct SUMO genes with partially redundant functions. In contrast, only one SUMO gene is present in the Drosophila melanogaster genome, making it a good model to understand the diverse functions of SUMOylation (Guo et al., 2004;Melchior, 2000;Yang et al., 1999).
Here we have established a new model to study regulation of rDNA expression using a molecularly marked single-unit rDNA transgene. We found that this model 6 faithfully recapitulates repression of interrupted rDNA units. We discovered that repression of rDNA units damaged by transposon insertions, as well as intact rDNA units is controlled by the SUMO pathway, indicating that the same molecular mechanism is responsible for epigenetic inactivation of intact and interrupted rDNA.

Results
A single-unit rDNA transgene is expressed from a heterologous genomic site The D. melanogaster genome contain numerous (50-200) rDNA units arranged in tandem repeats in the heterochromatin of the X and Y chromosomes ( Figure 1A). Many rDNA units contain insertions of R1 and R2 transposons that lack their own promoters and co-transcribe with pre-rRNA (Ye and Eickbush, 2006). However, expression of R1 and R2 elements is low in the majority of Drosophila strains, indicating that units with insertions are specifically repressed (Ye and Eickbush, 2006). The large number of identical rDNA units organized in large arrays makes it impossible to study expression of individual rDNA units by conventional methods. To circumvent this problem and study regulation of rDNA expression, we created flies with a transgene that carries a molecularly marked single rDNA unit inserted into a heterologous genomic locus ( Figure 1A). It harbors the non-transcribed spacer (NTS), which includes elements that regulate RNA polymerase I (Pol I) transcription and the complete transcribed portion, which generates the 47S pre-rRNA. The pre-rRNA contains a 5' external transcribed sequence (5' ETS) and internal transcribed spacers (ITS), which are removed from the pre-rRNA during its processing into mature 18S, 5.8S, 2S and 28S rRNAs. To discriminate transcripts generated from the transgene from endogenous rRNA we inserted a 21bp unique identification sequence (UID) into the external transcribed spacer (ETS) downstream of the transcription start site ( Figure 1A). Along with the rest of the ETS, the UID sequence is removed from pre-rRNA during its processing to mature rRNAs in the nucleolus, however, it allowed us to monitor transgene 7 transcription. In contrast to endogenous rDNA arrays, which are located in heterochromatin of the X and Y chromosomes, the single-unit rDNA transgene was integrated into a euchromatic site on 2nd chromosome (chr 2L: 1,582,820) using site-specific integration.
First, we tested if the rDNA transgene is expressed in the heterologous location. RT-qPCR with one primer specific to UID sequence detected expression of the rDNA transgene in transgenic flies, while no PCR product was produced in the parental strain used for transgenesis ( Figure 1B). Thus, RT-qPCR is able to discriminate expression of the marked rDNA transgene from endogenous rDNA and insertion of UID sequence does not disrupt the Pol I promoter and enhancer elements.
Next, we employed fluorescent in situ hybridization (FISH) to detect expression of the rDNA transgene in individual cells. Nascent transgene transcripts differ from more abundant endogenous pre-rRNA by a 21 nt UID sequence, however, the standard FISH protocol using short probe against UID failed to detect expression of the transgene. To circumvent low sensitivity of standard FISH, we used a method based on the mechanism of the hybridization chain reaction (HCR-FISH) that offers a combination of high sensitivity and quantitation (Choi et al., 2018). HCR-FISH allowed specific detection of nascent transgene transcripts and revealed that transgene RNA is present exclusively in the nucleus ( Figure 1C) consistent with the ETS portion being processed out of pre-rRNA and degraded soon after transcription. Thus, the marked rDNA transgene enables expression analysis of an individual rDNA unit inserted in a heterologous genomic location.

Insertion of a heterologous sequences into rDNA leads to decreased expression
In Drosophila significant fraction of rDNA units contain insertions of R1 and R2 retrotransposons, however, units damaged by transposon insertions are usually silent (Long and Dawid, 1979;Jamrich et al., 1984). R1 and R2 each integrate into single unique site positioned in 28S RNA at 2711 nt and 2648 nt of 28S (Jakubczak 8 et al., 1990), respectively. To understand how insertion of transposons into rDNA influences its expression, we generated a series of additional transgenes. The fulllength R1 (5.3 Kb) and R2 (3.6 Kb) transposon sequences were inserted in their respective integration sites within the 28S rDNA ( Figure 2A). We also generated transgenes that contain insertion of an unrelated sequence (promoterless CFP) in the same positions as R1 and R2 (Figure 2A). In R1' and R2' transgenes, respective transposon sequences were flanked by additional UIDs. All transgenes were integrated into the same genomic location, which allows us to directly compare their expression.
Expression analysis of transgenes by RT-qPCR in ovary and carcass ( Figure   2B) showed that transposon insertions lead to a 3-to 5-fold reduction in pre-rRNA level compared to the intact transgene. R1 insertion caused a slightly stronger reduction in expression compared to R2 insertion. Importantly, insertions of a nontransposon CFP sequence lead to comparable decrease in expression, indicating that the repressive effect is not dependent on a specific transposon sequence and can be triggered by heterologous sequence. A 29 bp insertion caused a 2-fold decrease in transgene expression, demonstrating that short insertions can affect rDNA expression, though to a lesser degree. Thus, expression of the single-copy rDNA transgene recapitulates the behavior of endogenous rDNA and demonstrates repression of units damaged by transposon insertions. To analyze the cell-to-cell heterogeneity we first measured the fraction of nurse cells in each chamber that expressed the transgene. The intact rDNA transgene is expressed in the majority (10±1 out 15, 67%), but not all cells of the egg chamber ( Figure 2D). The fraction of individual nuclei with detectable FISH signal drops to 27% for transgenes with R1 and R2 insertions ( Figure 2D). Next, we measured the level of HCR FISH signal intensity in individual nuclei that had detectable signal.

Cell-to-cell variability in rDNA transgenes expression
The intact rDNA transgene showed high level of cell-to-cell variability in expression ( Figure 2E). The signal intensity was ~ 30-fold lower for transgenes containing R1 and R2 insertions compared to intact rDNA transgene ( Figure 2E). These experiments revealed an unexpected cell-to-cell variability in expression of rDNA transgenes that was not present for control protein-coding gene vasa. rDNA units with insertions are expressed in less than one-third of all nuclei and even the intact unit is not expressed in all cells. Thus, repression of rDNA transgenes with insertions differs from cell to cell: in the majority of cells repression is complete and in the remaining cells expression is strongly reduced.

The SUMO pathway is required for repression of transposons integrated into native rDNA loci
In the Drosophila germline many transposon families are repressed by the piRNA pathway. We recently found that SUMO, encoded by the single smt3 gene in Drosophila, and the E3 SUMO-ligase Su(var)2-10/dPIAS are involved in piRNAmediated transcriptional silencing of transposons in germline (Ninova et al.,10 2020a). SUMO is covalently attached to many nuclear proteins in a conserved pathway that consists of the E1 (Uba2/Aos1) and E2 (Ubc9) SUMO ligases ( Figure   3A). SUMOylation of target proteins requires transfer of SUMO from Ubc9, which may occur directly or be aided by specific E3 SUMO ligases.
In agreement with the function of smt3 and Su(var)2-10 in piRNA-guided transposon repression, germline-specific knockdown (GLKD) of either gene causes similar activation of many TE families (Ninova et al., 2020a;Ninova et al., 2020b). Notably, GLKD of Su(var)2-10 had a modest effect on R1 and R2 compared to many other TE families that were upregulated more strongly.
However, knockdown of smt3 has a disproportionally strong effect on expression of R1 and R2 transposons. RNA-seq analysis showed that the levels of R1 and R2 transcripts increased ~1000 and ~300 fold, respectively, and they become among  Figure 3D). Analysis of RNA-seq data demonstrated that such strong upregulation is exclusive for R1 and R2 and is not seen for any other protein-coding or ncRNA genes: the vast majority of other transposons and genes those expression was affected by smt3 GLKD changed less than 10-fold. Thus, SUMO seems to be involved in specific process of R1/R2 repression which is independent of Su(var)2-10 SUMO ligase.
To further explore the role of SUMO and piRNA in repression of R1 and R2 we knocked down smt3 and the single E2 SUMO ligase, Ubc9, in S2 cells that lack an active piRNA pathway. RT-qPCR showed strong and specific upregulation of R1 and R2 upon both smt3 and Ubc9 knock-downs in S2 cells ( Figure 3E). In contrast, knock-down of Su(var)2-10 in S2 cells did not cause activation of R1 and R2 To get further insights into expression of intact and damaged rDNA transgenes upon SUMO depletion we employed HCR-FISH to study expression in individual nurse cells ( Figure 4B). Confirming the RT-qPCR results, Smt3 GLKD lead to a marked increase in expression of both intact and damaged rDNA transgenes. First, in contrast to wild-type flies, expression of intact and interrupted rDNA was detected in almost all nuclei upon Smt3 KD ( Figure 4C). Second, depletion of SUMO increased the signal intensity per nucleus: after Smt3 KD the signal increased 35~37-fold for damaged transgenes and ~ 2.5-fold for the intact transgene ( Figure 4D). Consistent with the RT-qPCR results, intact and damaged transgenes showed similarly high expression level upon SUMO KD. Thus, while 12 damaged units are preferentially silenced in wild-type cells, loss of SUMO eliminates differential repression.
To test the role of SUMO in expression of endogenous rDNA units we measured pre-rRNA levels by RT-qPCR using primers that amplify the external transcribed spacer (EST) portion. Depletion of Smt3 led to 2.5-fold increase in pre-rRNA levels both in ovarian germ cells and in S2 cells ( Figure 4E), demonstrating that SUMO controls expression of endogenous rDNA arrays.

The role of heterochromatin in SUMO-dependent rDNA repression
SUMO-dependent repression of rRNA might be caused by co-transcriptional degradation of pre-rRNA or by suppression of Pol I transcription. To discriminate between these possibilities, we employed ChIP to measure the presence of RNA pol I on intact and damaged transgenes. We generated transgenic flies that express GFP-tagged Rpl135, an essential subunit of Pol I. As expected, GFP- rDNA arrays are embedded in heterochromatin on the X and Y chromosomes, a genomic compartment characterized by high level of the H3K9me3 mark, which is linked to repression of protein-coding genes transcribed by RNA pol II. H3K9me3 was proposed to play a role in rDNA silencing . To study whether the SUMO pathway regulates rDNA expression through deposition of H3K9me3 mark, we performed H3K9me3 ChIP-seq in the fly ovary upon Smt3 knockdown. We found that endogenous rDNA units and R1 and R2 sequences are indeed enriched in H3K9me3 mark in wild-type flies, however, no change in H3K9me3 occupancy was observed upon SUMO depletion ( Figure 5B). We used 13 independent ChIP-qPCR to compare the levels of heterochromatin mark on R1 and R2 transposons to other hetero-and euchromatin regions. These experiments confirmed that R1 and R2 sequences are heterochromatic, however, the level of H3K9me3 was not affected by Smt3 KD on R1 and only decreased by 36.7% on R2 ( Figure 5C).
The level of H3K9me3 mark was ~ 4-fold lower on transgenic compared to endogenous rDNA copies and only slightly higher compared to control euchromatin region ( Figure 5D). Furthermore, heterochromatin mark levels were similar on intact and damaged rDNA transgenes. Finally, H3K9me3 levels on intact or damaged transgenes did not decrease upon SUMO knockdown, despite the strong activation of their expression. Taken together, these results indicate that the level of H3K9me3 mark on endogenous and transgenic rDNA does not correlate with silencing.
To further explore the role of heterochromatin marks in repression of rDNA we used RNAi to knock-down three histone methyltransferases, SetDB1, Su(var)3-9 and G9a, that are together responsible for all mono-, di-and trimethylation of H3K9 in Drosophila (Supplementary File 4). Depletion of SetDB1 and G9a had no effect on R1 and R2 expression, while depletion of Su(var)3-9 caused very mild ~2.5fold change (compared to ~1,000 fold derepression upon smt3 and Ubc9 RNAi) ( Figure 5E). Simultaneous knockdown of all three histone methyltransferases had the similar effect as depletion of Su(var)3-9 alone indicating no redundancy. Thus, our results suggest that H3K9me1/2/3 marks are not essential for silencing and SUMO represses rDNA by an H3K9me-independent mechanism.

SUMOylation of nucleolar proteins involved in rDNA expression
Proteomic studies in Drosophila identified several hundred SUMOylated proteins (Handu et al., 2015;). However, the list is likely to be even longer considering frequency of the SUMOylation consensus (ΨKxE/D) in the fly proteome and technical difficulties of detecting SUMOylation that often affects only a small fraction of target protein molecules (Hay, 2005). Many chromatin proteins, including histones, are substrates of SUMOylation (Shiio and Eisenman, 2003;Nathan et al., 2003) suggesting a possibility that rDNA repression might be caused by SUMOylation of chromatin proteins on rDNA sequences. To explore this possibility, we analyzed previously published SUMO ChIP-seq data (Gonzales et al. 2014). This analysis revealed enrichment of SUMO at rDNA unit and R2 sequences as well as at the 5' regions of the R1 transposon ( Figure 6A) indicating that SUMOylated proteins are indeed enriched on rDNA chromatin.
To find if specific proteins involved in rDNA transcription and nucleolar function are SUMOylated we employed sensitive SUMOylation assay. GFP-tagged candidate proteins were co-expressed together with FLAG-SUMO in S2 cells followed by their immunoprecipitation and detection of SUMOylation. The assay successfully detected SUMOylation of CTCF, conserved Zn-finger protein involved in high order chromatin organization that is known target of SUMOylation (MacPherson, et al., 2009;Guerrero, Maggert, 2011). Out of 7 other candidates we detected SUMOylation of three proteins including Udd and CG3756, two proteins involved in RNA pol I dependent transcription of rDNA ( Figure 6B). As expected from previous observations (Hay, 2005), only small fraction of target proteins was modified. Thus, the limited screen of selected nucleolar proteins suggests that potentially many more proteins involved in rDNA transcription might be SUMOylated.
To further explore this possibility, we retrieved all D. melanogaster proteins associated with Gene Ontology term "nucleolus" (GO:0005730 and child terms, n=243 Flybase-annotated genes) and searched for the SUMOylation consensus (ψKxE/D) in their sequences. This analysis showed that 77% of nucleolar genes have products harboring a SUMOylation consensus. In addition, 68% (19 out of 28) genes associated with the biological process "transcription by RNA polymerase I" (GO:0006360) have a SUMOylation consensus. Proteomic approach was previously used to comprehensively identify SUMOylated proteins in Drosophila (Handu et al., 2015). Ontology analysis revealed that 27 out of 823 SUMOylated proteins are associated with the GO term "nucleolus" demonstrating significant To find proteins involved in SUMO-dependent rDNA repression we used RNAi to knock-down candidate genes and monitor expression of R1 and R2 transposons in S2 cells. We composed a list of proteins that have nucleolar function or involved in rDNA transcription and are SUMOylated according to published (Handu et al., 2015) and our own (Ninova, unpublished) mass-spec data (Supplementary File 4).
We also selected proteins involved in SUMO pathway such as E3 SUMO ligases and SUMO isopeptidases. Out of 25 tested genes, knockdown of five (Ulp1, CG13773 CG3756, Fib, mbm) caused increase in expression of R1 and R2, including SUMO isopeptidase Ulp1 ( Figure 6C). None of tested E3 SUMO ligases scored positive suggesting that SUMO-dependent rDNA repression requires yet unknown E3 ligase or involves direct transfer of SUMO by E2 ligase Ubc9 that is indispensable for the repression. The RNAi screen also revealed that repression requires two proteins that contribute to RNA polymerase I activity: CG13773 and CG3756. However, the magnitude of R1/R2 upregulation was rather modest (3 to 30-fold) compared to dramatic (>1,000-fold) derepression observed upon SUMO knockdown. Overall, the results of RNAi screen further support the role of SUMO pathway in rDNA regulation and suggest that multiple proteins including several SUMOylated components of RNA pol I complex are involved in this process.
To directly test if local SUMOylation of chromatin proteins in vicinity of rDNA promoter lead to repression we recruited E2 SUMO ligase Ubc9 to rDNA transgene. Ubc9 enzyme catalyzes the transfer of SUMO that is covalently linked to it to many proteins that harbor simple SUMOylation motif (Johnson and Blobel, 1997) ( Figure 3A). We generated transgenic flies with intact rDNA harboring 16nt hairpin sequence that can be irreversibly bound by inactive version of bacterial Csy4 nuclease (Lee et al., 2013) ( Figure 6D). Expression of Ubc9 fused to Csy4 lead to 4.7-fold decrease in transgene expression compared to control flies suggesting that increase in local SUMOylation lead to further rDNA repression ( Figure 6D).

Discussion
To satisfy the high demand for rRNAs -essential components of ribosomesgenomes of most organisms contain multiple identical rDNA genes. However, studies in many eukaryotic species paradoxically demonstrated that only a fraction of available rDNA genes are expressed, while other rDNA units with apparently identical sequence are inactive (Conconi et al., 1989;Morgan et al., 1983;Sogo et al., 1984). rDNA repression was linked to rDNA stability, prevention of recombination and preserving nucleolar structure (Espada et al., 2007;Sinclair et al., 1997). Differential expression of ribosomal RNA genes represents an ultimate case of epigenetic regulation: identical DNA sequences have drastically different expression levels within a single cell and these expression states are propagated through multiple cellular divisions.
Studies of rDNA repression are hampered by the fact that hundreds of rDNA units are present in the genome with almost identical sequence, which cannot be reliably discriminated (Ganley and Kobayashi, 2007). To circumvent this problem and understand regulation of rDNA expression, we used molecularly tagged rDNA transgenes inserted in a heterologous locus in the D. melanogaster genome.
Insertion of a short unique sequence into the 5'-external transcribed spacer (ETS) of rDNA did not interfere with its expression and allowed us to monitor expression and chromatin structure of the rDNA transgenes and to discriminate it from the native rDNA units.
In many organisms, including Drosophila, native rDNA clusters are located within constitutive heterochromatin, repeat-rich and gene-poor regions that are transcriptionally silent (Németh and Längst, 2011;Yoshida, 1998) Previous studies suggested that rDNA units that are damaged by insertion of R1 and R2 retrotransposons, which integrate into specific sites in 28S rDNA in Drosophila are selectively repressed (Eickbush and Eickbush, 2003;Jolly and Thomas, 1980;Kidd and Glover, 1981;Long et al., 1981aLong et al., , 1981b. Variable levels of repression were also observed upon integration of a non-transposon sequence in 28S rDNA in a native rDNA cluster (Eickbush and Eickbush, 2003). The molecular mechanism by which damaged rDNA units are repressed and its link to silencing of intact rDNA remained poorly understood. Our results indicate that integration of a sequence other than transposons can induce rDNA repression. Interestingly, extremely truncated R2 copies were shown to be actively expressed, indicating that rDNA with very short insertions can escape silencing (Eickbush and Eickbush, 2003). However, although the length of CFP (720 bp) is shorter than the full-length R1 and R2 transposons (5.4 and 3.6 kb, respectively), it is sufficient to induce repression. Integrating a heterologous sequence into rDNA also induced silencing in mammals, which lack transposons that specifically integrate into rDNA.
Integration of the human growth hormone gene into rat ribosomal locus caused deletion of the ribosomal sequence and silencing of rDNA (Henderson and Robins, 1982). Thus, repression of rDNA copies damaged by insertions of heterologous sequences seems to be an evolutionary conserved process that prevents production of aberrant rRNA, which might interfere with proper ribosome assembly.
Detection of transcripts from intact and damaged rDNA transgenes in individual cells revealed large cell-to-cell variability in their expression. Although intact transgenes are expressed in a larger fraction of cells than damaged transgenes, they are still not expressed in all cells, indicating that intact rDNA units also undergo silencing, albeit less frequently than damaged units. In search of the molecular mechanism of rDNA repression we found that SUMO depletion lead to dramatic increase in the expression of R1 and R2 transposons, which integrate in native rDNA clusters ( Figure 3B-E). Potent derepression of R1 and R2 was observed only upon SUMO depletion, but not in piRNA pathway mutants (Ninova et al., 2020a) or upon depletion of E3 SUMO ligase Su(var)2-10 (Ninova et al., 2020a) indicating that SUMO plays a special role in repression of rDNA-targeting transposons that is independent of the piRNA pathway. Furthermore, depletion of SUMO also activates rDNA transgenes indicating that they are repressed by the same mechanism that silences native rDNA units ( Figure 4A). Surprisingly, SUMO KD increased expression of intact transgenes and released their repression in individual cells indicating that the same SUMO-dependent pathway is responsible for repression of both damaged and intact rDNA.

Mechanism of SUMO-dependent rDNA repression
SUMO-dependent repression correlates with the levels of nascent pre-rRNA ( Figure 4B-D) and decreased Pol I occupancy at rDNA promoters ( Figure 2C) suggesting that repression acts at the level of transcription. How the presence of an insertion within the body of the rDNA unit lead to decreased transcription at the promoter remains to be understood. In mammals repression of rDNA units was shown to correlate with the presence of CpG DNA methylation and repressive histone marks near the rDNA promoter (Bird et al., 1981;Coffman et al., 2005;Conconi et al., 1989;Dammann et al., 1993;Flavell et al., 1988;Earley et al., 2006;Li et al., 2006;Santoro and Grummt, 2001;Santoro et al., 2002;Zhou et al., 2002).
Furthermore, the chromatin remodeling complexes NoRC and NuRD were shown to be involved in rDNA repression probably by altering accessibility of the rDNA promoter to chromatin repressors such as DNA methyltransferases, histone methyltransferases and deacetylases Strohner et al., 2001;methylation (Urieli-Shoval et al., 1982), so this mechanism plays no role in rDNA repression in flies. Our results indicate that native (but not transgenic) rDNA units are enriched in H3K9me3, a histone mark associated with repression of genes transcribed by Pol II. However, the level of H3K9me3 remained high on rDNA units upon SUMO depletion (Figure 5), indicating that the presence of this mark is not sufficient for repression. Furthermore, rDNA transgenes inserted in heterologous genomic locus are repressed by SUMO-dependent mechanism despite the fact that they lack H3K9me3 mark.
Large number of nuclear proteins are SUMOylated and SUMO is required for many nuclear processes (Nacerddine et al., 2005;Zhao and Blobel, 2005) including different stages of ribosome maturation (Heun, 2007;Takahashi et al., 2008;Finkbeiner et al., 2011) and transcriptional repression of protein-coding genes (Gill, 2005;Verger et al., 2003). Therefore, SUMO might play two different -though not necessarily mutually exclusive -functions in rDNA silencing. First, repression might depend on SUMOylation of proteins directly involved in rDNA silencing. Alternatively, depletion of SUMO might influence rDNA expression indirectly by changing expression of genes involved in repression. In agreement with known role of SUMO in Pol II transcription, its depletion in female germline caused changes in gene expression. We observed statistically significant >2-fold up-regulation of ~4% (323) and down-regulation of ~1.5% (138) of the analyzed genes (qval<0.05, likelihood ratio test). However, the vast majority of affected genes changed within 2-10 fold, in stark contrast to dramatic ~1000-and ~300fold upregulation of R1 and R2 elements upon SUMO depletion (Fig 3B). We did not find any significantly enriched GO terms associated with the up-and downregulated gene sets upon SUMO KD related to nucleolar function and/or rRNA transcription (BH-adjusted p-value cutoff 0.05). Manual inspection showed that the only gene associated with the cellular component GO term 'nucleolus' ("GO:0005730") and offspring terms among down-regulated genes was smt3 itself.
Only two out of the 323 up-regulated genes -ph-p and CG9123 -are associated with the 'nucleolus' GO term, however, neither of them has known function in rDNA 20 expression. We did not find any matches to the biological process GO:0006360 (transcription by RNA polymerase I) and offspring. Therefore, while we cannot completely exclude a possibility of secondary effects of SUMO depletion, there is no direct indication that expression of genes involved in rDNA transcription and nucleolar function is affected by SUMO depletion.
Several lines of evidence suggest that SUMO plays direct role in rDNA silencing through modification of one or several proteins involved in rDNA expression. First, many chromatin proteins, including histones, are substrates of SUMOylation (Shiio and Eisenman, 2003;Nathan et al., 2003) and ChIP-seq analysis revealed enrichment of SUMO at R1/R2 and rDNA sequences ( Figure   6A). Second, our analysis of SUMOylated proteome using both targeted and unbiased approaches indicates that many nucleolar proteins, including several components of RNA pol I machinery and Udd protein previously implicated in rDNA expression (Zhang et al., 2014), are SUMOylated in vivo ( Figure

(B) rDNA transgene expression is detected by RT-PCR in fly ovaries.
RT-PCR amplicon of the UID ETS region is only detected in transgenic but not in wild type flies or in the absence of reverse transcriptase (-RT). rDNA transgene expression in ovaries was measured by RT-qPCR and normalized to rp49 mRNA.
Error bars indicate standard deviation of three biological replicas. Statistical significance is estimated by two-tailed Student's t-test; *** p<0.001.

(C) rDNA transgene expression is detected by HCR FISH in fly ovaries.
Nascent transcripts of the pre-rRNA transgene (arrowhead) were detected in nurse cell nuclei using a probe against the UID sequence (red). Control wild type flies lack the UID sequence. Scale bar: 5µm. The sequences of R1 and R2 transposons were inserted into their natural integration sites within the 28S rRNA. Transgenes R1' and R2' are identical to R1 23 and R2 transgenes, respectively, but contain second UID flanking retrotransposon sequence. The promotorless CFP sequences was inserted into the same R1 (CFP-1 transgene) and R2 (CFP-1 transgene) integration sites. 29nt sequence was inserted into R2 site. All constructs were integrated into the same genomic att site (chr 2L: 1,582,820) on the 2 nd chromosome using ΦC31-mediated recombination.

(B) Transgenic rRNA expression is decreased upon insertion of foreign sequence.
Expression of rDNA transgenes in ovary (top) and carcasses (bottom) measured by RT-qPCR and normalized to rp49 mRNA. Error bars indicate standard deviation of three biological replicas. Statistical significance is estimated by two-tailed Student's t-test; *** p<0.001.

(C) R1 or R2 insertion decreases transgenic rRNA transcripts as detected by HCR FISH
Endogenous pre-rDNA is detected in stage 7-8 nurse cell nuclei using a probe against ETS (green), while transgenes are detected with probe against the UID (Red). Vasa mRNA is detected as a control (green). Scale bar: 10µm.

(D) Transgenic rDNA is expressed in fewer nuclei upon R1/R2 insertion
Shown is the number of nurse cell nuclei with positive HCR FISH signal per egg chamber (out of 15 nuclei per chamber). Each circle represents data from one egg chamber. Error bars indicate standard deviation from 5 egg chambers. Statistical significance is estimated by two-tailed Student's t-test; *** p<0.001.

(E) Transgenic rDNA HCR FISH signal is reduced upon R1/R2 insertion
The total intensity of HCR FISH signal was measured in individual nuclei that have positive signal. Error bars indicate standard deviation from 5 nuclei. Statistical significance is estimated by two-tailed Student's t-test; *** p<0.001.

(D) Transgenic rDNA HCR FISH signal is increased upon SUMO KD
The total intensity of HCR FISH signal was measured in individual nuclei that have positive signal. Error bars indicate the standard deviation of 5 nuclei. Statistical significance is estimated by two-tailed Student's t-test; *** p<0.001.

(E) SUMO knockdown increases pre-rRNA expression.
Expression of pre-rRNAs were measured by RT-qPCR using primers that target ETS region normalized to rp49 mRNA. Germline-specific knockdown of SUMO (shSmt3) or control (shW) gene was induced by small hairpin driven by maternaltubulin-Gal4 driver. In S2 cells, knockdown of SUMO (dsSmt3) or control (dsGFP) was induced by double stranded RNA. Error bars indicate the standard deviation of three biological replicates. Statistical significance is estimated by two-tailed Student's t-test; ** p<0.01, *** p<0.001.

R1/R2 (A) Pol I occupancy increases over rDNA transgenes upon SUMO KD
Germline-specific knockdown of SUMO (shSmt3) or control (shW) gene was induced by small hairpin driven by maternal-tubulin-Gal4 driver. ChIP of Rpl135-GFP using a GFP antibody for pull-down was followed by qPCR analysis using UID-specific primers. Data were normalized using a sequence mapping to a genepoor region. Error bars indicate standard deviation of three biological replicates.

(B-C) H3K9me3 enrichment over native rDNA and the R1 and R2 transposons sequences is unaffected by SUMO KD.
Germline knockdown was induced by small hairpin driven by maternal-tubulin-Gal4 driver (B) H3K9me3 ChIP-seq signal and corresponding input coverage across the R1 and R2 consensus sequences (RepBase), and the rDNA unit (Stage et al.,2007) from control (shW) and SUMO-depleted (shSmt3) ovaries. Data is normalized to total reads mapping to the genome. (C) H3K9me3 ChIP-qPCR using primers to native 28S rDNA interrupted with R1 and R2 insertions as in Figure 3D.
ChIP to input enrichment is normalized to the region that has high level of H3K9me3 mark (chr2R: 4,141,405 -4,141,502). Error bars indicate the standard deviation of three biological replicates. Statistical significance is estimated by twotailed Student's t-test; ** p<0.01.
Error bars indicate the standard deviation of three biological replicates. Statistical significance is estimated by two-tailed Student's t-test; ** p<0.01. Rpl135-GFP (green) and nucleolus marker Fibrillarin IF (red) show co-localized in nurse cells of the fly ovary. Scale bar: 10µm.

(B) Rpl135-GFP ChIP-qPCR.
Fold enrichment of Rpl135-GFP as measured on ETS and 18S of endogenous rDNA in ovaries by ChIP-qPCR normalized using a sequence mapping to a 28 gene-poor region kalahari. Error bars indicate standard deviation of three biological replicates. Statistical significance is estimated by two-tailed Student's ttest; * p<0.05, ** p<0.01. GFP-tagged proteins were co-expressed with FLAG-SUMO in S2 cells.
SUMOylation was detected after immunoprecipitation with anti-GFP antibodies.
Several proteins were tagged at either N-or C-terminus. In case of CG3756, only C-terminally tagged protein appears to be SUMOylated.

(C) RNAi screen for genes involved in R1 and R2 repression in S2 cells.
After knock-down of gene expression using RNAi expression of R1 and R2 transposons was measured by RT-qPCR and normalized to rp49 mRNA. Shown is fold-upregulation of R1 and R2 expression upon knock-down compared to control (double-stranded RNA against eGFP gene). Expression levels were measured in three biological replicates. Statistical significance is estimated by two-

Construction of rDNA transgenes and generation of transgenic flies
To make an rDNA unit marked with a unique sequence (UID), the full-length rDNA unit including the IGS (10.5 kb) was amplified by overlapping PCR from the DmrY22 plasmid (Long and Dawid, 1979)

RNA HCR FISH and image analysis 31
The previously described HCR FISH protocol (Choi et al., 2018)  Samples were incubated with the hairpin solution for 12-16 h in the dark at room temperature followed by washing with 500 µl 5x SSCT at room temperature in following order: 2x 5 min, 2x 30 min and 1x 5 min. Samples were preserved on glass slides with mounting medium and imaged using a ZEISS LSM880 microscope. Probe sequences are listed in Supplementary File 2.
FISH signal intensity was analyzed using Fiji. For each nucleus the z-stacks of images were taken at same interval distance and the total FISH signal was calculated as a sum of the signals from each z-stack image.

Immunofluorescence microscopy
Immunofluorescence microscopy was performed as previously described (Hur et al., 2013). Anti-Fibrillarin antibody (Abcam ab5821) was added to fixed ovaries at 32 1:500 dilution and incubated at 4°C overnight followed by incubation with 1:500 dilution of secondary Alexa fluor568 antibody. Images were acquired using the ZEISS LSM880.

RNAi in S2 cells
The RNAi protocol was described previously (Rogers and Rogers, 2008

RT-qPCR
About 20 fly ovaries were dissected, homogenized in 1 ml TRIzol (Invitrogen) and RNA was extracted and precipitated according to the manual. Reverse transcription was performed using SuperScript III (Invitrogen) with random hexamer. RT-qPCR target expression was normalized to rp49 mRNA expression.
All qPCR primers are listed in Supplementary File 2.

ChIP-seq and ChIP-qPCR
All ChIP experiments followed the protocol described previously (Chen et al., 2016) using anti-H3K9me3 antibody from Abcam (ab8898) and anti-GFP antibody from ThermoFisher (A11122) for Rpl135-GFP. qPCR signal using primers to the respective regions was normalized to Kalahari as described previously (Sienski et al.,2015). All qPCR primers are listed in Supplementary File 2. ChIP-seq libraries 33 were generated using the NEBNext ChIP-Seq Library Prep Master Mix Set. All libraries were sequenced on the Illumina HiSeq 2500 platform.

RNA-seq
Total RNA was extracted from fly ovaries by using TRIzol according to the manual.  (Bray et al., 2016) with the following parameters: '-single -t 4 -l 200 -s 50 -b 30 -rf-stranded'. Subsequent differential expression analysis was performed with sleuth using the gene analysis option (Pimentel et al., 2017). Fold changes in gene expression were calculated from the average TPM between replicates in knockdown versus control ovaries.
The R code for this analysis is available on github at https://github.com/mninova/smt3_KD. Gene ontology (GO) analysis was performed on the genes that were significantly up-and down-regulated upon smt3 KD (log2FC>1 and qval<0.05 (likelihood ratio test in sleuth)), using the clusterProfiler R package (Yu et al., 2012) as well as the FlyMine web server, using all ovary-expressed genes that were not filtered out by sleuth, or all genes as background. In either case, no enrichment for rRNA-or nucleolar-related function was found (BH-adjusted p-value 0.05).
For genomic enrichment histograms and heatmaps in Supplemental Figure 1, RNA-seq and ChIP-seq data were aligned to the dm3 assembly using Bowtie Jakubczak, J. L., Xiong, Y., & Eickbush, T. H. (1990). Type I (R1) and type II (R2) ribosomal DNA insertions of Drosophila melanogaster are retrotransposable elements closely related to those of Bombyx mori. Journal of molecular biology, 212 (1)