Abstract
Interactions between the splicing machinery and RNA polymerase II increase protein-coding gene transcription. Similarly, exons and splicing signals of enhancer-generated long noncoding RNAs (elncRNAs) augment enhancer activity. However, elncRNAs are inefficiently spliced, suggesting that, compared with protein-coding genes, they contain qualitatively different exons with a limited ability to drive splicing. We show here that the inefficiently spliced first exons of elncRNAs as well as promoter-antisense long noncoding RNAs (pa-lncRNAs) in human and mouse cells trigger a transcription termination checkpoint that requires WDR82, an RNA polymerase II–binding protein, and its RNA-binding partner of previously unknown function, ZC3H4. We propose that the first exons of elncRNAs and pa-lncRNAs are an intrinsic component of a regulatory mechanism that, on the one hand, maximizes the activity of these cis-regulatory elements by recruiting the splicing machinery and, on the other, contains elements that suppress pervasive extragenic transcription.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$189.00 per year
only $15.75 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
The complete list of datasets used in this study is reported in Supplementary Table 10. Datasets generated in this study are available in the Gene Expression Omnibus (GEO) database under the accession number GSE133109. Source data are provided with this paper.
References
Konarska, M. M., Padgett, R. A. & Sharp, P. A. Recognition of cap structure in splicing in vitro of mRNA precursors. Cell 38, 731–736 (1984).
Izaurralde, E. et al. A nuclear cap binding protein complex involved in pre-mRNA splicing. Cell 78, 657–668 (1994).
Herzel, L., Ottoz, D. S. M., Alpert, T. & Neugebauer, K. M. Splicing and transcription touch base: co-transcriptional spliceosome assembly and function. Nat. Rev. Mol. Cell Biol. 18, 637–650 (2017).
Fong, Y. W. & Zhou, Q. Stimulatory effect of splicing factors on transcriptional elongation. Nature 414, 929–933 (2001).
Lin, S., Coutinho-Mansfield, G., Wang, D., Pandit, S. & Fu, X. D. The splicing factor SC35 has an active role in transcriptional elongation. Nat. Struct. Mol. Biol. 15, 819–826 (2008).
Ji, X. et al. SR proteins collaborate with 7SK and promoter-associated nascent RNA to release paused polymerase. Cell 153, 855–868 (2013).
Das, R. et al. SR proteins function in coupling RNAP II transcription to pre-mRNA splicing. Mol. Cell 26, 867–881 (2007).
Damgaard, C. K. et al. A 5′ splice site enhances the recruitment of basal transcription initiation factors in vivo. Mol. Cell 29, 271–278 (2008).
Sims, R. J. III. et al. Recognition of trimethylated histone H3 lysine 4 facilitates the recruitment of transcription postinitiation factors and pre-mRNA splicing. Mol. Cell 28, 665–676 (2007).
Tyagi, A., Ryme, J., Brodin, D., Ostlund Farrants, A. K. & Visa, N. SWI/SNF associates with nascent pre-mRNPs and regulates alternative pre-mRNA processing. PLoS Genet. 5, e1000470 (2009).
Heintzman, N. et al. Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat. Genet. 39, 311–318 (2007).
De Santa, F. et al. A large fraction of extragenic RNA pol II transcription sites overlap enhancers. PLoS Biol. 8, e1000384 (2010).
Kim, T. K. et al. Widespread transcription at neuronal activity-regulated enhancers. Nature 465, 182–187 (2010).
Natoli, G. & Andrau, J. C. Noncoding transcription at enhancers: general principles and functional models. Annu. Rev. Genet. 46, 1–19 (2012).
Marques, A. C. et al. Chromatin signatures at transcriptional start sites separate two equally populated yet distinct classes of intergenic long noncoding RNAs. Genome Biol. 14, R131 (2013).
Hon, C. C. et al. An atlas of human long non-coding RNAs with accurate 5′ ends. Nature 543, 199–204 (2017).
Engreitz, J. M. et al. Local regulation of gene expression by lncRNA promoters, transcription and splicing. Nature 539, 452–455 (2016).
Gil, N. & Ulitsky, I. Production of spliced long noncoding RNAs specifies regions with increased enhancer activity. Cell Syst. 7, 537–547.e3 (2018).
Tan, J. Y., Biasini, A., Young, R. S. & Marques, A. C. Splicing of enhancer-associated lincRNAs contributes to enhancer activity. Life Sci. Alliance 3, e202000663 (2020).
Ponjavic, J., Ponting, C. P. & Lunter, G. Functionality or transcriptional noise? Evidence for selection within long noncoding RNAs. Genome Res. 17, 556–565 (2007).
Schuler, A., Ghanbarian, A. T. & Hurst, L. D. Purifying selection on splice-related motifs, not expression level nor RNA folding, explains nearly all constraint on human lincRNAs. Mol. Biol. Evol. 31, 3164–3183 (2014).
Koch, F. et al. Transcription initiation platforms and GTF recruitment at tissue-specific enhancers and promoters. Nat. Struct. Mol. Biol. 18, 956–963 (2011).
Lee, J. H. & Skalnik, D. G. Wdr82 is a C-terminal domain-binding protein that recruits the Setd1A histone H3-Lys4 methyltransferase complex to transcription start sites of transcribed human genes. Mol. Cell. Biol. 28, 609–618 (2008).
Austenaa, L. M. et al. Transcription of mammalian cis-regulatory elements is restrained by actively enforced early termination. Mol. Cell 60, 460–474 (2015).
Wu, M. et al. Molecular regulation of H3K4 trimethylation by Wdr82, a component of human Set1/COMPASS. Mol. Cell. Biol. 28, 7337–7344 (2008).
Lee, J. H., You, J., Dobrota, E. & Skalnik, D. G. Identification and characterization of a novel human PP1 phosphatase complex. J. Biol. Chem. 285, 24466–24476 (2010).
Baillat, D. et al. Integrator, a multiprotein mediator of small nuclear RNA processing, associates with the C-terminal repeat of RNA polymerase II. Cell 123, 265–276 (2005).
Lai, F., Gardini, A., Zhang, A. & Shiekhattar, R. Integrator mediates the biogenesis of enhancer RNAs. Nature 525, 399–403 (2015).
Preker, P. et al. RNA exosome depletion reveals transcription upstream of active human promoters. Science 322, 1851–1854 (2008).
Andersen, P. R. et al. The human cap-binding complex is functionally connected to the nuclear RNA exosome. Nat. Struct. Mol. Biol. 20, 1367–1376 (2013).
Ostuni, R. et al. Latent enhancers activated by stimulation in differentiated cells. Cell 152, 157–171 (2013).
van Nuland, R. et al. Quantitative dissection and stoichiometry determination of the human SET1/MLL histone methyltransferase complexes. Mol. Cell. Biol. 33, 2067–2077 (2013).
Searles, L. L., Ruth, R. S., Pret, A. M., Fridell, R. A. & Ali, A. J. Structure and transcription of the Drosophila melanogaster vermilion gene and several mutant alleles. Mol. Cell. Biol. 10, 1423–1431 (1990).
Fridell, R. A., Pret, A. M. & Searles, L. L. A retrotransposon 412 insertion within an exon of the Drosophila melanogaster vermilion gene is spliced from the precursor RNA. Genes Dev. 4, 559–566 (1990).
Brewer-Jensen, P. et al. Suppressor of sable [Su(s)] and Wdr82 down-regulate RNA from heat-shock-inducible repetitive elements by a mechanism that involves transcription termination. RNA 22, 139–154 (2016).
Castello, A. et al. Insights into RNA biology from an atlas of mammalian mRNA-binding proteins. Cell 149, 1393–1406 (2012).
Baltz, A. G. et al. The mRNA-bound proteome and its global occupancy profile on protein-coding transcripts. Mol. Cell 46, 674–690 (2012).
Kwon, S. C. et al. The RNA-binding protein repertoire of embryonic stem cells. Nat. Struct. Mol. Biol. 20, 1122–1130 (2013).
Fu, M. & Blackshear, P. J. RNA-binding proteins in immune regulation: a focus on CCCH zinc finger proteins. Nat. Rev. Immunol. 17, 130–143 (2017).
Godin, K. S. & Varani, G. How arginine-rich domains coordinate mRNA maturation events. RNA Biol. 4, 69–75 (2007).
Shi, Y. et al. Molecular architecture of the human pre-mRNA 3′ processing complex. Mol. Cell 33, 365–376 (2009).
Cortazar, M. A. et al. Control of RNA Pol II Speed by PNUTS-PP1 and Spt5 Dephosphorylation Facilitates Termination by a "Sitting Duck Torpedo" Mechanism. Mol. Cell 76, 896–908.e4 (2019).
Sigova, A. A. et al. Divergent transcription of long noncoding RNA/mRNA gene pairs in embryonic stem cells. Proc. Natl Acad. Sci. USA 110, 2876–2881 (2013).
Deveson, I. W. et al. Universal alternative splicing of noncoding exons. Cell Syst. 6, 245–255 (2018).
Mele, M. et al. Chromatin environment, transcriptional regulation, and splicing distinguish lincRNAs and mRNAs. Genome Res. 27, 27–37 (2017).
Yeo, G. & Burge, C. B. Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals. J. Comput. Biol. 11, 377–394 (2004).
Fairbrother, W. G., Yeh, R. F., Sharp, P. A. & Burge, C. B. Predictive identification of exonic splicing enhancers in human genes. Science 297, 1007–1013 (2002).
Caceres, E. F. & Hurst, L. D. The evolution, impact and properties of exonic splice enhancers. Genome Biol. 14, R143 (2013).
Andersson, R. et al. Human gene promoters are intrinsically bidirectional. Mol. Cell 60, 346–347 (2015).
Seila, A. C. et al. Divergent transcription from active promoters. Science 322, 1849–1851 (2008).
Kaida, D. et al. U1 snRNP protects pre-mRNAs from premature cleavage and polyadenylation. Nature 468, 664–668 (2010).
Almada, A. E., Wu, X., Kriz, A. J., Burge, C. B. & Sharp, P. A. Promoter directionality is controlled by U1 snRNP and polyadenylation signals. Nature 499, 360–363 (2013).
Ntini, E. et al. Polyadenylation site-induced decay of upstream transcripts enforces promoter directionality. Nat. Struct. Mol. Biol. 20, 923–928 (2013).
Murray, M. V., Turnage, M. A., Williamson, K. J., Steinhauer, W. R. & Searles, L. L. The Drosophila suppressor of sable protein binds to RNA and associates with a subset of polytene chromosome bands. Mol. Cell. Biol. 17, 2291–2300 (1997).
Nojima, T. et al. Mammalian NET-seq reveals genome-wide nascent transcription coupled to RNA processing. Cell 161, 526–540 (2015).
Nojima, T. et al. RNA polymerase II phosphorylated on CTD serine 5 interacts with the spliceosome during co-transcriptional splicing. Mol. Cell 72, 369–379.e4 (2018).
Wongpalee, S. P. et al. Large-scale remodeling of a repressed exon ribonucleoprotein to an exon definition complex active for splicing. Elife 5, e19743 (2016).
Attig, J. & Ule, J. Genomic accumulation of retrotransposons was facilitated by repressive RNA-binding proteins: a hypothesis. Bioessays 41, e1800132 (2019).
Kelley, D. & Rinn, J. Transposable elements reveal a stem cell-specific class of long noncoding RNAs. Genome Biol. 13, R107 (2012).
Cassa, C. A. et al. Estimating the selective effects of heterozygous protein-truncating variants from human exome data. Nat. Genet. 49, 806–810 (2017).
Austenaa, L. et al. The histone methyltransferase Wbp7 controls macrophage function through GPI glycolipid anchor synthesis. Immunity 36, 572–585 (2012).
De Santa, F. et al. The histone H3 lysine-27 demethylase Jmjd3 links inflammation to inhibition of Polycomb-mediated gene silencing. Cell 130, 1083–1094 (2007).
Balestrieri, C. et al. Co-optation of tandem DNA repeats for the maintenance of mesenchymal identity. Cell 173, 1150–1164.e14 (2018).
Sakuma, T., Nishikawa, A., Kume, S., Chayama, K. & Yamamoto, T. Multiplex genome engineering in human cells using all-in-one CRISPR/Cas9 vector system. Sci. Rep. 4, 5400 (2014).
Trapnell, C. et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 7, 562–578 (2012).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Zang, C. et al. A clustering approach for identification of enriched domains from histone modification ChIP-seq data. Bioinformatics 25, 1952–1958 (2009).
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Arner, E. et al. Transcribed enhancers lead waves of coordinated transcription in transitioning mammalian cells. Science 347, 1010–1014 (2015).
Hnisz, D. et al. Super-enhancers in the control of cell identity and disease. Cell 155, 934–947 (2013).
Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Amemiya, H. M., Kundaje, A. & Boyle, A. P. The ENCODE blacklist: identification of problematic regions of the genome. Sci. Rep. 9, 9354 (2019).
Ramirez, F. et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 44, W160–W165 (2016).
Krchnakova, Z. et al. Splicing of long non-coding RNAs primarily depends on polypyrimidine tract and 5′ splice-site sequences due to weak interactions with SR proteins. Nucleic Acids Res. 47, 911–928 (2019).
Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).
Acknowledgements
We thank B. Amati (IEO) and S. Monticelli (IRB, Bellinzona) for critical comments on the manuscript. This work was supported by the European Research Council (Advanced ERC grant no. 692789 to G.N).
Author information
Authors and Affiliations
Contributions
L.M.I.A., V.P., M.R. and G.N. conceptualized the study. L.M.I.A. and M.R. generated all data with contributions from E.P., D.P., S.G. and G.R.D. V.P. analyzed all data with contributions from I.B. S.P. generated and processed sequencing libraries. G.N. wrote the manuscript with contributions from all authors. G.N. supervised the study. G.N. was responsible for funding acquisition.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Nature Structural & Molecular Biology thanks Christopher Glass and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available. Anke Sparmann was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Extragenic transcription in cells depleted of WDR82 or other transcription terminators.
a, The effects of the depletion of known termination factors on extragenic transcription was measured by 4sU labeling and sequencing in mouse bone marrow-derived macrophages. We considered the n=2,870 extragenic regions whose transcription was increased in macrophages depleted of WDR82 at 45′ after LPS stimulation and measured their 4sU labeling in macrophages depleted of the indicated proteins. Each transcript was assigned to the nearest annotated enhancer, Transcription Start Site (TSS) or Transcription End Site (TES). The log2-transformed fold change (sh vs. scramble) for each depletion experiment is shown. Statistical significance was assessed using the two-tailed Wilcoxon signed rank test and a p-value ≤ 0.01 was considered significant. p-values for transcripts assigned to Enhancers: Exosc3 p-value= 2.2e-208, Ars2 p-value= 2.9e-206, Ints11 p-value= 1.4e-217, CFIm25 p-value= 4.4e-189, Xrn2 p-value= 9.2e-239. p-values for transcripts assigned to TSS: Exosc3 p-value= 4.6e-69, Ars2 p-value= 3.7e-73, Ints11 p-value= 1.6e-72, CFIm25 p-value= 7.1e-72, Xrn2 p-value= 7.8e-101. p-values for transcripts assigned to TES: Exosc3 p-value= 1.5e-14, Ars2 p-value= 5.8e-10, Ints11 p-value= 6.7e-26, CFIm25 p-value= 0.149775, Xrn2 p-value= 1.0e-81. *** = p-value <0.01. ns: not statistically significant. Inside the boxplot, the median value for each fold change is shown with a horizontal black line. Boxes show values between the first and the third quartile. The lower and upper whisker show the smallest and the highest value, respectively. Outliers are not shown. The notches correspond to ~95% confidence interval for the median. b, Comparison of the effects of the depletion of WDR82 and INTS11 on transcription termination at snRNA genes. c, A representative genomic region on mouse chromosome 11 containing multiple snRNA genes. d, Snapshots of genomic regions showing the effects of the depletion of WDR82 and other termination factors on extragenic transcription.
Extended Data Fig. 2 Interaction of WDR82 with the zinc finger protein ZC3H4.
a,b, Immunoprecipitations were carried out either with an anti-Flag antibody on extracts of HEK-293 cells transduced with a Flag-mouse ZC3H4 expression vector (A) or with an anti-ZC3H4 rabbit polyclonal antibody on extracts from Raw264.7 mouse macrophages (B). Different parts of the western blot membrane were hybridized with the indicated antibodies. Data are representative of n=4 independent experiments. The position of molecular weight markers (kDa) is shown on the right. Uncropped images are available online as Source Data. c, Upper panel: Schematic representation of (A) the full length human ZC3H4 protein and (B to E) its deletion mutants used in transfection and co-immunoprecipitation experiments. The ZC3H4 domains annotated in UniProt are shown. Bottom panel: lysates from HEK-293 cells, either untransfected (-) or transduced with the indicated Flag-ZC3H4 expression vectors (A-E) were used in co-immunoprecipitation experiments with an anti-Flag antibody. Inputs (left) and immunoprecipitates (right) were immunoblotted and probed with an anti-FLAG (top) or an anti-WDR82 (bottom) antibody as indicated. The position of molecular weight markers (kDa) is shown on the right. Uncropped images are available online as Source Data. d, The Flag-tagged ZC3H4 C-terminal fragment (804–1303) was expressed in HeLa cells. Lysates were immunoprecipitated with an anti-ZC3H4 antibody directed against aa. 677–765 and blotted with anti-Flag or anti-WDR82 antibody. Inputs are shown on the left and molecular weight markers (kDa) on the right. Uncropped images are available online as Source Data.
Extended Data Fig. 3 Effects of WDR82 and ZC3H4 co-depletion on extragenic transcription in HeLa cells.
a, The effects of WDR82, ZC3H4 or their combined depletion by siRNA transfection were measured on selected extragenic transcripts, as indicated. In co-depletion experiments, a double amount of siRNA was used, as indicated. The bar plots show the mean ± SD of n=4 biological replicates. The data were normalized on the housekeeping gene CDC25b. Light grey columns: 30pmol siRNA, dark grey columns, 60pmol siRNA. b, Depletion efficiency of WDR82 (left) and ZC3H4 mRNA (right) in individual and combined depletions. The bar plots show the mean ± SD of n=4 independent experiments.
Extended Data Fig. 4 Distribution of WDR82, ZC3H4 and RNA Pol II ChIP-seq peaks.
a, Classification of WDR82 and ZC3H4 ChIP-seq peaks based on their genomic location. TSS: Transcription Start Site; TES: Transcription End Site. Data are from n=2 independent experiments. b, Transcribed protein-coding genes (n=10,917) were divided into quartiles of increasing RNA Pol II occupancy. The heatmaps show WDR82, ZC3H4 and RNA Pol II ChIP-seq signals at genes of the 1st and 4th quartiles.
Extended Data Fig. 5 Analysis of spliced and unspliced lncRNAs suppressed by WDR82.
Extragenic transcripts upregulated upon WDR82 depletion in mouse macrophages (n=2,870; top) or in HeLa cells (n=1,509; bottom) were first divided into spliced (left) and unspliced RNA species (right). Then within each of these two groups they were further divided based on their overlap with lncRNAs in the NONCODE v5 database of non-coding RNAs, classified into single exon and multi-exonic ncRNAs.
Extended Data Fig. 6 Analysis of splice efficiency and splice site sequences of lncRNAs suppressed by ZC3H4-WDR82 in HeLa cells.
a, Splicing efficiency at WDR82-suppressed lncRNA junctions (n=3,717) (top panel) and at a randomly selected set of premRNA junctions (n=4,000) (bottom panel) in HeLa cells. A window of ±− 10 nucleotides centered on the 5′ splice sites was used to measure read counts in polyA RNA-seq data. b, log2-transformed ratio of polyA RNA-seq reads in a 20 nt window centered on the 5′ splice sites of WDR82-suppressed lncRNA or of randomly selected set of mRNAs with at least one splice junction. c, Analysis of 5′ (left) and 3′ (right) splice site strength (measured as MaxEnt scores) at WDR82-suppressed lncRNAs. Statistical significance was assessed using the two-tailed Wilcoxon rank sum test in correspondence of both the 5′ (p-value=1.2e-21) and the 3′ (p-value=2.6e-06) splice sites. ***=p-value <0.01. Nucleotide frequencies at splice sites are shown as sequence logos. Donor and acceptor splice sites are indicated as black triangles. d, Effects of the depletion of WDR82-ZC3H4 on transcription of protein coding genes in HeLa cells. Expressed protein-coding genes (n=8,804) were divided into deciles based on their sensitivity to the depletion of WDR82 in 4sU-seq data, with the 10th decile including the most upregulated genes. Log2-transformed RNA fold changes (polyA and 4sU RNA-seq data) and log2-transformed reads ratio across the first exon-intron junction, as annotated in GENCODE, are shown for the 10th, 5th and 1st deciles. Statistical significance was assessed using the two-tailed Wilcoxon rank sum test (pvalue = 2.0e-21). ***=p-value<0.01. Data were from n=3 independent experiments. In the boxplots in panels B, C, D the median value for each group is shown with a horizontal black line. Boxes show values between the first and the third quartile. The lower and upper whisker show the smallest and the highest value, respectively. Outliers are not shown. The notches correspond to ~95% confidence interval for the median.
Extended Data Fig. 7 Relationship between gene transcript expression and splicing efficiency.
a, Genes were ranked into deciles of decreasing expression based on 4sU-seq data in macrophages (left) and HeLa cells (right). In both panels, expression of the lncRNAs upregulated in WDR82-depleted cells is shown in the red boxes on the right. Data are from n=3 independent experiments. b, Splicing efficiency of the 1st exon of the ranked genes was measured by dividing the sequencing reads in the 10nt upstream by those in the 10nt downstream of the 5′ splice junction in polyA RNA-seq data. Left: macrophages (n=6,280 junctions); right: HeLa (n=8,804 junctions). Data are from n=3 independent experiments. Boxes show values between the first and the third quartile. The lower and upper whisker show the smallest and the highest value, respectively. Outliers are not shown. The notches correspond to ~95% confidence interval for the median.
Extended Data Fig. 8 Exonic splice enhancer (ESE) sequences in exons of WDR82-suppressed lncRNAs and in mRNAs.
a, Number of ESE per exon in lncRNAs and in mRNAs suppressed by WDR82 in macrophages. Data are from n=3 independent experiments. b, Distance between ESEs and 5′ splice sites in lncRNAs and in mRNAs suppressed by WDR82. Data are from n=3 independent experiments. c, Number of ESEs per exon recognized by individual SRSF proteins in lncRNAs and in mRNAs suppressed by WDR82.
Extended Data Fig. 9 Characterization of splicing efficiency and splice site quality of extragenic transcripts not affected by WDR82 depletion.
a, log2-transformed ratio of polyA RNA-seq reads upstream and downstream of the 5′ splice sites of WDR82-suppressed and WDR82-insensitive lncRNAs in HeLa cells. Statistical significance was assessed using the two-tailed Wilcoxon rank sum test (p-value= 1.8e-175 for the controls and p-value=1.3e-199 in WDR82-depleted cells). ***p-value < 0.01. Data are from n=3 independent experiments. b, Analysis of 5′ (left) and 3′ (right) splice site strength at WDR82-suppressed and WDR82-insensitive lncRNAs in HeLa cells. MaxEnt scores for both donor and acceptor splice sites were measured. Statistical significance was assessed using the two-tailed Wilcoxon rank sum test in correspondence of both the 5′ (p-value= 1.3e–20) and the 3′ (p-value= 3e-04) splice sites. *** p-value < 0.01. Data are from n=3 independent experiments. Boxes show values between the first and the third quartile. The lower and upper whisker show the smallest and the highest value, respectively. Outliers are not shown. The notches correspond to ~95% confidence interval for the median.
Extended Data Fig. 10 First exon deletions in protein coding genes.
a, Schematic representation of the deletion of the first exons of protein coding genes. sgRNAs were designed to remove a genomic sequence that included the first exon from 30–50 nt downstream of the TSS to the intronic sequences just downstream of the 5′ splice site. b, Expression of the indicated gene mRNAs was measured by qRT-PCR in bulk populations of wild type or first exon-deleted HeLa cells after transduction of the indicated siRNAs. Primers used were specific for spliced mRNAs and were designed on downstream exons (Methods). The plot shows the mean ± s.d. of n=3 independent experiments. * P < 0.05; **P < 0.01, by two tailed t-test. The data were normalized on the housekeeping gene NRSN2. P-values for COG2: WT vs. DEL siCtl = 1.75E-07; DEL siCtl vs. siWdr82 = 0.78 (n.s.); DEL siCtl vs. siWdr82 = 0.80 (n.s.). P-values for FAM174a: WT vs. DEL siCtl = 0.0005; DEL siCtl vs. siWdr82 = 0.85 (n.s.); DEL siCtl vs. siWdr82 = 0.15 (n.s.). P-values for RRP15: WT vs. DEL siCtl = 0.013; DEL siCtl vs. siWdr82 = 0.70 (n.s.); DEL siCtl vs. siWdr82 = 0.65 (n.s.). c, First exon deletion efficiency at the three genes tested was analyzed by genomic PCR. The quantification of the wild type allele gel band in wt cells and cells in which the first exon was deleted using sgRNAs+Cas9 is shown on the right. Uncropped images are available online as source data.
Supplementary information
Supplementary Information
Supplementary Figs. 1–5.
Supplementary Table 1
Extragenic transcription changes in primary mouse bone-marrow-derived macrophages depleted of WDR82 or other transcription termination factors. 4sU–seq datasets were generated in mouse bone- marrow-derived macrophages depleted of the indicated factors by retroviral delivery of shRNAs. Two different shRNAs per target were used as indicated. Data for each depleted factor are shown in different sheets. Data are from n = 4 independent experiments for WDR82 and n = 2 independent experiments for the other factors. False discovery rate (FDR) was calculated based on the Benjamini–Hochberg correction.
Supplementary Table 2
Deregulation of extragenic transcription in primary mouse bone-marrow-derived macrophages depleted of ZC3H4. 4sU–seq datasets were generated in macrophages transduced with retroviral vectors expressing ZC3H4-targeting shRNAs. Data are from n = 3 independent experiments. False discovery rate (FDR) was calculated based on the Benjamini–Hochberg correction.
Supplementary Table 3
Extragenic transcription changes in HeLa cells upon disruption of the WDR82–ZC3H4 complex. Tabs 1 and 2: HeLa cells were transfected with siRNAs targeting WDR82 or ZC3H4, and 4sU–seq datasets were generated. Data are from n = 3 independent experiments. Tab 3: HeLa cells were transfected with an expression vector encoding ZC3H4(804–1303) and, 48 h later, 4sU–seq datasets were generated. Data are from n = 3 independent experiments. False discovery rate (FDR) was calculated based on the Benjamini–Hochberg correction.
Supplementary Table 4
WDR82, ZC3H4 and RNA Pol II ChIP–seq datasets from mouse macrophages. False discovery rate (FDR) was calculated based on the Benjamini–Hochberg correction.
Supplementary Table 5
Overlap of annotated lncRNAs in the NONCODE v.5 database with extragenic transcripts upregulated in mouse macrophages upon WDR82 or ZC3H34 depletion. Extragenic regions showing increased transcription in 4sU–seq datasets from macrophages depleted of WDR82 or ZC3H4 were overlapped with annotated lncRNAs from the NONCODE v.5 database. The identity of the annotated lncRNA corresponding to each region upregulated in WDR82- or ZC3H4-depleted cells is shown.
Supplementary Table 6
Noncoding RNA junctions overlapping with extragenic transcripts upregulated in macrophages upon depletion of WDR82. polyA-RNA-seq datasets were used to identify splice junctions in the extragenic regions detected as upregulated in 4sU–seq data from WDR82-depleted macrophages. Exonic splice enhancers are also indicated. Data are from n = 3 independent experiments.
Supplementary Table 7
Overlap of annotated lncRNAs in the NONCODE v.5 database with extragenic transcripts upregulated in HeLa cells upon WDR82 or ZC3H34 depletion. Extragenic regions showing increased transcription in 4sU–seq datasets from HeLa cells depleted of WDR82 or ZC3H4 were overlapped with annotated lncRNAs from the NONCODE v.5 database. The identity of the annotated lncRNA corresponding to each region upregulated in WDR82- or ZC3H4-depleted cells is shown.
Supplementary Table 8
Noncoding RNA junctions overlapping with extragenic transcripts upregulated in HeLa cells upon depletion of WDR82. PolyA-RNA-seq data were used to identify splice junctions in the extragenic regions detected as upregulated in 4sU–seq data obtained in WDR82-depleted HeLa cells.
Supplementary Table 9
Sequences of the primers used in this study.
Supplementary Table 10
List of the datasets generated in this study.
Source data
Source Data Extended Data Fig. 2
Uncropped western blots.
Source Data Extended Data Fig. 10
Uncropped agarose gels.
Rights and permissions
About this article
Cite this article
Austenaa, L.M.I., Piccolo, V., Russo, M. et al. A first exon termination checkpoint preferentially suppresses extragenic transcription. Nat Struct Mol Biol 28, 337–346 (2021). https://doi.org/10.1038/s41594-021-00572-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41594-021-00572-y
This article is cited by
-
A CpG island-encoded mechanism protects genes from premature transcription termination
Nature Communications (2023)
-
Screening thousands of transcribed coding and non-coding regions reveals sequence determinants of RNA polymerase II elongation potential
Nature Structural & Molecular Biology (2022)
-
Mechanisms of lncRNA biogenesis as revealed by nascent transcriptomics
Nature Reviews Molecular Cell Biology (2022)
-
Inefficient splicing curbs noncoding RNA transcription
Nature Structural & Molecular Biology (2021)