Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

A first exon termination checkpoint preferentially suppresses extragenic transcription

Abstract

Interactions between the splicing machinery and RNA polymerase II increase protein-coding gene transcription. Similarly, exons and splicing signals of enhancer-generated long noncoding RNAs (elncRNAs) augment enhancer activity. However, elncRNAs are inefficiently spliced, suggesting that, compared with protein-coding genes, they contain qualitatively different exons with a limited ability to drive splicing. We show here that the inefficiently spliced first exons of elncRNAs as well as promoter-antisense long noncoding RNAs (pa-lncRNAs) in human and mouse cells trigger a transcription termination checkpoint that requires WDR82, an RNA polymerase II–binding protein, and its RNA-binding partner of previously unknown function, ZC3H4. We propose that the first exons of elncRNAs and pa-lncRNAs are an intrinsic component of a regulatory mechanism that, on the one hand, maximizes the activity of these cis-regulatory elements by recruiting the splicing machinery and, on the other, contains elements that suppress pervasive extragenic transcription.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Effects of ZC3H4 depletion on extragenic transcription.
Fig. 2: Recruitment of the WDR82–ZC3H4 complex to genomic sites with high RNA Pol II occupancy in mouse macrophages.
Fig. 3: Control of lncRNA production by WDR82–ZC3H4.
Fig. 4: lncRNAs suppressed by WDR82–ZC3H4 contain inefficiently spliced exons.
Fig. 5: A first exon transcription termination checkpoint.

Similar content being viewed by others

Data availability

The complete list of datasets used in this study is reported in Supplementary Table 10. Datasets generated in this study are available in the Gene Expression Omnibus (GEO) database under the accession number GSE133109. Source data are provided with this paper.

References

  1. Konarska, M. M., Padgett, R. A. & Sharp, P. A. Recognition of cap structure in splicing in vitro of mRNA precursors. Cell 38, 731–736 (1984).

    Article  CAS  PubMed  Google Scholar 

  2. Izaurralde, E. et al. A nuclear cap binding protein complex involved in pre-mRNA splicing. Cell 78, 657–668 (1994).

    Article  CAS  PubMed  Google Scholar 

  3. Herzel, L., Ottoz, D. S. M., Alpert, T. & Neugebauer, K. M. Splicing and transcription touch base: co-transcriptional spliceosome assembly and function. Nat. Rev. Mol. Cell Biol. 18, 637–650 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Fong, Y. W. & Zhou, Q. Stimulatory effect of splicing factors on transcriptional elongation. Nature 414, 929–933 (2001).

    Article  CAS  PubMed  Google Scholar 

  5. Lin, S., Coutinho-Mansfield, G., Wang, D., Pandit, S. & Fu, X. D. The splicing factor SC35 has an active role in transcriptional elongation. Nat. Struct. Mol. Biol. 15, 819–826 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Ji, X. et al. SR proteins collaborate with 7SK and promoter-associated nascent RNA to release paused polymerase. Cell 153, 855–868 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Das, R. et al. SR proteins function in coupling RNAP II transcription to pre-mRNA splicing. Mol. Cell 26, 867–881 (2007).

    Article  CAS  PubMed  Google Scholar 

  8. Damgaard, C. K. et al. A 5′ splice site enhances the recruitment of basal transcription initiation factors in vivo. Mol. Cell 29, 271–278 (2008).

    Article  CAS  PubMed  Google Scholar 

  9. Sims, R. J. III. et al. Recognition of trimethylated histone H3 lysine 4 facilitates the recruitment of transcription postinitiation factors and pre-mRNA splicing. Mol. Cell 28, 665–676 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Tyagi, A., Ryme, J., Brodin, D., Ostlund Farrants, A. K. & Visa, N. SWI/SNF associates with nascent pre-mRNPs and regulates alternative pre-mRNA processing. PLoS Genet. 5, e1000470 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  11. Heintzman, N. et al. Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat. Genet. 39, 311–318 (2007).

    Article  CAS  PubMed  Google Scholar 

  12. De Santa, F. et al. A large fraction of extragenic RNA pol II transcription sites overlap enhancers. PLoS Biol. 8, e1000384 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  13. Kim, T. K. et al. Widespread transcription at neuronal activity-regulated enhancers. Nature 465, 182–187 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Natoli, G. & Andrau, J. C. Noncoding transcription at enhancers: general principles and functional models. Annu. Rev. Genet. 46, 1–19 (2012).

    Article  CAS  PubMed  Google Scholar 

  15. Marques, A. C. et al. Chromatin signatures at transcriptional start sites separate two equally populated yet distinct classes of intergenic long noncoding RNAs. Genome Biol. 14, R131 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  16. Hon, C. C. et al. An atlas of human long non-coding RNAs with accurate 5′ ends. Nature 543, 199–204 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Engreitz, J. M. et al. Local regulation of gene expression by lncRNA promoters, transcription and splicing. Nature 539, 452–455 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Gil, N. & Ulitsky, I. Production of spliced long noncoding RNAs specifies regions with increased enhancer activity. Cell Syst. 7, 537–547.e3 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Tan, J. Y., Biasini, A., Young, R. S. & Marques, A. C. Splicing of enhancer-associated lincRNAs contributes to enhancer activity. Life Sci. Alliance 3, e202000663 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  20. Ponjavic, J., Ponting, C. P. & Lunter, G. Functionality or transcriptional noise? Evidence for selection within long noncoding RNAs. Genome Res. 17, 556–565 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Schuler, A., Ghanbarian, A. T. & Hurst, L. D. Purifying selection on splice-related motifs, not expression level nor RNA folding, explains nearly all constraint on human lincRNAs. Mol. Biol. Evol. 31, 3164–3183 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  22. Koch, F. et al. Transcription initiation platforms and GTF recruitment at tissue-specific enhancers and promoters. Nat. Struct. Mol. Biol. 18, 956–963 (2011).

    Article  CAS  PubMed  Google Scholar 

  23. Lee, J. H. & Skalnik, D. G. Wdr82 is a C-terminal domain-binding protein that recruits the Setd1A histone H3-Lys4 methyltransferase complex to transcription start sites of transcribed human genes. Mol. Cell. Biol. 28, 609–618 (2008).

    Article  CAS  PubMed  Google Scholar 

  24. Austenaa, L. M. et al. Transcription of mammalian cis-regulatory elements is restrained by actively enforced early termination. Mol. Cell 60, 460–474 (2015).

    Article  CAS  PubMed  Google Scholar 

  25. Wu, M. et al. Molecular regulation of H3K4 trimethylation by Wdr82, a component of human Set1/COMPASS. Mol. Cell. Biol. 28, 7337–7344 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Lee, J. H., You, J., Dobrota, E. & Skalnik, D. G. Identification and characterization of a novel human PP1 phosphatase complex. J. Biol. Chem. 285, 24466–24476 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Baillat, D. et al. Integrator, a multiprotein mediator of small nuclear RNA processing, associates with the C-terminal repeat of RNA polymerase II. Cell 123, 265–276 (2005).

    Article  CAS  PubMed  Google Scholar 

  28. Lai, F., Gardini, A., Zhang, A. & Shiekhattar, R. Integrator mediates the biogenesis of enhancer RNAs. Nature 525, 399–403 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Preker, P. et al. RNA exosome depletion reveals transcription upstream of active human promoters. Science 322, 1851–1854 (2008).

    Article  CAS  PubMed  Google Scholar 

  30. Andersen, P. R. et al. The human cap-binding complex is functionally connected to the nuclear RNA exosome. Nat. Struct. Mol. Biol. 20, 1367–1376 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Ostuni, R. et al. Latent enhancers activated by stimulation in differentiated cells. Cell 152, 157–171 (2013).

    Article  CAS  PubMed  Google Scholar 

  32. van Nuland, R. et al. Quantitative dissection and stoichiometry determination of the human SET1/MLL histone methyltransferase complexes. Mol. Cell. Biol. 33, 2067–2077 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  33. Searles, L. L., Ruth, R. S., Pret, A. M., Fridell, R. A. & Ali, A. J. Structure and transcription of the Drosophila melanogaster vermilion gene and several mutant alleles. Mol. Cell. Biol. 10, 1423–1431 (1990).

    CAS  PubMed  PubMed Central  Google Scholar 

  34. Fridell, R. A., Pret, A. M. & Searles, L. L. A retrotransposon 412 insertion within an exon of the Drosophila melanogaster vermilion gene is spliced from the precursor RNA. Genes Dev. 4, 559–566 (1990).

    Article  CAS  PubMed  Google Scholar 

  35. Brewer-Jensen, P. et al. Suppressor of sable [Su(s)] and Wdr82 down-regulate RNA from heat-shock-inducible repetitive elements by a mechanism that involves transcription termination. RNA 22, 139–154 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Castello, A. et al. Insights into RNA biology from an atlas of mammalian mRNA-binding proteins. Cell 149, 1393–1406 (2012).

    Article  CAS  PubMed  Google Scholar 

  37. Baltz, A. G. et al. The mRNA-bound proteome and its global occupancy profile on protein-coding transcripts. Mol. Cell 46, 674–690 (2012).

    Article  CAS  PubMed  Google Scholar 

  38. Kwon, S. C. et al. The RNA-binding protein repertoire of embryonic stem cells. Nat. Struct. Mol. Biol. 20, 1122–1130 (2013).

    Article  CAS  PubMed  Google Scholar 

  39. Fu, M. & Blackshear, P. J. RNA-binding proteins in immune regulation: a focus on CCCH zinc finger proteins. Nat. Rev. Immunol. 17, 130–143 (2017).

    Article  CAS  PubMed  Google Scholar 

  40. Godin, K. S. & Varani, G. How arginine-rich domains coordinate mRNA maturation events. RNA Biol. 4, 69–75 (2007).

    Article  CAS  PubMed  Google Scholar 

  41. Shi, Y. et al. Molecular architecture of the human pre-mRNA 3′ processing complex. Mol. Cell 33, 365–376 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Cortazar, M. A. et al. Control of RNA Pol II Speed by PNUTS-PP1 and Spt5 Dephosphorylation Facilitates Termination by a "Sitting Duck Torpedo" Mechanism. Mol. Cell 76, 896–908.e4 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Sigova, A. A. et al. Divergent transcription of long noncoding RNA/mRNA gene pairs in embryonic stem cells. Proc. Natl Acad. Sci. USA 110, 2876–2881 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Deveson, I. W. et al. Universal alternative splicing of noncoding exons. Cell Syst. 6, 245–255 (2018).

    Article  CAS  PubMed  Google Scholar 

  45. Mele, M. et al. Chromatin environment, transcriptional regulation, and splicing distinguish lincRNAs and mRNAs. Genome Res. 27, 27–37 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Yeo, G. & Burge, C. B. Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals. J. Comput. Biol. 11, 377–394 (2004).

    Article  CAS  PubMed  Google Scholar 

  47. Fairbrother, W. G., Yeh, R. F., Sharp, P. A. & Burge, C. B. Predictive identification of exonic splicing enhancers in human genes. Science 297, 1007–1013 (2002).

    Article  CAS  PubMed  Google Scholar 

  48. Caceres, E. F. & Hurst, L. D. The evolution, impact and properties of exonic splice enhancers. Genome Biol. 14, R143 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  49. Andersson, R. et al. Human gene promoters are intrinsically bidirectional. Mol. Cell 60, 346–347 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Seila, A. C. et al. Divergent transcription from active promoters. Science 322, 1849–1851 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Kaida, D. et al. U1 snRNP protects pre-mRNAs from premature cleavage and polyadenylation. Nature 468, 664–668 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Almada, A. E., Wu, X., Kriz, A. J., Burge, C. B. & Sharp, P. A. Promoter directionality is controlled by U1 snRNP and polyadenylation signals. Nature 499, 360–363 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Ntini, E. et al. Polyadenylation site-induced decay of upstream transcripts enforces promoter directionality. Nat. Struct. Mol. Biol. 20, 923–928 (2013).

    Article  CAS  PubMed  Google Scholar 

  54. Murray, M. V., Turnage, M. A., Williamson, K. J., Steinhauer, W. R. & Searles, L. L. The Drosophila suppressor of sable protein binds to RNA and associates with a subset of polytene chromosome bands. Mol. Cell. Biol. 17, 2291–2300 (1997).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Nojima, T. et al. Mammalian NET-seq reveals genome-wide nascent transcription coupled to RNA processing. Cell 161, 526–540 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Nojima, T. et al. RNA polymerase II phosphorylated on CTD serine 5 interacts with the spliceosome during co-transcriptional splicing. Mol. Cell 72, 369–379.e4 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Wongpalee, S. P. et al. Large-scale remodeling of a repressed exon ribonucleoprotein to an exon definition complex active for splicing. Elife 5, e19743 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  58. Attig, J. & Ule, J. Genomic accumulation of retrotransposons was facilitated by repressive RNA-binding proteins: a hypothesis. Bioessays 41, e1800132 (2019).

    Article  PubMed  Google Scholar 

  59. Kelley, D. & Rinn, J. Transposable elements reveal a stem cell-specific class of long noncoding RNAs. Genome Biol. 13, R107 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  60. Cassa, C. A. et al. Estimating the selective effects of heterozygous protein-truncating variants from human exome data. Nat. Genet. 49, 806–810 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Austenaa, L. et al. The histone methyltransferase Wbp7 controls macrophage function through GPI glycolipid anchor synthesis. Immunity 36, 572–585 (2012).

    Article  CAS  PubMed  Google Scholar 

  62. De Santa, F. et al. The histone H3 lysine-27 demethylase Jmjd3 links inflammation to inhibition of Polycomb-mediated gene silencing. Cell 130, 1083–1094 (2007).

    Article  CAS  PubMed  Google Scholar 

  63. Balestrieri, C. et al. Co-optation of tandem DNA repeats for the maintenance of mesenchymal identity. Cell 173, 1150–1164.e14 (2018).

    Article  CAS  PubMed  Google Scholar 

  64. Sakuma, T., Nishikawa, A., Kume, S., Chayama, K. & Yamamoto, T. Multiplex genome engineering in human cells using all-in-one CRISPR/Cas9 vector system. Sci. Rep. 4, 5400 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Trapnell, C. et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 7, 562–578 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Zang, C. et al. A clustering approach for identification of enriched domains from histone modification ChIP-seq data. Bioinformatics 25, 1952–1958 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Arner, E. et al. Transcribed enhancers lead waves of coordinated transcription in transitioning mammalian cells. Science 347, 1010–1014 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Hnisz, D. et al. Super-enhancers in the control of cell identity and disease. Cell 155, 934–947 (2013).

    Article  CAS  PubMed  Google Scholar 

  71. Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).

    Article  CAS  PubMed  Google Scholar 

  72. Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Amemiya, H. M., Kundaje, A. & Boyle, A. P. The ENCODE blacklist: identification of problematic regions of the genome. Sci. Rep. 9, 9354 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  74. Ramirez, F. et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 44, W160–W165 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Krchnakova, Z. et al. Splicing of long non-coding RNAs primarily depends on polypyrimidine tract and 5′ splice-site sequences due to weak interactions with SR proteins. Nucleic Acids Res. 47, 911–928 (2019).

    Article  CAS  PubMed  Google Scholar 

  76. Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

We thank B. Amati (IEO) and S. Monticelli (IRB, Bellinzona) for critical comments on the manuscript. This work was supported by the European Research Council (Advanced ERC grant no. 692789 to G.N).

Author information

Authors and Affiliations

Authors

Contributions

L.M.I.A., V.P., M.R. and G.N. conceptualized the study. L.M.I.A. and M.R. generated all data with contributions from E.P., D.P., S.G. and G.R.D. V.P. analyzed all data with contributions from I.B. S.P. generated and processed sequencing libraries. G.N. wrote the manuscript with contributions from all authors. G.N. supervised the study. G.N. was responsible for funding acquisition.

Corresponding authors

Correspondence to Liv M. I. Austenaa or Gioacchino Natoli.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Structural & Molecular Biology thanks Christopher Glass and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available. Anke Sparmann was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Extragenic transcription in cells depleted of WDR82 or other transcription terminators.

a, The effects of the depletion of known termination factors on extragenic transcription was measured by 4sU labeling and sequencing in mouse bone marrow-derived macrophages. We considered the n=2,870 extragenic regions whose transcription was increased in macrophages depleted of WDR82 at 45′ after LPS stimulation and measured their 4sU labeling in macrophages depleted of the indicated proteins. Each transcript was assigned to the nearest annotated enhancer, Transcription Start Site (TSS) or Transcription End Site (TES). The log2-transformed fold change (sh vs. scramble) for each depletion experiment is shown. Statistical significance was assessed using the two-tailed Wilcoxon signed rank test and a p-value ≤ 0.01 was considered significant. p-values for transcripts assigned to Enhancers: Exosc3 p-value= 2.2e-208, Ars2 p-value= 2.9e-206, Ints11 p-value= 1.4e-217, CFIm25 p-value= 4.4e-189, Xrn2 p-value= 9.2e-239. p-values for transcripts assigned to TSS: Exosc3 p-value= 4.6e-69, Ars2 p-value= 3.7e-73, Ints11 p-value= 1.6e-72, CFIm25 p-value= 7.1e-72, Xrn2 p-value= 7.8e-101. p-values for transcripts assigned to TES: Exosc3 p-value= 1.5e-14, Ars2 p-value= 5.8e-10, Ints11 p-value= 6.7e-26, CFIm25 p-value= 0.149775, Xrn2 p-value= 1.0e-81. *** = p-value <0.01. ns: not statistically significant. Inside the boxplot, the median value for each fold change is shown with a horizontal black line. Boxes show values between the first and the third quartile. The lower and upper whisker show the smallest and the highest value, respectively. Outliers are not shown. The notches correspond to ~95% confidence interval for the median. b, Comparison of the effects of the depletion of WDR82 and INTS11 on transcription termination at snRNA genes. c, A representative genomic region on mouse chromosome 11 containing multiple snRNA genes. d, Snapshots of genomic regions showing the effects of the depletion of WDR82 and other termination factors on extragenic transcription.

Extended Data Fig. 2 Interaction of WDR82 with the zinc finger protein ZC3H4.

a,b, Immunoprecipitations were carried out either with an anti-Flag antibody on extracts of HEK-293 cells transduced with a Flag-mouse ZC3H4 expression vector (A) or with an anti-ZC3H4 rabbit polyclonal antibody on extracts from Raw264.7 mouse macrophages (B). Different parts of the western blot membrane were hybridized with the indicated antibodies. Data are representative of n=4 independent experiments. The position of molecular weight markers (kDa) is shown on the right. Uncropped images are available online as Source Data. c, Upper panel: Schematic representation of (A) the full length human ZC3H4 protein and (B to E) its deletion mutants used in transfection and co-immunoprecipitation experiments. The ZC3H4 domains annotated in UniProt are shown. Bottom panel: lysates from HEK-293 cells, either untransfected (-) or transduced with the indicated Flag-ZC3H4 expression vectors (A-E) were used in co-immunoprecipitation experiments with an anti-Flag antibody. Inputs (left) and immunoprecipitates (right) were immunoblotted and probed with an anti-FLAG (top) or an anti-WDR82 (bottom) antibody as indicated. The position of molecular weight markers (kDa) is shown on the right. Uncropped images are available online as Source Data. d, The Flag-tagged ZC3H4 C-terminal fragment (804–1303) was expressed in HeLa cells. Lysates were immunoprecipitated with an anti-ZC3H4 antibody directed against aa. 677–765 and blotted with anti-Flag or anti-WDR82 antibody. Inputs are shown on the left and molecular weight markers (kDa) on the right. Uncropped images are available online as Source Data.

Source data

Extended Data Fig. 3 Effects of WDR82 and ZC3H4 co-depletion on extragenic transcription in HeLa cells.

a, The effects of WDR82, ZC3H4 or their combined depletion by siRNA transfection were measured on selected extragenic transcripts, as indicated. In co-depletion experiments, a double amount of siRNA was used, as indicated. The bar plots show the mean ± SD of n=4 biological replicates. The data were normalized on the housekeeping gene CDC25b. Light grey columns: 30pmol siRNA, dark grey columns, 60pmol siRNA. b, Depletion efficiency of WDR82 (left) and ZC3H4 mRNA (right) in individual and combined depletions. The bar plots show the mean ± SD of n=4 independent experiments.

Extended Data Fig. 4 Distribution of WDR82, ZC3H4 and RNA Pol II ChIP-seq peaks.

a, Classification of WDR82 and ZC3H4 ChIP-seq peaks based on their genomic location. TSS: Transcription Start Site; TES: Transcription End Site. Data are from n=2 independent experiments. b, Transcribed protein-coding genes (n=10,917) were divided into quartiles of increasing RNA Pol II occupancy. The heatmaps show WDR82, ZC3H4 and RNA Pol II ChIP-seq signals at genes of the 1st and 4th quartiles.

Extended Data Fig. 5 Analysis of spliced and unspliced lncRNAs suppressed by WDR82.

Extragenic transcripts upregulated upon WDR82 depletion in mouse macrophages (n=2,870; top) or in HeLa cells (n=1,509; bottom) were first divided into spliced (left) and unspliced RNA species (right). Then within each of these two groups they were further divided based on their overlap with lncRNAs in the NONCODE v5 database of non-coding RNAs, classified into single exon and multi-exonic ncRNAs.

Extended Data Fig. 6 Analysis of splice efficiency and splice site sequences of lncRNAs suppressed by ZC3H4-WDR82 in HeLa cells.

a, Splicing efficiency at WDR82-suppressed lncRNA junctions (n=3,717) (top panel) and at a randomly selected set of premRNA junctions (n=4,000) (bottom panel) in HeLa cells. A window of ±− 10 nucleotides centered on the 5′ splice sites was used to measure read counts in polyA RNA-seq data. b, log2-transformed ratio of polyA RNA-seq reads in a 20 nt window centered on the 5′ splice sites of WDR82-suppressed lncRNA or of randomly selected set of mRNAs with at least one splice junction. c, Analysis of 5′ (left) and 3′ (right) splice site strength (measured as MaxEnt scores) at WDR82-suppressed lncRNAs. Statistical significance was assessed using the two-tailed Wilcoxon rank sum test in correspondence of both the 5′ (p-value=1.2e-21) and the 3′ (p-value=2.6e-06) splice sites. ***=p-value <0.01. Nucleotide frequencies at splice sites are shown as sequence logos. Donor and acceptor splice sites are indicated as black triangles. d, Effects of the depletion of WDR82-ZC3H4 on transcription of protein coding genes in HeLa cells. Expressed protein-coding genes (n=8,804) were divided into deciles based on their sensitivity to the depletion of WDR82 in 4sU-seq data, with the 10th decile including the most upregulated genes. Log2-transformed RNA fold changes (polyA and 4sU RNA-seq data) and log2-transformed reads ratio across the first exon-intron junction, as annotated in GENCODE, are shown for the 10th, 5th and 1st deciles. Statistical significance was assessed using the two-tailed Wilcoxon rank sum test (pvalue = 2.0e-21). ***=p-value<0.01. Data were from n=3 independent experiments. In the boxplots in panels B, C, D the median value for each group is shown with a horizontal black line. Boxes show values between the first and the third quartile. The lower and upper whisker show the smallest and the highest value, respectively. Outliers are not shown. The notches correspond to ~95% confidence interval for the median.

Extended Data Fig. 7 Relationship between gene transcript expression and splicing efficiency.

a, Genes were ranked into deciles of decreasing expression based on 4sU-seq data in macrophages (left) and HeLa cells (right). In both panels, expression of the lncRNAs upregulated in WDR82-depleted cells is shown in the red boxes on the right. Data are from n=3 independent experiments. b, Splicing efficiency of the 1st exon of the ranked genes was measured by dividing the sequencing reads in the 10nt upstream by those in the 10nt downstream of the 5′ splice junction in polyA RNA-seq data. Left: macrophages (n=6,280 junctions); right: HeLa (n=8,804 junctions). Data are from n=3 independent experiments. Boxes show values between the first and the third quartile. The lower and upper whisker show the smallest and the highest value, respectively. Outliers are not shown. The notches correspond to ~95% confidence interval for the median.

Extended Data Fig. 8 Exonic splice enhancer (ESE) sequences in exons of WDR82-suppressed lncRNAs and in mRNAs.

a, Number of ESE per exon in lncRNAs and in mRNAs suppressed by WDR82 in macrophages. Data are from n=3 independent experiments. b, Distance between ESEs and 5′ splice sites in lncRNAs and in mRNAs suppressed by WDR82. Data are from n=3 independent experiments. c, Number of ESEs per exon recognized by individual SRSF proteins in lncRNAs and in mRNAs suppressed by WDR82.

Extended Data Fig. 9 Characterization of splicing efficiency and splice site quality of extragenic transcripts not affected by WDR82 depletion.

a, log2-transformed ratio of polyA RNA-seq reads upstream and downstream of the 5′ splice sites of WDR82-suppressed and WDR82-insensitive lncRNAs in HeLa cells. Statistical significance was assessed using the two-tailed Wilcoxon rank sum test (p-value= 1.8e-175 for the controls and p-value=1.3e-199 in WDR82-depleted cells). ***p-value < 0.01. Data are from n=3 independent experiments. b, Analysis of 5′ (left) and 3′ (right) splice site strength at WDR82-suppressed and WDR82-insensitive lncRNAs in HeLa cells. MaxEnt scores for both donor and acceptor splice sites were measured. Statistical significance was assessed using the two-tailed Wilcoxon rank sum test in correspondence of both the 5′ (p-value= 1.3e–20) and the 3′ (p-value= 3e-04) splice sites. *** p-value < 0.01. Data are from n=3 independent experiments. Boxes show values between the first and the third quartile. The lower and upper whisker show the smallest and the highest value, respectively. Outliers are not shown. The notches correspond to ~95% confidence interval for the median.

Extended Data Fig. 10 First exon deletions in protein coding genes.

a, Schematic representation of the deletion of the first exons of protein coding genes. sgRNAs were designed to remove a genomic sequence that included the first exon from 30–50 nt downstream of the TSS to the intronic sequences just downstream of the 5′ splice site. b, Expression of the indicated gene mRNAs was measured by qRT-PCR in bulk populations of wild type or first exon-deleted HeLa cells after transduction of the indicated siRNAs. Primers used were specific for spliced mRNAs and were designed on downstream exons (Methods). The plot shows the mean ± s.d. of n=3 independent experiments. * P < 0.05; **P < 0.01, by two tailed t-test. The data were normalized on the housekeeping gene NRSN2. P-values for COG2: WT vs. DEL siCtl = 1.75E-07; DEL siCtl vs. siWdr82 = 0.78 (n.s.); DEL siCtl vs. siWdr82 = 0.80 (n.s.). P-values for FAM174a: WT vs. DEL siCtl = 0.0005; DEL siCtl vs. siWdr82 = 0.85 (n.s.); DEL siCtl vs. siWdr82 = 0.15 (n.s.). P-values for RRP15: WT vs. DEL siCtl = 0.013; DEL siCtl vs. siWdr82 = 0.70 (n.s.); DEL siCtl vs. siWdr82 = 0.65 (n.s.). c, First exon deletion efficiency at the three genes tested was analyzed by genomic PCR. The quantification of the wild type allele gel band in wt cells and cells in which the first exon was deleted using sgRNAs+Cas9 is shown on the right. Uncropped images are available online as source data.

Source data

Supplementary information

Supplementary Information

Supplementary Figs. 1–5.

Reporting Summary

Peer Review Information

Supplementary Table 1

Extragenic transcription changes in primary mouse bone-marrow-derived macrophages depleted of WDR82 or other transcription termination factors. 4sU–seq datasets were generated in mouse bone- marrow-derived macrophages depleted of the indicated factors by retroviral delivery of shRNAs. Two different shRNAs per target were used as indicated. Data for each depleted factor are shown in different sheets. Data are from n = 4 independent experiments for WDR82 and n = 2 independent experiments for the other factors. False discovery rate (FDR) was calculated based on the Benjamini–Hochberg correction.

Supplementary Table 2

Deregulation of extragenic transcription in primary mouse bone-marrow-derived macrophages depleted of ZC3H4. 4sU–seq datasets were generated in macrophages transduced with retroviral vectors expressing ZC3H4-targeting shRNAs. Data are from n = 3 independent experiments. False discovery rate (FDR) was calculated based on the Benjamini–Hochberg correction.

Supplementary Table 3

Extragenic transcription changes in HeLa cells upon disruption of the WDR82–ZC3H4 complex. Tabs 1 and 2: HeLa cells were transfected with siRNAs targeting WDR82 or ZC3H4, and 4sU–seq datasets were generated. Data are from n = 3 independent experiments. Tab 3: HeLa cells were transfected with an expression vector encoding ZC3H4(804–1303) and, 48 h later, 4sU–seq datasets were generated. Data are from n = 3 independent experiments. False discovery rate (FDR) was calculated based on the Benjamini–Hochberg correction.

Supplementary Table 4

WDR82, ZC3H4 and RNA Pol II ChIP–seq datasets from mouse macrophages. False discovery rate (FDR) was calculated based on the Benjamini–Hochberg correction.

Supplementary Table 5

Overlap of annotated lncRNAs in the NONCODE v.5 database with extragenic transcripts upregulated in mouse macrophages upon WDR82 or ZC3H34 depletion. Extragenic regions showing increased transcription in 4sU–seq datasets from macrophages depleted of WDR82 or ZC3H4 were overlapped with annotated lncRNAs from the NONCODE v.5 database. The identity of the annotated lncRNA corresponding to each region upregulated in WDR82- or ZC3H4-depleted cells is shown.

Supplementary Table 6

Noncoding RNA junctions overlapping with extragenic transcripts upregulated in macrophages upon depletion of WDR82. polyA-RNA-seq datasets were used to identify splice junctions in the extragenic regions detected as upregulated in 4sU–seq data from WDR82-depleted macrophages. Exonic splice enhancers are also indicated. Data are from n = 3 independent experiments.

Supplementary Table 7

Overlap of annotated lncRNAs in the NONCODE v.5 database with extragenic transcripts upregulated in HeLa cells upon WDR82 or ZC3H34 depletion. Extragenic regions showing increased transcription in 4sU–seq datasets from HeLa cells depleted of WDR82 or ZC3H4 were overlapped with annotated lncRNAs from the NONCODE v.5 database. The identity of the annotated lncRNA corresponding to each region upregulated in WDR82- or ZC3H4-depleted cells is shown.

Supplementary Table 8

Noncoding RNA junctions overlapping with extragenic transcripts upregulated in HeLa cells upon depletion of WDR82. PolyA-RNA-seq data were used to identify splice junctions in the extragenic regions detected as upregulated in 4sU–seq data obtained in WDR82-depleted HeLa cells.

Supplementary Table 9

Sequences of the primers used in this study.

Supplementary Table 10

List of the datasets generated in this study.

Source data

Source Data Extended Data Fig. 2

Uncropped western blots.

Source Data Extended Data Fig. 10

Uncropped agarose gels.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Austenaa, L.M.I., Piccolo, V., Russo, M. et al. A first exon termination checkpoint preferentially suppresses extragenic transcription. Nat Struct Mol Biol 28, 337–346 (2021). https://doi.org/10.1038/s41594-021-00572-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41594-021-00572-y

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing