aCPSF1 in synergy with terminator U-tract dictates archaeal transcription termination efficiency via the KH domains recognizing U-tract

Recently, aCPSF1 was reported to function as the long-sought global transcription termination factor of archaea, while the working mechanism remains elusive. This work, through analyzing transcript-3′end-sequencing data of Methanococcus maripaludis, found positive correlations of both the terminator uridine(U)-tract and aCPSF1 with hierarchical transcription termination efficiencies (TTEs) at the genome-wide level. In vitro assays determined that aCPSF1 specifically binds to the terminator U-tract with U-tract number-related binding abilities, and in vivo assays demonstrated the two are indispensable in dictating high TTEs, revealing that aCPSF1 and the terminator U-tract in synergy determine high TTEs. The N-terminal KH domains equip aCPSF1 of specific binding to terminator U-tract and the in vivo aCPSF1-terminator U-tract synergism; aCPSF1’s nuclease activity was also required for TTEs. aCPSF1 also functioned as back-up termination for transcripts with weak intrinsic terminator signals. aCPSF1 orthologs from Lokiarchaeota and Thaumarchaeota exhibited similar U-tract synergy in dictating TTEs. Therefore, aCPSF1 and the intrinsic U-rich terminator could work in a noteworthy two-in-one termination mode in Archaea, which could be widely employed by archaeal phyla; using one factor recognizing U-rich terminator signal and cleaving transcript 3′-end, the archaeal aCPSF1-dependent transcription termination could display a simplified archetypal mode of the eukaryotic RNA polymerase II termination machinery.


41
Transcription termination is an essential and highly regulated process in all forms 42 of life, which not only determines the accurate 3′-end boundary of a transcript and 43 transcription related regulatory events, but also is important in shaping programmed Methods) were analyzed. In total, 2357 TTSs, including the previously identified 998 147 primary and 1359 newly identified secondary TTSs, were obtained (Datasheet S1). 148 Multiple consecutive TTSs were found in more than 50% of transcription units (TUs) 149 of M. maripaludis (Figure 1-figure supplement 1), which could produce 150 multi-isoforms of a transcript with varying 3′-UTRs, as that found in M. mazei and S. 151 acidocaldarius (Dar et al., 2016a). Nevertheless, compared with the primary TTSs, 152 which have the highest reads among the Term-seq captured transcript 3′-ends of each 153 TU, much lower median reads abundances, TTEs and motif scores were identified for 154 the secondary TTSs (Figure 1-figure supplement 2). This indicates that TUs are 155 mainly terminated at the primary TTSs, which were therefore used for further 156 investigation. in a step-wise manner at the four nucleotides, which was therefore defined as the TTS 167 quadruplet. Through pair-wisely comparing the reads of each nt in a TTS quadruplet, 168 the maximal decrease was found between sites −2 nt (upstream) and +2 nt 169 (downstream) relative to TTS in the majority of TTSs ( Figure 1B) Figure 1C). Sequence motifs generated from −30 nt until +5 nt flanking TTSs using 178 Weblogo showed characteristic U-rich tracts, each containing four consecutive uridine 179 nucleotides (U4) preceding TTSs among the overrepresented TUs in all the three 180 groups ( Figure 1C). Noticeably, a positive correlation appeared between TTE and the 181 terminator U-tract length, such as two or more than two U4-tracts were found 182 overrepresented in the high TTE group, while the U4-tract was underrepresented in  . That 208 TUs having TQRR <1, i.e., the read ratio between +2 and −2 nt in the TTS quadruplet 209 is reduced due to aCPSF1 depletion, were identified as aCPSF1-dependent  but also found a significant positive correlation between the aCPSF1-dependency and 228 the numbers of U4-tract preceding TTSs, i.e., TU groups with higher aCPSF1 229 dependency have more featured U4-tracts ( Figure 2C). Additionally, we evaluated the 230 relationship between the aCPSF1-dependency and the above four U4-tract TU groups 231 analyzed in Figure 1D. Interestingly, we found that 94.5% (736/779) of TUs in the ≥1  containing RNAs, but not in that without U-tract at the same aCPSF1 contents ( Figure   264 3-figure supplement 1). Next, 12 additional RNA fragments in a consensus 36 nt . The attempt to using fluorescence to indicate transcript levels failed due 324 to intensity fluctuations during the assay. Using quantitative reverse-transcription 325 (RT-qPCR), transcript abundances of the two reporter genes were assayed, which determined that TTEs were highest for the terminators with two U-tracts (T1149 and 327 T0204), lower for those with one U4-tract (T0911 and T0229), and the lowest for that 328 without U-tract (T1710) in the wild-type strain S2 ( Figure 5B). Therefore, the reporter , which is very similar to that for TQRR calculation.

337
TA ratios of 24%, 24%, 56%, 61%, and 99% were determined for the terminators  TTSs of TUs (Figures 1 and 2) and 76.6% (736/961) TUs were determined having 500 both U4-tract terminators and aCPSF1 dependency (Figure 1-figure supplement 4A), 501 and also revealed that terminators containing more U4-tracts numbers have higher aCPSF1 dependency ( Figure 2C and 2D). Accordingly, the in vitro biochemical assays 503 clearly demonstrated that aCPSF1 binds more strongly to the terminators containing 504 more U-tracts and two U4-tracts could be the minimum specific sequence for efficient 505 binding of aCPSF1 (Figures 3, 4, Figure 3 terminator could also cause aRNAP pausing and disturb the TEC, so further 588 synergistically contributing to transcription termination. Therefore, the termination 589 factor aCPSF1, through recognizing and specifically binding to the terminator 590 U-tracts and cleaving at the transcript 3′-ends, in synergy with U-tract intrinsic 591 terminator signal dictates the genome-wide TTEs in Archaea.

592
The archaeal aCPSF1-dependent termination mechanism involving transcript 593 3′-end cleavage has exposed the resemblance with the eukaryotic 3′-end 594 processing/cleavage triggered RNAP II termination mode, in which an aCPSF1 Archaea. Therefore, we propose that the aCPSF1-dependent archaeal termination 620 mechanism reported here could be a simplified and evolutionary predecessor of the 621 eukaryotic transcription termination machinery.

622
In conclusion, this work reports that the intrinsic terminator U-tract cis-element

699
The rEMSA assay was performed as described previously with some modifications     Table S3.