N-terminal tagging of RNA Polymerase II shapes transcriptomes more than C-terminal alterations

Summary RNA polymerase II (Pol II) has a C-terminal domain (CTD) that is unstructured, consisting of a large number of heptad repeats, and whose precise function remains unclear. Here, we investigate how altering the CTD’s length and fusing it with protein tags affects transcriptional output on a genome-wide scale in mammalian cells at single-cell resolution. While transcription generally appears to occur in burst-like fashion, where RNA is predominantly made during short bursts of activity that are interspersed with periods of transcriptional silence, the CTD’s role in shaping these dynamics seems gene-dependent; global patterns of bursting appear mostly robust to CTD alterations. Introducing protein tags with defined structures to the N terminus cause transcriptome-wide effects, however. We find the type of tag to dominate characteristics of the resulting transcriptomes. This is possibly due to Pol II-interacting factors, including non-coding RNAs, whose expression correlates with the tags. Proteins involved in liquid-liquid phase separation appear prominently.

in grey on the left of each sequence, and the repeat in the wildtype human CTD that each repeat originates from is listed in black on the right.S13).quantities to the normalized expression levels according to scRNA-seq.A, all QC-passed cells, corresponding to the results shown in Fig. 1A.B, as A, for cells that have non-zero expression of Dendra2 or HaloTag, corresponding to the results shown in Fig. 1B.
Table S3, related to Figure 4 -Wilcoxon rank-sum test results comparing mean expression between samples.Paired and unpaired Wilcoxon rank-sum tests were performed on the fifteen pairwise combinations of the six samples.W and V are the test statistics, and the estimator is the median difference between the mean expression of a gene from each sample in the pair.Extremely small P values are recorded as 0.
Table S4, related to Figure 4 -as Table S3, for burst frequencies.Table S5, related to Figure 4 -as Table S3, for CV 2 .
Table S7, related to Figure 4 -as Table S3, for burst size.Table S8, related to Figure 4 -as Table S3, for mean expression of cells with non-zero expression of either tag.
Table S9, related to Figure 4 -as Table S8, for burst frequency.

Figure 2 -
Figure S3, related to Figure 1 -Read coverages at the exo-and endogenous versions of the POL2RA reference sequences (corresponding to RPB1).Shown for all cell lines other than D25(shown in Fig.1A).RPM, reads per million total reads.Coverage is computed in 25 bp bins.

Figure S12 ,
Figure S12, related to Figure 5 -Elbow plot showing the variation each PC contributes to the dataset.

Figure S15 ,
Figure S15, related to Figure 5 -Top genes of significant PCs.Highlighted genes are cell cycle phase markers according to 1 (TableS13).

Table S2 , related to Figure 1 -Statistics for cell line comparisons regarding expression of Dendra2
, HaloTag, GAPDH and POLR2A.'count' refers to the numbers of cells, the other

Table S12 , related to Figure 4 -as
Table S8, for burst size.