Pervasive isoform‐specific translational regulation via alternative transcription start sites in mammals

Abstract Transcription initiated at alternative sites can produce mRNA isoforms with different 5ʹUTRs, which are potentially subjected to differential translational regulation. However, the prevalence of such isoform‐specific translational control across mammalian genomes is currently unknown. By combining polysome profiling with high‐throughput mRNA 5ʹ end sequencing, we directly measured the translational status of mRNA isoforms with distinct start sites. Among 9,951 genes expressed in mouse fibroblasts, we identified 4,153 showed significant initiation at multiple sites, of which 745 genes exhibited significant isoform‐divergent translation. Systematic analyses of the isoform‐specific translation revealed that isoforms with longer 5ʹUTRs tended to translate less efficiently. Further investigation of cis‐elements within 5ʹUTRs not only provided novel insights into the regulation by known sequence features, but also led to the discovery of novel regulatory sequence motifs. Quantitative models integrating all these features explained over half of the variance in the observed isoform‐divergent translation. Overall, our study demonstrated the extensive translational regulation by usage of alternative transcription start sites and offered comprehensive understanding of translational regulation by diverse sequence features embedded in 5ʹUTRs.

A C E D B Figure EV2. Polysome profiling measured mRNA translational efficiency.
A For each TSS isoform, the number of ribosomes per mRNA was plotted against its corresponding ORF length. B For each TSS isoform, the abundance ratio between monosome fraction and sum of polysome factions was plotted against its corresponding ORF length. Short ORFs (≤ 450 nt) were more enriched in the monosome fraction. C TE values for each gene calculated based on published ribosome footprinting data were plotted against the TE values calculated based on polysome profiling data in this study. D TE values for each gene calculated based on published proteomics/genomics data were plotted against the TE values calculated based on polysome profiling data in this study. E Log2-transformed TE fold change values for each pair of alternative TSS isoforms calculated based on all seven fraction data were compared to those calculated based on data with one of the seven fractions left out.  Figure EV3. GO enrichment analyses and the relationship between transcription and translation.
A GO enrichment for single/multi-TSS genes over all expressed genes. B Boxplots showing the distribution of TE divergence between alternative TSS isoforms grouped by their abundance differences. Box edges represent quantiles, whiskers represent extreme data points. C Boxplots showing the distribution of TE at the gene level grouped by mRNA abundance. Box edges represent quantiles, whiskers represent extreme data points no more than 1.5 times the interquartile range.  Figure EV4. Quantitative validation of TE divergence between TSS isoforms. The ratio of relative isoform abundance between fractions was calculated according to the formula Tdist;poly=Tprox;poly Tdist;nonribo =Tprox;nonribo , where T dist,poly and T prox,poly represented the isoform abundance of distal and proximal TSSs in the polysomal fraction, respectively; T dist,nonribo and T prox,nonribo represented the isoform abundance in the non-ribosomal fraction. The ratio determined based on agarose gel image (y-axis) was plotted against that estimated based on 5ʹ end sequencing (x-axis).   The description of the two genes can be found in Table EV3. C, D Same as Fig 5A and B, but based on EFE to define stable RNA structures. ***P < 0.001; Mann-Whitney U-test. Box edges represent quantiles, whiskers represent extreme data points no more than 1.5 times the interquartile range.

Molecular Systems Biology
Translational control via alternative TSSs Xi Wang et al Figure EV7. Performance of the combinatory nonlinear regression model. Same as Fig 6B, but in addition, we marked the six genes that were tested by luciferase reporter assay ( Fig 2C) and containing unambiguously determined 5ʹUTR sequences (see Materials and Methods). The TE divergence values estimated based on 5ʹ end sequencing data are shown in cyan, and those based on reporter assay are shown in yellow. Single 3′ end Multi 3′ ends uORF Figure EV8. Sequence features associated with translational regulation conferred significant impact in genes with single or multiple 3ʹ ends. All the multi-TSS genes were separated into two groups based on whether only one or more 3ʹ end were identified in the study from Spies et al (2013). Similar to boxplots in Figs 4 and 5, but all comparisons were performed separately for the two groups (columns), one group contained the genes with only one 3ʹ end and the other group contained the genes with more than one 3ʹ end. Sequence features including uORF, out-of-frame uAUG, 5ʹ cap RNA structure, and 5ʹ TOP sequence (rows) were analyzed. Box edges represent quantiles, whiskers represent extreme data points no more than 1.5 times the interquartile range.   Table EV5.
A Downstream TSSs could lead to N-terminal truncated proteins. Two examples were shown here. The description of the two genes can be found in Table EV3. B Alternative TSS could also lead to N-terminal extended proteins. One example was shown here. The description of this gene can be found in Table EV3.