Abstract
Approximately half of human genes generate mRNA isoforms that differ in their 3′UTRs while encoding the same protein. 3′UTR and mRNA length is determined by 3′ end cleavage sites (CS). Here, we mapped and categorized mRNA 3′ end CS in more than 200 primary human and mouse cell types, resulting in a 40% increase of CS annotations relative to the GENCODE database. We incorporated these annotations into a novel computational pipeline, called scUTRquant, for rapid, precise, and accurate quantification of gene and 3′UTR isoform expression from single-cell RNA sequencing (scRNA-seq) data. When applying scUTRquant to data from 474 cell types and 2,134 perturbations, we discovered extensive 3′UTR length changes across cell types that are as widespread and dynamically regulated as gene expression changes. Our data indicate that mRNA abundance and mRNA length are two independent axes of gene regulation that together determine the amount and spatial organization of protein synthesis.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
The annotation pipeline was reworked to identify cleavage sites at the celltype level, rather than in aggregate. Annotation is also now added for human. Several quality checks are added for the novel cleavage sites. All previous analyses are now run with the updated annotations. New analyses of Tabula Sapiens and a Perturb-seq dataset are added, including extensive characterization of perturbations that impact UTR isoforms. LUI statisitics are replaced with WUI statistics. Sibylle Mitschka is added as author who worked on Perturb-seq analysis; Gang Zhen is removed as his contribution from the previous work is no longer included.