Abstract
The role of Transposable Elements (TEs) in regulating diverse biological processes, from early development to cancer, is becoming increasing appreciated. However, unlike other biological processes, next generation single-cell sequencing technologies are ill-suited for assaying TE expression: in particular, their highly repetitive nature means that short cDNA reads cannot be unambiguously mapped to a specific locus. Consequently, it is extremely challenging to understand the mechanisms by which TE expression is regulated and how they might themselves regulate other protein coding genes. To resolve this, we introduce CELLO-seq, a novel method and computational framework for performing long-read RNA sequencing at single cell resolution. CELLO-seq allows for full-length RNA sequencing and enables measurement of allelic, isoform and TE expression at unique loci. We use CELLO-seq to assess the widespread expression of TEs in 2-cell mouse blastomeres as well as human induced pluripotent stem cells (hiPSCs). Across both species, old and young TEs showed evidence of locus-specific expression, with simulations demonstrating that only a small number of very young elements in the mouse could not be mapped back to with high confidence. Exploring the relationship between the expression of individual elements and putative regulators revealed surprising heterogeneity, with TEs within a class showing different patterns of correlation, suggesting distinct regulatory mechanisms.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
Abbreviations
- (CELLO-seq)
- CELl LOng read RNA sequencing
- (PacBio)
- Pacific Biosciences
- (MaLRs)
- Mammalian apparent LTR-retrotransposons
- (SNP)
- single nucleotide polymorphism
- (sc)
- single cell
- (TE)
- Transposable element
- (ERV)
- endogenous retrovirus
- (LINE)
- Long interspersed element
- (SINE)
- Short interspersed element
- (RNAseq)
- RNA sequencing
- (hiPSCs)
- Human induced pluripotent stem cells
- (RT)
- Reverse Transcription
- (LTRs)
- long terminal repeat elements
- (NGS)
- next generation sequencing
- (ONT)
- Oxford Nanopore technologies
- (UMIs)
- unique molecular identifiers
- (TSO)
- template switch oligo
- (ZNFs)
- Zinc finger nucleases
- (nt)
- Nucleotides
- (bp)
- Base pairs
- (TSS)
- Transcription start site
- (TES)
- Transcription end site
- (ERCC)
- External RNA Controls Consortium