RppH can faithfully replace TAP to allow cloning of 5′-triphosphate carrying small RNAs

Graphical abstract


Specifications
Area Biochemistry, Genetics and Molecular Biology More specific subject area: Small RNA biology Method name: RppH treatment of small RNAs prior to sequencing library preparation Name and reference of original method

TAP treatment
Resource availability https://www.neb.com/products/m0356-rna-5-pyrophosphohydrolase-rpph#Product%20Information Method details RNA interference (RNAi), the process whereby small RNAs and their cognate Argonaute proteins regulate gene expression, was initially discovered in the nematode Caenorhabditis elegans [1]. In C. elegans there are several endogenous small RNA species and 27 genomically-encoded Argonautes [2]. Two classes of endogenous small RNAs are transcribed by RNA-dependent RNA Polymerases (RdRPs), the 22G-and 26G-RNAs. The 22G-RNAs exist in large numbers and, consensually, have 22 nucleotides with a 5 0 bias for guanine [2]. RdRP transcription leaves a triphosphate group at the 5 0 end of the small RNA and consequently, mature 22G-RNAs still have this RdRP signature at their 5 0 end [2]. Mature 26G-RNAs do not have a 5 0 -triphosphate group, but it is currently unclear how this is achieved.
The 5 0 triphosphate group decreases clonability of 22G-RNAs in standard small RNA library preparations. A common practice in the field to improve the yield of 22G-RNA reads in deepsequencing experiments is to treat samples, prior to library preparation, with Tobacco Acid Pyrophosphatase (TAP) [3]. TAP hydrolyzes several types of pyrophosphate bonds, including the eukaryotic 5 0 Cap, but it does not hydrolyze polymerized RNA and DNA [4]. TAP can also hydrolyze pyrophosphate groups from small RNAs, for example 22G-RNAs, increasing the likelihood of cloning 22G-RNAs during library preparation. This type of treatment is essential to address aspects of 22G-RNA biology. For example, 22G-RNAs provide transcriptome-wide non-self-and self recognition platforms that silence foreign gene expression or license bona-fide gene expression, respectively [5][6][7][8][9][10][11][12][13]. Also, several studies focused on understanding the dynamics of inheritance and establishment of silencing by these small RNA species [14,15].
Recently, TAP production was discontinued, so we sought a commercially available alternative. We chose E. coli RNA 5 0 Pyrophosphohydrolase (RppH). RppH primes mRNA degradation in E. coli by converting the 5 0 triphosphate into a 5 0 monophosphate [16]. In practical terms, RppH is an enzyme analogous of TAP, allowing for a direct replacement of TAP in the established protocols, and it is available commercially at a comparable price. Here, we treated small RNAs from C. elegans with TAP and RppH and compared their performance in regard to 22G-RNA enrichment.
Worm culture, RNA extraction and enrichment for small RNAs Caenorhabditis elegans populations were grown according to standard conditions [17]. Wild-type N2 worms were grown at 20 C on nematode growth medium (NGM) plates seeded with E. coli OP50. For synchronization, gravid adult worm populations were bleached and eggs were allowed to hatch overnight in M9 buffer, at 20 C. Synchronized L1 larvae were plated the next day on OP50-seeded plates and grown at 20 C for 60-63 h until adulthood. Gravid adults were washed off plate with M9 buffer and lysed in 250 mL of Worm Lysis Buffer (0.2 M NaCl, 0.1 M Tris pH 8.5, 50 mM EDTA, 0.5% SDS) with 150 mg Proteinase K (Sigma-Aldrich #P2308) for at least 30 min at 65 C. 750 mL of Trizol LS (Life Technologies, 10296-028) were added per sample and extraction followed manufacturer's instructions, with the exception that phase-lock tubes were used to facilitate phase-separation. Total RNA extraction was followed by TURBO DNase treatment (Life Technologies #AM1907) according to manufacturer's instructions. Small RNAs were isolated from DNase-treated total RNA using MirVana (Life Technologies #AM1561).

RppH and TAP treatments and library preparation
Pre-treatment A single RNA sample was divided into 6 technical replicates. Three of these technical replicates were treated with Tobacco Acid Phosphatase (TAP, Epicentre) and the other three, with RNA 5 0 Pyrophosphohydrolase (RppH, New England Biolabs, #M0356S). For TAP-treated samples, 1 mg of total RNA was incubated for 2 h at 37 C, with 5 units of TAP enzyme and 10X TAP buffer. For RppH treated samples, 1 mg of RNA was incubated for 1 h at 37 C, with 5 units of RppH enzyme and 10X NEB Buffer 2. To stop the RppH treatment, 500 mM EDTA was added and samples were incubated for 5 min at 65 C.

Library preparation
Small RNAs (16-30 nt) were enriched by performing size selection of the RNA prior to library preparation. RNA samples were run on a 15% TBE-Urea polyacrylamide gel (BioRad), and purified with sodium chloride/isopropanol precipitation. Library preparation was performed using the NEBNext Multiplex Small RNA Library Prep Set for Illumina (New England BioLabs) as recommended by the manufacturer (protocol version v2.0 8/13), but replacing the RNA ligation adapters with adapters with 4 additional random bases at their 3 0 ends to allow identification of PCR clonal artefacts (synthesized by Bioo Scientific). Fourteen PCR cycles were used for library amplification. The PCR-amplified cDNA was purified using AMPure XP beads (Beckman Coulter). Size selection of the small RNA library was done on LabChip XT instrument (Perkin Elmer) using DNA 300 assay kit extracting the 135-170 bp fraction.

Next generation sequencing
The 135-170 bp library fractions of the RppH-and TAP-treated libraries were pooled in equal molar ratio. The resulting 4 nM pool was denatured and diluted to 9 pM with 5% PhiX spike-in and sequenced as single-read on MiSeq (Illumina) for 68 cycles. Libraries for untreated samples, originating from a different biological replicate, were also prepared using the protocol above, but were sequenced as single-read on HiSeq 2500 (Illumina). Untreated samples were originally used in [18].

Bioinformatics analysis
Read processing and mapping The quality of raw sequenced reads was accessed with FastQC, illumina adatpers were then removed with cutadapt (-O 8 -m 26 -M 38), reads with low-quality calls were filtered out with fastq_quality_filter (-q 20 -p 100 -Q 33). Using information from unique molecule identifiers (UMIs) added during library preparation, reads with the same sequence (including UMIs) were collapsed to removed putative PCR duplicates using a custom script. Prior to mapping, UMIs were trimmed (seqtk trimfq) and library quality re-assessed with FastQC. Reads were aligned against the C. elegans genome assembly WBcel235 with bowtie v0.12.8 (-tryhard -best -strata -chunkmbs 256 -p 8 -v 1 -M 1). For the analysis, only reads mapping to a single genomic location were used. Except where indicated, reads used for comparison are those whose sequence is exactly 22 nucleotides long and have a guanine at their 5 0 .

RppH and TAP sample correlation
To determine the similarity between libraries, two approaches were used: (1) the genome was divided into equally sized bins and reads mapping to each bin were counted (DeepTools, multiBamSummary bins); or (2) reads mapping to a set of published 22G-RNA targets (DeepTools, multiBamSummary BED-file). Using either of those counts, the correlation coefficient (spearman and Pearson) between samples was calculated and plotted (DeepTools, plotCorrelation).

Differential 22G-RNA targeting
We used featureCounts (v1.4.6) with options -s 2 -ignoreDup to count the number of 22G-RNAs mapping antisense to the annotated C. elegans genes (Ensembl, WBcel235). Read counts were then used in DESeq2 to determine differential expression between RppH and TAP treatments. Since the goal was to determine the experimental (technical) differences between both treatments, we considered each technical replicate as a biological replicate to perform the analysis.

21U-RNA precursors
To identify putative 21U-RNA precursors we selected reads that mapped to annotated 21U-RNA loci (bedtools intersect -F 0.5). The length and 5 0 nucleotide frequency of these reads was then summarized.

Method validation
To address if RppH treatment enriches for 22G-RNAs as efficiently as TAP, we sequenced small RNAs of wild-type adult worms with prior treatment with TAP or RppH. To properly compare both treatments, we sequenced three technical replicates of TAP treatment and three technical replicates of RppH treatment, all derived from a unique biological sample. We first determined if RppH treatment would affect the quality of the sequenced reads. QC reports generated by FastQC shows that the per base sequence quality of RppH-treated libraries is at least as good as that of TAP-treated libraries (Supplementary Fig. 1). Other key indicators such as the average quality per read and sequence duplication levels are also similar in libraries treated with either method. Importantly, the three RppH-treated technical replicates show that these libraries are consistently of high quality.
The goal of both Rpph-and TAP-treatment is to enrich the library for 22G-RNAs. This class of small RNAs has two features that can be readily used to determine if and how successful the enrichment was: (i) most sequences are 22 nucleotides in length and; (ii) high proportion of sequences with a guanine at their 5 0 end. After adapter and UMI removal we plotted the distribution of read length in the sequenced libraries (Fig. 1A). Reads were normalized to total number of sequenced reads. For comparison purposes, we added samples that were neither treated with TAP nor RppH, to set the baseline for enrichment. Untreated libraries have two abundant populations of reads with lengths of 21 and 23 nucleotides, accounting for 21U-RNAs and miRNAs, respectively, whereas in the RppH-and TAP-treated libraries there is a clear enrichment of reads with 22 nucleotides. Furthermore, in sequences originating from untreated libraries most reads contain a 5 0 Uracil, but approximately 50% of reads from RppH and TAP-treated libraries have a guanine at this position (Fig. 1B). This shows that RppH treatment is as effective as TAP in enriching for reads with 22G-RNAs.
For both treatments, reads were mapped to the C. elegans genome with equal efficiency, and similarly to untreated libraries ( Fig. 2A). Despite achieving a high enrichment of 22G-RNA sequences (Fig. 1), it is possible that the TAP and RppH treatments select for different subpopulations of these   [3,24]. For visualization purposes, the three replicates of each library preparation method were collapsed. small RNAs. To exclude this possibility, we selected, in the RppH-and TAP-treated libraries, only reads that are 22 nucleotides in length and have a guanine at the 5 0 (22G-RNA reads), and tested if these reads map to the same genomic locations. All the follow-up analysis in this manuscript is done using these 22G-RNA reads. The overall 22G-RNA genomic coverage appears to be similar between RppH and TAP-treated libraries (Fig. 2B, Supplementary Fig. 2). To confirm this observation the genome was split in bins of equal lengths, we counted the number of reads mapped to genomic bins, and compared these for all samples (Fig. 2C). The mapped positions of 22G-RNAs from TAP-and RppH-treated libraries are highly correlated (Spearman's correlation >0.86), and as expected are very different from those of the untreated libraries (Spearman's correlation <0.43). Technical replicates of the RppH treatment are also consistently highly correlated, indicating that the RppH treatment is reproducible.
Even though the overall genomic location of RNA recovered from RppH or TAP treatment is very similar (Fig. 2B, Supplementary Fig. 2), there might be subtle differences between the treatments, possibly in genes known to be targeted by 22G-RNAs. To exclude this possibility, we determined if the RppH treatment recovered small 22G-RNAs that map to known 22G-RNA targets. We counted the number of reads mapping to lists of published target genes of certain small RNA pathway factors [3,7,[19][20][21][22] and plotted sample correlations (Fig. 3A). Replicates from TAP-and RppH-treated libraries are highly correlated (Spearman's correlation > = 0.82), whereas untreated samples show very different patterns, (Spearman's correlation < = 0.48 versus TAP or RppH). Furthermore, RppH and TAP libraries are enriched in 22G-RNAs in these regions, compared to untreated libraries ( Supplementary  Fig. 3A). We have also looked in detail into the coverage of known 22G-RNA targets, and confirmed that the level of genic reads resulting from either treatment show the same patterns (Fig. 3B). This strongly suggests that RppH and TAP treatment enriches for 22G-RNAs that target the same genes. To further determine if there is some bias in either treatment, we performed differential gene expression analysis comparing TAP vs RppH and found only 1 gene differentially enriched between TAP and RppH libraries (F38E11.21, a non-coding RNA, FDR < 0.1, Supplementary Fig. 3B). Importantly, F38E11.21 is not a known 22G-RNA target. From this, we conclude that RppH treatment faithfully recapitulates TAP treatment.
Lastly, TAP and RppH phosphatase activity also results in 5 0 cap removal. Since 21U-RNA precursors are 5 0 capped, decapping may affect 21U-RNA cloning during library preparation. The number of 21U-RNAs and potential precursors sequenced in TAP-and RppH-treated samples are not significantly different ( Supplementary Fig. 3C).
The unavailability of TAP from commercial sources eliminated a common tool used in the field of small RNA research, where, amongst other uses, TAP was used to enrich samples for 22G-RNAs in C. elegans. Whilst no consensus alternative for TAP has been found, others have used other alternatives to TAP and RppH [15,22], however, no systematic comparison of these enzymatic treatments was performed. Here, we show that another decapping enzyme, RppH, is a suitable alternative to TAP. Small RNA-seq libraries of samples treated with RppH not only have similar number of 22G-RNA reads, but these also originate from the same genomic locations as that of TAP-treated libraries. While preparing this manuscript, we sequenced small RNAs of various other RppH-treated libraries (data not shown). Analysis of those small RNA sequencing datasets attested for the reproducibility of RppH treatment (data not shown). Overall, we can recommend RppH as a direct replacement of TAP as a small RNA pyrophosphatase to enrich RNA samples for 22G-RNAs with almost no changes to existing protocols or added costs.