A fast and cost-effective microsampling protocol incorporating reduced animal usage for time-series transcriptomics in rodent malaria parasites

Background The transcriptional regulation that occurs in malaria parasites during the erythrocytic stages of infection can be studied in vivo with rodent malaria parasites propagated in mice. Time-series transcriptome profiling commonly involves the euthanasia of groups of mice at specific time points followed by the extraction of parasite RNA from whole blood samples. Current methodologies for parasite RNA extraction involve several steps and when multiple time points are profiled, these protocols are laborious, time-consuming, and require the euthanization of large cohorts of mice. Results A simplified protocol has been designed for parasite RNA extraction from blood volumes as low as 20 μL (microsamples), serially bled from mice via tail snips and directly lysed with TRIzol reagent. Gene expression data derived from microsampling using RNA-seq were closely matched to those derived from larger volumes of leucocyte-depleted and saponin-treated blood obtained from euthanized mice with high reproducibility between biological replicates. Transcriptome profiling of microsamples taken at different time points during the intra-erythrocytic developmental cycle of the rodent malaria parasite Plasmodium vinckei revealed the transcriptional cascade commonly observed in malaria parasites. Conclusions Microsampling is a quick, robust and cost-efficient approach to sample collection for in vivo time-series transcriptomic studies in rodent malaria parasites. Electronic supplementary material The online version of this article (10.1186/s12936-019-2659-4) contains supplementary material, which is available to authorized users.

and in other human malaria parasite species 11,12 . Such time-series transcriptome studies, including perturbation experiments [13][14][15] can be performed with human malaria parasites but only in in vitro or ex vivo cultures. A few studies have profiled gene expression in vivo in clinical field isolates [16][17][18] to infer gene function but gene expression changes due to a particular environmental condition or gene knockout need to be studied under controlled experimental settings.
Rodent malaria parasites (RMPs) allow for tractable in vivo model systems for the study of the biology of malaria parasites [19][20][21] . RMPs can be propagated in mice and mosquitoes under laboratory conditions, thus providing easy access to all the developmental stages of the parasite's complex life cycle. Stage-specific transcriptional control has been shown in RMPs during their IDC [22][23][24] , vector 22,[25][26][27] and liver stages 28 .
Extraction of parasite RNA from blood stages of RMPs involves several steps.
Peripheral, parasitized whole blood from infected mice is collected at a desired time point during the course of infection, through terminal sampling methods 39 involving exsanguination. In the case of profiling life-stage specific gene expression in RMPs that exhibit asynchronous parasite development in the blood (Plasmodium berghei and P. yoelii), the parasite stages are typically separated via a density gradient column by centrifugation after blood collection or are maintained as synchronous ex vivo cultures 23,24 .
In order to study stage-specific gene expression in synchronous parasites such as P. chabaudi and P. vinckei, and more importantly, gene expression changes in vivo during the course of infection or in response to perturbations, transcriptomic profiling can be performed directly after blood collection. Blood samples are first enriched for parasite RNA by removing host leukocytes using either commercially available Plasmodipur filters (EuroProxima CAT#8011) or custom-made CF11 cellulose columns 40 . In addition to leukocyte depletion, host globin RNA within the parasitized erythrocytes is also removed by releasing the parasites from the RBCs via saponin lysis prior to RNA isolation. RNA is extracted from the purified parasites using guanidinium thiocyanatephenol-chloroform method 41,42 (with TRIzol Reagent) or with column-based RNA extraction kits. RNA transcripts are then identified and their levels measured through microarray hybridization 22 or next-generation sequencing of RNA-seq libraries 23,24 .
While profiles from both methods correlate well 3,43 , RNA-seq involves direct, deep sequencing of RNA transcripts and offers an unbiased and highly resolved picture of the parasite's transcriptional landscape by providing information on genes expressed at low levels, alternative splicing events and antisense transcription at base-pair resolution 3,4,43,44 .
As these experiments require exsanguination of mice, large cohorts of mice are required for each time point (see Figure 1A). Thus, the number of animals that need to be euthanized directly increases with the number of biological replicates and time points, imposing ethical and cost constraints on the study design. Sampling different animals at different time points also increases the probability that inter-animal variation will increase the variability of the results.
Exsanguination involves deep terminal anesthesia of the mouse, and the performance of surgical procedures. This, along with the leukocyte depletion and saponin lysis steps, makes the entire procedure time-consuming and requires considerable technical expertise. Thus, multiple sampling at short time intervals requires significant cost, time-investment and high level of technical expertise. We have, therefore, devised a simplified protocol for time-series transcriptomics of RMPs that uses a serial blood microsampling approach for sample collection (see Figure 1A).
Microsamples are usually blood volumes less than 50 μL and which can be collected at multiple time points from a single mouse using less invasive procedures such as tail snip or tail vein sampling. Microsampling techniques are quicker, cause less stress to the animal, allow multiple samples from the same animal through time and have been shown to reduce animal usage in pharmacokinetic studies [45][46][47][48] . Here, we have evaluated the feasibility of sequencing parasite RNA transcripts from blood volumes as low as 20 μL and assessed whether data thus obtained would reflect the true global gene expression in the parasite. We have also assessed the impact of raw processing of blood samples without leukocyte depletion.

Results
Two sets of experiments were designed to assess the transcriptomic read-outs from the microsampling approach. First, we compared the gene expression profile obtained through microsampling with that of the routinely applied technique of terminal bleeding, in mice infected with P. chabaudi parasites. Twenty microlitres of blood was collected via tail snip from P. chabaudi-infected mice, washed with PBS and immediately lysed with 0.5 mL TRIzol reagent. Following microsample collection, mice were anaesthetized and exsanguinated via incision of the brachial artery. Around 0.5 mL of blood was collected from each mouse, washed twice with PBS and passed through home-made CF11 cellulose columns to remove leukocytes. The RBCs were then gently lysed using saponin and the harvested parasite pellet was immediately lysed with 1 mL TRIzol reagent.
In the second experiment, microsampling was done from mice infected with P.
vinckei parasites to assess whether the microsampling protocol can identify gene expression changes that occur during P. vinckei's intraerythrocytic developmental cycle.
Twenty microlitres of samples (microsamples) were taken via tail snip from three mice infected with P. vinckei at four time points -6h, 12h, 18h and 24h during the 4th day post infection, and lysed with TRIzol, as before.
Firstly, RNA yield and quality obtained from microsamples were assessed. The P. chabaudi samples were taken at a low parasitaemia of 4-8% and the P. vinckei samples at a higher parasitaemia of around 25%. This was reflected in the total RNA yield from these samples, with most of the P. chabaudi and P. vinckei samples yielding around 1 μg and 10 μg respectively (see Table 1). Quality assessment of the total RNA using Bioanalyser found no evidence of degradation and also, as expected, showed rRNA electropherogram peaks of both mouse and Plasmodium origin (see Figure 1C).
Upon performing Illumina RNA sequencing and mapping RNA-seq reads onto the relevant RMP reference genomes, it was found that a low percentage of the reads (4-12%) were of parasite origin in the P. chabaudi samples, resulting in low fragment depth (most genes had less than 50 fragments mapped onto them) while higher percentages (55-75%) and fragment depth (most genes with more than 424 mapped fragments) were obtained for P. vinckei (see Table 1).

Comparison of microsampling and terminal sampling methods
There was strong correlation between the normalized gene-wise FPKM values (fragments per kilobase of transcript per million mapped reads) of microsamples and terminal bleed samples (see Figure 1B Figure 1D). Of a total of 5,073 genes in P. vinckei, 4,328 genes were significantly differentially expressed (p-value less than 0.05) in at least one time point (616 genes were not differentially expressed and only 129 genes had 0 FPKM value in at least one timepoint). As in other Plasmodium species 1,3,10,11,23,24 stage-specific gene expression was inferred in P. vinckei by constructing a phaseogram, where the differentially expressed genes are ordered according to the time point at which their expression peaks (see Figure 2).
Next, we compared P. vinckei gene expression with the transcriptional cascade shown in P. falciparum 51 . Two thousand, four hundred and eighty (2,480) P. vinckei genes were ordered according to the expression values of their one-to-one orthologues in P. falciparum (from a total of 2,712 P. falciparum genes profiled in 51 ), and a similar temporal expression cascade as in 51 was obtained (see Figure 2).

Minimum sequencing depth and cost estimation
The absence of any host depletion step reduces the proportion of sequencing reads of parasite origin and the final read coverage of parasite transcripts. In order to assess the impact of host contamination, we first estimated the minimum amount of sequencing required per sample to gain robust results during gene expression analysis.
Random subsampling of different sizes (1, 3.16, 10, 31.6 and 100% of the total reads) was performed for the 12 P. vinckei samples and differentially expressed genes were inferred in each case at a significance level (q-value) of 0.05. It was observed that the number of differentially expressed genes and their expression values (in all pairwise comparisons between the four time points) did not change drastically in subsamples of and above 31.6% of the total reads, which is equivalent to around 3 million paired-end reads per sample (see Supplementary Figure S1). Setting this as the target sequencing depth for an RMP transcriptome, sequencing and animal costs alone were calculated for different host contamination levels (10 to 80%) for a microsampling experiment and compared with a terminal blood sampling experiment. As the number of time points or biological replicates increase in the study design, microsamples with host contamination levels less than 70% would cost the same or less than terminal blood sampling (see Supplementary Figure S1). Differences in other costs incurred for animal housing and laboratory reagents between the two protocols were assumed to be negligible. Our estimates were also made without considering the substantial manpower costs associated with the terminal blood sampling procedure. Our protocol allows for expression profiling at multiple time points from the same host, thus reducing animal-to-animal variation.

Discussion
As host RNA depletion steps can be skipped, high proportions of host-derived reads in the sequencing data is the main limitation of this protocol, especially at low levels of parasitaemia. More sequencing data is therefore required per sample to compensate for host contamination and to achieve a suitable sequencing depth of the parasite's transcriptome. By randomly reducing the number of reads in our dataset, we estimated that only 3 million paired-end reads are required for robust differential expression analyses. Increasing the number of replicates could further reduce this minimum sequencing depth.
Around 20% of the genes did not have fragment coverage in microsamples with P. chabaudi at low parasitaemias suggesting that host contamination at parasitaemias of below 7% would be unfeasible, requiring extra-large amount of sequencing to achieve sufficient sequencing depth for parasite transcripts. The protocol is well-suited for higher parasitaemias as shown in the P. vinckei microsamples that yielded sufficient fragment coverage for almost all of the genes.
Based on current animal and sequencing costs, we estimate that any study with host contamination as high as 70% would still be economically viable, especially if the significant reduction in manpower costs is considered. Host reads would of course be informative in studies that profile both host and parasite transcriptomes simultaneously to study host response to infection.
In conclusion, RNA extraction, sequencing and expression analysis can be performed with <20 microliters of malaria parasite infected blood in a robust, reproducible and costefficient way. Our protocol can also be adapted to profile the in vivo transcriptome of other blood-borne pathogens like Trypanosomes in rodent models. Blood collection and TRIzol lysis can be performed within 5 minutes allowing snapshots of gene expression to be taken quickly, at more frequent time points, and using less manpower. Serial bleeding of the same mice throughout the study reduces the number of animals used and animalto-animal variation.

Plasmodium chabaudi chabaudi AS strain and Plasmodium vinckei vinckei CY strain
were used to initiate infections in mice. In each case, one million parasites were intravenously inoculated into each CBA mouse.

Comparison of microsampling and terminal sampling methods
In order to compare microsampling with terminal bleed sampling, blood sampling was performed in mice infected with either wild-type P. chabaudi parasites or genetically modified P. chabaudi parasites (PCHAS_1433600 gene knockout). On the fourth day post infection, each mouse was restrained and 1-2 mm of the distal portion of the tail was excised with sanitized scissors. Twenty microlitres of blood was subsequently collected from the tail by pipette and deposited in 500 μL of phosphate buffered saline (PBS) solution. Whole blood was briefly spun down in a tabletop microcentrifuge, supernatant removed and the RBC pellet resuspended in 500 μL TRIzol reagent (ThermoFischer Cat#15596026). TRIzol lysates were temporarily stored at 4°C (for periods up to 48hrs), or for longer periods at -80°C.
Thin blood smears on glass slides were taken just before blood collection of blood, fixed with methanol and stained with Giemsa's solution for estimating parasitaemia.
Following this, mice were immediately anaesthetized with an intraperitoneal injection of 0.2 mL of 10% sodium pentobarbital solution in PBS solution. Once completely sedated, a vertical incision was made from the bottom of the rib cage to the right shoulder, forming a cavity. The brachial artery was cut and around 0.5-0.6 mL of blood was collected into 3 mL citrate saline solution (8.5g of NaCl,15g trisodium citrate in 1L of distilled water, pH 7.2) on ice with a sterile Pasteur pipette. The complete procedure was carried out with the mouse under isoflurane anaesthesia via inhalation.
Mice were euthanized by cervical dislocation. Strand-specific mRNA sequencing was performed from total RNA using a TruSeq Stranded mRNA Sample Prep Kit LT (Illumina Cat#RS-122-2101) according to the manufacturer's instructions. Briefly, polyA+ mRNA was purified from total RNA using oligo-dT dynabead selection. First strand cDNA was synthesized using randomly primed oligos followed by second strand synthesis where dUTPs were incorporated to achieve strand-specificity. The cDNA was adapter-ligated and the libraries amplified by PCR.
Libraries were sequenced in an Illumina Hiseq2000 with paired-end 100 bp read chemistry.

Subsampling and cost estimation
Gene-wise fragment counts were inferred using featureCounts 62 and subsampling analysis was performed using subSeq v1.4.1 63 . Random subsampling of different sizes (1, 3.16, 10, 31.6 and 100% of the total reads) was performed for the 12 P. vinckei samples with up to 10 replications at each subsampling step. For each subsampling step, differential expression analysis was performed for each pairwise comparison among the four time points with a q-value cutoff of 0.05 using two tools, DESeq2 64 and edgeR 65 . Cost estimation was done with the following conditions-i) target sequencing depth of 3,000,000 paired-end 100bp reads per sample, ii) sequencing cost per gigabasepair is $22 66 and iii) Cost of one six-weeks old female CBA/J mouse is $31.68 (https://www.jax.org/strain/000656).

Author's contributions
AR, RC and AP designed the methodology. AR collected, analyzed and interpreted the data. AKS designed and performed real-time qPCR experiments. AR wrote the manuscript and all authors contributed to it.

Declarations
The authors declare that they have no competing interests. All raw sequencing fastq files are available through the European Nucleotide Archive study accession number: PRJEB27301.

Figure Legends
Figure-1. Microsampling protocol design and reproducibility. A) In terminal blood sampling, at each time point, groups of mice are exsanguinated to get 0.5 -0.6 mL blood volumes, which is then subject to leukocyte depletion and saponin lysis before TRIzol treatment. Thus, the number of mice increases proportionally to number of time points and biological replicates in the study design (NT). On the other hand, microsampling involves obtaining sample volumes as low as 20 μL from the same mouse at different time points, thus confining the number of mice to just biological replicates (N) and significantly lowering costs and biological variability due to individual animals. Leukocyte depletion and saponin lysis are also not performed on the low volume samples, thus saving time and manpower. B) High Pearson correlations were observed between gene expression profiles from microsampling and terminal blood sampling methods. C) Bioanalyser electrophoregrams of total RNA from P. vinckei microsamples show that high quality RNA could be extracted consistently from 20 μL microsamples. D) Heatmap shows pair-wise Pearson correlation coefficients and the inset shows multidimensional scaling to visualize the level of similarity between the P. vinckei microsamples. Microsamples show low degree of variability and are highly reproducible as proved by tight correlations between biological replicates.

Figure-2. Time-series transcriptome of P. vinckei vinckei CY.
Heat maps showing gene expression in P. vinckei at 6 hour time points during the 24 hour asexual cycle, each corresponding to a dominant population of rings (R), early trophozoites (ER), late trophozoites (LT) and schizonts (S) respectively. All significantly regulated P. vinckei genes (4,328 genes) were ordered according to their phase of expression (left). P. vinckei genes with one-to-one orthologs in P. falciparum (2,480 genes) were ordered based on P. falciparum gene expression pattern shown in 1 (right). Gene-wise FPKM values can be found in Supplementary tables S1 and S2.  Table 1. Microsample characteristics and RNA-seq mapping statistics. Microsamples from P. chabaudi had low parasitaemia and therefore, a low percentage of reads mapping to P. chabaudi genome. In contrast, P. vinckei microsamples had lesser host contamination resulting in higher median fragment coverage across its transcripts. Transcript integrity number (TIN) was calculated using RSeQC 56 and all samples showed a high TIN value, indicating little to no evidence of RNA degradation.