RNA-Seq analysis validates the use of culture-derived Trypanosoma brucei and provides new markers for mammalian and insect life-cycle stages

Trypanosoma brucei brucei, the parasite causing Nagana in domestic animals, is closely related to the parasites causing sleeping sickness, but does not infect humans. In addition to its importance as a pathogen, the relative ease of genetic manipulation and an innate capacity for RNAi extend its use as a model organism in cell and infection biology. During its development in its mammalian and insect (tsetse fly) hosts, T. b. brucei passes through several different life-cycle stages. There are currently four life-cycle stages that can be cultured: slender forms and stumpy forms, which are equivalent to forms found in the mammal, and early and late procyclic forms, which are equivalent to forms in the tsetse midgut. Early procyclic forms show coordinated group movement (social motility) on semi-solid surfaces, whereas late procyclic forms do not. RNA-Seq was performed on biological replicates of each life-cycle stage. These constitute the first datasets for culture-derived slender and stumpy bloodstream forms and early and late procyclic forms. Expression profiles confirmed that genes known to be stage-regulated in the animal and insect hosts were also regulated in culture. Sequence reads of 100–125 bases provided sufficient precision to uncover differential expression of closely related genes. More than 100 transcripts showed peak expression in stumpy forms, including adenylate cyclases and several components of inositol metabolism. Early and late procyclic forms showed differential expression of 73 transcripts, a number of which encoded proteins that were previously shown to be stage-regulated. Moreover, two adenylate cyclases previously shown to reduce social motility are up-regulated in late procyclic forms. This study validates the use of cultured bloodstream forms as alternatives to animal-derived parasites and yields new markers for all four stages. In addition to underpinning recent findings that early and late procyclic forms are distinct life-cycle stages, it could provide insights into the reasons for their different biological properties.


Background
Virtually all unicellular parasites, particularly those that depend on two hosts, progress through a series of different life-cycle stages that can differ radically in their morphology, metabolic capabilities and surface architecture. One case in point is the African trypanosome, T. brucei brucei, which causes the disease Nagana in cattle, and is closely related to the parasites causing human sleeping sickness.
T. b. brucei cycles between vertebrates and tsetse flies, the latter being the definitive hosts where sexual reproduction can take place. There are at least two morphologically distinct life-cycle stages in the mammal: the slender form, which has the capacity to replicate, and the stumpy form, which is non-dividing. Until recently it was assumed that these forms were restricted to the bloodstream, but they are now known to occur in adipose tissue and skin as well [1][2][3]. When parasites are taken up by a tsetse fly in the course of a blood meal, slender forms are eliminated, but stumpy forms differentiate to procyclic forms that colonise the midgut. There are two populations of procyclic forms: early procyclic forms are found for up to a week after transmission, and are positive for the surface protein GPEET procyclin, while late procyclic forms, which are responsible for persistent infection of the midgut, are GPEET-negative. These two forms cannot be distinguished by their morphology. To complete the life cycle trypanosomes must undergo several more rounds of differentiation, culminating in the delivery of infectious metacyclic forms to a new mammalian host when the tsetse takes a blood meal.
Pleomorphic stocks of T. b. brucei, which produce both slender and stumpy bloodstream forms, can be cultured in the presence of an extracellular matrix such as methylcellulose. In vitro, differentiation to procyclic forms is induced by the addition of citrate and/or cisaconitate to the medium and a reduction in temperature from 37°C to 27°C. Procyclic culture forms initially express GPEET; depending on the culture medium they continue to grow as early procyclic forms or differentiate into late procyclic forms [4]. Glycerol, glucose, oxygen concentration and an uncharacterised midgut factor can influence GPEET expression via its 3' UTR [4,5]. Other factors such as serum concentration and cell density may also influence expression, but these have not been investigated systematically. In addition to GPEET, a recent study identified several transcripts and proteins that were differentially expressed in early and late procyclic forms [6]. The two life-cycle stages also showed differences in behaviour when plated on semi-solid media. Early procyclic forms exhibited social motility (SoMo), a form of coordinated group movement, while late procyclic forms replicated at the inoculation site but did not migrate [6]. In tsetse flies the progression from early to late procyclic forms is strictly unidirectional. In culture, however, differentiation/dedifferentiation can occur in both directions [5] and it is not predictable when and why this occurs.
Several previous studies have analysed the transcriptomes of different life-cycle stages of T. b. brucei [7,8]. It is difficult to make comparisons, however, since some employed microarrays, while others used splice leader trapping or classical RNA-Seq. For the most part, it is also not clear whether the procyclic forms used in these studies were early, late or a mixture of the two. To obtain a more comprehensive overview of differentially expressed genes we performed RNA-Seq on defined cultures of slender bloodstream forms, stumpy bloodstream forms and early and late procyclic forms. In addition, we analysed the expression of several multigene families and found that individual members were expressed in a stage-specific manner.

Parasite cultivation
Pleomorphic bloodstream forms of Trypanosoma brucei brucei EATRO 1125, clone AnTat 1.1 [9] were originally obtained from Dr. Erik Vassella, University of Bern. Bloodstream forms cultivated in HMI-9 supplemented with 1.1% methylcellulose [10]. At densities < 10 6 ml − 1 , the majority of cells are slender forms; 24 h after reaching a density of 5 × 10 6 ml − 1 , the cells are essentially pure stumpy forms [11]. Early procyclic forms were obtained as described by Knüsel and Roditi [12]. They were maintained in SDM79 supplemented with 10% heatinactivated foetal bovine serum in the presence of 20 mM glycerol. To differentiate them into late procyclic forms glycerol was removed from the culture medium [4]. Real-time PCR showed that the early procyclic forms expressed 26-fold more GPEET mRNA than the corresponding late procyclic forms used for RNA-Seq.

RNA isolation and RNA-Seq analysis
Total RNA was isolated as described previously [12] and subjected to DNase treatment to remove residual genomic DNA contamination. Illumina cDNA libraries were prepared using TruSeq RNA sample preparation from a poly(A)-selected RNA. Sequencing of cDNA libraries was performed at Fasteris, Geneva, using Illumina Hiseq sequencing systems with 100 or 125 bp read lengths and sequence depths of > 40 million reads per sample. Reads were mapped to the T. b. brucei 927 reference genome version 5 (either coding sequences or putative 3'UTRs), using the bowtie tool available in Galaxy Interface (usegalaxy.org) with default parameters that allow a maximum of 2 mismatches per 28 bp seed (Galaxy version 1.1.2). Sequencing depth and mapping coverage are provided in Additional file 1. Mapping to the genome was used to visualise the data on Gbrowse; to estimate transcript abundance, reads were first mapped to coding sequences and unmapped reads were re-mapped to 3' UTRs. Read counts for the annotated genes or 3'UTR were extracted using SAMTools pileup and RPM values were calculated. Bioconductor package DESeq [13] was used to identify the differentially expressed genes from biological replicates.

Results
Faithful expression of stage-specific markers in culturederived trypanosomes To place our analysis of the transcriptomes of early and late procyclic forms in a wider context, we compared them to culture-derived slender and stumpy bloodstream forms of a tsetse-transmissible pleomorphic strain of T. b.
brucei. RNA-Seq was performed on biological replicates. Pearson's correlations for replicates of slender, stumpy, early and late procyclic forms were 0.86, 0.85, 0.95 and 0.97, respectively. When consecutive life-cycle stages were compared, the greatest differences were observed between stumpy and early procyclic forms, presumably reflecting the adaptation to a different host, followed by slender/ stumpy forms (Fig. 1). We first validated our data by examining the expression profiles of a panel of genes that are known to be stage-regulated. As shown in Fig. 2 and Additional file 2, bloodstream-specific genes such as invariant surface glycoproteins ISG75, ISG65 and ISG64 [14,15], GPI-phospholipase C (GPI-PLC) [16]; and haptoglobin-haemoglobin receptor (HpHbR) [17] are upregulated in bloodstream forms compared to procyclic forms. Furthermore, two transcripts previously shown to accumulate in rodent-derived stumpy forms (PAD1 and PAD2) [18] showed increased expression in culture-derived stumpy forms (Fig. 2). Differentiation to the procyclic form is accompanied by expression of procyclic-specific surface proteins and development of a fully functional mitochondrion. In addition to the procyclins, which are the most abundant procyclic-specific transcripts and proteins [19], our data showed strong upregulation of transcripts encoding the surface protein PSSA-2 [20], two voltage-dependent anion-selective channels (VDAC1 and 2) [21,22]) and cytochrome oxidases [23] (Fig. 2). Finally, transcripts that are differentially expressed by early and late procyclic forms in both culture and tsetse [6] gave the same expression profile in our RNA-Seq data (Fig. 3a).
The transcriptomes of culture-derived slender and stumpy forms Pleomorphic bloodstream forms grown in HMI-9 in the presence of 1.1% methycellulose replicate as slender forms, and differentiate into stumpy forms once they reach densities > 5 × 10 6 ml − 1 [11]. Stumpy forms from these cultures can differentiate synchronously to procyclic forms [10], have increased levels of PAD1 mRNA, are infectious for tsetse and can complete the life cycle [11]. As shown in Additional file 2, DESeq analysis of biological replicates identified 497 genes as differentially regulated ≥2-fold between slender forms and stumpy forms with a p value < 0.05. GO term analyses indicate significant decreases (Bonferroni adjusted p value < 10 − 5 ) in transcripts in the categories of macromolecule synthesis, metabolism, chromatin assembly and locomotion (Additional file 3). Among them are 230 transcripts that were down-regulated ≥2 fold in stumpy forms compared Comparative transcriptomes of a slender vs stumpy forms, b stumpy vs early procyclic forms, c early vs late procyclic forms to slender forms. As shown for rodent-derived bloodstream forms, these reflect the fact that stumpy forms are cell-cycle arrested [24], with clear decreases in histone transcripts (Additional files 2 and 4). Tubulins and flagellar components are also down-regulated relative to slender forms. Once again this is consistent with the cells being quiescent, as the flagellum is only duplicated at the onset of mitosis [25]. Translation is also reduced in stumpy forms to about one-fifth of the rate in slender forms and there is a decrease in the number of polysomes [26]. Interestingly, transcripts for ribosomal proteins were not reduced significantly, but transcripts for the two versions of elongation factor 1A were down-regulated 2.5fold and two Alba-domain proteins, Alba1 and Alba3 were reduced 3-fold. Alba proteins have previously been linked to initiation of translation in trypanosomes [27]. Transcripts encoding glycosomal proteins and glycolytic enzymes were also strongly down-regulated (Fig. 3b, Additional file 2), suggesting a reduced reliance on glucose as an energy source.
A large number of transcripts were more highly expressed in stumpy forms than in slender forms (Additional files 2, 4 and 5), probably reflecting pre-adaptation for transmission to the fly. This is reminiscent of a recent study of in vitro-derived metacyclic forms; these are cell-cycle arrested, and are poised to translate a bloodstream form proteome on transmission to the mammalian host [28]. GO term analysis [29] indicated a preponderance of transcripts in the signalling and cyclic nucleotide categories were significantly regulated (Additional file 2); all of these were adenylate cyclases. However, visual inspection revealed that 12 genes encoding components of inositol metabolism were also up-regulated (Table 1) [30][31][32]. These included all four target of rapamycin (TOR) orthologues [33,34]. Depletion of TOR4 has previously shown to lead to the differentiation of monomorphic bloodstream forms to stumpy-like forms [34], so it is somewhat surprising that its expression is maximal in the latter.
Many of the proteins that were shown to be differentially expressed by SILAC [6] reflected regulation at the level of mRNA (Table 2), including calflagins, prostaglandin F  Numbers denote fold changes. Two biological replicates were performed in each case. The data for SILAC are derived from Imhof et al. [6]. ND not detected synthase, PTP1-interacting protein 39 (PIP39), pteridine reductase, adenylate cyclases and hexokinase 1 (HK1). HK1 and HK2 are virtually identical in their coding sequences, but differ in their 3' UTR sequences. In addition, receptor-type adenylate cyclases Tb927.5.330 (AC330) and Tb927.5.320 (AC320) have extremely similar coding sequences but can be distinguished by their 3' UTRs. Mapping coverage to 3' UTRs was taken into account to identify the differential regulation of these genes between early and late procyclic forms (Fig. 4a). As shown in Additional file 6 and Fig. 4b, HK1 is up-regulated in early procyclic forms whereas HK2 is increased in late procyclic forms.

Individual members of multigene families show stagespecific expression
With longer sequences, it is possible to assign reads to individual members of multigene families and demonstrate stage-specificity, even when coding regions share ≥96% identity. Of a cluster of 5 cation transporter genes on chromosome 11 (Tb927.11.8990-9030; Fig. 5), Tb927.11.8990 shows maximum expression in procyclic forms (both early and late), whereas Tb.927.11.9000 and 9010 show maximum expression in stumpy forms. The remaining two copies are expressed at low levels in bloodstream and procyclic forms, but are up-regulated in the salivary glands [8]. Likewise, a cluster of 9 amino acid transporter genes on chromosome 8 (Tb927.8.7600-7700) show differential expression. For example, Tb927.8.7610 and 7650 are most highly expressed in bloodstream forms and Tb927.8.7600 is most highly expressed in procyclic forms (Fig. 6). Tb927.8.7640, which is 96% identical to Tb.927.8.7610/7620/7630, is expressed at moderate levels in all 4 life-cycle stages that we analysed, but is up-regulated in the salivary glands, together with Tb927.8.7610 [8].

Discussion
We have obtained comprehensive transcriptome data from cultures of four different life-cycle stages. It is highly encouraging that the expression profiles of all known stage-regulated genes identified in previous studies using different parasite strains, different sources and different methods are confirmed in our analysis. We conclude that slender and stumpy forms cultured in the presence of methylcellulose are excellent substitutes for parasites isolated from animals. Furthermore, this study, which provides the first RNA-Seq analysis of the transcriptome of stumpy forms, shows that many more genes are stage-regulated than was previously realised, with genes involved in inositol metabolism taking a prominent place. A number of genes that show peak expression in stumpy forms are expressed at similar levels in long slender and procyclic forms, and would therefore have been missed in earlier analyses. Our findings also underline that stumpy forms are not merely non-dividing bloodstream forms with some degree of pre-adaptation for transmission to tsetse, but are likely to have unique functions in the mammalian host. A first comparison of the transcriptomes of early and late procyclic forms shows that these are more closely related to each other than to other life-cycle stages, . The coding region of Tb927.8.7610 is ≥96% identical to Tb927.8.7630/7640, but their expression profiles are distinct. Sl: slender bloodstream forms; St: stumpy bloodstream forms; Ea: early procyclic forms; La: late procyclic forms but they are clearly distinct. Most of the differential regulation of proteins described in a previous study of early and late procyclic forms [6] can be attributed to differences in steady stage mRNA, suggesting that translational control plays a relatively minor role at this point. When the analysis was extended to~1200 proteins identified in the 2 SILAC datasets, the overall correlation coefficients (fold changes RNA:fold changes protein) were 0.46 and 0.59, respectively [6]. Comparing our RNA-seq data to the proteomics data sets from Dejung et al., [39] the correlation for RNA:protein in slender bloodstream forms was in the same range, at 0.48, while the correlation for stumpy forms was only 0.28. This is likely to reflect RNAs that are present, but not translated until the parasites begin to differentiate to procyclic forms. We could not perform a comparison between RNA and proteome for procyclic forms as the data from Dejung et al. [39] does not specify if their cultures are early or late procyclic forms (based on various markers we suspect that they are a mixture of the two). However, of 99 proteins downregulated 24-48 h after triggering differentiation from stumpy to procyclic forms, 87 mRNAs were downregulated in early procyclic forms.
In addition to providing new markers for all four lifecycle stages, these data also offer clues about metabolism. For example, genes encoding glycerol-uptake proteins are upregulated in stumpy forms, while glycerol kinases are upregulated in early procyclic forms. Unexpectedly, the THT2 hexose transporters are transiently upregulated in early procyclic forms. This may reflect a need for active acquisition of glucose in a sugar-poor environment, the insect midgut, and provide a window for maturation of the mitochondrion. Differentially regulated ion transporters and amino acid transporters presumably allow the parasites to sense and respond to their environment. Finally, the discovery of a relatively small number of differentially regulated genes between early and late procyclic forms may enable us to elucidate the signals and mechanisms involved in SoMo.

Conclusions
This study provides the first transcriptomic data from cultures of four consecutive life-cycle stages of Trypanosoma brucei. As well as validating the use of cultured slender and stumpy bloodstream forms as alternatives to animal-derived parasites, in compliance with 3R principles, it provides the first comparison of the transcriptomes of early and procyclic forms and identifies new stage-regulated transcripts. Long reads enabled us to distinguish between closely related members of multigene families, and show that these are differentially expressed during the life cycle. Finally, this study delivers insights into the metabolic activities of the different life-cycle stages.