Cell Surface Proteomics Provides Insight into Stage-Specific Remodeling of the Host-Parasite Interface in Trypanosoma brucei*

African trypanosomes are devastating human and animal pathogens transmitted by tsetse flies between mammalian hosts. The trypanosome surface forms a critical host interface that is essential for sensing and adapting to diverse host environments. However, trypanosome surface protein composition and diversity remain largely unknown. Here, we use surface labeling, affinity purification, and proteomic analyses to describe cell surface proteomes from insect-stage and mammalian bloodstream-stage Trypanosoma brucei. The cell surface proteomes contain most previously characterized surface proteins. We additionally identify a substantial number of novel proteins, whose functions are unknown, indicating the parasite surface proteome is larger and more diverse than generally appreciated. We also show stage-specific expression for individual paralogs within several protein families, suggesting that fine-tuned remodeling of the parasite surface allows adaptation to diverse host environments, while still fulfilling universally essential cellular needs. Our surface proteome analyses complement existing transcriptomic, proteomic, and in silico analyses by highlighting proteins that are surface-exposed and thereby provide a major step forward in defining the host-parasite interface.

Parasitic protozoa afflict nearly one billion people worldwide and constitute a substantial global public health burden. Owing to infection of livestock and crop plants, protozoa also cause economic hardship and limit development in some of the most impoverished regions of the world. A critical but poorly understood aspect of parasite-host interactions is the parasite cell surface, which is the direct interface with the host environment. Parasite surface proteins function in attachment and invasion of host tissues, defense against host attack, and uptake of essential nutrients (1). For many parasites, transmission between human hosts occurs through invertebrate vectors and intermediate hosts, requiring that the surface proteome be sufficiently flexible to accommodate diverse extracellular environments.
The protozoan parasite Trypanosoma brucei causes lethal sleeping sickness in humans and nagana in cattle, which together impose a tremendous medical and economic burden across sub-Saharan Africa. There is no vaccine for sleeping sickness and current treatments are antiquated, toxic, and increasingly ineffective (2). Transmission between mammalian hosts occurs through the bite of a tsetse fly vector and T. brucei is extracellular throughout all stages of its life cycle. Therefore, a dynamic and multifunctional surface proteome is paramount for T. brucei survival, transmission, and pathogenesis (3)(4)(5).
In the mammalian host, T. brucei replicates indefinitely in the bloodstream, where surface proteins must continuously protect against attacks from the host immune system and simultaneously compete with host cells for uptake of essential nutrients (6). The surface of bloodstream-form parasites is dominated by a dense coat of variant surface glycoprotein (VSG) 1 , which shields other parasite surface proteins from host antibodies and allows evasion of the immune system through antigenic variation (7)(8)(9). Upon uptake during a tsetse fly bloodmeal, the parasites undergo a dramatic differentiation, marked by pronounced changes in cell morphology and metabolism and replacement of VSG with a surface coat of procyclin (10). The resulting procyclic-form parasites establish an infection in the fly midgut. Parasites then migrate from the midgut to the salivary glands and undergo several further differentiations, including modification of procyclin isoforms and acquisition of a surface coat of "brucei alanine-rich proteins" (BARPs) (11)(12)(13)(14). Upon establishing a salivary gland infection, parasites undergo a final differentiation into mammalian-infectious forms that reacquire a VSG coat in preparation for transmission to a new host. Remodeling of major surface proteins between life cycle stages reflects the parasite's need to adapt to varied and generally hostile host environments (4, 15). In addition to the major surface proteins discussed above, less abundant surface proteins are responsible for sensation of signals that direct tissue-specific differentiation events crucial for infection chronicity and transmission (16 -20).
Beyond their role in host-parasite interaction, T. brucei surface proteins are directly relevant for therapeutic intervention, impacting nearly all currently available drug treatments. For example, cell surface transporters mediate uptake of pentamidine and melarsoprol, two of the frontline drugs used to treat bloodstream and central nervous system infections, respectively (21)(22)(23). Mutations in these surface transporters cause naturally occurring drug resistance in field isolates (24 -26), underscoring the clinical relevance of parasite surface proteome composition and function.
Despite its importance to transmission, pathogenesis, and therapeutic intervention the T. brucei surface protein repertoire remains largely unknown, presenting a major gap in our understanding of parasite biology and host-parasite interactions. Here, we utilize surface biotinylation, coupled with affinity purification and shotgun proteomics (27,28), to obtain cell surface proteomes from procyclic (insect midgut) and bloodstream-form T. brucei. We identify many proteins not previously known to be surface exposed, indicating great diversity of proteins that function at the host-parasite interface. Comparison of surface proteomes from procyclic and bloodstream forms reveals extensive stage-specific surface protein remodeling that includes individual paralogs within protein families. As such, our studies suggest that the surface proteome is larger and more diverse than generally appreciated and that fine-tuned remodeling enables adaptation to different host environments, while still accommodating universally essential cellular needs.

EXPERIMENTAL PROCEDURES
Cell Lines-427-derived bloodstream form trypanosomes, 221 single marker cell line, and procyclic form, 2913 cell line were cultivated as described (29).
Purification of Biotinylated Proteins and VSG Depletion-Surface biotinylation with sulfo-NHS-SS-biotin (Life Technologies, Grand Island, NY) and purification of biotinylated proteins was performed as described (27,28), with the exception that flagella were not removed. 1-5 ϫ 10 8 cells were washed twice in phosphate buffered saline (PBS) and resuspended in PBS ϩ 0.5 mg/ml Sulfo-NHS-SS-biotin (Life Technologies, Grand Island, NY) for 10 min on ice. Unreacted biotin was blocked by addition of 100 mM Tris for 10 min on ice, followed by two washes in PBS ϩ 100 mM Tris. Purification of biotinylated proteins was performed as described (28). Briefly, cells were lysed in PBS ϩ 0.5% Nonidet P-40 ϩ SigmaFAST EDTA-free protease inhibitors (Sigma-Aldrich, St. Louis, MO) for 10 min on ice. Soluble and insoluble proteins were separated by centrifugation at 13,000 rpm for 10 min at 4°C. The supernatant containing soluble proteins was incubated with Streptavidin Sepharose High Performance beads (GE Healthcare, Buckinghamshire, UK) for 30 min at 4°C. Beads were collected by centrifugation and washed as described (28).
VSG Depletion-VSG was removed by activation of GPI-PLC (30). Surface-biotinylated cells were hypotonically lysed at 8e8 cells/ml in ice-cold H 2 O ϩ SigmaFAST EDTA-free protease inhibitors (Sigma-Aldrich, St. Louis, MO) for 5 min on ice. Samples were pelleted at 3000 ϫ g for 10 min at 4°C and pellets were resuspended in 10 mM sodium phosphate, pH 8 ϩ protease inhibitors at 37°C for 5 min. After chilling briefly on ice, VSG-depleted samples were pelleted at 12,000 ϫ g for 10 min at 4°C. Purification of surface-biotinylated proteins from VSG-depleted pellets was performed as described above.
Shotgun Proteomic Analysis of Surface Proteomes-Proteomic analyses were performed essentially as described (28). TCA precipitates and on-bead samples were mixed with digestion buffer (100 mM Tris-HCl, pH 8.5, 8 M urea). The samples were reduced and alkylated by sequential treatment with 5 mM tris(2-carboxyethyl) phosphine (TCEP) and 10 mM iodoacetamide as described earlier (31,32). Afterward, samples were sequentially digested with Lys-C and trypsin proteases as previously described (32). The digestion was stopped by addition of 5% formic acid and peptide digests were analyzed by mass spectrometry. An initial set of surface proteome samples were prepared from both BSF (n ϭ 3) and PCF (n ϭ 5) stage cultures and analyzed by 2D-LC-MS/MS on a Thermofisher LTQ-Orbitrap XL as described (27,28). Additional surface proteome samples were subsequently prepared in order to either: (1) assess the relative abundance of putative surface proteins in the Input versus streptavidinbound fractions or (2) perform label-free MS1-based quantitation of surface proteins that were differentially identified in the initial BSF or PCF samples. These subsequent surface proteome preparations were desalted and analyzed by LC-MS/MS on a Thermofisher Q-Exactive. For Q-Exactive experiments, desalted peptide digests were separated online using reversed-phase chromatography on a 75 M inner diameter fritted fused silica capillary column with a 5 M pulled electrospray tip that was packed in-house with 15 cm of Luna C18(2) 3 M reversed phase particles. An EASY-nLC 1000 ultra-high pressure liquid chromatography (UHPLC) system (Thermo Scientific, Waltham, MA) was used to deliver a linear acetonitrile gradient from 3% to 30% solvent B (Buffer A: 0.1% formic acid, Buffer B: Acetonitrile/ 0.1% formic acid) at a flow rate of 200 -300 nl/min. MS/MS spectra were collected on a Q-Exactive mass spectrometer (Thermo Scientific, Waltham, MA) as described (33,34).
Raw data files were converted to MS2 files using RawExtractor v.1.8 and v.1.9.9.2 for LTQ-Orbitrap XL and Q-Exactive data, respectively. Data analysis was performed using ProLuCID for database searching and DTASelect2 for probabilistic filtering as implemented in the Integrated Proteomics Pipeline v.2-IP2 (Integrated Proteomics Applications, Inc., San Diego, CA) (35)(36)(37). MS/MS spectra were searched against a protein FASTA database obtained from TriTrypDB (downloaded from tritrypdb.org on February 9, 2012) appended with sequences for ESAGs from the 221 VSG expression site (GI numbers 189094616 -189094632) and concatenated to a decoy database in which the amino acid sequence of each entry was reversed (19,686 total entries). The search parameters (31) for LTQ-Orbitrap XL data were as follows: (1) precursor ion mass tolerance of Ϯ20 ppm, (2) fragment ion mass tolerance of Ϯ400 ppm, (3) only peptides with fully tryptic ends and unlimited missed cleavages were considered as candidates, and (4) a static modification of ϩ57.02156 Da on cysteine residues resulting from carbamidomethylation. The search parameters for analysis of Q-Exactive data were identical except precursor ion mass tolerance and fragment ion mass tolerance were each set Ϯ10 ppm.
Protein and peptide identifications were filtered using DTASelect and required at least two unique peptides per protein and a spectralevel false positive rate of less than 5% as estimated by a decoy database strategy (38). When protein-level false positive rates are estimated using a decoy database approach, these filtering criteria give a protein-level FDR of Ͻ1% for all data sets included in the manuscript. Normalized spectral abundance factor (NSAF) values including shared peptides was calculated as described and multiplied by 10 5 to improve readability (39). Proteins that could not be distinguished by available peptides in any given replicate were considered as a group. The numbers in the text refer to the number of protein groups, corresponding to the minimum number of proteins present. See supplemental Tables S8 -S13. Proteomic mass spectrometry data have been deposited to the ProteomeXchange Consortium via the MassIVE partner repository with the data set identifier PXD001946 (40).
MS1 Analysis-To validate the PCF-and BSF-specific classifications made based on the initial qualitative data sets, additional surface proteomic analyses were performed for both BSF and PCF samples. These additional data sets were acquired on the Q-Exactive instrument and subjected to MS1-based, label-free quantitative analysis. Data were acquired as two technical replicates from single biological preparations for both BSF and PCF stages. Proteins were identified and filtered as above, and the identifications were used to generate spectral libraries within the Skyline v2.6 proteomic mass spectrometry software suite (41). Identifications were filtered within Skyline to only include fully tryptic, uniquely mapping peptides with no missed proteolytic cleavage sites. Peaks were picked in an automated fashion using the default Skyline peak picking model. Integrated peak areas were generated from extracted ion chromatograms for each peptide's [M], [Mϩ1], and [Mϩ2] isotopic precursor masses. The calculated peak-areas were exported for statistical analysis using the linear mixed-effects model provided within the R package MSstats v2.3.4 (42). Settings for the group-comparison within MSstats are as follows: (1) peak intensities were Log 2 transformed, (2) intensity normalization between runs was accomplished by means of quantile normalization, (3) the scope of conclusions for biological and technical replication was set to restricted and expanded, respectively, and (4) settings for inclusion of interference transitions and assumption of equal feature variance were both set to "TRUE." p values were corrected within MSstats via Benjamini-Hochberg correction. See supplemental Table S13.
Volcano plots of Log 2 (PCF/BSF) ratios and significance were plotted using Microsoft Excel. Plots showing distributions of Log 2 (PCF/ BSF) ratios in cell surface and whole-cell proteomes were plotted using GraphPad Prism. Proteins were binned by Log 2 (PCF/BSF) ratio, using a bin width of 0.7 and proteins with Log 2 (PCF/BSF) ratios Յ 5.25 or Ն 5.25 were consolidated into the first and last bins, respectively.
Enrichment in the Bound versus Input or Unbound Fractions-For bound versus input or unbound analyses, surface biotinylation was performed as described above. Half of the soluble Nonidet P-40 extracted supernatant was reserved for shotgun proteomic analysis (Input) and the other half was incubated with streptavidin beads to obtain the Bound and Unbound fractions. Bound fractions were analyzed by on-bead digestion as described above. Input and Unbound fractions were TCA precipitated and analyzed using shotgun proteomics as described above. See supplemental Table S12. IP2 software (Integrated Proteomics, San Diego, CA) was used to compare protein identification between samples (supplemental Table S4). The relative abundance of proteins within each fraction was determined by dividing the number of spectra for each protein over the total number of spectra for all proteins in that fraction. The Bound/Input and Bound/Unbound ratios for each protein were then determined based on relative abundance in the corresponding fractions.
Bioinformatics Analyses-Prediction of membrane-targeting domains, reciprocal best BLAST analysis, and DAVID/GO classification are described in supplemental Experimental Procedures.
Comparison to Whole-cell SILAC Proteomes-Three stage-specific whole-cell proteomes have been published (43)(44)(45). A detailed comparison to the Urbaniak et al. (44) and Butter et al. (43) studies is shown in Fig. 5. The third study utilized a different trypanosome strain and analyzed parasites extracted from mice instead of culture (45) (supplemental Table S6), so was excluded from the comparison. The distribution of Log 2 (PCF/BSF) protein ratios from MS1 quantification of proteins in the cell surface proteome or SILAC quantification of proteins in the whole cell proteomes are plotted (Fig. 5B). For Butter et al. (43), replicate protein ratios from supplemental Table S3 in (43) were converted to Log 2 (PCF/BSF) and averaged. Urbaniak et al. (44) reported 10.6% of their whole cell proteome to be fivefold differentially expressed between the BSF and PCF life cycle stages. Our analysis of supplemental Table S3 in Butter et al. (43) indicates that 8.6% of the proteome was fivefold differentially regulated in both replicates. Proteins fivefold differentially regulated were considered stage-specific and proteins less than fivefold differentially regulated were considered constitutively expressed for analyses in Fig. 5C.
Mass Spectrometry Analysis of Stage-enriched Cell Surface Family Members-The list of proteins comprising the T. brucei cell surface phylome (46) was kindly provided by Dr. Andrew Jackson. Families represented in the cell surface proteomes were analyzed with respect to stage-specificity of individual members. Categorization as BSFspecific or PCF-specific was strictly defined as exclusive expression in one or the other life cycle stage (Fig. 3). Owing to challenges presented by proteins with closely related sequences, we additionally examined the distribution of unambiguous peptides mapping to individual proteins (supplemental Table S5). Proteins detected by at least two unambiguous peptides in two different samples were categorized as putatively BSF-enriched, PCF-enriched, or constitutive. Phylogenetic trees (FigTree, http://tree.bio.ed.ac.uk/software/figtree/) display MUSCLE alignments of the protein sequences in each family. In the case of Fam51, four additional family members were added: two ESAG4s from the 221 VSG expression site (GI# 189094619, 1890946250), ACP4 (Tb927. 10.13040) (47), and ACP2 (Tb927. 10. 16190) (48). For Fam57, the 221 expression site does not contain an ESAG10, so an ESAG10 from an alternate expression site was included (Tb427.BES15.1; GI# 189094656) to show the approximate relationship between PCF-specific and BSF-specific isoforms. Peptide data are included in supplemental Tables S10 -S11.
Quantitative Real-time PCR (qRT-PCR)-Total RNA was extracted using Qiagen's RNAeasy kit. DNase treatment was followed by reverse transcription using oligo(dT) primers for first-strand cDNA synthesis. qRT-PCR was performed as described (49) using iQ SYBR Green Supermix (Bio-Rad, Hercules, CA) on a CFX96 Touch Real-Time PCR Detection System (Bio-Rad, Hercules, CA). All analyses were performed in duplicate on two independent RNA preparations and values were normalized to PFR2 and TERT2 (50) using the 2 Ϫ⌬⌬CT method (51). Gene-specific primers were designed using NCBI Primer-BLAST to amplify a region of 150 -200 base pairs (see supplemental Experimental Procedures).

RESULTS AND DISCUSSION
Purification of Cell Surface Proteins from T. brucei-We used a combination of cell surface biotinylation, affinity purification, and shotgun proteomic mass spectrometry (Fig.  1A) to determine the protein composition of the parasite surface from insect stage procyclic culture-form (PCF) and mammalian bloodstream form (BSF) T. brucei. Live cells were incubated with a cell-impermeant biotin conjugate to label surface-exposed proteins. Immunofluorescence against intracellular and cell surface markers confirmed that cells remained intact during surface biotinylation (supplemental Fig.  S1A). Labeled cells were lysed with nonionic detergent and the detergent-soluble fraction was incubated with streptavidin beads to purify biotinylated proteins. The vast majority of proteins fractionated with unbound material, whereas biotinylated proteins were quantitatively purified with the streptavi-din-bound fraction (Fig. 1B). Known surface proteins VSG and procyclin copurified with the biotinylated fraction, whereas known intracellular proteins such as BiP were almost exclusively in the unbound fraction (Fig. 1C). As a control, streptavidin purification was also performed on unbiotinylated samples from each life cycle stage. No biotinylated proteins were detected by anti-biotin staining (supplemental Fig. S1B) and in the absence of surface biotinylation, VSG and procyclin were restricted to the unbound fraction. Thus, surface biotinylation and streptavidin purification enabled effective enrichment of T. brucei surface proteins.
VSG Depletion Improves Detection of Other Surface Proteins-Abundance of the major surface proteins (5-10 million copies/cell), procyclin on PCF cells and VSG on BSF cells, poses a potential barrier to a comprehensive analysis of the parasite surface proteome. Shotgun proteomic analysis of the streptavidin-bound fraction from PCF cells revealed very few spectra for procyclin ( Fig. 2A), presumably because of its lack of tryptic cleavage sites. In contrast, the overwhelming majority of mass spectra (nearly 80%) from BSF samples mapped to VSG, presenting a potential dynamic range challenge for reliable detection of low abundance proteins. To overcome this, we took advantage of endogenous GPI-specific phospholipase C (GPI-PLC) activity to release VSG from the cell surface (30) prior to streptavidin purification of surface proteins (Fig. 2B). Using this strategy we obtained a strepta- vidin-bound fraction that was relatively free of VSG and intracellular marker proteins EIF4AI and BiP based on Western blotting (Fig. 2C). To determine whether VSG depletion improved detection of non-VSG proteins, we compared proteomic analyses of VSG-depleted and nondepleted samples. VSG depletion increased the total number of proteins identified by Ͼ50% and reduced the proportion of mass spectra corresponding to VSG from ϳ80% to 15% of total spectra ( Fig. 2A). Spectra from another GPI-anchored surface protein, transferrin receptor, were also reduced, though not absent (Fig. 2D). In contrast, spectra derived from other surface proteins increased markedly, whereas spectra from known intracellular contaminants remained relatively constant. Therefore, VSG depletion dramatically increased sensitivity of detection for non-VSG surface proteins.

Surface Proteomes Provide High-confidence Data Sets of Surface Protein
Candidates-Having established a method for effective enrichment of T. brucei surface proteins, we performed surface biotinylation, purification, and proteomic analysis on multiple biological replicates for each life cycle stage. To minimize the effects of sample variation, only proteins reproducibly identified in three of three VSG-depleted BSF samples or at least four of five PCF samples were considered high-confidence surface protein candidates (Fig. 3). This yielded data sets of 239 BSF proteins and 198 PCF proteins, referred to as BSF and PCF surface proteomes, respectively (supplemental Table S1), for a combined surface proteome of 372 nonredundant proteins. As expected, the surface proteomes contain most currently known and suspected classes of surface proteins (supplemental Table S2) The majority of VSG is removed in the soluble VSG fraction (V), leaving no detectable VSG in the VSG-depleted pellet fraction (P*). Intracellular markers EIF4AI (cytoplasmic) and BiP (ER lumen) are removed in the cytosolic (C) or unbound (U) fractions. Numbers below the blots indicate relative number of cell equivalents loaded. D, Relative abundance of individual known surface proteins is shown as a fraction of total mass spectra. Black bars represent the mean from VSG-depleted samples, whereas white bars represent the mean from samples without VSG depletion (n ϭ 3 for each). Known surface proteins are transferrin receptor (TfR), invariant surface glycoproteins (ISGs), glucose transporter (THT1), adenylate cyclases (ACs), aquaporins (AQPs), and nucleoside transporters (TbNTs). The dashed lines (Bkgrd) indicate the average relative abundance of known intracellular proteins (e.g. proteins annotated as ribosomal proteins and histones) from VSG-depleted (black line) and nondepleted (gray line) samples. and are enriched for proteins with predicted membrane-association domains (supplemental Fig. S2A). Notably, despite use of GPI-PLC to deplete VSG, proteins with predicted GPI anchors were still identified. Endogenously biotinylated proteins were not a significant confounding factor, as proteomic analyses of two unbiotinylated samples from each life cycle stage identified very few proteins (supplemental Table S3).
Rigorous assessment of surface location demands a great deal of effort for any single protein, let alone a large set of randomly selected proteins, which would be necessary to use localization as a means of evaluating the surface proteome data set. We therefore evaluated the data set as a whole by determining the relative abundance of known surface and intracellular proteins in streptavidin-bound versus unbound and input fractions. In BSF cells, most surface protein controls (19 of 24) were enriched in the bound fraction (Bound/ Input Ͼ 1), whereas most intracellular controls (78 of 87) were depleted (Bound/Input Ͻ 1) (Fig. 4, supplemental Table S4). Strikingly, among the controls examined, an enrichment Ն2.0 was observed almost exclusively for bona fide surface proteins. Likewise, in PCF cells, a fold-enrichment of Ն2.5 was observed for 15 of 20 known surface proteins, but only three of 96 intracellular controls (Fig. 4, supplemental Table  S4). Notably, 46 proteins meeting this enrichment threshold among BSF or PCF proteomes are annotated as hypothetical (supplemental Table S1), with no clear conserved domains, indicating novel functionalities for trypanosome surface proteins.
As with any biochemical purification or analysis of the scale presented here, we expect to miss some surface proteins and to identify some false positives. Nonetheless, several independent lines of evidence show that the surface proteomes reported here represent high-confidence data sets of candidate surface proteins. These include high quality of the purified samples, multiple independent biological replicates to minimize the impact of prep-to-prep variability, excellent coverage of known surface proteins, and label-free quantitation of intracellular and surface protein controls in the Bound versus Input fractions.
The Parasite Surface Proteome is Enriched for Life Cycle Stage-specific Proteins, Including Stage-specific Paralogs within Protein Families-As different hosts present varied environments and challenges, stage-specific surface proteins are primary contributors to host-parasite interactions. We therefore examined cell surface proteomes for proteins that were exclusively identified in only one life cycle stage. We identified 72 BSF-specific proteins and 74 PCF-specific proteins, (Fig. 3, supplemental Table S1). Note that proteins present in both life cycle stages, but upregulated or downregulated between life cycle stages would be missed by this approach. However, the stage-specific proteins identified constitute the most prominent and robust changes between the surface proteomes during the parasite's transmission between hosts. To test the robustness of these stage-specific assignments, we completed additional independent surface proteome analyses using MS1 label-free quantitation, with two technical replicates for each life cycle stage. Nearly all proteins assigned as stage-specific could be quantitated in MS1 analyses (supplemental Table S1), and of these, the majority showed significant (p Ͻ 0.01) up-regulation in the The BSF and PCF cell surface proteomes were defined as proteins reproducibly identified in three of three VSG-depleted BSF or Նfour of five PCF replicates, respectively. Proteins that could not be distinguished by available peptides in any given replicate were considered as a group (see supplemental Tables S1 and S8 -S11). The numbers listed are the number of protein groups identified, corresponding to the minimum number of proteins present. The number of nonredundant protein groups identified in the combined surface proteome from either life cycle stage was 372. Stage-specific proteomes were defined as proteins or protein groups that were exclusive to one life cycle stage (supplemental Table S1). assigned life cycle stage (Fig. 5A). Therefore, these analyses provided strong statistical support for stage-specific assignments based on exclusive detection in one or the other life cycle stage. In addition, MS1 analyses revealed stage-specific enrichment of several proteins in the surface proteome that were not exclusive to one stage (supplemental Fig. S3, supplemental Table S1).
The cell surface is the direct interface with the host. We therefore asked whether the cell surface proteome is enriched for stage-specific proteins compared with the whole-cell proteome. The distribution of relative protein abundance in BSF and PCF surface proteomes was plotted against the distributions in published BSF and PCF whole cell proteomes (43,44). The data show that the cell surface proteome is enriched for proteins that are differentially regulated between BSF and PCF stages (Fig. 5B). Likewise, among proteins identified in whole-cell proteome analyses (43,44), roughly 25% of the stage-specific proteins are present in the cell surface proteome, as compared with only 5% of constitutive proteins (Fig. 5C). Therefore, a substantial fraction of developmentally regulated gene expression changes in T. brucei are devoted to remodeling the cell surface.
We noticed that paralogs from large protein families (46) were well-represented among stage-specific surface proteins (Fig. 6, supplemental Fig. S4, and supplemental Table S5). Although stage-specificity has been described for some families, e.g. adenylate cyclases (47), glucose transporters (52), and nucleoside transporters (53), stage-specific regulation was not previously known for others, e.g. hypothetical proteins (Fam72 and Fam79) (Fig. 6, supplemental Tables S1 and S5). We therefore used qRT-PCR to independently assess stage-specific expression for several of these protein families ( Fig. 6 and supplemental Fig. S4). The qRT-PCR results fully supported the stage-specificity determined by mass spectrometry, validating the mass spectrometry analysis and suggesting that dedicated use of distinct paralogs in different environments is a common feature of T. brucei host adaptation.
Implications for Expanded Gene Families and Host Adaptation-One of the largest protein families encoded by the T. brucei genome is a family of ϳ75 receptor-type adenylate cyclases (ACs) (Fam51 (46), Fig. 6). The canonical AC is the BSF-specific ESAG4 (54,55), which contributes to parasite virulence by modulating the immune response of the mammalian host (48). Recent work described five ACs that are specifically upregulated in PCF cells (47), one of which regulates social behavior (56), showing that both BSF-specific and PCF-specific ACs mediate parasite interaction with the environment. In addition to these previously identified ACs, we identify two additional PCF-specific and BSF-specific ACs (supplemental Tables S1 and S5, Fig. 6), further supporting an important role for ACs in host-specific adaptation. We speculate that individual ACs not identified in our studies are optimized for host environments that are not recapitulated in cultured PCF and BSF forms.
Gene expansion is a well-known phenomenon in T. brucei and other kinetoplastid parasites (57). The reasons for gene expansion are not clear, though it has been suggested to reflect redundant gene duplication. Based on the size of the T. brucei AC family and the identification of PCF-specific ACs, we have previously proposed that the greatly expanded repertoire of T. brucei ACs reflects the relative complexity of the life cycle within the insect vector, as compared with other kinetoplastids (47,56). Our discovery in the present work that stage-specific expression of distinct paralogs is common across multiple families argues against simple redundancy as the explanation for gene expansion. Moreover, some families are described to have undergone faster diversification since the divergence of African trypanosomes from other trypanosomatids (46), indicating that gene expansion confers a selective advantage. Thus, rather than redundancy, our results suggest surface protein families have expanded in response to the need for alternately expressed isoforms that are each tuned to a specific host environment, as has been suggested for the nucleoside transporter family (53). This would allow the parasite to continually meet essential cellular needs that are Input, Unbound, and Bound fractions from a single independent PCF or VSG-depleted BSF surface purification were analyzed by shotgun proteomics. The relative abundance of individual proteins was determined as a percentage of the total mass spectra in each fraction, and the ratio of relative abundance in the Bound versus Input fractions is plotted for several known surface proteins (black) and intracellular proteins (gray) in ascending order (see columns in supplemental Table S4). Intracellular controls included alpha and beta tubulin, two dynein heavy chains, BiP, and proteins annotated as ribosomal proteins or histones. A ratio Ͼ1 indicates enrichment, whereas a ratio Ͻ1 indicates depletion. Proteins identified in the Bound but not Input fraction were plotted as Ͼ10 (BSF) or Ͼ40 (PCF). Note that the graph excludes proteins not identified in the Bound fraction. common in all environments, such as uptake of an essential nutrient, while still responding to unique pressures imposed by hostile and markedly different environments in each host. Given that adapting to a multitude of different extracellular environments is intrinsic to a parasitic lifestyle, we speculate this is a paradigm that applies broadly to surface proteome remodeling in other parasites.
Cell Surface Proteomes Represent the First Direct Analysis of Stage-specific Surface Proteome Remodeling-Aside from major surface proteins, the parasite surface proteome has remained largely uncharacterized. Previous efforts to define life cycle stage-specific proteomes have almost exclusively focused on the whole-cell proteome (43-45) (supplemental Table S6). The flagellum surface proteome has been examined (28,58). However, efforts to define the T. brucei wholecell surface proteome have been limited to in silico prediction (46) and a subtractive proteomic analysis of crude fractions containing plasma membranes and cytoskeletons from bloodstream form parasites (59) (supplemental Table S6). In silico predictions are powerful, but they do not distinguish between cell surface and intracellular membranes. Likewise, though subtractive proteomics can be a useful approach, no attempt was made to directly separate surface-exposed proteins from intracellular contaminants and no comparable analysis was performed in procyclic-stage parasites. Therefore, the cell surface proteomes described here present the first analyses to directly isolate and define the cell surface proteome and to do so from both the insect and mammalian life cycle stages of T. brucei (supplemental Table S6).

Functional Implications of the Cell Surface Proteomes
Proteomic Analyses Identify a Large and Diverse Surface Proteome Important for Host-Parasite Interaction and Therapeutics-An important finding to come from our studies is that the T. brucei cell surface proteome is larger and more diverse than generally appreciated. Surface proteins identified encompass a broad range of molecular functionalities (supplemental Table S7), including activities anticipated for cell surface function, such as receptors, transporters and proteases, as well as numerous proteins of completely unknown function. Approximately 10 -20% of the BSF and PCF surface proteomes are restricted to kinetoplastids, with nearly half of those being exclusive to T. brucei (supplemental Fig. S2).
Only a handful of T. brucei surface proteins have been studied in depth and virtually all perform critical functions, particularly in the context of host-parasite interaction and therapeutics. Examples include VSG and transferrin receptor in BSF cells (60 -62), as well as surface proteins modulating resistance to human serum (63,64) and recent studies A, Stage-specific proteins exclusively identified in either the cell surface proteome from BSF cells (red circles) or PCF cells (blue circles) were analyzed by MS1 label-free quantitation in two technical replicates each. The Log 2 (PCF/BSF) fold change measured for each protein is plotted on the x axis. Statistical significance is plotted on the y axis, with p Ͻ 0.01 above the dashed line. B, Graph shows the distribution of Log 2 (PCF/BSF) protein ratios from MS1 analysis of the cell surface proteome (shaded area), relative to whole-cell proteomes determined by SILAC analyses (*Urbaniak et al. (44), black dashes; **Butter et al. (43), black line). C, Chart shows the relative fraction of constitutive and stage-specific proteins identified in published whole-cell proteome SILAC analyses (43,44) that were also identified in the cell surface proteome. Proteins at least fivefold differentially regulated between PCF and BSF in published SILAC studies were defined as stage-specific (43,44).
showing virulence functions for ESAG4 and calflagin (48,65). In procyclic forms, procyclin is required for robust infection and transmission through the tsetse fly (66). From a clinical perspective, parasite genes impacting the efficacy of all five currently available drugs have been identified and the majority of these encode surface-exposed proteins (67)(68)(69)(70)(71). Importantly, nearly all of these proteins were identified in the cell surface proteome, emphasizing the functional relevance of the surface proteome. Thus, our studies define a cohort of proteins important for host-parasite interaction with potential to impact success of therapeutic interventions.
The Surface Proteome Prioritizes Proteins Important for Fitness and Host-Parasite Interaction-Our surface proteome analyses advance systems-level studies of protein function in T. brucei. For example, unbiased high-throughput RNAi target sequencing (RIT-seq) analysis has identified nearly 4500 genes that impact parasite fitness or differentiation from bloodstream to procyclic forms in culture (72). Although these 4500 genes are interesting for their potential as drug targets, there are far too many candidates for effective prioritization. We conducted a meta-analysis of the surface proteome versus the RNAi target data set. Our analyses substantially pare down the number of proteins required for parasite fitness from 4500 (72) to ϳ65 that are surface-exposed based on enrichment in Bound versus Input fractions (supplemental Table S1) and thus accessible to small molecules added to live cells. These genes can therefore be prioritized for investigation as therapeutic targets that may circumvent the need for cellpermeant drugs. Moreover, nearly half of these proteins are of unknown function, emphasizing novel features of the cell surface proteome and suggesting host-interaction functions for ϳ30 of the ϳ5000 T. brucei genes annotated as hypothetical (73). Relevance to host-parasite interaction also comes from considering stage-specific expression, as our analyses distinguish surface proteins, which would directly interface with the host, from intracellular proteins, such as those uncovered in whole proteome stage-specific analyses (43)(44)(45), that may simply reflect downstream metabolic or structural consequences of differentiation.  (46) for which we identified life cycle stage-specific paralogs. Families are adenylate cyclases (ACs), amino acid transporters (AATs), and two families annotated as hypothetical proteins. Proteins are colored according to the predicted expression profile inferred from mass spectrometry data (supplemental Table S5). (Red ϭ BSF-enriched; Blue ϭ PCF-enriched; Green ϭ Constitutive, Gray ϭ Not identified, Black ϭ identified but expression profile could not be reliably predicted). The total number of proteins in each family is shown along with the number of BSF-enriched and PCF-enriched proteins identified in the cell surface proteomes. If two or more proteins could not be distinguished based on peptides identified they are counted as a single entry, and the group is indicated by a bracket in the phylogenetic tree. ‡ Proteins for which stage-specific expression has been previously described (see supplemental Table S5 for references). *Proteins whose stage-specific expression was examined by qRT-PCR in panel B. B, Expression levels for the indicated genes were determined by qRT-PCR analysis using RNA from BSF (red) and PCF (blue) cells. Data are the mean Ϯ S.D. of two independent biological replicates and expression is normalized to the stage having higher expression. **Groups of closely related genes detected by the primer pair used for qRT-PCR analysis (189094619/189094625; Tb927.7.6050/Tb927.7.6060/Tb927.7.6070; Tb927.4.4820/Tb927.4.4840/Tb927.4.4860).

Cell Surface Proteome Analyses Distinguish between Cell
Surface and Intracellular Membranes-Our data provide a valuable resource for resolving outstanding questions regarding individual proteins. As an example, conflicting evidence for surface exposure of GPI-PLC has been reported (74,75). We observe a Bound/Input ratio Ͻ1 for this protein (supplemental Table S1), supporting intracellular localization. Similarly, a Bound/Input ratio of Ͻ1 for the virulence factor metacaspase 4 (supplemental Table S1) supports previous data suggesting it is predominantly intracellular (76). In contrast, we see strong surface enrichment for flagellar calflagins (65) and a flagellum tip-localized calpain-like protease (77) (supplemental Table S1), suggesting these proteins function at the flagellum surface. Our bound/input analyses also provide insight into the large family of T. brucei ABC transporters. Out of ϳ20 annotated ABC transporters in the genome, only four have B/I ratios of Ն 2.3, indicating they function in substrate exchange with the host environment, while the remainder function intracellularly (Supplemental Table S1). Another interesting discovery is a group of three phospholipid-transporting ATPases that are enriched Ն12-fold in the bound fraction compared with input (supplemental Table S1). A role for phospholipid transporters at the cell surface has not been shown, but could be important for host-parasite interactions by maintaining distinct lipid compositions of the cell surface membrane and critical subdomains, such as the flagellum and flagellar pocket membranes (78). These proteins will be interesting targets for further functional studies.
Summary and Perspective-The surface of parasitic protozoa remains vastly understudied relative to its importance for parasite biology, pathogenesis, transmission, and clinical influence. Here we find that the T. brucei surface proteome is larger and more diverse than generally recognized and includes proteins known or predicted to be important for hostparasite interaction and drug action. Additionally, our studies reveal extensive remodeling of the surface proteome, indicating the parasite strives to accommodate specific requirements of hostile host environments while continuing to maintain fundamental processes essential for viability.