Immunomic identification of malaria antigens associated with protection in mice

Efforts to develop vaccines against malaria represent a major research target. The observations that 1) sterile protection can be obtained when the host is exposed to live parasites and 2) the immunity against blood stage parasite is principally mediated by protective antibodies suggest that a protective vaccine is feasible. However, only a small number of proteins have been investigated so far and most of the Plasmodium proteome has yet to be explored. To date, only few immunodominant antigens have emerged for testing in clinical trials but no formulation has led to substantial protection in humans. The nature of parasite molecules associated with protection remains elusive. Here, immunomic screening of mice immune sera with different protection efficiencies against the whole parasite proteome allowed us to identify a large repertoire of antigens validated by screening a library expressing antigens. The calculation of weighted scores reflecting the likelihood of protection of each antigen using five predictive criteria derived from immunomic and proteomic data sets, highlighted a priority list of protective antigens. Altogether, the approach sheds light on conserved antigens across Plasmodium that are amenable to targeting by the host immune system upon merozoite invasion and blood stage development. Most of these antigens have preliminary protection data but have not been widely considered as candidate for vaccine trials, opening new perspectives that overcome the limited choice of immunodominant, poorly protective vaccines currently being the focus of malaria vaccine researches.


Summary:
Efforts to develop vaccines against malaria represent a major research target. The observations that 1) sterile protection can be obtained when the host is exposed to live parasites and 2) the immunity against blood stage parasite is principally mediated by protective antibodies suggest that a protective vaccine is feasible. However, only a small number of proteins have been investigated so far and most of the Plasmodium proteome has yet to be explored. To date, only few immunodominant antigens have emerged for testing in clinical trials but no formulation has led to substantial protection in humans. The nature of parasite molecules associated with protection remains elusive.
Here, immunomic screening of mice immune sera with different protection efficiencies against the whole parasite proteome allowed us to identify a large repertoire of antigens validated by screening a library expressing antigens. The calculation of weighted scores reflecting the likelihood of protection of each antigen using five predictive criteria derived from immunomic and proteomic datasets, highlighted a priority list of protective antigens. Altogether, the approach sheds light on conserved antigens across Plasmodium that are amenable to targeting by the host immune system upon merozoite invasion and blood stage development. Most of these antigens have preliminary protection data but have not been widely considered as candidate for vaccine trials, opening new perspectives that overcome the limited choice of immunodominant, poorly protective vaccines currently being the focus of malaria vaccine researches.
To date, there is no reliable approach that can be used to predict vaccine candidates implicated in parasite elimination and control. The current selection of vaccine targets is based on a variety of approaches which, while not unreasonable, are not systematic. For example, immunodominant antigens have been explored due to the ease with which sera were generated against them and numerous experimental evidences suggesting a possible protective effect.
However, the efficiency of these antigens in clinical trials was unsatisfactory due to antigenic polymorphisms, variations across parasite strains and species and the need to maintain a high antibody titer for substantial protection (24)(25)(26)(27)(28). proteins. In addition, the accuracy of such approaches is affected by multiple-factors such as the coverage of the protein in the library, the folding of the antigens, the lack or presence of post-translational modifications and in the case of variant proteins, the polymorphism between parasite clones/isolates (reviewed in (29,30)). Thus, the repertoire of antigens derived from antigen libraries remains incomplete. Alternatively, immunoprecipitation (IP) coupled to MALDI-TOF analysis was used to expand the size of the proteome screened. Such an approach was used to identify parasite antigens recognized by mice immune sera from an internal parasite lysate (31). However, with only four antigens identified, additional improvements are further required. Although experimental and epidemiological data have clearly demonstrated that a 6 protective immune response can develop against malaria parasites; no vaccine formulation has been able to induce a sufficient level of protection. RTS,S, the most clinically advanced vaccine, confers only ~30% protection against P. falciparum in children aged from 6 to 12 weeks and ~50% protection in children aged from 5 to 17 months (32,33). Furthermore, the protection was undetectable 3 years post vaccination (34). Thus, more information aimed at developing a vaccine able to yield life-long sterile immunity is needed. It is likely that the protection against blood stage parasite results from a strong humoral response targeting a set of nonimmunodominant antigens that are yet to be identified.
Here, combining multiple immunomic and proteomic approaches, we developed a strategy to determine the whole repertoire of antigens associated with protective humoral immunity in mice against a murine malaria parasite. For this, sera conferring different levels of protection against erythrocytic P. yoelii parasites were generated and screened for reactivity against the whole parasite proteome. Reactive parasite antigens were then categorized following their likelihood to mediate protection using a range of predictive criteria. This combined approach allowed the prediction of a novel set of immune protective proteins. The data generated here can now serve as a valuable resource to develop a rational approach for the development of a malaria blood stage vaccine. pooled sera contained sera collected from 11 mice challenged three time and a mouse challenged four time. The blood was collected by cardiac puncture after confirmation of the presence of parasites six weeks post immunization (None-protected mice) or two weeks after the last challenge following validation of the parasite clearance (Self-Cured, Chloroquine treated and Attenuated mice). The samples were stored overnight at 4°C prior to centrifugation for 10 min at 16,100 × g to collect the serum.
To access the protective efficiency of the immune sera, 500 µl of naïve and immune pooled sera were passively transferred into naïve BALB/c mice. The mice were then immediately infected with 10 6 of Plasmodium yoelii 17X 1.1 and monitored by thin tail blood smear until a parasitaemia of 1% is observed.
All the inoculation of parasites and mice sera were performed by intravenous injection into the retro-orbital sinus vein. The treatments with chloroquine or pyrimethamine were achieved using the intraperitoneal route.
Isolation of antigens using non-denaturing condition.
BALB/c mice were infected by intraperitoneal injections with P. yoelii 17X. Highly purified samples of trophozoite and schizont stage parasites were obtained by 50-60% Nycodenz gradient (Sigma) as previously described (35). For each biological replicate, an aliquot of ~100 μl infected RBC were solubilized using the ProteoExtract ® Native Membrane Protein Extraction Kit (Merck) using manufacturer instruction to obtain parasite proteins extracted under native conformation and enriched in transmembrane proteins. The native parasite extracts were then incubated overnight at 4C° with protein A/G agarose beads crosslinked to immune sera using the Crosslink IP Kit (Pierce) following manufacturer instructions. The whole procedure was repeated until four (None-protected and Chloroquine-treated sera) and three (Self-cure and Attenuated sera) biological replicates were obtained. The IP samples obtained were analyzed by label-free quantitative LS-MS/MS.

Isolation of antigens using denaturing condition.
Denatured samples of trophozoite and schizont stage parasites were obtained by solubilizing 500µl of infected RBC using the ProteoExtract® Complete Mammalian Proteome Extraction Kit (Merck) following manufacturer instructions. The sample were then quantified using the 2-D Quant Kit (GE) and divided in aliquots of 500µg of proteins to be separated using 2D gels as described previously (38). Briefly, five aliquots were loaded on Immobiline DryStrips 24 cm strips (pH3-7NL, nonlinear pH gradient) (GE Healthcare) for the first-dimension separation using an Ettan IPGphor 3. Next, the strips were placed on top of 11% SDS-PAGE gels (25.5 × 20 cm) for the second-dimension electrophoresis using an Ettan Daltsix Electrophoresis System (GE Healthcare). After separation of the five infected RBC samples, the proteins included in two gels were transferred onto low fluorescence PVDF membrane (Biorad) using the Trans-Blot® Cell (Biorad) for western blotting. The remaining gels were either fluorescently (Pierce™ Krypton fluorescent gel stain) (n=1) or silver (Pierce™ Silver Stain for Mass Spectrometry) (n=2) stained. Following blocking using Odyssey™ Blocking Buffer (Licor), a first transferred membranes was probed with None-protected sera. The second membrane was probed with Chloroquine treated sera, then stripped and re-probed using Attenuated sera. All the sera used were diluted 1/1000 in Odyssey™ Blocking Buffer. The membranes were then revealed using Alexa 649 (None-protective and Chloroquine-treated) or Alexa 488 (Attenuated) secondary antibody (Jackson laboratory) diluted at 1/3000. The probed membranes were imaged using a GE typhoon trio scanner prior to staining using Coomassie blue. The whole proteome pattern included in the fluorescently labeled gel were revealed using the GE typhoon trio scanner. The area containing the antigens probed by the immune sera were manually identified by superimposing the 2D western blots, Coomassie blue and the fluorescent stained by guest on April 27, 2019 http://www.mcponline.org/ Downloaded from gel images. The counterparts of these area were excised from the silver stained gels to be further analyzed by LC-MS/MS shotgun analysis.
Validation of the repertoire of antigens.
Ninety antigens were expressed in mammalian CHO-K1 cells. For this, 170 open reading frames (ORFs) containing full coding sequence or overlapping parts (for genes >1.2kb) of these antigens excluding the hydrophobic regions, were cloned into the pDisplay vector (Thermo) to be expressed in fusion with with a Myc-and a HA-tag (Table S3, antigen validation). The open reading frame (ORF) were then transiently expressed into CHO-K1 cell line cultivated in 96wells plates using Lipofectamine LTX transfection reagent (Thermo). Following fixation, the transfected CHO-K1 cells were then screened by immunofluorescence assay (IFA) with an Olympus IX71 fluorescent microscope, using anti-tag antibodies (Abcam) and mice protective sera (None-protective, Self-cured, Chloroquine-treated or Attenuated) diluted at 1/800. The location and size of the antigen part expressed is summarized in the Table S2, antigen validation.
The recognition patterns obtained determined after observation of the cells immunolabeled with immune sera are shown either with the symbol 'yes' or 'no' standing for positive or negative labeling, respectively.

Proteome analysis of blood stage parasite.
Aliquots of 50 µl of purified P. yoelii at ring, trophozoite and schizont stage were prepared using a 50%-80% Nycodenz (Sigma) gradient as previously described (35). To obtain merozoite sample, schizont parasites were first cultivated till maturity in RPMI1640 media containing 20% FBS with gentle shaking at 37°C. Then, ISOPORE membrane filter (Millipore) was pre-wetted with incomplete RPMI1640, and 500µl of mature schizont were passed through the membrane to release ~50µl of free merozoites that were collected and washed with RPMI1640 media and PBS. All the sample were quantified using the 2-D Quant Kit (GE) and following manufacturer instructions. One hundred microgram of each sample will be separated in SDS-PAGE gel and analyzed using label-free quantitative LS-MS/MS. To identify proteins associated with the merozoite stage, we searched for proteins characterized by an increasing abundance pattern over the erythrocytic stages and a peak of abundance in the merozoite stage (i.e. proteins with a relative abundance in the ring ≤ trophozoite ≤ schizont < merozoite stage).
In addition, proteins without signal peptide and/or transmembrane domain were also filtered out to exclude internal proteins most likely not associated with protection.
Isolation of the RBC membrane proteome.
Highly purified samples of trophozoite parasite were prepared using Nycodenz gradient,

LS-MS/MS analysis
The IP, blood stage parasite and RBC membrane samples were separated on 12% SDS-PAGE at 50 V and protein bands were visualized by staining with Imperial Coomassie blue (Pierce).
The gel lanes were cut into 10 separate slices, then de-stained, and the proteins reduced by using dithiothreitol (DTT) and alkylated by iodoacetamide (IAA). The gel area excised from the 2D gel were destained using the instruction and reagents provided in the Pierce™ Silver Stain for Mass Spectrometry kit. The proteins were cleaved by overnight digestion in porcine trypsin

13
The MS/MS spectra of peptide ion fragments generated by collision-induced dissociation were acquired in the linear ion trap. The 10 most intense ions above a 500 counts threshold were selected for fragmentation in Collision-induced dissociation.

Database searching
The LC-MS/MS raw data were processed using Proteome Discoverer 2.1 (PD2.1; ThermoFisher Scientific), and peptide identification was performed by both Mascot search engine (V 2.4.1, Matrix Science, Boston, MA) and Sequest-HT search engine.
A database including 22270 mouse proteins (Mus musculus reference proteome, https://www.ebi.ac.uk/reference_proteomes), 6092 P. yoelii 17X proteins (PlasmoDB release 34, http://plasmodb.org/common/downloads/release-34/Pyoeliiyoelii17X/fasta/data/) and 2 sets of 245 and 116 common contaminant proteins (http://www.coxdocs.org/doku.php?id= maxquant:start_downloads.htm and ftp://ftp.thegpm.org/fasta/cRAP, respectively) was used for the searches. The protease was specified as trypsin with two maximum missing cleavage sites. Mass tolerance for precursor ion mass was 10 ppm with the fragment ion tolerance as 0.8 Da. Cysteine alkylation by iodoacetamide (carbamidomethyl) was set as fixed modification, Methionine oxidation, and deamidation at asparagine or glutamine were selected as variable modifications. Proteome Discoverer's workflow included an automatic target-decoy search tactic along with the Percolator to score peptide spectral matches from both Mascot and Sequest HT searches to estimate the false discovery rate (FDR). The Percolator parameters are set to maximum delta Cn = 0.05; target FDR (strict) = 0.01; target FDR (relaxed) = 0.05, validation based on q-value (40). Estimation of the protein abundance was performed using exponentially modified Protein Abundance Index (emPAI) values for each sample were reported by the PD2.1 (41). The emPAI is defined as emPAI = 10 PAI -1 with PAI = Nobserved/Nobservable. Nobserved is the number of experimentally observed peptides and Nobservable is the calculated number of observable peptides for each protein. The emPAI reported here were calculated using the data obtained from each sample. Only proteins identified by at least one unique peptide were retained. Finally, the presence of signal peptide and/or transmembrane domain in the proteins identified were checked using Plasmodb (www.plasmodg.org). The localization of the proteins identified was deduced by searching in the literature using PubMed (https://www.ncbi.nlm.nih.gov/pubmed), RMgmDB (http://www.pberghei.eu/) and Phenoplasm (http://phenoplasm.org/) databases and analysis of the annotations available in PlasmoDB.

Experimental Design and Statistical Rationale
The dataset of native antigens identified following immunoprecipitation using immune sera was derived from n = 3 (Self-cured and Attenuated mice sera) or n = 4 (Chloroquine-treated and None-protected sera) biological replicates, each conducted with parasite extracts obtained from 50µl of infected RBC independently prepared. For the analysis of the IP samples abundance pattern (criterion 2), the mean emPAI was obtained by calculating the mean abundance values of each of the biological replicates obtained with same immune sera. For 2D western blot dataset, the antigens recognized by non-protective and protective sera in their denatured form were identified by screening technical replicates of 2D gel obtained from a same parasite extract.
For the proteomic analysis of the blood stage parasite, 100µg of purified samples ring, trophozoite, schizont and merozoite stage P. yoelii independently prepared were analyzed by LC-MS/MS. The emPAI values were normalized according to the mean value of the total abundance of all proteins found in each sample.
Results 1) Characterization of sera from mice immunized with live blood stage P. yoelii parasites.
Four groups of immune sera from mice immunized with live blood-stage P. yoelii 17X 1.1 were obtained using three experimental protocols that provide increasing exposure to the parasite ( Figure 1A). The first group of sera, named "Chloroquine-treated" sera and characterized by a brief exposure time, were obtained from mice infected with a high dose of live parasite (10 6 parasites) followed by the immediate treatment with chloroquine to rapidly eliminate all the injected parasite (12). The second group, named "Attenuated" sera and characterized by a moderate exposure time were obtained from mice infected with a high dose of parasite transfected with a plasmid containing a drug selection cassette (toxoplasma DHFR) and subsequently treated with pyrimethamine to select for the transfected parasites. Using this protocol, a low parasitaemia (< 8%) was detected in the mice blood for ~ two weeks before clearance by the immune system ( Figure S1). Both the third and fourth groups were obtained from mice infected with a low dose of live erythrocytic parasites (10 3 P. yoelii 17X 1.1) and exposed to the parasite for a longer time. The sera obtained from mice able to eliminate the parasite within six weeks were named "Self-cured" sera whereas sera from mice unable to clear the initial immunization were qualified as "None-protected" sera. To minimize the difference between the sera and to obtain sufficient quantity of material, 12 immune sera obtained from each group of mice were pooled equi-volumetrically.
The pooled immune sera were passively transferred into naïve mice (n=5) to access their abilities to protect against a subsequent challenge with 10 6 P. yoelii 17X 1.1. The results were compared to that of mice passively transferred with pooled naive mice sera ( Figure 1B). Mice that had received naive sera were found with 1% parasitaemia three days post infection, and none survived beyond 20 days post infection. The None-protected sera showed a low protective effect with parasites detected five days post infection and one mouse out of five recovering 28 days post challenge. The transfer of Self-cured, Chloroquine-treated, and Attenuated sera induced a higher protective effect with parasites observed seven days post infection and an increased survival rate with two, three and four mice out of five recovering after 18 to 35 days post-infection, respectively. These three pooled sera were therefore considered as protective with Attenuated sera showing the highest protective effect followed by Chloroquine-treated and then Self-cured sera. Finally, the animals which recovered following the challenge were fully protected against a subsequent challenge (figure 1B). Altogether, these results indicate that that the sera generated herein were able to protect against a blood stage challenge with various efficacies and thus contained antibodies directed against protective antigens.
2) Identification of antigens recognized by protective sera using non-denaturing condition.
Immunoprecipitation was used to isolate antigens in their native condition. P. yoelii infected erythrocyte proteins extracts prepared using non-denaturing condition were separately incubated with the four pooled immune sera covalently linked to protein A/G agarose beads (See Figure 2A and Experimental procedures). The immunoprecipitated samples were then submitted to label-free quantitative LC-MS/MS analysis. The proteomic data was derived from n = 3 (Self-cured and Attenuated mice sera) or n = 4 (Chloroquine-treated and None-protected sera) independent experiments, each conducted with independently prepared parasite extracts.
Aligning the peptide hits in the immunoprecipitated samples and retaining proteins identified in at least one biological replicate 358, 407, 695 and 427 proteins were identified in the Noneprotected, Self-cured, Chloroquine-treated and Attenuated samples, respectively ( Figure 2B, Table S1). Of the antigens detected by the immune sera, 248 antigens are shared across the four datasets, 243 antigens overlapped between two or three datasets while 294 antigens were unique to a dataset ( Figure 2B). The analysis of the antigens recognized by the protective sera revealed a dataset of 763 antigens that was subsequently used for prioritization, 336 of them being also identified in the none-protective samples ( Figure 2B and Table S2, IP dataset). The abundances of these proteins were calculated using the emPAI index (36,(41)(42)(43). The median emPAI value in these datasets was ~0.35 with abundances spanning six orders of magnitudes (emPAI value ranging from ~0.004 to ~750) ( Table S1).
As the antigens identified here contains proteins previously characterized either in the endogenous parasite or as orthologue or paralogue in other Plasmodium spp., we searched the literature for experimental evidences pertaining to protection and subcellular location. The locations of the remaining proteins were examined using the annotations available in the malaria databases and the P. yoelii proteome mapping dataset published recently (36). Overall, a localization was assigned to 704 antigens (~92% of the repertoire) (Table S1 P. yoelii antigens column AS) including exported proteins (beyond the PVM or in the RBC membrane) (n=62).
Twenty-nine proteins are localized in the parasite periphery, some of them being also reported in the merozoite (n=6). The antigen dataset also includes 10 PIR proteins known to be exported in the parasite periphery and/or the host-cell cytosol (37,44). Another 44 proteins were previously reported on the merozoite surface or in the apical end, with some of them having a moonlighting localization either in the parasite periphery (n=3), the RBC membrane (n=4) or both the parasite periphery and the host-cell cytosol (n=1). Finally, 559 proteins (~73% of the repertoire) are likely localized within the internal parasite. The large presence of internal parasite proteins suggests that antibodies directed against internal parasite protein likely released after parasite clearance or egress, are generated by the host and possibly contribute to misdirect the immune response. Inherent to all proteomic analysis, some of those proteins might be contaminants due to their high abundance as shown previously (45-47).
Supporting the notion that the overall dataset is enriched in protective antigens, the dataset included 39 antigens with protection evidences and two antigens that have been shown to be none-protective (pyRON4(48), pfUB05(49)) (Table S1 P. yoelii antigens column AQ).
Altogether, this suggests that while the repertoire of antigens is enriched with protective antigens, it is not devoid of false positives due to their abundance or immune reactivity unrelated to a protective immune response. To identify the protective antigens, the IP dataset was validated and prioritized using information obtained from proteomic approaches and malaria databases.
The antigenicity of the immunomics dataset was investigated using a mammalian cell library. For this, 90 antigens were expressed in fusion with a Myc and a HA-tag (Table S3, P. Integrative analysis column AO and Table S3 (Table S1 antigens validation sheet) that were transfected into CHO-K1 cells. The cells were then screened by immunofluorescence assay (IFA) using anti-tag antibodies and mice immune sera (None protected, Self-cured, Chloroquine-treated or Attenuated). Screening using anti-tag antibodies revealed that 49 ORFs representing 10 proteins (11% of the library) were not expressed by mammalian cells (Table 1A, Table S2, antigen validation). Validating the immunomic approaches, ~81% of the library (73 antigens) were recognized by at least one protective serum, ~51% of them (46 antigens) being also recognized by the None-protective sera (Table 1A, Table S2, antigen validation). This precision rate is higher than those observed during the screening of the largest protein array currently available (1204 malarial proteins or ~23% of the proteome) for which a maximum of 22% of the screened proteins were recognized by immune sera (15,50,51). Importantly, the analysis of the immunoreactivity pattern did not reveal any bias toward a specific group of antigens with 37 out of the 38 internal parasite proteins probed by the protective sera. Similarly, many of the merozoite, parasite periphery and exported proteins screened are recognized by the noneprotective sera. The expressed antigens that were not probed by immune sera were mostly large merozoite proteins, possibly due to the limited coverage of these proteins in the library, a possible improper folding of the chimera or in the case of the variant Py235, the expression of different variants during the infection (52). Overall, the screening of the mammalian library highlighted the diversity of the antigens recognized by protective sera, many of them being also recognized by None-protective sera, emphasizing the need of an unbiased prioritization system. Several proteomics approaches were performed to assist with the selection of protective antigens.
3) Identification of antigens recognized by immune sera using denaturing condition.
2D western-blot was used to identify antigens in their denatured form. P. yoelii infected erythrocyte proteins extracts prepared using denaturing condition were fractionated by 2D gels and incubated with None-protective, Chloroquine-treated and Attenuated sera (See Figure 3 and Experimental procedures). The gel area recognized by the immune sera were analyzed by LS-MS/MS analysis. Overall, 290 antigens were identified in the gel areas recognized by both the Attenuated and the Chloroquine-treated sera, including 84 antigens also probed by Noneprotective sera and most likely not associated with protection (Table S2,   Immunity against blood stage parasites has been associated with merozoite antigens (18)(19)(20). This is further supported by the observation that the IP dataset includes 44 merozoite proteins. Thus, the identification of the parasite proteins enriched in the merozoite stage would assist in the selection of protective antigens. For this, a differential analysis of the P. yoelii parasite proteome during the blood stage was performed to identify proteins enriched in the merozoite stage. Highly purified parasite sample at ring, trophozoite, schizont and merozoite stage were prepared and analyzed using label-free quantitative LS-MS/MS leading to the identification of 1146 proteins, 1035 of them being included in the merozoite sample (Table S2, P. yoelii blood stage proteome). Proteomic datasets obtained from wild type merozoite sample have become recently available in P. berghei with 781 proteins identified (53) and in P.
falciparum with the quantitative analysis of wild type and genetically modified merozoites that reported 981 proteins (54). From a technical point of view, these datasets are comparable to the 1035 proteins identified in the P. yoelii merozoite as supported by the large overlap between the three datasets ( Figure S2). However, the dataset is likely not enriched in merozoite proteins, with only 240 proteins specific to the merozoite, the remaining proteins being also detected throughout the erythrocytic stage ( Figure S3). The dataset was further filtered out by 1) retaining the proteins enriched in the merozoite stage (i.e. proteins with a relative abundance in the ring ≤ trophozoite ≤ schizont < merozoite stage) and 2) excluding the proteins without signal peptide and/or transmembrane domain as they represent mostly internal proteins not associated with protection. Overall, 181 proteins were considered as associated with the merozoite stage (Table S2, P. yoelii blood stage proteome column I), including 109 antigens shared with the IP dataset (Table S2, P. yoelii blood stage proteome column J). The pre-filtered set of 1035 proteins included 20 proteins with experimental evidence pertaining to merozoite localization, most of them having protection evidence annotations (Table S2, P. yoelii blood stage proteome column K and L). Seventeen of these proteins were retained in the filtered set and shared with the IP dataset, supporting the relevance of the approach (Table S2, P. yoelii blood stage proteome column I and J). Altogether, this supports that the filtered dataset is enriched in merozoite protective proteins and can be used to refine the immunomics datasets.

5) Identification of P. yoelii protein located on the red blood cell surface.
Proteins expressed on the surface of the infected erythrocyte are known target of protective immunity (55,56) and this is supported by the high number of exported proteins identified here using immune sera. However, only little is known about the rodent malaria proteins localized on the RBC surface. Therefore, their identifications using proteomic approach could aid the refinement and selection of protective antigens.
For this, an infected RBC sample at trophozoite stage was incubated with Sulfo-NHS-SS-Biotin, a non-permeable cleavable biotin which attach to primary amine-containing molecule. A second sample was used to perform a control experiment without biotin in parallel. Following extraction of the proteins and isolation of the biotinylated proteins, both the samples were analyzed by label-free quantitative LS-MS/MS. Despite numerous precautions taken to avoid alteration of the RBC membrane integrity, numerous intracellular proteins were identified (Table S2, P. yoelii RBC membrane proteome). Similar to other proteomic studies studying malaria infected RBC membrane (46, 57), the subproteome includes exported proteins such as the P. yoelii ortholog of SBP1, MAHRP1a, IBIS, the tryptophan rich protein pypAg-1(58-60).
It also contains 12 Pyst/Fam proteins known to be exported into the host-cell as soluble protein, targeted into specialized organelles or associated with RBC membrane (36,37,46). Importantly, the dataset includes the few rodent malaria proteins currently known to be exported to the RBC surface. The most abundant proteins detected in the dataset is PY17X_0840300 (emAPI value~162), the ortholog in P. yoelii of the Pyst/Fam protein PbEMAP1 (46). Two other RBC membrane proteins were also detected albeit at a lower abundance level, including PY17X_0317500 (emAPI value~1.3) and PY17X_1001900 (emAPI value~3), the P. yoelii ortholog of PbEMAP2 (46) and PcEMA1 (61), respectively.
To overcome the presence of contaminating proteins, the dataset was further filtered to retain protein :1) enriched in the biotinylated sample (ratio >2 as compared to the abundance in the control) 2) with signal peptide or transmembrane domain, 3) not identified as associated with the merozoite stage in the previous section, 4) identified with two unique peptides or more.
To date, only little data is available on the rodent parasite proteins targeted to the infected RBC surface. Two methods were used to analyze RBC membrane samples obtained from P. berghei infected RBC (46,53). The hypotonic lysis of trophozoite and schizont parasites with or without cytoadherence defect and the analysis of their membrane fractions generated a unique RBC membrane proteome of 613 proteins after merging all the datasets (46,53). Alternatively, the surface shaving of similar infected RBC samples using trypsin and the analysis of the surfacereleased proteins led to the identification of a merged proteomes of 96 unique proteins (46,53).
From a technical point of view, the data presented here is more comparable to the surface shaving approach since in both approaches aimed to isolate RBC surface protein while hypotonic lysis generated membrane fraction that include both RBC surface membranes and other internal membrane components such as the PVM. However, a comparison of the surface biotinylating dataset of 86 proteins with the surface shaving dataset revealed an overlap of only 6 proteins while 30 proteins are shared with the hypotonic lysis dataset ( Figure S4). In absence of extensive knowledge of the proteins expressed on the RBC surface by rodent parasite, it is difficult to discuss the value of each dataset. However, the low levels of overlap can be partially due to the nature of the samples analyzed as the P. berghei datasets were derived from trophozoites and schizonts with various genetic backgrounds while the P. yoelii dataset is obtained solely from trophozoite. Importantly, the filtered P. yoelii dataset included the three rodent malaria RBC membrane proteins currently known while the P. berghei datasets identified only two surface proteins. EMAP1 was found in the hypotonic lysis and the surface shaving datasets while EMAP2 was included only in the hypotonic lysis dataset ( Figure S4). This dataset constitutes the first quantitative analysis of the rodent malaria infected RBC membrane proteome and suggests that EMAP1 is likely the dominant rodent malaria protein targeted to the RBC membrane. Importantly, the set of 86 proteins might not include all the parasite RBC membrane proteins. Proteins such as Rhoph2 and Rhoph3, originally found in the merozoite rhoptry and recently suggested to form a complex on the RBC periphery to allow nutriment uptake (62,63), were filtered out due to their dual localization pattern. Out of 86 proteins considered as enriched in surface molecules, 70 proteins were shared with the repertoire of antigens and used for prioritization (Table S2 P. yoelii RBC membrane proteome, column H).

6) Integrative analysis of proteomic data.
The repertoire of antigens immunoprecipitated by protective sera is enriched in putative protective antigens, but do not devoid of antigens unrelated to a protective response as well as contaminant proteins abundantly found in the infected RBC. We therefore prioritized the antigens from 0 (least protective) to 20 (most protective) using a weighted scoring system, following five predictive criteria (Figure 4). The score attributed to each criterion reflected their assumed importance to be associated with protection.
-Criterion 1, Reproducibility -Antigens associated with protection would be expected to be more consistently found in protective sera samples. Thus, antigens identified only in the protective samples and those found with a mean abundance higher by at least two time as compared to the mean abundance in the None-protective samples were scored according to their conservation across the protective replicates and the protection efficiency of the cognate immune sera. Based on this, the antigens found in all the Self-cured (less protective, n=3), the Chloroquine-treated (n=4), and the Attenuated (most protective, n=3) replicates were scored +1, +2 and +3, respectively. Those identified in n-1 experiments were scored +0.5 (Self-cured), +1.5 (Chloroquine-treated) and +2 (Attenuated). Finally, those recognized in n-2 experiments were scored +0.5 (Self-cured), +1 (Chloroquine-treated) and +1 (Attenuated). Using these criteria, the antigens enriched and conserved across all the protective samples received a maximal score of 3+2+1= +6 (Table S3, Integrative analysis, criterion 1, columns L, R and W).
-Criterion 2, Immunoreactivity -Antigens most strongly recognized in the sera with the highest protection efficacy are expected to be more important for protective immunity. It was considered that the relative abundance of each antigens translates a quantitative measurement of the cognate antibody specificity and affinity. Thus, protective antigens are expected to have a lower abundance in IP samples obtained using less protective immune serum and a higher abundance in samples obtained using a more protective serum. Based on this, the antigens found with a ratio >1.2X (i.e. with a mean abundance higher by at least 1.2 time) in the Self-cured samples as compared to the None-protective samples were scored +1. Similarly, the antigens with an abundance ratio >1.2X in the Chloroquine-treated samples as compared to the Selfcured samples and in the Attenuated samples as compared to the Chloroquine-treated samples were scored +2 and +3, respectively. Overall, the antigens with an abundance pattern in line with the protection pattern of the immune sera were scored with an additional 3+2+1= +6 (Table   S3, Integrative analysis , criterion 2 columns M, S and X).
-Criterion 3, Recognition by 2D western blot -Out the 206 proteins identified by 2D western blot as probed with Chloroquine-treated and Attenuated sera, 96 antigens were shared with the IP dataset and represent high confidence antigens that were scored with an additional +4 (Table   S3, P. Integrative analysis, criterion 3, column AA).
-Criterion 4, Proteins enriched in merozoite or RBC membrane -The merozoite and RBC membrane proteins are known target of the protective immunity, therefore antigens enriched in the merozoite proteome (n=109) or the RBC membrane proteome (n=70) were scored with an additional +2 (Table S3, P. Integrative analysis, criterion 4, columns AE and AJ).
-Criterion 5, Predicted signal or TM domain -Antigens with predicted signal peptide and/or transmembrane domain(s) are more likely to be involved in the host-pathogen interaction and be exposed to the host. Therefore, the proteins with a predicted signal peptide and/or transmembrane domain(s) were scored with an additional +2 (Table S3, P. Integrative analysis, criterion 5, column AM).
The analysis of the prioritized antigens using the information available in the literature indicated that the antigens prioritized with a higher score were more likely linked to protection. In detail, ~55% of the 18 antigens scored ≥16 were found with protection information as compared to ~25% for the 51 antigens scored ≥12 and <16 and ~2.5% for the 694 antigens scored <12 (Table   1B). Most of the antigens prioritized with a lower score are internal parasite proteins as these proteins represent ~84% of the antigens scored <12 (Table 1B). In fact, >90% of the internal parasite proteins identified in the IP dataset were scored <12 (523 proteins out of 559), corroborating that internal parasite proteins are unlikely associated to protection. In contrast the antigens predicted with a higher score were enriched in merozoite, RBC membrane, parasite periphery and exported proteins, many of them having a dual localization pattern that include the merozoite (apical end or surface) and the RBC membrane or the parasite periphery (Table   1B). Supporting the role of those antigens in protection, these antigens represent 39% of the molecules scored ≥16, 12% of those scored ≥12 and <16 while none of these antigens were found among the proteins scored <12 (Table 1B and Table S1 protective antigens). Altogether, these observations validate the prioritization system used to select antigens with protection potential. The antigens predicted with a highest score are more likely important for the protective immunity. It also suggests that the humoral protection against blood stage parasite may be obtained by targeting antigens shared between the merozoite and the infected RBC.

Discussion
In summary, this study reports a holistic approach allowing the identification of an extensive antigen dataset recognized by mice immune sera and the prioritization of this dataset to identify potential protective antigens. The immunomic approaches, validated by accessing the immunoreactivity of 80 antigens expressed in mammalian cells, provided an overview of the repertoire of antigens recognized by sera able to confer protection, while minimizing the influence of the contaminations inherent to all proteomic studies. Similar to the previous protein array study (15), this dataset includes a large fraction of internal parasite proteins (559 out of the 673 antigens or 73% of the repertoire) suggesting the activation of a large set of nonprotective immune responses. These proteins along with other parasite molecules localized in the merozoite, the parasite periphery or the host cytosol that are released following parasite egress or death during the immunization procedure and the subsequent challenges, may also confuse the host immune system. Thus, the immunoreactivity against these proteins may not be a good indicator for determining their vaccine potentials. Higher parasite clearance permitted by a more protective sera is likely to translate into a higher immunoreactivity for truly protective antigens, which in return would be picked by the immunomics approaches. The expanded antigen dataset indicate that the host immune response is targeting a wide range of parasite proteins, rather than focused on a few immunodominant antigens. The large overlap between the None-protective and protective immunomic datasets preclude the identification of protective antigens solely based on their presences in the protective datasets, stressing the need of a prioritization system. To assist prioritization, datasets enriched in antigens with a cellular localization linked to protective antigens were used along with the presence of SP/TM domain(s) predicted in the antigen. However, such criteria may create a selection bias as it could lead to the prioritization of antigens containing SP/TM domain(s) localized in the parasite periphery, the HCC, the RBC membrane or the merozoite that are released following parasite clearance/death and unrelated to protection. To overcome this, a scoring system was used in which each criterion is weighted according to their perceived importance. Here, the immunomic properties of the antigens (reproducibility, immunoreactivity and recognition in denaturing condition) were given a higher weightage over their localizations (inclusion in RBC membrane or merozoite datasets) or their proteins features (presence of SP/TM domain(s)).
Despite that the localization of the antigens was not especially emphasized, most of the antigens highlighted by the scoring system are found in the merozoite, supporting previous findings that this invasive stage constitutes one of the mains targets of the protective humoral response (18)(19)(20). The antigens localized solely in the merozoite and those identified in both the merozoite and the RBC or the parasite periphery represented 17% + 39% = 56% of the antigens scored ≥16 and 18% +12% = 30% of those scored ≥12 and <16 (Table 1B). Interestingly, none of the two leading immunodominant vaccine candidate MSP1 and AMA1 was ranked high (score 13 out of 20) as they failed to be enriched in the Attenuated IP samples when compared to the Chloroquine-treated samples. In fact, the weighted scoring system highlighted mostly antigens that had not been evaluated for vaccination purpose despite the existence of encouraging preliminary data, most of them being conserved in P. falciparum and P. vivax (Table S3, Protective antigens summary, Column J).
Merozoite surface proteins are important targets of human immune responses and antibodies are acquired to most, if not all, merozoite surface proteins (64,65). However, the polymorphism inherent to these antigens has prevented the development of an efficient vaccine formulation.
Here, MSP8 (score 17), MSP9 (score 18) and to some extent P113 (score 15) were underlined among the 6 merozoite surface proteins found in the protective antigen dataset ( Figure 5). To date, the diversity of these antigens remains uncharacterized on a population level and we cannot exclude that a reduced level of polymorphism, surmountable by current vaccines approaches, might enable the development of vaccines combination including such antigens.
In addition to merozoite surface proteins, the prioritization strategy highlighted the components of the moving junction that allows the internalization of the merozoite into a nascent PV (66) ( Figure 5). Here, RON2, the member of the RON complex that interacts with AMA1 (67-69), was particularly emphasized (score 20) followed by RON4 (score 17) while RON3, AMA1 and RON5 were scored 14, 13 and 12, respectively. The essentiality of RON2 and RON4 (RmgmDB) along with the reduced polymorphism of these proteins compared to AMA1 and the importance of the interaction between RON2 and AMA1 (70,71) suggest that those two antigens could constitute promising vaccine targets in combination with other synergistic targets. However, the protective efficiency of RON2 and RON4 remains to be fully characterized as only a small fraction of these antigens has been investigated so far (48,72).
Several merozoite antigens recently proposed to be localized in the RBC periphery were also prioritized ( Figure 5). The RhopH complex antigens were predicted to be protective with RhopH1, RhopH2 and RhopH3 scored 16, 12 and 16, respectively. RhopH1 is encoded by a highly diverse multigene family which limit vaccine intervention (73). However, the low polymorphisms detected for RhopH2 and RhopH3 (73) along with the recent data supporting that RhopH complex contributes to both invasion and channel-mediated nutrient uptake at the RBC periphery (62,74) suggest that these RhopH proteins could constitute promising vaccine targets. Other parasite antigens localized both in the merozoite secretory organelles and the parasite periphery were also highlighted ( Figure 5). These antigens are released upon invasion where they could be amenable to targeting by the host. Such antibody-antigens complex could potentially be carried into the nascent parasite PV/PVM during the invasion as previously suggested for MSP proteins (75)(76)(77). This is the case for the PTEX complex proteins EXP2 (score 13) and PTEX150 (score 15.5) and the EPIC protein PV1 (score 12) stored in merozoite dense granules and translocated to the nascent PVM/PV upon invasion to mediate protein translocation beyond the PVM (78)(79)(80). Similarly, RAP complex proteins which include RAP1 (score 17) and RAP2/3 (score 13) were also shown to be released in the nascent PV during the invasion and peripherally associated with the PVM to maintain its structure and facilitates the survival of the parasite (81). Finally, antigens belonging to SERA family were also highly emphasized with SERA1 having the maximal score (20 out of 20) while SERA2 and SERA3 were scored 16, 14, respectively. In rodent Plasmodium, the role of these dispensable antigens expressed in late stage schizont where they localize in the PV, remain unknown (82). However, P. falciparum SERA5, the most extensively studied member of SERA family, is known to be involved in merozoite egress (83) and is peripherally-associated to merozoite surface proteins, allowing an exposure to the host immune system (84)(85)(86). Finally, the existence of data showing that antibodies targeting this SERA5 mediate agglutination of the merozoite and inhibit merozoite egress (86)(87)(88)(89) supports that a large part of the protection is mediated by antibodies targeting multiple aspects critical for the parasite life cycle.
Plasmodium parasites express antigens at the surface of the infected RBC that can be targeted by the immune system. In P. falciparum, these antigens are highly polymorphic and encoded by multigene families such as PfEMP1, which generate a substantial antigenic diversity allowing immune evasion (55,(90)(91)(92)(93). In other Plasmodium spp., the large number of variant PIR (conserved in all the genus at the exception of P. falciparum) and PYST/FAM (conserved in all the rodent parasite) suggests a critical role in parasite-host interaction. However, evidence that these proteins are targeted to the erythrocyte surface remains elusive (36,37,46,(94)(95)(96).
Among the RBC membrane protein found in the dataset, only the PYST/FAM protein EMAP1 PY17X_0840300 (46) appears to be protective (score 15), possibly due to its high abundance in the RBC membrane sample ( Figure 5). However, the lack of PYST orthologues in human Plasmodium preclude its use as vaccination target. Ten PIR were also found in the dataset.
However, none of these were predicted as protective (score between 2 to 9.5). This support that PIR are mostly targeted to the parasite periphery or the host cytosol (37,44) and that the immune response against those variant proteins is likely due to the release of PIR proteins after parasite clearance or schizont burst. In addition to EMAP1, the immunomics approaches also identified 12 (V)-H+-ATPase proteins ( Figure 5). Among those, three were predicted as protective including the (V)-H+-ATPase subunit A PY17X_1227000 (score 16), the (V)-H+-ATPase subunit E PY17X_0838700 (score 15.5) and the (V)-H+-ATPase subunit B PY17X_1005200 (score 12.5). These proteins are part of a highly conserved large membrane bound multi-subunit complex involved in the regulation of the intracellular pH (97,98). While malaria-encoded vacuolar (V)-H+-ATPase were first reported within the infected RBC upon invasion (99), the recent evidences that (V)-H+-ATPase is also found in the RBC periphery (100) suggest that (V)-H+-ATPase subunits might constitute valuable therapeutic targets.
However, the high conservation of those proteins with the host (V)-H+-ATPases might complicate their uses.
The identification of antigens responsible for mediating protective immunity against malaria parasites constitutes a significant challenge in the development of an effective malaria vaccine.
Here, the combination of systemic immunomic screening and weighted scoring system provide a powerful way to identify key protective antigens. Out of the 69 antigens prioritized with a score ≥12, 63 were shared with P. falciparum and/or P. vivax increasing the potential value of the refined dataset (Table S3,    Flowchart of the immunization protocol used to generate three protective and one noneprotective sera. Mice were submitted to increasing exposure time to parasite depending on the immunization protocol used (left to right). The mice able to clear the parasite after the immunizations were challenged 2-4 times with 10 6 P. yoelii in order to boost their immunities.
The sera obtained from 10 mice immunized with the same protocol ware pooled equivolumetrically to constitute the four immune pooled sera used in the course of the study (Noneprotected, Self-Cured, Chloroquine-treated or Attenuated). B. Evaluation of the protection efficiency of the immune sera. Mice (n=5) were inoculated with 500µl of pooled sera and challenged with 10 6 live P. yoelii. The number of days required to reach a parasitaemia of 1% and the survival of the mice were monitored over 45 days. The results were compared to this observed with mice transferred with naïve mice sera.   were systematically selected following five criteria assumed to be associated with protection.
The maximal score given to each criterion reflect its assumed importance. From left to right. 1) The antigens enriched in the protective IP samples as compared to non-protective samples were scored according to their conservation across the protective replicates and the protection efficiency of the cognate immune sera. The score for each antigen is calculated by adding the scores obtained in the protective IP samples. 2) The antigens were scored based to their immunoreactivity pattern. The antigens strongly recognized in the sera with the highest protection efficacy are likely more important for protective immunity. Protective antigen are expected to have a lower immunoreactivity/ relative abundance when probed by a less protective serum and a higher immunoreactivity/ relative abundance when isolated with a more protective serum. The total score for each antigen is calculated by adding the scores obtained in the protective IP samples 3) The antigens recognized by protective sera in 2D western blot but not the none-protective sera were scored +4. 4) The antigen found enriched in the merozoite stage or in the RBC membrane proteomes were scored +2. 5) The Antigen with a predicted signal peptide or TM domain(s) were scored +2.  Table 1 Validation of the antigenicity of the IP dataset and analysis of the literature information pertaining to antigens following prioritization using the weighted scoring system. A. Summary of the screening of the mammalian cell library expressing 90 antigens identified by immunoprecipitation. The immunolabeling patterns observed after incubation with Noneprotective / protective sera was compilated from the information available in the Table S2, antigen validation and expressed as a percentage calculated based on the 90 antigens included in the library. Not expressible indicates the proportion of antigens for which no staining using anti-tag antibody was detected in the CHO-K1 cell after transfection. Negative staining / positive staining indicate the proportion of antigens for which a absence or a presence of specific staining was observed by immunofluorescence, respectively. B. Compilation of the protection and localization data associated to prioritized antigens. The antigens were grouped into 3 groups based on their likelihoods of protection as calculated by the weighted scoring system and analyzed using the information available in the table S3, integrative analysis. From top to bottom. Number of parasite antigens found in each group. Proportion of antigen with information pertaining to protection in each group. The subcellular localization of the antigens identified in each group was compilated from the information available in the Table S1 and expressed as a percentage calculated based on the total number of antigens in each group for which annotations is available.