Primary Structure of a Plasmodium falciparum Malaria Antigen Located at the Merozoite Surface and within the Parasitophorous Vacuole”

DNA encoding an antigen of 101,000 apparent mo- lecular weight from the human malaria parasite Plasmodium falciparum was cloned and sequenced. Ge- nomic DNA from the Camp strain covering the complete coding region along with cDNA from the FCR3 strain covering 81% of the coding region were obtained. The cloned DNA specified a full-length protein of 743 amino acids which included two tandemly re- peated regions, one near the amino terminus containing eight hexapeptide repeats of sequence :NDEED, and the second near the carboxyl terminus containing primarily KE and KEE repeats. The latter repeated region is encoded by a 174-base stretch of mRNA containing only a single pyrimidine. Except for a putative leader sequence located at the amino terminus of the protein, the protein is hydrophilic and highly charged with a calculated isoelectric point of 5.6. Sequences from the Camp and FCR3 strains are very close and are also nearly identical to the partial cDNA sequence of the acidic basic repeated antigen (ABRA) protein from the FC27 strain (Stahl,

DNA encoding an antigen of 101,000 apparent molecular weight from the human malaria parasite Plasmodium falciparum was cloned and sequenced. Genomic DNA from the Camp strain covering the complete coding region along with cDNA from the FCR3 strain covering 81% of the coding region were obtained. The cloned DNA specified a full-length protein of 743 amino acids which included two tandemly repeated regions, one near the amino terminus containing eight hexapeptide repeats of sequence :NDEED, and the second near the carboxyl terminus containing primarily KE and KEE repeats. The latter repeated region is encoded by a 174-base stretch of mRNA containing only a single pyrimidine. Except for a putative leader sequence located at the amino terminus of the protein, the protein is hydrophilic and highly charged with a calculated isoelectric point of 5.6. Sequences from the Camp and FCR3 strains are very close and are also nearly identical to the partial cDNA sequence of the acidic basic repeated antigen (ABRA) protein from the FC27 strain (Stahl, H. D., Bianco, A. E., Crewther, R. F., Anders, R. F., Kyne, A. P., Coppel, R. L., Mitchell, G. F., Kemp, D. J., and Brown, G. V. (1986) Mol. Biol. Med. 3, 351-368). ABRA was previously shown to be located at the merozoite surface and in the parasitophorous vacuole. Because of its location and because it becomes complexed to merozoites when schizonts rupture in the presence of immune serum, ABRA is a candidate component of a malaria vaccine.
Malaria is still an enormous health problem in the tropical regions of the world. A malaria vaccine is urgently needed to augment the effectiveness of antimalarial drugs and insecticides. Because human malaria parasites cannot be obtained in sufficient quantity and purity to prepare vaccines directly, malaria vaccine development has relied on cloning and sequencing genes encoding parasite antigens (1).
Unlike malaria sporozoites, which are coated with a single * This work was supported in part by the Marshfield Clinic and by National Institutes of Health Grant 1 U41 RR-01685-02 to BIONET. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "aduertisernent" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. The nucleotide sequence(s) reported in thispaper has been submitted 503902.
to the GenBankm/EMBL Data Bank with accession number(s) immunodominant protein, malaria blood stage merozoites express many antigens which are recognized by serum from immune individuals. Among the best blood stage vaccine candidates are antigens which appear to be located at the merozoite surface and are enriched in immune clusters of merozoites that form when schizonts rupture in the presence of immune serum (2-4). In this report, we describe the primary structure of one of these antigens, previously called plOl because it migrates on SDS'-polyacrylamide gels with an apparent molecular weight of 101,000 (4).
Comparison of the plOl sequence to a partial cDNA sequence reported by Stahl and co-workers (5) revealed that plOl was identical to the acidic basic repeat antigen (ABRA). To avoid confusion, we have used ABRA to describe the plOl antigen throughout this paper.

EXPERIMENTAL PROCEDURES
Culture of Parasites-Cloned Camp strain Plasmodium falciparum parasites (Malaysia) were cultured and synchronized by double sorbitol treatment as described (6). Parasites were harvested by banding the schizont-infected erythrocytes on Percoll gradients (7).
Gene Cloning and Sequencing-Construction of the genomic DNA expression library using mung bean nuclease and screening of the library with monkey immune serum has been described (8,9). The EcoRIITaqI fragment from the 5' end of clone a95 ( Fig. 1) was eluted from a preparative agarose gel using the GENECLEAN system (Bio 101, LaJolla, CA) and subcloned between the EcoRI and ChI sites of pBR322. After amplification of the plasmid DNA and reisolation of the insert, the fragment was nick-translated and hybridized (10) to Southern blots containing Camp strain genomic DNA digested with various restriction enzymes. In this way, genomic restriction enzyme sites were mapped within and about the ABRA gene. The same parasite DNA fragment was then used to screen a FCR3 strain (West Africa) cDNA library (11) and also to screen a minilibrary comprised of Camp strain genomic DNA EcoRI/BglII fragments, 3.5-3.9 kilobases in length ligated into pUC13 (see Fig. 1). A portion of the resulting ABRA cDNA clone from the 5' end of the clone to the EcoRI site was then isolated, labeled, and used to screen a minilibrary containing 2.8-3.2 kilobase TaqI/EcoRI genomic fragments ligated into pUC13.
Complete sequences of both strands of the Camp genomic DNA were determined by the Sanger dideoxy chain termination method using the Klenow fragment of DNA polymerase I and [35S]dATP (8). Subclones for sequencing were produced using T4 DNA polymerase as an exonuclease (12) or by subcloning restriction enzyme fragments into m13. Gaps were filled by using 17-18 base synthetic oligodeoxynucleotides as primers. The region around the polypurine/polypyrimidine segment (bp 2020-2193) required that reactions be carried out with deaza-dGTP at 50 "C (13). About 95% of both strands of the FCR3 cDNA clone were similarly sequenced: in the two regions where The abbreviations used are: SDS, sodium dodecyl sulfate; ABRA, acidic basic repeat antigen; bp, base pairs; HEPES, 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid. only one strand was sequenced, the FCR3 sequence was identical to the Camp sequence. Sequences were analyzed with software from IBI (New Haven, CT) and BIONET (Intelligenetics, Mountain View, CA).

A
Immunoblotting Experiments with Parasite Antigens-Schizont antigens were fractionated by electrophoresis on SDS-polyacrylamide gels and electroblotted onto nitrocellulose filters (4). Filter strips were reacted with various antibody preparations, and after washing, the bound antibodies were detected using '261-labeled secondary antibodies (9).
ABRA was purified from [3H]isoleucine-labeled P. fakiparum schizonts cultured with protease inhibitors and extracted with 1% Triton X-100 (14). Antigens from lo9 parasites were loaded onto a 3.5-ml column of monoclonal antibody 3D5 (4) coupled to Affi-Gel 10 and washed with phosphate-buffered saline, pH 7.4. Bound ABRA was eluted with 0.1 M sodium citrate, pH 2.8, and neutralized with HEPES (85 mM final concentration) and NaOH. All buffers contained chymostatin (100 pg/ml) to inhibit proteolysis. As assessed by laser densitometry of autoradiographs after SDS-polyacrylamide gel electrophoresis, ABRA accounted for 4% of the total metabolically labeled antigens applied to the affinity column and was the major metabolically labeled antigen eluted from the column. A 2-ml aliquot containing 25% of the eluted radioactive antigen, or a control containing 2 ml of bovine serum albumin (10 pg/ml), was incubated overnight with a 1-inch diameter 0.2-pm nitrocellulose filter; more then 90% of radioactivity bound to the filter. The filters were then blocked with phosphate-buffered saline containing 3% bovine serum albumin and 0.3% Tween 20 and used to affinity purify antibodies (9).
Antibodies recognizing expression fusion proteins from Xgtll lysogens of both clone a95 and a circumsporozoite protein clone (15) were affinity-purified from monkey immune serum as described (9).
A synthetic peptide with sequence NVVPPTQSKKKNKNET-VSGMDE based on the deduced amino acid sequence of ABRA (amino acids 516-537) was synthesized by Peninsula Laboratories, Inc. (Belmont, CA). The peptide was coupled to keyhole limpet hemocyanin using 1-ethyl-3-(3-dimethylaminopropyl) carbodiimide hydrochloride (16). Mice were immunized subcutaneously with the peptide carrier emulsified in Freund's complete adjuvant and boosted intraperitoneally with the peptide carrier in saline.
Zmmunoblotting Experiments with Fusion Proteins-Escherichia coli strain Y1089 was lysogenized (17) with Xgtll bacteriophage without an insert or with the ABRA gene insert (clone a95). Lysogens were grown at 32 'C in nutrient broth supplemented with 50 pg/ml ampicillin until the absorbance at 550 nm reached 0.5. The cultures were then incubated in a 44 "C water bath for 20 min before adding 5% volumes of 100 mM isopropyl-P-D-thiogalactopyranoside or water and shaking for an additional 60 min at 37 "C. E. coli cells were pelleted by centrifugation at 12,000 X g for 15 min. Pellets were resuspended in 2% culture volume of 50 mM Tris (pH 8.0), 150 mM NaCl, 0.2 mM phenylmethylsulfonyl fluoride, and the cells were lysed by two cycles of freezing and thawing. Lysates were diluted with SDS sample buffer, and protein from 100 pl of the original culture was loaded into each lane of a SDS-polyacrylamide gel. Electrophoresis, electroblotting, and reaction with antibodies were as described above.

RESULTS AND DISCUSSION
Immune monkey serum was used to screen a P. fakiparum Camp strain genomic DNA Xgtll expression library with the result that about 150 positive clones were identified (9,18). Antibodies recognizing expression proteins from individual clones were affinity-purified from the immune serum and used to probe immunoblots of schizont antigens. One clone, a95, was identified as encoding at least a portion of ABRA. Through immunoblot, hybridization, and sequence analysis, clone a95 was found to include approximately the 3'-half of the ABRA gene coupled to an interspersed repetitive element (Fig. lA). The ABRA gene and the repetitive element are not adjacent to each other on the parasite chromosome but rather were joined through an E. coli recombination event (19). Overlapping portions of the Camp strain genomic ABRA DNA and a FCR3 strain cDNA clone were then isolated by hydridization to the ABRA portion of a95 (Fig. lA).
Evidence that the cloned DNA encodes ABRA came from reaction of schizont antigens with four antibody preparations: antibodies affinity-purified from total immune serum using the clone a95 expression protein; antibodies affinity-purified using ABRA isolated from parasites; monoclonal antibody 3D5 which recognizes B R A (4); and mouse antiserum against a synthetic peptide based on the deduced amino acid sequence of ABRA. All four antibody preparations detected a protein of identical electrophoretic mobility on the schizont antigen immunoblots (Fig. 2).
Monoclonal 3D5 and anti-ABRA antibodies also recognized the @-galactosidase fusion protein expressed by clone a95 (Fig.  3). Sequencing of the 5' end of the clone a95 insert showed that the ABRA gene was in frame with the vector P-galactosidase gene. The apparent molecular weight of the fusion protein detected on the immunoblots was also consistent with the size of the peptide encoded by the segment of the gene from the EcoRI site to the stop codon. However, the fusion protein is apparently unstable in E. coli as it is not detectable by Coomassie Blue staining, and several protein fragments, smaller than the fusion protein, are recognized by the anti-@galactosidase and anti-101 antibodies (Fig. 3).
A composite gene sequence from the overlapping Camp strain clones is shown in Fig. 4 (anti-101). Molecular weight markers are indicated on the left; the position of the ABRA-@-galactosidase fusion protein is indicated by the arrow on the right. Note that monoclonal antibody 3D5 strongly recognizes a bacterial protein in addition to ABRA.
The first methionine codon is located 48 bp after the start of the ABRA open reading frame. This appears to be the start site for ABRA translation as there are no significant open reading frames preceding the main open reading frame nor are there sequences which match the conserved malaria intron/exon boundary sequences (21). The ABRA protein sequence begins with a probable signal peptide containing an 11-amino acid hydrophobic core (underlined in Fig. 4) and a likely signal peptidase cleavage site after amino acid 22 (cysteine) (22). ABRA has two regions of tandem peptide repeats (hatched boxes in Fig. 1B). Eight hexapeptide repeats of sequence ;NDDED are encoded by bp 676-819. Near the carboxyl terminus, tandem repeats mostly of sequence KE and KEE are encoded by a polypurine segment of mRNA.
The protein sequence deduced from the long open reading frame has a calculated molecular weight of 86,595 (83,931 for the mature protein), which is considerably less than the apparent molecular weight of 101,000-102,000 estimated from mobility on SDS-polyacrylamide gels (4,5). True molecular weights for malaria antigens have repeatedly been found to be lower than the apparent molecular weights (15,23,24), although the explanation is unknown. There are nine potential N-glycosylation sites of sequence N-X-T (X stands for any amino acid) within ABRA, but we have been unable to detect [3H]glucosamine incorporation into ABRA, and, therefore, the high apparent molecular weight is probably not due to glycosylation.
ABRA is a hydrophilic (Fig. 1C) and acidic protein. The calculated PI from the amino acid sequence of the mature protein is 5.6, in agreement with the acidic mobility of ABRA on two-dimensional protein gels? There are six cysteine residues in ABRA; all are located near the NH2 terminus (Fig.  1C). Except for the putative signal peptide, there are no protein segments which could obviously anchor ABRA to a membrane. This is consistent with the observation that ABRA is an exoantigen secreted into the parasitophorous vacuolar space (4,5). The absence of [3H]myristi~ acid incorporation into ABRA suggests that ABRA is not coupled to lipid after translation, as has been found for gp195, the precursor to several merozoite surface proteins (25).
Searches of version 48 of the GenBank and version 10 of the NBRF data bases revealed no sequences homologous to ABRA. A few proteins from the NBRF data base, notably the yeast heat shock protein 90 (26) and the mammalian neurofilament L protein (27), had short regions rich in lysine and especially glutamic acid, but outside of these regions there was no obvious sequence similarity to ABRA.
ABRA appears to be a conserved malaria antigen. The Camp and FCR3 strains are known to differ significantly in the molecular weights of antigens recognized by growth inhibitory immune serum (28,29), and in the sequence of a 126,000 molecular weight parasite antigen (4,18).* In contrast, the sequence of a 1.8-kb FCR3 strain cDNA clone revealed only four differences from the Camp strain genomic sequence (Fig.  4). One was the deletion of an adenine nucleotide within a stretch of 10 adenines (bp 583-592). Deletions or insertions of adenines in runs of adenines have been seen previously in cDNA clones of malaria antigens (30,31). The other three differences resulted in amino acid changes (N to S at bp 935, E to Q at bp 1078, and D to N at bp 1126). Likewise, the partial FC27 strain (Papua New Guinea) ABRA cDNA sequence (5) is almost identical to the Camp sequence with the only differences being an extra base near the 5' end of the FC27 clone and some rearrangements within the carboxyl terminal repeat region. Additionally, the mobility of ABRA on SDS-polyacrylamide gels is nearly invariant for seven different parasite strains (4,5).
The most unusual feature of the ABRA gene is the 174-bp polypurine segment (bp 2020-2193). Polypurine-polypyrimidine segments are common in eukaryotes, but previous examples have all been outside of coding regions (32)(33)(34). The  same DNA mechanisms which generate the polypurine-polyseveral codons, especially GAG (glutamic acid) and AGA pyrimidine segments in other organisms may have generated (arginine), which would have been found in a random polythe analogous region in the ABRA gene, but selection for purine region, are not found within the ABRA repeat region. amino acid sequence must also have been involved since The function of this portion of the ABRA molecule is un-known. An immunological decoy or smokescreen role (35) for this region seems unlikely since human immune serum was found to not react with a synthetic peptide containing KE and KEE repeats ( 5 ) .
The ABRA gene maps to one of the largest P. fakiparum chromosomes (36).
ABRA is found within the parasitophorous vacuole and is released from the parasite by extraction with Triton X-100 or through natural schizont rupture (4, 5 ) . On this basis, Stahl and co-workers (5) concluded that ABRA probably functions as an immunological smokescreen, helping to prevent the host from mounting a strong immune response against antigens which are crucial to parasite survival. However, ABRA has also been located at the schizont and merozoite surface, and the protein is found in clusters of merozoites that have been aggregated with immune serum (4), suggesting that ABRA functions as more than just a smokescreen. ABRA may bind to other integral membrane proteins on the parasite surface (42,43) and could become a component of a malaria vaccine. Determination of the complete primary structure of ABRA has made rigorous testing of these alternatives possible.