Demonstration of alternative splicing of a pre-mRNA expressed in the blood stage form of Plasmodium falciparum.

By screening of a lambda gt11 library from Plasmodium falciparum genomic DNA with an antiserum raised against a 41-kDa protein band, which was shown to confer protective immunity to monkeys, the phage clone 41-3 was identified. The entire 41-3 gene was isolated, and its coding regions were determined by amplification and sequencing of 41-3 specific mRNA fragments. The 41-3 gene has a complex structure consisting of nine exons, encoding 375 amino acids in total with a calculated molecular weight of 43,400. Provided that the N-terminal hydrophobic residues function as signal sequence which is cleaved off, the molecular weight of the 41-3 protein decreases to 41,200 and could therefore be considered to be a component of the protective Mr = 41,000 protein band. Indeed, a 41-kDa protein could be detected by Western blot analysis using antisera raised against different recombinant expression products of the 41-3 gene. We furthermore demonstrate an alternative splice process for the mRNA precursor transcribed from the 41-3 gene to yield at least three distinct mRNAs. The major splice product carries all exons E1 to E9, whereas at least two minor 41-3 mRNA species can be identified which show deletions in the region between exons E5 and E7. The possible role of this differential splice process for the parasite is discussed.

Several proteins have been considered to be candidate antigens for a vaccine against the blood stage of Plasmodium falciparum. A 41-kDa protein band was shown to confer protective immunity to Saimiri monkeys (1). Attempts to isolate the genetic information for the corresponding antigen resulted in the isolation of a gene coding for the P. falciparum aldolase which was shown to be the main component of the protective 41-kDa protein band (2, 3). However, the recombinant aldolase failed to protect monkeys (4) which seems to rule out this protein as a vaccine candidate for malaria.
Here we report on the isolation of a gene encoding an additional 41-kDa protein which seems to be a minor component of the protective 41-kDa protein band. This gene has a complex structure, and its transcript is alternatively spliced to at least three different mRNAs encoding different isoforms of a P. falciparum blood stage antigen. *This work was supported by the Bundesministerium fur Forschung und Technologie. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "aduertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in thispaper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) M5996I.

MATERIALS AND METHODS
Preparation of Biological Material-The P. fakiparum strain FCBR (Columbia) was cultivated by standard methods, and schizonts were enriched as described previously (5). The preparation of antigens for Western blot analysis as well as the isolation of genomic DNA and poly(A)+ RNA for Southern and Northern blot analysis were performed using described protocols (6,7).
Construction and Screening of a Genomic Mung Bean Nuclease Library-Mung bean nuclease digestion of DNA from the FCBR strain was performed as described (8). Ten micrograms of genomic DNA was digested at 50 "C for 30 min with 5 units of mung bean nuclease (Biolabs) in 40% formamide containing 50 mM NaCl, 1 mM ZnCl,, 5% glycerin, and 30 mM sodium acetate, pH 5.0. The mung bean nuclease-digested DNA was sized by agarose gel electrophoresis, and the fraction between 0.5 kb' and 5.0 kb was eluted by standard methods (9). The sized DNA was ligated to EcoRI adaptors (Pharmacia LKB Biotechnology Inc.), introduced into the EcoRI-digested phosphatase-treated Xgtll DNA (Pharmacia), and encapsidated using the gigapack gold packaging extract (Promega) to produce a library of 3.1 X lo5 recombinant phages. The screening of this library with the 32P-labeled insert DNA of the Xgtll clone 41-3 revealed one positive phage clone carrying an insert DNA of 2.3 kb which was introduced into the Bluescript vector pKS (Stratagene) to obtain plasmid p41-3MN. The phage clone 41-3 was previously isolated by screening of a Xgtll genomic library prepared from DNA of the P. falciparum strain T9.96 with an antiserum raised against the protective 41-kDa protein band of P. fakiparum as has been reported (3,6).
Cloning of the 5' and 3' Gene Regiom-To isolate additional genetic information in the 5' and 3' direction of the insert DNA of plasmid p41-3MN, the inverse polymerase chain reaction (10) was applied similarly as described previously (11). A 5' 300-bp HaeIII-PstI fragment and a 3' 500-bp HindIII-TaqI fragment were used to identify 1.6-kb TaqI and 1.5-kb HinfI genomic fragments by Southern blot analysis, respectively.
To extend the 5' region of the insert DNA of plasmid p41-3MN, TaqI-digested genomic DNA in the range of 1.5-1.7 kb was isolated, ligated, and digested with PstI which resulted in an inversion of the known sequences to the ends of the 41-3 specific DNA fragment. This DNA probe and oligonucleotides p l and p2 (Table I) were used for PCR carried out under standard conditions using the Gene-AmpTM kit of Perkin-Elmer Cetus Instruments. The amplified 960-bp fragment was digested with TaqI and EcoRI and introduced between the AccI and EcoRI sites of the Bluescript vector pKS to obtain plasmid In order to extend the 3' end of the 41-3 gene genomic DNA of 1.4-1.6 kb digested with HinfI was isolated, used for the fill-in reaction by the Klenow fragment, self-ligated, digested with Tag1 to convert known sequences to the ends of the 41-3 specific DNA fragment, and amplified by PCR using the oligonucleotides p3 and p4 (Table I). The amplified 1210-bp fragment was treated with Klenow enzyme to fill in the sticky ends and subcloned into the Smal site of the vector pKS to obtain plasmid p41-3PCR3'.
DNA Sequencing-The insert DNAs of plasmids p41-3MN, p41-3PCR5', and p41-3PCR3' were sequenced from both strands by the chain termination procedure using the sequenase system from U.S. Biochemical Corp. Convenient DNA fragments were subcloned into the Bluescript vector pKS, and large fragments were shortened by the exonuclease III/Sl nuclease method according to described techniques (9). The sequencing data obtained were analyzed using the University of Wisconsin Genetic Computer Group (UWGCG) programs (12).
PCR Amplification of 41-3 Specific cDNA or rnRNA-Poly(A)+ RNA was prepared as described previously (6,7) and purified from contaminating genomic DNA, which was not detected by gel electrophoresis, but by PCR, by passage through a Quiagen-tip5 column as described by the supplier (Diagen, Dusseldorf). One microgram of this purified RNA was converted to single-stranded cDNA by murine leukemia virus reverse transcriptase according to standard procedures (9). The single-stranded cDNA or 1 pg of the poly(A)+ RNA directly (13) was used for PCR performed according to the protocol of the Gene-AmpTM kit of Perkin-Elmer Cetus. The 5' and 3' oligonucleotides used for the reactions were p7 and p9, p6 and p9, p5 and p9, p10 and p l l , p10 and p12, p8 and p4, p7 and p4, p5 and p l l a s well as p5 and p12, respectively. These oligonucleotides are listed in Table  I in detail. The different PCR fragments obtained were digested at the restriction sites given in Table I and introduced into the corresponding sites of the Bluescript vector pKS for DNA sequencing.
Expression of PartMl Sequences-The insert DNA of the original Xgtll clone 41-3 was excised by digestion with the restriction enzyme EcoRI and subcloned into the vector pEX3lb (14), giving rise to plasmid pEX41-3a.
To express a fragment which extends the original 41-3 sequence, the complete exon 4 region was amplified using the oligonucleotides p13 and p14 (see Table I) and 10 ng of the plasmid DNA p41-3MN. The resulting 340-bp fragment was digested with Sac1 and HindIII and introduced into the expression vector pEX32b ( l l ) , giving rise to plasmid pEX41-3b.
The PCR reaction using poly(A)+ RNA and the oligonucleotides p8 and p4 yielded three 41-3 specific DNA fragments different in size: a 630-bp fragment, a 470-bp fragment, and a 360-bp fragment, respectively. These DNA fragments were introduced between the EcoRI and KpnI sites of the vector pEX32c ( l l ) , giving rise to plasmids pEX41-3c1, pEX41-3c2, and pEX41-3c3, respectively.
The polymerase chain reaction of poly(A)+ RNA using the oligonucleotides p6 and p9 revealed a 400-bp fragment which was digested with EcoRI and HindIII and introduced into the vector pEX32a (11) giving rise to plasmid pEX41-3d.
The 630-bp and 400-bp insert DNAs of plasmids pEX41-3cl and pEX41-3d were isolated using the restriction enzymes EcoRI and KpnI or EcoRI and HindIII, respectively. About 100 ng of both DNA fragments and the oligonucleotides p6 and p4 were used for PCR which was carried out under standard conditions. Both fragments were fused to a 920-bp fragment which was digested with EcoRI and KpnI and introduced into the expression vector pEX32a, giving rise to plasmid pEX41-3e.
The plasmids constructed for expression of 41-3 specific DNA fragments were transformed into the Escherichia coli strain POP 2136 (Stratagene) which was induced to express the malarial antigens as MS2-polymerase fusion proteins by a temperature shift (5).
The expression products were partially purified, and antisera in rabbits were prepared as described previously (5).

Isolation of a Xgtll Phage Clone Recognized by an Antiserum
Raised against the Protective 41 -kDa Protein Band-As previously described, screening of a genomic Xgtll library from P. fakiparum with an antiserum raised against the 41-kDa protein band which was shown to confer protective immunity to Saimiri monkeys (1) yielded 16 phage clones carrying insert DNAs coding for different proteins (3,7). One of the phage clones, 41-3, was chosen for further characterization.
Isolation of the Complete 41-3 Gene-The isolation of genomic sequences flanking the sequence of the insert DNA of phage clone 41-3 that covers the residues from position 1715 to 1850 is shown in Fig. 1. A Xgtll library prepared from genomic DNA of the P. fakiparum strain FCBR digested with mung bean nuclease was screened with the insert DNA of clone 41-3. One phage clone, designated 41-3MN, was isolated which contains a genomic fragment of 2.3 kb covering the nucleotide residues 871 to 3167. In order to extend the genomic sequence in the 5' and 3' direction, a 960-bp fragment and a 1210-bp fragment were amplified by inverse PCR from 1.6-kb TaqI and 1.5-kb HinfI genomic fragments identified by Southern blot analysis, respectively (Fig. 1). The amplified DNA fragments further extend the genomic sequence for 870 bp in the 5' direction and for another 677 bp at the 3' end of the sequence of 41-3MN. The three genomic fragments, designated 41-3MN, 41-3PCR5', and 41-3PCR3' cover 3844 bp of the genomic DNA flanking the original 41-3 sequence. The nucleotide sequences determined from these DNA fragments reveal only short open reading frames (ORFs) suggesting that the 41-3 gene carries several intervening sequences.
Determination of the Coding Regions of the 41 -3 Gene-To determine the coding regions of the genomic sequence, two different cDNA libraries were screened, but no 41-3 specific clones could be isolated. Therefore, the polymerase chain reaction was used to amplify 41-3 specific sequences from cDNA or from mRNA directly. For this purpose, the oligonucleotides p4 to p12 (Table I, Fig. 1) were constructed which should correspond to exon regions of the 41-3 gene. As shown in Table 11, we succeeded in the amplification of overlapping 41-3 specific mRNA or cDNA fragments using different combinations of oligonucleotides. However, it was not possible to amplify larger cDNA fragments using the oligonucleotides p5 and p l l or p5 and p12, respectively. The sequence data of these isolated cDNA fragments covering 2.27 kb of the genomic sequence reveal a rather complex genomic structure for the 41-3 gene. Nine exons are interrupted by eight short intervening sequences (Figs. 1 and 2).
All eight introns start with GT and end with AG, upstream to which pyrimidine residues are enriched (Table 111). This sequence pattern is in agreement with the splice site consensus sequences found at the 5' and 3' boundaries of eukaryotic introns (15). The eight intervening sequences are short in size ranging from 81 bp to 208 bp similarly as described for introns of other P. fakiparum genes (16).
Since it was not possible to amplify mRNA fragments further in the 5' direction, the ATG start codon is proposed for positions 908 to 910, which is the only ATG codon found in this region. This is supported by the following facts. Upstream, the most proximal ATG codon which is found at residues 779 to 781 is immediately followed by a stop codon.
A TAA stop codon which is situated 10 triplets upstream to the proposed initiation codon additionally confirms the predicted translation start position. In this region, there is no evidence for another intron as no further 3' splice site consensus sequence was found. Furthermore, the 4 nucleotides preceding the initiation codon are in agreement with the 5' consensus sequence 5' AAAA 3' determined from 22 P. falciparum sequences (17).
At the 3' end, there is no evidence either for additional exons. The TAG stop codon in position 3040 to 3042 was also shown to terminate-the coding region for a sequence amplified from mRNA using the oligonucleotides p10 and p12. Furthermore, the 3' noncoding region ranging from position 3040 to 3844 is very rich in A + T (87.5%), a value which is in agreement with the A + T content of noncoding regions from other P. fakiparum genes (18).
At the 5' end, 537 bp upstream to the proposed initiation codon of the 41-3 gene, the isolated genomic sequence carries

I1
Exons resulting from PCR on cDNA or mRNA using the oligonucleotides indicated in Table I PCR Intron.
a further ORF coding for 123 amino acids which include 18 C-terminal serine-rich degenerate tetrapeptides based on the sequence DSQS (Fig. 2). An ORF is also located at the 3' end 575 bp downstream to the stop codon of the 41-3 gene which codes for 75 amino acids starting with a hydrophobic region of 17 residues. A 560-bp AccI fragment covering the 5' ORF and a 295-bp DraI-HinfI fragment containing the 3' ORF (compare Fig. 1) were prepared for Northern blot analysis.
Using both probes, we were not able to detect specific mRNAs (results not shown) and therefore could not assign the ORFs to expressed genes. However, the results also indicate that these sequences do not belong to the 41-3 gene. The deduced amino acid sequence of the nine exons was determined (Fig. 2)  The N-terminus starts with a hydrophobic sequence of 22 amino acids, including all 4 cysteine residues of the protein, which might function as a signal sequence. Assuming that the leader sequence is cleaved off, the molecular weight of the 41-3 protein decreases to 41,200. The amino acid sequence of the 41-3 protein was compared with the NBRF and Swissprot protein data banks: no significant homology with published protein sequences was found. The 41-3 antigen is a very hydrophilic protein enriched in Glu (11.7%) and Lys (10.7%) residues with regions of high surface probability. Four putative N-glycosylation sites were found Asn", AS^'^^, and AsnZ7'.
The 41-3 Gene Gives Rise to Different mRNAs-Chromo-soma1 DNA from the P. fakiparum strain FCBR was digested with different enzymes known to have restriction sites on the 41-3 gene for Southern blot analysis with a 32P-labeled 1040bp PstI-Hind111 fragment containing the exons E3 to E6. The pattern obtained (results not shown) corresponds to that expected for a single gene in the P. falciparum genome.
Using the oligonucleotides p13 and p14 (Table I) for PCR, a 340-bp fragment was obtained containing the largest exon, E4, of the 41-3 gene. This probe detects a low copy mRNA of 2.4 kb and two minor bands of 2.1 kb and 1.4 kb by Northern blot analysis (Fig. 3).
Alternative mRNA Splicing of the 41-3 Gene-Using the oligonucleotides p8 and p4 for PCR on mRNA, three distinct fragments could be isolated and were shown to originate from the 41-3 gene by DNA sequencing (Table I1 and Fig. 44). The predominant fragment is 630 bp in size and carries the exons E4 to E8; a distinct minor amplified fragment of 360 bp    exon E5 and the second containing exon E7, were spliced out.
The DNA shown in Fig. 4A was blotted onto a membrane and hybridized with a 1040-bp PstI-Hind111 fragment. The Southern blot (Fig. 4R) reveals the three splice products described above and additionally a DNA fragment of 580 bp which may represent a further splice product. A further PCR reaction performed with the oligonucleotides p7 and p4 yielded two DNA bands of 760 bp and 845 bp visible on ethidium bromide-stained gels ( Table 11). The sequencing data of these fragments revealed that both contain the exon regions E3 to E8, but the largest one additionally carries the intervening sequence between exons E3 and E4. No differential splice products could be observed for exons E l to E4 as shown by PCR on mRNA using oligonucleotides p5 and p9 (Table 11).
Identification of the Protein(s) Encoded by the 41 -3 Gene-In order to identify one or more isoforms of the 41-3 protein encoded by the predominant splice products of the 41-3 precursor mRNA, rabbit antisera were raised against the different fusion proteins encoded by the vectors pEX41-3a, -b, -cl, -c2, -c3, -d, -e. Table IV shows the regions of the 41-3 protein expressed in the pEX vectors as MS2-polymerase fusion proteins. Using antisera against the different fusion proteins, no single malarial antigen could be detected specifically by Western blot analysis of schizont proteins. The antisera tested show a complex banding pattern with some typical bands of 110 kDa, 70 kDa, and 41 kDa (Fig. 5). The latter band migrates at the same position as the 41-kDa aldolase of P. falciparum  -3b ( a ) , pEX41-3cl ( b ) , pEX41-3c2   ( c ) , pEX41-3c3 ( d ) , pEX41-3e ( 0 , or against a MS2- ( Fig. 5e). A protein of this size is in agreement with the molecular size deduced from the amino acid sequence. Proteins above 50 kDa are not expected from the deduced amino acid sequence of the 41-3 protein, and therefore may be polypeptides cross-reacting with the 41-3 specific antisera. Proteins smaller than 41 kDa may be either cross-reacting antigens, isoforms of the 41-3 protein, or processing products thereof.

DISCUSSION
A 41-kDa protein band was shown to confer protective immunity to monkeys (1). As the main component of this protein band, the P. falciparum aldolase has been identified (2, 3), which was expressed in E. coli and tested for its potential as a vaccine candidate. However, the recombinant aldolase was not able to protect monkeys from a challenge infection with P. falciparum parasites (4).
Here we report on the isolation of a P. falciparum gene which seems to encode another constituent of the protective 41-kDa protein band. The molecular weight calculated from the deduced amino acid sequence amounts to 41.2 kDa, assuming that a hydrophobic N-terminal region functions as leader sequence and is cleaved off. Indeed, among other polypeptides, a protein that migrates at the same position as the parasite aldolase was detected by Western blot analysis using antisera raised against different expression products of the 41-3 protein (Fig. 5 ) . Therefore, the 41-3 protein might well be a component of the protective 41-kDa protein band and could be a candidate for the development of an anti-blood stage malaria vaccine. The biological function of the 41-3 protein is yet unknown. No significant homology was observed to known protein sequences. However, the 41-3 protein seems to be a soluble antigen exported from the parasite. Indeed, antisera raised against a fusion protein expressing exons E2 to E8 detect an antigen localized mainly within the erythrocyte cytoplasm as shown by immunoelectron microscopy (results not shown). The only very hydrophobic region of the 41-3 protein is localized at the N terminus and probably functions as a signal sequence. Most of the residual sequence is highly hydrophilic.
In contrast to the P. falciparum aldolase, the 41-3 protein is expressed a t a low level as demonstrated by the different exposition times necessary for the detection of the aldolase or the 41-3 specific mRNAs on Northern blots. The aldolase mRNA was detected within 1 h, whereas the 41-3 specific mRNA needed an exposition time of several weeks using nucleotide probes of similar specific radioactivity. Therefore, not only the mRNA coding for the 41-3 protein but also the protein itself should be a very minor component of P. falciparum blood stages. The low abundance of the 41-3 protein may explain the difficulty to specifically detect this antigen by Western blot analysis.
Northern blot analysis using a 41-3 specific probe reveals three mRNA bands of 2.4 kb, 2.1 kb, and 1.4 kb, respectively. The 5' region (position 1 to 570) of the isolated sequence as well as the 3' region (position 3550 to 3844) were not identified as part of the 41-3 specific mRNA(s) by Northern blot analysis using the corresponding fragments as probes. As both fragments carry ORFs coding for 123 and 75 amino acids, respectively, these may represent parts of adjacent genes. The complete coding region of the 41-3 gene spans only 1125 bp, and the regions upstream and downstream of the coding region to the 5' and 3' ORFs are 537 bp and 578 bp in size, respectively. As these sequences should also carry the noncoding regions of the flanking genes as well as the promoter regions, we conclude that the 1.4-kb mRNA is the predominant transcript of the 41-3 gene. The 2.4-kb and 2.1-kb mRNAs do not originate from homologous genes as no homology was observed by Southern blot analysis, but they may represent highly abundant mRNAs cross-hybridizing with the 41-3 specific probe.
The 41-3 protein is encoded by a gene with a complex structure in which the coding region is interrupted by eight intervening sequences. All the introns have a high A + T content and are very small in size ranging from 81 bp to 208 bp similarly as was reported for intervening sequences of other P. falciparum genes (16,18). Up to now, no further P. falciparum gene has been isolated carrying more than four exons. Genes containing either four or three exons have been described coding for the serine stretch protein SERP (6) as well as its homologue SERP H (11) and the exp-1 antigen (19), respectively. All other P. falciparum genes described so far either contain a continuous coding sequence or carry just one single intron (16).
The complex exon-intron structure gives rise to a feature so far unique for P. falciparum genes: the antigen encoded is variable due to differential splicing of the precursor mRNA. By PCR on mRNA using the oligonucleotides p8 and p4, three different splice products which appear in different abundance within schizonts could be identified containing either the exons E4 to E8, E4 and E8, or E4, E6, and E8, respectively. The occurrence of further splice products cannot be excluded as four DNA bands were detected by Southern blot analysis of PCR amplified mRNA sequences specific for the exon regions E4 to E8. No alternative splice products were detected for the exon regions E l to E4 and E8 + E9 so that differential splicing seems to occur only at exons E5, E6, and E7. These data suggest that the 41-3 precursor mRNA is spliced to at least three different products encoding different, isoforms of a P. falciparum blood stage antigen assembled as shown in Fig. 6. A mRNA species of low abundance could be detected which may carry the exons E l to E4, E6, E8, and E9, whereas a more abundant mRNA species has deleted exons E5 to E7. The predominant mRNA species carries all the exons E l to E9 specifying a 41.2-kDa antigen, the only isoform encoded by the 41-3 gene which unambiguously could be identified by Western blot analysis of schizont proteins using antisera prepared against different fusion proteins.
Furthermore, by PCR amplification of mRNA, one product was observed which carries in addition to the exons E3 to E8 the intervening sequence between exons E3 and E4. This product may represent an intermediate of one of the spliced mRNAs. Another possibility is that this product originates from a spliced mRNA which carries the intervening sequence between exons E3 and E4 as part of the 5' noncoding region. This mRNA should initiate translation at the second or third codon of exon 4. To our knowledge, the 41-3 gene is the first parasite gene for which an alternative splice process has been observed. Differential splicing has been described for many genes of higher eukaryotes encoding hormones like calcitonin (20), enzymes like phosphofructokinase (21) or decarboxylase (22), receptors like the dopamine Dz receptor (23,24) or the insulin receptor (25), growth factors like the insulin-like growth factor I1 (26) or the nerve growth factor (27), cytochrome P-450 (28), the cell-surface glycoprotein fibronectin (29,30), or structural proteins like collagen (31), myosin (32), troponin (33), or tropomyosin (34). In higher eukaryotes, the same transcript can be processed differentially in different tissues to yield mRNAs with different coding potentials, which is the case for most of the mentioned proteins (20-28, 32, 34). Alternatively, the pre-mRNA is differentially spliced at different developmental stages of the same tissue (33). Considering both aspects, the protozoan parasite P. falciparum could use differential splicing only to express different processing products of the same gene during different developmental stages, a hypothesis which needs further investigation.
We have identified at least three different splice products of the 41-3 precursor mRNA expressed in vitro during only one developmental stage, the late blood stage form of P.
falciparum. Expression of different splice products in one tissue was also reported for the LC1 and LCs forms of the myosin light chain found in skeletal muscle (32) or for the insulin receptor expressed as two different forms in liver exhibiting different affinities (25).
We can only speculate on the possible biological function of the mechanism creating different isoforms of one antigen for the parasite. Different but very similar proteins may play a role in evasion of the host immune response. Among the isoforms produced, only one minor expressed form may have an essential function for the parasite. The residual molecules could divert the immune response of the host, thus protecting the essential protein by an excess of nonessential homologous proteins. Alternatively, some exons may encode epitopes which may be spliced out upon immunological pressure. This hypothesis could be tested by epidemiological studies which might yield not only a distribution of the described splice products different to the in vitro situation reported here, but