Distinct genetic difference between the Duffy binding protein (PkDBPαII) of Plasmodium knowlesi clinical isolates from North Borneo and Peninsular Malaysia

Plasmodium knowlesi is one of the monkey malaria parasites that can cause human malaria. The Duffy binding protein of P. knowlesi (PkDBPαII) is essential for the parasite’s invasion into human and monkey erythrocytes. A previous study on P. knowlesi clinical isolates from Peninsular Malaysia reported high level of genetic diversity in the PkDBPαII. Furthermore, 36 amino acid haplotypes were identified and these haplotypes could be separated into allele group I and allele group II. In the present study, the PkDBPαII of clinical isolates from the Malaysian states of Sarawak and Sabah in North Borneo was investigated, and compared with the PkDBPαII of Peninsular Malaysia isolates. Blood samples from 28 knowlesi malaria patients were used. These samples were collected between 2011 and 2013 from hospitals in North Borneo. The PkDBPαII region of the isolates was amplified by PCR, cloned into Escherichia coli, and sequenced. The genetic diversity, natural selection and phylogenetics of PkDBPαII haplotypes were analysed using MEGA5 and DnaSP ver. 5.10.00 programmes. Forty-nine PkDBPαII sequences were obtained. Comparison at the nucleotide level against P. knowlesi strain H as reference sequence revealed 58 synonymous and 102 non-synonymous mutations. Analysis on these mutations showed that PkDBPαII was under purifying (negative) selection. At the amino acid level, 38 different PkDBPαII haplotypes were identified. Twelve of the 28 blood samples had mixed haplotype infections. Phylogenetic analysis revealed that all the haplotypes were in allele group I, but they formed a sub-group that was distinct from those of Peninsular Malaysia. Wright’s FST fixation index indicated high genetic differentiation between the North Borneo and Peninsular Malaysia haplotypes. This study is the first to report the genetic diversity and natural selection of PkDBPαII of P. knowlesi from Borneo Island. The PkDBPαII haplotypes found in this study were distinct from those from Peninsular Malaysia. This difference may not be attributed to geographical separation because other genetic markers studied thus far such as the P. knowlesi circumsporozoite protein gene and small subunit ribosomal RNA do not display such differentiation. Immune evasion may possibly be the reason for the differentiation.


Background
Plasmodium knowlesi, a malaria parasite of macaque monkeys, was reported to cause a large number of human infections in the Malaysian state of Sarawak, North Borneo, in 2004 [1]. Subsequent to this landmark report, human knowlesi malaria cases have been documented in in other parts of Borneo Island, Peninsular Malaysia, and in many other countries in Southeast Asia [2]. In Malaysia, P. knowlesi has now overtaken Plasmodium vivax as the main cause of human malaria [3].
The invasion of a malaria parasite into its host erythrocyte depends on the interaction between the parasite's protein and its corresponding receptor on the surface of the erythrocyte. Plasmodium knowlesi uses the Duffy blood group antigen as a receptor to invade erythrocytes [4]. The Duffy binding proteins of P. knowlesi (PkDBP) are located on their merozoites and occur as three distinct forms: α, β and γ. These are large proteins and each can be divided into seven regions (I-VII). Region II contains the critical motifs for binding to the erythrocyte. Region II of PkDBPα (designated as PkDBPαII) binds to Duffy-positive human erythrocytes and macaque erythrocytes. PkDBPβII and PkDBPγII, however, bind only to macaque erythrocytes and not to the Duffy antigen of human erythrocytes [5].
It has been observed that antibodies raised against PkDBPαII could inhibit P. knowlesi invasion of human and macque erythrocytes in vitro [6]. Therefore, like PvDBPII for vivax malaria, PkDBPαII may be a candidate vaccine antigen against knowlesi malaria. Any design of vaccine against malaria must take into consideration the nature and genetic polymorphism of the candidate antigen. In a recent study, a high level of genetic diversity was found in the PkDBPαII of 20 P. knowlesi clinical isolates from Peninsular Malaysia [7]. At the amino acid level, 36 haplotypes were identified and these haplotypes could be separated into allele group I and allele group II. In the present study, the PkDBPαII of clinical isolates from the Malaysian states of Sabah and Sarawak in North Borneo was investigated.

Blood samples
The 28 human blood samples used in this study were collected from knowlesi malaria patients at government hospitals in Sabah (n = 16) and Sarawak (n = 12) (

Extraction of DNA
Total DNA of the P. knowlesi was extracted from each blood sample using the QIAGEN Blood DNA Extraction kit (QIAGEN, Hilden, Germany). In each extraction, 100 μl of blood was used. The extracted DNA was suspended in water to a final volume of 50 μl.
Purification of PCR products and DNA cloning PCR products were purified by QIAquick PCR purification Kit (QIAGEN, Hilden, Germany) following the manufacturer's instructions. The purified PCR products were then ligated into cloning vector pGEM-T® (Promega Corp, USA) and transformed into Escherichia coli TOP10F'. Plasmids of recombinant clones harbouring the PkDBPαII fragment were sent to a commercial laboratory for DNA sequencing. To detect possibility of multiple haplotypes infecting a patient, plasmids from four to six recombinant clones from each transformation mixture were sequenced.

Analysis of PkDBPαII sequences
Multiple sequence alignment of PkDBPαII was performed using CLUSTAL-Omega programme which was available on-line [8]. Both nucleotide and the deduced amino acid sequences were aligned and analysed. Phylogenetic tree was constructed using the Neighbour Joining method described in MEGA5 [9]. In constructing the phylogenetic tree, bootstrap replicates of 1,000 were used to test the robustness of the tree.

PkDBPαII sequence polymorphism analysis
DnaSP ver. 5.10.00 [10] was used to perform polymorphism analysis on the PkDBPαII sequences. Information such as the number of segregating sites (S), haplotype diversity (Hd), nucleotide diversity (π), and average number of pair-wise nucleotide differences within the population (K) was generated. The π was also calculated on a sliding window of 100 bases, with a step size of 25 bp to estimate the step-wise diversity across PkDBPαII. The rates of synonymous (Ks) and non-synonymous (Kn) mutations were estimated and compared by the Z-test (P <0.05) in MEGA5 using the Nei and Gojobori's method [11] with Jukes and Cantor correction. In the case of purifying (negative) selection, mutations are usually not advantageous so that Kn will be less than Ks (Kn/Ks <1). However, in positive selection, nonsynonymous mutations can be advantageous and Kn will exceed Ks (Kn/Ks >1). For testing the neutral theory of evolution, Tajima's D [12] and Fu and Li's D and F [13] tests were carried out using DnaSP 5.10.00. In the Fu and Li's tests, P. vivax PvDBPII (GenBank Accession No. M90466) was used as outgroup. The Wright's F ST fixation index [14] in DnaSP 5.10.00 was used to measure genetic differentiation between the PkDBPαII of North Borneo and Peninsular Malaysia.

Results
The nested PCR amplification on the human blood samples produced DNA fragments of 1,053 bp in size. The sequence of each fragment was trimmed to 921 bp, as according to the PkDBPαII region described by Singh et al. [15]. The trimmed sequence encoded an amino acid sequence of 307 in length. A final total of 49 sequences (GenBank Accession No. KM926563 -KM926611) were obtained. DNA sequence analyses were conducted to determine nucleotide diversity and genetic differentiation. The average number of pair-wise nucleotide differences (K) for the PkDBPαII was 11.261. The overall haplotype diversity (Hd) and nucleotide diversity (π) for the 49 PkDBPαII sequences were 0.999 ± 0.004 and 0.012 ± 0.002, respectively. Detailed analysis of π, with a sliding window plot (window length 100 bp, step size 25 bp), revealed diversity ranged from 0.003 to 0.022. The highest peak of nucleotide diversity was within nucleotide positions 125-250, whereas the most conserved region was within nucleotide positions 625-700 (Figure 1).
Analysis and comparison at the nucleotide level against P. knowlesi strain H as reference sequence (Gen-Bank Accession No. M90466) showed mutations at 160 positions among the North Borneo isolates. Fifty-eight of these mutations were synonymous and 102 were nonsynonymous. To determine whether natural selection contributed to the diversity in the PkDBPαII, the rate of non-synonymous (Kn) to synonymous mutations (Ks) was estimated. Kn (0.00900) was found lower than Ks (0.02723) and the Kn/Ks ratio was 0.331, suggesting that purifying (negative) selection may be occurring in the PkDBPαII of the North Borneo isolates. Similarly, the Z test (Ks > Kn; P <0.05) also indicated purifying selection on PkDBPαII. In the tests of departure of neutrality of selection, the Tajima's D was −2.459 (P <0.01), indicating expansion in population size and/or purifying selection. This is further supported by the Fu and Li's D and F tests statistics (−3.713and −3.917, respectively; P <0.02). Comparison at the amino acid level against the reference P. knowlesi strain H revealed high polymorphism across the PkDBPαII of the North Borneo isolates (Figure 2  haplotypes (H37-H74) with haplotype 47 having the highest frequency (10/49). Twelve of the 28 blood samples had mixed haplotype infections (Table 1).
A phylogenetic tree comprising these 38 North Borneo and the 36 Peninsular Malaysia haplotypes reported previously [7], showed interesting features (Figure 3). Overall, the haplotypes are still separated into allele group I and allele group II. All the North Borneo haplotypes are in allele group I. However, they form a sub-group which is distinct from allele group I members from Peninsular Malaysia. The Wright's F ST value between the PkDBPαII of North Borneo and Peninsular Malaysia was 0.621, indicating high genetic differentiation between these two groups.

Discussion
The P. knowlesi PkDBPαII plays an essential role in the invasion of the parasite by mediating binding with its corresponding receptor, the Duffy protein receptor for chemokines (DARC) on the surface of erythrocytes [16].
The PkDBPαII elicits immune response in humans and therefore has been suggested to be a vaccine candidate antigen [6]. The genetic diversity and haplotype groups of PkDBPαII among Peninsular Malaysia P. knowlesi clinical isolates were recently reported [7]. The present study found distinct differences in the PkDBPαII of North Borneo upon comparison with those from Peninsular Malaysia.
Previous studies on P. vivax isolates from different geographical regions such as Colombia, South Korea, Papua New Guinea, Thailand, Iran, and Myanmar reported numerous haplotypes and allele groups of PvDBPII [17][18][19][20][21][22]. Interestingly, some of these PvDBPII haplotypes were grouped with those from outside their geographic origins. For example, haplotypes from Iran were grouped with those from Brazil, Papua New Guines (PNG) and Thailand [21], haplotypes from Myanmar grouped with haplotypes from South Korea [22], and haplotypes from PNG grouped with those from South Korea and Thailand [18,20]. This, however, is not observed in the PkDBPαII in H42  H58  H66  H72  H45  H55  H57  H62  H56  H60  H68  H47  H52  H63  H39  H40  H46  H37  H38  H51  H64  H48  H50  H67  H41  H49  H59  H53  H43  H44  H65  H69  H70  H71  H74  H61  H54  H73  H5  H21  H6  H7  H12  H17  H20  H23  H31  H30  H16  H2  H32  H13  H15  H33  H34  H36  H35  H1  H14  H18  H22  H26  H25  H27  H24  H28  H29  H19  H4  H3  H8  H9  H10  the present study. The phylogenetic analysis ( Figure 3) showed a sub-group consisted solely of haplotypes from North Borneo, although these haplotypes were still categorized under allele group I. Geographical separation of Borneo Island from Peninsular Malaysia and subsequent genetic drift of the P. knowlesi populations may not be the reason for this unique PkDBPαII separation. This is because other genetic markers studied thus far such as the P. knowlesi circumsporozoite protein (csp) gene and the small sub-unit ribosomal rRNA (ssu rRNA) do not display such such geographical-based separation [1,23,24]. The PkDBPαII analysed in this study is based on the region defined by Singh et al. [15]. In their analysis, 12 C residues (positions 16,29,36,45,99,176,214,226,231,235,304,306), which form six disulphide bridges, have been shown to be involved in the folding of PkDBPαII for interaction with DARC. Multiple alignment of the PkDBPαII amino acid sequences (Additional file 1) in this study revealed that these 12 residues were conserved in the PkDBPαII of North Borneo. Apart from these conserved C residues, the Y94, N95, K96, R103, L168, and I175 residues are required for recognition of DARC on human erythrocytes [15]. The multiple sequence alignment showed high conservation of these residues except at position 95. The N (asparagine) residue at this position was substituted with the D (aspartic acid) in the PkDBPαII of North Borneo. However, this N → D substitution may not affect the overall structure and biological function of PkDBPαII, as N is the amide derivative of D.
The PkDBPαII of North Borneo (K = 11.261; Hd = 0.999; π = 0.012) was as diverse as that of Peninsular Malaysia (K = 11.736; Hd = 0.986; π = 0.013). Like the PkDBPαII of Peninsular Malaysia [7], the PkDBPαII of North Borneo was found to be under purifying (negative) selection. A possible reason for this purifying selection is population expansion of P. knowlesi in Borneo Island, as evident by the Tajima's D, as well as the Fu and Li's D and F tests statistics. Mitochondrial DNA analysis also suggests recent population expansion of P. knowlesi in Southeast Asia [25].
Further evidence of difference between the PkDBPαII of North Borneo and Peninsular Malaysia was shown by the Wright's F ST fixation index, which measures population differentiation due to genetic structure [14]. As a rule of thumb, populations with F ST values of > 0.25 are considered highly differentiated. The F ST obtained in this study was 0.61, indicating extremely high genetic difference between the PkDBPαII of North Borneo and Peninsular Malaysia. The amino acid substitutions in the PkDBPαII, which most likely contribute to this genetic difference, were at positions at positions 47-57, 95 and 224 ( Figure 2).
PkDBPαII plays a critical role in the invasion of P. knowlesi merozoite into human and monkey erythrocytes.
It is crucial for PkDBPαII to conserve its structure for precise interaction with DARC in the invasion process. The discovery in this study of highly differentiated PkDBPαII in North Borneo and Peninsular Malaysia may seem puzzling. However, it has been observed that the P. vivax PVDBPII is highly diverse, and in some instances within a population of a particular region [26]. DBPII amino acid residues can be variable and these polymorphisms usually map to non-functional regions of the protein, therefore may serve as a mechanism of immune evasion for the parasite. In such a mechanism, polymorphic residues near the binding site escape binding of host inhibitory antibodies. This protects the crucial functional site on the interacting DBPII domain.
A recent phylogenetic study on the relationships of Macaca fascicularis, the natural monkey host of P. knowlesi, showed a clear separation between Borneo's and Peninsular Malaysia's populations [27]. This phylogeny was based on the cytochrome b gene sequences of the monkeys. It is, therefore, worthwhile in future studies to determine whether a similar genetic separation occurs in the DARC of the monkey populations, and to associate it with the PkDBPαII haplotype groups observed in this study.

Conclusions
This study is the first to report the genetic diversity and natural selection of PkDBPαII of P. knowlesi from Borneo Island. The PkDBPαII haplotypes found in this study were distinct from those from Peninsular Malaysia. This difference may not be attributed to geographical separation because other genetic markers studied thus far such as the P. knowlesi circumsporozoite protein gene and small subunit ribosomal RNA do not display such differentiation. Immune evasion may possibly be the reason for the differentiation.

Additional file
Additional file 1: Full amino acid sequence alignment of PkDBPαII from Peninsular Malaysia and North Borneo. Amino acid residues identical to those of the reference sequence (strain H) are indicated by dots. The twelve conserved cysteine (C) residues are marked in yellow. The conserved Y94, N95, K96, R103, L168 and I175 residues required for recognition of DARC on human erythrocytes are highlighted in green. Note that the N9 residue was substituted by D95 in the North Borneo sequences.