Inter-Population Genetic Diversity and Clustering of Merozoite Surface Protein-1 (pkmsp-1) of Plasmodium knowlesi Isolates from Malaysia and Thailand

The genetic diversity of pkmsp-1 of Malaysian Plasmodium knowlesi isolates was studied recently. However, the study only included three relatively older strains from Peninsular Malaysia and focused mainly on the conserved blocks of this gene. In this study, the full-length pkmsp-1 sequence of recent P. knowlesi isolates from Peninsular Malaysia was characterized, along with Malaysian Borneo and Thailand pkmsp-1 sequences that were retrieved from GenBank. Genomic DNA of P. knowlesi was extracted from human blood specimens and the pkmsp-1 gene was PCR-amplified, cloned, and sequenced. The sequences were analysed for genetic diversity, departure from neutrality, and geographical clustering. The pkmsp-1 gene was found to be under purifying/negative selection and grouped into three clusters via a neighbour-joining tree and neighbour net inferences. Of the four polymorphic blocks in pkmsp-1, block IV, was most polymorphic, with the highest insertion–deletion (indel) sites. Two allelic families were identified in block IV, thereby highlighting the importance of this block as a promising genotyping marker for the multiplicity of infection study of P. knowlesi malaria. A single locus marker may provide an alternate, simpler method to type P. knowlesi in a population.


Introduction
Malaria remains a major public health challenge. In 2021, 247 million people were afflicted by this disease globally, with more than 600,000 deaths reported. Since 2018, Malaysia has had no reports of indigenous malaria cases caused by human malaria parasite species. However, zoonotic malaria caused by Plasmodium knowlesi has become the main cause of malaria cases in Malaysia, with a notable increase in the number of cases from 1600 to over 4000 between 2016 and 2018. In 2021, 13 deaths were reported due to this zoonotic infection [1], with 409 cases in Peninsular Malaysia and 3166 cases in Malaysian Borneo (Ministry of Health Malaysia, unpublished data). The situation is similar to other countries in Southeast Asia (SEA), as an upsurge of P. knowlesi cases in humans was also reported in neighbouring countries such as Brunei [2], Thailand [3], and Indonesia [4]. No later than 2018, Laos also documented their first human knowlesi case [5], making all countries in SEA endemic to this zoonotic infection, except for Timor-Leste [6]. This quotidian apicomplexan parasite undergoes the shortest asexual life cycle of 24 h in comparison to other human Plasmodium species [7], which results in the parasitaemia level increasing rapidly upon infection. High parasitaemia in patients with knowlesi malaria has been correlated with the development of debilitating symptoms and severe cases with potentially fatal outcomes. The fatality rate is similar to malignant malaria caused by P. falciparum, at approximately 6% to 9% [8]. Hyperparasitaemia, jaundice, and severe acute kidney injuries are among the clinical presentations and known predictors of severe human knowlesi infection [9,10]. Recently, a histological examination also demonstrated the ring stage of P. knowlesi on renal

Ethical Clearance
The study received ethical clearance from the Medical Research Subcommittee of the Malaysian Ministry of Health (NMRR-15-67223975) for the use of blood specimens obtained from hospitals and district health offices in Peninsular Malaysia.

Plasmodium Species Confirmation
Twenty previously confirmed P. knowlesi human blood specimens by microscopy were re-confirmed via nested PCR targeting the 18S rRNA gene as described previously [36,37]. The specimens were also screened for the presence of the four human Plasmodium species, and none of the specimens had mixed infection. The details of the specimens are listed in Supplementary File (Table S1).

DNA Extraction
Plasmodium genomic DNA was extracted from 100 µL of human knowlesi malaria patient blood using Qiagen DNeasy Blood and Tissue Kits (Qiagen, Hilden, Germany) according to the manufacturer's protocol and eluted with 50 µL of AE buffer in the kit for a total of 2 eluates. The eluates were pooled, and the DNA was stored at −20 • C until further use.

Amplification of pkmsp-1 Gene
The pkmsp-1 gene was amplified using gene-specific primer pairs PkMSP1-F: 5 CGTTGGCCACTTTTAAG 3 and PkMSP1-R: 5 AATGTGCAGCCAAAGCC 3 with the following composition: 4.0 µL DNA template, 1X GoTaq ® Long PCR Master Mix (Promega, Madison, WI, USA), and 0.4 µM of each primer in 25 µL final volume reaction. The cycling conditions were set at 95 • C for 3 min, then 35 cycles at 94 • C for 30 s, 58 • C for 30 s, and 68 • C for 7 min, followed by a final extension at 72 • C for 10 min. Amplicons were electrophoresed on a 0.7% agarose gel pre-stained with SYBR ® Safe DNA gel stain (Invitrogen, Eugene, OR, USA) at 90 V for 40 min and visualised using Gel Doc XR+ System (BioRad, Hercules, CA, USA).

Cloning and Sequencing of pkmsp-1
The amplicons were purified using the QIAquick PCR purification kit (Qiagen, Hilden, Germany) per manufacturer's protocol, and the purified amplicons were ligated into pGEM-T ® TA cloning vector (Promega, WI, USA). The ligation products were then transformed into One Shot™ Escherichia coli TOP10F' competent cells (Invitrogen, Eugene, OR, USA). Colony PCR was then conducted against the transformants using the universal M13 forward (40 mer) and M13 reverse (48 mer) after a 16-h incubation at 37 • C. Plasmids containing the pkmsp-1 insert were harvested from positive recombinant clones using the QIAprep Spin Miniprep kit (Qiagen, Hilden, Germany) and sent to a commercial laboratory (First BASE Laboratories Sdn. Bhd., Seri Kembangan, Malaysia) for nucleotide sequencing. Four primer pairs (Table 1) were used to sequence the pkmsp-1 insert via Sanger dideoxy sequencing method.

Phylogenetic Inference
The deduced pkmsp-1 amino acid sequences were used to construct a neighbourjoining tree using the Nei-Gojobori (Jukes-Cantor correction) method with 1000 bootstraps in MEGA X. The P. falciparum MSP-1 (GenBank accession number: XP_0013152170) was used as the outgroup. The robustness of the tree was re-appraised using the neighbour net method via SplitsTree6 ver 0.1.2-alpha [39]. The allelic clustering for each polymorphic block was conducted using a similar approach without the outgroup.

Genetic Diversity and Natural Selection of pkmsp-1
The organization of the full-length pkmsp-1 gene obtained in this study was identical to previous reports [27,35], with five conserved blocks interspersed by four polymorphic blocks. There were 640 polymorphic sites, of which 127 were singleton and 456 were parsimony informative sites ( Table 2). As a result of the extensive size variation within the polymorphic blocks, 805 sites with indels were identified. The nucleotide diversity in the sliding window plot ( Figure 1) revealed high nucleotide diversity at the 5 end, specifically at block II and block IV. Although pkmsp-1 was more diverse (π = 0.026, Hd: 0.996) than other invasion-related proteins of P. knowlesi, the protein was under a strong negative/purifying selection (dN − dS = −5.87, p < 0.0001). Of the four polymorphic blocks, block IV contained the highest number of sites with indels (326 sites), followed by block VIII (244 sites), block VI (147 sites), and block II (74 sites), thus indicating that there were more gaps within blocks IV and VII than other polymorphic blocks. However, between the two polymorphic blocks with the highest indels, block IV (π = 0.174) was more polymorphic than block VIII (π = 0.100). Block II (π = 0.162) had a slightly lower nucleotide diversity than block IV, while, among the four polymorphic blocks, block VI demonstrated the lowest nucleotide diversity (π = 0.051). Overall, the polymorphic blocks were shown to be under positive/diversifying selection, whereas the conserved blocks were under negative/purifying selection. Tajima's D, Fu and Li's D*, and Fu and Li's F* tests revealed that, overall, pkmsp-1 did not depart significantly from neutrality.

Phylogenetic Analysis of pkmsp-1
The neighbour-joining phylogenetic tree of the pkmsp-1 sequences revealed three clusters ( Figure 2). Cluster 3 comprised an admixture of Peninsular Malaysia and Malaysian Borneo sequences with no apparent sub-clustering. On the contrary, cluster 2 bifurcated from cluster 3, and contained Thailand sequences only. A few Peninsular Malaysia sequences were clustered together with some Thailand sequences in cluster 1. A similar clustering pattern was observed when the neighbour net method was employed ( Figure 3). Further investigation of the polymorphic blocks (Figures 4-7) revealed three allelic clusters for block II, two allelic clusters for block IV, three allelic clusters for block VI, and four allelic clusters for block VIII. In each block, the clusters obtained via the neighbour-joining and neighbour-net were similar. It is noteworthy that, in block IV, the two clusters bifurcated distinctly based on the neigbour-net method with shorter branches within the two clusters when compared with the neighbour-net tree of other polymorphic blocks. Meanwhile, the bifurcation of block VIII was most extensive with long branches linking the sequences, resulting in the highest number of clusters. As for block II, the bifurcation was as distinct as block IV, with cluster 3 forming a larger cluster upon bifurcation from cluster 2, thereby making cluster 3 the major allele in block II. The branch length between the clusters in block VI was shortest, thus indicating minute differences in sequences within the same cluster. joining and neighbour-net were similar. It is noteworthy that, in block IV, the two cluster bifurcated distinctly based on the neigbour-net method with shorter branches within th two clusters when compared with the neighbour-net tree of other polymorphic blocks Meanwhile, the bifurcation of block VIII was most extensive with long branches linking the sequences, resulting in the highest number of clusters. As for block II, the bifurcation was as distinct as block IV, with cluster 3 forming a larger cluster upon bifurcation from cluster 2, thereby making cluster 3 the major allele in block II. The branch length between the clusters in block VI was shortest, thus indicating minute differences in sequence within the same cluster.

Discussion
This is the first study describing the genetic diversity of the polymorphic blocks of pkmsp-1 with the inclusion of recent isolates from Peninsular Malaysia. The findings in this study are in contrast to a previous observation by   [27], who reported that block VIII had the highest nucleotide diversity (π ≈ 0.290). The difference in nucleotide diversity distribution between these studies may be attributed to the number of sequences analysed, i.e., 65 in this study versus 11 in the study by   [27]. The polymorphic blocks of pkmsp-1 could serve as useful markers to genotype circulating P. knowlesi strain in a locality [35]. In addition to the two allelic families identified (based on the phylogenetic inference), block IV had the highest number of indels among the polymorphic blocks with the highest number of nucleotides. The indels are indicative of gaps within the block; hence, block IV may serve as a promising size polymorphism marker for the MOI of P. knowlesi malaria.
It is well established that there are three allelic families designated for pfmsp-1 (RO33, MAD20, and K1) and two for pfmsp-2 (FC27 and 3D7/IC) [31]. We propose to designate the T1 allelic family (corresponding to cluster 1 of block IV) and T2 allelic family (corresponding to cluster 2 of block IV) as genotyping the parasite allele circulating in a population, and, subsequently, calculating the multiplicity of infections (MOIs) are paramount. The MOI serves as a surrogate determinant of transmission intensity [40][41][42], where a higher mean MOI indicates high transmission intensity in the locality and vice versa. Genotyping is also useful in the context of ascertaining the parasite population. Previously, the K1 of pfmsp-1 and the 3D7/IC of pfmsp-2 were found to be predominantly circulating in Bobo-Dioulasso, Burkina Faso [41]. This suggests that P. falciparum, with either of those two alleles, adapts better in the population and is less virulent. Conversely, a study in Aceh, Indonesia revealed that the multiclonal infection by K1 + RO33 allelic families was strongly associated with a severe form of falciparum malaria [43]. In addition to testing the usefulness of the block IV of pkmsp-1 as a genotyping marker for P. knowlesi, future studies may also investigate the association of the allelic families in block IV with disease severity. The T1 of block IV was the major allele or the predominant allelic family found in this study. Although speculative, it can be postulated that P. knowlesi harbouring a T1 allelic family adapts better in the population. Further investigations employing adequate sample size are deemed necessary to validate this.
Three clusters of pkmsp-1 were observed based on the full length of pkmsp-1. Interestingly, some of the Peninsular Malaysian pkmsp-1 sequences were clustered with those from Thailand in cluster 1, which may suggest a common ancestral origin of P. knowlesi. The Thailand sequences within cluster 1 (n = 7) were from six P. knowlesi human isolates and one Macaca nemestrina isolate and were all from provinces (Yala and Narathiwat) neighbouring northern Peninsular Malaysia. The transmission of P. knowlesi is thus not confined by a man-defined border. Recently, human knowlesi malaria cases were reported across the Laos-Vietnam border [52], and this could perhaps occur in a similar manner as in this study. The clustering of the Peninsular Malaysia isolates with the Thailand isolates corroborates the allopatric speciation of P. knowlesi in Malaysia, as proposed by Divis and colleagues, (2017) [53]. A previous study on the clustering of P. knowlesi based on its normocyte binding protein xa (pknbpxa) also showed the bifurcation of Peninsular-Malaysia-derived isolates from two other clusters [54]. This could be due to the submergence of the Sunda plate at the end of the last ice age, thus separating Malaysian Borneo from Peninsular Malaysia [48]. The admixture of Peninsular Malaysia and Malaysian Borneo isolates within cluster 3 mirrors the clustering of P. knowlesi reported in previous studies [49,50,54].
With the escalation of human knowlesi cases in Malaysia, vaccine development should be more progressive as an alternative approach, in addition to rigorous vector control and monitoring efforts implemented by the government. Since the last decade, ample studies investigating the natural selection acting on the invasion-related proteins of P. knowlesi have been conducted to gain insight and rudimentary data for better vaccine development and formulation [44][45][46][47][48][49][50][51]. The pkama-1 was deciphered previously to be an under purifying/negative selection [45,55]. Hence, this serves as a determinant of a vaccine candidate for human knowlesi malaria. Recently, Ng et al. (2023) found that rabbit-raised antibodies against the region II of pkama-1 displayed significant differences when compared to the negative control group, with close to 50% invasion inhibition [56]. The findings by Muh and colleagues (2018) are similar but with a lower antibody concentration used [57]. Similar investigations toward other invasion-related proteins of P. knowlesi should be conducted and compared, as antibodies are critical in developing immunity to malaria. Of note, RTS, S/AS01, the first malaria vaccine, specifically for falciparum malaria, has been approved to be used in pilot areas of Ghana, Kenya, and Malawi as part of the routine immunization for children [58]. The vaccine for P. knowlesi can be designed, perhaps in a similar way, by fusing the recombinant protein with an immunogen, such as the Hepatitis B surface antigen. Alternatively, a multistage or multiantigen vaccine is another promising approach [59].
MSP-1 is a promising candidate not only for P. knowlesi [60], but also for P. vivax [61] and P. falciparum [62]. However, vaccine studies are hampered owing to the extensive polymorphism across the gene. This can be observed from this study whereby high nucleotide diversity was observed for the full-length pkmsp-1 as a consequence of the four polymorphic blocks, thus, highlighting the functionality of investigating the genetic diversity and polymorphism of the vaccine candidates. Instead, the conserved blocks, particularly the 42-kDa subunit of PkMSP-1 (PkMSP-1 42 ), was studied as a potential vaccine candidate. Recent findings revealed novel binding peptides against the C-terminal of PkMSP-1 19 [63,64]. This further aids the vaccine development or peptide-based antimalarial drug formulation, and it is in line with the objectives of the One Health initiative to intervene and mitigate the endemic zoonotic infection caused by P. knowlesi via activities, including vaccine development. This would subsequently provide better health and wellbeing to vulnerable communities, as environmental components, which are taken into account to mitigate the spread of the infection.

Conclusions
Three clusters of pkmsp-1 were observed across Thailand and Malaysia, with one cluster unique for Thailand. The pkmsp-1 was under purifying/negative selection, albeit it was more diverse than several genes encoding invasion-related proteins of P. knowlesi. Block IV of pkmsp-1 may serve as a promising genotyping marker for the MOI study of P. knowlesi malaria. Therefore, further investigation should be conducted on block IV to test its feasibility as an MOI genotyping marker in humans, as well as for the macaque population in Southeast Asia.
Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/tropicalmed8050285/s1, Table S1: Details of Peninsular Malaysia specimens included in the study.