Longitudinal genomic surveillance of Plasmodium falciparum malaria parasites reveals complex genomic architecture of emerging artemisinin resistance in western Thailand

Background Artemisinin-based combination therapies are the first line of treatment for Plasmodium falciparum infections worldwide, but artemisinin resistance (ART-R) has risen rapidly in in Southeast Asia over the last decade. Mutations in kelch13 have been associated with artemisinin (ART) resistance in this region. To explore the power of longitudinal genomic surveillance to detect signals in kelch13 and other loci that contribute to ART or partner drug resistance, we retrospectively sequenced the genomes of 194 P. falciparum isolates from five sites in Northwest Thailand, bracketing the era in which there was a rapid increase in ART-R in this region (2001–2014). Results We evaluated statistical metrics for temporal change in the frequency of individual SNPs, assuming that SNPs associated with resistance should increase frequency over this period. After Kelch13-C580Y, the strongest temporal change was seen at a SNP in phosphatidylinositol 4-kinase (PI4K), situated in a pathway recently implicated in the ART-R mechanism. However, other loci exhibit temporal signatures nearly as strong, and warrant further investigation for involvement in ART-R evolution. Through genome–wide association analysis we also identified a variant in a kelch-domain–containing gene on chromosome 10 that may epistatically modulate ART-R. Conclusions This analysis demonstrates the potential of a longitudinal genomic surveillance approach to detect resistance-associated loci and improve our mechanistic understanding of how resistance develops. Evidence for additional genomic regions outside of the kelch13 locus associated with ART-R parasites may yield new molecular markers for resistance surveillance and may retard the emergence or spread of ART-R in African parasite populations.


Introduction
Artemisinin-based combination therapy (ACT) is the first-line treatment for Plasmodium falciparum malaria infection in most of the world [1,2].
Resistance to ACT treatment, manifested as delayed clearance of parasites following treatment, was first documented in Cambodia in 2009 [3,4] and has since spread throughout SE Asia [5,6]. Mutations in the BTB/POZ or propeller domain of the kelch13 gene (PF3D7_1343700) have been associated with artemisinin resistance (ART-R), as evidenced by in vitro selection [7] and transfection experiments [8], and are associated with reduced cure rates following ACT [9][10][11]. Surveys have documented the rapid increase in frequency of kelch13 mutations in SE Asia over the last ten years, driven by a combination of de novo mutation creating new resistance alleles and natural selection favoring the spread of existing alleles [5,[12][13][14]. Though kelch13 resistance mutations are becoming prevalent in SE Asia, and have been observed at low frequency in Africa, South America, and other parts of the world, to date they have not been observed to spread in any location outside of SE Asia [13,15].
There are several hypotheses to explain the failure of kelch13mediated resistance to spread outside of SE Asia via de novo mutations or human migration, as has previously occurred with resistance to chloroquine [16] and pyrimethamine [17]. One is that the selective pressure of ACTs on parasite populations is more intense in SE Asia due to lower disease endemicity, commensurately less acquired immunity to disease, and therefore a greater likelihood that infected individuals will become symptomatic and be treated with drugs. This hypothesis is difficult to exclude, but malaria endemicity is highly variable across sub-Saharan Africa and many other regions, and access to ACTs is high in many regions outside of SE Asia [18], making it unlikely that SE Asian drug selection pressure is unique. An alternative hypothesis, to be explored in more detail in this manuscript, is that kelch13 mutations induce a fitness cost in parasites lacking an appropriate genetic background. In many pathogens, mutations conferring resistance to drugs also confer deleterious fitness effects that are usually suppressed or mitigated by co-segregating compensatory mutations, a phenomenon well documented in bacteria [19,20], yeast [21], and in P.
If background mutations that abrogate a fitness cost of kelch13 mutations or provide resistance to partner drugs are required for the spread of kelch13 mutations, SE Asia is a favorable location for these mutations to arise and be selected in association with kelch13 mutations. Beyond the fact that ACTs have been in use for much longer in SE Asia relative to Africa [25], offering a longer window of time for requisite background mutations to occur and rise in frequency, SE Asian parasites experience a lower rate of sexual outcrossing than parasites in most African populations. This is because P. falciparum is an obligately sexual but facultatively outcrossing eukaryotic parasite. Meiosis occurs following the union of parasite gametes in the mosquito midgut, but mosquitoes that feed on humans infected by a single genotype of P. falciparum will result in self-fertilization of male and female gametes from the same genotype, rather than outcrossing between unrelated genotypes. In low-transmission settings like SE Asia, a majority of human infections are caused by a single parasite genotype [26], leading to infrequent outcrossing relative to high transmission regions like Africa, where human infections may contain multiple parasite genotypes. In addition, there is reduced competition between parasites in low-transmission settings, because most infected humans and mosquitoes harbor only a single parasite genotype. SE Asia, therefore, may be an ideal setting for kelch13 mutations to maintain association with a favorable genetic background and spread via natural selection.
There is some existing evidence for the involvement of additional loci in ART-R. Two groups have identified a region of chromosome 14 as being associated with slow parasite clearance [27,28]. Miotto et al [29] have suggested that variants in several loci outside of kelch13 are associated with ART-R.
We hypothesized that background mutations providing compensatory fitness for ART resistance mutations in kelch13 or mutations at other loci conferring partner drug resistance should rise in frequency over time with kelch13 resistance mutations. We performed whole genome sequencing of samples collected between 2001 to 2014 from Northwestern Thailand, a period spanning the emergence and spread of ACT resistance in this region [6], using hybrid selection to enrich parasite DNA in early clinical samples collected as dried blood spots without leukocyte depletion. In conjunction with genotype-phenotype association tests and scans for signals of natural selection, we used longitudinal changes in allele frequency to identify a list of candidate mutations that may provide a suitable background for kelch13 resistance mutations. These markers give insight into the mechanism of kelch13-based ART-R, clarify the genomic architecture of this trait, and suggest other loci in the genome that could be informative targets of future ACT resistance surveillance efforts.

Results
We sequenced a total of 194 isolates distributed among four time intervals between 2001 and 2014 from five clinics situated along the northwestern border of Thailand (Fig. 1). The median clearance rate half-life of the samples sequenced during each time interval exhibits a sharp increase in this region after 2008 [6], indicating that our collection window spans the emergence of ACT resistance in this region (Fig. 1). Although kelch13 mutations are strongly associated with slow clearance (Additional file 1: Figure S1), clearance rate half-life ranges from 3.0 to 9.6 hours for parasites with kelch13 mutations, and 16 out of 68 samples with kelch13 mutations exhibit a clearance rate half-life less than 5 hours, suggesting that resistance may vary according to the nature of different kelch13 mutations and/or parasite genomic background.

Dataset filtration -
We performed an analysis of identity by descent (IBD) among pairwise comparisons of samples within sampling intervals to ascertain changes in the degree of clonality due to recent common ancestry over time, using a hidden Markov Model-based approach that makes use of SNP calls [30]. High levels of IBD among samples impedes the identification of individual variants subject to selection due to increased LD among variants within IBD blocks. Analysis of pairwise IBD distributions within each time interval showed only a modest amount of recent common ancestry during the first three sampling intervals, but a high level of clonality among isolates collected in 2014 (Fig. 2). This phenomenon resembles the previously documented increase in parasite clonality in Cambodia, [31], and most likely stems from decreasing disease transmission in this region [26] coupled with the selective sweeps of multiple kelch13 resistance mutations. The increased IBD among the 2014 samples elevates the LD between SNPs, making it difficult to identify signals associated with individual background mutations. Therefore, the 2014 samples were excluded from subsequent analyses of temporal frequency trends. Other isolates were discarded due to low sequencing coverage depth (see Methods), ultimately resulting in a collection of 134 sequenced isolates for the temporal analysis (Additional file 7: Table   S1).
The dataset was also filtered based on location and nature of polymorphic sites (see Methods). The resultant 15,117 SNPs were analyzed using both a conventional genotype/phenotype association analysis aimed at detecting variants associated with a low parasite clearance rate under artesunate therapy, as well as a phenotype-agnostic approach to identify variants with a temporal trend and other features suggestive of ACT selection.
Frequency trajectory of kelch13 mutations -Twelve distinct nonsynonymous kelch13 mutations were found among sequenced isolates (Additional file 2: Figure S2, Additional file 3: Figure S3A). Figure 3A shows the frequency of kelch13 mutations exhibiting at least 5% allele frequency in at least one of the first three sampled time intervals. Although other kelch13 mutations exhibit a higher frequency before 2011, C580Y overtakes those in the 2011-2012 sample. Two independent origins of the C580Y mutation were inferred by the observation of shared haplotypes in the vicinity of that mutation (Additional file 4: Figure S4), consistent with previous SNP genotyping analyses of parasites from the Thai-Myanmar border [32] and other SE Asian locations [12].
GWAS results -A GWAS using the filtered set of SNPs identified during the first three sampling intervals identified the SNP underlying the kelch13 C580Y mutation as the variant most significantly associated with slow parasite clearance (P = 7.0e-06; Q = 0.09, Benjamini-Hochberg) [33]. There were 15 other candidate SNPs that, in spite of having q -values greater than 0.10, had p-values lower than expected assuming a uniform distribution of p-values (P < 1e-3; Additional file 5: Figure S5A). This list of SNPs will subsequently be referred to as the GWAS set. Of this set, 11 are nonsynonymous coding mutations and four are synonymous coding mutations (Additional file 7: Table   S2). The coding SNPs are located in 13 distinct genes, including three conserved proteins with unknown function and three conserved membrane proteins with unknown function. Three genes containing candidate SNPs encode proteins involved in protein degradation via the proteasome complex, a pathway hypothesized to be associated with ART resistance [34,35]: putative sentrin-specific protease (SENP2) (PF3D7_0801700); ApiAP2 transcription factor (PF3D7_1342900); and a putative ubiquitin protein ligase (PF3D7_1448400). The derived nonsynonymous mutation in the sentrinspecific protease is negatively associated with artemisinin resistance (negative beta value on Additional file 7: Table S2), meaning that the reference allele is associated with slow clearance. For the mutations in the ApiAP2 transcription factor and ubiquitin protein ligase, the derived alleles are positively associated with slow clearance.
An additional association analysis was performed with only those samples containing kelch13 mutations in a search for other variants that could be potentiating slow clearance rates mediated by kelch13 mutations.
No variants exhibited a statistically significant association with clearance rate after correction for multiple testing. However, the most significant (P = 7.1e-4) non-synonymous SNP positively associated with artemisinin resistance (positive beta-value on Additional file 7: Table S3) is located on chromosome 10 in a gene functionally annotated as "kelch protein, putative" (PF3D7_1022600; kelch10). kelch10 has limited sequence similarity to kelch13 (Additional file 3: Figure S3B,C), restricted to a few amino acid positions between one of its instances of "Kelch-type beta-propeller" domain (IPR015915) and the "Galactose oxidase, beta-propeller" domain (IPR015916) of kelch13 [36]. Both domains were defined based on tertiary structure, imported by InterPro from CATH, a protein structure classification database [37], and they represent beta-propellers with 6 and 7 blades in kelch10 and kelch13, respectively, explaining the lack of amino acid sequence similarity between the loci. The mutation in kelch10 induces a proline to threonine amino acid substitution at position 623 (P623T) and is located between instances of beta-propeller domains ( Fig S7B). P623T exhibits variable impact on parasite clearance rate half-life in the presence of different kelch13 mutations (Additional file 3: Figure S3D). It significantly increases parasite clearance rate half-life in the presence of the kelch13 E252Q mutation (Wilcoxon rank sum test, P = 8.7e-3) and mildly affects it in the presence of C580Y or other common kelch13 mutations (P range = 0.23 -1). The kelch10 mutation does not impact clearance rate on a wildtype kelch13 background (P = 0.07; Additional file 3: Figure S3D). We used PCR-based Sanger dideoxy sequencing to genotype the kelch10 mutation in 68 additional samples with clearance rate data and the kelch13 E252Q mutation from the same geographic region. SNP genotyping data (not shown) indicates that 52 distinct parasite clones are represented within these 68 additional samples, with four clonal groups having the P623T mutation on kelch10. Clonal groups containing the E252Q (kelch13) and P623T (kelch10) mutations exhibit a significantly higher clearance rate relative to those with only E252Q mutation (Wilcoxon rank sum test, P = 3.21e-3; Fig. 4). Given that E252Q is the only common kelch13 mutation in SE Asia that occurs outside the BTP-POZ and propeller domains of kelch13, this association may be evidence of an epistatic relationship with kelch10 that potentiates the resistance phenotype of E252Q. Further functional work will be necessary to elucidate the relationship between these variants.
A paucity of isolates exhibiting wildtype kelch13 and slow clearance also compromised the power of a GWAS restricted to wildtype kelch13 isolates. No statistically significant associations were observed after correction for multiple testing.
Temporal screen for kelch13 background SNPs -The set of 15,117 genome-wide SNPs meeting the quality and frequency filters described in the Methods section was screened for positions exhibiting a NRAF trajectory similar to C580Y, under the hypothesis that any variants contributing to a fit genomic background for kelch13 mutations should increase in frequency in tandem with the most successful kelch13 mutation. Two approaches were utilized to identify variants with NRAF trajectories similar to C580Y.
The stricter approach, requiring an NRAF of 0% in the first two sampling intervals and an NRAF ≥ 5% in the third interval, yielded 779 SNPs, a set that we designate C580Y-like. An alternative approach, utilizing a threedimensional vector-distance metric not involving hard NRAF thresholds, yielded 382 SNPs, 158 of which also belong to the C580Y-like set. After removing shared SNPs we designated this set as C580Y-vector-like. Variants segregating non-independently as part of linked haplotype blocks within these two variant sets were identified as described in Methods, and single 'tag SNPs' were selected to represent blocks of variants that routinely co- In the C580Y-like set, the lowest quartile of candidate background variants (5% < NRAF ≤ 6%) exhibits no significant difference from control SNPs with regard to LD with mutant kelch13 (P = 0.47; Wilcoxon rank sum test). The higher frequency quartiles all exhibit enriched LD with mutant kelch13 relative to frequency-matched control SNPs, however, suggesting they may contain legitimate background variants for resistance mutations at that locus (6% > NRAF ≤ 8%; P = 3.49E-3; 8% > NRAF ≤ 10%, P = 1.14E-3; 10% < NRAF ≤28%, P = 2.23E-8). LD analysis of the C580Yvector-like set yields qualitatively similar results ( Fig S6C).
Both the C580Y-like and C580Y-vector-like variant sets were also analyzed for signatures of natural selection using the iHS statistic [38], assuming that variants exhibiting upward NRAF trajectories as a result of neutral genetic drift or sampling error would be unlikely to exhibit independent evidence of selection. Similar to the mutant kelch13 analysis, candidate variants were binned into quartiles according to their 2011-2012 NRAF and compared to control SNPs exhibiting similar NRAF in that sampling interval. Fig. 5B and Additional file 6: Figure S6D illustrate the results of this analysis for the C580Y-like and C580Y-vector-like sets, respectively. As with the analysis for LD with mutant kelch13, in the C580Y-like set variants in the lowest 2011-2012 NRAF quartile (5% < NRAF ≤ 6%) exhibited no significant difference in iHS distribution relative to frequency-matched control SNPs (P = 0.37; Wilcoxon rank sum). Once more, however, C580Y-like variants in the three higher NRAF quartiles exhibited significantly lower iHS distributions than control SNPs, suggesting they may be subject to recent natural selection (6% > NRAF ≤ 8%; P = 7.6E-7; 8% > NRAF ≤ 10%, P=1.74E-6; 10% > NRAF ≤28%, P = 3.22E-10; Wilcoxon rank sum test). iHS analysis of the C580Yvector-like set for natural selection again yields qualitatively similar results (Additional file 6: Figure S6D).
The observation of enhanced LD with mutant kelch13, as well as evidence of enriched signals of natural selection in candidate SNPs exhibiting high NRAF in 2011-2012, indicates that some of these variants may constitute part of a genomic background required for parasite fitness in the presence of kelch13 mutations. An alternative hypothesis is that high-NRAF candidate background SNPs could exhibit these LD and selection signatures as a result of selective sweeps targeting the kelch13 resistance mutations, and that some or all of these candidate background variants are themselves selectively neutral. Though these candidate markers reside on distinct chromosomes (Additional file 7: Table S4 and S5), reduced opportunities for sexual outcrossing in a low-transmission setting like Thailand could result in selective sweeps that impact the whole genome, rather than just the immediate vicinity of the selected kelch13 variants on chromosome 13.
To explore this hypothesis, candidate SNPs were examined for the richness of their association with different kelch13 mutations, under the hypothesis that if these mutations are neutral and were dragged up in frequency due to chance co-occurrence with a kelch13 mutation undergoing a selective sweep, they should be primarily associated with a single kelch13 mutation rather than co-occur with multiple different kelch13 mutations. This analysis indicates that many of the candidate background SNPs with low NRAF exhibit low allelic richness in their kelch13 mutation association, but that candidate background SNP exhibiting medium to high NRAF (> 8%) are found in association with multiple different kelch13 mutations, similar to a set of frequency-matched control variants (Fig. 5C), reducing the likelihood that their NRAF increase is due to a trivial kelch13 hitchhiking scenario. Additional file 7: Table S4 lists all nonsynonymous candidates in the C580Y-like set with NRAF greater than 8%. Additional file 7: Table S5 list nonsynonymous candidates in the C580Y-vector-like set that are disjunct from the C580Y-like set. The Shannon entropy index of the association with distinct kelch13 mutations was calculated for each candidate. SNPs with a higher Shannon entropy index are less likely to have been detected due to hitchhiking with a single kelch13 mutation, given their association with multiple distinct kelch13 mutations (Additional file 7: Table S4 and S5). Table 1 contains a list of some of the most promising candidate background SNPs that pass these filters in the C580Y-like C580Y-vector-like sets, and it includes selected candidates from the GWAS sets. Ranked either by 2011-2012 NRAF or Shannon index, the top candidate in the C580Y-like set is a nonsynonymous mutation in a putative phosphatidylinositol 4-kinase (PI4K) locus (PF3D7_0419900). A distinct Plasmodium PI4K beta locus has been identified as a potential Plasmodium drug target [39,40], and another component of the phosphatidylinositol pathway (PI3P) has been implicated in the mechanism of kelch13-mediated artemisinin resistance [34]. Further functional work will be required to evaluate the potential compensatory role of this and other candidate background mutations, including nonsynonymous mutations in a gene encoding a Sec14-domain-containing protein (PF3D7_0626400), a member of a family of proteins also associated with vesicle trafficking, previously reported as having orthologs in yeast functioning as regulators of PI4K [41] and ranked among the top 14 candidates in C580Y-like set when ranking by Shannon index and 2011-2012 NRAF. Several other genes encoding proteins associated with vesicle trafficking and/or the endoplasmic reticulum were also in the list (Table 1). Table 1 -Selection of non-synonymous mutations detected by C580Y-like, C580-vector-like method and SNPs associated with lower clearance rate according to GWAS.

Discussion
Assuming a conventional eukaryotic mutation rate of approximately 3 x 10 -9 mutations per base pair generation [42,43], and assuming that a typical P. falciparum infection results in approximately 10 11 parasites during its peak, [44] it is likely that virtually all of the 23 million nucleotide positions in the parasite genome mutate into all possible alternative states during the course of a single infection within an individual. The observation that drug GWAS as a means of identifying loci associated with a phenotype subject to natural selection and has the advantage of not requiring phenotype data, which can be much more difficult to collect than genomic data [48][49][50][51][52]. This These results also highlight the potential shortcomings of in vitro resistance selection studies. While kelch13 was discovered by sequencing parasites selected in vitro for resistance to artemisinin [7], such studies cannot recapitulate the complex fitness landscapes influencing the evolution of parasites exposed to drugs in the field. Mutations observed when sequencing culture-adapted parasite lines selected for resistance to an antimalarial compound may or may not represent changes that are eligible for spread in parasite populations in the field, because cultured parasites are not exposed to the same fitness-determining challenges as wild parasites. And different wild populations of parasites may assemble a fit resistance genotype in different ways. Notably, we failed to identify any mutations with our GWAS and temporal analyses in NW Thailand that were previously found to be associated with ART resistance in Cambodia [29]. This lack of replication does not impugn the results from the Cambodian study, because there may legitimately be different determinants of fit genomic backgrounds in the two parasite populations. Ideally, markers for resistance and genomic backgrounds that enable resistance should be identified from parasites collected as close as possible to the geographic setting in which surveillance will take place.
The variants we have identified as candidate background mutations required for the spread of kelch13 resistance mutations occur in some genes that belong to pathways already hypothesized to contribute to ART resistance in P. falciparum. Mbengue and colleagues [34] found elevated levels of phosphatidylinositol-3-phosphate (PI3P) to be associated with ART resistance in vitro. Elevated levels of PI3P can result from polyubiquitination of phosphatidylinositol-3-kinase (PI3K). We observed no polymorphisms in PI3K, but our top hit in the C580Y-like set was a nonsynonymous mutation in a PI4K locus (PF3D7_0419900), which may impact PI3P or other relevant members of the phosphoinositol pathway in a manner conducive to evolutionarily fit resistance. Dogovski and colleagues [57] identified the proteasome/ubiquitination pathway as associated with ART resistance via that pathway's involvement in the cellular stress response. They hypothesize that an enhanced stress response, manifested via lower levels of ubiquitination, delays cell death and confers resistance to ART, and observe enhanced resistance in vitro when ART is co-administered with proteasome inhibitors. In our GWAS and temporal analyses we find several loci associated with the proteasome/ubiquitination pathway, including a putative sentrinspecific protease (SENP2; PF3D7_0801700) and a putative ubiquitin protein ligase (PF3D7_1448400). These loci, as well as the kelch domain-containing protein from chromosome 10 (PF3D7_1022600) that may be epistatically associated with the E252Q mutation at the kelch13 locus, constitute a list of potential genetic surveillance targets in other regions of the world where kelch13 mutations and/or phenotypic resistance are observed, such as Guyana [58]. 2010-2011; 2014) were selected for sequencing following SNP genotyping to avoid sequencing multiple isolates exhibiting the same parasite genotype [26]. Following treatment with mefloquine-artesunate combination therapy [6], parasite density was monitored at six hour intervals until patients were slide negative, at which point combination therapy was administered.

Conclusions
Parasite clearance data were used to estimate clearance half-life as described previously [14]. Ethical approval for this work was given by the Genotype filtering -Polymorphic sites meeting any of the following criteria were removed from the analysis using VCFTools [63] and custom scripts: heterozygous sites, QUAL < 60, GQ < 30, indels, sites lacking genotype calls in 10% or more of the samples within one or more time intervals and sites with non-reference allele frequency (NRAF) lower than 5% in all time intervals.
NRAF was calculated using custom scripts. Polymorphic sites located in pericentromeric, subtelomeric and hypervariable regions (listed in Additional file 7: Table S6) were removed, as were those occurring in genes belonging to large antigenic gene families (Additional file 7: Table S7).

Identifying regions identical by descent (IBD) -A previously described
Hidden Markov Model (HMM) was utilized to identify Identical by Descent (IBD) genomic regions between pairwise comparisons of samples [30].

Genome wide association scan (GWAS) -Genotype calls were converted
from VCF format to PLINK format using VCFTools [63]. GEMMA [64] was run with default parameters to evaluate the degree of association between genotype calls and the clearance rate half-life under artemisinin treatment.
Q-Q and Manhattan plots were rendered using the qqman R package [65].
P-values were corrected for multiple testing using the Benjamini-Hochberg method [33] implemented in the standard R package [66].
Projections connecting regions of similarity between the linear representation of kelch13 and kelch10 sequences (Additional file 1: Figure S1) were rendered by Kablammo [69]. Boxplots in Additional file 1: Figure S1, as others boxplots in this work, were rendered using ggplot2 R package [70].
Temporal Analysis -Two methods were employed to identify SNP sets with temporal frequency signatures concordant with selection for ACT resistance.
First, SNPs were identified that exhibited a sample frequency history strictly similar to the C580Y kelch13 mutation ('C580Y-like'), which has prevailed as the most successful ART resistance mutation in the region [14]. To be defined as C580Y-like, SNPs were required to exhibit a NRAF of 0 during the first two Defining haplotype blocks -Linkage disequilibrium was measured by calculating the Pearson correlation coefficient (r) between all pairs of SNPs within the C580Y-like and C580Y-vector-like sets using custom scripts. Pairs of SNPs exhibiting r greater than or equal to 0.8 were classified as belonging to the same haplotype block.
Extended haplotype homozygosity -Genotypes were imputed by Beagle [71] using the recombination map described on [72] and default parameters.
Linkage disequilibrium with mutant kelch13 -'Mutant kelch13' was defined as any kelch13 genotype containing one or more non-synonymous differences relative to the 3D7 reference sequence (Additional file 9: Figure   S9A). Linkage disequilibrium (r) was calculated between every candidate SNP in the C580Y-like and C580Y-vector-like variant sets and mutant kelch13.

Consent for Publication
Not applicable

Availability of Data and Material
The dataset supporting the conclusions of this article has been submitted to EuPathDB (http://eupathdb.org) is available via the NCBI BioProject database, accession PRJNA262567.

Competing Interests
Not applicable     with NRAF less than 8% generally exhibit association with fewer kelch13 alleles than control SNPs, indicating they may be more likely to be products of genetic hitchhiking than C580Y-like SNPs with higher NRAF.

Additional Files
Additional file 1: Figure S1