Evidence for both sequential mutations and recombination in the evolution of kdr alleles in Aedes aegypti

Background Aedes aegypti is a globally distributed vector of human diseases including dengue, yellow fever, chikungunya, and Zika. Pyrethroid insecticides are the primary means of controlling adult A. aegypti populations to suppress arbovirus outbreaks, but resistance to pyrethroid insecticides has become a global problem. Mutations in the voltage-sensitive sodium channel (Vssc) gene are a major mechanism of pyrethroid resistance in A. aegypti. Vssc resistance alleles in A. aegypti commonly have more than one mutation. However, our understanding of the evolutionary dynamics of how alleles with multiple mutations arose is poorly understood. Methodology/Principal findings We examined the geographic distribution and association between the common Vssc mutations (V410L, S989P, V1016G/I and F1534C) in A. aegypti by analyzing the relevant Vssc fragments in 25 collections, mainly from Asia and the Americas. Our results showed all 11 Asian populations had two types of resistance alleles: 1534C and 989P+1016G. The 1534C allele was more common with frequencies ranging from 0.31 to 0.88, while the 989P+1016G frequency ranged from 0.13 to 0.50. Four distinct alleles (410L, 1534C, 410L+1534C and 410L+1016I+1534C) were detected in populations from the Americas. The most common was 410L+1016I+1534C with frequencies ranging from 0.50 to 1.00, followed by 1534C with frequencies ranging from 0.13 to 0.50. Our phylogenetic analysis of Vssc supported multiple independent origins of the F1534C mutation. Our results indicated the 410L+1534C allele may have arisen by addition of the V410L mutation to the 1534C allele, or by a crossover event. The 410L+1016I+1534C allele was the result of one or two mutational steps from a 1534C background. Conclusions/Significance Our data corroborated previous geographic distributions of resistance mutations and provided evidence for both recombination and sequential accumulation of mutations contributing to the molecular evolution of resistance alleles in A. aegypti.


Methodology/Principal findings
We examined the geographic distribution and association between the common Vssc mutations (V410L, S989P, V1016G/I and F1534C) in A. aegypti by analyzing the relevant Vssc fragments in 25 collections, mainly from Asia and the Americas. Our results showed all 11 Asian populations had two types of resistance alleles: 1534C and 989P+1016G. The 1534C allele was more common with frequencies ranging from 0.31 to 0.88, while the 989P+1016G frequency ranged from 0.13 to 0.50. Four distinct alleles (410L, 1534C, 410L+1534C and 410L+1016I+1534C) were detected in populations from the Americas. The most common was 410L+1016I+1534C with frequencies ranging from 0.50 to 1.00, followed by 1534C with frequencies ranging from 0.13 to 0.50. Our phylogenetic analysis of Vssc supported multiple independent origins of the F1534C mutation. Our results indicated the 410L+1534C allele may have arisen by addition of the V410L mutation to the 1534C allele, or by a crossover event. The 410L+1016I+1534C allele was the result of one or two mutational steps from a 1534C background.
-indicates the mutation was investigated, but it did not exist.
A black box indicates the mutation was not investigated.
The number of studies that found the mutation/the number of studies that examined the mutation.
https://doi.org/10.1371/journal.pntd.0008154.t001 [2,16,17,20]. In 2017, the V410L mutation was first found in A. aegypti from Brazil [12]. Subsequently, the V410L mutation was detected in Colombian and Mexican populations [13,14]. Given the history of the discovery of the different mutations, several studies have only reported on a subset of the mutations that are now known, giving an incomplete picture of the distribution of each mutation and of the alleles present (Table 1). For example, several studies have focused on the V1016I mutation, without considering the V410L and F1534C mutations [6,[21][22][23], even though the V410L+V1016I+F1534C allele can be common [13,14]. Such an incomplete listing of mutations makes understanding the evolution of resistance alleles particularly challenging [4]. Different studies have attempted to reconstruct the evolution of kdr alleles in A. aegypti. Saavedra-Rodriguez et al [14] examined three mutations (V410L, V1016I and F1534C) in populations of A. aegypti collected in Mexico from 2002 to 2012 and proposed that each arose independently and the 410L+1016I+1534C allele arose by two subsequent recombination events. A recent paper [4] proposed sequential evolution of the 989P+1016G+1534C allele, in which the 1016G allele was selected first, followed by 989P +1016G, and finally 989P+1016G +1534C. The 989P+1016G+1534C allele may have arisen from a crossing-over event [24], although sequential accumulation of mutations may also have played a role. The goal of the current study is to improve the understanding of how multiple mutations in Vssc have arisen and evolved. We examined nine Vssc mutations (V410L, L982W, S989P, A1007G, I1011V/M, V1016G/I and F1534C) and pattern types (see Materials and methods) of individuals from various geographical areas to investigate whether or not Vssc alleles with multiple mutations occur by (a) sequential accumulation of individual mutations, (b) by recombination between individual alleles or (c) some combination of the two mechanisms. Our results indicate the 410L+1534C allele could have evolved by both recombination and by addition of a second mutation to an existing resistance allele. This information will help to inform the evolution of kdr resistance in this important vector of human diseases.

Rationale
Vssc in A. aegypti is >450 kb in length and contains >30 exons [25]. To investigate the evolution of kdr mutations in Vssc, we needed to explore the three regions containing mutations (V410L, S989P ± V1016G/I and F1534C) that are common in A. aegypti (Table 1). In addition, we wanted to capture sequence information near the 5 0 and 3 0 ends of Vssc to maximize the chances of detecting recombination events. We selected exons 6-8 and 31-32 for this purpose because their introns/exons were of a size appropriate for reliable PCR amplification and sequencing. Preliminary PCR and sequencing in the V410L or F1534C regions showed very limited sequence diversity (based on sequencing PCR products of about 1000 bp), beyond the presence or absence of the resistance mutations. Thus, for these regions we simply identified the presence or absence of the resistance mutation by allele-specific PCR (ASPCR) (see below). The L982W, S989P, A1007G, I1011M/V and V1016G/I mutations were investigated by sequencing a PCR product spanning exons 20-21. Thus, for each individual mosquito we sequenced three regions of Vssc (exon 6-8, exon 20-21 and exon 31-32) and genotyped for V410L and F1534C by ASPCR, giving us data across Vssc and providing details about which of the known resistance mutations were present (Fig 1).

A. aegypti collections
We evaluated the Vssc sequences of 25 populations of A. aegypti. Eleven of these were from Asia, thirteen were from the Americas and one was from Africa. Our goal was to identify the most common Vssc alleles in each population. At least eight individuals from each population were used, except for California populations (LA, SD, TE and MD) in which four individuals were used. The Costa Rica and Puerto Rico strains were provided by BEI Resources (https:// www.beiresources.org/Home.aspx). Details about populations were shown in Table 2. Mosquitoes were reared at 27˚C (± 1˚C) with 70-80% relative humidity, and a photoperiod of 14 L:10 D. Females were blood fed using membrane-covered water-jacketed glass feeders with cow blood (Owasco Meat Co., Moravia, NY).

DNA extraction
Genomic DNA was isolated from three legs of individual mosquitoes from each population using an alkali extraction method as follows. The legs were placed in individual wells of a 96-well PCR plate (BioRad, Hercules, CA, USA) containing three 2.3-mm diameter zirconia/ silica beads (BioSpec Products, Bartlesville, OK, USA) and 10 μL 0.2 M NaOH. The samples were homogenized on a vortex mixer for 2 min and then incubated at 70˚C for 10 min. Then 10 μL of neutralization buffer (360 mM Tris-HCl, pH 7.5 and 10 mM EDTA) and 80 μl of ddH 2 O were added to each well.

Genotyping
PCR was carried out using 1 μL gDNA, 12.5 μL GoTaq Green Master Mix 2x (Promega, Madison, WI, USA), 9.5 μL ddH 2 O, and 2 μL of 10 μM forward and reverse primer mix. The primers and thermocycler conditions are shown in Table 3. The genotypes for Vssc exons 6-8, 20-21 and 31-32 were obtained by Sanger sequencing. PCR products were treated with Exo-SAP (5 μL PCR product/1 μL of 1 u/μL ExoSAP mix), and sequenced (6 μL ExoSAP treated PCR product, 1 μL primer, 11 μL ddH 2 O) at the Cornell University Biotechnology Resource Center. PCR products with ambiguous sequences were cloned using the pGEM-T Vector System (Promega). Positive colonies were confirmed to have the expected size insert by PCR with primers T7 and SP6 designed from the plasmid sequence, and the PCR products were directly sequenced using the primer T7. Sequences were confirmed by analyzing two or more clones.
ASPCR was used to determine the V410L and F1534C genotypes. Each ASPCR was evaluated on a 1% agarose gel and was scored as homozygous susceptible (ASPCR band only with susceptible primers), homozygous resistant (ASPCR band only with kdr primers) or heterozygous (ASPCR band with both susceptible and kdr primers). Samples were always run alongside DNA from individuals with known genotypes determined by Sanger sequencing. At least one random sample from each population was also sequenced (see above) to validate the ASPCR results.

Sequence analysis
Electropherograms were inspected with Chromas Lite (Technelysium Pty. Ltd., Tewantin, Queensland). We used the term haplotype to refer to the unique allele from sequencing each of the three (E6-8, E20-21 and E31-32) regions. Unknown haplotypes were identified by aligning sequences with known haplotypes using the MegAlign and EditSeq applications of Lasergene 7 (DNAStar, Madison, WI). Haplotypes present in heterozygous samples were identified either by cloning and sequencing (a minimum of two clones) or were inferred by comparison the known haplotypes in same populations on the electropherograms. All inferred haplotypes were observed in more than one individual.
We used the term "pattern sequences" to refer to specific concatenated sequences that were assembled from the sequence of E6-8, the codon from V410L (GTA for V, TTA for L), the sequence of E20-21, the codon from F1534C (TTC for F, TGC for C) and the sequence of E31-32 in each individual mosquito (Fig 1). Pattern sequences from heterozygous individuals were identified by comparison of known pattern sequences from the homozygous samples in the same population. Unique pattern sequences were numbered in the order they were discovered. Population structure analysis STRUCTURE 2.3 [26] was used to infer the co-ancestry of individuals within the 24 populations from Asia and the Americas (Africa was not included because only a single population was available) using their pattern sequences. The STRUCTURE algorithm was run using a 100,000 burn-in period and 500,000 MCMC replications under the admixture model with correlated allele frequencies. This was repeated five times for each K, ranging from 1 to 5. The optimal number of K clusters was determined both following the guidelines of Pritchard et al [26]

Phylogenetic analyses
We used Vssc pattern sequences to create a phylogenetic tree based on maximum likelihood (ML . We assessed support for relationships on each phylogeny by performing 500 bootstrap replicates. In addition, we constructed the parsimony network to identify genealogic relations among patterns across sampling sites based on a TCS network in PopART [34,35]. The probability of parsimony is calculated for the number of mutational steps between patterns until the probability exceeds 0.95 in TCS network.

Frequency and distribution of kdr alleles
There was considerable variation in the Vssc resistance alleles found in populations from Asia vs. the Americas. All Asian populations had only two types of resistance alleles: 1534C and 989P+1016G (Fig 2). The 1534C alleles were more common in each population, except  (Fig 3). Overall, the most common resistance allele was 410L+1016I+1534C, which was found in seven out of 10 populations, with allele frequencies ranging from 0.50 to 1.00. The second most common resistance allele was 1534C which was found in four out of 10 populations, with frequencies ranging from 0.13 to 0.50. The 410L allele was only detected in the SD population with a frequency of 0.13. The 410L+1534C allele was found only in the NO population at a frequency of 0.06. The resistant alleles 1534C and 410L+1016I+1534C were found in all populations from the USA, except AT. In the Colombian populations, the 410L+1016I+1534C allele was fixed in LM, whereas the MN had only the 1534C kdr allele (frequency of 0.50). No Vssc resistance alleles were detected in the AT and CR populations. The L982W, A1007G and I1011M/V mutations were not found in any mosquitoes, nor was the S989P+V1016G+F1534C allele which has been found in Myanmar [10], Saudi Arabia [18] and Malaysia [38]. All of the sequences have been deposited at GenBank (accession nos. MK977825-MK977851, S1 Table).

Phylogenetic analyses
Phylogenetic analyses of 33 determined pattern sequences revealed multiple clades (with bootstrap support �90%)) ( Fig 5). There are at least two independent clades containing 1534C alleles, suggesting multiple evolutionary origins of the F1534C mutation. Other resistance alleles were found in only a single pattern type (410L and 410L+1534C) or were found in two lineages (989P+1016G and 410L+1016I+1534C), making predictions of single or multiple evolutionary origins uncertain. The pattern sequences genealogy network illustrates frequency, relationships and number of mutations between the different pattern types (Fig 6). These results indicate pattern sequence P27 (1534C) and P28 (410L) were the result of one and two mutational steps, respectively, from their common susceptible ancestor P26. Similarly, P24 (1534C) and P25 (410L+-1534C) are the results of one and two mutational step from P23, respectively. Also, the 410L pattern sequence (P28) and 410L+1534C pattern sequence (P25) were distributed in different lineages (Fig 5). These results suggest the 410L+1534C allele arose from in individual already having 1534C. The most common pattern sequences (P29 and P30) with 410L+1016I+1534C arose from P27 (1534C). However, neither the 410L+1534C nor 1016I+1534C were found in the lineage. The 410L+1016I+1534C pattern sequences and 410L+1534C pattern sequence were located in different lineages (Fig 5). This suggested the 410L+1016I+1534C allele more likely originated from a 1534C background by two additional mutations, but was not the result of additional one mutation (1016I) from a 410L+1534C background. In order to determine if recombination may contribute to the evolution of multiple mutations in Vssc alleles, we analyzed the observed pattern sequences associated with kdr mutations. We found sequences consistent with recombination events giving rise to the 410L+1534C allele (P25) from alleles having a 410L (P28) and a 1534C allele (P20) (Fig 7).

Discussion
There are multiple kdr alleles in A. aegypti; some with global distribution and others that are found in more limited areas of the world (Fig 2 and Fig 3). The alleles containing only the F1534C mutation were widely distributed in the populations from Asia and the Americas. Alleles containing the S989P+V1016G mutations were found only in Asia and alleles with the V1016I mutation were only found in the Americas. These results are consistent with previous reports [2]. In 2017, the V410L mutation was detected in populations from Brazil [12]. It was subsequently found in A. aegypti from Colombia [13] and Mexico [14]. We also detected alleles with this mutation in mosquitoes from USA, Colombia and Puerto Rico (Fig 3). This suggests

Evolution of kdr alleles in Aedes aegypti
that Vssc alleles with the V410L mutation are widely distributed in A. aegypti from the Americas. We found a novel Q1835R mutation in the PR strain, but its association with the resistance requires further study.
The frequencies of the 989P+1016G and the 1534C alleles found in the Asian populations are intriguing. The S989P+V1016G allele confers pyrethroid resistance [36,37] and the F1534C allele has been shown to strongly correlate with pyrethroid resistance in A. aegypti. A field collected strain having both the 989P+1016G and 1534C alleles was selected with permethrin and the resulting strain had only the S989P+V1016G allele, suggesting that this allele  Table). The size of the circle represents the number of individuals having that pattern. Short hash-marks perpendicular to pattern branches indicate the number of base pair differences between pattern sequences. Black balls indicate intermediate missing (i.e. unsampled) steps. Geographic regions are coded by colors (Asia is yellow, America is blue and Africa is green). The different kdr alleles are indicated by different symbols.
https://doi.org/10.1371/journal.pntd.0008154.g006 gives greater protection to permethrin than the F1534C allele [4,36]. In all 11 of our Asian populations both the 1534C and 989P+1016G alleles were detected, and the 1534C allele was generally more common with frequencies ranging from 0.31 to 0.88 compared to 989P+1016G allele frequency that ranged from 0.13 to 0.50 (Fig 2). This suggests that the mosquitoes with the F1534C allele may have a lower fitness cost relative to the mosquitoes with the 989P +1016G allele. This finding is supported by Plernsub et al who reported that a strain having the F1534C mutation had no fitness reductions [39]. These results appear to be similar to the findings of Vssc alleles in house flies where kdr-his is common in some populations, even though it provides less resistance than other alleles [40][41][42][43].
Our results have both similarities and differences to what has previously been reported on Vssc alleles in the Americas. The populations we sampled from the USA had high frequencies of the 410L+1016I+1534C allele, and this is similar with Cornel et al. 2016 who reported that two populations from California were homozygous for V1016I mutation, although the 410 and 1534 positions were not examined [23]. Our mosquitos from Florida (AT) had no kdr alleles, and this is not consistent with Estep et al 2018 who found most A. aegypti strains from 62 locations in Florida were fixed or nearly fixed for the F1534C and V1016I mutations with frequencies from 0.11 to 1 [44]. Additional studies in Florida, across sites, seasons and years would be helpful to understand this discrepancy. In our populations from Colombia, the kdr allele types and frequencies are variable, and this is consistent with previous reports [13,45]. Generally, we detected four putative kdr alleles (410L, 410L+1534C, 410L+1016I+1534C and 1534C) in populations from Americas. The most common kdr allele was 410L+1016I+1534C. The PR strain was originally reported to have the 1016I+1534C allele [46], but the V410L position was not examined. Our results indicate that the PR population has the V410L+V1016I+ F1534C allele. Thus, it is likely that the 1016I+1534C allele previously reported in A. aegypti is most likely the 410L+1016I+1534C allele. This also suggested that the high frequencies of 1016I or 1016I+1534C in previous studies are most likely high frequencies of 410L+1016I+1534C. This point is supported by Granada et al 2018 [13] and Saavedra-Rodriguez et al 2018 [14] in which the 1016I and 1016I +1534C alleles were rare, and the 410L+1016I+1534C allele was more common. Our primer sets of E20-21 are also capable of detecting previously reported mutations L982W, A1007G and I1011M/V [5,47], and none of the mutations were detected in any of the mosquitoes we analyzed, indicating they are rare or are geographically constrained.
It is well established that A. aegypti originated in Africa, invaded the Americas and later Asia from the Americas, likely in the 1890s [48][49][50]. Our population structure analysis showed that the populations from Asia and the Americas have similar background (Fig 4), although some kdr alleles have different geographic distributions (Fig 6). This supports the admixture of the populations from Asia and the Americas. For the population structure of mosquitoes from the Americas, it is important to consider that this species was eradicated from much of this area in the 1950s and 1960s, with re-colonization starting in the 1970s [49]. Therefore, it is possible that the mosquitoes with the V410L or V1016I mutations in populations from the Americas were from re-colonization by African populations. This idea is supported by previous reports [17,20] in which the V1016I (no S989P and V1016G) mutation was detected in African populations, unfortunately V410L was not examined in these two studies. Also, the 989P +1016G allele confers a >2000-fold resistance to DDT [37], so it is unlikely that the S989P+-V1016G was ever present and then eradicated in American populations owing to the DTT use, and it is likely that the S989P and V1016G mutations occurred in Asian populations after the A. aegypti invaded Asia.
The intron between exons 20 and 21 is highly polymorphic in some species and being in close proximity to important kdr mutations has facilitated phylogenetic analyses of alleles in Myzus persicae [51], Bemisia tabaci [52], Musca domestica [53] and Leptinotarsa decemlineata [54]. While we did identify nine haplotypes from this region (E20-21) in A. aegypti (S2 Fig), this is far fewer than have been found for the species mentioned above. Thus, our ability to determine a single or multiple origin of the mutations (S989P, V1016G/I) in this region was unsuccessful. As additional E20-21 haplotype sequences become available, this analysis should be revisited.
A previous study on the E20-21 region of Vssc from A. aegypti identified two intron length polymorphisms: group A (250 bp) and B (234 bp) [55]. In our nine E20-21 haplotypes, introns of V4, V8 and V9 belong to intron group A, introns of V2, V3, V5, V6 and V7 belong to intron group B, and the intron of V1 differs from group B by the deletion of a single nucleotide (S2 Fig). The intron of group A was found to be strongly linked with S989P, V1016G/I and D1763Y mutations [6,56]. This agrees with our results in which the haplotypes containing S989P and V1016G/I mutations have the intron of group A (S2 Fig, V4, V8 and V9). Also, our results showed the F1534C mutation had no strong link to either the group A or group B. Specifically, the F1534C mutation was found with both intron A (V4-F1534C and V9-F1534C) and B (V2-F1534C and V1-F1534C), and they were found in populations from both Asia (V1-F1534C, V2-F1534C and V4-F1534C) and Americas (V2-F1534C, V4-F1534C and V9-F1534C) (i.e. showed no geographical distribution) (S1 Table). The findings are not consistent with previous reports in which the F1534C mutation was strongly linked with the intron of group A in Ghana [17] and intron of group B in Taiwan [17,56]. This is most likely that the two studies used A. aegypti populations from limited areas.
Previous studies on single vs. multiple evolutionary origins of kdr mutations have all suggested that multiple origins are more common [51][52][53]57]. Similarly, our results for F1534C suggest that this mutation had multiple evolutionary origins. However, our data could not differentiate between single and multiple evolutionary origins for the 989P+1016G and 410L +1016I+1534C allele. Additional sequences from other strains/populations will be needed before more concrete conclusions can be drawn.
Vera-Maloof et al [58] and Saavedra-Rodriguez et al [14] detected the frequencies of mutations in A. aegypti collected in Mexico from 2000-2012, and proposed the V410L, V1016I, and F1534C mutations arose independently, and then by two recombination events, gave rise to the 410L+1016I+1534C allele. In order to examine if alleles with multiple mutations arose by sequential accumulation of mutations, recombination or a combination of both processes, we analyzed our 33 pattern sequences. Our data equally support the ideas that the 410L+1534C allele could have arisen by recombination (Fig 7) or by addition of the 410L mutation to an individual having the F1534 allele (Fig 6). Without accurate knowledge of mutation and recombination rates it is difficult to decide which model is most likely. Our data suggests that other kdr alleles having multiple mutations could have arisen by sequential accumulation of mutations, but we found no evidence to support that they arose via recombination. However, a larger dataset of alleles would be needed before solid conclusions can be made. Although no 1016I+1534C allele was found in our samples, it has been reported in mosquitoes from Colombia [13] and Mexico [14]. Therefore, based on the sequence polymorphisms associated with the mutations, we put forward possible evolutionary scenarios of how resistance alleles with multiple mutations may have evolved (Fig 8). It appears the 410L+1534C allele occured by addition of a V410L mutation to an individual already having the F1534C allele or by recombination (see above), ii) the 410L+1016I+1534C allele arose in an individual having the 1016I +1534C allele which originates from an individual with a1534C allele. It should be mentioned that we analyzed the relationship between the pattern sequences, not considering geographyical distribution of alleles (because our analysis focused on identification of the most common alleles in each population, not on a deep sampling). Although the 989P+1016G+1534C allele has been reported in A. aegypti populations [18,19,38], it was not detected in our samples. Therefore we cannot propose how this allele arose.
In summary, we examined nine Vssc mutations (V410L, L982W, S989P, A1007G, I1011V/ M, V1016G/I and F1534C) and 33 pattern sequences of individual A. aegypti from 25 geographic collections. Our results show that 1534C and 989P+1016G alleles are widely distributed in Asian populations. The distinct 410L, 410L+1534C, 410L+1016I+1534C and 1534C alleles were found in populations from the Americas where the most common kdr allele was 410L+1016I+1534C, followed by 1534C. The phylogenetic analysis of Vssc suggest multiple independent origins of the 1534C allele. Our data suggest the 410L+1534C allele could have arisen by accumulation of the V410L mutation in an individual already having the 1534C allele or by the crossover of the 410L and 1534C alleles. More kdr pattern sequences (i.e. 1016I+ 1534C and 989P+1016G+1534C) would allow for understanding of how multiple mutations in Vssc have arisen and evolved. Overall, our data corroborate previous geographic distributions of kdr mutations and provides evidence for both recombination and sequential accumulation of mutations contributing to the molecular evolution of kdr mutations in A. aegypti.
Supporting information S1