Genetic profiles of ten Dirofilaria immitis isolates susceptible or resistant to macrocyclic lactone heartworm preventives

For dogs and cats, chemoprophylaxis with macrocyclic lactone (ML) preventives for heartworm disease is widely used in the United States and other countries. Since 2005, cases of loss of efficacy (LOE) of heartworm preventives have been reported in the U.S. More recently, ML-resistant D. immitis isolates were confirmed. Previous work identified 42 genetic markers that could predict ML response in individual samples. For field surveillance, it would be more appropriate to work on microfilarial pools from individual dogs with a smaller subset of genetic markers. MiSeq technology was used to identify allele frequencies with the 42 genetic markers previously reported. Microfilaria from ten well-characterized new isolates called ZoeKY, ZoeMI, ZoeGCFL, ZoeAL, ZoeMP3, ZoeMO, ZoeAMAL, ZoeLA, ZoeJYD-34, and Metairie were extracted from fresh blood from dogs. DNA were extracted and sequenced with MiSeq technology. Allele frequencies were calculated and compared with the previously reported susceptible, LOE, and resistant D. immitis populations. The allele frequencies identified in the current resistant and susceptible isolates were in accordance with the allele frequencies previously reported in related phenotypes. The ZoeMO population, a subset of the ZoeJYD-34 population, showed a genetic profile that was consistent with some reversion towards susceptibility compared with the parental ZoeJYD-34 population. The Random Forest algorithm was used to create a predictive model using different SNPs. The model with a combination of three SNPs (NODE_42411_RC, NODE_21554_RC, and NODE_45689) appears to be suitable for future monitoring. MiSeq technology provided a suitable methodology to work with the microfilarial samples. The list of SNPs that showed good predictability for ML resistance was narrowed. Additional phenotypically well characterized D. immitis isolates are required to finalize the best set of SNPs to be used for large scale ML resistance screening.


Background
Dirofilaria immitis is the causative agent of heartworm disease, which can produce life-threatening morbidity that affects dogs, cats, and wild canids [1][2][3][4][5][6]. This filarial nematode is distributed in North and South America, Southern Europe, Japan, Australia, India, and China [7,8]. The macrocyclic lactones (ML) milbemycin oxime, ivermectin, moxidectin, and selamectin are the available prophylactic drugs in the U.S. veterinary marketplace that prevent the establishment of L3-L4 larval D. immitis stages in dogs and cats [9]. The first ML loss of efficacy (LOE) report was published in 2005 [10]. Reports of LOE dogs in the United States have persisted over the years. Some of these suspected LOE cases are no doubt due to lack of full compliance with recommended chemoprophylaxis regimens [11]. Nevertheless, recently ML resistance has been confirmed in the U.S. [12,13]. Because ML resistance has a genetic origin [14], whole genome analysis has been performed on well characterized susceptible D. immitis isolates from the U.S., Italy, Gran Canaria, and Grenada and LOE isolates from the U.S. to identify genetic differences that could correlate with evidence of LOE and resistance [13]. One hundred eighty-six single nucleotide polymorphisms (SNP) showed highly significant differences between pools of susceptible and LOE D. immitis. Based on these 186 SNPs, Sequenom® SNP frequency analyses were conducted on 663 individual parasites (adult worms and microfilariae) which were phenotypically characterized as susceptible (SUS), confirmed ML treatment survivors/resistant (RES), or suspected resistant/loss of efficacy (LOE) parasites. This approach identified 42 SNPs that appeared to differentiate ML-susceptible from LOE and resistant D. immitis isolates [13]. It is highly desirable to reduce the number of marker SNPs using additional ML phenotypically characterized D. immitis isolates. Previously, only a small number of confirmed resistant isolates had been genetically characterized. Ultimately, the goal is to build a robust protocol for field application that could be used to monitor for resistant D. immitis isolates in the field. Such genetic markers for susceptibility/resistance may also be useful in developing protocols for managing drug resistance in D. immitis and for establishing improved scientifically based protocols for registration of new heartworm preventives.

Samples
Ten isolates were provided by Zoetis Animal Health for analysis. The detailed information on the isolates has been published elsewhere [15,16]. The ML phenotype response was assessed in efficacy studies with a dose of 3 μg/kg of moxidectin (MOX), and the origin of the isolates are presented in Table 1. In total, five isolates were susceptible to MOX while five were resistant to heartworm preventive to varying degrees.

Sample manipulation and DNA extraction
Microfilariae (MF) were shipped to McGill University in fresh blood collected in EDTA tubes from untreated dogs infected with each isolate. A filtration procedure [13] was used to extract and clean MF from a 15-to 20-mL blood sample. Polycarbonate membrane filters (3.0 μm; 25 mm; Sterlitech® Corporation, Kent, WA, USA) were used for the filtration. A 1:1 dilution from venous blood with NaHCO 3 (SIGMA®, Aldrich Co., Oakville, ON, Canada) solution (2 g/L) was made before filtration (5-20 mL per filter). DNA extraction of pooled MF was achieved using QIAamp® DNA Micro kit (Qiagen Inc., Toronto, ON, Canada). DNA concentrations for all samples were evaluated using the Quant-iT™ PicoGreen DNA Assay Kit (Invitrogen®, Life Technologies Inc., Burlington, ON, Canada).

SNP markers
Based on previous work, 42 SNPs out of the 186 identified from the whole genome [13] were evaluated as they seemed to better differentiate the ML-susceptible phenotype from the LOE and resistant phenotype. The list of SNP positions investigated in the current study is available in Additional file 1 at the end of this article.

Genetic analysis
Ten DNA pools of MF were sequenced with MiSeq® at Génome Québec Innovation Centre (McGill University). Libraries were prepared following two successive thermocycler  [15]. The efficacy study was based on 3 μg/kg of oral moxidectin. Efficacy studies described in McTier et al. [15] steps for tagging with CS1 and CS2 primers [17] and barcoding the fragments. The lists of primers are available in Additional file 2 at the end of this article. Then the fragments were pooled and purified using AMPure XP beads (Beckman Coulter, Inc.) [18], and library quality control was performed. Libraries were then run on a MiSeq® sequencing system using paired end read of 250 base pairs (PE250).

Data analysis
Reads obtained from MiSeq® were trimmed from the 3' end to have a Phred score of at least 30. Illumina sequencing adapters were removed from the reads, and all reads were required to have a length of at least 50 bp. Trimming and clipping were performed using Trimmomatic (http://www.usadellab.org/cms/?page=trimmomatic) [19]. The filtered reads were aligned to the nDi.2.2. D. immitis genome (http://www.nematodes.org/genomes/diro filaria_immitis/). Each read set was aligned using BWA (http://bio-bwa.sourceforge.net/) [20], which has a low error rate (<3%), and which a Binary Alignment Map file (.bam) created. Then, all read set BAM files from the same sample were merged into a single global BAM file using Picard (http://broadinstitute.github.io/ picard/). BVATool (https://bitbucket.org/mugqic/bva tools/src) was then used to extract from the BAM files the read frequencies at each of the 42 SNPs. The read frequencies were assimilated to the allele frequencies.
Allele frequencies from the ten isolates were compared with those of isolates described previously [13].

Statistical analysis
The identification of the best SNPs to predict ML resistance from these samples was assessed visually by plotting the allele frequencies using GraphPad Prism Software (Version 5). Any allele frequency difference between groups was assessed using Chi-squared tests. The limit of significance was p value = 0.05.

Predictive model
The Random Forest algorithm [21] as implemented in the "Biomarker analysis" module in MetaboAnalyst 3.0 (http://www.metaboanalyst.ca) [22][23][24][25][26] was used to build classification models and to evaluate their performance in predicting ML phenotypes in D. immitis. Random Forest is a well-established algorithm based on ensemble learning using a multitude of decision trees. It has been successfully used in building predictive models from SNP data [27,28]. The tool allowed the identification of different SNP combinations that could best distinguish the two groups. In this case, a score of zero was the optimal value for ML susceptibility prediction; and a score of one was the optimal value for ML resistance prediction. Using a cut-off of 0.5, any sample with a predicted class probability less than 0.5 was considered ML susceptible while any sample with a predicted class probability higher than 0.  further insight, a heat map was constructed with the percentage of the alternative allele that characterized resistance and the phenotype response from the ML efficacy study of each isolate. The Metairie isolate was not used in the heat map, as the percentage efficacy was not known, although the isolate was classified as resistant. The susceptible samples from a previous study [13] were assumed to give 100% reduction in an efficacy trial, while RES-1 and RES-2, a resistant isolate [13], were known to have 21.6% and 71.3% efficacy, respectively, with ivermectin.

Genetic analysis
The mean depth sequencing coverage of the region, including the SNP, was~2000X. The percentage frequencies of the alternative alleles of the 42 SNPs previously associated with LOE and resistance [13] are presented in Fig. 1. The differences of the percentage alternative allele frequencies between the new resistant isolates ZoeRES (ZoeMO, ZoeAMAL, ZoeLA, ZoeJYD-34, Metairie) and the new susceptible isolates ZoeSUS (ZoeKY, ZoeMI, ZoeGCFL, ZoeAL, ZoeMP3) showed that 40 of the 42 SNP positions had a higher percentage of the alternative alleles in ZoeRES compared with ZoeSUS (Fig. 1). The allele frequencies identified in the ZoeSUS and in ZoeRES populations are in accordance with the allele frequencies previously reported [13]. Two SNP positions (NODE_42411_RC and NODE_21554_RC) are presented as examples in Fig. 2 which shows the percentage allele frequencies of these two SNPs in all of the characterized D. immitis isolated collected so far. This current study allowed the number of SNPs that can best predict ML response in D. immitis to be narrowed.
In addition, ZoeMO and ZoeJYD-34 isolates were compared. The difference between the two samples was that ZoeMO was the ZoeJYD-34 isolate re-passaged to a recipient dog 1.5 years later, with no intervening drug exposure. Thus, ZoeMO D. immitis population was a subset of the Zoe-JYD34 population (from a 50 L 3 inoculum). So, ZoeMO and ZoeJYD-34 parasite populations are related but may not be genetically identical. ZoeMO and ZoeJYD-34 isolates had 82 and 19% percentage MOX efficacy, respectively [15] (Table 1). Interestingly, in Fig. 3, out the 42 SNPs described, 28 SNPs showed a higher frequency of the alternative allele (resistance associated) in ZoeJYD-34 compared with ZoeMO, six SNPs shared similar genetic profiles while nine SNPs had showed a higher frequency of the alternative allele in ZoeMO compared with ZoeJYD-34.

Predictive model
Using the Random Forest algorithm as implemented in the biomarker module from MetaboAnalyst 3.0, a series of predictive models were generated using any combination of two, three, five, and ten SNPs to differentiate the ML-susceptible isolates from the resistant isolates (see Additional file 3 at the end of this article). The results showed that using just a few SNPs, all models perform well with >90% sensitivity and 100% specificity. Given the relatively small the sample size (n = 17), it is expected that more robust performance estimate will be obtained when results of more isolates become available. For practical reasons for future field application, it was decided to test SNP combinations with the three-and five-SNP models, the three and five SNPs that gave, individually, the best performance (Fig. 4). Caution is necessary as they may be only the best individual SNP markers for this current dataset, and not necessarily for new isolates. Thus, there is a risk that performance evaluation was overoptimistic due to small sample size. With this in mind, only SNPs labeled NODE_42411_RC, NODE_21554_RC, and NODE_45689 were used in the three-SNP model. NODE_42411_RC, NODE_21554_RC, NODE_45689, NODE_20587_RC, and NODE_9400 were used in the five-SNP model. The results are presented in Fig. 4. Interestingly, the three-SNP model, using the SNPs NODE_42411_RC, NODE_21554_RC, and NODE_45689, better differentiated the ML-susceptible samples from the ML-resistant samples compared with the five-SNP model. In both cases, however, none of the samples were misclassified. When the allele frequencies of some of the SNPs used in the best three-SNP and five-SNP models were plotted against the percentage resistance (100 -% efficacy) for the nine isolates for which efficacy data are available, significant regressions were obtained (Fig. 5).
The heat map presented in Fig. 6 allowed the phenotype ML response of the isolates to be mapped against  The difference between the two analyses was that for the individual performance, the ML response was categorical (0 for susceptible and 1 for resistant), while in the heat map, phenotype was a continuous variable as the percentage reduction efficacy was used. Details of the estimation of efficacy are described in detail elsewhere [15]. Although the genetic data and the biological data were in accordance, the data should be  Linear regressions with some of the SNPs retained in the three-SNP and five-SNP models (Fig. 4). Each regression is based on the nine isolates for which efficacy data was available. RC and alt stand for reverse complement and alternative nucleotide, respectively treated with caution, as the efficacy studies and samples for genetic analysis (using untreated control dogs only) were related but not from the same dogs (in the efficacy studies, obviously the treated dogs were in the treatment group, whereas the MF for genetic analysis were from untreated dogs, from the same isolate). In general, the heat map showed significant differences in the frequencies of the alternative alleles between ML-susceptible and MLresistant isolates.

Discussion
At a time when ML resistance in D. immitis has become a concern [12,13], identifying reliable genetic markers to predict ML response is important. The current study allowed us to determine the percentage of alternative alleles, previously identified as putative markers, in additional phenotypically well-characterized D. immitis isolates. ZoeRES isolates showed similar genetic profiles to LOE and confirmed resistant isolates previously reported [13], which was encouraging when seeking universal genetic markers to predict ML response in D. immitis. In addition, MiSeq technology appeared to be a suitable technology to work with MF pool samples.
Genetic analysis of ZoeMO and ZoeJYD-34 isolates was consistent with some reversion towards susceptibility in ZoeMO compared with parental ZoeJYD-34. Previous work in Onchocerca volvulus [29], a closely related filarial parasite, showed that female worms that carried an IVM selected genotype were less fertile than unselected worms. Thus, a possible difference in fitness between susceptible and resistant parasites could be considered. With no additional drug pressure on ZoeMO, the more susceptible female worms in the population may have produced more susceptible offspring (microfilariae) that could change the genetic profile and the resistance phenotype of ZoeMO compared with ZoeJYD-34. These fitness possibilities need investigation.

Conclusion
Predictive models based on the Random Forest algorithm offer a promising approach to investigate which SNP combinations would best predict ML response. At the current stage, due to a limited sample size (n = 17), the combination of SNPs identified with the mathematical model may not be yet the final optimal set of markers. However, they will provide useful tools to consider when additional isolates can be added to the current models. This should help to identify a small number of the SNP for field monitoring for resistance. A new study supported by the American Heartworm Society to further evaluate the SNP markers in isolates coming from veterinary clinics in U.S., should provide additional information and increase confidence in using these SNPs for resistance identification and monitoring.