Genomic Diversity of Azole-Resistant Aspergillus fumigatus in the United States

ABSTRACT Azole resistance in pathogenic Aspergillus fumigatus has become a global public health issue threatening the use of medical azoles. The environmentally occurring resistance mutations, TR34/L98H (TR34) and TR46/Y121F/T289A (TR46), are widespread across multiple continents and emerging in the United States. We used whole-genome single nucleotide polymorphism (SNP) analysis on 179 nationally represented clinical and environmental A. fumigatus genomes from the United States along with 18 non-U.S. genomes to evaluate the genetic diversity and foundation of the emergence of azole resistance in the United States. We demonstrated the presence of clades of A. fumigatus isolates: clade A (17%) comprised a global collection of clinical and environmental azole-resistant strains, including all strains with the TR34/L98H allele from India, The Netherlands, the United Kingdom, and the United States, and clade B (83%) consisted of isolates without this marker mainly from the United States. The TR34/L98H polymorphism was shared among azole-resistant A. fumigatus strains from India, The Netherlands, the United Kingdom, and the United States, suggesting the common origin of this resistance mechanism. Six percent of azole-resistant A. fumigatus isolates from the United States with the TR34 resistance marker had a mixture of clade A and clade B alleles, suggestive of recombination. Additionally, the presence of equal proportions of both mating types further suggests the ongoing presence of recombination. This study demonstrates the genetic background for the emergence of azole resistance in the United States, supporting a single introduction and subsequent propagation, possibly through recombination of environmentally driven resistance mutations.

IMPORTANCE Aspergillus fumigatus is one of the most common causes of invasive mold infections in patients with immune deficiencies and has also been reported in patients with severe influenza and severe acute respiratory syndrome coronavirus 2 (SARs-CoV-2). Triazole drugs are the first line of therapy for this infection; however, their efficacy has been compromised by the emergence of azole resistance in A. fumigatus, which was proposed to be selected for by exposure to azole fungicides in the environment [P. E. Verweij, E. Snelders, G. H. J. Kema, E. Mellado, et al., Lancet Infect Dis 9:789-795, 2009, https://doi.org/10.1016/S1473-3099(09) . Isolates with environmentally driven resistance mutations, TR 34 /L98H (TR 34 ) and TR 46 /Y121F/ T289A (TR 46 ), have been reported worldwide. Here, we used genomic analysis of a large sample of resistant and susceptible A. fumigatus isolates to demonstrate a single introduction of TR 34 in the United States and suggest its ability to spread into the susceptible population is through recombination between resistant and susceptible isolates.
KEYWORDS azole resistance, Aspergillus fumigatus, whole-genome sequencing, population genomics, population structure, TR 34 /L98H, drug resistance mechanisms, population genetics A spergillus fumigatus is a globally distributed environmental fungus and the primary cause of aspergillosis, a serious and often fatal infection that mainly affects patients with impaired immune systems or lung function but can also occur in people without these underlying conditions (1,2). Recent studies demonstrate that invasive pulmonary aspergillosis is a lethal and frequently misdiagnosed complication of severe influenza and severe acute respiratory syndrome coronavirus 2 (SARs-CoV-2) infections that can cause illness and death in previously healthy individuals (2)(3)(4)(5)(6)(7)(8)(9). Medical azoles, such as itraconazole, posaconazole, and voriconazole, are the first line of therapy against aspergillosis. However, numerous clinical reports and surveillance studies document the emergence of azole-resistant A. fumigatus (ARAf) over the past decade (10).
Azole-resistant A. fumigatus isolates are characterized by a wide variety of genomic resistance mechanisms. However, the predominant azole resistance mechanism found among ARAf isolates is one or more substitutions in the sterol 14a-demethylase gene (cyp51A) involved in the biosynthesis of ergosterol, the main component of the fungal cell membrane; medical azoles target the product of this gene. Numerous reports document a variety of single nucleotide substitutions (SNPs) in the cyp51A gene that arise in patients that undergo prolonged antifungal therapy, such as patients with cystic fibrosis and chronic pulmonary aspergillosis (11)(12)(13). In contrast, ARAf isolates with duplications in the cyp51A gene promoter coupled with specific amino acid substitutions, known as TR 34 /L98H and TR 46 /Y121F/T289A, are becoming widespread in the environment and are frequently isolated from patients without previous exposure to antifungal drugs (14)(15)(16)(17)(18)(19). Clinical ARAf isolates harboring these environmentally derived mutations are emerging as the predominant cause of resistance in patients without azole exposure (20).
It is now widely accepted that the emergence of TR 34 /L98H and TR 46 /Y121F/T289A substitutions in A. fumigatus is associated with selection pressure from azole fungicides that target the cyp51A gene product. In recent years, the use of these fungicides in crop agriculture in the United States has increased; therefore, it is not surprising that reports of ARAf isolates in patients and the environment harboring these substitutions has also increased (20,21). The presence of clinical or environmental ARAf with TR 34 / L98H and TR 46 /Y121F/T289A substitutions has now been documented on six continents (14,17,(22)(23)(24)(25)(26). Up to 15% of clinical isolates in Europe carry these substitutions, which dramatically decrease treatment options for patients infected with ARAf strains (27,28). Recent genomic analysis demonstrated that azole-resistant isolates from patients and the environment are genetically related (29).
Although still uncommon in the United States, ARAf with environmentally derived substitutions in cyp51A have emerged in at least 3 states (30). While the initial survey of 1,026 A. fumigatus isolates conducted by the CDC from 2011 to 2013 did not identify isolates with TR 34 /L98H or TR 46 /Y121F/T289A substitutions, subsequent surveillance reported the emergence of these polymorphisms among clinical and environmental ARAf isolates in the United States (15,(31)(32)(33)(34)(35). However, it remains unknown whether isolates with these mutations were introduced into the United States or emerged independently. To address this question and to gain deeper insight into the genetic diversity and the emergence of ARAf isolates in the United States, we applied whole-genome sequencing on a subset of A. fumigatus isolates from a collection of 1,736 clinical and environmental isolates from various regions in the United States (15,31). The aim was to (i) characterize the genetic relationship among ARAf isolates from the United States and global isolates, (ii) identify the genetic substitutions in the cyp51A gene that may influence antifungal susceptibility, and (iii) evaluate the extent of genetic recombination in the A. fumigatus population to test the hypothesis that recombination may have contributed to the distribution of the TR 34 /L98H allele in the United States.

RESULTS
Azole resistance and substitutions in cyp51A. A total of 179 A. fumigatus isolates from 36 states were included in the analysis (see Table S1 in the supplemental material). This sample included (i) 164 representative clinical isolates selected from 1,736 A. fumigatus isolates submitted to the CDC as part of the ongoing passive surveillance for azole resistance from 2015 to 2017, (ii) 13 previously reported environmental isolates from a peanut farm in the state of Georgia, and (iii) two historical type isolates from the CDC strain collection (15). We included all isolates with elevated MICs to itraconazole and voriconazole available at the time as well as susceptible isolates from seven geographic areas in the United States (West, Midwest, Mountain, Central, Northeast, Mid-Atlantic, and Southeast) (Fig. 1). In addition, publicly available genomes of 18 isolates from The Netherlands, India, and the United Kingdom were included for comparison. Of the 179 U.S. samples, 46 (26%, 33 clinical and all 13 environmental) were deemed resistant to triazoles using an MIC cutoff value of 2 mg/ml for either itraconazole or voriconazole (Table S1).
To investigate the potential mechanisms of resistance, we downloaded cyp51A sequences and searched for nonsynonymous substitutions in this region. Forty-four isolates (25%) had nonsynonymous cyp51A substitutions, three of which, G448S, TR 34 /L98H, and TR 34 /L98H/S297T/F495I, are known to confer resistance to one or more triazole (Table 1;  Table S1). Three clinical isolates with TR 34 /L98H substitution were identified in the  Mid-Atlantic region. In addition, eight isolates with TR 34 /L98H/S297T/F495I were identified: two clinical (one from the Mid-Atlantic region one from the West), and six environmental isolates from a peanut farm in the Southeast. Other substitutions in cyp51A were also identified, but they were present in both resistant and susceptible isolates and were not linked to resistance (Table 1). An identical wild-type cyp51A allele was found in 135 isolates; of those, 120 were susceptible and 15 were resistant to azoles (Table S1). Additionally, we also searched for variants within the 3-hydroxy-3-methyl-glutaryl-coenzyme A (HMG-CoA) reductase-encoding gene, hmg1: all sequences were wild type. Phylogenetic analysis using the neighbor-joining algorithm showed little genetic variation in the cyp51A locus, a total of 27 SNPs were identified in a 2,049-bp region. All sequences from ARAf isolates with the TR 34 /L98H substitution from U.S. and non-U. S. isolates formed a single monophyletic clade on the phylogenetic tree (see Fig. S1).
Genetic diversity of A. fumigatus isolates. We performed whole-genome sequencing to investigate genetic diversity among isolates. All genomes had greater than 100Â sequencing coverage and mapped to .95% of the publicly available reference genome AF293 (Table S1). Genomes representing the United States and the non-U.S. genomes shared a total of 228,546 SNPs. Phylogenetic analysis using the neighborjoining algorithm showed the presence of two major clades (Fig. 2). Clade B represented azole-susceptible and -resistant isolates without the TR 34 /L98H resistance marker (N = 164 [83%]); resistant isolates belonging to this clade were geographically spread across the United States and can be further divided into two subclades. Clade A comprised 24 (12%) isolates with the TR 34 /L98H resistance marker from the United States, India, The Netherlands, and United Kingdom as well as 9 isolates without this marker from the United States, The Netherlands, and United Kingdom (Fig. 2). Notably, 7 of the 18 U.S. isolates clustering with clade A were located on the long branches at  (Fig. 2). Six environmental isolates with TR 34 /L98H/S297T/F495I from a peanut farm in Georgia were genetically related and likely came from a clonal population.
Population structure of A. fumigatus. To explore the genetic relationships among isolates, we performed principal-component analysis. Analysis of the first two principal components, representing the majority of the variation (77.6% of the total variance) in this data set, demonstrated the presence of two populations, A and B, that corresponded to the A and B clades on the phylogenetic tree; in addition, 14 isolates were located between the two populations ( Fig. 2 and 3) (17). A collection of globally represented A. fumigatus isolates harboring the TR 34 /L98H resistance marker clustered along the first principal component, capturing much of the uniqueness in this data set (62% variation) (Fig. 3). The majority of U.S. and non-U.S. isolates with the TR 34 /L98H resistance marker were found along this axis (population A). Isolates mainly representing azole-susceptible and ARAf isolates without the TR 34 /L98H marker clustered along the second principal component (population B) (Fig. 3). Less genomic variation was observed among A. fumigatus isolates within the United States (15.6%); however, two subpopulations within population B which correspond to the two subclades in clade B can be differentiated. In addition, 3 genotypes occupied an intermediate position between the two populations (Fig. 3).
We used ADMIXTURE to further investigate the population structure and assess the extent of genetic recombination between A and B populations as well as between isolates with and without the TR 34 /L98H resistance marker (36). When criteria were set to two populations (K = 2), ADMIXTURE assigned 164 (83%) isolates from the United Genomic Basis of Azole-Resistant A. fumigatus ® States to population B, while isolates with TR 34 /L98H markers from India, The Netherlands, and the United Kingdom were assigned to population A. In addition, six isolates (18%), representing the Midwest, Mountain, Southeast, and West regions of the United States were also assigned to population A (Fig. 4). Furthermore, 12 (6%) isolates from the United States, including those with the TR 34 /L98H markers, had a mixture of population A and population B alleles, indicative of recombination between the populations. Notably, all isolates with the TR 34 /L98H resistance marker shared more than 50% of their alleles with the isolates from population A, which suggests they may be a product of recombination between the two populations (Fig. 4). Similar results were observed when the ADMIXTURE criteria were set to 3 populations (K = 3) (see Fig. S2), in which case, the majority of isolates from the United States contained a mixture of alleles from two subpopulations corresponding to the two subclades within the B clade on the phylogenetic tree, while isolates with TR 34 /L98H marker were assigned to population A and possessed more than 50% of their alleles with the isolates from population A. Although a K value of .3 was not supported by principal-component analysis (PCA) or phylogenetic analysis, ADMIXTURE results for K values of 4 to 10 are shown in Fig. S3.
Distribution of mating types. Aspergillus fumigatus isolates carrying a mixture of alleles from the two different populations, which suggests the presence of recombination and sexual reproduction. To further test whether A. fumigatus in the United States is capable of sexual reproduction, we mined genome assemblies for mat1-1 and mat1-2 genes. The mat1-1 gene was present in 55% of genomes, while the mat1-2 gene was found in 45% of A. fumigatus genomes ( Fig. 4; see also Fig. S4). While the mat1-2 gene was more frequent among ARAf isolates, the difference was not significant (P . 0.05). This analysis demonstrates a nearly equal proportion of both mating types present in this sample and allele sharing between different mating types (Fig. 4).

DISCUSSION
The environmentally associated polymorphism TR 34 /L98H occurs globally but only recently emerged in the United States among clinical and environmental ARAf. Azole resistance in A. fumigatus is primarily linked to substitutions in the cyp51A gene. Of the 46 resistant isolates from the United States in our collection, nearly two-thirds had nonsynonymous substitutions in the cyp51A gene, three of which, including the fumigatus isolates without the TR 34 /L98H were widespread across multiple regions of the United States and shared less than 50% of their alleles with non-U.S. isolates. Azole resistant A. fumigatus isolates with the TR 34 /L98H carrying the mat1-2 idiomorph were commonly seen in U.S. isolates, while mat1-1 was seen in non-U.S. isolates. environmentally derived TR 34 /L98H mutation, are known to cause resistance. Other substitutions in cyp51A were also observed in both resistant and susceptible isolates; however, their role in conferring resistance to azoles requires additional investigation. In addition, we identified 16 resistant isolates with a wild-type copy of cyp51A, indicating the presence of other mechanisms of resistance in the U.S. population of A. fumigatus. No isolates with TR 46 /Y121F/T289A resistance polymorphisms were identified in our collection, although detection of isolates with these alleles in the United States was reported by others (33,34).
Recent studies utilizing high-resolution whole-genome sequencing and microsatellite typing have identified two genetically distinct populations of A. fumigatus (17,19,29,34). These studies indicate that most ARAf isolates harboring TR 34 /L98H form a single clade that is genetically distinct from other clades and includes isolates with other resistance mechanisms and wild-type genotypes. Consistent with these observations, phylogenetic analysis using whole-genome sequencing data and PCA identified the presence of two major populations in the United States, which, according to the nomenclature used by Sewell et al., were designated clade A and clade B (17).
Phylogenetic analysis demonstrated 83% of U.S. isolates clustered with clade B, all of which were either susceptible to azoles or had other mechanisms of azole resistance; no isolates with the TR 34 /L98H marker were detected in this population. Conversely, all U.S. isolates with the TR 34 /L98H marker clustered with clade A; however, most of them occupied an intermediate position between the two populations on the PCA plot and were placed on long branches at the base of clade A that suggested that they might have been products of recombination between the two populations. However, the main limitation of these analyses is that neither PCA nor phylogenetic analysis considers a possibility of recombination between isolates.
We used ADMIXTURE that considers recombination and estimates the contributions of alleles from different populations in a recombining population (36). This analysis confirmed that azole-resistant isolates with the TR 34 /L98H resistance marker from The Netherlands, India, and the United Kingdom belonged to population A. In addition, two isolates from the United States, one with TR 34 /L98H from the West and another susceptible isolate from the Midwest, possessed 100% of the alleles assigned to population A. Furthermore, 7% of the isolates from the United States contained a mixture of alleles from both populations, and all U.S. isolates with the TR 34 /L98H marker had at least 50% of the alleles from population A, suggesting ongoing recombination between the populations and a possible spread of TR 34 /L98H into population B in the United States (Fig. 3). Sexual reproduction was previously reported in A. fumigatus (37)(38)(39)(40). Our analysis shows the nearly 1:1 distribution of the mating types in our sample that is consistent with a recombining population. Isolates with both mating types were identified among TR 34 /L98H strains in the United States.
To further investigate the origin of the TR 34 /L98H marker in the United States, we interrogated cyp51A gene genealogy, a much smaller region that is less likely to be affected by recombination than the entire genome. Unfortunately, only a small number of single nucleotide polymorphisms were detected in this locus, limiting the extent of phylogenetic inference that could be conducted with this locus. However, these limited data are consistent with a single monophyletic origin of TR 34 /L98H sequences. Interestingly, isolates with TR 34 /L98H/S297T/F495I were closely related to TR 34 /L98H sequences. Consistent with phylogenetic analysis, isolates with this marker also formed a separate phylogenetic subgroup on the genome phylogeny within clade A and consisted mainly of isolates from the United States, which was indicative of the genetic relatedness among these isolates.
Because of the environmental occurrence and the rapid spread in Europe and Asia, the emergence of TR 34 /L98H and TR 46 /Y121F/T289A mechanisms of resistance is a serious concern to public health. One of the major questions about the emergence of these isolates in the environment is whether these markers can arise independently in multiple populations or whether these substitutions have arisen once and are spreading through migration and recombination. In addition to an academic interest, the answer to this question has important public health implications. Although they do not provide direct proof, the combined results of PCA, ADMIXTURE, whole-genome, and single-locus phylogenetic analysis are consistent with the hypothesis of a single introduction of the TR 34 /L98H mutations in population A and the ongoing spread of this marker throughout clade A possibly through recombination. Since the majority of the U.S. isolates of A. fumigatus are represented in population B, the current low prevalence of the TR 34 /L98H mutations among the U.S. isolates is not surprising. However, the ability of isolates from two populations to recombine is suggestive of future spread of this mutation among U.S. A. fumigatus populations.
Our results demonstrate that isolates with this resistance mechanism have been introduced into the United States (21). The use of azole antifungals in agriculture in the United States is rapidly increasing; therefore, the proportion of resistance isolates in the environment and clinics is also expected to increase, underscoring the importance of public health, environmental surveillance, and the development of prevention measures (21). Therefore, the CDC and AR Laboratory network have launched surveillance for ARAf in the United States (https://www.cdc.gov/drugresistance/laboratories/ AR-lab-network-testing-details.html).

MATERIALS AND METHODS
Isolates. Species identification and antifungal susceptibility testing were previously reported on a combined collection of 1,736 clinical and environmental isolates in the United States (15,31). A total of 179 clinical isolates, 46 azole resistant and 133 randomly selected azole susceptible, collected in 2015 to 2017 from 7 geographic areas (36 states) were included (see Table S1 in the supplemental material); the origin of the isolates was assigned based on the submitter address. Thirteen azole-resistant environmental isolates were also included. Two historical isolates, which later were identified as different aliquots of the same NRRL 163 (ATCC 1022) strain, were obtained from the Mycotic Diseases culture collection, Centers for Disease Control and Prevention, Atlanta, Georgia.
Growth, DNA extraction, and whole-genome sequencing. All isolates were grown on Sabouraud dextrose agar (SDA) slants at 37°C for 24 h; conidia were harvested and preserved in 20% Tween 80-sterile water. Conidial suspensions were serially diluted; 1 ml of the suspension was placed in a 25-ml canonical tube containing yeast-peptone-dextrose (YPD) broth. This mixture was grown for 12 to 18 h in a 37°C centrifugal shaker at 300 Â g. Fungal hyphae were transferred to a 2-ml centrifuge tube and spun at 13,000 Â g to remove residual broth. For DNA extraction, a Zymo research fungal/bacterial miniprep kit (Zymo Research, USA) coupled with a Zymo research large-fragment DNA recovery kit was used per manufacturer's guidelines. Genomic library preparation and sequencing were performed as previously described (41).
Gene analysis. All sequence data were assembled using default parameters in SPAdes version 3.9 (42). Assemblies were evaluated using QUAST version 2.3 (43). Nucleotide BLAST databases were generated using the cyp51A gene (accession AF338569) and mating type genes mat1-1 and mat1-2 (44). De novo genome assemblies were compared to the cy51A gene using the blastn algorithm of BLAST1 (45). This process was repeated for the mating type genes. Genomic regions that covered the entire length of reference sequences and produced greater the 98% sequence similarity were designated that particular gene. Genes were analyzed using the MUSCLE algorithm in MEGA version 7, and neighbor-joining trees were plotted using iTOL and Microreact (46)(47)(48).
Variant calling and phylogenetic analysis. Raw Illumina reads from U.S. and non-U.S. Aspergillus fumigatus genomes (BioProject PRJEB8623) were evaluated for quality using FastQC version 0.11.2 (49). All sequence data were mapped to AF293 (GCF_000002655.1) using BWA version 0.7.7 (50). Postalignment processing was performed using SAMtools version 1.7 and Picard tools version 2.5 (51). Single nucleotide polymorphisms were identified using the Genome Analysis Toolkit Haplotype Caller (GATK)  (53). Allele sharing distance is defined as 1 2 identity by state (IBS), where IBS is defined as the probability of drawing the same allele from two different individuals. The neighbor-joining algorithm, as implemented in MEGA-CC, was used to construct an unrooted phylogenetic tree and visualized with the Interactive Tree Of Life (iTOL) and Microreact (46)(47)(48).
Population analysis. Pairwise average nucleotide identities were estimated using a publicly available script (54). Principal-component analysis for the first two components was performed with resulting pairwise distance matrix using the prcomp function in the R programming language. To evaluate the structure of A. fumigatus isolates, an unsupervised analysis was executed for a K of 2 to a K of 10 populations in ADMIXTURE (36). All graphs shown in this paper were plotted using the ggplot2 library in R.
Data availability. All raw Illumina reads in this study have been submitted to NCBI under BioProject accession number PRJNA632561.

SUPPLEMENTAL MATERIAL
Supplemental material is available online only.

ACKNOWLEDGMENT
The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention.