Genomic and Phenotypic Analysis of COVID-19-Associated Pulmonary Aspergillosis Isolates of Aspergillus fumigatus

ABSTRACT The ongoing global pandemic caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is responsible for coronavirus disease 2019 (COVID-19), first described in Wuhan, China. A subset of COVID-19 patients has been reported to have acquired secondary infections by microbial pathogens, such as opportunistic fungal pathogens from the genus Aspergillus. To gain insight into COVID-19-associated pulmonary aspergillosis (CAPA), we analyzed the genomes and characterized the phenotypic profiles of four CAPA isolates of Aspergillus fumigatus obtained from patients treated in the area of North Rhine-Westphalia, Germany. By examining the mutational spectrum of single nucleotide polymorphisms, insertion-deletion polymorphisms, and copy number variants among 206 genes known to modulate A. fumigatus virulence, we found that CAPA isolate genomes do not exhibit significant differences from the genome of the Af293 reference strain. By examining a number of factors, including virulence in an invertebrate moth model, growth in the presence of osmotic, cell wall, and oxidative stressors, secondary metabolite biosynthesis, and the MIC of antifungal drugs, we found that CAPA isolates were generally, but not always, similar to A. fumigatus reference strains Af293 and CEA17. Notably, CAPA isolate D had more putative loss-of-function mutations in genes known to increase virulence when deleted. Moreover, CAPA isolate D was significantly more virulent than the other three CAPA isolates and the A. fumigatus reference strains Af293 and CEA17, but similarly virulent to two other clinical strains of A. fumigatus. These findings expand our understanding of the genomic and phenotypic characteristics of isolates that cause CAPA. IMPORTANCE The global pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the etiological agent of coronavirus disease 2019 (COVID-19), has already killed millions of people. COVID-19 patient outcome can be further complicated by secondary infections, such as COVID-19-associated pulmonary aspergillosis (CAPA). CAPA is caused by Aspergillus fungal pathogens, but there is little information about the genomic and phenotypic characteristics of CAPA isolates. We conducted genome sequencing and extensive phenotyping of four CAPA isolates of Aspergillus fumigatus from Germany. We found that CAPA isolates were often, but not always, similar to other reference strains of A. fumigatus across 206 genetic determinants of infection-relevant phenotypes, including virulence. For example, CAPA isolate D was more virulent than other CAPA isolates and reference strains in an invertebrate model of fungal disease, but similarly virulent to two other clinical strains. These results expand our understanding of COVID-19-associated pulmonary aspergillosis.

O n 11 March 2020, the World Health Organization declared the ongoing pandemic caused by SARS-CoV-2, which causes COVID-19, a global emergency (1). Similar to other viral infections, patients may be more susceptible to microbial secondary infections, which can complicate disease management strategies and result in adverse patient outcomes (2,3). For example, approximately one quarter of patients infected with the H1N1 influenza virus during the 2009 pandemic were also infected with bacteria or fungi (4,5). Among COVID-19 patients, one study found that ;17% of individuals also have bacterial infections (6) and another that ;40% of patients with severe COVID-19 pneumonia were also infected with filamentous fungi from the genus Aspergillus (7). A third study reported that ;26% of patients with acute respiratory distress syndrome-associated COVID-19 were also infected with Aspergillus fumigatus and had high rates of mortality (8). Other studies from around the world have also reported high incidences of Aspergillus infections among patients with COVID-19 (9)(10)(11)(12). Taken together, these findings have prompted some to suggest routine clinical testing for secondary infections of Aspergillus fungi among COVID-19 patients (13,14). Despite the prevalence of microbial infections and their association with adverse patient outcomes, these secondary infections are only beginning to be understood.
To address this question and gain insight into the pathobiology of A. fumigatus CAPA isolates, we examined the genomic and phenotypic characteristics of four CAPA isolates obtained from four critically ill patients of two different centers in Cologne, Germany (8) ( Table 1). All patients were submitted to intensive care units due to moderate to severe respiratory distress syndrome (ARDS). Genome-scale phylogenetic (or phylogenomic) analyses revealed CAPA isolates formed a monophyletic group closely related to reference strains Af293 and A1163. Examination of the mutational spectra of 206 genes known to modulate virulence in A. fumigatus (which are here referred to as genetic determinants of virulence) revealed several putative loss-of-function (LOF) mutations. Notably, CAPA isolate D had the most putative LOF mutations among genes whose null mutants are known to increase virulence. The profiles of pathogenicity-related traits and of secondary metabolites of the CAPA isolates were similar to those of reference A. fumigatus strains Af293 and CEA17 or CEA10, which are parental strains of A1163 (41). One notable exception was that CAPA isolate D was significantly more virulent than other strains in an invertebrate model of disease, but on par with two other clinical strains of A. fumigatus. These results suggest that the genomes of A. fumigatus CAPA isolates contain nearly complete and intact repertoires of genetic determinants of virulence and have phenotypic profiles that are broadly expected for A. fumigatus clinical isolates. However, we did find evidence for genetic and phenotypic strain heterogeneity. These results suggest the CAPA isolates show similar phenotypic profiles as A. fumigatus clinical strains Af293 and A1163 and expand our understanding of CAPA.

RESULTS AND DISCUSSION
CAPA isolates belong to A. fumigatus and are closely related to reference strains Af293 and A1163. To confirm that the CAPA isolates belong to A. fumigatus, we sequenced, assembled, and annotated their genomes (Table S1 in the supplemental material; see figshare, 10.6084/m9.figshare.13118549). Phylogenetic analyses conducted using tef1 (Fig. S1) and calmodulin (Fig. S2) sequences suggested that all CAPA isolates are A. fumigatus. Phylogenomic analysis using 50 Aspergillus genomes (the four CAPA isolates, 43 A. fumigatus genomes that span the known diversity of the species,  including strains Af293 and A1163 (18,29,(42)(43)(44)(45)(46)(47), Aspergillus fischeri strains NRRL 181 and NRRL 4585, and Aspergillus oerlinghausenensis strain CBS 139183 T ) (40,45) confirmed that all CAPA isolates are A. fumigatus (Fig. 2). Phylogenomic analyses also revealed the CAPA isolates formed a monophyletic group closely related to reference strains Af293 and A1163. CAPA isolates are inferred to be closely related, which may be due to the fact they are all from the same geographic area. CAPA isolate genomes contain polymorphisms in genetic determinants of virulence and biosynthetic gene clusters. An extensive literature and database search identified 206 genetic determinants of virulence (File S1) (23,(48)(49)(50)(51). We define genetic determinants of virulence as genes that alter virulence in an animal model of disease when deleted or that are required for biosynthesis of secondary metabolites known to affect virulence. This definition resulted in a list of genes distinct from those previously published, which include genes that contribute to allergy-related phenotypes and genes that are computationally predicted to contribute to virulence but have yet to be validated in an animal model of fungal disease (52)(53)(54)(55).
To determine if the 206 genetic determinants of virulence are conserved in CAPA isolates, we conducted sequence similarity searches of gene sequences in the genomes of the CAPA isolates. We found that all 206 genes were present in the genomes of the CAPA isolates. Furthermore, none of the 206 genetic determinants of virulence showed any copy number variation among CAPA isolates. Examination of single nucleotide polymorphisms (SNPs) and insertion/deletion (indel) polymorphisms coupled with variant effect prediction in these 206 genes ( Fig. 3; File S2) showed that all CAPA isolates shared multiple polymorphisms resulting in early stop codons or frameshift mutations suggestive of loss of function (LOF) in NRPS8 (AFUA_5G12730), a nonribosomal peptide synthetase gene that encodes an unknown secondary metabolite (42). LOF mutations in NRPS8 are known to result in increased virulence in a mouse model of disease (56). Putative LOF mutations were also observed in genes whose null mutants decreased virulence. For example, all CAPA isolates shared the same SNPs resulting in early stop codons that likely result in LOF or partial LOF in pptA (AFUA_2G08590), a putative 49-phosphopantetheinyl transferase, whose deletion results in reduced virulence in a mouse model of disease (57). In light of the close evolutionary relationships among CAPA isolates, we hypothesize that these shared mutations likely occurred in the genome of their most recent common ancestor.
In addition to shared polymorphisms, analyses of CAPA isolate genomes also revealed isolate-specific polymorphisms affecting genetic determinants of virulence (File S2). For example, SNPs resulting in early stop codons, which likely lead to LOF, were observed in CYP5081A1 (AFUA_4G14780), a putative cytochrome P450 monooxygenase, in CAPA isolates B and C. CYP5081A1 LOF is associated with reduced virulence of A. fumigatus (58). Other SNPs were found only in single isolates. CAPA isolate B has a frameshift mutation in a putative fatty acid oxygenase (AFUA_4G00180). CAPA isolate D has a mutation resulting in the loss of the start codon in fleA (AFUA_5G14740), a gene that encodes an L-fucose-specific lectin. Mice infected with FLEA null mutants have more severe pneumonia and invasive aspergillosis than wild-type strains. FLEA null mutants cause more severe disease because FleA binds to macrophages and therefore is critical for host recognition, clearance, and macrophage killing (59). The only evidence of pseudogenization among the genetic determinants of virulence in the reference strains was observed in mybA (AFUA_3G07070), a transcription factor involved in conidiation and conidial viability (60), in strain A1163. MybA null mutants have reduced virulence compared to wild-type strains (60).
Examination of three additional clinical strains of A. fumigatus (IFM61407, CN-CM7555, and Afs35) revealed that CAPA isolates shared some, but not all, polymorphisms present in the 206 genetic determinants of virulence. For example, similar putative LOF mutations were observed in NRPS8 (AFUA_5G12730). Polymorphisms that were not shared between CAPA isolates and the three clinical strains include the loss of a stop codon in aspA, a septin (61), in strain Afs35; an early stop codon in noc3, a nuclear export protein (62), in strain CN-CM7555; and a lost stop codon in cat2, a  Examination of the presence of biosynthetic gene clusters (BGCs) revealed that all CAPA isolates had BGCs that encode secondary metabolites known to modulate host biology ( Table 2). For example, all CAPA isolates had BGCs encoding the toxic secondary metabolite gliotoxin (Fig. 4). Other intact BGCs in the genomes of the CAPA isolates include fumitremorgin, trypacidin, pseurotin, and fumagillin, which are known to modulate host biology (64-66); for example, fumagillin is known to inhibit neutrophil function (67,68). More broadly, all CAPA isolates had similar numbers and classes of BGCs (Fig. S3).
In summary, we found that CAPA isolates were closely related to one another and had largely intact genetic determinants of virulence and BGCs. However, we observed FIG 3 Mutational spectra among genetic determinants of virulence. Genome-wide SNPs, indels, and CN variants were filtered for those present in genetic determinants of virulence. Then, the number of genetic determinants of virulence with high-impact polymorphisms was identified. The number known to increase or decrease virulence in null mutants was determined thereafter.  Fig. S6 in the supplemental material for relative abundance of secondary metabolite production.) strain-specific polymorphisms in known genetic determinants of virulence in CAPA isolate genomes, which raises the hypothesis that CAPA isolates differ in their virulence profiles.
CAPA isolates display strain heterogeneity in virulence and in a few virulencerelated traits. Examination of virulence and virulence-related traits revealed the CAPA isolates often, but not always, had similar phenotypic profiles compared to reference A. fumigatus strains Af293 and a CEA17 DakuB KU80 pyrG 1 derivative of CEA17 akuB KU801 , pyrG 2 (which is here referred to as CEA17 for simplicity [41]). For example, virulence in the Galleria moth model of fungal disease revealed strain heterogeneity among CAPA isolates Af293, CEA17, and a panel of three clinical strains of A. fumigatus, namely, Afs35, IFM61407, and CN-CM7555 (41, 69, 70) (P , 0.001; log-rank test; Fig. 5A). Pairwise examination revealed the observed strain heterogeneity was primarily driven by CAPA isolate D, which was significantly more virulent than all other CAPA isolates, reference strains Af293 and CEA17, and clinical strain Afs35 (Benjamini-Hochberg adjusted P value , 0.05 when comparing CAPA isolate D to another isolate; log-rank test; File S3). However, the virulence of the CAPA isolate D was on par to those of clinical strains IFM61407 and CN-CM 7555 (P = 0.085 and P = 0.386, respectively) (Fig. 5A). These results reveal the CAPA isolates have generally similar virulence profiles compared to the reference strains Af293 and CEA17 with the exception of the more virulent CAPA isolate D. Furthermore, the virulence profiles of all CAPA isolates are within the known range of A. fumigatus clinical strains. Determining the association between virulence and the genetic polymorphisms described in the section above, in addition to other polymorphisms identified in this study (Fig. 3), is an important future direction.
Examination of growth in the presence of osmotic, cell wall, and oxidative stressors revealed the phenotypic profiles of CAPA isolates were similar to the profiles of Af293 and CEA17 strains (Fig. 5B to D and Fig. S4). The sole exception was growth in the presence of calcofluor white, where we observed that the CAPA isolates were more sensitive than reference strains Af293 and CEA17 (P , 0.001; Tukey's honestly significant difference test; Fig. 5C). Last, antifungal drug susceptibility profiles for amphotericin B, voriconazole, itraconazole, and posaconazole were similar between the CAPA isolates Gliotoxin is known to contribute to virulence of A. fumigatus. The genomes of CAPA isolates of A. fumigatus contain biosynthetic gene clusters known to encode gliotoxin. Note, the BGC of CAPA isolate A was split between two contigs and, therefore, the BGC is hypothesized to be present. and reference strains Af293 and CEA17 (Table 3). Following the guidelines of the Clinical and Laboratory Standards Institute (71), the CAPA isolates are not considered multidrug resistant.
Secondary metabolites can impact host biology and virulence (30,72). Examination of secondary metabolite production revealed strain heterogeneity among CAPA isolates and reference strains Af293 and CEA10, a pyrG 1 and akuB KU801 strain that CEA17 is derived from (41). For example, principal-component analysis of chromatogram features revealed that CAPA isolate A was substantially different from other CAPA isolates along the first and second principal components, which capture 82.47% of the total variance, whereas the CAPA isolate D was substantially different from other CAPA isolates along the second and third principal components, which capture 38.64% of total variance (Fig. S5). Examination of the loadings plot, which identifies the individual secondary metabolites that contribute to the observed variation across strains, revealed gliotoxin and fumitremorgin as the largest contributors (Fig. S6). Measurement of relative abundance of biosynthesized gliotoxin and fumitremorgin, two secondary metabolites known to modulate host biology (30), showed the largest amount of fumitremorgin was biosynthesized by CAPA isolate A, and the largest amount of gliotoxin was biosynthesized by the Af293 strain, followed by CAPA isolate C ( Fig. S6; Table 2).
In summary, we found the CAPA isolates have similar phenotypic profiles, with the exception of growth in the presence of calcofluor white and secondary metabolite $20 replicates per strain). Pairwise examinations revealed CAPA isolate D was significantly more virulent than all other strains (Benjamini-Hochberg adjusted P , 0.01 when comparing CAPA isolate D to another isolate; logrank test) with the exception of clinical strains IFM61407 and CN-CM 7555 (P = 0.085 and P = 0.386, respectively). Growth of CAPA isolates and references strains Af293 and CEA17 in the presence of osmotic (B), cell wall (C), and oxidative stressors (D). Growth differences between CAPA isolates and reference strains Af293 and CEA17 were observed across all growth conditions (P , 0.001; multifactor ANOVA). Pairwise differences were assessed using the post hoc Tukey's honestly significant difference (HSD) test and were only observed for growth in the presence of CFW at 25 mg/ml (P , 0.001; Tukey HSD test) in which the CAPA isolates did not grow as well as the reference isolates. To correct for strain heterogeneity in growth rates, radial growth in centimeters in the presence of stressors was divided by radial growth in centimeters in the absence of the stressor (MM only). Abbreviations of cell wall stressors are as follows: CFW, calcofluor white; CR, Congo red; CSP, caspofungin. Growth in the presence of other stressors is summarized in Fig. S4. Error bars in panels B to D represent the average of 6 one standard deviation across three replicates.
biosynthesis, compared to reference strains, and virulence on par with the known range of A. fumigatus clinical strains.
In conclusion, the effects of secondary fungal infections in COVID-19 patients are only beginning to be understood. Our results revealed that CAPA isolates are generally, but not always, similar to A. fumigatus clinical reference strains. Notably, CAPA isolate D was significantly more virulent than the other three CAPA isolates and two reference strains examined, but on par with other clinical strains. Taken together, these results are important to consider in the management of fungal infections among patients with COVID-19, especially those infected with A. fumigatus, and broaden our understanding of CAPA.

MATERIALS AND METHODS
Patient information and ethics approval. Patients were included into the FungiScope global registry for emerging invasive fungal infections (www.ClinicalTrials.gov, NCT 01731353). The clinical trial is approved by the Ethics Committee of the University of Cologne, Cologne, Germany (study ID: 05-102) (73). Since 2019, patients with invasive aspergillosis were also included.
DNA quality control, library preparation, and sequencing. Sample DNA concentration was measured by Qubit fluorometer and DNA integrity and purity by agarose gel electrophoresis. For each sample, 1 to 1.5 mg of genomic DNA was randomly fragmented by Covaris and fragments with average sizes of 200 to 400 bp were selected by Agencourt AMPure XP-Medium kit. The selected fragments were endrepaired, 39 adenylated, ligated to adapters, and amplified by PCR. Double-stranded PCR products were recovered by the AxyPrep Mag PCR cleanup kit, and then heat denatured and circularized by using the splint oligonucleotide sequence. The single-strand circlular DNA (ssCir DNA) products were formatted as the final library and went through further QC procedures. The libraries were sequenced on the MGISEQ2000 platform.
Genome assembly and annotations. Short-read sequencing data of each sample were assembled using MaSuRCA, v3.4.1 (74). Each de novo genome assembly was annotated using the MAKER genome annotation pipeline, v2.31.11 (75), which integrates three ab initio gene predictors: AUGUSTUS, v3.3.3 (76), GeneMark-ES, v4.59 (77), and SNAP, v2013-11-29 (78). Fungal protein sequences in the Swiss-Prot database (release 2020_02) were used as homology evidence for the genome annotation. The MAKER annotation process occurs in an iterative manner as described previously (79). In brief, for each genome, repeats were first soft-masked using RepeatMasker v4.1.0 (http://www.repeatmasker.org) with the library Repbase library release-20181026 and the "-species" parameter set to "Aspergillus fumigatus." GeneMark-ES was then trained on the masked genome sequence using the self-training option ("-ES") and the branch model algorithm ("-fungus"), which is optimal for fungal genome annotation. On the other hand, an initial MAKER analysis was carried out where gene annotations were generated directly from homology evidence, and the resulting gene models were used to train both AUGUSTUS and SNAP. Once trained, the ab initio predictors were used together with homology evidence to conduct a first round of full MAKER analysis. Resulting gene models supported by homology evidence were used to retrain AUGUSTUS and SNAP. A second round of MAKER analysis was conducted using the newly trained AUGUSTUS and SNAP parameters, and once again the resulting gene models with homology supports were used to retrain AUGUSTUS and SNAP. Finally, a third round of MAKER analysis was performed using the new AUGUSTUS and SNAP parameters to generate the final set of annotations for the genome. The completeness of de novo genome assemblies and ab initio gene predictions was assessed using BUSCO, v4.1.2 (80) using 4,191 preselected "nearly" universally single-copy orthologous genes from the Eurotiales database (eurotials_odb10.2019- [11][12][13][14][15][16][17][18][19][20] in OrthoDB, v10.1 (81).
Polymorphism identification. To characterize and examine the putative impact of polymorphisms in the genomes of the CAPA isolates, we identified single nucleotide polymorphisms (SNPs), insertiondeletion polymorphisms (indels), and copy number (CN) polymorphisms. To do so, reads were first quality-trimmed and mapped to the genome of A. fumigatus Af293 (RefSeq assembly accession: GCF_000002655.1) following a previously established protocol (82). Specifically, reads were first qualitytrimmed with Trimmomatic v0.36 (83) using the following parameters: leading, 10; trailing, 10; slidingwindow, 4:20; minlen, 50. The resulting quality-trimmed reads were mapped to the A. fumigatus Af293 genome using the Burrows-Wheeler Aligner (BWA) v0.7.17 (84) with the mem parameter. Thereafter, To identify SNPs and indels, mpileup files were used as input into VarScan v2. 3.9 (86), with the mpi-leup2snp and mpileup2indel functions, respectively. To ensure only confident SNPs and indels were identified, a Fisher's exact test P value threshold of 0.05 and minimum variant allele frequency of 0.75 were used. The resulting Variant Call Format files were used as input to snpEff v.4.3t (87), which predicted their functional impacts on gene function as high, moderate, or low. To identify CN variants, the sorted bam files were used as input into Control-FREEC v9.1 (88,89). The coefficientOfVariation parameter was set to 0.062 and window size was automatically determined by Control-FREEC. To ensure high confidence in CN variant identification, a P value threshold of 0.05 was used for both the Wilcoxon rank sum and Kolmogorov Smirnov tests.
To identify evidence of putative pseudogenization between reference strains A1163 and Af293, we used a previously established approach (19,90). More specifically, we compared lengths of gene pairs as a proxy for pseudogenization. A gene was considered a putative pseudogene in one of the strains if the gene was 70% the length of its reciprocal best blast hit in the other strain.
Maximum likelihood molecular phylogenetics. To taxonomically identify the species of Aspergillus sequenced, we conducted molecular phylogenetic analysis of two different loci and two different data sets. In the first analysis, the nucleotide sequence of the alpha subunit of translation elongation factor EF-1, tef1 (NCBI accession: XM_745295.2), from the genome of A. fumigatus Af293 was used to extract other fungal tef1 sequences from NCBI's fungal nucleotide reference sequence database (downloaded July 2020) using the blastn function from NCBI's BLAST1, v2.3.0 (91). Tef1 sequences were extracted from the CAPA isolates by identifying their best BLAST hit. Sequences from the top 100 best BLAST hits in the fungal nucleotide reference sequence database and the four tef1 sequences from the CAPA isolates were aligned using MAFFT v7.402 (92) using previously described parameters (93) with slight modifications. Specifically, the following parameters were used: -op 1.0 -maxiterate 1000 -retree 1 -genafpair. The resulting alignment was trimmed using ClipKIT v0.1 (19) with default "gappy" mode. The trimmed alignment was then used to infer the evolutionary history of tef1 sequences using IQ-TREE2 (94). The best-fitting substitution model-TIM3 with empirical base frequencies, allowing for a proportion of invariable sites, and a discrete Gamma model (95,96) with four rate categories (TIM31F1I1G4)was determined using the Bayesian information criterion. In the second analysis, the same process was used to conduct molecular phylogenetic analysis using calmodulin nucleotide sequences from Aspergillus section Fumigati species and Aspergillus clavatus, an outgroup taxon, using sequences from NCBI that were made available elsewhere (97). For calmodulin sequences, the best-fitting substitution model was TNe (98) with a discrete Gamma model with four rate categories (TNe1G4). Bipartition support was assessed using 5,000 ultrafast bootstrap support approximations (99).
To determine which strains of A. fumigatus the CAPA isolates were most similar to, we conducted phylogenomic analyses using the 50 Aspergillus proteomes. To do so, we first identified orthologous groups of genes across all 50 Aspergillus strains using OrthoFinder 2.3.8 (100). OrthoFinder takes as input the proteome sequence files from multiple genomes and conducts all-versus-all sequence similarity searches using DIAMOND v0.9.24.125 (101). Our input included 50 total proteomes, where 47 were A. fumigatus, two were A. fischeri, and one was A. oerlinghausenensis (40,42,45). OrthoFinder then clusters sequences into orthologous groups of genes using the graph-based Markov Clustering Algorithm (102). To maximize the number of single-copy orthologous groups of genes found across all input genomes, clustering granularity was explored by running 41 iterations of OrthoFinder that differed in their inflation parameter. Specifically, iterations of OrthoFinder inflation parameters were set to 1.0 to 5.0 with a step of 0.1. The lowest number of single-copy orthologous groups of genes was 3,399 when using an inflation parameter of 1.0; the highest number was 4,525 when using inflation parameter values of 3.8 and 4.1. We used the groups inferred using an inflation parameter of 3.8.
Next, we built the phylogenomic data matrix and reconstructed evolutionary relationships among the 50 Aspergillus genomes. To do so, the protein sequences from 4,525 single-copy orthologous groups of genes were aligned using MAFFT v7.402 (92), with the following parameters: -bl 62 -op 1.0 -maxiterate 1000 -retree 1 -genafpair. Next, nucleotide sequences were threaded onto the protein alignments using function thread_dna in PhyKIT v0.0.1 (103). The resulting codon-based alignments were then trimmed using ClipKIT v0.1 (104), using the gappy mode. The resulting aligned and trimmed alignments were then concatenated into a single matrix with 7,133,367 sites using the PhyKIT function create_concat. To reconstruct the evolutionary history of the 50 Aspergillus genomes, a single best-fitting model of sequence substitution and rate heterogeneity was estimated across the entire matrix using IQ-TREE2 v.2.0.6 (94). The best-fitting model was determined to be a general time reversible model with empirical base frequencies and invariable sites with a discrete Gamma model with four rate categories (GTR1F1I1G4) (96,(105)(106)(107) using the Bayesian information criterion. During tree search, the number of candidate trees maintained during maximum likelihood tree search was increased from five to ten. Five independent searches were conducted and the tree with the best log-likelihood score was chosen as the "best" phylogeny. Bipartition support was evaluated using 5,000 ultrafast bootstrap approximations (99).
Biosynthetic gene cluster prediction. To predict BGCs in the genomes of A. fumigatus strains Af293 and the CAPA isolates, gene boundaries inferred by MAKER were used as input into antiSMASH v4.1.0 (108). Using a previously published list of genes known to encode BGCs in the genome of A. fumigatus Af293 (42), BLAST-based searches using an expectation value threshold of 1 Â 10 210 were used to identify BGCs implicated in modulating host biology using NCBI's BLAST1 v2.3.0 (91). Among predicted BGCs that did not match the previously published list, we further examined their evolutionary history if at least 50% of genes showed similarity to species outside the genus Aspergillus, which is information provided in the antiSMASH output. Using these criteria, no evidence suggestive of horizontally acquired BGCs from distant relatives was detected.
Characterization of biosynthesized secondary metabolites. (i) General experimental procedures. HRESIMS experiments utilized a Thermo LTQ Orbitrap XL mass spectrometer equipped with an electrospray ionization source. A Waters Acquity UPLC (Waters Corp.) was utilized using a BEH C 18 column (1.7 mm; 50 mm Â 2.1 mm) set to a temperature of 40°C and a flow rate of 0.3 ml/min. The mobile phase consisted of a linear gradient of CH 3 CN-H 2 O (both acidified with 0.1% formic acid), starting at 15% CH 3 CN and increasing linearly to 100% CH 3 CN over 8 min, with a 1.5 min hold before returning to the starting condition.
(ii) Growth and extraction of fungal cultures. To identify the chemical differences between the various A. fumigatus strains and isolates (Af293, CEA10, CAPA isolates A, B, C, and D), they were grown in a clinically relevant growth condition (37°C) and extracted for chemometric analysis. Czapek Dox agar (Sigma-Aldrich) petri plates were inoculated with the asexual spores of each strain in biological triplicates. Subsequently, the plates were incubated at 37°C in the dark for 3 days. The cultures were extracted by chopping and transferring the agar to 20-ml scintillation vials, adding 10 ml of acetone, thoroughly shaking, and then letting the samples sit for 4 h. Last, the cultures were filtered and evaporated to dryness under nitrogen gas.
(iii) Metabolomics analyses. Principal-component analysis (PCA) was performed on the ultraperformance liquid chromatography-mass spectrometry (UPLC-MS) data. Untargeted UPLC-MS data sets for each sample were individually aligned, filtered, and analyzed using MZmine 2.20 software (https:// sourceforge.net/projects/mzmine/) (109). Peak detection was achieved using the following parameters: noise level (absolute value), 1.5 Â 10 5 ; minimum peak duration, 0.05 min; m/z variation tolerance, 0.05; and m/z intensity variation, 20%. Peak list filtering and retention time alignment algorithms were used to refine peak detection. The join algorithm integrated all sample profiles into a data matrix using the following parameters: m/z and retention time balance set at 10.0 each, m/z tolerance set at 0.001, and RT tolerance set at 0.5 min. The resulting data matrix was exported to Excel (Microsoft) for analysis as a set of m/z-retention time pairs with individual peak areas detected in quadruplicate analyses. Samples that did not possess detectable quantities of a given marker ion were assigned a peak area of zero to maintain the same number of variables for all sample sets. Ions that did not elute between 2 and 8 min and/or had an m/z ratio less than 100 or greater than 1,200 Da were removed from analysis. Relative standard deviation was used to quantify variance between the technical replicate injections, which may differ slightly based on instrument variance. A cutoff of 1.0 was used at any given m/z-retention time pair across the technical replicate injections of one biological replicate and, if the variance was greater than the cutoff, it was assigned a peak area of zero (110). Final chemometric analysis, including data filtering and PCA, was conducted using Python. The PCA plots were generated using data from the averaged biological replicates from the petri dish cultures. Each biological replicate was plotted using averaged peak areas obtained across four replicate injections (technical replicates). The principal components (PC) were generated and processed via Scikit Learn decomposition and Pandas v0.25.3, Python libraries. The PCA data were plotted using Altair v4.1.0, Python graphing libraries. These data were converted into a dataframe via Pandas, and the PCs were created from the dataframe using Scikit Learn decomposition. The PCA scores and loadings plots were then plotted using the PCs dataframe that was generated from Scikit Learn.
Infection of Galleria mellonella. Survival curves (n $ 20/strain) were generated for Galleria mellonella infected with CAPA isolates A, B, C, and D. Phosphate-buffered saline (PBS) without asexual spores (conidia) was administered as a negative control. A log-rank test was used to examine strain heterogeneity, followed by pairwise comparisons with the Benjamini-Hochberg multitest correction (111). All the selected larvae of Galleria mellonella were in the final (sixth) instar larval stage of development, weighing 275 to 330 mg. Fresh conidia from each strain were harvested from minimal medium (MM) plates in PBS solution and filtered through a Miracloth (Calbiochem). For each strain, the spores were counted using a hemocytometer and the stock suspension was done at 2 Â 10 8 conidia/ml. The viability of the administered inoculum was determined by plating a serial dilution of the conidia on MM medium at 37°C. A total of 5 ml (1 Â 10 6 conidia/larva) from each stock suspension was inoculated per larva. The control group was composed of larvae inoculated with 5 ml of PBS to observe the killing due to physical trauma. The inoculum was performed using Hamilton syringe (7000.5KH) via the last left proleg. After infection, the larvae were maintained in petri dishes at 37°C in the dark and were scored daily. Larvae were considered dead by presenting the absence of movement in response to touch.
Growth assays. To examine growth conditions of the CAPA isolates and reference strains Af293 and CEA17, plates were inoculated with 10 4 spores per strain and allowed to grow for 5 days on solid MM or MM supplemented with various concentrations of osmotic (sorbitol, NaCl), cell wall (Congo red, calcofluor white, and caspofungin), and oxidative stress agents (menadione and t-butyl) at 37°C. MM had 1% (wt/vol) glucose, original high-nitrate salts, trace elements, and a pH of 6.5; trace elements, vitamins, and nitrate salt compositions follow standards described elsewhere (112). To correct for strain heterogeneity in growth rates, radial growth in centimeters in the presence of stressors was divided by radial growth in centimeters in the absence of the stressor.
To determine the MICs of antifungal drugs in the CAPA isolates and reference strains Af293 and CEA17, strains were grown in 96-well plates at a concentration of 10 4 spores/well in 200 ml of RPMI 1640 supplemented with increasing concentrations of amphotericin B, voriconazole, itraconazole, and posaconazole, according to the protocol elaborated by the Clinical and Laboratory Standards Institute (71).
The MIC was defined as the lowest concentration of drugs that visually inhibited 100% fungal growth. Three independent experiments were carried out for each antifungal drug.
Data availability. Newly sequenced genomes assemblies, annotations, and raw short reads have been deposited to NCBI's GenBank database under BioProject accession PRJNA673120.
Supplementary data, including tables, figures, and files, additional copies of genome assemblies, annotations, and gene coordinates, raw data, including the genome assembly and annotations of all analyzed Aspergillus genomes, the aligned and trimmed phylogenetic and phylogenomic data matrices, polymorphisms identified in the present project, and predicted BGCs have been uploaded to figshare (10.6084/m9.figshare.13118549).