Tracing the Evolutionary History and Global Expansion of Candida auris Using Population Genomic Analyses

In less than a decade, C. auris has emerged in health care settings worldwide; this species is capable of colonizing skin and causing outbreaks of invasive candidiasis. In contrast to other Candida species, C. auris is unique in its ability to spread via nosocomial transmission and its high rates of drug resistance. As part of the public health response, whole-genome sequencing has played a major role in characterizing transmission dynamics and detecting new C. auris introductions. Through a global collaboration, we assessed genome evolution of isolates of C. auris from 19 countries. Here, we described estimated timing of the expansion of each C. auris clade and of fluconazole resistance, characterized discrete phylogeographic population structure of each clade, and compared genome data to sensitivity measurements to describe how antifungal resistance mechanisms vary across the population. These efforts are critical for a sustained, robust public health response that effectively utilizes molecular epidemiology.

Initial studies suggested that C. auris emerged simultaneously and independently in four global regions, as phylogenetic analyses revealed four major clades of C. auris wherein isolates clustered geographically (13). These clades are referred to as the South Asian, East Asian, African, and South American clades or clades I, II, III, and IV, respectively (13,14). The isolates from these clades are genetically distinct, differing by tens to hundreds of thousands of single nucleotide polymorphisms (SNPs), with nucleotide diversity nearly 17-fold higher between clades compared to within clades (14). All the clades, except clade II, have been linked to outbreaks of invasive infections; uniquely, clade II appears to have a propensity for ear infections (15). The need for increased global efforts to understand the population structure of C. auris was recently highlighted by the discovery of the first Iranian C. auris case that yielded a single isolate representing a fifth major clade (16).
Molecular epidemiological investigations of C. auris outbreaks generally show clusters of highly related isolates, supporting local and ongoing transmission (7,17,18). The analysis of outbreaks and individual cases has also revealed genetic complexity, with isolates from different clades detected in Germany (19), United Kingdom (20), and United States (21), suggesting multiple introductions into these countries, followed by local transmission. To date, each of the clades appears to have undergone clonal expansion; while C. auris genomes have conserved mating and meiosis genes, only one of the two fungal mating types is present in a given clade. Specifically, MTLa is present in clades I and IV, and the other mating type, MTL␣, is found in clades II and III (14). Understanding whether mating and recombination between clades is occurring is critical, especially in those countries where isolates from different clades and opposing mating types overlap in time and space. This information could help contextualize complex epidemiologic findings or transmission dynamics.
In addition to its transmissibility, C. auris is concerning because of its high rates of drug resistance. Three major classes of antifungal drugs are currently approved for systemic use-azoles, polyenes, and echinocandins. More than 90% of C. auris isolates have been reported to be resistant to fluconazole (azole), although resistance levels vary markedly between the clades (13,22). Elevated minimum inhibitory concentrations (MICs) to amphotericin B (polyene) have been reported in several studies, and resistance to echinocandins is emerging in some countries (22). Numerous mechanisms of antifungal resistance have been described for C. auris. Echinocandin resistance has been linked to a single mutation at S639P/F in FKS1, the gene that encodes the echinocandin target 1,3-beta-D-glucan synthase (23). Most isolates display a mutation linked to fluconazole resistance in C. albicans; three mutations, Y132F, K143R, and F126L, have been identified in ERG11, the gene that encodes the azole target lanosterol 14-␣-demethylase. These mutations have been shown to associate by clade where Y132F and K143R are predominately found in clades I and IV and F126L is exclusively in clade III (13,24). Additionally, there have been suggestions that increased copy number of ERG11 may be a mechanism for fluconazole resistance in C. auris (14).
To better understand C. auris emergence and population structure, we engaged in a global collaboration involving 19 countries to produce a large data set of C. auris whole-genome sequences from hundreds of cases and associated environmental samples from healthcare facilities. Our goal was to generate a comprehensive genomic description of a global C. auris population to provide a population genetic framework for the molecular epidemiologic investigations.

RESULTS
Geographic distribution of Candida auris major clades. We first performed a phylogenetic analysis to characterize the global distribution of C. auris clades. By including isolates from previous studies (13,18), we observed that all 304 isolates in this sample collection clustered in one of the four major C. auris clades (Fig. 1a). In this collection, 126 (41%) were classified as clade I, 7 (2%) as clade II, 51 (17%) as clade III, and 120 (39%) as clade IV. Globally, clade I was the most widespread and found in 10 c b Clade I    (Fig. 1b). Multiple clades were found in Canada (clades I, II, and III), Kenya (clades I and III), and United States (clades I, II, III, and IV). No clade V isolates were present in our population sample.
In contrast to an initial report (13), we observed a weaker phylogeographic substructure as isolates from countries of most global regions appeared interspersed in phylogenies, although there was notable clustering by country within clade IV (Fig. 1c). Within clade I, there were three predominant subclades, each including isolates from India and Pakistan. The smallest subclade included B8441 (reference genome) from Pakistan and three other isolates. The other two subclades were more closely related to each other and included groups of highly related isolates from outbreaks in Kenya, United Kingdom, and United States. Additionally, clade I isolates from countries in Europe (France and Germany) and the Middle East (Saudi Arabia and United Arab Emirates) appeared interspersed in the phylogeny, suggesting multiple introductions of C. auris into these countries. Consistent with this observation, patients from France and Germany had travel to or contact with regions that have reported cases of C. auris. One patient in France (CNRMA17-624 isolate) had travelled to India and Iran for health care, and the second patient (CNRMA15-337 isolate), diagnosed in Réunion Island, had reported links to India and Saudi Arabia (25,26). Patients in Germany had travelled to the Middle East, India, or Russia (19). Travel histories were not available for patients from the Middle Eastern countries. Clade II was rarely observed and consisted of seven diverse isolates from Canada, Japan, South Korea, or United States, and six of the isolates were from cases involving ear infections. Other examples of phylogeographic mixing included isolates from Australia and Spain that clustered with clade III, and isolates from Israel that clustered with clade IV, clades originally described as the South African and South American clades, respectively.
Evolutionary rate and molecular dating. To better understand the emergence of this species, we next estimated the divergence times of the four major clades. We utilized collection dates for clinical isolates and associated environmental samples, such as swabs from health care facilities, which ranged from 2004 to 2018; most (98%) were collected from 2012 to 2018 (Fig. 2a). We confirmed that the divergence level of these isolates is temporally correlated, supporting use of molecular clock analyses. We observed the highest correlation in clade III isolates from a single health care facility in Kenya experiencing an outbreak of ongoing transmission (27), and calculated a mutation rate of 1.8695eϪ5 substitutions per site per year (see Fig. S1 in the supplemental material). As we observed low rate variation between the clades, we used this rate for a Bayesian approach for molecular dating of a phylogeny for all four clades using a strict clock coalescent model (Materials and Methods). We estimated that the time to most recent common ancestor (TMRCA) for each clade occurred within the last 360 years (Fig. 2b and c). Clade IV emerged most recently with a TMRCA of 1982 (95% highest posterior density [HPD], 31.5 to 40.5 years ago), while clade II was the oldest with a TMRCA of 1658 (95% HPD, 317.2 to 400.4 years ago). Lastly, we observed the divergence within the 19th century for the two most closely related clades, clades I and III, 1869 (95% HPD, 131.5 to 167.6 years ago) and 1833 (95% HPD, 162.8 to 207.5 years ago), respectively ( Fig. 2b and c). These dates are impacted by the inclusion of divergent isolates in both clades, which notably do not have ERG11 resistant mutations and are often drug susceptible, one isolate from Canada in clade III and two isolates from Pakistan and one from United States in clade I ( Fig. 1c and 2b). Excluding these drug-susceptible outliers for clades I and III, TMRCA estimates are for clade I in 1983 (95% HPD, 37.7 to 38.5 years ago) and for clade III in 1984 (95% HPD, 31.2 to 37.5 years ago); these more recent estimates are more similar to that estimated for clade IV. The estimated dates of TMRCA for each individual clade support the recent expansion during the ongoing outbreak. Together, this suggests an older separation of the four clades and the recent diversification of each clade in the years before the detected outbreaks.
Candida auris population structure. We next examined the global genomic data set for evidence of population substructure and recent admixture. Principal component analysis (PCA) identified four well-separated populations corresponding to clades I, II, III, and IV, and the tight clustering of isolates within each clade suggests there is no recent admixture in any isolates (Fig. 3a). To assess genetic diversity at the clade level, we compared population genetic statistics, including nucleotide diversity (), Tajima's D (TD), fixation index (F ST ) and pairwise nucleotide diversity (D XY ) ( Fig. 3b to e). Overall, clades I and III showed the lowest genetic diversity ( ϭ 1.51eϪ5 and ϭ 1.42eϪ5, respectively); clade IV exhibited nearly three times these levels ( ϭ 4.23eϪ5), and clade II presented the highest genetic diversity ( ϭ 1.29eϪ4), nearly nine times higher than clades I and III (Fig. 3b). Clade II was also the only clade that exhibited positive TD (td ϭ 1.153 [ Fig. 3b and c]), suggesting demographic changes as expected from the long branches observed in clade II phylogeny and isolate geography (Fig. 1c). In clades I, III, and IV, we observed negative TD values consistent with recent population expansions and the shorter phylogenetic branches (Fig. 3b and c and Fig. 1c); however, clade IV exhibited a highly variable distribution of TD relative to clade I and III ( Fig. 3c; Fig. S2), which suggests that these clades have experienced distinct evolutionary processes, such as different degrees of population bottlenecks.
Genome-wide F ST analysis highlighted substantial interspecific divergence and reproductive isolation between C. auris clades (average genome-wide F ST Ͼ 0.94 in all interclade comparisons; Fig. 3d; Fig. S2). Comparison of the two most closely related clades (clades I and III) revealed 86 small regions (Յ5 kb) with F ST values close to zero; these regions of identity were distributed across the genome (3.5% of the genome; 256 genes; Fig. 3e; Fig. S2 and S3). Phylogenetic analysis revealed that these regions in isolates from clades I and III are intermixed in a monophyletic clade (Fig. S3). Comparison of D XY values across the genome highlighted regions of population divergence between C. auris clades. Even between these clades with substantial interspecific b a c Clade I

Clade II
Clade III Clade IV J a n 2 0 0 4 J a n 2 0 0 6 J a n 2 0 0 8 J a n 2 0 1 0 J a n 2 0 1 2 J a n 2 0 1 4 J a n 2 0 1 6 J a n Evolutionary History of Candida auris divergence, we detected large genomic tracts that exhibit either high or low D XY values.
For D XY , we observed that all chromosomes exhibited a bimodal distribution of regions of both high and low levels of D XY ; scaffolds 8 and 10, which correspond to chromosomes 1 and 3, respectively (14), show only low D XY values ( Fig. 3e; Fig. S2). D XY is expected to be elevated in regions of limited gene flow, which could have arisen in C. auris due to chromosomal rearrangements between the clades (28,29), whereas D XY is unchanged or decreased in regions under recurrent background selection or selective sweeps.
We observed a single C. auris mating type in each clade. Isolates in clades I and IV had MTLa, and those in clades II and III had MTL␣ (Fig. S4); this confirms prior findings from a smaller data set (14) in this larger global survey. Countries with multiple clades (i.e., Canada, Kenya, and United States) had isolates of opposite mating types; however, there is no evidence of hybridization between clades within these countries or even between isolates of opposite mating types that were observed contemporaneously in a single health care facility in Kenya based on the PCA analysis. Together, these findings suggest that the C. auris clades have been genetically isolated and that variation across the genome was likely impacted by karyotype variation that prevented equal chromosome mixing.
Antifungal drug resistance and mechanisms of resistance. To examine resistance levels, we performed antifungal susceptibility testing (AFST) to fluconazole, amphotericin B, and micafungin-drugs representing each of the major classes. Of the 296 isolates tested, 80% were resistant to fluconazole, 23% to amphotericin B, and 7% to micafungin (Table 1 and Fig. 4). Clade II had the greatest percentage (86%) of susceptible isolates, including only one isolate resistant to fluconazole, and clade I had the greatest percentage of isolates resistant to fluconazole (97%) and amphotericin B (47%). Additionally, clade I had the highest rates of multidrug resistance (two antifungal classes; 45%) and was the only clade to have extensive drug resistance (3%) to all three major classes of antifungals, including isolates from two geographic regions (United Arab Emirates and Kenya) that cluster together (Table 1 and Fig. 4). Amphotericin B resistance appeared only in clades I and IV and was dispersed across the phylogeny in clade I and detected in a clade IV cluster of isolates from Colombia. Clade IV also had the highest percentage (9%) resistant to micafungin, all isolates from Venezuela. Micafungin resistance appeared sporadically in the phylogenies of clades I and III.
We next determined the genotypes of specific drug mutations in the ERG11 gene that have been associated with azole resistance (Y132F, K143R, and F126L). The most widespread mutation was Y132F spanning 11 countries in 53% of isolates from clade I and 40% of isolates from clade IV (Fig. 4). ERG11 K143R was predominately found in a subclade within clade I (43%) and one isolate from clade IV. F126L was found only in clade III, in nearly all isolates (96%) (Fig. 4); all isolates with F126L also carried the adjacent mutation V125A. Nearly all of the isolates with these changes in ERG11 were resistant to fluconazole; 99% of the isolates with Y132F or K143R and 100% of the isolates with F126L/V125A appeared resistant to fluconazole (MIC Ն 32 g/ml). We also identified polymorphisms in S639 in hot spot 1 of the FKS1 gene in 90% of the isolates with decreased susceptibility to micafungin. The most frequent mutation was S639P in 13 isolates from clade IV (11 resistant to micafungin), and S639F and S639Y were found in micafungin-resistant isolates from clade I and III (Fig. 4).
Analysis of the distribution of ERG11 copy number variation (CNV) revealed that of 304 isolates, 18 (6%) had either two or three copies. Of those 18 isolates, all were resistant to fluconazole and 17 (94%) were from clade III (Fig. S5). Isolates within clade III with two and three copies of ERG11 had significantly higher MICs (P Յ 0.05; Mann-Whitney test) to fluconazole than isolates with one copy (Fig. S5). Along with CNVs in ERG11, we found a total of six large regions (Ͼ40 kb) that showed increased copy number. Unlike CNVs in ERG11, these CNVs appeared in single isolates even in highly clonal clusters, with two isolates in each of clades I, II, and IV (Fig. S6). While genes in these regions (between 23 and 125 genes in each region) have no direct relation with antifungal resistance, they might play a role in microevolution and C. auris adaptation to host stress. This includes genes associated with response to oxidative stress (AOX2  (7) 86 (6) 14 (1) 0 (0) 0 (0) 0 (0) 0 (0) Clade III (51) 2 (1) 98 (50) 0 (0) 8 (4) 8 (4) 0 (0) Clade IV (120) 31 (37) 59 (71)  and HSP12), iron assimilation (FET33, FTR1, CHA1, and FLC1), cell wall and membrane integrity (MNN2, ERG5, and ERG24), a transcription factor (ZCF16), and oligopeptide transporters associated with metabolic and morphologic adaptation and adherence (see Table S2 in the supplemental material). These data provide insight into the underlying molecular mechanisms of antifungal resistance and suggest that CNV could be a mechanism of strain variation in C. auris. Further exploration and monitoring of these traits are crucial to improve our understanding of C. auris diversity and control the expanding outbreak.

DISCUSSION
In this study, we used whole-genome sequencing to describe a global collection of C. auris isolates collected from patients and health care facilities between 2004 and 2018. We found that the four predominant clades are genetically distinct with strong geographic substructure in clade IV. Using collection dates to estimate a molecular clock, we dated the origins of the four clades and confirmed the recent emergence of C. auris. Furthermore, we characterized mutations associated with antifungal resistance by clade, which varied between clades and country of isolation. While the clades appear largely clonal in species phylogenies and represent a single mating type, we found that a b  Chow et al.
® they have distinct evolutionary histories and genome-wide patterns of variation. We provided a browsable version for C. auris genomic epidemiology through Microreact (30) to explore phylogeny, geographic distribution, timeline, and drug resistance mutations (https://microreact.org/project/Candidaauris). In contrast to previous reports (13), we observed more phylogeographic mixing for C. auris. While we found that isolates from additional global regions can be clearly assigned to one of the four previously reported clades, we observed that three countries-Canada, Kenya, and United States-had isolates corresponding to multiple C. auris clades (Fig. 1b). Additionally, isolates from multiple clades have been previously reported in Germany (19) and the United Kingdom (20). As travel has been previously shown to play a major role in the spread of C. auris (21), global travel of persons with prior health care exposures to C. auris has likely contributed to the observed phylogeographic mixing. Our analysis of the likely geographic origin of infections observed in new geographic regions is limited by incomplete travel history for most patients in this study set. We noted that the strongest geographic substructure was observed in clade IV for isolates from Colombia, Panama, and Venezuela, with additional distinct clades of isolates from Israel and United States (Fig. 1c). This finding further supports evidence of rapid localized transmission in some of these countries (18,21).
These results have confirmed prior findings from the analysis of a smaller data set where isolates in clades I and IV had MTLa, and those in clades II and III had MTL␣ (14).
Although mating between C. auris clades has not been reported, it is concerning that the majority of countries reporting multiple C. auris clades have clades of opposite mating types. This is especially concerning in Kenya, where opposite mating types were observed in a single health care facility experiencing ongoing transmission. In such a situation, it could be possible to have mixed infections of opposite mating types. If mating occurred, this would lead to increased genetic diversity and the possibility for enhanced virulence and exchange of drug resistance alleles. Continued efforts to characterize C. auris infections at the genomic level provide the most sensitive approach for the detection of potential C. auris hybrids.
Assessment of C. auris population structure by PCA and genome-wide F ST analysis yielded no evidence for admixture between the major clades. The close relationship of clades I and III is highlighted by the detection of regions with very low F ST values, which suggests recent divergence or genetic exchange between these clades. Given that these regions were short and spread across the genome, we hypothesized that they are a result of incomplete lineage sorting rather than recent introgression events. We also observed variation in the average divergence between clades (D XY ) along each chromosome. This may be due to genome rearrangements between the clades, whereby genomic areas exhibiting high D XY levels, or low gene flow, arose in C. auris due to chromosomal rearrangements, which prevents recombination and supports high rates of genetic differentiation. Variation in chromosome number and size as measured by electrophoretic karyotyping as well as deletions, inversions, and translocations detected by comparing genome assemblies of different C. auris clade isolates have been reported for C. auris (14,28,29).
This global survey has provided a wider perspective of the mechanisms and frequency of mutations associated with resistance to antifungal drugs. The presence of both resistant and susceptible isolates in the same populations along with the presence of genetically related isolates with different alleles of resistance genes indicate that the resistance in C. auris is not intrinsic and has been recently acquired. The most common mutation associated with azole resistance in clades I and IV was ERG11 Y132F; however, both clades also included genetically related isolates with ERG11 K143R. In contrast, all fluconazole-resistant isolates in clade III carried ERG11 F126L substitution. In addition to mutations in genes associated with drug resistance, we found that increase in copy number of ERG11 is predominantly observed in clade III, again suggesting clade-specific variation in mechanisms of azole resistance. Recently, frequent nonsynonymous mutations in the TAC1B transcription factor were reported among azole-resistant isolates in clades I and IV and also detected during experimental evolution in the presence of Evolutionary History of Candida auris drug (31), providing additional evidence of candidate clade-specific profiles of azole resistance. All but three isolates with micafungin resistance had FKS1 S639Y/P/F mutations. Taken together, these observations suggest recent emergence of antifungal resistance in C. auris populations, most likely in response to some unknown environmental change, such as increased use of azole antifungals in clinical practice, agriculture, or both.
By using a molecular clock, we estimated the ages of the four clades by calculating time to the most recent common ancestor (TMRCA) of each clade. Our estimates demonstrated that clade II was the oldest clade with TMRCA of 360 years, while clade IV was the youngest with TMRCA of 38 years. Clade I and III isolates coalesce 149 and 186 years ago, respectively; however, in both clade I and clade III, the clusters of isolates associated with ongoing drug-resistant outbreaks worldwide, which display increased resistance to fluconazole and harbor mutations in ERG11 associated with drug resistance, have emerged within the last 37 years. These results are consistent with other population characteristics: even with the smallest sample set, clade II had the highest genetic diversity compared to the three other clades and a positive TD, characteristics of an older population. Conversely, the three other clades had low genetic diversity and negative TD consistent with the rapid emergence. Notably, the oldest C. auris isolate was collected from a patient in South Korea in 1996 (2), and no other strains were identified by searching the historic Candida culture collections. The absence of C. auris in culture collections prior to 1996 and a rapid emergence after 2012 suggest that this organism only recently emerged as a human pathogen and likely occupied a different ecological niche.
Other notable fungal outbreaks have also been estimated to be of recent origin. For example, the BdGPL lineage of the amphibian pathogen Batrachochytrium dendrobatidis was estimated to have arisen only ϳ100 years ago (32). The dispersal of Cryptococcus gattii into the Pacific Northwest also appears to have occurred within the last 100 years (33). While our reported mutation rate of 1.87eϪ5 substitutions per site is consistent with that (5.7eϪ5; R 2 ϭ 0.37) reported in a previous study (17), the mutation rate over longer time spans than we sampled is likely lower. We used collection dates spanning from 2004 to 2018 to inform our estimate, and rates of molecular evolution measured over short time scales tend to be overestimated, as some sites will be removed over time by natural selection (34). Therefore, the rate is more similar to a spontaneous mutation rate rather than an evolutionary substitution rate. If our mutation rate is substantially overestimated, the exact times of C. auris emergence and clade divergence would be older than we have estimated. We also acknowledge that utilizing only currently known isolates, which are highly similar within clades, provides a limited sampling of a larger source population, which may be also be undergoing sexual recombination. The identification and characterization of a wider population sample of C. auris will provide a higher-resolution view of the nodes separating these major clades. However, as there is only speculation thus far about potential associations or locations of such a source population, we suggest that the dates reported be used as a rough estimate that will need further evaluation when sources of additional diversity are identified.
Our molecular clock estimates demonstrate that nearly all outbreak-causing clusters from clades I, III, and IV originated 36 to 38 years ago in 1982 to 1984. Such recent origin and nearly simultaneous detection of genetically distinct clades suggest that anthropogenic factors might have contributed to its emergence. Specifically, first, in the 1980s, azole drugs first became widely used in clinical practice. The first azole topical antifungal drug, miconazole, was approved in 1971, followed by clotrimazole in 1972; both became widely used for treatment of superficial fungal infections in the late 1970s. In 1981, the first oral azole drug, ketoconazole, was released for treatment of systemic fungal infections (35). Second, in agriculture, the first azole fungicides, triadimefon and imazalil, were introduced in 1973, and by the early 1980s, 10 different azole pesticide formulations were available. It has been demonstrated that azoles from agricultural use can penetrate ground water and accumulate in soils (36,37). Third and also noteworthy, our predicated emergence of C. auris as a human pathogen coincided with the early stages of AIDS epidemics; however, the wide use of antifungal drugs, such as fluconazole, for treatment of secondary fungal infections, did not start until the late 1980s (38)(39)(40). Other anthropogenic factors might also have brought C. auris into contact with humans (41). Although the emergence of C. auris may be due to multiple factors, the coincidence between the introduction of azoles and emergence of C. auris is intriguing and requires further investigation, including the key question of identifying the source population. Understanding processes that led to the emergence of C. auris in humans is important to prevent emergence of other drug-resistant fungi and pathogens.
Although a recent study reported an isolate from a fifth clade isolated from a patient in Iran (16), all isolates in our collection were assigned to the previously described clades I, II, III, and IV. This is noteworthy because isolates from neighboring Pakistan, Saudi Arabia, and United Arab Emirates were represented in the analysis. Indeed, this highlights the unique nature of the divergent Iranian C. auris case and advocates for increasing diagnostic capacity worldwide and continued phylogenetic studies to understand C. auris diversity.
While we have included a set of diverse isolates, they likely differ from a random sample of the C. auris population. The isolates were obtained by convenience sampling, and therefore, our findings do not represent country-specific characteristics of C. auris molecular epidemiology. Wider sampling including identifying and collecting environmental isolates may also change the population structure and antifungal susceptibility profiles. Notably, at the time of this analysis, clade V had not yet been discovered, highlighting the importance of further sampling and genomic characterization. Finally, since the environmental reservoir of C. auris remains unknown, our analysis is based solely on the analysis of clinical isolates; higher genetic diversity, deeper divergence times, and different population structure are likely to occur in the natural populations of this fungus.
In conclusion, we have provided a comprehensive genomic description of a global C. auris survey representing 19 countries on six continents. Given that C. auris is a transmissible multidrug-resistant organism causing outbreaks of invasive infections in health care studies, an understanding of how C. auris is spreading, evolving, and acquiring resistance to antifungal drugs is essential for robust public health responses. Continued efforts to characterize the C. auris population, additional mechanisms of antifungal resistance, and environments conducive for mating between clades are critical.

MATERIALS AND METHODS
Sample collection. We performed genomic analyses on sequences from 304 C. auris isolates. This collection included C. auris isolates from 19 countries on six continents and isolates from both C. auris cases and environmental surfaces from health care facilities where ongoing transmission was occurring (see Table S1 in the supplemental material). Samples from C. auris cases were derived from a variety of specimen source sites, including sterile sites, such as blood, and noninvasive sites, such as respiratory tract or urine. All samples were a result of convenience sampling. For four countries (Colombia, Kenya, United States, and Venezuela) where more than 50 samples were available, 50 representative samples were selected by proportional random sampling: samples from each country were stratified by city, and then a subset was randomly selected proportionally from each strata.
Sample preparation and whole-genome sequencing (WGS). The sample collection comprised both publicly available sequences generated from previous studies and newly sequenced isolates (Table S1). For newly sequenced isolates, except those from France, DNA was extracted using the ZR Fungal/Bacterial DNA MiniPrep kit (Zymo Research, Irvine, CA, USA). For isolates from France, DNA was extracted using NucleoMag plant kit extraction (Macherey-Nagel, Germany) in a KingFisher Flex system (Thermo Fisher Scientific). Genomic libraries were constructed and barcoded using the NEBNext Ultra DNA Library Prep kit for Illumina (New England Biolabs, Ipswich, MA, USA) and were sequenced on either the Illumina HiSeq 2500 platform (Illumina, San Diego, CA, USA) using the HiSeq Rapid SBS kit v2 for 500 cycles or the MiSeq platform using the MiSeq reagent kit v2 for 500 cycles. For the two isolates from France, libraries were constructed using the Illumina Nextera Flex protocol and sequenced on an iSeq 100 to generate paired 150-bp reads. We estimated that 98.4% of the genome is uniquely mappable with reads of 250 bases using gemtools version 1.6 (kmer of 250, approximation threshold of 8, max mismatches of 10, max big indel length of 15, minimum matched bases of 200) (42). Repetitive regions, including microsatellites, represent a very low fraction of the C. auris genome, with 1.3% repetitive sequences and 1.1% microsatellites. A combined library of de novo repeats identified using RepeatMod-Evolutionary History of Candida auris eler v1.0.11 (www.repeatmasker.org/RepeatModeler/) and fungal sequences from RepBase (43) were mapped using RepeatMasker v.4.0.5 (www.repeatmasker.org/). Tandem repeats were determined using Tandem Repeats Finder (https://tandem.bu.edu/trf/trf.download.html) using the basic search option.
Phylogenetic and phylodynamic analyses. For phylogenetic analysis, sites with an unambiguous SNP in at least 10% of the isolates (n ϭ 222,619) were concatenated. Maximum likelihood phylogenies were constructed using RAxML v8.2.4 (48) using the GTRCAT nucleotide substitution model and bootstrap analysis based on 1,000 replicates. Phylogenetic analysis was also performed for each clade using subsets of the entire VCF and visualized using iTOL (49).
For phylodynamic analysis, we assessed temporal signal using a set of isolates from either United States (clade I), Kenya (clade III), or Venezuela (clade IV) using TempEst v1.5.3 (50) to quantify and estimate an initial mutation rate for each clade and R 2 value. The rates were 4.65eϪ5 (R 2 ϭ 0.55), 1.87eϪ5 (R 2 ϭ 0.56), and 1.54eϪ5 (R 2 ϭ 0.21), for clades I, III, and IV, respectively, supporting strong temporal correlation and low variation rate between clades. These rates are consistent with the rate previously reported for isolates from an outbreak in United Kingdom (clade I; 5.7eϪ5; R 2 ϭ 0.37) (17). As mutation rates were similar between the clades, Bayesian phylogenies were generated using BEAST v1.8.4 (51) under a strict molecular clock (both lognormal and exponential priors). In addition, we applied both Bayesian Skyline coalescent and Coalescent Exponential, and a general time reversible (GTR) nucleotide substitution model. We obtained similar results using the molecular rate estimated for a C. auris outbreak in the United Kingdom (17). Specimen collection dates (month and year) were used as sampling dates; the month of June (year midpoint) was assigned for samples where the month was unknown. Bayesian Markov chain Monte Carlo (MCMC) analyses were run for 500 million steps using an unweighted pair-group method with arithmetic mean (UPGMA) tree as a starting tree, and samples were drawn every 10,000 MCMC. The MCMC convergence was explored by inspection of posterior samples (effective sample size, 215) using Tracer v.1.7.1 (52). We generated a maximum clade credibility tree with TreeAnnotator v1.8.4 after discarding 10% as burn-in, and we visualized phylogenies using FigTree v1.4.4.
Antifungal susceptibility testing (AFST). Antifungal susceptibility testing was performed on 296/ 304 (97%) isolates (Table S1). The majority (n ϭ 271; 90%) of isolates were tested at the U.S. Centers for Disease Control and Prevention (CDC) as outlined by Clinical and Laboratory Standards Institute guidelines. Custom prepared microdilution plates (Trek Diagnostics, Oakwood Village, OH, USA) were used for fluconazole and the echinocandin micafungin. Interpretive breakpoints for C. auris were defined based on a combination of those breakpoints which have been established for other closely related Candida species, epidemiologic cutoff values, and the biphasic distribution of MICs between the isolates with and without known mutations for antifungal resistance. Resistance to fluconazole was set at Ն32 g/ml and at Ն4 g/ml for micafungin. Amphotericin B was assessed by Etests (bioMérieux), and resistance was set at Ն2 g/ml. For isolates not tested at the CDC, similar methods were employed and described previously (17,19,56). As there are no currently approved breakpoints for C. auris, for this study, the breakpoints were set at Ն32 g/ml for fluconazole, Ͼ1 g/ml for amphotericin B, and Ն4 for micafungin. These MIC values were based on a combination of the wild-type distribution (those isolates with no mutations) and pharmacokinetic (PK)/pharmacodynamic (PD) analysis in a mouse model of infection (57).
Ethics. This project was reviewed by the CDC institutional review board (IRB) as part of the broader human subjects protocol for the Mycotic Diseases Branch, CDC.
Data and resource availability. All Illumina sequence data generated by this project are available in the NCBI SRA under BioProject accession numbers PRJNA328792, PRJNA470683, PRJNA493622, and PRJNA595978. The phylogenetic tree has been deposited in Microreact (https://microreact.org/project/ Candidaauris). A set of isolates representing each of the five clades are available from the CDC and FDA Antimicrobial Resistance (AR) Isolate Bank (https://www.cdc.gov/drugresistance/resistance-bank/index .html).

SUPPLEMENTAL MATERIAL
Supplemental material is available online only.

ACKNOWLEDGMENTS
We thank Anuradha Chowdhary, Joveria Farooqi, Nelesh Govender, Mathew Fisher, and Koichi Makimura for their continued support and sharing isolates and Mathew Fisher and Angela Early for comments on the manuscript. We also thank Jacques Meis for coordinating isolate acquisition; Javier Peman and Olga Rivero-Menendez for support with the Spanish isolates; Erika Santiago, Jovanna Borace, and Angel Cedeño from Hospital Santo Tomas; and Soraya Salcedo, Adriana Marín, Carmen Varón, Nohora Villalobos, Jairo Perez, Julian Escobar for participating in the collection of strains and data from Colombian institutions. We also thank Daniel Park and Christopher Tomkins-Tinch for cloud computing support for BEAST analysis.
This project has been funded in part with federal funds from the National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services, under award U19AI110818 to the Broad Institute. C.A.C. is a CIFAR fellow in the Fungal Kingdom Program. This work was also made possible through support from the Advanced Molecular Detection (AMD) initiative at CDC.
The use of product names in this manuscript does not imply their endorsement by the U.S. Department of Health and Human Services. The finding and conclusions in this article are those of the authors and do not necessarily represent the views of the Centers for Disease Control and Prevention.