Chromosomal rearrangements and loss of subtelomeric adhesins linked to clade-specific phenotypes in Candida auris

Candida auris is an emerging fungal pathogen of rising concern due to its increasing incidence, its ability to cause healthcare-associated outbreaks and antifungal resistance. Genomic analysis revealed that early cases of C. auris that were detected contemporaneously were geographically stratified into four major clades. Clade II, also termed East Asian clade, consists of the initial isolates described from cases of ear infection, is less frequently resistant to antifungal drugs and to date, the isolates from this group have not been associated with outbreaks. Here, we generate nearly complete genomes (“telomere-to-telomere”) of an isolate of this clade and of the more widespread Clade IV. By comparing these to genome assemblies of the other two clades, we find that the Clade II genome appears highly rearranged, with 2 inversions and 9 translocations resulting in a substantially different karyotype. In addition, large subtelomeric regions have been lost from 10 of 14 chromosome ends in the Clade II genomes. We find that shorter telomeres and genome instability might be a consequence of a naturally occurring loss-of-function mutation in DCC1 exclusively found in Clade II isolates, resulting in a hypermutator phenotype. We also determine that deleted subtelomeric regions might be linked to clade-specific adaptation as these regions are enriched in Hyr/Iff-like cell surface proteins, novel candidate cell surface proteins, and an ALS-like adhesin. The presence of these cell surface proteins in the clades responsible for global outbreaks causing invasive infections suggests an explanation for the different phenotypes observed between clades. IMPORTANCE Candida auris was unknown prior to 2009 and since then it has quickly spread around the world, causing outbreaks in healthcare facilities and representing a high fraction of candidemia cases in some regions. The emergence of C. auris is a major concern, since it is often multidrug-resistant, easily spread between patients, and causes invasive infections. While isolates from three global clades cause invasive infections, isolates from Clade II primarily cause ear infections and have not been implicated in outbreaks, though cases of Clade II infections have been reported on different continents. Here, we describe genetic differences between Clade II and Clades I, III and IV, including a loss-of-function mutation in a gene associated with telomere length maintenance and genome stability, and the loss of cell wall proteins involved in adhesion and biofilm formation, that may suggest an explanation for the lower virulence and potential for transmission of Clade II isolates.


21
subtelomeric regions have been lost from 10 of 14 chromosome ends in the Clade II genomes.

22
We find that shorter telomeres and genome instability might be a consequence of a naturally 23 occurring loss-of-function mutation in DCC1 exclusively found in Clade II isolates, resulting in a 24 hypermutator phenotype. We also determine that deleted subtelomeric regions might be linked   causing outbreaks in healthcare facilities and representing a high fraction of candidemia cases 33 in some regions. The emergence of C. auris is a major concern, since it is often multidrug-34 resistant, easily spread between patients, and causes invasive infections. While isolates from three global clades cause invasive infections, isolates from Clade II primarily cause ear 36 infections and have not been implicated in outbreaks, though cases of Clade II infections have 37 been reported on different continents. Here, we describe genetic differences between Clade II 38 and Clades I, III and IV, including a loss-of-function mutation in a gene associated with telomere 39 length maintenance and genome stability, and the loss of cell wall proteins involved in adhesion 40 and biofilm formation, that may suggest an explanation for the lower virulence and potential for 41 transmission of Clade II isolates.

51
Initial genomic analysis of the outbreak identified four major genetic groups corresponding to 52 these geographic regions or Clades I, II, III, and IV (8). Clades I, III, and IV are responsible for 53 the ongoing and difficult to control outbreaks in healthcare facilities worldwide (9). Clade II, also 54 termed the East Asia clade, is predominantly associated with cases of ear infection and appears 55 to be less resistant to antifungals than other clades (10). While a reference genome assembly of 56 a Clade I isolate is commonly used for SNP analyses, the karyotype is known to vary based on 57 whole genome alignment with an assembly of a Clade III isolate (11) and wider analysis of 58 chromosomal sizes (12). To better understand the emergence of this species and phenotypic 59 differences between clades, here we leverage complete reference genomes for isolates from 60 Clades II and IV. We find that the genome of Clade II is highly rearranged and is missing large

66
To investigate genomic differences between clades, we generated complete chromosome scale 67 assemblies for isolates from Clades II and IV. Genome assemblies of B11245 (Clade IV) and 68 B11220 (Clade II) consisted of 7 nuclear contigs corresponding to complete chromosomes with telomeres at both ends, excluding one end that corresponds to rDNA in each assembly and one 70 additional end in B11220 (Supplementary Table 1

104
The subtelomeric regions deleted in Clade II likely contribute to the phenotypic differences of 105 this clade, most notably by the loss of fourteen candidate adhesins present in Clades I, III and 106 IV. These include two sets of genes that contain predicted GPI anchors and secretion signals, 107 one set sharing sequence similarity to C. albicans adhesins from the Hyr/Iff family and a second 108 set of clustered genes only found in C. auris and the closely related species C. haemulonii and 109 C. duobushaemulonii (Figure 2a; Supplementary Table 2). The Hyr/Iff gene family was 110 previously noted to be the most highly enriched family in pathogenic Candida species and has 111 been associated with pathogenicity and virulence (15). Six of eight Hyr/Iff proteins found in C.