Candida albicans selection for human commensalism results in substantial within-host diversity without decreasing fitness for invasive disease

Candida albicans is a frequent colonizer of human mucosal surfaces as well as an opportunistic pathogen. C. albicans is remarkably versatile in its ability to colonize diverse host sites with differences in oxygen and nutrient availability, pH, immune responses, and resident microbes, among other cues. It is unclear how the genetic background of a commensal colonizing population can influence the shift to pathogenicity. Therefore, we examined 910 commensal isolates from 35 healthy donors to identify host niche-specific adaptations. We demonstrate that healthy people are reservoirs for genotypically and phenotypically diverse C. albicans strains. Using limited diversity exploitation, we identified a single nucleotide change in the uncharacterized ZMS1 transcription factor that was sufficient to drive hyper invasion into agar. We found that SC5314 was significantly different from the majority of both commensal and bloodstream isolates in its ability to induce host cell death. However, our commensal strains retained the capacity to cause disease in the Galleria model of systemic infection, including outcompeting the SC5314 reference strain during systemic competition assays. This study provides a global view of commensal strain variation and within-host strain diversity of C. albicans and suggests that selection for commensalism in humans does not result in a fitness cost for invasive disease.

Introduction with macrophages, but this variation was not explainable by sample origin site. We found that although the commensal isolates had a reduced capacity to cause macrophage cell death, the majority of the strains showed increased competitive fitness in invertebrate models of systemic candidiasis compared with the reference SC5314 strain and retained their capacity to cause disease during monotypic infections. Together, these data suggest that the selective pressures experienced by C. albicans during commensal colonization do not necessarily result in decreased pathogenic potential.

Phenotypic characterization of commensal C. albicans strains
We sourced C. albicans from oral and fecal samples from undergraduate student donors, using these as a representative sample of colonizing C. albicans strains from healthy individuals. In this population of students, 29% (29/98) were positive for oral Candida colonization and 10% (10/98) were positive for fecal colonization, including 4 students who exhibited both oral and fecal colonization (Fig 1A). From each host and site, we collected every individual colony present on the BD ChromAgar plates and confirmed species identity through ITS amplicon sequencing. Overall, we obtained 910 C. albicans isolates (fecal: n = 84 colonies, oral: n = 826 colonies) (Fig 1B).
We then performed growth assays on all 910 isolates and the SC5314 reference strain in rich medium (yeast peptone) with dextrose, galactose, or glycerol as the carbon source. We observed primarily unimodal distributions with a long tail of slow-growing strains for both exponential growth (μ max ) and saturating density (K) in rich media (Fig 1C). Although the 5 slowest-growing isolates were all obtained from oral samples, we did not observe a significant difference in growth rate between oral and fecal samples (S1A Fig), and overall, the growth rates between the different carbon sources were not correlated (S1B Fig).
As filamentation has been tightly associated with virulence, and because previous work in murine models has suggested that gut adapted strains may lose their ability to filament [21,36,38], we examined each isolate in the collection for their ability to form hyphae using 10% serum and febrile temperatures as 2 inducing cues. Additionally, we examined the morphology of each strain under the non-inducing condition of yeast extract peptone dextrose (YPD) at 30˚C to identify constitutive filamentation, as has been observed from isolates collected from sputum samples from cystic fibrosis patients [45]. We observed that while none of the strains were constitutively filamentous, several strains aggregated at YPD 30˚C (Fig 1D; represented by 882-46). These aggregative isolates were more likely to reach a lower carrying capacity; however, it is possible that the aggregation interfered with the accuracy of the OD reads.
All strains were able to filament in response to the standard laboratory inducing conditions of serum and high temperature, albeit with some variation in the number of hyphal cells, shape, and aggregation of the cells (Fig 1D). Under hyphal-inducing conditions, the aggregating strains formed filaments while the mother yeast cells remained connected, resulting in a star-like pattern. Some isolates showed different filamentation patterns under the different inducing conditions. For example, isolate 813-61 formed filaments that resembled those of SC5314 under 37˚C + 10% serum, but under the 42˚C inducing cue, this isolate formed fewer filaments with many yeast cells aggregating around the filament. The ability of all our commensal isolates to form filaments may be consistent with the hypothesis that interaction with bacteria in the gut maintains selection for the hyphal program [36,46]. Although there were isolates, such as 813-61, that showed more of a mixture of yeast and hyphae than the SC5314 parent that are potentially consistent with the yeast-locked phenotypes observed in Tso and Significance determined by Wilcoxon rank sum test. (C) Strains varied in maximum growth rate and carrying capacity in response to different carbon sources. Growth curves were performed on each isolate under 3 carbon sources at 30˚C for 24 h for dextrose and 48 h for glycerol and galactose in biological duplicate. Rate and carrying capacity were determined using the GrowthcurveR analysis package. Error bars represent median and interquartile range of the growth parameter. (D) All strains retained the capacity to filament in liquid inducing cues, but some demonstrated altered morphology and aggregation. Strains were incubated in the indicated conditions and imaged at 40× magnification. Scale = 10 μM. (E) Strains varied in their capacity to invade into solid agar. Colonies were incubated on the indicated conditions for 5 days before gentle washing and imaging for invasion. (F) UMAP plot for strain phenotypic similarity. All growth conditions and invasion phenotypes were nonlinearly projected into 2D space and colored by donor. colleagues (2018) [36], the majority of our commensal strains retained full capacity to filament in our in vitro assays. This may suggest that the antibiotic or germ-free mouse models are not fully recapitulating important features of C. albicans-human interactions.
Filamentation programs under solid and liquid growth can involve distinct genetic programs [47], and the human host can present a variety of substrates for C. albicans to utilize. Therefore, we also tested each isolate for its capacity to invade into solid agar media under multiple conditions: YPD agar at 37˚C under anaerobic conditions as a minimal inducing cue, YPD agar at 30˚C as the baseline condition, YPD agar at 37˚C as an intermediate inducing condition, and Spider agar at 37˚C as a strong inducing condition ( Fig 1D). As expected for a strong inducing cue, the highest degree of invasive growth was observed on Spider agar at 37˚C [47]. Interestingly, although all strains showed the ability to form hyphae under liquid culture conditions, many strains failed to invade into solid agar, giving scores of 0 or 1 under multiple conditions ( Fig 1E). We also observed substantial phenotypic heterogeneity among strains isolated from the same host, including in strains isolated from the same site, consistent with each individual being colonized by multiple, phenotypically distinct, strains of C. albicans.
To examine the relationship between the strains, observed phenotypes, and donors, we generated a uniform manifold approximation projection (UMAP) embedding, which will plot strains with similar phenotypes closer together and strains with dissimilar phenotypes farther apart based on the Euclidean metric ( Fig 1F). Using this projection, we saw significant overlap in strain phenotypes from many donors, although there were some donors that had strains that clustered away from others, such as strain 946.
Additionally, some donors had 2 separate clusters of strains, such as those from donor 814, 872, and 882. We also observed that some strains from multiple donors clustered together in this projection, including the aggregating strains from donors 811 and 882. However, we did not observe clustering based on sample origin site (S1C Fig). Together, these experiments demonstrate a large variation in phenotypes between commensal isolates, even among traits, such as filamentation and invasion, that are often correlated with virulence.

Genomic variability
Due to the extensive variation in observed phenotypes among the isolates, we next wanted to characterize the genomic variability in these strains, at both a sequence and structural variation level. To carry out these analyses, we selected a set of 45 commensal isolates, hereafter referred to as the "condensed set." Isolates were chosen for inclusion in the condensed set where we had matched pairs from oral and fecal isolates from the same donor or isolates from the same donor that exhibited multiple phenotypes based on growth, filamentation, or invasion. For these analyses, we compared each isolate to the SC5314 reference strain. We also included the human isolate, CHN1, which has been previously phenotypically and genotypically characterized as a bona fide commensal and is able to stably colonize the murine gastrointestinal tract for long periods of time, in contrast to SC5314 [21,48].
Previous descriptions of population genomic variation in C. albicans have largely relied on short-read sequencing approaches, which are limited in their ability to resolve structural variation between strains. Therefore, we performed whole-genome sequencing on all 45 commensal strains, SC5314, and CHN1 using the Transposase Enzyme Linked Long-read Sequencing (TELL-Seq) method for library preparation [49]. This approach uses barcode linked-reads to produce synthetic long reads with Illumina quality sequence, thus allowing us to capture both single-nucleotide variants (SNVs) and structural variants.
To identify SNVs and compare our strains to the existing set of sequenced C. albicans isolates, we collected 388 previously published C. albicans genomes and, after filtering, we mapped them, and our 45 newly generated genomes, to the SC5314 reference genome with BWA-MEM [50]. We called variants using GATK HaplotypeCaller and after filtering, we obtained final set of 112,136 high-quality SNVs across 431 remaining samples. Of these, 90,675 (80.86%) were represented in at least 1 member of our set of newly sequenced strains. The population diversity captured in this analysis was consistent with the largest previous analysis of C. albicans genomic variation, which called 589,255 SNPs from 182 genomes [25]. Although our final SNV set was significantly smaller than that identified in past work, our filtering criteria were significantly more stringent and robust for inferring population level patterns.
We then wanted to place our newly sequenced isolates in the phylogenetic context of previous work on C. albicans strains [19,25,51]. To remove redundancy and focus on natural C. albicans diversification, we removed samples corresponding to resequenced strains (e.g., multiple SC5314 samples present in full data set) and those sequenced as part of experimental evolution studies (i.e., [36,52,53]). Following removal of these samples, we were left with 324 sample SNV profiles, which were then used to cluster the samples into a dendrogram of relationships. Despite the reduced size of our data set, our SNV-based clustering recovers all major accepted clades of C. albicans (Fig 2) [22][23][24][25][26]. The fact that >80% of the 112,136 highquality SNVs we identified are represented in the genome sequences of the 45 new C. albicans isolates we sequenced, in addition to their clustering near 8 major clades, asserts the high degree of diversity captured in our study of commensal C. albicans strains actively colonizing humans. In line with past work and underpinning the validity of our SNV-based clustering, we did not collect isolates from clades of C. albicans known to exhibit a high degree of geographic specificity (e.g., Clade 13) (Fig 2) [25].
The majority of the new isolates (26/45) belonged to Clade 1, of which the reference strain SC5314 is also a member. We identified cases where isolates from the same donor clustered tightly together, such as donor 814, whose 6 strains included in the condensed set clustered in Clade 1. Consistent with previous reports on microevolution in the host [53], we observed primarily SNVs and short-tract loss of heterozygosity between these 6 isolates, perhaps consistent with clonal expansion and diversification during colonization. In the 4 donors colonized at both the oral and fecal sites (811, 814, 831, and 851), both oral and fecal isolates were closely genetically related. However, we also identified donors with colonizing strains from multiple clades, such as donor 882, whose 4 strains came from Clade 1 and Clade C, or donor 811, whose 4 strains came from Clade 1, Clade C, and Clade 4 (Fig 2). Interestingly, some isolates from multiple donors clustered within one another, such as those the donor pairs 838 & 833 and 882 & 811, potentially indicating transmission events between individuals, due to the phylogenetic closeness of the strains. Together, this suggests that variation in an individual's colonizing C. albicans strains may come from both within-host diversification and between-host transmission.
To characterize genomic variation at a structural level, we performed pulsed-field gel electrophoresis to separate the chromosomes of our condensed set of commensal C. albicans isolates, including SC5314 as a reference (Figs 3A and S3A). The commensal C. albicans strains show between 7 and 10 chromosome bands, ranging from approximately 0.8 MB to approximately 3.2 MB in size [54]. We observed many size differences in chromosomes, especially among the smaller chromosomes (corresponding with Chr 5, 6, and 7 in SC5314), but we also observed size variation in large chromosomes at approximately~1 MB,~1.5 MB, and~2 MB, potentially indicating variation in Chrs 2, 4, and R. Size variation of larger chromosomes may be mostly associated with allelic expansion or contraction of the rDNA repeat array on ChrR.
When comparing the chromosomal structural variations to our SNV tree, we identified several unique patterns and observed that a phylogenetic ordering of the isolates did not encompass the structural variation. For example, a subset of isolates within Clade 1 from donors 833 and 838 all contained an additional band below Chr 4, with the exception of isolate 833-19, despite being within the same phylogenetic cluster based on SNV analyses (Fig 3A). We also observed clade-level variation in karyotypes: Isolate 871-1 was the only strain in the condensed set from Clade 9, and this strain showed unique banding patterns at Chrs 2, 3, and 4. The 3 strains from Clade C, which originated from donors 882 and 811, all contained a band between

PLOS BIOLOGY
Chrs 5 and 6. Finally, the 2 closely related strains, 859-2 and 806-1, both contained a band between Chrs 2 and 3 ( Fig 3A). The observed altered band sizes could represent either loss of genome content or chromosomal fusion events. An advantage of our TELL-seq-based

PLOS BIOLOGY
platform was the opportunity to resolve the differences between the SNV-level analysis and the structural variants that we identified through the CHEF gels. We used the synthetic long-reads to assemble contigs for each strain, using Universal Sequencing's TELL-seq pipelines: Tell-Read and Tell-Link ( Fig 3B). We could observe variation in copy number across the chromosomes (Figs 3B and S3B), as well as potential inversions, duplications, or deletions (S3C Fig) [55,56]. We also see some peaks in copy number that are associated with MRS and rDNA sequences that were not resolved in our mapping. However, there may be some additional technical limitations of the TELL-seq approach, as we identified some inversions, duplications, and deletions in the resequenced SC5314 (S3C Fig). Therefore, we focused on those that were unique to the commensal isolates.
Notably, we were able to identify structural rearrangements and putative chromosome fusions that occurred in a set of the commensal isolates ( Fig 3C and S4 Table) and used PCR to test for the presence of the fusion event. In strain 880-2, we observed a fusion event between chromosomes 3 and 4, connected by a 1.3 kb intervening sequence ( Fig 3C). This intervening sequence had 94% sequence identity to an intergenic sub-telomeric region of chromosome 6. To determine whether this was a true event or a sequencing artifact, we designed primers to span the junction and performed PCR to amplify the fusion ( Fig 3C). Using this approach, we observed that in strain 880-2, there is a bona fide structural rearrangement that links chromosomes 3 and 4. In strain 859-2, we observed a fusion event between chromosomes 1 and 3, connected by an approximately 7 kb intervening sequence with no obvious sequence identity to the SC5314 reference strain, but instead had 99.76% sequence identity to a region on chromosome 2 from C. albicans strain TIMM 1768, a highly virulent strain originally isolated from the feces of a candidiasis patient [57] (Fig 3C). TIMM 1768 is closely related to CHN1, a member of Clade A [56]. We were again able to use PCR to span both junctions observed the presence of the fusion between chromosomes 1, 2, and 3 ( Fig 3C). This fusion event may correspond to the additional chromosomal band between chromosomes 2 and 3 that we observed in the karyotype for this strain. Importantly, neither of these fusion events were present in the SC5314 reference strain, indicating that the fusions were unique to the specific isolate ( Fig 3C). These structural variations were not captured in the SNV analysis and may play important roles in gene regulation or phenotypic variation between the strains.

Deep phenotyping of commensal isolates
The set of isolates for sequencing were initially chosen based on variation in growth rate in rich medium and alterations in invasion into agar. However, we hypothesized that we may identify site-specific adaptations, as host sites commonly colonized by C. albicans vary dramatically in environmental cues. Additionally, we hypothesized we may identify phenotypes associated with specific C. albicans clades, as we were able to identify structural variants shared between closely related isolates. To test this, we performed a set of growth analyses under multiple environmental conditions, including pH stresses, nutrient limitation, cell wall stressors, and antifungal drugs ( Fig 4A). These analyses produced a dense array of quantitative phenotypic information for each strain.
From these data, we identified 3 strains, 882-60, 882-46, and 811-7, that consistently reached a lower carrying capacity than the wild type under multiple conditions; this is consistent with the aggregative phenotype exhibited by these isolates at 30˚C (Fig 1D). These strains all belonged to Clade C and were closely related, despite arising from 2 donors (Fig 4A). Growth rates in the nutrient limitation conditions were generally correlated with each other. However, we did not observe a correlation between body site and growth rate, even in response to cues that would appear to be specific for a particular body site, such as anaerobic For phagocytosis and cell death rates, significant differences from the SC5314 reference strain were determined by one-way ANOVA, with Dunnett's multiple correction testing. Asterisks indicate P < 0.05 (*), P < 0.005 (**), P < 0.001 (***), and P < 0.0001 (****). https://doi.org/10.1371/journal.pbio.3001822.g004

PLOS BIOLOGY
growth. Across the commensal isolates, we noted the most variation in growth in response to caffeine and the antifungals fluconazole and rapamycin. In addition to growth, we measured each of the strains for their ability to form biofilms on plastic surfaces [58]. Although we observed variation between the strains, there was no correlation between isolation site or clade with the propensity of isolates to form biofilms (S4 Fig).
Our dense array of phenotypic data across 45 C. albicans isolates and 8 clades reveal that commensal isolates largely retain the plasticity to grow efficiently under diverse environmental cues, even those not immediately relevant to their colonizing site, as we did not observe growth enrichment in cues specific to isolation sites.
A major stress condition and environmental factor impacting C. albicans in the host is the immune response. Therefore, we moved from pure growth assays to measuring host-microbe interactions, using macrophages as representative phagocytes. We first hypothesized that the oral strains may show decreased recognition by macrophages, as persistent oral isolates were recently shown to result in reduced immune recognition and inflammation in both an OPC model of infection and in cell culture [59]. We tested this by measuring phagocytosis of each strain by immortalized bone marrow-derived macrophages and determining the ratio of internalized to external cells by differential staining and microscopy ( Fig 4B) [60]. Although most isolates were not significantly different from the SC5314 reference, the isolates generally had a lower phagocytic rate than SC5314. Additionally, there was no correlation between sample origin site or clade with phagocytosis rate.
As phagocytosis was not a major differentiating factor between strains, we then wanted to examine whether the strains would induce different levels of macrophage cell death. We primed bone marrow-derived macrophage for 2 h with LPS before infecting with each of our isolates for 4 h. Following infection, we stained the cells with propidium iodide (PI) as a measure of cell death ( Fig 4C). Strikingly, many of the commensal isolates induced less cell death than SC5314. Our results are consistent with previous reports that commensal oral isolates of C. albicans are characterized by a reduced capacity to damage host cells [61]. To test whether this increased killing capacity of SC5314 was shared with other bloodstream isolates, we measured the phagocytosis and cell death rates for 12 bloodstream isolates originally described in Hirakawa and colleagues (2015) [19]. These results indicated a higher rate of phagocytosis for the bloodstream isolates ( Fig 4B). However, for most of the bloodstream isolates, we also observed decreased capacity for macrophage killing (Fig 4C). The bloodstream isolate P94015, which was identified by Hirakawa and colleagues as having a loss of function of the transcription factor EFG1, behaved similarly to the other bloodstream isolates in these assays, despite having a decrease in virulence in their animal model [19]. These results indicate that the ability to kill host cells is not conserved across bloodstream isolates.
Recently, we showed that C. albicans mutants that filament in serum are not always filamentous within macrophages [62]. As filamentation is linked, but not required, for inflammasome activation within host phagocytes [62][63][64], and clinical isolates show variability in induction of host inflammatory responses [65,66], we examined the morphology of the commensal isolates after incubation for 4 h with macrophages. We observed considerable variation in the extent of filamentation among the natural isolates (Fig 4D), including strains that completely failed to filament and those that filamented more than the SC5314 reference strain. The aggregative strains, 882-60, 882-46, and 811-7 failed to filament during incubation with macrophages. Notably, the extent of filamentation did not correlate with colony morphology or invasion on agar, with many strains showing invasion into agar but no filamentation inside the macrophage, and vice versa (S5 Fig). Additionally, oral and fecal isolates both demonstrated defects in filamentation in macrophages, and filamentation in macrophages was not predictive of the phagocytic rate or cell death rate.
Using individual phenotypic measures, we were unable to identify associations between strains based on body site or donor. However, it is possible that the combined phenotypic and genotypic profile would identify clusters of strains with similar distinct phenotypes or reveal connections between isolates. Therefore, we turned to UMAP embedding, which will plot strains with similar phenotypes closer together and strains with dissimilar phenotypes farther apart based on the cosign metric. We did not observe any obvious clusters that segregated by isolation site, clade, or participant (S6 Fig). In sum, all of the commensal isolates showed extensive phenotypic variation, but this was not dependent on the body site or participant from which they were collected.

Limited diversity exploitation
Genome-wide association studies have been a powerful tool for identifying the genetic basis of variation in phenotypes of interest in humans and other recombining species. However, the generally clonal and asexual reproduction of C. albicans creates a population structure that confounds traditional GWAS methods. By sampling multiple isolates from each individual, we were able to obtain phenotypically diverse strains with a limited set of unique SNPs between isolates, allowing us to identify causative variants associated with a particular phenotype.
We focused on the 6 strains from donor 814 included in the condensed set, which clustered tightly in Clade 1; we hypothesized this would allow us to identify causative variants associated with particular phenotypes that were divergent between strains. Our agar invasion analysis revealed that isolate 814-168 demonstrated hyper invasion into Spider agar at 37˚C, whereas the other 5 isolates from the condensed set were less invasive (Fig 5A). Moreover, from this donor's 163 total isolates (144 oral isolates and 19 fecal isolates), only this single isolate exhibited the hyper invasive phenotype into Spider agar at 37˚C (Fig 5B); this phenotype was the motivation for initially including this strain in the condensed set.
Variant analysis identified 12 genes with unique SNVs in the 814-168 strain compared with the other 5 sequenced 814 strains, including a heterozygous adenine to thymine SNV in the transcription factor Zms1, resulting in a change in amino acid 681 from a threonine to a serine. Moreover, co-expression analysis [67] of ZMS1 revealed that it is highly correlated with genes involved in regulating the yeast-to-hyphal morphogenic transition (Fig 5C). To test whether this SNV can drive invasion, we generated complementation plasmids encoding each of the alleles from the 2 strain backgrounds and performed allele swap experiments between the high and low invasion strains. In the minimally invasive 814-183 background, replacing 1 copy of the endogenous ZMS1 allele with the ZMS1-T681S allele was sufficient to drive hyper invasion into both YPD and Spider agar (Fig 5D). Similarly, replacing the invasive ZMS1 allele with the ZMS1-S681T allele was sufficient to reduce invasion into both YPD and Spider agar (Fig 5D).
Previous work on the function of Zms1 via deletion mutant analysis had not revealed a phenotype [68]; however, this was in the SC5314 genetic background and the impact of a specific transcription factor on a given phenotype can vary depending on the strain [18]. Therefore, we deleted ZMS1 from both 814 backgrounds and tested the strains for invasion and filamentation. On YPD agar, as before, deletion of ZMS1 in both genetic backgrounds had minimal effects, with the mutant strains behaving similarly to their parent strains (Fig 5D). However, on Spider agar, ZMS1 deletion changed the colony morphology from wrinkly to smooth in the 814-168 background, although it did not decrease overall invasion. In contrast, deletion of ZMS1 increased invasion in the otherwise minimally invasive 814-183 background, highlighting the differential impact of ZMS1 mutation in the different genetic backgrounds (Fig 5D). Our results demonstrate that a single SNV changing amino acid 681 to a serine is a dominant

PLOS BIOLOGY
negative allele that is sufficient to drive a hyphal invasion program into Spider agar. We also identified natural variation that was distinct from deletion phenotypes. This approach, which we have termed "limited diversity exploitation," highlights how deep phenotypic analysis of a limited set of natural isolates from a single host can be exploited to identify causative variants and identify new functions for under-characterized genes.

Virulence
We next examined the fitness and virulence of the commensal isolates relative to the SC5314 reference strain; we hypothesized that the commensal isolates would have decreased virulence compared to SC5314, a clinical isolate. To test this, we turned to the Galleria mellonella model of systemic infection as this insect model is significantly correlated with the murine systemic infection model [19,69,70]. We first examined competitive fitness by infecting with a 1:1 mixture of fluorescently marked SC5314 and each sequenced commensal isolate [71]. After 3 days, the worms were homogenized and CFUs were plated to determine the competitive index, calculated as the log2 ratio of fluorescent to nonfluorescent colonies [37]. When comparing the marked and unmarked SC5314 strains, we obtained a competitive index of 0, indicating that both strains are equally fit and that the fluorescence does not impose a fitness cost. In contrast, most commensal strains had a competitive index >2, suggesting that these strains have a competitive advantage over SC5314, even during systemic infection ( Fig 6A). Notably, the 3 strains previously identified from Clade C, 882-60, 882-46, and 811-7, consistently demonstrated a competitive index of less than 0, indicating these strains are less fit than SC5314 in this model of infection. This is consistent with the aggregation exhibited by these strains in many growth conditions (Fig 4A).
The striking increased competitive fitness of the other isolates motivated us to test whether this increased ratio was correlated with increased disease. Here, we examined the survival of G. mellonella after performing monotypic infections. We started by using the 6 isolates from donor 814 to test the hypothesis that increased invasion is associated with increased disease. Three of the 6 isolates from 814 had increased competitive fitness compared to SC5314. However, the majority of the strains from donor 814 were not significantly different from the SC5314 reference strain (p > 0.05, log-rank test), including the hyper-invasive 814-168 isolate ( Fig 6B). Isolates 814-42 and 814-45 had a slight defect in virulence (p < 0.05 log-rank test). We additionally tested 2 other clusters of strains for their ability to cause systemic disease in the insect model. However, the human commensal isolates were again not significantly decreased in virulence from the SC5314 reference strain (S7 Fig), despite the increased ability of the SC5314 strain to cause host immune cell death (Fig 4C). Together, this suggests that there is not a consistent selection for avirulent behavior in the commensal human isolates in this model of systemic infection. Although we observed wide variation in in vitro host response phenotypes, the isolates generally retained their pathogenic potential, indicating that our in vitro assays may not capture the complex stresses experienced during whole-organism infections.

Discussion
Intraspecies analyses of microbial strains can allow for the identification of variants that are associated with specific clinical outcomes, as demonstrated widely in bacterial virulence [72,73] and recently in the human fungal pathogen Cryptococcus neoformans [74,75]. Understanding the differences between strains can allow for insights into mechanisms of colonization and pathogenesis [18]. However, assessing the underlying genetic variation responsible for differences in virulence in C. albicans is challenging in the absence of clear candidate genes because C. albicans is a primarily clonal yeast that does not generally undergo meiosis [76,77]. Additionally, the genetic basis underlying the ability of commensal strains of C. albicans to transition to pathogenic behavior is not well defined. Moreover, C. albicans exhibits high rates of structural mutation and ploidy variation [43,78] as well as high heterozygosity [25]. Previous descriptions of population genomic variation in C. albicans have largely relied on short-read sequencing [25] or molecular typing methods [28] at relatively few loci to define genetic variation in the species. Here, we have greatly expanded the number of available long-read genomes for C. albicans and have generated a catalog of structural variants that was largely absent from previous descriptions of natural diversity. We were able to leverage clonal variation within a Competition assays in G. mellonella demonstrate increased fitness of many commensal isolates compared to SC5314 reference. Isolates were competed against a fluorescent SC5314 isolate, starting at a 1:1 initial inoculum. Competitive fitness was calculated as the ratio between fluorescent and nonfluorescent colonies, normalized to the inoculum, and log2 transformed. Black data points indicate an oral isolate and pink data points indicate a fecal isolate. Significant differences from the SC5314 reference strain were determined by one-way ANOVA, with Dunnett's multiple correction testing. (B) Survival assays in G. mellonella, comparing the SC5314 reference to 6 isolates from donor 814. Each strain was standardized to 2 × 10 6 cells/mL before inoculating 20 G. mellonella larvae per strain with 50 μL of prepared inoculum. Larvae were monitored daily for survival. Statistical differences were determined using a Mantel-Cox log-rank test. Asterisks indicate P < 0.05 (*), P < 0.005 (**), P < 0.001 (***), and P < 0.0001 (****) compared with SC5314.
https://doi.org/10.1371/journal.pbio.3001822.g006 single donor to identify variation within a host, potentially including variation that arose during colonization. Overall, we performed a systematic phenotypic analysis of commensal isolates from healthy donors, thus allowing us to examine C. albicans genotypic and phenotypic diversity.
Strikingly, we observed higher competitive fitness of the commensal strains during systemic competition assays, and when we tested 15 isolates for their ability to cause disease, we saw similar kinetics of mortality. This suggests that our commensal strains generally maintained their capacity to cause disease in our Galleria model of infection, despite SC5314 being able to cause more cell death in a macrophage model of infection for all but 7 commensal isolates. Importantly, other bloodstream isolates were also unable to cause host cell death. Additionally, our commensal strains were still able to filament in response to the inducing cues of 10% serum or high temperature, in contrast to work demonstrating that strains passaged through the mammalian gut resulted in a decrease in systemic virulence and defects during in vitro filamentation assays [36]. Potentially, the Galleria model does not recapitulate the disease process that would be caused by these isolates in a mammalian host. Although we observed significant variation in the ability of strains to invade into agar, recent intravital imaging approaches suggest that filamentation in response to serum matched that seen in vivo more than filamentation in response to solid Spider agar [79,80], although this may not recapitulate the environment seen in the gut. Potentially, the selective pressures that occur during mouse models of colonization may not recapitulate the selection that occurs during human colonization.
Although we were not able to associate a particular phenotype with increased systemic disease, we observed extensive phenotypic and genotypic variation between the commensal isolates. This variability is also consistent with recent work from clinical, disease-associated strains [19]. Moreover, our commensal strains were able to proliferate on a range of different environmental conditions, and we did not observe significant differences in phenotypes between isolates obtained from oral or fecal sites, except for a slight increase in invasion into Spider medium for oral samples. This work highlights the striking ability of C. albicans to adapt to a wide range of environmental conditions and indicates that colonization at a specific body site does not necessarily predict pathogenic potential.
Previous work has suggested that the C. albicans population within a given individual is clonal [71,[81][82][83][84] and that the fungus is acquired during birth as a part of the normal microbiota [44]. In these studies, samples often came from patients with active disease; this may suggest that there is selection for the ability to cause disease, resulting in repeated isolation of representative samples of a clonal population. In contrast, we identified disparate individuals that appeared to be colonized by strains that were nearly identical, suggesting that there was some transmission between individuals. Whether these transmission events allow for longterm colonization, and how they affect the initial C. albicans colonizing strains, is still not fully understood. An important limitation of our study is that the undergraduate student donors were anonymous; therefore, we are unable to provide specific demographic information including age, sex, immune status, and recent antibiotic use that may allow us to expand our conclusions.
We also observed diversification within hosts, with multiple instances of closely related strains showing variation in phenotypes, such as invasion into agar. In one representative example, we were able to use comparative genome analysis coupled with co-expression, which we term "limited diversity exploitation," to identify a candidate transcription factor that regulates invasion. Previous work on a ZMS1 knockout strain of C. albicans did not show any differences in phenotype compared with the parent strain [68]. However, we observed that a single amino acid substitution in the predicted fungal transcription factor regulatory middle homology region was sufficient to drive hyper-invasive growth. Moreover, this phenotype was distinct from the deletion phenotypes in these 2 genetic backgrounds, again contrasting with the SC5314 reference strain. It is likely that populations of colonizing C. albicans will vary in other important clinically relevant traits, including filamentation in macrophages or intrinsic drug tolerance and resistance, and our results suggest that studying more than a single representative isolate gives opportunities to discover new biology.

Human subjects and sample collection
Study volunteers were recruited through the Authentic Research Sections of the introductory biology laboratory course at the University of Michigan (BIO173). All participants provided written, informed consent for sample collection as well as isolation and characterization of microbes from feces. This study was approved by the Institutional Review Board of the University of Michigan Medical School (HUM00094242 and HUM00118951) and was conducted in compliance with the Helsinki Declaration. Oral samples were obtained by self-administered cotton swabbing, and fecal samples were collected and diluted in PBS+DMSO before plating; in each case, BD ChromAgar was used to identify C. albicans colonies. Individual C. albicans colonies were picked from all plates with a toothpick into 100 μL of YPD in a 96-well plate. The cultures were then inoculated onto BD Chromagar to differentiate between yeast species and ITS regions were amplified to confirm C. albicans strains. The C. albicans colonies were individually inoculated into fresh 96-well plate in YPD and after 24 h of incubation at 30˚C, 50% glycerol was added to generate the stock plates. For each oral sample, individual C. albicans colonies were picked from the CHROMagar plates and incubated overnight in 100 μL of YPD in a 96-well plate. After 24 h incubation at 30˚C, 50% glycerol was added to the 96 wells to generate the stock plates. All strains were maintained in −80˚C cryoculture.

Reference strains
SC5314 reference strain was obtained from the gene replacement and conditional expression (GRACE) library [85]. CHN1 human isolate was obtained from the Huffnagle laboratory group [48].

Media and culture conditions
Media for growth curves is described in S1 Table.

Growth curve analysis
Overnight cultures of each C. albicans isolate were grown in 200 μL of YPD in 96-well plates at 30˚C, then subcultured into fresh media using a sterile pinner. Growth curves were performed on a BioTek 800 TS Absorbance Reader in the indicated conditions for 24 h, without shaking. For the glycerol and galactose growth curves, the strains were incubated for 48 h due to the slower growth. Maximum growth rate and carrying capacity were determined using the GrowthcurveR analysis package [86]. A minimum of 2 biological replicates were performed per isolate per growth condition. All media and conditions are listed in S1 Table. Summary statistics from GrowthcurveR for the entire collection of isolates are included in S2 Table, raw data in S1 Information. Summary statistics from GrowthcurveR for the condensed set in multiple environmental conditions are included in S5 Table, raw data in S1 Information.

Filamentation analysis
Overnight cultures of each C. albicans isolate were grown in 100 μL of YPD in 96-well plates at 30˚C, then subcultured into 100 μL of YPD at either 30˚C or 42˚C for 3 h. After 3 h, the plates were imaged on an Olympus IX80 microscope at 20× magnification. To test the response to serum, overnight cultures were subcultured into 5 mL of YPD with 10% serum and rotated in a 37˚C incubator for 3 h before vortexing and imaging on glass slides at 20× magnification.

Agar invasion methods
Isolates were taken from frozen glycerol stocks, incubated overnight in yeast extract peptone dextrose (YPD) in 96-well plates and inoculated onto solid YPD and Spider media using a 96-pin replicator tool (Singer Instruments). Plates were then incubated at the indicated temperature and oxygen conditions for 5 to 7 days. Colonies on plates were first imaged on a Biorad Gel Doc XR. Colonies were then gently washed off with deionized water. Plates were imaged again to capture invasion images. The invasiveness of each isolate was determined through a rubric scale of 0 to 5, as indicated in Fig 1. A 0 indicates no invasion, 1 to 2 indicates minimal invasion, a 3 indicates distinct circular hyphal invasion, while a 4 indicates an even larger circular hyphal invasion. A 5 indicates the most aggressive invasion, with a "halo" of more hyphae surrounding the initial growth. A minimum of 3 biological replicates was performed for each isolate and the median score was determined. A summary of invasion scores is in S2 Table, raw data in S1 Information.

High molecular weight genomic DNA extraction and sequencing
Genomic DNA was extracted from the 45 commensal isolates, SC5314, and CHN1. High molecular weight (>20 kb fragment) genomic DNA was prepared using the Qiagen MagAttract HMW DNA Kit (Qiagen 67563), following the manufacturer's protocol, with minor modification. Fungal cell walls were first digested using Zymolyase (Zymo E1005) before DNA extraction. TELL-seq [49] libraries were prepared at the University of Michigan Advanced Genomics facility. Libraries were sequenced on an Illumina NovaSeq SP 300 cycle flow cell. All sequences are uploaded to NCBI SRA at PRJNA875200.

Sequence variation analysis
To identify SNVs across all published isolates of C. albicans and the newly sequenced commensal isolates, we mapped 435 read libraries to the SC5314 reference genome with BWA--MEM [50]. We used various samtools utilities to convert alignment files, remove PCR duplicates, and assign read groups [50]. Variants were called with GATK HaplotypeCaller and individual sample GVCFs were combined with GATK CombineGVCFs. Finally, we genotyped our combined GVCF across samples and loci with GATK GenotypeGVCFs [87]. GATK identified 851,355 total variable loci. Of the 741,027 diallelic loci, GATK identified 60,626 indelderived alleles (8.18%) and 680,401 SNV alleles (91.82%). To refine our raw variant set to only include high-confidence SNVs, we removed low-quality variant calls at the (i) individual genotype call; (ii) whole locus; and (iii) whole sample levels. First, we removed individual sample genotype calls if their genotype quality (i.e., GQ) was <0.99 (out of 1.0) or the results of the map quality rank sum test (i.e., MQRS) differed from 0.0 (i.e., the perfect score). Second, we removed all variant calls with more than 1 alternative allele (i.e., not diallelic), an alternate allele longer than 1 nucleotide (i.e., indels), or with >5% missing genotype calls across samples. Finally, we removed entire samples from the combined VCF if their genotype calls across all loci were more than 90% missing, which resulted in the removal of 3 entire samples: SRR6669899, SRR6669970, and SRR1103579. Filtering our variant set in this way yielded a final set of 112,136 high-quality SNVs across 431 remaining samples. Of these, 90,675 (80.86%) were represented in at least 1 member of our set of newly sequenced strains.
To remove redundancy and focus on natural C. albicans diversification, we removed additional samples corresponding to resequenced strains (e.g., multiple SC5314 samples present in full data set) and those sequenced as part of experimental evolution studies (i.e., [36,52,53]). Following removal of these samples, we were left with 324 sample SNV profiles (280 published and 44 new genome strains) (S3 Table). To generate an input matrix for distance calculation and NJ clustering, we coded genotypes as homozygous for the reference allele I, heterozygous (H), homozygous for the alternative allele (A), or missing (N). Raw distance between 2 sample genotypes at a particular locus (i.e, GT i and GT j ) were calculated as 0.5 per alternative allele, with respect to the SC5314 reference such that D(H, A) = 0.5, D(R, A) = 1.0, etc. Missing genotypes (i.e., coded as N) were ignored. Using these raw distances, we calculated the pairwise distance between all sample pairs across all 112,136 loci according to the generalized distance function for sampIe i and sampIe j.
Where nSNPs is the total number of high-quality SNPs in our data set (i.e., 112,136), nN is the number of uncalled genotypes between each pair (i.e., coded as N), GT ix is the genotype of sample i at locus x, and GT jx is genotype of sample j at locus x. The resulting distance matrix was clustered with the nj function in phytools in R [88]. We visualized the resulting tree structure and associated data with ggtree in R [89][90][91].

Bootstrapping
One hundred bootstrap replicates were generated by randomly sampling rows from the SNP

Structural variant analysis from TELL-seq
To identify structural variation genome wide in the condensed set of C. albicans isolates, we mapped the read libraries to the SC5314 reference genome with BWA-MEM [50]. We visualized the read depth across the genome by breaking the reference genome into 500 bp bins. Each read was assigned to a single bin based on the first coordinate in the reference genome to which the read aligned. The read counts were normalized across the samples by the average read coverage for each sample. The value for each bin (v b;s ) for each sample was calculated using the following formula: v b;s ¼ bin read count * read length bin size * sample coverage Additionally, we performed structural variant calling using lumpy-sv [92] and genotyping with svtyper [93] using smoove (version 0.2.5). We visualized the variant calls by breaking the reference genome into 500 bp bins and identified the number of variant calls of a give type that overlapped with each bin.
To identify potential translocations from the TELL-seq data, we used the Tell-Link pipeline to assemble the read libraries into contigs and aligned the assembled contigs to the SC5314 reference genome using MiniMap2 [94]. We identified candidate junctions where consecutive aligned segments longer than 10,000 bp on the same contig aligned to different chromosomes. To experimentally validate the predicted junctions, we designed primers from uniquely mapped contig segments on either side of the junction, i.e., one from each of the 2 chromosomes predicted to be joined together. Primers were selected from the 500 bp directly upstream and downstream from the predicted junction location unless the junction was directly flanked by non-uniquely mapping segments in which case, the 500 bp of the closest uniquely mapping segments were used. Primers are in S6 Table. CHEF gel electrophoresis Agarose plugs for CHEF gel electrophoresis were prepared as in Selmecki 2005 [95]. Electrophoresis was performed as in Chibana [96], with minor modifications to improve chromosome separation. Plugs were run on 0.9% megabase agarose (Biorad 1613108), with run conditions of 60 to 300 s, 4.5 V/cm, 120 angle for 36 h, followed by 720 to 900 s, 2.0 V/cm, 106 for 24 h. Sizes were determined based on approximations from Selmecki and a CHEF DNA Size Marker, 0.2-2.2 Mb, S. cerevisiae Ladder (Biorad 1703605).

Biofilm formation
Biofilm formation was assessed as previously described [58]. Briefly, overnights of C. albicans isolates were incubated and diluted to 0.5 OD 600 . In a 96-well plate, isolates were added to 200 μL of RPMI media, covered with a Breathe-Easy film, and incubated at 37˚C shaking at 250 rpm. After 90 min, RPMI media was removed and wells were washed once with 200 μL of 1× PBS and 200 μL of fresh RPMI media was added. Plates were covered in a Breathe-Easy film and left to incubate at 37˚C with shaking at 250 rpm for 24 h before a final read on the plate reader at OD 600 .

Phagocytosis
Phagocytosis assays were performed as previously described [60]. Briefly, iBMDM macrophages were prepared for infection in RPMI media containing 3% FBS, diluted to 3 × 10 6 cells/ mL and incubated overnight at 100 μL/well in a 96-well plate. Overnight cultures of C. albicans isolates were incubated and diluted to 4 × 10 6 cells/mL into RPMI media with 3% FBS and used to infect macrophages. Inoculated plates were centrifuged for 1 min at 1,000 rpm to synchronize. After 30 min, media was removed, and cells were fixed with 4% paraformaldehyde (PFA) for 10 min. The cells were washed 3 times in 1× PBS. Cells were then stained with 50 μL of FITC-Concanavalin (5 μg/mL Sigma-Aldrich C7642) at room temperature, rocking for 30 min, wrapped in foil. The plates were then washed 3× with 100 μL of 1× PBS, and then 50 μL of 0.05% Triton-X100 was added to permeabilize the cells. Cells were then washed 3× with 1× PBS and a final stain of 50 μL of calcofluor white (100 μg/mL, Sigma-Aldrich C7642) was added to cells to incubate for 10 min. The cells were then washed 3× with 100 μL of 1× PBS and then maintained in 100 μL of 1× PBS at 4˚C before imaging on the microscope at 20× magnification using the DIC, FITC, and DAPI channels. Images were analyzed with a CellProfiler pipeline to determine the percentage of internalized Candida. The total number of Candida was determined through calcofluor white staining. Next, the number of external Candida was determined through FITC-ConA staining. (1 -external cells)/total cells = % of internalized Candida.

Macrophage filamentation assay
To assess filamentation of isolates in macrophages, iBMDM macrophages were prepared for infection in RPMI media containing 3% FBS, diluted to 3 × 10 6 cells/mL and incubated overnight at 100 μL/well in a 96-well plate. Overnights of C. albicans isolates were incubated and diluted to 4 × 10 6 cells/mL into RPMI media with 3% FBS and used to infect macrophages at an MOI of approximately 1:1. The inoculated plate was centrifuged for 1 min at 1,000 rpm to synchronize phagocytosis. After 2 h, the media was removed, and cells were fixed with 4% PFA for 10 min, permeabilized with 0.05% Triton-X100, and stained with 50 μL of calcofluor white (100 ug/mL, Sigma-Aldrich C7642) before imaging at 40× magnification using the DIC and DAPI channels. The extent of filamentation for each isolate was scored on a rubric from 0 to 5. A score of 0 indicates an isolate that exhibited only yeast morphology with no filamentation during macrophage infection. A score of 1 indicates an isolate that remains primarily in the yeast form, with some hyphae or pseudohyphae, while a score of 2 indicates an isolate that remains primarily in the yeast form, with more hyphae and pseudohyphae present than a score of 1. A score of 3 indicates an isolate that had approximately equal numbers of yeast and true hyphae. A score of 4 indicates an isolate with more true hyphae than yeast during infection, and a score of 5 indicates an isolate in which the majority of cells formed hyphae with very few yeast remaining. A minimum of 3 biological replicates was performed for each isolate and the median score was determined. A summary of macrophage filamentation scores is in S5 Table, raw data in S1 Information.

Cell death assay
To assess macrophage cell death, J2-iBMDM macrophages were prepared for infection in RPMI media containing 3% FBS, diluted to 4 × 10 5 cells/mL and incubated overnight at 100 μL/well in a 96-well plate. Macrophages were primed with 200 ng/mL LPS in RPMI media with 3% FBS for 2 h. Overnights of C. albicans isolates were incubated and diluted to 5 × 10 5 cells/mL into RPMI media with 3% FBS and used to infect macrophages at an MOI of approximately 1:1. The plate was centrifuged for 1 min at 1,000 rpm to synchronize phagocytosis. After 4 h, the media was removed and cells were stained with Hoechst (20 μM, Cayman Chemical 33342) to mark nuclei and propidium iodide (1 μg/mL, Acros Organics 440300250) to mark cell death. Plates were then imaged on an Cellomics ArrayScanVTI Objective Module at 20× with at least 5 images taken per well. Images were analyzed with a CellProfiler pipeline (available by request) to determine the percent of dead macrophages. Macrophages were identified through Hoescht nuclear staining to determine the total number of cells in each field of view. Next, cell death events were determined based on propidium iodide staining. (# of propidium iodide events)/(# of total cells) = % of dead macrophages.

ZMS1 strain construction
To make the ZMS1 allele swap strains, we used the NEBuilder HiFi assembly kit to clone the endogenous ZMS1S681T allele from the 814-183 strain background and the ZMS1T681S allele from the 814-168 strain background into the pUC19 cloning vector, along with the nourseothricin resistance cassette, to generate plasmids pTO274 and pTO192. ZMS1 and 500 bp of putative terminator was amplified using oTO771+oTO734. NAT was amplified using oTO18 + oTO735. pUC19 was amplified using oTO590 + oTO591. The plasmids were Sanger sequenced to confirm the presence of the specific ZMS1 allele and the absence of additional mutations using oTO736. These plasmids were transformed into the 814-168 and 814-183 strain backgrounds using a transient CRISPR approach [97]. NAT-resistant transformants were tested for the presence of specific ZMS1 alleles by Sanger sequencing using oTO736.
To generate the deletion strains, the ZMS1::NAT cassette was amplified from the NAT flipper plasmid [98] using primers oTO1215 and oTO1216, and transformed into the 814-168 and 814-183 strain backgrounds using a transient CRISPR approach [97]. Integration was tested using oTO5 and oTO1218 and loss of the wild-type ZMS1 gene was tested using oTO736 and oTO1218.

Galleria mellonella infections
Each isolate was examined for competitive fitness in a one-to-one ratio with a fluorescent wild-type SC5314 C. albicans strain in G. mellonella larvae. Infections were performed as previously described [98]. Briefly, G. mellonella larvae were purchased from speedyworm.com and maintained in sawdust at room temperature. Overnights were prepared for each isolate and wild-type strain in YPD at 30˚C, with rotation.
To measure competitive fitness, a 1:1 ratio of fluorescent wild type to unlabeled isolate was prepared in 1× PBS at 5 × 10 5 cells/mL. 10 larvae/strain were randomly chosen and infected via the last right proleg with 50 μL of the 1:1 inoculum using an exel veterinary U-40 diabetic syringe (0.5CC × 29G × ½). After injection, larvae were maintained at room temperature for 3 days before harvesting using a Benchmark D1000 homogenizer. Larvae were homogenized in 0.5 mL of 1× PBS, diluted 1:10 in PBS, and plated onto YPD plates containing gentamicin, ampicillin, and ciprofloxacin. Plates were left at 30˚C for 48 h before imaging on a Typhoon FLA 9500 biomolecular Imager. The ratio of fluorescent to unmarked strains was compared with the inoculum to determine competitive index.
Supporting information S1  Red arrows indicate large chromosome banding patterns that differ from the SC5314 reference strain. Chromosome 5 showed especially extensive variability in size between isolates. (B) Heatmaps of read coverage across the condensed set of isolates for each chromosome. Each column represents a 500 bp bin of the reference genome and each row is an isolate from the condensed set. Values greater than 1 (red) suggest potential duplications while values less than 1 (blue) suggest potential deletions. Each row represents an isolate, with SC5314 represented by the top row. The rows are ordered, top to bottom, in the phylogenetic order used in Fig 3. (C) Heatmaps of Lumpy Structural variation calls for the condensed set of isolates. Each column represents a 500 bp bin of the reference genome and each row of the heat map for a given variant class is an isolate from the condensed set. The value for each bin indicates the number of SV calls overlapping the 500 bp window. (TIF)

S4 Fig. Biofilm formation.
Each condensed set isolate was tested for its ability to form a biofilm on a plastic surface. Isolate labels were colored based on the nearest defined cluster from Nonlinear embedding dimensionality reduction was performed on the phenotypic data on the condensed set of isolates using UMAP. Clusters did not segregate by sample site, donor, or clade. Data used to generate the UMAP are included in S5 Table, raw data in S1 Information. Clade information is indicated by color. Pink or black outlines indicate fecal of oral isolate, respectively. (TIF) S7 Fig. Virulence in G. mellonella. (A) Survival assays in G. mellonella, comparing the SC5314 reference to 4 isolates from donors 882 and 811. (B) Survival assays in G. mellonella, comparing the SC5314 reference to 5 isolates from donors 833 and 838. Each strain was standardized to 2 × 10 6 cells/mL before inoculating 20 G. mellonella larvae per strain with 50 μL of prepared inoculum. Larvae were monitored daily for survival. Statistical differences were determined using a Mantel-Cox log-rank test. ** indicates P-value < 0.01, * indicates Pvalue < 0.05. (TIF) S1 Information. Raw data underlying figures.