Comparative genomics of bacteria from amphibian skin associated with inhibition of an amphibian fungal pathogen, Batrachochytrium dendrobatidis

Chytridiomycosis, caused by Batrachochytrium dendrobatidis (Bd), is a skin disease associated with worldwide amphibian declines. Symbiotic microbes living on amphibian skin interact with Bd and may alter infection outcomes. We completed whole genome sequencing of 40 bacterial isolates cultured from the skin of four amphibian species in the Eastern US. Each isolate was tested in vitro for the ability to inhibit Bd growth. The aim of this study was to identify genomic differences among the isolates and generate hypotheses about the genomic underpinnings of Bd growth inhibition. We identified sixty-five gene families that were present in all 40 isolates. Screening for common biosynthetic gene clusters revealed that this set of isolates contained a wide variety of clusters; the two most abundant clusters with potential antifungal activity were siderophores (N=17 isolates) and Type III polyketide synthases (N=22 isolates). We then examined various subsets of the 22 isolates in the phylum Proteobacteria for genes encoding specific compounds that may inhibit fungal growth, including chitinase and violacein. We identified differences in Agrobacterium and Sphingomonas isolates in the chitinase genes that showed some association with anti-Bd activity, as well as variation in the violacein genes in the Janthinobacterium isolates. Using a comparative genomics approach, we generated several testable hypotheses about differences among bacterial isolates from amphibian skin communities that could contribute to variation in the ability to inhibit Bd growth. Further work is necessary to explore and uncover the various mechanisms utilized by amphibian skin bacterial isolates to inhibit Bd.

: Summary of whole genome bacterial isolates.
Bd inhibition values were obtained from in vitro challenge assays .
The inhibition values refer to the mean inhibition. 100 = complete Bd inhibition, 0 = no effect, <0 = facilitation of Bd growth. 34 Table 2: Summary of BUSCO databases used to assess quality of genome assemblies. 35 Table 3: Summary of gene families identified by PIRATE found in all members of each subset. 36 Table 4: Summary of shared genes within biosynthetic gene clusters within the four Bacillus isolates. 36 Table 5: Summary of shared genes within biosynthetic gene clusters within the five Pedobacter isolates. 37 Table 6: Summary of shared genes within biosynthetic gene clusters within the three Microbacterium isolates. 37 Table 7: Summary of shared genes within biosynthetic gene clusters within the five Sphingomonas isolates 38

Introduction
In 1977, one of the first DNA sequencing methods, Sanger sequencing, was developed (Sanger, Nicklen and Coulsen,1977). Using this method Sanger and his team were able to sequence the genome of bacteriophage φX174, which was 5386 bases long, and revolutionize biology in the process . Now, less than 50 years later, the largest genome sequenced is the Australian lungfish, with a genome of more than 43 billion bases (Meyer et al.,2021). The lungfish genome was sequenced using a combination of newer "next-generation" sequencing platforms (Nanopore and Illumina) that allow for rapid production of terabytes of sequencing data. New computational approaches have accompanied these technological advances in DNA and RNA sequencing, and together, they have led to an "omics" revolution.
Many new biological insights have resulted from these advances in recent decades, including in the fields of ecology and conservation. For example, DNA metabarcoding has expanded our ability to assess and define ecological communities, and the use of environmental DNA surveys have allowed us to identify habitats that should be prioritized for conserving endangered species (Beng et al.,2020). These approaches can also be used to identify "hidden" species interactions, for instance, in pollinator networks (Pornon et al.,2017), and can be especially useful in systems that are difficult to sample (e.g. soils, Deceans,2020), or for systems where cryptic diversity may be common (e.g. reef ecosystems, . In microbial ecology, next-generation sequencing has provided an opportunity to assess microbial communities with much higher resolution and to a much greater depth than was previously possible with culture-based methods. The primary methods used to assess microbial communities using next-generation sequencing are amplicon sequencing and metagenomics. For amplicon sequencing, a small segment of DNA is amplified, typically 200-300 base pairs from the fungal ITS or bacterial 16S region, for each sample. These are amplified with barcoded primers so they can be massively multiplexed, and the resulting data provide relative abundance estimates of microbial communities. Metagenomes are produced by cutting up all the DNA in a sample into 300-500 bp fragments, sequencing it, and then reconstructing genomes of all of the organisms from the sample. As with amplicon sequencing, this can provide a relative abundance estimate of the various organisms in the sample (depending on sampling depth), but also provides further information, as the genes that are present can sometimes tell us about the functional roles of the organisms in the sample. For example, Candidatus Accumulibacter phosphatis is a bacterial species that was identified through metagenomic data (it was not cultured), but which, based on its genome, appears to play an important role in removing phosphate from wastewater (Martín et al.,2006).
While important advances have occurred via next-generation sequences, bacterial cultures can also be important, especially if we wish to use bacteria to address real-world issues. For instance, if we wanted to use Candidatus Accumulibacter phosphatis in wastewater treatment plants, we would need to be able to grow it in culture. As next-generation sequencing has resulted in much more rapid generation of data, the ability to produce whole bacterial genomes from isolates is also significantly easier now. Combining cultures with whole genome data can be a powerful approach for developing bacterial applications.
The same advances that have improved our ability to assess environmental microbial communities have also improved our knowledge of host-associated systems. The bacterial communities found on and within an organism are known as the microbiome (Berg et al.,2020). The gut microbiome can vary in size depending on the study system. The human gut microbiome contains trillions of different cells, honey bees on the other hand have eight characteristic bacterial species, and the fruit fly gut microbiome is composed of only two genera (Ursel et al.,2012;Engel et al.,2012;Wong et al.,2011). The microbiome can play an important role in pathogen resistance and maintaining homeostasis and can even impact host behavior Harris et al.,2009;Pearson-Leary,2019). Changes in the microbiome, known as dysbiosis, can have detrimental effects on the host (Maes et al.,2016;Haran and McCormick,2021). Furthering our understanding of the microbiome could lead to novel therapeutic approaches and improve conservation efforts.
Chytridiomycosis is an amphibian skin disease caused by Batrachochytrium dendrobatidis (Bd) and Batrachochytrium salamandrivorans . It has caused over 6% of the world's amphibian species to experience rapid population declines and in some cases, has caused species extinctions . This disease along with climate change, introduction of invasive species and habitat loss makes amphibian conservation a great concern (Stuart et al.,2004). Bd infects the skin of amphibians. The amphibian skin microbiome can impact the severity of Bd infection . The skin microbiome of amphibians shifts depending on life stage, seasonality and other factors Douglas et al.,2021). The interactions among bacteria within the microbiome can produce beneficial metabolites that are not produced when the bacteria are grown in pure culture . The lethality of chytridiomycosis has led to many studies addressing how Bd affects the amphibian skin microbiome, and studies investigating the role of the skin microbiome in prevention of Bd infection Flechas et al.,2018;Bates et al.,2018).
Beneficial bacteria from microbiomes can be isolated and used as probiotics to mitigate the impact of disease in various species, including amphibians, but also in bats and other species (Kueneman et al.,2016;Hoyt, et al.,2019;Harrison et al.,2020). Bacterial wilt is a disease that affects many plant species including tomatoes. For example bacterial wilt is caused by a variety of pathogens such as Ralstonia solanacearum (Schell,2000). The addition of a Flavobacterium species that was more abundant in wilt resistant plants to the root system of non-resistant plants increased their resistance to bacterial wilt (Kwak et al.,2018). Augmentation of the gut microbiome can be beneficial as well. The use of antibiotics in managed honey bee hives can be necessary to combat bacterial infection; however, antibiotic treatment can lead to reduced diversity of the gut microbiota. The addition of three beneficial Lactobacillus strains to the pollen puck fed to honey bee hives that were exposed to an antibiotic led to recovery of the gut microbiome, as well as improved immune function (Daisley et al.,2020). Probiotics can also have beneficial effects on humans. (Maity et al.,2020).
Understanding the genetics of these probiotic bacteria is important as it provides key information about the mechanisms of action, and could also lead to the identification of new probiotic candidates. Work by  highlighted the importance of understanding the genetics of host-associated bacteria by using a metagenomic approach to compare the skin microbiomes of frogs from a site with Bd present and a site where Bd was absent using a metagenomic approach. They found variation in the dominant members of the skin microbiome and differences in functional classes of genes that were present at the two sites. They found a higher abundance of genes related to secondary metabolism, cellular communication, antimicrobial drug resistance and membrane transport in the microbiome of frogs from the site where Bd was present. They also found several genes related to the production of secondary metabolites that inhibit Bd growth . Other studies have worked with bacterial cultures to isolate antifungal compounds, such as violacein and 2,4-diacetylphloroglucinol (Brucker et al.,2008a;2008b;Woodhams et al.,2018). By identifying the presence of the genes and pathways responsible for producing these compounds, bacteria can more readily be selected as possible probiotic candidates.
Two important things to consider when selecting probiotic candidates are their abilities to be cultured under laboratory conditions and understanding how the addition of the bacteria affects the microbiome. Most bacteria from the amphibian skin microbiome can be cultured .  investigated how the addition of a probiotic bacterium affects the skin microbiome of cricket frogs, Acris blanchardi. The frogs were inoculated with wild type Serratia marcescens or a knockout strain of S. marcescens that lacked a functional prodigiosin pathway, and then the frogs were exposed to Bd. They found that the addition of the knockout strain of S. marcescens lead to increased species richness in the skin microbiome but also lead to higher host mortality rates. The addition of wild type S. marcescens had no effect on species richness . Prodigiosin is known for having both antifungal and antimicrobial properties (Lapenda et al.,2015). The authors propose that the lack of prodigiosin allowed other bacteria to grow on the frogs that were inoculated with the knockout S. marcescens strain. The addition of a probiotic bacteria to the amphibian skin microbiome may inhibit the growth of other bacteria. The bacteria that are already present may produce compounds that inhibit Bd growth better than the introduced bacteria.
Sequencing whole bacterial genomes is important, as even closely related bacteria can vary in gene content, and thus function differently within a community (Martínez-Carranza et al.,2018). Bacteria have the ability to uptake DNA from their environment, which can contribute to this genomic variation (Sun,2018). Bacterial genomes can be divided into the core genome and the pangenome. The core genome consists of the genes that are present in all isolates of a taxon. The pangenome consists of genes that are taxon specific or are only found in a fraction of the members of a species (Tettelin et al.,2005). For example, bacteria belonging to the family Lachnospiraceae that were isolated from human fecal samples varied in genes related to sugar metabolism and butyrate production, which could have an impact on the overall health of the gut microbiome (Sorbara et al.,2020). A more extreme example is the genus Pseudomonas, which contains some species, such as Pseudomonas psychrotolerans CS51, which promotes plant growth, and Pseudomonas syringae, which is a known plant pathogen . Pseudomonas has a large pangenome. By sequencing whole genomes of bacterial isolates, their gene content can be examined, compared to other members of the species, and predictions can be made about their role in the environment.
For my thesis, I built on prior work based on bacteria that were isolated from the skin of four Virginia amphibian species and tested for their ability to inhibit Bd growth in an in vitro challenge assay . From those amphibian skin isolates, closely related sets of isolates that varied in their ability to inhibit Bd growth, were selected for whole genome sequencing. The goal of my thesis was to analyze these whole genome sequences, identify genomic differences in them and form hypotheses about these differences that could potentially explain the isolates' variation in Bd inhibition.
Chapter 1: Comparative genomics of bacteria from amphibian skin associated with inhibition of an amphibian fungal pathogen Batrachochytrium dendrobatidis

Abstract
Chytridiomycosis is a fungal skin disease in amphibians that is primarily caused by Batrachochytrium dendrobatidis (Bd). We analyzed whole genome sequences of 40 bacterial isolates that had been previously cultured from the skin of four amphibian species from Virginia, USA, and tested for their ability to inhibit Bd growth via an in vitro challenge assay. These 40 isolates spanned 11 families and 13 genera. The aim of this study was to identify genomic differences among the amphibian skin bacterial isolates and generate hypotheses about possible differences that could contribute to variation in their ability to inhibit the growth of Bd. We identified sixty-five gene families that were present in all 40 isolates. We also looked for the presence of biosynthetic gene clusters. While this set of isolates contained a wide variety of biosynthetic gene clusters, the two most abundant clusters with potential anti-fungal activity were siderophores (N=17) and Type III polyketide synthases (N=22). We then analyzed the isolates belonging to the phylum Proteobacteria. We identified 197 gene families that were present in all 22 Proteobacteria. We examined various subsets of the Proteobacteria for genes for specific compounds with known activity against fungi, including chitinase and violacein. We identified a difference in the number, as well as amino acid sequences, of predicted chitinases in two isolates belonging to the genus Agrobacterium that varied in their inhibition of Bd. After examining the annotated genomes, we identified a predicted chitinase in a Sphingomonas isolate that inhibited the growth of Bd that was absent from the five Sphingomonas isolates that did not inhibit Bd growth. The genes vioA, vioB, vioC, vioD and vioE are necessary to produce violacein, a compound which inhibits the growth of Bd. Differences in these genes were identified in three out of the four Janthinobacterium isolates. Of these three isolates, two showed strong inhibition of Bd growth, while the third inhibited Bd growth to a lesser extent. Using comparative genomics, we generated additional hypotheses about differences among bacterial isolates that could contribute to variation in their ability to inhibit Bd growth. Further work is necessary to experimentally test these hypotheses of how the amphibian skin bacterial isolates may inhibit Bd.

Introduction
Vertebrates serve as hosts for a wide range of microorganisms, collectively termed the "microbiome". These microorganisms form communities that inhabit many different body regions, including the gut, mouth, and skin (Jia et al.,2008;Wade,2013;Grice and Segre,2011). The microbiome plays a role in host health; however, many of the specific mechanisms that mediate these responses are unknown (Cénit et al.,2014). In many hosts, these microbial communities, appear to have a critical role in defending against pathogens, as many microbes line mucosal surfaces or the skin, where they can provide a critical first line of defense against pathogen entry (Findley and Grice,2014;Swaney and Kalan,2021).
Chytridiomycosis is a fungal skin disease in amphibians that was discovered in 1998 and is primarily caused by Batrachochytrium dendrobatidis, or Bd , although a second species was recently described in this genus (B. salamandrivorans) that can also cause disease in some amphibians . Bd has contributed to large-scale declines of amphibian populations across the globe (Skerratt et al.,2007;Lips,2016). More specifically, estimates suggest Bd has contributed to the decline of 6.5% of the described amphibian species, which is approximately 500 species. Of these, 18% are assumed to be extinct in the wild, while another 25% show declines in population sizes greater than 90% . Tropical species are particularly vulnerable due to small population sizes, limited habitat ranges, and the rapid spread of the disease (Lips, 2016).
Amphibian skin bacterial communities are of particular interest because microbes residing on the skin could prevent or limit Bd infection by producing anti-fungal compounds (Brucker et al, 2008a). The composition of these bacterial communities can sometimes predict host responses following exposure to Bd (Becker et al.,2015a;Walke et al.,2015a;Jani and Briggs,2014). These communities, as well the gene networks within them, can be altered by the presence of Bd . One example of a bacterium that produces anti-fungal compounds is Janthinobacterium lividum which has been isolated from red-backed salamanders and produces indole-3-carboxaldehyde, as well as violacein. These compounds can inhibit Bd growth in in vitro challenge assays (Brucker et al., 2008b). It is likely that many other compounds that could interact with Bd are also produced in these bacterial communities on amphibian skin.
Advances in whole genome sequencing have made it cheaper and easier than ever before to obtain genetic information from a wide variety of organisms. Prokaryote genomes are of interest due to the wide variety of bioactive secondary metabolites that have been isolated from them, which could have a wide array of applications (Franke et al.,2012;Lautru et al.,2005). The use of bioinformatic tools allow scientists to narrow the search for genes that encode for novel bioactive secondary metabolites. Development of bioinformatic tools allow scientists to more easily assemble and compare genomes of species to determine their evolutionary relationships, as well as to examine the genetic basis for phenotypic differences. For example, the development of tools, such as antiSMASH (Blin et al.,2019) has allowed for the prediction of biosynthetic gene clusters from genomic sequence data. Tools like this allow for the discovery of novel compounds from metagenomes without having to assemble whole genomes for each member of the community.
In prokaryotes, genes related to the synthesis of secondary metabolites are often located near each other in regions described as biosynthetic gene clusters (BGC) (Blin et al., 2019). These gene clusters have been linked to the production of antibiotics, siderophores, and other medically relevant drugs (Newman and Cragg, 2012). The bacterial genus Streptomyces is one of the most commonly used in genome mining, as many useful natural products have been isolated from Streptomyces species. Some examples of these compounds include the antibiotic erythromycin, the anti-fungal nystatin and the anti-cancer drug saframycin (Quinn et al.,2020). While many bacteria collected from amphibian skin have anti-fungal properties against Bd in vitro , less is known about the diversity of anti-fungal pathways in these communities of amphibian skin bacteria, or whether common genes tend to contribute to Bd inhibition across these bacterial taxa.
Metagenome studies have been done to try and identify mechanisms of Bd inhibition.  used shotgun metagenome sequencing of DNA samples from skin swabs of frogs from a site where Bd was endemic and a site that was naive to Bd. They found gene classes involved in biosynthesis of secondary metabolites, cellular communication and membrane transport enriched in the samples from the Bd endemic site. They also found genes involved in the production of anti-fungal compounds, such as prodigiosin, indole-3carboxaldehyde, and 2,4-diacetylphloroglucinol in both the bd endemic bd naive sites . Interestingly these genes were found in different genera depending on the site which indicates that there are multiple species capable of inhibiting the growth of Bd.
In previous work, we collected 719 bacterial isolates from four species of Virginia amphibians and tested for their ability to inhibit Bd in vitro . Based on their ability to inhibit the growth of Bd, 47 of these isolates, spanning 12 families and 15 genera, were selected for whole genome sequencing. The aim of this study was to look for genomic differences that could potentially explain the variation in Bd growth inhibition among sets of related isolates.

Isolate Collection
The bacteria used in this study were originally isolated from the skin of four species of Virginia amphibians (Walke et al.,2015b): bullfrogs (Lithobates catesbeianus), spring peepers (Pseudacris crucifer), American toads (Anaxyrus americanus), and Eastern spotted newts (Notophthalmus viridescens). These four species are all common amphibians that are native to the Eastern US. They all breed in ponds and wetlands during the spring and summer and have an aquatic larval stage that metamorphoses into a more terrestrial juvenile or adult.
A total of 719 bacterial isolates were collected per methods in Walke et al.(2015b). Briefly, all animals were rinsed with sterile water and then swabbed with two sterile rayon swabs. One swab was then plated on to R2A media to assess the culturable bacterial community composition of the individual animals. After incubating at room temperature for 14 days, individual colonies were isolated from the plates based on margin, form, whole colony color, elevation and substance. These were serially plated until pure isolates were obtained. The pure isolates were then initially identified via Sanger sequencing of the total length 16s rRNA gene (primers 8F and 1492R). The other swab was used to determine the bacterial community via 16s rRNA gene amplicon sequencing. Cultured isolates were then tested for the ability to inhibit Bd growth in vitro, using a quantitative 96-well plate assay (Bell et al., 2013;. In the assay, optical density values were used to score isolates on a range of completely inhibitory (100) to those with no effect on Bd growth (0), and even a few with negative inhibition which indicates facilitation of Bd growth (<0). Results comparing the full set of culturable and amplicon-based skin communities were previously published in Walke et al. (2015b), and results examining the bacterial skin community in relation to Bd growth inhibition were published in .
For the present study, a subset of 47 of the cultured isolates in 14 genera (Table 1) were selected for whole genome sequencing from the complete set of 719 isolates. Our approach was to select sets of related isolates that varied in their ability to inhibit Bd, so that we could identify genomic differences that could potentially contribute to variation in Bd inhibition.

DNA Extraction
DNA from 20 of the 47 isolates sent for sequencing had been previously extracted via methods in . The other 27 isolates used in this study needed to be cultured from freezer stocks and have their DNA re-extracted. For these, prior to DNA extraction, isolates were cultured at room temperature in 750 μl of 1% tryptone broth, shaking at 150 rpm for 24 hours. After 24 hours, we centrifuged the cultures at 7500 RPM for 10 minutes. The resulting pellet was resuspended in 180 μl of lysis buffer. DNA extraction was completed using a DNeasy blood and tissue kit (Qiagen Inc., Valencia, CA) following the gram-positive bacteria protocol provided by the manufacturer. DNA was eluted in 150 μl of sterile Milli-Q water.
Sequencing and genome assembly Extracted DNA was sent to the Duke Center for Genomic and Computational Biology for library construction and sequencing on the Illumina Hi-Seq 4000 platform, 2 x 150bp. This generated an average of 2.2 Gbp of reads per sample library with an average Q score of 38 and 90% of reads >Q30 (Supplemental Table 1). Raw reads were adapter trimmed using Trimmomatic v. 0.35 (Bolger et al.,2014) with default settings including standard Illumina adapters and visually checked for quality using fastqc (Andrews, 2010). Processed raw reads were de novo assembled using Minia version 2.0 (Chikhi and Rizk,2013) with the command line arguments, -kmer 131 -abundance -min 3. The kmer choice was based on comparative assembly of a subset (3) of data for kmer lengths 21-141 with a step of 10. For the average assembly, the total length was 5.2 Mbp (range 2.8:8.9 Mbp), with 45 total contigs(range 11:211). The average N50 was 357 kbp (range 73:1, 105 kbp). The average L50 was 6.9 (2:22), and GC content was 55% (range 34:72%) (Supplemental Table 2). Minia generated contig files were used for downstream analyses.

Annotation
To verify the identities of the isolates prior to annotation, we used BBTools version 38.86 (Bushnell, 2018) to obtain average nucleotide identity scores between individual isolates and their most closely related bacteria. Of the 47 genome assemblies, 40 unambiguously matched the initial isolate 16S identity and were used for subsequent analyses. To assess assembly completeness (accuracy of assembled orthologs) we analyzed the genome assemblies for Benchmarking Universal Single-Copy Orthologs, which are genes that are expected to be present in closely related bacteria. To do this we used BUSCO version 4.1.3 (Simão et al.,2015), with version 10 BUSCO databases (Table 2). BUSCO does not have databases for every genus so assemblies were evaluated with the lowest taxonomic database available for that isolate. Nine assemblies were evaluated using class level databases, 29 were evaluated using order level databases and two were evaluated using genus level databases. All genome assemblies contained at least 97% of expected Benchmarking Universal Single-Copy Orthologs, suggesting the genomes were complete, and thus usable for subsequent analyses. To generate the General Feature Format (GFF) files necessary for analyses with PIRATE (Bayliss et al.,2019), the 40 genomes were annotated using Prokka version 1.14.6 (Seemann, 2014). To find orthologous groups and to functionally annotate the genomes, the fasta files containing the amino acid sequences of the predicted coding sequences were annotated using eggNOG-mapper version 2.0 with eggNOG database version 5.0. (Huerta-Cepas et al.,2017; Comparative Genomics Our general approach was to look for unique genes and metabolic pathways, and for differences in those shared genes and pathways, that may contribute to variation in the ability of isolates to inhibit Bd growth. Due to the taxonomic diversity of the isolates, this was done at various scales. First, all 40 isolates were compared, then subsets of the isolates were compared based on phylum, class and genus. Shared and unique gene families were identified, and we then looked for differences in shared biosynthetic gene clusters. In addition, for subsets of genera, we looked at genes for specific compounds, such as chitinase that are known to play a role in anti-fungal activity. Gene families are sets of genes that originated through duplication from an ancestral gene and have similar functions. To identify both shared and unique gene families among all 40 isolates, the GFF files produced by Prokka were run through PIRATE version 1.0.3. Based on user defined amino acid sequence similarity thresholds, PIRATE identifies orthologous gene families in bacterial genomes. This was also done with subsets of the genomes that were divided based on phylum, class, order and genus. After running PIRATE on all 40 isolates, we ran PIRATE on only the Proteobacteria (N=22), Bacteroidetes (N=8), Actinobacteria (N=6) and Firmicutes (N=4). The 22 Proteobacteria were further divided into Alphaproteobacteria (N= 7), Betaproteobacteria (N= 7) and Gammaproteobacteria (N= 8) and PIRATE was run on these subsets. The Proteobacteria were further divided by class, order and genus for analysis. The Bacteroidetes and Actinobacteria phyla were divided by genus because the isolates within them only differed at the genus level. To look at gene families present at lower taxonomic levels, PIRATE was run on the Burkholderiales order and a total of 8 genera.
To identify predicted biosynthetic gene clusters potentially associated with anti-fungal activity, all 40 genome assemblies, as well as their respective GFF files, were run through antiSMASH version 5.0 (Blin et al.,2019) using the strictest settings. We identified 28 different biosynthetic gene clusters in our dataset of 40 isolates. Terpene synthesis clusters were the most common and were identified in 33 of our isolates, followed by Type III polyketide synthases in 22 isolates and siderophores in 17 isolates. The structural and functional diversity of terpenes makes them difficult to compare across genera, so we compared genes within terpene synthesis clusters only within genera, as described below. We focused more on the Type III polyketide synthases and the siderophores, which have previously been associated with anti-fungal activity (Sulochana et al.,2013;Risdian, Mozef and Wink, 2019).
To examine shared and unique genes in these two clusters, individual fasta files containing nucleotide sequences of the predicted Type III polyketide synthase clusters and siderophore clusters were imported into Geneious Prime version 2020.2.1(Biomatters Ltd). Genes that were shared among multiple isolates were then aligned using the Mafft plugin version 1.4.0 (Katoh,2002) in Geneious Prime. These alignments were then concatenated and exported in phylip format and a maximum likelihood tree was generated using RAxML HPC version 8.2.12 (Stamatakis, 2014). RAxML was run using a random number seed for parsimony inferences, rapid bootstrapping with 1000 replicates and the GTRCAT nucleotide substitution model.
Only a single gene, iucC, was present in all predicted siderophore clusters across the various genera. Only one gene that encoded a Type III polyketide synthase was found in all Type III polyketide synthase clusters across genera. Because there were few genes shared at higher taxonomic levels, we then started looking for common genes in shared biosynthetic gene clusters within genera. Four genera were examined. Bacillus isolates (N=4) shared a predicted terpene synthesis cluster, a predicted siderophore cluster and a predicted Type III polyketide synthase cluster. Pedobacter isolates (N=5) shared a predicted siderophore, Type III polyketide synthase and a terpene synthesis cluster. Microbacterium isolates(N=4) shared a Type III polyketide synthase and a terpene synthesis cluster. Sphingomonas isolates (N=5) shared a Type III polyketide synthase cluster and a terpene synthesis cluster. RAxML (Stamatakis, 2014) was run using the same parameters mentioned previously and Bd inhibition values were mapped to the nodes of the resulting tree. By observing which nodes had the highest Bd inhibition value we generated hypotheses about which genes to take a closer look at for single nucleotide polymorphisms.
To further explore potential links to anti-fungal activity, four genera were selected, Bacillus (N=4; Bd inhibition : -8, 8, 15, 90), Flavobacterium (N=3; Bd inhibition: 58, 62, 92), Microbacterium (N=4; Bd inhibition: -21, 4, 97, 100) and Sphingomonas (N=5; Bd inhibition: -30, -11, -9, -2 and 93). These genera were selected for further analysis because the isolates within them showed the greatest difference in their ability to inhibit Bd growth. For each of these four genera, the genomes of the 10 most closely related bacteria were downloaded from the National Center for Biotechnology Information (NCBI) and annotated using Prokka. These annotations were used for analysis in PIRATE. The output files were analyzed to determine shared and unique genes families among the isolates within each genus. Only those protein predicted by Prokka and eggNOG were retained for further analysis. Predicted proteins from the unique gene families identified from the isolates with greater Bd growth inhibition scores were characterized via BLASTp on the NCBI website.
Annotated genomes from individual isolates were then analyzed for genes that could potentially explain variation in Bd growth inhibition assays. Two isolates in the genus Agrobacterium (Bd inhibition: 10 and 91), contained a shared predicted chitinase gene. This gene was then translated, and the resulting amino acid sequences were aligned using BLASTp from the NCBI website. We also found a predicted chitinase in the only Sphingomonas isolate that was able to inhibit Bd; it was lacking from the other four isolates in that genus. Violacein inhibits the growth of Bd (Woodhams et al.,2018). Genes that that code for the enzymes necessary to produce the compound violacein were identified in the genomes of three out of the four Janthinobacterium isolates (Bd inhibition: 36, 57, 98 and 101). These five genes, vioA, vioB, vioC, VioD and vioE were translated and aligned using Clustal Omega (Sievers, et al,2011).

Results
PIRATE identified a total of 67,039 gene families across the 40 genomes analyzed. Only sixty-five (<1%) of those gene families were found in all isolates. In the Proteobacteria, there were 31419 distinct gene families across the twenty-two isolates, in the Bacteroidetes there were 14561 gene families across the eight isolates, in the Actinobacteria there were 15267 gene families across the six isolates, and in the Firmicutes there were 8784 gene families across the four isolates. The number of shared gene families in each subset can be found in table 3.
The most common biosynthetic gene cluster we found was for terpene synthesis, which was identified in 33 of the isolates. However, this is a very large group of compounds and pathways, and there were zero shared genes found among our isolates, so we could not pursue further genomic analysis for these in the full dataset. Siderophore biosynthetic gene clusters were identified in 17 of the isolates, and Type III polyketide synthase biosynthetic gene clusters were identified in 22 of the isolates. The iucC gene from the siderophore cluster from each isolate was aligned and used to create a phylogenetic tree. Bd inhibition values of each isolate were mapped to the branches of this tree. We found no pattern relating to Bd inhibition. This was also done for the Type III polyketide synthase gene and we did not find any patterns related to Bd inhibition. There were some shared genes in these pathways for several genera that we were able to analyze further.
The four Bacillus isolates shared a predicted siderophore cluster, a Type III polyketide synthase cluster and a terpene synthesis cluster. The predicted siderophore clusters had 6 genes in common. The nucleotide sequences of these genes in two Bacillus isolates that varied in their ability to inhibit Bd (Bd inhibition score = 90,15), were identical. The type III polyketide synthase clusters shared 7 genes and the terpene synthase clusters only had a single gene in common. The nucleotide sequences of these 7 genes in two of the Bacillus isolates that varied in their ability to inhibit Bd growth, were identical. Predicted siderophore gene clusters were found in three out of the five Pedobacter isolates. Two of these isolates HP3C (no Bd inhibition) and HP6J (strong Bd inhibition) shared 7 genes (Figure 1). While there were differences in these shared genes, it is unclear if those differences contribute to variation in ability to inhibit Bd growth. Predicted Type III polyketide synthase clusters were found in all Pedobacter isolates; these clusters shared 7 genes. All Pedobacter isolates only shared a single gene within their predicted terpene BGC. A Type III polyketide synthase BGC was found in all four Microbacterium isolates. Within this BGC, they all shared two genes. All four isolates also had a terpene BGC in common which shared 6 genes. Predicted Type III polyketide synthase BGCs were found in all five Sphingomonas isolates. These BGCs had 7 genes in common. They also shared a predicted terpene BGC however only three genes within this cluster were common to all five isolates. A summary of the shared genes can be found in tables 4-7.
Chitinase was an additional potentially anti-fungal compound we examined. Of the 40 isolates, two, were in the genus Agrobacterium, with inhibition scores of 10 and 91. We found that these isolates shared a potential chitinase gene that was 639 base pairs (213 amino acids) long, and differed by four amino acids (Figure 2). We identified the following substitutions between the less and more inhibitory isolates: Val45Ala, Arg79Ile, Gln99Arg, and Arg195Lys. In addition to the differences seen in this shared predicted chitinase gene, the more inhibitory isolate contains a second predicted chitinase gene that is 897 base pairs long.
Five isolates out of the 40 were in the genus Sphingomonas, but only one of these was strongly Bd inhibitory (inhibition score = 93, others = -30, -11, -9, -2). This inhibitory Sphingomonas isolate also contained a predicted chitinase that was absent in the other four Sphingomonas isolates.
We also examined the genes that contribute to the production of violacein in Janthinobacterium isolates. Three of the four Janthinobacterium isolates in our dataset (inhibition scores = 37,98,101), contained all five genes necessary to produce the compound violacein. The fourth Janthinobacterium isolate (inhibition score = 57) contained only the vioA gene. The four amino acids Arg 64 , His 163 , Lys 269 , and Tyr 309 are critical for the functionality of VioA (Füller et al.,2016), these amino acids are present in the VioA protein of all three of our Janthinobacterium isolates. The vioA gene in the least inhibitory Janthinobacterium isolate of these three (score =37) is 1308 base pairs long, and after translation it was found that it differed from the two more inhibitory isolates by 8 amino acids (Figure 3). There were also differences in the other violacein genes in this less inhibitory isolate. The vioB gene differed by twenty-four amino acids (Figure 4), the vioC gene by six amino acids (Figure 5), the vioD gene by thirteen amino acids (Figure 6), and the vioE gene by eight amino acids (Figure7). Tyr 17 , Ser 19 , Phe 50 , Asn 51 , Glu 66 , and Arg 172 are important for VioE to catalyze reactions (Hirano et al.,2008). All of these amino acids except for Glu 66 are present in the VioE protein sequences in our isolates.

Discussion
The 40 isolates we examined contained over 26 distinct biosynthetic gene clusters, which is not surprising given the taxonomic breadth of our samples. Of these 26 gene clusters, we looked more closely at the three that were present in the most isolates and had some evidence of potentially being important for bacteria-fungi interactions, which were gene clusters for terpenes, Type III polyketide synthases and siderophores.
Terpenes are a very broad class of molecules that are structurally and functionally diverse, and include antimicrobial compounds (Can Baser and Bachbauer,2015). Most terpenes that have been characterized are produced by plants, fungi and some bacteria, including the Actinomycetes and Streptomycetes (Citron et al.,2012). While all terpenes consist of the same general chemical formula, their structural diversity makes it difficult to predict their function based solely off of genetic information. We found that while there were many terpene gene clusters present in our dataset, there were very few shared genes among our isolates, such that we could not speculate on function as related to Bd inhibition for genes in these clusters.
Polyketides sometimes referred to as polyketide natural products are secondary metabolites that can also serve a variety of functions, including as antimicrobials (Pandith et al.,2020). Many important human antimicrobials are produced in bacteria via these gene clusters, including the antibiotics streptomycin and erythromycin (Flores-Sanchez and Verpoorte 2009). Depending on their structure, polyketide synthases can be classified as Type I, Type II or Type III. Type I polyketide synthases contain multiple active sites that are specific for certain reactions necessary to produce the end product, erythromycin for example (Wang et al.,2020). Type II polyketide synthases consist of multiple enzymes and produce polycyclic aromatic polyketides, such as tetracenomycin C. What differentiates Type III polyketide synthases are that they consist of a single enzyme that does not require an acyl-carrier protein and can catalyze reactions of acyl-CoA substrates directly (Pandith et al.,2020).
While we found Type III polyketide synthase clusters to be fairly common, we did not see clear evidence that they were likely to be linked to Bd inhibition in the genera we examined. For instance, our two Bacillus isolates that varied in Bd inhibition had 100% sequence similarity in the 7 shared genes within their Type III polyketide synthase cluster. Even for the genera that had some differences in genes present in the Type III polyketide synthase clusters, like in Microbacterium, the genes involved seemed unlikely candidates for driving variation in antifungal activity.
The final biosynthetic gene cluster that we focused on were the siderophores. Iron is a necessary nutrient, however, soluble forms of iron are scarce in the environment (Andrews et al.,2003). Siderophores are compounds produced by bacteria that are necessary to sequester iron from the environment. Siderophores have been linked to inhibition of plant fungal pathogens (Sulochana etal.,2012), which is why we chose to explore these biosynthetic gene clusters in our dataset. It is thought that fungal inhibition occurs due to the bacteria sequestering the limited iron, which prevents fungal growth (Scher and Baker, 1982). The nucleotide sequences of the 6 shared genes were 100% identical in the siderophore cluster from two of the Bacillus isolates that differed in their ability to inhibit Bd growth (Bd inhibition scores 15 and 90). This indicates that the siderophore biosynthetic genes clusters are most likely not associated with variation in Bd inhibition in our Bacillus isolates. The variation in the shared genes within the siderophore cluster of the two Pedobacter isolates, may contribute to their variation in ability to inhibit Bd growth.
We also considered genes for several individual compounds in our analysis, namely chitinase and violacein. Chitin is a major component of the fungal cell wall and is a polysaccharide composed of N-acetylglucosamine linked through beta 1,4 linkages (Lenardon et al.,2010). Chitinases are enzymes that can cleave the beta 1,4 glycosidic linkages which results in degradation of fungal cell walls (Le and Yang,2019).
The two Agrobacterium isolates in our dataset varied in Bd growth inhibition. Both of them contained a predicted chitinase, but it differed in four amino acids. It is possible that these differences in amino acids could alter the structure, and thus functionality, of this predicted chitinase enzyme. In addition, the more inhibitory of the two isolates also contained a second predicted chitinase. The only inhibitory Sphingomonas isolate, which had a Bd inhibition score of 93 also contained a predicted chitinase. The other four Sphingomonas isolates lacked predicted chitinases. These results suggest that chitinase may be a compound worth investigating more in terms of Bd growth inhibition by symbiotic bacteria.
Violacein is a purple pigment produced by Janthinobacterium lividum that can inhibit growth of both Bd and B. salamandrivorans (Woodhams et al.,2018). Three of our Janthinobacterium isolates contained all of the genes vioA-E, which are necessary to produce the compound violacein. Interestingly, the fourth Janthinobacterium isolate contained only vioA, and thus is unlikely to be producing violacein, even though it still had some ability to inhibit Bd. Two of the isolates that contained all the violacein genes were strongly inhibitory, while the third was much less inhibitory. We found that many amino acid substitutions were only present in the less inhibitory isolate. While there were amino acid substitutions in both the VioA and VioE protein sequences, both of sequences contained the amino acids that are critical for their functionality. This indicates that the differences seen in the vioA and vioE genes most likely do not contribute to variation in the ability to inhibit Bd growth. Less is known about the structure of VioB, VioC and VioD. We hypothesize that these differences within the vioB, vioC and vioD genes could explain the variation of these isolates ability to inhibit Bd.
It is interesting to note the variation in Bd inhibition, and genomic variation, that occurs within bacterial genera from amphibian skin swabs collected in the same population of amphibians. Both Agrobacterium isolates were isolated from the same population of Spring Peepers on the same day, however, one showed a stronger ability to inhibit Bd growth, and concordant changes in putative chitinase genes. This was also seen in the five Sphingomonas isolates, which were also all isolated from the same population of Spring Peepers. Similarly, three of the Janthinobacterium isolates were cultured from the same population of American Toads, and showed variation in both their ability to inhibit Bd and the genes necessary to produce violacein. This suggests that genus-level taxonomy is not adequate for predicting bacterial function against Bd in these symbiotic communities. Bacteria in the same genus can have vastly different functions. For example, the genus Pseudomonas, which contains some species such as Pseudomonas psychrotolerans CS51 which promotes plant growth and Pseudomonas syringae which is a known plant pathogen . Another example is Bacillus subtilis which is a widely used model organism and Bacillus anthracis which causes anthrax (Errington and van der Aart,2020;. In addition, this variation in function within skin symbiont community also raises the interesting question of how selection acts in these systems. All of these results stem from computational predictions, and therefore can only serve to generate hypotheses about the function and importance of these genes. To validate these results, laboratory studies need to be completed on these isolates and/or with these genes. We would recommend both chitinase and violacein genes for key follow up laboratory studies, but there are likely many other compounds and pathways important for anti-fungal activity in these skin bacterial communities. For example, a metagenomic study done by  found several gene classes that were more abundant in bacterial communities from frogs that lived in an area infected with Bd than frogs who had not been exposed to Bd. These gene classes were involved in biosynthesis of secondary metabolites, cellular communication, and membrane transport . They also found genes that are associated with known Bd inhibitory compounds such as prodigiosin, 2,4-diacetylphloroglucinol and indole-3carboxaldehyde. Another study has also highlighted the potential role of prodigiosin, produced by Serratia marcescens, in mediating interactions in the amphibian skin microbiome . These studies, and our findings, suggest that combining genomic and metagenomic studies with experimental approaches may be the best way to begin to unravel the complex interactions and functions of the amphibian skin microbiome.            The aim of my thesis was to generate hypotheses about genomic differences between bacterial isolates collected from amphibian skin that could explain their variation in ability to inhibit the growth of an important fungal pathogen, Batrachochytrium dendrobatidis. After functional annotation of whole genome sequences, our most interesting results were that we identified differences in the amino acid sequences of two shared predicted chitinases as well as the presence of a second predicted chitinase in two Agrobacterium isolates. We also noted a predicted chitinase in the Sphingomonas isolate that had the only strong Bd inhibition of the isolates in that genus. Differences in the genes needed to produce the compound violacein were also found in three Janthinobacterium isolates. Twenty three out of the 40 isolates had a greater ability to inhibit Bd than the other isolates (Bd inhibition score ≥ 50). We were able to elucidate two of the potential mechanisms for Bd inhibition: the production of chitinase enzymes and the production of violacein. As these genes were found in only six of the isolates, this indicates that there are other mechanisms that are potentially linked with Bd inhibition. Further experiments need to be done to test the hypotheses about the predicted chitinases and the differences in the vioA-E genes contributing to Bd inhibition ability.
A chitinase assay could be performed to confirm the presence of the predicted chitinase enzymes in our isolates. This consist of culturing the bacteria on agar plates that contain chitin. If the bacteria produce chitinase, a clearance zone will be visible in the medium (Shahbaz and Yu,2020). Once the presence of a chitinase gene is confirmed, a test that could be performed to see if the chitinase gene contributes to variation in ability to inhibit Bd growth, is cloning the predicted chitinase gene into a model bacterium, such as Escherichia coli using a plasmid and testing for protein expression as seen in previous publications (Fuchs et al.,1986;Robbins et al.,1988;Lobo et al.,2013). Once the gene has been cloned into the bacterium, an in vitro Bd challenge assay can be performed as in . A similar experiment could be conducted for the vioA -E genes that were found in three of the Janthinobacteirum isolates. Gene constructs could be made by cloning the vioA, vioB, vioC, vioD and vioE genes from each isolate into separate vectors. A competent strain of E. coli could then be transformed with these constructs and tested for Bd inhibition. If it is found that the differences in the violacein producing genes results in differed Bd inhibition, then follow up experiments can be conducted to determine which gene or genes are responsible.
As mentioned previously, there are likely many other mechanisms that contribute to Bd inhibition. An experiment that could be performed to try and elucidate some of these mechanisms is obtaining bacteria from various culture collections such as the ATCC, that are closely related to the genera in our data set. These bacteria could then be tested for their ability to inhibit Bd growth. By comparing a greater number of closely related genomes that have been shown to inhibit Bd growth, the gene family or families responsible may be elucidated more easily.
More broadly, more research should be done to investigate the interactions between members of the amphibian skin microbiome. As shown previously, interactions between bacteria can produce novel metabolites that are inhibitory to Bd . This indicates that the most effective probiotic strategy may involve multiple bacteria rather than a single strain that can inhibit Bd. This artificial microbiome should consist of members Proteobacteria, Actinobacteria, Bacteroidetes and Firmicutes as those are some of the most common phyla found on amphibian skin Hughey et al.,2017). Increased bacterial diversity on amphibian skin can lead to greater inhibition of Bd . Mock communities could be created in the lab by selecting isolates that are known to be inhibitory to Bd from various taxonomic groups. These communities can be grown in the lab and tested for their ability to inhibit Bd. Once the most inhibitory community has been found it could be used to inoculate wild amphibians to prevent infection of Bd. Amphibians can successfully acquire probiotic bacteria from the soil (Muletz et al.,2012). The skin microbiome also goes through a transitionary period as tadpoles undergo metamorphosis (Kueneman et al.,2014;Bataille et al.,2018). Frogs could either be inoculated in the lab and released back into the wild or the probiotic microbes could be inoculated in the soil near breeding ponds in the hopes that juvenile frogs will acquire those bacteria as they make the transition from water to land. Further research could be done to analyze the compounds produced by these mock communities under normal conditions and when challenged with Bd. This, along with whole genome sequencing could identify novel genes that are associated the production of Bd inhibitory compounds.
The interactions between the microbiome and the host, and the interactions between the amphibian skin microbiome and pathogens are complex and only somewhat understood. More work needs to be done to try and tease apart the various mechanisms so that we can gain a better understanding of how these complex symbiotic communities work. Once more is understood about the mechanisms behind community assembly, host-community interactions and Bd inhibition within bacterial communities, probiotic communities can be more readily generated and used as part of amphibian conservation efforts.