Comparative Genomic and Metabolomic Analysis of Termitomyces Species Provides Insights into the Terpenome of the Fungal Cultivar and the Characteristic Odor of the Fungus Garden of Macrotermes natalensis Termites

ABSTRACT Macrotermitinae termites have domesticated fungi of the genus Termitomyces as food for their colony, analogously to human farmers growing crops. Termites propagate the fungus by continuously blending foraged and predigested plant material with fungal mycelium and spores (fungus comb) within designated subterranean chambers. To test the hypothesis that the obligate fungal symbiont emits specific volatiles (odor) to orchestrate its life cycle and symbiotic relations, we determined the typical volatile emission of fungus comb biomass and Termitomyces nodules, revealing α-pinene, camphene, and d-limonene as the most abundant terpenes. Genome mining of Termitomyces followed by gene expression studies and phylogenetic analysis of putative enzymes related to secondary metabolite production encoded by the genomes uncovered a conserved and specific biosynthetic repertoire across strains. Finally, we proved by heterologous expression and in vitro enzymatic assays that a highly expressed gene sequence encodes a rare bifunctional mono-/sesquiterpene cyclase able to produce the abundant comb volatiles camphene and d-limonene. IMPORTANCE The symbiosis between macrotermitinae termites and Termitomyces is obligate for both partners and is one of the most important contributors to biomass conversion in the Old World tropic’s ecosystems. To date, research efforts have dominantly focused on acquiring a better understanding of the degradative capabilities of Termitomyces to sustain the obligate nutritional symbiosis, but our knowledge of the small-molecule repertoire of the fungal cultivar mediating interspecies and interkingdom interactions has remained fragmented. Our omics-driven chemical, genomic, and phylogenetic study provides new insights into the volatilome and biosynthetic capabilities of the evolutionarily conserved fungal genus Termitomyces, which allows matching metabolites to genes and enzymes and, thus, opens a new source of unique and rare enzymatic transformations.

clusters, including an extraordinary number of genes annotated as terpene cyclases, many of which were differentially transcribed in different biological samples. Finally, we verified the enzymatic origin for the most abundant terpenes in vitro, thereby linking the obtained metabolomic and transcriptomic data sets.

RESULTS AND DISCUSSION
Volatilome of the fungus comb. In a first experimental setup, we collected different types of biosamples (fungus comb, fungal nodules, and termite workers, with soil and air samples serving as controls) from six different M. natalensis colonies (see Table  S1 at https://doi.org/10.6084/m9.figshare.16702471). Volatiles were collected from preweighed biosamples (n = 3) using solid-phase microextraction (SPME) and then analyzed by GC-MS. Obtained data sets were dereplicated using the National Institute of Standards Mass Spectral Library (NIST 2017), and peak intensities (with a match quality of at least 90%) were quantified relative to the measured sample weight and averaged signal intensities (see Fig. S1 to S3 and Tables S2 and S3 at https://doi.org/10.6084/m9 .figshare.16702471).
It was particularly noteworthy that the emission of several monoterpenes was detectable from the volatile blend released by the fungus comb as well as nodules. While D-limonene and camphene were emitted from both samples, a-pinene was predominately detected from comb samples. Interestingly, b-pinene and 3-carene, common plant metabolites of pine trees, were detected in soil and air samples and might have originated from the predigested plant material of the comb or surrounding trees (21). However, most of the detected chemical features from comb and nodule samples are known fungal infochemicals emitted during fungal growth and differentiation (22), some of which were also detected in previous volatile studies of Termitomyces and, thus, may contribute to the characteristic scent of the fungal cultivar (18,23). While isoamyl alcohol is known to induce morphological changes in yeast (24,25), phenylethyl alcohol serves as an antifungal agent (26) and might have additional morphogenic functions during the fungal life cycle, as shown in the yeast Candida albicans (27). Both a-pinene and D-limonene are known microbial volatiles with antimicrobial properties (9,16,28) and serve as communication signals in other termite species (29,30), while camphene is reported to have antioxidant activities (31). Only a few of the detected chemical features were deduced to be of anthropogenic or unknown origin (e.g., 2-ethylhexyl salicylate, cyclohexane, and trichloromethane) (29). To corroborate these findings and to pinpoint major termite volatiles, we also measured VOCs of 30 major workers (Tables S71 to S73). Here, several known sesquiterpenes were detectable, with b-gurjunene, gymnomitrene, and aristolene as the most abundant features, as well as smaller amounts of four terpenes, trans-a-bergamotene, a-pinene, a-barbatene, and b-chamigrene that were also produced by, e.g., Termitomyces sp. strain J132 and could be due to insects feeding on fungus biomass (see Tables S4 to S71  In silico analysis of biosynthetic gene clusters in Termitomyces sp. To correlate detected volatiles to the biosynthetic capacity of Termitomyces, we analyzed the draft genomes of eight Termitomyces strains using the platform fungiSMASH (v6.0) and web service NRPSpredictor2 (see Table S73 at https://doi.org/10.6084/m9.figshare.16702471) (32,33). Overall, each fungal genome encoded more than 20 detectable biosynthetic features related to secondary metabolism, including predominately fungal terpene cyclases (TCs), one nonreducing iterative type I polyketide synthase (NR-PKS) with conserved domain architecture (SAT-KS-AT-PT-ACP-ACP-TE), and a nonribosomal peptide synthetase (NRPS) with a conserved domain architecture (A 1 -T 1 -C 1 -T 2 -C 2 -T 3 -C 3 ) and an adjacent putative cytochrome P450 monooxygenase (34). Phylogenetic analysis of the NR-PKS sequences revealed their close relationship to known orsellinic acid-producing synthases described from other basidiomycetes (35,36), while an analysis of the detected NRPS sequences indicated their close relationship to one of the most abundant type VI basidiomycete siderophore synthetases that is assumed to produce the trimeric siderophore basidioferrin, derived from N 5 -acylated N 5 -hydroxy-L-ornithine (L-AHO) (see Table S73 and Fig. S5 and S6 at https://doi.org/10.6084/m9.figshare.16702471) (37). Thus, we deduced that both PKS and NRPS are unlikely to be accountable for any of the observed chemical features in the volatilome.
Consequently, we focused on the analysis and categorization of identified putative fungal terpene cyclase (TC) sequences based on the following general classification: (i) monoterpene cyclases (MTCs), which cyclize geranylpyrophosphate (GPP), yielding monoterpenes; (ii) sesquiterpene cyclases (STCs), which produce sesquiterpenes from farnesyl pyrophosphate (FPP); (iii) diterpene cyclases (DTCs), which use geranylgeranylpyrophosphate for product formation; and (iv) triterpene cyclases (Tri-TCs), also known as oxidosqualenecyclases (OSC)/lanosterol synthases, which catalyze cyclization of oxidosqualene into triterpenes. Both MTCs and STCs are known to initiate terpene formation by a metal-dependent ionization and diphosphate release mechanism and are identified by the characteristic active-site motifs (DDXXD and NSE) that stabilize and guide the reaction pathway (9, 10). However, in silico predictions of fungal MTCs are still ambiguous due to their high similarities to fungal STC (similar active motifs, catalytic pocket, and sequence length) and the low numbers of biochemical characterized MTCs in basidiomycetes (38). In contrast, numerous fungal STCs have been characterized in basidiomycetes, and comparative studies uncovered a correlation between their phylogenetic placement and cyclization mechanism (clades I to V) ( Fig. 3) (16,(39)(40)(41). In short, STCs belonging to clade I have been found to cyclize FPP via the formation of a germacradienyl cation, while STCs belonging to clade II induce a 1,10-cyclization mechanism of nerolidyl diphosphate (NPP), resulting either in bicyclogermacrene (clade IIa) or the germacradienyl cation (clade IIb), which undergoes further transformations. In contrast, enzymes belonging to clade III generally produce humulyl cation intermediates via 1,11-cyclization of FPP. Finally, enzymes of clade IV catalyze a 1,6-or 1,7-cyclization of NPP, which leads to an intermediate bisabolyl or cycloheptenyl cation, while clade V putatively includes TCs that favor only a 1,6-cyclization mechanism.
To the best of our knowledge, only 10 DTCs have been biochemically characterized so far from basidiomycetes and were found to be either mono-or bifunctional enzymes (16). While monofunctional DTCs cyclize GGPP via a type I ionization-dependent mechanism centered at the N terminus, triggering the second cyclization step by diphosphate removal (ionization-triggered reaction) (16,39), bifunctional enzymes use both class I and class II mechanisms to form diterpenes. Here, class II activity is located at the C-terminal site and catalyzes the cyclization of GGPP by protonation of the substrate. Only recently, a third class of fungal DTC was discovered that contains proteins from the UbiA family, which can have class I diterpene cyclase activity besides prenylation activity (42,43). Tri-TCs, which are key enzymes in ergosterol biosynthesis, carry a QW motif that determines its cyclization specificity in the transformation of squalene epoxide via a type II cyclization mechanism (9, 10). So far only four Tri-TC candidates involved in the biosynthesis of triterpenoid natural products have been characterized from different fungal species (44). Based on these considerations, we extended our fungiSMASH survey by dedicated HMM profile searches of predicted protein sequences of TCs (45). Overall, we identified, on average, 22 distinct putative MTC/STC protein sequences per strain, which were predicted from the open reading frames in Termitomyces genomes, while slightly lower numbers of protein sequences for putative DTCs and Tri-TCs were identified (Table 1 and Tables S74 to S82 at https://doi.org/10.6084/m9.figshare.16702471). The number of predicted TCs varied across Termitomyces species, which is likely a result of various sequencing qualities and annotation approaches as well as lacking transcriptomic data, which impairs the prediction of putative proteins. Despite these ambiguities, it was important to note that genomes of free-living basidiomycetes generally encoded lower numbers of terpene cyclases, a finding suggestive of the importance of terpenebased communication mechanisms in a termite-associated lifestyle.
Phylogenetic analysis and prediction of chemical scaffolds. To more precisely predict the putative cyclization mechanism and product scope of the identified TCs, we performed a phylogenetic analysis of all candidate sequences identified from the nine Termitomyces spp., as depicted in Fig. 4.
Eight terpene cyclases (TTCs; TTC21 to TTC28) showed high homologies to sequences belonging to STCs of clade I, with TTC21 to TTC26 appearing to build a Termitomycesspecific subclade. TTC28 from Termitomyces sp. strain J132 (KNZ77998; previously named STC15) (19) was biochemically characterized in previous studies and found to cyclize FPP in a 1,10-cyclization fashion via the (E,E)-germacradienyl cation to (1)-germacrene D-4ol. We also identified two groups of TCs belonging to clade II (TTC1 and TTC13), with TTC13 being more closely related to members of clade IV, while nine TTCs (TTC2 to TTC10) were categorized as representatives of clade III and again appeared to belong to a Termitomyces-specific subclade. Earlier studies revealed that one representative of TTC13 (previously named STC9, from Termitomyces strain J132, KNZ74377) acts as a (2)-g-cadinene synthase by cyclizing NPP at positions 1 and 10, a typical cyclization mode for clade II terpene cyclases (38). TTC13 is phylogenetically related to CpSTS18 (Clitopilus pseudopinsitus) and ShSTS5 (Stereum hirsutum), both known producers of g-cadinene. Seven TTCs (TTC14 to TTC20) were located in clade IV, and sequence alignments of all representatives showed that the NSE motif is well conserved between Cop6, Omp9, Omp10, and all Termitomyces sequences, while the aspartate-rich motif (DDXXD/  E; Tables S82 to S86) varied in Termitomyces enzymes. Based on their phylogenetic relatedness, six TTCs were assigned to clade V, forming their own subclade (TTC11, TTC12, and TTC30 to TTC32). Here, it was intriguing to note that one member of the TTC31 group (STC4; from Termitomyces strain J132, KNZ72568) was previously characterized to cyclize FPP via germacradienyl cation formation to the main product, (1)-intermedeol, although these enzymes were predicted to guide FPP through a 1,6-cyclization pathway (19). This example showcases again that the outlined relation between phylogenetic relatedness and cyclization mode harbors several exceptions, and the outlined predictions usually require biochemical verification (see Fig. S4 at https://doi.org/10.6084/m9 .figshare.16702471).
The six identified putative DTC genes were grouped as TTC33 to TTC34 (Fig. 5A) and classified according to the classification in Li et al. (42,43). We tentatively categorized the first group, TTC33, as UbiA-type DTCs (clade III), while TTC34 was assigned to an unclassified phylogenetic branch. Lastly, we also identified seven homologous Tri-TC candidate sequences based on our similarity searches (Fig. 5B), which were grouped as TTC35 and found to be closely related to OSC from Ganoderma lucidum and known to produce lanosterol (46), the precursor for ergosterol (47) and other triterpene-based metabolites (e.g., ganoderic acids) (10,48).
RNA-seq analysis. Subsequently, we analyzed the relative expression levels of natural product biosynthetic-related gene sequences in transcriptome sequencing (RNAseq) data obtained from cultured Termitomyces sp. strains 153 and J132 as well as fresh and old fungus comb and nodules on which young workers feed ( Fig. 6 and Fig. S7 and S8 at https://doi.org/10.6084/m9.figshare.16702471) (49). Intriguingly, a differential expression pattern of TTCs was observed, with some candidate sequences mostly expressed in comb and lab cultures but not in nodules, while others were expressed in all samples. Intriguingly, one candidate sequence (TTC15) was predominately expressed in fungal nodules, while only moderate to low expression levels were detectable within the comb and agar cultures. In contrast, NRPS-and NR-PKS-related gene sequences were expressed with similar abundance under all conditions and confirmed by RT-PCR of a fungal monoculture (Fig. S9 at https://doi.org/10.6084/m9 .figshare.16702471).
Comparative analysis of TTC15-T153 and related sequences from clade TTC15 enzymes revealed that the first metal binding motif differs from the common DDXXD/ E amino acid composition, which might cause the acceptance of both GPP and FPP as substrates by TTC15-T153, causing the formation of the different terpene blends.
Conclusions. In this study, we elaborated on the hypothesis that the fungal cultivar of Macrotermitinae termites emits specific volatiles (odor) that allows Termitomyces to orchestrate its complex life cycle within the fungal garden and its symbiotic relation with the termite host. To test the hypothesis, we combined volatile studies of fungal nodules and comb, in which the fungal cultivar Termitomyces resides, with phylogenetic analyses, gene expression, and molecular biological studies.
Our finding that Termitomyces thriving in fungus comb material emits a specific volatilome with a-pinene, camphene, and D-limonene among the most abundant terpenes (5-10) correlated with the identification of an above average repertoire of different types of terpene cyclases with conserved candidate sequences across Termitomyces species. While TCs from different Termitomyces strains showed high homologies to each other, they grouped into distinct phylogenetic branches compared to other TCs reported from other fungal genera. The detailed comparative analysis also uncovered that phylogenetic relatedness was not always correlated with the predicted cyclization mode, and more detailed biochemical studies on fungal TCs are required to solidify the correlations or to uncover the characteristics of their deviations. Based on RNA-seq data analysis, we verified that a congener of the highly expressed enzyme TTC15-T153 accepts GPP as well as FPP as substrates, causing the formation of more than 20 different terpenes, most of which were also detected in the volatile blends of collected biosamples and may contribute significantly to the characteristic odor of Termitomyces. From a chemical-ecological perspective, these findings were highly intriguing, as the two major GPP-derived products of TTC15-T153, D-limonene and camphene, were identified within the fungus comb volatilome and are known to be metabolites exchanged within symbiotic systems (14,58). This study represents, to the best of our knowledge, the second characterized example of a bifunctional mono-/sesquiterpene cyclase in basidiomycetes. The high expression level of the coding sequence TTC15 in comb and nodules as well as natural abundance of these characteristic monoterpenes points toward a central role in termite symbiosis. Overall, our combined studies allowed us, for the first time, to link and verify produced metabolites within the complex community with their genetic basis, transcriptional pattern, and conserved biosynthetic origin and improve our current understanding of the chemical language orchestrating this complex farming symbiosis.

MATERIALS AND METHODS
Volatile analysis. For volatile sampling, different parts of M. natalensis nests from different areas of Pretoria (South Africa) were collected in preweighed amber glass vials directly in the field. Soil, air, and fungus comb samples were collected in 40-ml amber glass vials, and nodules were separated from the fungus comb using clean forceps and collected in 1-ml amber glass vials, both equipped with a cap containing a silicone septum. The vials were closed tightly and kept at 4°C until volatile extraction. Each sample was collected in triplicates, if not stated differently. Headspace sampling was performed using solid-phase microextraction (SPME). After penetrating the silicone septum of the vial cap using an SPME fiber holder, a conditioned SPME fiber coated with 100 mm polymethylsiloxane (Supelco) was exposed to the headspace in the vial for 20 min at room temperature. Room temperature and sample weight were recorded each time before measuring. The SPME fiber was then directly injected into the inlet of an Agilent 7890B gas chromatograph coupled to a 7977MSD quadrupole mass spectrometer. Thermal desorption was achieved at 250°C. Compound separation was achieved using a DB wax GC column (length, 20 m; inner diameter, 0.18 mm; and film thickness, 0.18 mm [J&W]). Separation was achieved with a 5-min hold at 50°C, followed by a linear temperature increase of 10°C/min to 250°C. The column was reconditioned with a 2-min hold at 250°C. Data analysis and peak integration were performed using the program MSD ChemStation (version F.01.03.2357). Metabolites were tentatively identified based on comparison of mass spectra and retention times to the National Institute of Standards mass spectral library (NIST 2017). After peak detection and integration, metabolites with a database match quality of at least 90% were taken for further analysis. Thereafter, the peak area was normalized to the sample weight for each metabolite (area per gram) and integration values used to generate a heat map of averaged values (see the supplemental material at https://doi.org/10.6084/m9.figshare.16702471).
Identification of putative NRPS and PKS and phylogenetic analysis. Identified enzymes were aligned to characterized fungal sequences from the same class in MEGAX by MUSCLE algorithm. Phylogenetic trees were generated with IQ TREE, calculating 1,000 bootstrap replicates (see the supplemental material at https://doi.org/10.6084/m9.figshare.16702471).
Identification of putative terpene cyclases and phylogenetic analysis. Putative terpene cyclases in Termitomyces species were identified by HMMsearch of HMMprofiles from characterized fungal terpene cyclases against predicted proteins from different Termitomyces species. Identified terpene cyclases that matched specific criteria for each class of terpene cyclases were aligned to characterized fungal sequences from the same class in MEGAX by MUSCLE algorithm. Phylogenetic trees were generated with IQ TREE, calculating 1,000 bootstrap replicates. Bioinformatic analyses were performed on the Galaxy EU webserver (59) (see the supplemental material at https://doi.org/10.6084/m9.figshare.16702471).