Genome sequences of Rhizopogon roseolus, Mariannaea elegans, Myrothecium verrucaria, and Sphaerostilbella broomeana and the identification of biosynthetic gene clusters for fungal peptide natural products

Abstract In recent years, a variety of fungal cyclic peptides with interesting bioactivities have been discovered. For many of these peptides, the biosynthetic pathways are unknown and their elucidation often holds surprises. The cyclic and backbone N-methylated omphalotins from Omphalotus olearius were recently shown to constitute a novel class (borosins) of ribosomally synthesized and posttranslationally modified peptides, members of which are produced by many fungi, including species of the genus Rhizopogon. Other recently discovered fungal peptide macrocycles include the mariannamides from Mariannaea elegans and the backbone N-methylated verrucamides and broomeanamides from Myrothecium verrucaria and Sphaerostilbella broomeana, respectively. Here, we present draft genome sequences of four fungal species Rhizopogon roseolus, Mariannaea elegans, Myrothecium verrucaria, and Sphaerostilbella broomeana. We screened these genomes for precursor proteins or gene clusters involved in the mariannamide, verrucamide, and broomeanamide biosynthesis including a general screen for borosin-producing precursor proteins. While our genomic screen for potential ribosomally synthesized and posttranslationally modified peptide precursor proteins of mariannamides, verrucamides, broomeanamides, and borosins remained unsuccessful, antiSMASH predicted nonribosomal peptide synthase gene clusters that may be responsible for the biosynthesis of mariannamides, verrucamides, and broomeanamides. In M. verrucaria, our antiSMASH search led to a putative NRPS gene cluster with a predicted peptide product of 20 amino acids, including multiple nonproteinogenic isovalines. This cluster likely encodes a member of the peptaibols, an antimicrobial class of peptides previously isolated primarily from the Genus Trichoderma. The nonribosomal peptide synthase gene clusters discovered in our screenings are promising candidates for future research.


Introduction
Borosins, a class of backbone N-methylated ribosomally synthesized and posttranslationally modified peptides (RiPPs), were defined in 2017 following the discovery of the biosynthesis pathway of the founding member omphalotin A (Ramm et al. 2017;Van Der Velden et al. 2017). This nematotoxic peptide macrocycle and its variants are produced by the fungus Omphalotus olearius via the self-modifying precursor protein OphMA. OphMA contains an N-terminal aN-methyltransferase domain that methylates the precursor's C-terminal core peptide, followed by cleavage, cyclization and release of omphalotin (Van Der Velden et al. 2017). Backbone N-methylations were previously found exclusively in nonribosomal peptides and were even considered a hallmark of this type of peptides. Therefore, it was a surprise to find them in RiPPs (Vogt and Kü nzler 2019). Genome mining led to the discovery of many other potential OphMA-like peptide precursors in fungi, including Dendrothele bispora and Lentinula edodes (Quijano et al. 2019). The genomes of these fungi contain biosynthetic gene clusters with similar composition and organization as the omphalotin cluster. In addition, the encoded OphMA homologs contain core peptide with high sequence similarity to omphalotin A. Analysis of fungal tissue samples confirmed the production of the corresponding peptides, termed dendrothelins and lentinulins (Matabaro et al. 2021). Recent publications demonstrated the presence of borosin clusters with trans-acting aN-methyltransferases in bacteria (Cho et al. 2022;Imani et al. 2022). Based on these findings, we were interested in investigating previously discovered, backbone N-methylated cyclic peptides that were hypothesized to be of nonribosomal origin, to represent novel members of the borosin class of RiPPs.
Recently discovered cyclic, backbone N-methylated peptides include verrucamides A-D, tetradecapeptides that are produced by the ascomycete Myrothecium verrucaria and contain two Dconfigured amino acids (Zou et al. 2011), and the octapeptides broomeanamides A-C from the mycoparasitic ascomycete Sphaerostilbella broomeana where all eight amino acids are L-configured ( Fig. 1) (Ekanayake et al. 2021). Another class of cyclic peptides are the octapeptides mariannamides A and B isolated from the filamentous ascomycete Mariannaea elegans that are also composed of all L-amino acids amongst three proline residues but do not contain any backbone N-methylations ( Fig. 1) (Ishiuchi et al. 2020). Both verrucamides and mariannamides were shown to possess antibacterial properties (Zou et al. 2011;Ishiuchi et al. 2020). The mode of synthesis of all three peptide classes is unknown; the structural similarity of the verrucamides and broomeanamides to the cyclic, backbone N-methylated borosins indicated that they may be RiPPs, although the presence of D-amino acids in the verrucamides rather suggested a nonribosomal origin. Only one fungal RiPP class with a residue in D-configuration has been identified so far (phallotoxins, Hallen et al. 2007).
Here, we report the genome sequences of M. verrucaria, M. elegans, Rhizopogon roseolus, and S. broomeana. We mined the genomes of M. verrucaria, M. elegans, and S. broomeana for potential RiPP precursor proteins of the verrucamides, mariannamides, and broomeanamides, respectively. In addition, we performed an antiSMASH search to screen for nonribosomal peptide (NRP) biosynthetic gene clusters that might encode genes for verrucamide, mariannamide, and broomeanamide synthesis. We sequenced the genome of the agaricomycete R. roseolus, as the genomes of two species of the genus Rhizopogon were shown in BLAST searches to encode multiple OphMA homologs each (Quijano et al. 2019). Finally, we performed screens to find new OphMA homologs in R. roseolus, M. verrucaria, M. elegans, and S. broomeana.

Strains and cultivation
The sequenced strains of M. verrucaria, M. elegans, and S. broomeana are the authentic producers of the verrucamides, mariannamides, and broomeanamides as seen in Zou et al. (2011), Ishiuchi et al. (2020, and Ekanayake et al. (2021)

Sample preparation and sequencing
The fungi were cultivated on cellophane-covered agar plates before their mycelia were harvested. Myrothecium verrucaria, M. elegans, and R. roseolus mycelia were harvested after 14, 9, and 40 days, respectively, and lysed by grinding with a mortar and pestle in the presence of liquid nitrogen. S. broomeana mycelium was harvested after 7 days, mixed in an Eppendorf tube with 0. 5 mm glass beads, frozen in liquid nitrogen and then lysed by vigorous shaking in a Fastprep machine for 2 times 45 s at level 6. Genomic DNA was extracted using the QIAGEN DNeasy plant Mini kit, DNA concentration measured using a Qubit dsDNA kit and DNA quality confirmed by running a fraction of the DNA on an agarose gel. The DNA was sent to Novogene, United Kingdom, for shotgun sequencing on an Illumina Novaseq, producing paired-end 150 bp reads, aiming for approximately 100x coverage.

Quality control
BBDuk (v38.87, Joint Genome Institute) was first used in righttrimming mode with a kmer length of 23 down to 11 and a hamming distance of 1 to filter out sequencing adapters. A second pass with a kmer length of 31 and a hamming distance of 1 was used to filter out PhiX sequences. A third and final pass performed quality trimming on both read ends with a Phred score cutoff of 14 and an average quality score cutoff of 20, with reads under 45 bp or containing Ns subsequently rejected.

Assembly
The paired-end and singleton reads of each read set were assembled using SPAdes (v3.14.0) (Nurk et al. 2013) in isolate mode, but otherwise default parameters.

Quality assessment
Completeness of the genome assemblies was assessed using BUSCO (v5.0.0) (Simão et al. 2015) in genome mode with theauto-lineage-euk parameter to automatically assess the likely lineage of each strain (M. elegans, M. verrucaria, S. broomeana: hypocreales; R. roseolus: boletales). To test for bacterial contamination, the tool mOTUs (v3.0.0) (Milanese et al. 2019) was run on the reads for each sample. M. elegans had 2 inserts that could not be assigned to a specific mOTU; M. verrucaria had 1 insert corresponding to "Phyllobacterium species incertae sedis"; R. roseolus and S. broomeana returned no hits. These very low hit counts indicate that it is very unlikely for there to be any contamination by bacteria in the samples.

Taxonomic analysis
The ssu_finder function of CheckM (v1.0.13) (Parks et al. 2015) was used to extract 16S and 18S rRNA gene sequences from the assemblies. 18S sequences of 1,726, 1,725, and 1,726 bp were found for M. elegans, M. verrucaria, and S. broomeana, respectively, but no such sequence was found for R. roseolus, likely because its assembly was highly fragmented. The sequences were aligned with the SILVA taxonomy database (v138) (Quast et al. 2013) using the provided software SINA (v1.6.1) (Pruesse et al. 2012). The internal transcribed spacer (ITS) region of each strain was extracted from its assembly using ITSx (Bengtsson-Palme et al. 2013) and its Fungi profile set. The sequences were then analyzed with the UNITE database (Nilsson et al. 2019;Kõljalg et al. 2020).

Genome mining
To search for all possible arrangements of cyclic peptides of interest, a custom Python script generated a fasta file containing all possible variants of linearized peptide sequences for the verrucamides, mariannamides, and broomeanamides. For broomeanamide A, for example, these sequences would be VPFAVLIL, PFAVLILV, FAVLILVP, and so on. First, the predicted protein sequences were searched for all peptides with blastp, then the assemblies were searched for all peptides with tblastn, both part of the BLASTþ suite (v2.11.0) (Camacho et al. 2009). As a positive control for the functionality of our mining method, we screened the genome of O. olearius using the peptide sequence of the cyclic RiPP omphalotin (Ramm et al. 2017;Van Der Velden et al. 2017). We found the omphalotin precursor protein OphMA, thus confirming that our method works.
Myrothecium verrucaria, M. elegans, and R. roseolus had an average nucleotide identity (ANI) of 78.8%, 79.9%, and 90.2% with their reference fungi M. inundatum, Mariannaea sp. strain PMI_226 and Rhizopogon vulgaris (Table 1). No ANI value could be generated for S. broomeana and its reference strain T. reesei, meaning that sequence similarity between the 2 strains is too low (<70%). For a deeper taxonomical analysis of the sequenced species, the 18S rRNA sequences were extracted and ran against the SILVA ribosomal RNA taxonomy database (Quast et al. 2013). Myrothecium verrucaria was identified as a member of the genus Myrothecium, M. elegans as a member of the order Hypocreales, R. roseolus as a member of either the order Boletales or Agaricales and S. broomeana as a member of the genus Trichoderma.

Screenings for RiPP precursors and NRP biosynthetic gene clusters
All 4 genomes were screened for potential RiPP precursor proteins; M. verrucaria for verrucamide precursors, M. elegans for mariannamide precursors, S. broomeana for broomeanamide precursors, and all 4 genomes, including R. roseolus, for OphMA homologs. Screens were performed using all circular permutations of verrucamide, mariannamide, and broomeanamide sequences. In addition, all genomes were screened for the N-terminal methyltransferase domain of OphMA. These searches yielded no hits, indicating that the cyclic backbone N-methylated verrucamides and broomeanamides and the cyclic mariannamides are not genetically encoded and therefore may indeed be NRPs, and that R. roseolus, unlike many of its relatives from the genus Rhizopogon, does not contain any OphMA homologs.
Following the unsuccessful search for RiPP precursors of verrucamides, mariannamides, and broomeanamides, an additional search was performed using the fungal version of the "antibiotics and secondary metabolite analysis shell" antiSMASH (Blin et al. 2019) with the goal of finding NRP biosynthetic gene clusters that might direct the biosynthesis of the isolated peptide natural products. antiSMASH currently uses an ensemble prediction method integrating several algorithms to predict the substrate specificity of adenylation domains (Blin et al. 2017). In M. verrucaria, 1 NRP cluster was predicted to produce a verrucamide-like peptide with the correct length and several N-methylated residues, whereas in M. elegans, 1 cluster was predicted to produce a peptide of the same length as the mariannamides, containing several leucines and at least 1 proline (Table 2). In S. broomeana, cluster NRPS 77.1 was predicted to produce a 6-residue-long peptide containing 1 isoleucine, 1 leucine and a total of 4 N-methlyations. The broomeanamides are longer (8 residues), but contain 4 N-methylated residues, 2 leucines and, in the case of Broomeanamide A, 1 isoleucine (Table 2).
Another NRP biosynthetic gene cluster of M. verrucaria was predicted to encode a 20 residue peptide with 11 residues of the nonproteinogenic amino acid isovaline (Table 3). This peptide is likely a peptaibol. Peptaibols are a class of antimicrobial NRPs from fungi that are 5-20 residues long, linear, N-and C-terminally modified with amino alcohol groups, and defined by the presence of the nonproteinogenic amino acids a-aminoisobutyric acid (Y) and/or isovaline (X) (de la Fuente-Núñez et al. 2013). The predicted peptide from M. verrucaria does not contain a-aminoisobutyric acid, but 11 residues of isovaline. There are no known peptaibols with such a high content of isovaline, so it is likely that some of these predicted isovalines are rather alpha-aminoisobutryric acids or other residues. To date, over 1,000 peptaibols have been characterized in various members of the order Hypocreales, with the vast majority produced by members of the genus Trichoderma (de la Fuente-Núñez et al. 2013). Our antiSMASH search suggested 2 additional isovalinecontaining NRPs in M. verrucaria (NRPS 3.4, X? X? L? Q? X and NRPS 15.2, XQ? X???) and 1 in S. broomeana (NRPS 31.1, XXX? X? QX??). Peptaibols have been previously reported in Sphaerostilbella toxica (Perlatti et al. 2020), but to our knowledge no peptaibols have been described in the Genus Myrothecium.
In conclusion, we present the complete genome sequences of the fungi M. verrucaria, M. elegans, R. roseolus, and S. broomeana. While our screens of the genomes for genes encoding RiPP precursor proteins of verrucamides, mariannamides, broomeanamides, and borosins did not yield any hits, we discovered 3 candidate NRP biosynthetic gene clusters that may control verrucamide, mariannamide, and broomeanamide biosynthesis, as well as multiple clusters predicted to produce peptaibol-like peptides. These gene clusters will be interesting targets for future research, particularly

Funding
This work was supported by the Swiss National Science Foundation (Grant No. 31003A-173097) and ETH Zü rich.

Conflicts of interest
None declared. A comparison between the predicted product sequence and the known sequences of verrucamide A-D, mariannamide A-B and broomeanamide A-B is given (Zou et al. 2011;Ishiuchi et al. 2020). Red letters indicate N-methylated residues, underlined letters represent D-amino acids. Question marks indicate non-specified residues.