Blocks in the pseudouridimycin pathway unlock hidden metabolites in the Streptomyces producer strain

We report a metabolomic analysis of Streptomyces sp. ID38640, a soil isolate that produces the bacterial RNA polymerase inhibitor pseudouridimycin. The analysis was performed on the wild type, on three newly constructed and seven previously reported mutant strains disabled in different genes required for pseudouridimycin biosynthesis. The results indicate that Streptomyces sp. ID38640 is able to produce, in addition to lydicamycins and deferroxiamines, as previously reported, also the lassopeptide ulleungdin, the non-ribosomal peptide antipain and the osmoprotectant ectoine. The corresponding biosynthetic gene clusters were readily identified in the strain genome. We also detected the known compound pyridindolol, for which we propose a previously unreported biosynthetic gene cluster, as well as three families of unknown metabolites. Remarkably, the levels of most metabolites varied strongly in the different mutant strains, an observation that enabled detection of metabolites unnoticed in the wild type. Systematic investigation of the accumulated metabolites in the ten different pum mutants identified shed further light on pseudouridimycin biosynthesis. We also show that several Streptomyces strains, able to produce pseudouridimycin, have distinct genetic relationship and metabolic profile with ID38640.

www.nature.com/scientificreports/ also includes formycin, malayamycin and ezomycin 15,16 . In previous work, we analyzed the PUM biosynthetic pathways through knockouts of several pum genes present within the PUM BGC, providing the first elucidation of a biosynthetic pathway for a C-nucleoside antibiotic 17 . This work has also showed that the pseudouridine synthase PumJ, the key biosynthetic enzyme in the PUM pathway, is present in diverse, taxonomically unrelated microorganisms, suggesting a widespread distribution of yet-to-be-discovered additional C-nucleoside antibiotics 17 .
Blocking PUM biosynthesis in the producer strain Streptomyces sp. ID38640 led to altered production of the siderophore desferroxiamine and of the polyketide lydicamycin, two unrelated specialized metabolites 17 . Here, we extend these findings by a systematic evaluation of MS profiles in available and ad-hoc-generated pum mutants, correlating metabolites with corresponding BGCs. Overall, we were able to define seven metabolite-BGC pairs, including proposing an uncovered BGC for a known metabolite. We also show that PUM can be easily detected in diverse Streptomyces strains harboring the pum BGC. Finally, our analyses provide additional insights into the PUM biosynthetic pathway.

Results and discussion
Metabolomic analysis of pum mutants. The functions of seven genes in the PUM gene cluster was previously assigned through bioinformatic analysis and the intermediates detected in knockout mutants [17], as summarized in Table 1. Briefly, the pseudouridine synthase PumJ catalyzes N-to C-nucleoside isomerization to yield pseudouridine or a derivative thereof, which is then converted into 5′-amino-5′-deoxy pseudourydine (APU) by the action of the oxidoreductase PumI and the aminotransferase PumG. In a converging step, guanidinoacetate (GAA) is produced by the amidino transferase PumN. Next, the amide ligases PumK and PumM add sequentially glutamine and GAA to APU, while PumE catalyzes N-hydroxylation (Table 1).
In order to analyze globally the metabolome of pum knockout mutants, we cultivated the wild type producer strain, Streptomyces sp. ID38640 (WT) along with the seven previously reported knockout mutant strains (ΔpumE, ΔpumG, ΔpumI, ΔpumJ, ΔpumK, ΔpumM, and ΔpumN) and three newly constructed ones (ΔpumF, ΔpumH and ΔpumL; see below), all unable to produce PUM but accumulating different intermediates (Table 1). Each strain was cultivated in two media and we analyzed the metabolite distribution by LC-MS at 24-h intervals over four days. Samples were analyzed after solvent extraction of the whole culture (FE samples; usually hydrophobic metabolites) and also by direct analysis of the cleared broth (SN samples; both hydrophobic and hydrophilic metabolites). The LC-MS/MS data from 44 samples, derived from 11 strains cultivated in two different media with two samples per culture, were subjected to the Global Natural Products Social (GNPS) molecular networking analysis 11 and visualized using Cytoscape 18 . This analysis clusters spectra having identical MS 2 patterns, forming nodes (rectangles in Fig. 1) and connects nodes having fragmentation patterns sharing at least 4 fragments (lines in Fig. 1). This results in the formation of networks, representing families of potentially related metabolites. Analysis with Cytoscape provides an intuitive visualization of metabolite distribution according to strain, medium or sample.
The molecular network of Fig. 1 www.nature.com/scientificreports/ translationally modified peptides, four terpenes, as well as at least nine regions classified as "other" (Table 2). Overall, 11 BGCs find a match with related BGCs in the MIBiG database. Consistent with our previous work 17 , we detected molecular families corresponding to the lydicamycins and desferroxiamines in all tested samples from each fermentation medium ( Fig. 1, Table 3). Particularly, high levels of lydicamycins and desferrioxamines were detected in the ΔpumI and ΔpumN mutants (Table 3; Supplementary Fig. 1).
We also detected a molecular family with a node at m/z 303 [M + 2H] 2+ (Fig. 1), the HR-MS fragmentation pattern and UV spectrum of which ( Supplementary Fig. 2) match those of the ureylene-containing oligopeptide antipain, a family identified used an authentic standard in previous work 19 . This family of compounds is produced by non-ribosomal peptide synthetases in numerous bacteria and functions as a protease inhibitor 20 . Consistently, we located a corresponding BGC in the ID38640 genome, that exhibits 61-78% gene sequence identity with antipain BGCs in the MIBIG database 21 ( Table 2). The antipain molecular family is found in both the WT strain and in most pum mutants, in both media, with highest amounts detected in the WT strain (Table 3; Supplementary Fig. 1).
A self-loop feature ( Fig. 1 Supplementary Fig. 3) encoding a predicted core peptide identical to ulleungdin. Ulleungdin was detected in all tested strains, in both fermentation media, with the WT strain showing trace levels (Table 3; Supplementary Fig. 1).
The molecular networking analysis of Fig. 1 was carried out using a cosine score above 0.7. This value was also used to form MS clusters. This filter excluded ectoine, a methyl, tetrahydropyrimidinecarboxylic acid that protects many bacterial species from osmotic stress, since this metabolite shows very poor fragmentation. Nonetheless, we were able to identify a peak, eluting at 1.  Supplementary Fig. 4). Consistently, we located a BGC in the ID38640 genome corresponding to the ectoine BGC (Table 2). Ectoine is detected in all tested strains (Table 3; Supplementary Fig. 1).
Additionally, the molecular network of Fig. 1 Table 2, are shown next to each metabolite.   Fig. 6). The associated LC peaks show a UV-Vis spectrum with absorption maxima at 254, 304 and 370 nm. These properties match those of pyridindolol and pyridindolol glucoside, produced by Streptomyces alboverticillatus and Streptomyces parvulus, respectively 25,26 . Table 2. BGCs identified in ID38640 and presence of these BGCs in the four PUM producers. a Regions identified by antiSMASH. Note that "Other0" was identified as explained in text. BGCs with detected metabolites are color-coded as in Fig. 1 b Presence of conserved BGCs in the genomes of the four analyzed PUM producers' . BGCs with detected metabolites are color-coded as in Fig. 1 Fig. 6). These metabolites are detected in almost all strains, with the ΔpumN mutant showing highest levels ( Table 3; Supplementary Fig. 1). Pyridindolol biosynthesis has not been studied previously and no BGC possibly linked to this metabolite could be found in the antiSMASH output. However, it has been reported that the β-carboline moiety present in pyridindolol is formed by a "Pictet-Spenglerase" (PSase), an enzyme that joins an amino group with an aldehyde 27 . The enzyme StnK2 has been shown to function as a PSase in streptonigrin biosynthesis in streptomycetes 28 . Accordingly, we searched the ID38640 genome for StnK2 homologs and identified QIK04791.1 as having 49% sequence identity with StnK2 ( Fig. 2; Table 4). The QIK04791.1-encoding region also specifies for a FAD-binding oxidoreductase, a long-chain fatty acid-CoA ligase, an aldehyde dehydrogenase, a histidine phosphatase and a F420-dependent oxidoreductase (Table 4). While most of these sequences have no paralogs in the streptonigrin BGC, we found that a syntenic region with over 90% protein-to-protein identity ( Fig. 2; Table 4) is present in the genome of the pyridindolol producer Streptomyces alboverticillatus (MUFU00000000; Fig. 2; Table 4). Based on our observations, we hypothesize that pyridindolol formation entails condensation of tryptophan with a C-3 unit, possibly glyceraldehyde(phosphate) by the PSase; aromatization of the newly formed ring by the FAD-bindingand/or the F420-dependent oxidoreductase; and reduction of the carboxyl group by the aldehyde dehydrogenase (Fig. 2). The order in which the hypothesized reactions occur awaits further analysis, as does the possible role of the conserved long-chain fatty acid-CoA ligase and histidine phosphatase present in the conserved segment. Table 3. Relative amounts of the identified metabolites in the different pum mutants. Amounts are expressed as ratios to those observed in the WT strain in the same medium. Note that for NK1 through NK3, which are not detected in the WT, their presence is indicated with an "X". The highest relative amounts of metabolites are in bold type. www.nature.com/scientificreports/ This BGC, which lies at one end of the genome sequence in Streptomyces sp. ID38640, has been added to Table2 and designated "Other0". Overall, our work led to the identification of seven metabolite-BGC pairs ( Table 2). This leaves 21 BGCs orphan of their product, including those for geosmin and methylisoborneol, volatile metabolites unlikely to be detected under our conditions, and hopene, unlikely to be present in our samples because of its lipophilicity. Thus, it remains to be determined whether these three metabolites are actually produced by Streptomyces sp. ID38640. Overall, 18 BGCs await matching metabolites and 3 identified metabolites are missing a matching BGC.
Notably, this work has demonstrated that changes in a small region of the genome facilitate the detection of additional metabolites. It has been previously reported that blocking biosynthesis of a specialized metabolite can facilitate detection of novel chemistry 29 . However, we are not aware of studies showing that different blocks in a single pathway can significantly alter the metabolite levels of biosynthetically unrelated metabolites. Accumulation of a particular PUM intermediate does not appear the reason for altering the metabolic profiles: for example, ectoine levels are tenfold enhanced in the ΔpumE, ΔpumJ and ΔpumK mutants that accumulate different PUM intermediates (Table 1). At the same time, not all mutants accumulating the same PUM intermediate show similar increases in ectoine levels. Thus, the observed modulation of specialized metabolite levels could result from the introduction of the apramycin resistance cassette with its strong promoter, from increased availability of precursors and/or from altered transcription by PUM. The mechanism(s) leading to altered metabolic profiles are currently unknown and further work will be necessary to establish whether this phenomenon is an oddity of the PUM pathway. Nonetheless, the generation of distinct mutants from a single BGC might be useful not only for elucidating the corresponding biosynthetic pathway (see below), but also for detecting chemistry hidden in wild type strain.

Additional insights into PUM biosynthesis.
The functions of most of the pum genes and the general pathway of PUM biosynthesis have been defined from our previous in vivo experiments and bioinformatic analyses 17 , as summarized in Table 1. In this work, we generated knockout mutants in three additional pum genes: pumF, pumH and pumL (Table 1). PumF shows 42-45% identity to SsaA and its orthologues NpsM and PacA, which are regulators of BGC for the structurally related uridyl-peptide antibiotics sansanmycins, napsamycin and pacidamycin, respectively 30 . PumH, annotated as an adenylate kinase, shares 42% identity with PolQ2 and MalE from the polyoxin and malayamycin biosynthetic pathways, respectively 31 . PumL shares 65% identity with a NocH-like protein belonging to the major facilitator superfamily.
Replacement of pumF with the apramycin resistance gene abolished PUM production and led to the accumulation of pseudouridine (PU), amino pseudouridine (APU) and guanidine acetate (GAA) ( Table 1; Supplementary Fig. 7). These results indicate that PumF is a positive regulator of PUM production that controls the conversion of APU into Gln-APU. The ΔpumL knockout mutant resulted in very low yields of PUM and accumulation of several PUM intermediates, consistent with a role of PumL in exporting the final pathway product (Table 1; Supplementary Fig. 7).
The phenotype of the ΔpumH mutant was more complex: it accumulated no PUM-related metabolite except for GAA (Table 1; Supplementary Fig. 7); and, unlike the ΔpumJ strain 17 , PUM production could not be rescued by adding PU to the production medium. Thus, the ΔpumH phenotype was identical to that of the previously reported ΔpumI mutant 17 , which likewise accumulated no intermediate except for GAA and could not convert PU We also assessed PUM and PUM precursors accumulated in the WT strain and in the ten pum knockout mutants in the molecular network. PUM (m/z 487 [M + H] + ) appears as a single loop detected only in the WT strain and, to a lesser extent, in the ΔpumJ and ΔpumL mutants, irrespective of the cultivation medium ( Fig. 1 and Table 1 (Fig. 1). As expected, Gln-APU is found in samples from the ΔpumE, ΔpumL, ΔpumN and ΔpumM strains, with the latter strain accumulating the highest level. Deoxy-PUM was detected in samples from the WT, ΔpumE and ΔpumL strains, with highest level present in WT, in agreement with previous results (Fig. 1, Table 1). In a different portion of the same network we observed a signal at m/z 388 [M + H] + ) consistent with N-hydroxy-Gln-APU (OH-Gln-APU), which corresponded to a hydrophilic peak with the pseudouridine-characteristic UV maximum at 263 nm. This species, which had not been noted in previous work because of its low abundance (Table 1), was detected in samples from the WT and, in lower amounts, from the ΔpumL, ΔpumM and ΔpumN strains. The levels of the PU-containing intermediates sharing the same chromophore were quantified as reported in Table 1.
The results presented here enable us to confirm and extend the previously proposed biosynthetic pathway for PUM 17 (Fig. 3). During the early biosynthetic steps, the substrate for the kinase PumH may be either uridine, PU or PU aldehyde (Fig. 3), with PumJ, PumI and PumG acting sequentially for C-isomerization, alcohol oxidation and amine formation, respectively. By analogy with the nikkomycin/polyoxin pathways 32 , phosphorylation is likely to occur at the 2′ position, as shown in Fig. 3. Subsequently, the phosphate group is removed by PumD or by a housekeeping phosphatase. A key step in the pathway appears to be the conversion of APU into Gln-APU by PumK, a conversion controlled by the regulator PumF. The detection of OH-Gln-APU suggests that N-hydroxylation by PumE precedes addition of GAA by PumM, consistent with the well-known facilitated www.nature.com/scientificreports/ hydroxylation of amines with respect to amides 33 . In the absence of PumE, PumM uses Gln-APU as substrate, leading to deoxyPUM as a shunt metabolite. While in the absence of PumM, Gln-APU preferentially accumulates, suggesting that either conversion of Gln-APU into its hydroxyderivative is inefficient or that expression of pumE is altered in this context. Finally, PumL appears to be a transporter for PUM.
PUM production by other streptomycetes. In previous work, we identified pumJ-related sequences linked to putative BGCs in numerous microbial genomes and we predicted these BGCs specify biosynthesis of PUM or closely related metabolites 17 . Production of PUM has been previously demonstrated only for Streptomyces spp. ID38640 and ID38673, from the NAICONS collection 14 , and for Streptomyces albus DSM 40763 34,35 .
To investigate whether the strains harboring a PUM BGC did produce PUM and other metabolites shared with Streptomyces sp. ID38640, we investigated four Streptomyces strains: S. rimosus ATCC 10970, producer of oxytetracycline; S. mobaraensis DSM 40847, producer of the NADH reductase inhibitor piericidin; S. eurocidicus ATCC 27428, producer of the antifungal polyene eurocidin; and S. flocculus DSM 40313, producer of the aminoquinone antibiotic streptonigrin. [S. flocculus has been recently reclassified 36 and will be referred as S. albus DSM 40313 hereafter.] These compounds have been known for several decades and the producer strains have been investigated by several laboratories, but PUM production has to our knowledge not been observed. When grown in a single medium and analyzed at three different time points, each Streptomyces species, in addition to the expected metabolites oxytetracycline, piericidin, eurocidin D or streptonigrin, produced PUM (Fig. 4a), at a level comparable to Streptomyces sp. ID38640 (around 200 μM) for S. rimosus and S. mobaraensis or at approximately half these levels for S. albus and S. eurocidicus. These results indicated that the PUM BGC in these species is actively expressed and that PUM can be easily detected when properly looked for.
We were interested in establishing the phylogenetic relationship and the extent of shared metabolites of the five PUM producers. We thus applied autoMLST 37 to construct a high-resolution species tree, which revealed ten major clades and three branches formed by a single strain each (Supplementary Fig. 8). Streptomyces sp. ID38640 belongs to clade 2 while, among the PUM producers reported above, only S. rimosus (clade 7) and S. mobaraensis (single-strain branch) were picked up. A phylogenetic tree of the five PUM producers showed that Streptomyces sp. ID38640 clustered with S. rimosus, while S. eurocidicus and S. mobaraensis formed a separate clade ( Supplementary Fig. 9).
The ID38640 genome does not harbor BGCs for oxytetracycline, euricidin, piericidin or streptonigrin. In addition to PUM, only three additional BGCs are shared by the five strains: those for the frequently encountered Streptomyces metabolites geosmin and hopene; and that for the BGC labeled Other5 ( Table 2). The latter BGC, consisting of a syntenic region of 8 conserved ORFs, is of unknown function and has been identified by antiSMASH as siderophore.
We next investigated whether the other PUM-producing strains shared other metabolites with Streptomyces sp. ID38640, using procedures similar to those described above. The resulting molecular network, represented in Fig. 4b, contains 630 features, including media components, of which 385 (61%) are organized in 62 molecular www.nature.com/scientificreports/ families. As highlighted by a red rectangle, a two-member family containing PUM and deoxy-PUM is found in samples from all strains. Of the families annotated in Fig. 1, lydicamycins, pyridindolol, ulleungdin and NK1 through NK3 remain ID38640-specific. Desferrioxamines were detected in S. rimosus and S. albus, while antipain was detected in S. mobaraensis. As described above, manual inspection of the LC-MS profiles showed ectoine in extracts from S. rimosus and S. albus. Additional metabolites were dereplicated in the samples and the corresponding BGCs were identified in the strain genomes (M.I., unpublished observations) but none of these additional molecules matched unannotated metabolites detected in Streptomyces sp. ID38640. Taken together, the above results indicate that the PUM BGC is not restricted to a specific Streptomyces clade and that PUM is not regularly co-produced with other metabolites. It will be interesting to establish whether knockout in the PUM pathway in the other PUM producers can also alter their metabolic profiles.

Conclusions
Streptomyces sp. ID38640 is a prolific and versatile producer of different metabolites, many of which could be detected only after selective blocks in the PUM pathway. While we do not yet understand why production of unrelated metabolites is significantly enhanced in different pum mutants, the approach used here might be a simple way of "catching two birds with a stone", simultaneously elucidating a biosynthetic pathway of interest and observing alterations in the metabolome. Possible targets for this approach might include the other PUM producers reported in this study. In any case, the work presented here, along with our previous studies 38,39 , indicate that a metabolomic look at "old strains" can unveil previously overlooked chemistry, including novel metabolites. This sort of analyses will be undoubtedly facilitated by the growing Paired Omics Data Platform (https ://paire domic sdata .bioin forma tics.nl) 40 .

Materials and methods
Bacterial strains and growth conditions. Streptomyces sp. ID38640, S. flocculus DSM 40313, S. rimosus ATCC 10970, S. mobaraensis DSM 40847, S. eurocidicus ATCC 27428 and the pum mutants were cultured as described 14 . Briefly, mycelium from BTT plates was inoculated in 50-mL Erlenmeyer flask containing 15 mL of seed medium (20 g/L dextrose monohydrate, 2 g/L yeast extract, 8 g/L soybean meal, 1 g/L NaCl, and 4 g/L CaCO 3 , pH 7.3), and incubated 72 h at 28 °C. The production media were M8 41 and PumP1 14 , which were inoculated with a 10% volume of the seed culture.
Construction of knockout mutants. The generation of ΔpumF, ΔpumH and ΔpumL strains followed described procedures 17 , which involved amplification of two ~1.0-kbp fragments (A and B) from genomic DNA using primers containing EcoRI and XbaI (fragment A) and XbaI and BamHI (fragment B) tails (Supplementary Table 1), that were cloned into the EcoRI-BamHI sites of the vector pWHM3-oriT-ΔXba. In the resulting plasmid, the apramycin resistance gene was inserted at the XbaI site within the PCR-amplified pum segments to generate the knockout plasmid. The knockout plasmids were then introduced into E. coli ET12567/pUB307, whence they were conjugated into spores of Streptomyces sp. ID38640 as described 17 . Double-crossover mutants were identified through PCR with diagnostic primers.
Genome sequence and bioinformatic analyses. Genome sequencing was performed by Cebitec Bielefeld University (Germany) using Illumina MiSeq/Genome Analyzer IIx/HiSeq 1000. BGCs were identified using the antiSMASH 5.0 at the default conditions 3 . BLAST analysis of individual CDSs was performed against the MIBiG database of known BGCs 21 and against Protein Data Bank. Multilocus sequence analysis was performed with autoMLST in "denovo mode" and default settings 37 .

Samples for LC-MS analysis.
For PUM-related metabolite analysis, 0.5 mL of the culture was centrifuged at 13,200 rpm for 2 min and the supernatant was filtered through a 0.2-μm membrane (EuroClone), generating the SN sample. Full extracts (FE) were prepared by transferring a 0.5-mL sample from cultures into a 2-mL Eppendorf tube containing 0.5 mL MeOH. After 1 h at 55 °C under constant shaking, the sample was centrifuged for 10 min at 13,200 rpm and the supernatant was recovered and transferred into a 1.5-mL glass vial.
Metabolite analysis. LC-MS analyses were performed with on a Dionex UltiMate 3000 coupled with an LCQ Fleet (Thermo scientific) mass spectrometer equipped with an electrospray interface (ESI) and a tridimensional ion trap. The column was an Atlantis T3 C18 5 mm × 4.6 mm × 50 mm maintained at 40 °C at a flow rate of 0.8 mL/min. Phases A and B were 0.05% trifluoroacetic acid in water and acetonitrile, respectively. SN samples were analyzed using the following gradient: 0 to 25% phase B in 4 min, followed by a 2-min wash at 90% and a 3-min re-equilibration at 0% phase B. The gradient used for FEs was a 14-min multistep program that consisted of 10, 10, 95, 95, 10 and 10% phase B at 0, 1, 7, 12, 12.5 and 14 min, respectively. UV-VIS signals (190-600 nm) were acquired using the diode array detector. The m/z ranges were set at 120-1500 and 200-2000 for SNs and FEs, respectively, with ESI conditions as follows: spray voltage of 3500 V, capillary temperature of 275 °C, sheath gas flow rate at 35 units and auxiliary gas flow rate at 15 units. High resolution mass spectra were acquired as described previously 42 . Metabolomic analysis. For the metabolomic analysis the Metabolomics-SNET-V2 (release_23) workflow was used. Parameters were adapted from the GNPS documentation: MS2 spectra were filtered so that all MS/ MS fragment ions within ± 17 Da of the precursor m/z were removed. The MS/MS fragment ion tolerance and the precursor ion mass tolerance were set to 2.0 and 0.5 Da, respectively. Edges of the created molecular network www.nature.com/scientificreports/ were filtered to have a cosine score above 0.7 and at least 4 matched peaks between the connected nodes. The maximum size of molecular families in the network was set to 100. The MS2 spectra in the molecular network, filtered in the same manner as the input data, were searched against our internal library of 480 annotated metabolites. Reported matches between network and library spectra were required to have a score above 0.75 and at least 5 matching peaks. The molecular networks were visualized using Cytoscape.
Metabolite quantification. Pseudouridine-containing intermediate were quantified by HPLC assuming an identical chromophore as PUM, against a purified pseudouridimycin internal standard. Relative amounts of the other metabolites were estimated as peak intensity ratio to those observed in WT strain.
Nucleotide sequence accession number and Paired Omics Data Platform project identifier. The genome sequence has been deposited in GenBank under the accession CP049782 as BioProject