Genome Mining of Endophytic Streptomyces wadayamensis Reveals High Antibiotic Production Capability

The actinobacteria Streptomyces wadayamensis A23, an endophitic strain, was recently sequenced and previous work showed qualitatively that the strain inhibits the growth of some pathogens. Herein we report the genome analysis of S. wadayamensis which reveals several antibiotic biosynthetic pathways. Using mass spectrometry, we were able to identify desferoxamines, several antimycins and candicidin, as predicted. Additionally, it was possible to confirm that the biosynthetic machinery of the strain when compared to identified known metabolites is far underestimated. As suggested by biochemical qualitative tests, genome encoded information reveals that the strain A23 has high capability to produce antibiotics.


Introduction
Streptomyces genera carry an extremely versatile group of biosynthetic machineries capable to produce complex molecules of medical, agricultural and economical significance.They are recognized as the most resourceful producers of molecules that present antibiotic activities. 1 The mutualism existing among several actinobacteria and plants, ants and other organisms also seemed to be a key element to the capability to produce such small molecules. 2,3ince the pioneers work in partial genome sequencing to a more recent and wide investment on whole genome sequencing, it became clear that the ability of Streptomyces and other actinobacteria to biosynthesize promising metabolites is underestimated. 1,4Recent developments on next generation sequencing revealed a diverse plethora of gene clusters encoding for cryptic targets.It became clear that by accessing sequences genome encoded it is possible to predict enzymatic functions and the hole of an entire gene cluster involved in secondary metabolites biosynthesis.
Genome-mining tools have therefore opened a new and innovative avenue for natural products research. 5hen combined with bioinformatic tools, it contributes to envisage a global map of biosynthetic gene clusters (BGC).][11][12][13][14][15][16][17][18] Driven by the current status of the genome to natural products programs we invested in the Whole Genome Shotgun project of Streptomyces wadayamensis A23, an endophytic strain isolated from Citrus reticulata (tangerine). 19Biochemical qualitative tests showed that strain A23 inhibits the growth of Candida albicans pathogen, Bacillus megaterium, Neisseria meningitides strains, and multiresistant Staphylococcus aureus. 20Herein we report our preliminary results generated after genome annotation and mining of the WGS project.Annotation was critical to predict that S. wadayamensis has high capability to biosynthesize antibiotics.Experimental evaluation combined to powerful analytical tools revealed the production of several bioactive metabolites as anticipated by mining.

Chemical and samples
Water was purified using a Milli-Q water purification system (Millipore).Beef extract, yeast extract, peptone and agar were purchased from Oxoid (Hampshire, UK).HPLC grade methanol was purchased from Tedia Brazil (Rio de Janeiro, Brazil).FeCl

Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR-MS) sample preparation
A single colony of Streptomyces wadayamensis was selected and added to 1 mL of vegetative N medium (seed).The cell was grown for 24 h at 30 o C and 250 rpm.After this time, the seed medium was added in 100 mL of the culture medium described above.Streptomyces wadayamensis was cultivated for 7 days in 500 mL Erlenmeyer flasks on reciprocal shaker at the same conditions.The medium was centrifuged and the mycelium was separated.The broth was extracted three times with ethyl acetate.The organic layer was evaporated. 1 mg of the culture extract was dissolved in 1 mL of methanol and centrifuged for 10 min at 13,000 rpm to remove any precipitate.This solution was used as stock solution from which aliquots of 100 µL were removed and diluted in 1 mL of methanol.

FT-ICR-MS analysis
The equipment was an LTQ FT Ultra 7T (ThermoScientific) equipped with a nano-electrospray ionization (ESI) source (TriVersa NanoMate 100 system).Ionization was carried out both on the negative and positive ion mode on an m/z between 100 and 2000.The spectra generated were analyzed with the Xcalibur software.

Desorption electrospray ionization imaging mass spectrometry (DESI-IMS) sample preparation
For microbial IMS sample preparation, a 0.25 µL sample from a Streptomyces spore solution was inoculated on thin-layered agar.The antibiotic assay medium 2 (1.5 g L -1 beef extract, 3.0 g L -1 yeast extract, 6 g L -1 peptone and 1.5% agar) was prepared by first placing a sterile microscope slides in a Petri dish, followed by the pouring of agar medium.The inoculated medium was incubated at 30 °C, and the Petri dish was sealed with parafilm to minimized premature dehydration.After the incubation period (48 h), the microscope slides were removed from the Petri dish, photographed and put in a vacuum desiccator for complete agar dehydration at room temperature.

DESI-IMS analysis
IMS was performed using a Prosolia DESI source (Model OS-3201) connected to a Q Exactive Hybrid Quadrupole-Orbitrap.The DESI configuration was set with an emitter height of 2.5 mm, mass spectrometer inlet height of 0.1 mm, inlet to emitter distance of 3.8 mm, 58° sprayed angle, 5.0 kV spray voltage, inlet capillary temperature of 320 °C, 100 V S-lens, 160 psi ultra-pure nitrogen nebulizing gas pressure and a sprayed solvent of methanol in a 3.0 µL min -1 flow rate.Images were collected scaning from m/z 133-1500 with a step size of 200 mm, a scan rate of 741 µm sec -1 and a pixel size of 200 × 200 µm.

MS and tandem mass spectrometry (MS/MS) for network analyses
For analysis of Streptomyces wadayamensis A23 metabolite profiles, cells cultures and medium were removed from microscope slides and extract with 0.5 mL methanol after DESI-IMS analysis.The extract were centrifuged for solid parts removal, and the remaining solution was analyzed by direct infusion ESI in a Thermo Scientific Q Exactive Hybrid Quadrupole-Orbitrap Mass Spectrometer.The MS/MS experiments were all conducted with 3.5 kV spray voltage, inlet capillary temperature of 320 °C, 100 V S-lens in a 3.0 µL min -1 syringe flow rate.

Data analysis
The DESI-IMS data was converted into imaged files using the Firefly data conversion software (version 2.1.05) 21and viewed using the BioMAP software (version 3.8.04). 22or molecular network creation, the tandem mass data (.raw) were converted in mzXML using msconvert from ProteoWizard and the GNPS automated platform 23 in default settings. 24In addition, Cytoscape 2.8 25 was used for molecular network visualization.

Results and Discussion
As previously described, the genome draft of S. wadayamensis A23 has near to 7 million base pairs, encoding 6006 protein-coding sequences with 73.5% GC content. 19The Illumina shotgun library produced 7.8 million reads in a total of 2.4 GB data assembled to generate 180 contigs.Genome annotation was carried out using a customized pipeline and revealed several putative secondary metabolites encoded by S. wadayamensis.
The input of the genome data into the antibiotics & Secondary Metabolite Analysis SHell (antiSMASH) server (version 3.0.5) 26resulted in an output predicting the presence of 32 gene clusters codifying for secondary metabolites biosynthesis.Biosynthetic systems as terpene, nrps, bacteriocin-terpene, t1-pks, t3-pks, bacteriocin, nrps-t1pks, tiopeptide-lantipeptide, lantipeptide-nrps-t1pks, siderophore, ectoine, lassopeptide, lantipeptide and other clusters are apparently encoded in S. wadayamensis genome (Table S1).Crossing the output results from antiSMASH 27 within the MIBiG (Minimum Information about a Biosynthetic Gene cluster) pipeline, 28 it was possible to predict more accurately certain metabolites and/or the most likely class that the gene cluster fits as presented in Table 1.However, it is important to emphasize that, although the antiSMASH platform predicts 32 possible BGC, this actual number could be underestimated as the genome is organized in 180 contigs and some clusters are truncated (for example, both clusters 4 and 16 are complementary and correspond to the BGC of FR-008).
After in silico analysis, the metabolic profile of the ethyl acetate extracts from cultivation in several mediums was evaluated by FT-ICR-MS and Orbitrap in both the positive and negative ion modes using a m/z range between 100 and 2000.To identify the most common ions present in the extracts, we performed a manual search in the dictionary of natural products (DNP), 29 with near to 226000 structures from wide-ranging sources.
Although there is a difference in intensity for the signals in the chromatographic profile (Figure S1), almost the same set of ions were found in the conditions tested for fermentation (data not shown).
A careful evaluation of A medium pointed to an ion of m/z 601.3552 with attributed formula C 27 H 48 N 6 O 9 .Among the structures generated in the DNP database, we identified nocardamine, also known as desferoxamine E (Figure 1), a siderophore with high affinity for iron, which forms a complex that is reabsorbed by the cell via the ATP-dependent transport system. 30We had already observed the production of siderophores by qualitative tests on agar plates containing chrome azurol sulfonate. 20dditionally, we performed direct DESI-IMS on an agar culture and observed another siderophore of same family, that is, the desferoxamine B of m/z 561.3640 (Figure 1). 31th siderophores predicted by the gene cluster 22 were therefore characterized by chemical/biological tests (chrome azurol) and MS/MS fragmentation patterns.The fragmentation of protonated desferoxamine E was mainly characterized by the neutral loss of a molecule of succinylcadaverine (200 Da, Figure S2). 32esferoxamine (MIBiG BGC 0000941) is a highly conserved gene cluster in Streptomyces genera (Figure 2 with essential functions for the siderophore biosynthesis: siderophore biosynthetic enzyme (5), acetyltransferase (6), monooxygenase (7), pyridoxal-dependent decarboxylase (8)  and siderophore-interacting protein (9).
A total of 6 product ions related to antimycins were also identified, which are part of a family of polyketidesnonribosomal peptides already isolated from Streptomyces sp.exhibiting confirmed antifungal activity and predicted on gene cluster 17. 33,34 The antimycins are biosynthesized by a mixed cluster nrps-t1pks and the main differences among them are located in the chains R 1 and R 2 positions, which corresponds to an aliphatic and fat acid chain, respectively,  and characterize the main losses observed by MS/MS (Table 2). 33,34Most of the product ions from antimycins fragmentation are common among all isoforms, e.g., m/z 263, 245, 219, 191 and 161, however and fortunately, there is often one fragment (R1, Figure 3A) which is characteristic for each antimycins variant.Therefore, based on these product ions, the production of antimycin variants such as A11, A13, A14, A16 and A20 could be excluded from our samples.As some variants have the same precursor ions and the same characteristic product ions, the fragmentation pattern may correspond, for instance, to A1a or A1b, A12 or A19, A8 or A17 (where R 1 corresponds to distinct acyl chains in the structural isomers), therefore it is not possible to distinguish between then by MS spectrometry and additional analysis, such as NMR, would be necessary to discriminate the structural isomers.
The ) of m/z 575.2994 were accordingly confirmed by A23 strain in the cultivation conditions (Figure 3 and Figure S3).
The gene cluster codifying for antimycins in S. wadayamensis is 100% identical to the one described by Seipke et al. 35 and fed in MIBiG platform (BGC0000958), comprising the genes 1 to 16 in Figure 4.The modular polyketide synthase gene antD (gene number 5, module 3, Figure 4) carries a KS, AT, ACP and TE domain.According to the differences observed in R1 chains, it is possible to infer that the AT specificity is highly atypical with the incorporation of several acyl-CoAs subunits.The mining also indicated a possible metabolite described by the clusters 4 and 16 belonging to the FR-008 BGC, a complex polyene with high similarity to the candicidin gene cluster found in Streptomyces griseus IMRU3570 (Figure 5). 36Chen et al. 36 originally described this metabolite and its gene cluster in 2003 and throughout this study it was possible to conclude that candicidin and FR-008 are identical compounds.
A meticulous search led us to find candicidin (m/z 1109.5784,C 59 H 84 N 2 O 18 , Figure 6) in the crude extract,  which could be most likely, the compound responsible for C. albicans growth inhibition. 36n indication for the production of candicidin in S. wadayamensis was already anticipated in previous studies 20 by the use of metabolic probes to promote dereplication.The KS-AT domains from FscB and FscD (gene numbers 17 and 20, respectively, Figure 7) were identified using this methodology.The candicidin gene cluster organization in the endophytic strain is 100% identical to the gene cluster description for FR-008 (BGC0000061 in MIBiG repository).
To enhance the A23 dereplication, we performed the molecular networking analysis of A23 cultivation extracts, but no further known molecules were found.Interestingly, we were able to cross several information found by FT-ICR-MS and DESI-IMS.As molecular networking also   enables clustering families of structurally related natural products by MS/MS fragmentation pattern analysis, and DESI-IMS is able to correlate molecules of same spatial distribution during the microorganism growth, it is possible therefore to establish a correlation within a related set of compounds that could share chemical similarities. 24,37Using molecular networking dereplication, we were able to find desferoxamine B (described in cluster 22) and their iron complex form (Figure 8), 31 and we also tagged three other clusters of potentially unknown natural products (cluster 1: m/z 700.645; cluster 2: m/z 1193.38;cluster 3: m/z 912.627).
IMS experiments revealed that metabolites possibly described by the gene clusters from Table 1 and Table S1 could be visualized over the agar media around the colony.A preliminary MS/MS analysis showed that the ion m/z 912 seems to be a nonribosomal linear peptide (NRP) not yet described (Figure S4).We failed to find a similar pattern in networking or in dictionary of natural products.In total, we found three different compounds (m/z 884, 898, and 912) differing in 14 u (CH 2 ) between each other from which is possible to observe a similar arrangement distribution over the agar media in IMS data.
Finally, mining the genome revealed gene cluster 21, related to a NRPS responsible to mannopeptimycin biosynthesis (48% of the genes show similarity) produced by several strains of Streptomyces hygroscopicus.Mannopeptimycins are glycosylated cyclic hexapeptides (glycopeptides) first described as a complex mixture of compounds named AC98 and showed effective activity against Gram-positive bacteria (Figure 9). 38Its activity was also confirmed against methicillin resistant staphylococci and vancomycin resistance enterococci. 39he biosynthetic pathway for mannopeptimycins was described by Margavey et al. 40 in 2006.Looking closely and performing blast and protein function comparison, it is possible to verify that the main genes mppA and mppB responsible for the hexapeptide core are 100% identical in S. wadayamensis A23 (genes 23 and 24 in Figure 10A) and S. hygroscopicus NRRL 30439, revealing similar operon type organization (Figure 10B).The most part of the right hand outside genes (except for mppM, mppN, mppX, mppY, mppZ, and mppZ1) are also identical in both strains.mppM and mppN correspond to isovaleryl transferases, responsible for the isovaleryl incorporation in the terminal O-linked mannose in mannopeptimycins γ, δ, and ε.The absence of these genes in the BGC codified by S. wadayamensis suggests that this strain could produce α or β-mannopeptimycins.The genes mppX, mppY and mppZ have unknown function and mppZ1 is a transcriptional regulator.The left hand side genes in Streptomyces wadayamensis codifies for a single major facilitator transporter (gene 22), while in the mannopeptimycin gene cluster used for comparison there are two genes (E and F) codifying for this function.In the MIBiG entry mppI and mppH codifies for polyprenyl fosfo-mannosyl transferases, mppJ for a methyltransferase, mppG for a polyprenyl mannose synthase, mppD for a sugar binding protein and mppC for an acetyltransferase.For mannopeptimycin BGC from S. wadayamensis it is possible to describe several unknown function genes (labelled in grey) and two glucose specific phosphotransferase (genes 12 and 21) among others.Under the fermentation conditions we couldn't identify the production of mannopeptimycin, but still the information is genome encoded and is under investigation.
All the molecules identified so far shows that Streptomyces wadayamensis A23 has a high capability to produce antibiotics.Still, we need to correlate the chemical entities with the biochemical activity prompted to inhibit the growth of pathogens as observed.Studies related to the unknowns metabolites produced by A23 strain are in progress.

Conclusions
The genome sequencing of Streptomyces wadayamensis A23 combined with gene annotation and mass spectrometry pattern evaluation revealed the strain is a promising producer  of several secondary metabolites.As such tool improves the dereplication approach it was possible to observe the production of the siderophores deferoxamines B and E, several antimycins antibiotics and the polyene candicidin.These results open a wide field of research encompassing "from genome to natural products" as a promising strategy to find new biological and pharmacological targets.

Figure 8 .
Figure 8. (A) Annotated molecular networking of orbitrap MS/MS fragmentation data acquisition from crude extract of Streptomyces A23 in 3 different growth times: 40 h (red nodes), 60 h (green nodes) and 90 h (yellow nodes).Compounds present in all experimental conditions are represented with blue nodes and other combinations in gray nodes; (B) superposed image showing the spatial distribution for m/z 912 over the medium around the microbial colony.

Table 1 .
Most significant gene clusters encoded in Streptomyces wadayamensis genome (antiSMASH-MIBiG output) a a Not all annotated gene clusters are represented in this chart.