Unveiling the Bacterial Sesquiterpenome of Streptomyces sp. CBMAI 2042 Discloses Cyclases with Versatile Performances

Terpenes are the most abundant class of natural product that exist in nature. They possess a myriad of industrial applications including pharmaceutical, perfumery and flavors, bulk chemicals, and fuel. Intriguingly, until today, the vast majority of characterized terpenoids have been isolated from plants and fungi, and only in recent years bacteria were found to generate a representative reservoir of terpenoids molecules. Mining Streptomyces sp. CBMAI 2042 genome data has revealed the presence of five terpene cyclase genes. Chemical analysis of mycelium extract of this bacteria strain has unveiled at least 28 volatile terpenes molecules, where three encoding sesquiterpene cyclase (STC) genes are apparently responsible for their biosynthesis. The cyclic products obtained by incubation of these three purified recombinant STCs with farnesyl diphosphate (FPP) were analyzed by gas chromatography-mass spectrometry (GC-MS) and identified using the Van den Dool and Kratz equation.


Introduction
Due to the great chemical and structural diversity, studies involving specialized metabolites (SMs) are of great relevance and help humanity understand the dynamics of nature with its surroundings.In addition, development of new drugs inspired by SMs to control, mainly, human infectious diseases resulted in a better quality of life for society.Currently, research in natural products continues to be an abundant source for the discovery of new chemical entities with a great chemical diversity and industrial applications. 1,2mong the SMs, terpene metabolites are the most abundant.Terpene SMs have a range of applications and are part of society's daily life.About 80,000 terpenoids have been identified and cataloged in the natural products dictionary to date. 3 These metabolites have a myriad of applications including pharmaceutical (taxol and artemisinin), perfumery and flavors (valencene, menthol and linalool), bulk chemicals (isoprene) and fuels (farnesene, bisabolane). 4,5espite the variety of applications and high demand for terpene products, there is a huge difficulty in accessing this class of molecules through chemical synthesis.The production of these molecules requires numerous synthetic steps and, consequently, increased production costs. 6Furthermore, the extraction of terpene compounds from natural sources is not only laborious, but also results in low yields and excessive consumption of valuable resources.In this way, to overcome this problem, approaches involving biotechnological processes have been making progress in the last two decades and are proving to be a more sustainable and profitable alternative in the production of terpenes with industrial applications. 4erpene molecules are derived from C 5 precursor molecules isopentenyl diphosphate (IDP) and dimethylallyl diphosphate (DMADP) which are assembled by prenyltransferase (PTs) enzymes to produce elongated polyisoprene.Enzymes known as terpene cyclases (TCs) convert these linear oligoprenyl diphosphate, via cationic intermediates, into a multitude of (poly)cyclic hydrocarbon molecules. 7,8nterestingly, only in recent years bacteria have been learnt to be a primary terpene producers, and therefore the vast majority of terpenoids characterized to date are derived from plants and fungi. 7,9,10This breakthrough is due to the extensive mining of the bacterial genome in the last decade, showing that genes encoding terpene cyclase are abundant among these organisms, including the family Streptomycetaceae. 11et, studies involving bacterial terpene cyclases (BTCs) have proven to be an even greater challenge.Amino acid sequences of these enzymes do not show general similarity with TC enzymes of plant or fungi origin, in addition to low amino acid sequence identity and similarity with other BTCs.To investigate cryptic TCs in the bacterial genome, Yamada et al. 9 applied an alternative genome mining procedure known as HMMs (hidden Markov models), 12 which, in addition to aligning conserved regions in proteins, also perform searches in the Pfam database (protein family database) 13 making possible to correlate a tertiary structure with characterized BTCs. 14This approach allowed to expand from 140 previously known BTCs to 262, permitting to enlarge the knowledge of this class of enzymes. 9,15ertainly, unraveling the bacterial terpenome can be a huge challenge.Therefore, genome mining approaches can guide the search for terpene compounds, nonetheless, so far, it is not possible to predict only from genome information the chemical nature of the (poly)cyclic terpenoid.
In this genome-guided study, three recombinant sesquiterpene cyclase (STC) enzymes were produced in E. coli Rosetta(DE3), purified, and incubated with farnesyl pyrophosphate (FPP).Enzyme-substrate assays revealed that all three recombinant enzymes are active, and their incubation products are cyclic sesquiterpenes.In addition, the analysis of mycelium oil extract from Streptomyces sp.CBMAI 2042, unveiled the presence of at least 28 cyclic sesquiterpenes.[18]
E. coli DH10B strain was used for maintenance and replication of the constructed plasmids.E. coli Rosetta(DE3) strain was used for production of recombinant terpene cyclase proteins.For the cultivation of any strain of E. coli, lysogeny broth (LB) medium was used (LB: 5 g L -1 yeast extract; 10 g L -1 tryptone; 10 g L -1 NaCl).

Transcription analysis assay
For transcription analysis, a pre-inoculum was made in 5 mL of TSBY medium containing spores of Streptomyces sp.CBMAI 2042 and the bacteriostatic nalidixic acid (25 μg mL -1 ) and cultured for 2 days at 200 rpm at 28 °C.To the inoculum of 50 mL TSBY and nalidixic acid (25 μg mL -1 ) 1% of the seed culture was added and fermentation conducted for 5 days at 200 rpm at 28 °C.Aliquots were collected at 48, 72, 96 and 120 h of fermentation times.tRNA (transfer ribonucleic acid) was isolated using the Rneasy Mini KIT QIAGEN (São Paulo, Brazil) according to the manufacturer's protocols.The tRNA was treated with deoxyribonuclease (DNAse) and then reacted with the reverse transcriptase enzyme from the commercial SuperScript III First-Strand Synthesis kit Invitrogen (São Paulo, Brazil).The resulting cDNA (complementary deoxyribonucleic acid) was used in polymerase chain reaction (PCR) and amplified with specific primers.

DNA extraction and manipulation
Genomic DNA of the strain Streptomyces sp.CBMAI 2042 was isolated according to the protocol described and adapted from Sharma and Singh. 20Plasmid pET28b(+) was extracted from the E. coli strain DH10B/pET28b(+) and purified using the commercial QIAprep Spin Miniprep kit QIAGEN (São Paulo, Brazil) following the manufacturer's protocol.Phusion GC buffer Master Mix from Thermo Scientific (São Paulo, Brazil) was used in PCR for plasmid construction.For colony PCR and screening the enzyme My RedTaq polymerase Bioline (São Paulo, Brazil) was employed.The restrictions endonucleases FastDigest ® (NdeI, NheI, EcoRI, HindIII, XbaI) and T4 DNA ligase were acquired from Thermo Scientific (São Paulo, Brazil).

Recombinant STCs productions and purification
Constructed plasmids under study were transformed into E. coli Rosetta(DE3).A single colony was used to inoculate a pre-inoculum in LB liquid medium with selective antibiotics kanamycin (50 µg mL -1 ) and chloramphenicol (34 µg mL -1 ) and shaken at 250 rpm, 16 h and 37 °C.Then, 1 L of LB liquid medium with selective antibiotics (5 × 1 L Erlenmeyer flask containing 200 mL LB) was inoculated with 1% of the overnight pre-inoculum and shaken at 250 rpm at 37 °C until optical density at 600 nm (OD 600 ) 0.6, the Erlenmeyer's were placed on ice and added 0.4 μmol L -1 of isopropyl β-D-1-thiogalactopyranoside (IPTG).The flasks were shaken at 250 rpm for 16 h at 18 °C.The harvested cells were resuspended in a lysis buffer (20 mM Tris-HCl; 20 mM imidazole; 1 mM MgCl 2 ; pH 7) and lysed by sonication (Eco-Sonics, São Paulo, Brazil).The filtered supernatant containing the recombinant His 6 Tag-STCs enzyme was purified by affinity chromatography on a ÄKTA pure (GE Healthcare, Umeå, Sweden) coupled with a 1 mL Histrap FF crude or 1 mL Histrap HP column, using wash buffer (20 mM Tris-HCl; 20 mM imidazole; 1 mM MgCl 2 ; pH 7) and elution buffer (20 mM Tris-HCl; 500 mM imidazole; 1 mM MgCl 2 ; pH 7).Buffer exchange was accomplished by applying the purified enzyme to a 5 mL HiTrap Desalting column and eluted by reaction buffer (50 mM Tris-HCl; 1 mM MgCl 2 ; pH 7) suitable for carrying out enzymatic assays.Bradford method was used to quantify purified proteins.The steps from production to purification were visualized by SDS-PAGE (sodium dodecyl sulfate-polyacrylamide gel electrophoresis) stained with commercial Coomassie Bio-Safe BioRad (São Paulo, Brazil).All centrifugations' steps were carried at centrifuge 5810R (Eppendorf, Hamburg, Germany).

Enzyme assays
The enzymatic assay was based on the protocol described by Rabe et al., 21 and Felicetti and Cane. 22Enzyme-substrate incubation experiments were performed using purified recombinant enzymes.Typically, 0.1 mg of enzyme was incubated with buffer (50 mM Tris-HCl; 1 mM MgCl 2 ; pH 7) supplemented with 50 μM farnesyl pyrophosphate ammonium salt from Cayman Chemical (Paulínia, Brazil) for 16 h at 30 °C.The reaction was overlaid with 1 mL of HPCL grade hexane extracted and analyzed using gas chromatograph-mass spectrometry (GC-MS).

GC-MS analysis
The extracts obtained were analyzed in a mass spectrometer coupled to a gas chromatograph Agilent 7890 (Agilent, São Paulo, Brazil) equipped with an HP-5MS column, injector temperature 220 °C, auxiliary temperature 240 °C, He flow 1 mL min -1 , injection volume 1 μL, split 300:1 to 20:1 and a mass range of m/z 50-300.The ramp profile was programmed as 60-165 °C in a gradient temperature of 3 °C min -1 , 165-280 °C in a gradient temperature of 20 °C min -1 , finally the ramp remained at 280 °C for 4 min.After the fermentation, the cultivation was centrifuged.The mycelium was washed 2× with autoclaved Milli-Q H 2 O, centrifuged and the harvested cell mass was taken to the Biofreezer (Sanyo, Osaka, Japan) at -80 °C for at least 1 h.Then, it was extracted with 200 mL of methanol in an ice bath for 90 min under magnetic stirring.Subsequently, a filtration in a Büchner funnel was performed to remove the mycelium.The methanol extract (200 mL) was transferred to a separatory funnel and extracted with HPLC grade hexane (2 × 100 mL) to obtain the fraction teeming terpenes and other volatile hydrocarbons.To remove free fatty acids from extraction procedure, the hexane extract was washed with 2 × 50 mL 0.5 M NaOH in 50% methanol solution.Finally, the organic phase was washed with brine, dried with Na 2 SO 4 and concentrated at a pressure of 130 mbar at 20 °C until obtaining 1 mL of a yellowish oil.

Results and Discussion
As previously described, 16 the actinobacteria Streptomyces sp.CBMAI 2042, an endophytic isolated from Citrus sinensis, was fully sequenced using Illumina technology (MiSeq).The linear sequence was grouped into three contigs and the genome estimated at approximately 8 Mbp.
The input data from S. sp.CBMAI 2042 genome sequencing was uploaded on the bioinformatics platform antiSMASH (version 6.0.1) 23and the output results revealed the presence of 35 biosynthetic gene clusters (BGCs) linked to specialized metabolite production.The annotated clusters are involved in the biosynthesis of butyrolactone, t1-polyketide synthase (PKS), t3PKS, non-ribosomal peptide synthetase (NRPS), 24,25 isorenieratene, terpene, lantipeptide, lassopeptide, melanin, siderophore, ectoin, and hybrid BGCs NRPS-transATPKS-t1PKS, 17 melanin-NRPS, t3PKS, thiopeptide-t1PKS-NRPS-, t2PKS-butyrolactone (Table S1, Supplementary Information (SI) section). 18he antiSMASH platform crosses genome information with several in silico analysis tools, including analysis using hidden Markov models (HMMer 3). 26The analysis highlighted the presence of five bacterial terpene clusters (BTCs) related to the production of bacterial terpene metabolites.Analysis throughout the NCBI BLAST tool 27 of the genes contained in terpene clusters identified a core gene encoding terpene cyclase enzyme in each BTC.From the five BTCs predicted, three of them (ts-1, ts-2 and ts-3) displayed high similarity to sesquiterpene cyclase encoding genes.Careful bioinformatics analysis indicated that the ts-1 gene was probably associated with the production of the degraded terpene geosmin and its precursor germacradienol, ts-3 could be related to caryophyllene alcohol-type metabolites and there were no correlations of metabolites with the ts-2 gene, although it was associated with 1-10 cyclization pattern of terpenes derivatives metabolites.
Following the in silico study, transcriptional analysis was performed to determine whether the genes encoding the biosynthesis of sesquiterpenes from Streptomyces sp.CBMAI 2042 were active under normal growth conditions in TSBY medium.To assess the functionality of the ts-1, ts-2, and ts-3 genes, tRNA extraction was performed at different times of bacterial culture in TSBY medium, profiling the transcriptome for the behavior of the selected biosynthetic genes (Figure S1, SI section).The assay showed that the ts-1 gene was inactive under the conditions described above, whereas transcription of the ts-2 and ts-3 genes was observed between 72 and 120 h (Figure 1).
Consequently, the endophytic S. sp.CBMAI 2042 was cultured in different culture media, including TSBY, ISP2, and A media, and the volatile organic profile was evaluated (Figure 2).The analysis revealed that the oils extracted from the mycelium of CBMAI 2042 had a greater diversity of peaks associated with terpene metabolites when cultured in ISP2 and A media (Figure S2, SI section).In contrast, when cultured in TSBY, the analyzed oil showed a smaller range of terpenes, possibly due to the absence of peaks corresponding to metabolites associated with ts-1 gene expression.This result is consistent with the data observed during transcriptional analysis in TSBY medium (Figure 2).
Using GC-MS analysis we identified at least 28 sesquiterpenes in the mycelial extracts when the bacterium was cultured on A medium, and 21 sesquiterpenes when cultured on ISP2 medium.Although the terpene peaks identified in the chromatogram had different retention times, they had a similar fragmentation pattern, making it difficult to classify their chemical signature by searching only the NIST11 and FFNSC2 libraries.
To characterize the sesquiterpene molecules it was required to obtain the arithmetic index (AI).As the Kováts index 28 is calculated under isothermal conditions, the method described by Van den Dool and Kratz, 29 equation 1, was adopted. 30In this method, a temperature ramp was used to separate the terpene compounds, and the AI was calculated through an arithmetic relationship of the retention indices of the n-alkanes standard as well as for the unidentified terpenes compounds (Table 1).
where, t R a represents the sample retention time, t R pz the retention time of the n-alkane standard peak that precedes the sample peak, t R p z+1 the retention time of the n-alkane standard peak that follows the sample peak and Pz represents the carbon number of the n-alkane with retention time t R pz.
To relate the mining information to the terpene chemotypes, all STC identified gene cores ts- (1-3)  were then amplified by PCR and cloned into the pET28b(+) system (Figures S3 and S4, SI section).For production of the recombinant terpene cyclases enzTS-(1-3), pET28b(+)::ts-(1-3) plasmids constructions were transformed into E. coli strains BL21(DE3) and Rosetta(DE3) (Table S2, SI section).Enzyme production experiments were performed using the E. coli Rosetta(DE3) system, which has a pRARE plasmid encoding rare tRNAs with chloramphenicol resistance, as in silico analysis via the ATGme platform 33 confirmed the presence of a high percentage of rare codons in E. coli contained in ts- (1-3)  genes.This system allowed us to obtain appreciable amounts of the recombinant enzymes in the soluble phase.
The purified enzTS-(1-3) enzymes (Figure S5, SI section) were conditioned in reaction buffer and incubated in an assay in the presence of the elongated precursor FPP; the blank assay was performed without the addition of recombinant enzymes in the reaction mixture.Examination of the chemotype profile using GC-MS analysis confirmed that the enzymes enzTS-(1-3) actively converted FPP to cyclic terpene derivatives (Figures S6-S8, SI section).After searching the NIST11 and FFNSC2 mass spectral libraries, all biotransformation products of the predicted terpene cyclases in this study showed characteristic fragmentation patterns of sesquiterpenes (Figure S9, SI section).Similarly, AIs were calculated for the diagnostic cyclic products (Table 2).As predicted by blastp, the incubation assay between the enzyme enzTS-1 and FPP resulted in the production of six cyclic terpenes, including the degraded terpenes 8,10-dimethyl-1-octalin (2), 8,10-dimethyl-1(9)-octalin (3), and geosmin (7), and the sesquiterpenes germacrene D (17)  and 1(10)E,5E-germacradien-11-ol (29), plus an unidentified product (18).According to these results, the terpene cyclase enzTS-1 can be classified as "germacradienol/geosmin synthase". 32,34,35y analyzing the GC-MS results involving the reaction between enzTS-2 and FPP, it was possible to observe the presence of at least 12 volatile sesquiterpenes products, including nonpolar terpenes as β-gurjunene (12) and trans-cadina-1,4-diene (24) and the alcohols germacrene D-4-ol ( 26) and 1-epi-cubenol (28).The sesquiterpenes β-( 6) and δ- (30) elemenes were possibly produced via Cope rearrangement due to temperature injection conditions used (above 200 °C). 36s the main products of the biotransformation reaction are alcohols, enzTS-2 can be described as "germacrene D-4-ol/1-epi-cubenol synthase", with germacrene D-4-ol (26) being the main product followed by 1-epicubenol (28) from in vitro enzyme-substrate assays.As predicted by in silico analysis, all products formed are derived from type 1-10 cyclization.Nevertheless, 26 was observed in low concentration in Streptomyces sp.CBMAI 2042 mycelium extract only when cultivated in A medium, which indicates that the formation of this sesquiterpene in vitro might be related to the reaction conditions such as culture supplements, pH and temperature. 37Furthermore, enzTS-2 proved to be a terpene cyclase with promiscuity towards the formation of cyclic products, a suitable and promising target enzyme for biotechnological applications.
On the other hand, the incubation between enzTS-3 and FPP showed only one peak, with β-caryophyllene (10)  being the major product of the reaction.However, formation of caryophyllene alcohol (25) with a calculated arithmetic index 1565 as observed in vivo production was expected.In spite of that, enzTS-3 can be classified as a "β-caryophyllene synthase". 38Interestingly, among the studied sesquiterpene cyclases, enzTS-3 produces a unique product, β-caryophyllene (10), with 1-11 type cyclization.
Performing in vitro assays with recombinant terpene cyclase enzymes endorsed the identification of 18 sesquiterpenes, and 10 terpenes less than observed in the analysis of mycelial extracts of the wild-type strain.Nevertheless, the in vitro experiments could not always reproduce exactly the behavior of the native organism, due to the absence of assistant enzymes or even to the different physiological environment characteristic of heterologous expression.For instance, in vitro experiments demonstrated the absence of caryophyllenyl alcohol (25) and the formation of germacrene D-4-ol (26) at higher concentrations compared with in vivo experiments, supporting this hypothesis.This bias, although not fully elucidated, may have been influenced by a subtle difference in the folding of recombinant proteins when performing in vitro assays as well as by a different chemical environment. 11,37,39,40he current results reinforce the fundamental importance of STCs studies due to their promiscuity across the natural substrate qualifying such enzymes for industrial applications.In a study by Oberhauser et al., 41 eight recombinant STCs were tested against six unnatural FPP containing heteroatoms, the assays resulted in the production of six unnatural heteroatom-modified cyclic terpenoids, in which one of the compounds obtained is a tricyclic terpenoid with interesting olfactory properties.
Additionally, studies performed with the recombinant enzyme germacrene D synthase (GDS) in the presence of FPP analogues generated unnatural SMs, including (S)-14,15-dimethylgermacrene D. Interestingly, unlike the repellent property of germacrene D against aphids, (S)-14,15-dimethylgermacrene D acted as an attractant against these insects. 42nfortunately, the generation of unnatural terpenoids ended up being limited by the complexity of chemical synthesis and further purification of the products.However, recently, Johnson et al. 43 reported the production of FPP analogues using a cell free system.In this approach, promiscuous kinase enzymes were used to obtain pyrophosphate derivatives of prenol and isoprenol and the isoprene chain was elongated using prenyl transferases.The use of prenol analogues in different proportions made it possible to obtain FPP analogues in excellent yields, avoiding complex synthetic steps and difficulties with purification.
Advances in this area have allowed the expansion of terpenoid repertoire, allowing the use of these compounds as starting material in semi-synthesis applications.Improvements in cell free systems usage for obtaining unnatural precursors enables the application of this strategy on larger scales, which can be applied in the flavors and food, agrochemical, and pharmaceutical industries.In this sense the expansion of the terpenome of recombinant STCs from Streptomyces sp.CBMAI 2042 introduced in this study can be explored by performing incubation assays with unnatural precursors, as well as site-directed mutagenesis studies approaches for enzyme optimization, making it possible to obtain new derivatives of cyclic terpenes SMs.

Conclusions
The endophytic bacteria Streptomyces sp.CBMAI 2042 was fully sequenced revealing 35 biosynthetic gene clusters involved in the biosynthesis of secondary metabolites, with five of these clusters involved in the biosynthesis of terpene specialized metabolites.In silico analysis revealed the presence of three genes encoding sesquiterpene cyclase enzymes, qualifying the strain as a potential producer of volatile sesquiterpenes SMs.
To deepen the knowledge into the biosynthetic machinery of this strain and to expand the knowledge of bacterial sesquiterpene cyclases, we performed fermentation experiments in different culture media and extraction of the volatile fraction.As a result, we observed the production of at least 28 sesquiterpenes by the native strain.Moreover, the recombinant mapped STCs enzymes enzTS-(1-3) were produced in their active form and evaluated in vitro in the presence of the natural substrate farnesyl pyrophosphate.Using calculated arithmetic indices, mass spectrometry and comparison with literature data, we identified the sesquiterpenes chemotype produced in vivo and in vitro.
The technical-scientific knowledge generated by unveiling the bacterial terpenome of Streptomyces sp.CBMAI 2042 contributes in numerous ways to the development of this large area of biotechnology.In this work, it was possible to acquire knowledge about the functioning of three bacterial terpene cyclases, in addition through sequencing and careful mining of STCs in the bacterial genome we contributed to feed databases regarding the sequence of enzymes that can be further explored.
Moreover, with these recombinant enzymes in hand, incubation assays with unnatural precursors can be performed, allowing unprecedented generation and expansion of terpenoid backbones.Finally, deepening the knowledge about STCs enzymes will enable the development of sustainable systems with application in several areas including the agrochemical and medicinal sector.
Extraction of terpenes produced by the Streptomyces sp.CBMAI 2042

Figure 1 .
Figure 1.Transcription analysis of genes (a) ts-1; (b) ts-2 and (c) ts-3 during times of 48, 72, 96 and 120 h; for positive control, genomic DNA of Streptomyces sp.CBMAI 2042 was used as template for the PCR reaction.

Figure 2 .
Figure 2. Chromatographic profile of the hexane extract from Streptomyces sp.CBMAI 2042 mycelium analyzed by GC-MS in (a) A medium; (b) ISP2 medium; and (c) TSBY medium.Compounds numbers were assigned in ascending order of retention time as shown in Table1.

Table 1 .
Bacterial terpenes uncovered in the GC-MS analysis of the hexane extracts of Streptomyces sp.CBMAI 2042 mycelium cultivated in A, ISP2 and TSBY medium a Compounds numbers were assigned in ascending order of retention time; b retention time; c experimental arithmetic index; d arithmetic index in literature; e internal standard; f generated from germacrene A by Cope rearrangement in the injection port of the GC apparatus.

Table 2 .
GC-MS analysis of in vitro incubation of recombinant enzymes enzTS-(1-3) with farnesyl pyrophosphate a Experimental arithmetic index; b terpene cyclase recombinant enzyme; c arithmetic index in literature; d generated from germacrene A by Cope rearrangement in the injection port of the GC apparatus.