Discovery and functional characterization of two diterpene synthases for sclareol biosynthesis in Salvia sclarea (L.) and their relevance for perfume manufacture

Background Sclareol is a diterpene natural product of high value for the fragrance industry. Its labdane carbon skeleton and its two hydroxyl groups also make it a valued starting material for semisynthesis of numerous commercial substances, including production of Ambrox® and related ambergris substitutes used in the formulation of high end perfumes. Most of the commercially-produced sclareol is derived from cultivated clary sage (Salvia sclarea) and extraction of the plant material. In clary sage, sclareol mainly accumulates in essential oil-producing trichomes that densely cover flower calices. Manool also is a minor diterpene of this species and the main diterpene of related Salvia species. Results Based on previous general knowledge of diterpene biosynthesis in angiosperms, and based on mining of our recently published transcriptome database obtained by deep 454-sequencing of cDNA from clary sage calices, we cloned and functionally characterized two new diterpene synthase (diTPS) enzymes for the complete biosynthesis of sclareol in clary sage. A class II diTPS (SsLPPS) produced labda-13-en-8-ol diphosphate as major product from geranylgeranyl diphosphate (GGPP) with some minor quantities of its non-hydroxylated analogue, (9 S, 10 S)-copalyl diphosphate. A class I diTPS (SsSS) then transformed these intermediates into sclareol and manool, respectively. The production of sclareol was reconstructed in vitro by combining the two recombinant diTPS enzymes with the GGPP starting substrate and in vivo by co-expression of the two proteins in yeast (Saccharomyces cerevisiae). Tobacco-based transient expression assays of green fluorescent protein-fusion constructs revealed that both enzymes possess an N-terminal signal sequence that actively targets SsLPPS and SsSS to the chloroplast, a major site of GGPP and diterpene production in plants. Conclusions SsLPPS and SsSS are two monofunctional diTPSs which, together, produce the diterpenoid specialized metabolite sclareol in a two-step process. They represent two of the first characterized hydroxylating diTPSs in angiosperms and generate the dihydroxylated labdane sclareol without requirement for additional enzymatic oxidation by activities such as cytochrome P450 monoxygenases. Yeast-based production of sclareol by co-expresssion of SsLPPS and SsSS was efficient enough to warrant the development and use of such technology for the biotechnological production of scareol and other oxygenated diterpenes.


Background
Diterpenoids constitute a large class of chemically diverse metabolites that is widely distributed throughout the plant kingdom with more than 12,000 known compounds, the majority of which derives from bicyclic 'labdane-related' diterpene intermediates [1]. These include the gibberellin phytohormones as part of general (i.e. primary) plant metabolism with essential roles in plant growth and development [2][3][4] and a plethora of specialized (i.e. secondary) metabolites with essential functions in ecological interactions of plants with other organisms, including attraction of pollinators or defense against pests or pathogens [5][6][7][8].
Because of their various biological activities in humans, diterpenoids of plant origins are of substantial economical relevance as bioproducts for a variety of applications, for example as pharmaceuticals or as fragrance components [9,10].
Sclareol ( Figure 1) is a labdane-type diterpene alcohol, which has been reported in four plant species of four different families: Salvia sclarea (Lamiaceae) [11], Cistus creticus (Cistaceae) [12], Nicotiana glutinosa (Solanaceae) [13] and Cleome spinosa (Brassicaceae) [14]. Although sclareol has been suggested to possess anti-bacterial, antifungal and growth regulating activities, its function in planta is unclear [12,[15][16][17][18][19]. A major use of sclareol is in the fragrance industry. Sclareol is the most common starting material for the synthesis of Ambrox W [20], which serves as a valuable and sustainable substitute for ambergris [21], a waxy substance secreted by sperm whales. Ambergris has historically been appreciated for its musky and sweet earthy odor and has been used for many years as a fixative in high-end perfumes. However, its origin from an endangered and protected animal species made the use of ambergris in the fragrance industry controversial.
Clary sage (Salvia sclarea) is the plant species predominantly used for production and isolation of sclareol. It is a native species of the Mediterranean basin, Southern Europe and Iran, and is commercially grown mostly in Europe (France, Hungary, Bulgaria) and North America for its essential oils [22]. Despite successful cultivation of clary sage, annual production and availability of sclareol varies substantially due to uncertain environmental factors. Figure 1 Proposed biosynthetic pathway of sclareol and related diterpenes in Salvia sclarea. The suggested biosynthetic pathway of sclareol 4 as the predominant diterpene in S. sclarea and other minor constituents detected in planta, such as manool 6 as well as manoyl oxide 7 and 13-epi-manoyl oxide 8, requires the activity of at least two monofunctional diTPSs. A class II enzyme catalyzes the protonation-initiated cyclization of (E,E,E)-geranylgeranyl diphosphate (GGPP) 1 to form labda-13-en-8-ol diphosphate (LPP)3 or (9 S,10 S)-copalyl diphosphate (CPP) 5 (i.e., CPP of normal or (+)-stereochemistry) via a labda-13-en-8-yl diphosphate 2 carbocation. Catalyzed by class I diTPS activity, ionization of the diphosphate ester of LPP 3 or CPP 5 results in the formation of sclareol 4 and manool 6, respectively. In addition, manoyl oxide 7 and 13-epi-manoyl oxide 8 may occur as a product of this biosynthetic pathway.
The development of a cost-efficient and scalable enzymatic production platform would improve the reliability of sclareol production. However, the genes and enzymes responsible for sclareol biosynthesis have not been described.
Based on the established general patterns of diterpene biosynthesis in angiosperms, we propose that biosynthesis of sclareol in clary sage may involve the activity of two monofunctional diTPSs (Figure 1). In a plausible sequence of diTPS activities, a class II diTPS may first catalyzes the bicyclization of GGPP (1) and water capture at C-8 to afford labda-13-en-8-ol diphosphate (LPP, 3), similar to the function of copal-8-ol diphosphate synthase (CcCLS) from C. creticus [25]. Subsequently, a class I diTPS may convert LPP through cleavage of the diphosphate group and may also catalyze the additional hydroxylation at C-13 to form sclareol (4). A recent patent [36] reported on two genes [GenBank: AET21247, GenBank: AET21246] coding for similar enzymatic activities in S. sclarea, however lacking a complete description of the enzyme activities. Hydroxylation reactions in the class I actives site of diTPSs have been reported for bifunctional class I/ II diTPS outside of the angiosperms, namely copalyl diphosphate / kaurene synthases (CPS/KS) from the nonvascular plants Physcomitrella patens and Jungermannia subulata [28,37], labda-7,13-dien-15-ol synthase from the lycophyte Sellaginella moellendorffii [27], and levopimaradiene / abietadiene synthase from Picea abies (PaLAS) [26].
Using previously established transcriptome sequence resources for clary sage calyces [38], we describe here the isolation of full-length (FL)-cDNAs of a class II diTPS (SsLPPS) and two class I diTPSs (SsSS and SsdiTPS3). We show that the enzymes encoded by SsLPPS and SsSS catalyze the direct formation of sclareol without the requirement of a P450-mediated hydroxylation. We demonstrate the subcellular localization of both sclareol-biosynthetic diTPSs in plastids. Initial efforts of engineering of sclareol biosynthesis in yeast established promising leads for the future development of microbial production systems for sclareol using plant enyzmes.

Results
Transcriptome mining and discovery of SsLPPS, SsSS and SsdiTPS3 cDNAs We hypothesized that sclareol is synthesized from GGPP through a two-step mechanism involving a pair of class II and class I monofunctional diTPS (Figure 1). Given the high abundance of sclareol in metabolite extracts of clary sage calyces, this tissue was subjected to 454 pyrosequencing and revealed six different diTPS candidate sequences [38]. Additional data mining of the 454-sequences allowed the retrieval of two additional sequences presenting homologies with known diTPSs. Full length sequencing of the cDNAs of these eight candidate sequences recovered by 5'-and 3'-RACE (Additional file 1: Figure S1) revealed that they were independent parts of three separate diTPS genes, a class II diTPS (SsLPPS) containing the characteristic DxDD motif, and class I diTPSs (SsSS and SsdiTPS3) carrying the conserved DDxxD and NSE/DTE functional motifs.

Phylogenetic analysis
Phylogenetic comparison of the translated FL-cDNA sequences confirmed the assignment of SsLPPS to the TPS-c subfamily of angiosperm class II diTPSs [39,40] ( Figure 2). Specifically, SsLPPS is closer related to class II diTPSs that are involved in specialized metabolism such as (9 S,10 S)-CPP (i.e. CPP of normal stereochemistry) synthase from Salvia miltiorhizza [24] and copal-8-ol diphosphate synthase from C. creticus (CcCLS) [25]. SsSS and SsdiTPS3 can be assigned to the TPS-e/f and TPS-e families respectively, which contain KS-like enzymes involved in general or specialized diterpene metabolism [39].
Interestingly, SsSS lacks the internal γ-domain, which is characteristic of the archetype three-domain structure of plant diTPSs [41][42][43][44]. While the γ-domain is essential during class II diTPS catalysis, the active site of a monofunctional class I diTPS is located in the α-domain. SsSS is closely related to a recently reported class I diTPS from S. miltiorhizza that produces miltiradiene and exhibits a similar loss of the γ-domain [24,43]. Together, the phylogenetic relation and domain structure suggested that SsSS encodes a monofunctional class I diTPS involved in specialized metabolism. In contrast, SsdiTPS3 exhibits the common αβγ architecture and shares only 23.5% identity with SsSS. Its closer relation to ent-kaurene synthases within the TPS-e subfamily may suggest a function in general rather than specialized metabolism.
In summary, SsLPPS, SsSS and SsdiTPS3 represent the three different subfamilies of angiosperm diTPSs involved in the biosynthesis of labdane-type diterpenoids ( Figure 2).
Functional characterization of S. sclarea diTPSs and discovery of sclareol synthase While the FL-cDNA of SsLPPS was directly amplified from calyx cDNA, the obtained FL-cDNAs of SsSS and SsdiTPS3 were subjected to codon-optimization for expression of the synthesized genes in E. coli. To investigate the catalytic activity of SsLPPS, SsSS and SsdiTPS3, N-terminally truncated constructs were generated that lack putative plastidial targeting peptides (Additional file 1: Figure S1). Recombinant proteins were expressed in   E. coli and Ni 2+ -affinity purified, resulting in soluble proteins of the expected molecular weight of 83 kDa for SsLPPS, 61 kDa for Ss SS, and 85 kDa for SsdiTPS3. Initial in vitro enzyme assays were carried out with GGPP as substrate to test the three enzymes for diTPS activity. For the characterization of SsLPPS, the reaction products were dephosphorylated prior to GC-MS analysis. By comparison to reference mass spectra databases (NIST, Wiley W9N08L), and the product of CcCLS [25], the major product of SsLPPS was identified, after dephosphorylation, as labd-13-en-8,15-diol (labdenediol) (Figure 3 and Additional file 2: Figure S2). Labdenediol, which was absent from the product profile without enzymatic dephosphorylation, is the dephosphorylated form of labda-13-en -8-ol diphosphate (LPP, Figure 1). Additional minor components of the SsLPPS product profile were epi-manoyl oxide, manoyl oxide, traces of sclareol, copalol,13(16),14labdien-8-ol, and an unidentified diterpene compound, with the latter three compounds being of too low abundance to allow unambiguous identification ( Figure 3 and Additional file 2: Figure S2). Additional LC-MS analysis on nondephosphorylated reaction products confirmed LPP as the major product of SsLPPS (Figure 4), which identified SsLPPS as an LPP synthase.
SsSS and SsdiTPS3 were not active with GGPP as substrate. We further tested SsSS and SsdiTPS3 in coupled enzyme assays with SsLPPS. In assays with SsLPPS and SsSS, sclareol was the major product with minor amounts of manool, manoyl oxides and epi-manoyl oxide as secondary products ( Figure 5 and Additional file 2: Figure S2). These products were identified by comparison to the authentic compounds and reference mass spectra. According to these results from in vitro enzyme assays, SsSS was identified as a monofunctional class I sclareol synthase, which converts LPP produced by SsLPPS to sclareol, as the second diTPS-catalyzed reaction in sclareol biosynthesis ( Figure 1).
Combination of SsLPPS and SsdiTPS3 yielded no additional product peaks as compared to SsLPPS alone, suggesting that SsdiPTS3 is not able to use LPP as a substrate ( Figure 5). Additional coupled assays with Zea mays ent-CPS [45] and a protein variant of Picea abies PaLAS producing (9R,10R)-CPP (i.e. ent-CPP) and (9 S,10 S)-CPP as alternative substrates, respectively, also did not reveal activity of SsdiTPS3 with CPP (data not shown).

Sclareol production in engineered yeast
To substantiate our results of SsLPPS and SsSS functions determined in in vitro assays, we used heterologous . GC-MS analysis was performed on an Agilent HP5ms column with electronic impact ionization at 70 eV. Results were confirmed with three independent experiments. Identification of reaction products was achieved by comparison to authentic standards or reference mass spectra from the National Institute of Standards and Technology MS library searches (Wiley W9N08L): peak a, epi-manoyl oxide 8; peak b, manoyl oxide 7; peak c, putative 13 (16),-14-labdien-8-ol; peak d, putative copalol; peak e, sclareol 4; peak f, unknown compound; peak g, labda-13-en-8,15-diol; peak h, geranylgeraniol. expression in yeast (S. cerevisiae) for additional in vivo assays. Metabolically-engineered yeast would also be a suitable biological system for scalable, and potentially industrial, production of sclareol. In a modular engineering approach, yeast cells were co-transformed with a S. cerevisiae GGPP synthase (ScGGPPS) [29] and SsLPPS alone, or in combination with SsSS or SsdiTPS3. Yeast strains expressing only ScGGPPS or carrying an empty vector were used as controls. After induction with galactose, both culture media and yeast cell pellets were collected and extracted with pentane. GC-MS analysis of the resulting pentane extracts showed similar results to the in vitro enzyme assays described above. Only the combination of SsLPPS and SsSS afforded sclareol as the major product, while expression of SsLPPS alone resulted in only trace amounts of sclareol ( Figure 6). Co-expression of SsLPPS with SsSdiTPS3 did not yield any additional product formation compared to expression with SsLPPS or ScGGPPS alone. Even though sclareol yield and purity in the culture media was not quantitatively measured, its accumulation in the medium suggests an active or passive release from the engineered yeast cells. These findings outline a promising basis for developing microbial production systems as an alternative to production in plants, where a clary sage field will produce from 10 to 15 Kg of inflorescence per hectare every other year (biannual plant) with a sclareol content of 1 to 2.5%. Sclareol is then extracted with an organic solvent and purified by a physical process to 95% purity. The overall extraction and purification yield can be up to around 35%.

Subcellular localization of SsLPPS and SsSS
The biosynthesis of sclareol is believed to occur in plastids as the substrate GGPP is predominantly derived from the plastidial 2-C-methyl-D-erythritol-4-phosphate (MEP) pathway [40]. To validate this hypothesis individually for SsLPPS and SsSS, we evaluated their subcellular distribution by transient expression of individual green fluorescent protein (GFP)-fusion proteins and confocal laser scanning microscopy. For this purpose, the putative plastidial signal peptides (SP) of SsLPPS and SsSS were fused in frame with the N-terminus of GFP and transiently expressed in N. benthamiana leaves. Confocal microscopy two to four days after transformation demonstrated a plastidial localization of the SP-GFP fusions for SsLPPS and SsSS (Figure 7).

Discussion
Naturally occurring diterpenol metabolites such as manool, cis-abienol or sclareol are of high value to the fragrance industry. For example, sclareol is commercially produced from clary sage plantations and used as a primary material in perfume manufacture.
In this study, we demonstrated that sclareol is biosynthesized through a two-step cyclization of GGPP, by two monofunctional diTPSs isolated from clary sage, namely SsLPPS and SsSS. We showed that the introduction of oxygen functionalities in sclareol biosynthesis is catalyzed by diTPSs and does not require activity of, for example, cytochrome P450 monooxygenases.
Similar to a class II diTPS from C. creticus (CcCLS) [25], SsLPPS catalyzes the formation of LPP, through a protonation-initiated cyclization of the substrate GGPP, followed by capture of a hydroxyl ion at C-8, as previously suggested. After the recent report of CcCLS [25], SsLPPS is only the second monofunctional class II diTPS reported to facilitate the formation of an oxygen-containing diterpene core structure. The close phylogenetic relationship Results were confirmed with three independent experiments. Identification of reaction products was achieved by comparison to authentic standards or reference mass spectra from the National Institute of Standards and Technology MS library searches (Wiley W9N08L): peak a, epi-manoyl oxide 8; peak b, manoyl oxide 7; peak c, putative 13 (16), -14-labdien-8-ol; peak d, putative copalol; peak e, sclareol 4; peak f, unknown compound; peak g, labda-13-en-8,15-diol; peak i, manool 6.
to other class II diTPSs within the TPS-c family indicates that SsLPPS arose from a CPS gene potentially involved in general gibberellin metabolism via gene duplication and neo-functionalization, resulting in a diTPS for the formation of LPP as the major product in specialized metabolism.
The subsequent SsSS-catalyzed class I reaction proceeds via the ionization of the diphosphate ester of LPP and hydroxylation at C-13, generating the diterpenediol product, sclareol. In contrast to the newly identified monofunctional SsSS from an angiosperm plant species, all of the previously reported diTPSs which catalyze hydroxylations during class I reactions, such as CPS/KS from P. patens producing ent-16α-hydroxy-kaurene [37], CPS/KSL from S. moellendorffii that forms labda-7,13dien-15-ol [27], and PaLAS recently shown to produce the tertiary diterpenol 13-hydroxy-8(14)-abietene [26], were bifunctional class I/II diTPSs from non-vascular or gymnosperm plants. With the additional hydroxylation at C-13, SsSS adds a novel function to the portfolio of diTPSs that introduce hydroxy functionality to the hydrocarbon backbone of diterpenes and deepens our understanding of the catalytic space of diTPSs that contributes to the remarkable diversity of naturally occurring plant diterpenoids.
Our results support independently the claims of a recent, non-peer-reviewed, patent report [36] of a class II diTPS ([GenBank: AET21247]; 98.9% amino acid identity to SsLPPS) and a class I diTPS ([GenBank: AET21246]; 99.7% identity to SsSS) from S. sclarea. The patent also reported LPP and manoyl oxides as the primary products of the class II diTPS and formation of sclareol when class I and class II diTPS were fused. Our data provide, nevertheless, a more complete functional characterization of the substrate and product specificities of these genes.
In our work, GC-MS analysis of reaction products of the combined activities of SsLPPS and SsSS with GGPP as the starting substrate demonstrated the formation of minor amounts of manool in addition to sclareol, which could originate from the conversion of CPP as a side product of the SsLPPS-catalyzed reaction. While sclareol is the most abundant diterpene found in the essential oils of S. sclarea with manool as a minor diterpene, other Salvia species such as S. oligophylla [46] and S. argentea [47] exhibit a high abundance of manool but not sclareol. Recent studies in rice and wheat have demonstrated that class I diTPSs can act on substrates of different size and stereochemistry [23,43]. The small quantities of CPP detected in the product profile of SsLPPS as well as the presence of manool as a minor product of the coupled reaction with SsSS suggest that SsSS can convert both (9 S,10 S)-CPP and LPP to form manool and sclareol, respectively. This hypothesis will be tested in future work. In general, the cloning and functional characterization of SsLPPS and SsSS from S. sclarea will enable the discovery of the potentially orthologous diTPS in other Salvia species, which will shed light on the molecular processes that determine the selective formation of sclareol versus manool as major products in the different species. Such knowledge will be useful for the targeted molecular breeding of Salvia species for the fragrance industry. Identification of reaction products was achieved by comparison to authentic standards or reference mass spectra from the National Institute of Standards and Technology MS library searches (Wiley W9N08L): peak a, epi-manoyl oxide 8; peak b, manoyl oxide 7; peak c, putative 13(16),-14-labdien-8-ol; peak d, putative copalol; peak e, sclareol 4; peak f, unknown compound; peak g, labda-13-en-8,15diol; peak i, manool 6. Subcellular localization studies supported the conclusion that SsLPPS and SsSS are targeted to plastids. This result further suggests that sclareol biosynthesis occurs in the plastids of flower calyces where the corresponding diTPS transcripts were highly abundant [38] and that precursors are most likely derived from the plastidial MEP pathway.
Both SsLPPS and SsSS are phylogenetically related to other diTPSs involved in specialized diterpene metabolism. SsLPPS is closely related to the CcCLS [25], which has the same enzymatic function, and a CPS from S. miltiorrhiza involved in tanshinone biosynthesis [24]. SsSS groups with KS-like enzymes and shares almost 60% identity with a miltiradiene synthase from S. miltiorrhiza [48]. Interestingly, these two enzymes do not exhibit the common αβγ-domain structure of archetype plant diTPSs [41,44], indicating a loss of the γ-domain in a common ancestor. Such events of domain loss may ultimately have resulted in the evolution of mono-and sesqui-TPSs from ancestral αβ-domain diTPS enzymes [43].
SsdiTPS3 appears phylogenetically closer related to ent-KS involved in general metabolism. However, neither in vitro nor yeast in vivo assays revealed diTPS activity with ent-CPP, (9 S,10 S)-CPP, or LPP as a substrate. In addition, SsdiTPS3 did not exhibit class II activity as no conversion of GGPP was observed. Lack of diTPS activity of SsdiTPS3 may be due to a mutation of the conserved Asn of the NSE/DTE functional motif to His in SsKSL2, since an Asn in this position has previously been shown to be critical for the class I reaction by coordinating the Mg 2+ cluster in the class I active site [42,49,50].
The two hydroxyl groups in sclareol are responsible for most of the market value of this substance because most of the harvested sclareol is chemically modified to generate high value commercial hemisynthetic products such as Ambrox W . Without such hydroxyl groups, the labdane hydrocarbon backbone would be unreactive. Due to the unique properties of the activities of SsLPPS and SsSS to catalyze distinct position-specific hydroxylation reactions during the cycloisomerization of GGPP via LPP to sclareol, these new enzymes are of great significance for the metabolic engineering of heterologous production systems for sclareol and potentially other oxygenated diterpene bioproducts. It is important to note that the introduction of oxygen functionalities by diTPSs, without requirement for P450 activities, provides a substantial advantage for metabolic engineering of both prokaryotic and eukaryotic host systems, since TPS enzymes are inherently more amenable to expression in a range of heterologous hosts systems than P450 enzymes. Indeed, engineering of SsLPPS and SsSS into yeast provided independent evidence, in addition to in vitro assays, for the enzymatic production of sclareol by SsLPPS and SsSS without the requirement of a S. sclarea P450 enzyme for oxidation of the diterpene.
The use of engineered microbial platforms for industrialscale production of high-value terpenes has emerged as a viable approach, especially for the manufacture of pharmaceutical agents, such as artemisinin and taxol [9,48,51,52]. Formation of sclareol through co-expression of SsLPPS and SsSS with GGPPS in yeast shown here, represents a proofof-concept for future efforts to develop a simple and reliable sclareol production system that is independent of environmental factors in agricultural production. It should be noted that sclareol accumulated in both the cell pellets and the culture media to approximately similar levels, yet with a higher purity in the media, which may allow for efficient extraction of sclareol from fermentation systems. Interestingly, a recent study on the closely related miltiradiene-producing diTPSs from S. miltiorhizza demonstrated the interaction of the class II and class I enzymes and further application of these findings allowed the optimization of microbial miltiradiene production through fusion of both proteins, bona fide precluding dilution of potentially short-lived intermediates [48]. In the case of sclareol biosynthesis, efficient production was observed when SsLPPS and SsSS were disjoint during in vitro and in vivo assays. Yet, their uniform subcellular localization supports an interaction of both enzymes and future studies may reveal if such metabolite channelling can be implemented to accelerate sclareol production.

Conclusions
In conclusion, the new knowledge of diTPSs of sclareol biosynthesis and their successful expression for sclareol formation in yeast provides a robust foundation for the development of a scalable and sustainable production system, applicable in the fragrance industry.

Isolation and cloning of FL-cDNAs
Sequences for putative diTPS transcripts representing members of the TPS-c or TPS-e/f clades were previously identified in the transcriptome resource developed from S. sclarea calyces [38]. For PCR amplification of the corresponding cDNAs, total RNA was extracted from S. sclarea calyces using the Tri reagent kit (Euromedex, www.euromedex.com), and first strand cDNAs were obtained from 2 μg of total RNA using the M-MLV Reverse Transcriptase (Promega, www.promega.com). Unique, target-specific oligonucleotide primers were designed using Primer 3 plus (Additional file 3: Table S1). PCRs were carried out with the GoTaq DNA polymerase (Promega) in a final volume of 50 μl including 0.8 μM of each primer, 0.2 mM of each dNTP, and 4 μl of a 5-fold dilution of first strand cDNAs. The reactions were heated for 2 min to 95°C followed by 30 cycles of 95°C for 30 sec, 55°C for 30 sec, 72°C for 75 sec and followed by a final extension at 72°C for 5 min. To obtain FL-cDNAs of the three candidate diTPS genes, 5' and 3' RACE-PCR were performed with the Marathon cDNA amplification kit (Clonetech, www.clontech.com) according to the manufacturer's instructions. The cDNA template was made from 1 μg of total RNA. Gene-specific primers used are listed in Additional file 4: Table S1. All PCR products were cloned into the pGEM-T Easy vector (Promega).
The EST library that was data mined for diTPS sequences [38] was normalized before sequencing and therefore prevented estimations of SsDTPS transcript abundances. ESTs contigs Salvi_c6071 and Salvi_c2272 had a combined length of 1455 bp and covered 84.3% of the 1725 bp of SsSS. The 454 read FE21XKK02HIAG9 contained a coding sequence of 189 bp which covered only 8.1% of the 2322 bp of SsdiTPS3. The combined EST contigs Salvi_c1504, Salvi_c10842, Salvi_c12957 and Salvi_c17648 and the 454 read FE21XKK02JPKR covered 1284 bp of the 2355 bp of the coding sequence of SsLPPS, i.e., 54.4%.
The cDNA sequences described in this paper have been submitted to the GenBank: TM/EBI Data bank with accession numbers: JQ478434 (SsLPPS), JQ478435 (SsSS) and JQ478436 (SsdiTPS3).
For protein expression in Escherichia coli, the FL-cDNA for SsLPPS was cloned from calyx cDNA. Codon optimized FL-cDNAs for protein expression in E. coli of SsSS and SsdiTPS3 were synthesized at GeneArt (www.geneart.com)sequences in(Additional file 4: Figure S3). N-terminal truncations of SsLPPS (Δ65), SsKSL1 (Δ53) and SsKSL2 (Δ29), lacking the predicted putative transit peptides, were established by PCR amplification using the FL constructs as template and cloned into the pET28b(+) expression vector (EMD Biosciences, www.emdbiosciences.com) in frame with the N-terminal hexahistidin tag.
For protein expression in yeast, the N-terminal truncated version of SsLPPS was sub-cloned into the first multiple cloning site of the expression vector pESC-Leu (Stratagene, www.genomics.agilent.com), resulting in pESC-Leu:SsLPPS. Truncated cDNAs of SsSS and SsdiTPS3 were then individually sub-cloned into the second multiple cloning site of pESC-Leu:SsLPPS, resulting in pESC-Leu:SsLPPS/SsSS and pESC-Leu:SsLPPS/SsdiTPS3, respectively. These constructs were individually co-transformed with the previously described plasmid pESC-His:ScGGPPS [29], containing a GGPP synthase from Saccharomyces cerevisiae, into the yeast strain BY4741.

Phylogenetic analysis
Multiple protein sequence alignments were performed using the DIALIGN web server (http://bibiserv.techfak. uni-bielefeld.de/dialign/). Phylogenetic analyses were conducted on the basis of the maximum likelihood algorithm using PhyML 3.0 [53] with four rate substitution categories, LG substitution model, BIONJ starting tree and 100 bootstrap repetitions, and displayed as unrooted tree using treeview32 1.6.6 [54].

In vitro enzyme assays
Enzymes assays were carried out as described before [56,57] in 50 mM HEPES (pH 7.0), 10 μM MgCl 2 , 5% glycerol, using 20 μg of purified protein (20 μg each for coupled assays) and 20 μM of GGPP (Sigma, www.sig maaldrich.com) as substrate. Reactions were allowed to proceed for 1 h at 30°C with gentle shaking. For the detection of diphosphate intermediates, reaction products were dephosphorylated by incubation with 7 U of calf intestinal alkaline phosphatase (Invitrogen, www.invitrogen. com) for 16 h at 37°C. Extraction of reaction products was achieved by vortexing the samples with 500 μl of pentane twice for 20 sec and subsequent centrifugation for 15 min at 1,000 × g, 4°C.
Production of sclareol in engineered yeast cells pESC-HIS:ScGGPPS was transformed in S. cerevisiae (BY4741) in combination with either pESC-LEU:SsLPPS, pESC-LEU:SsLPPS/SsSS or pESC-LEU:SsLPPS/SsdiTPS3. Yeast transformation, growth media, and culture conditions were described previously [29,30,58]. Cells were grown in 50 mL of 2% dextrose and Leu/His dropout selective medium up to an OD 600 of 0.6 to 0.8, at which point yeast cells were transferred to 50 mL of 2% galactose and Leu/His dropout selective medium for 20-24 h. Yeast cells were separated from the medium and 500 μL of medium was extracted with 500 μL of pentane. The harvested cells were extracted twice with 5 mL of pentane, and extracts were concentrated under N 2 to 500 μL prior to gas chromatographic-mass spectrometric (GC-MS) analysis.

Gas chromatographic-mass spectrometric (GC-MS) analysis
GC-MS analysis was performed by electronic ionization (EI) at 70 eV after injection of 1 μL of the pentane overlay into an Agilent 6890A GC, 7683B series autosampler (vertical syringe position of 7), combined with a 5973 N Inert XL MS detector at 70 eV. Compound separation was achieved on a SGE SOLGEL-WAX (30 m × 250 μm i.d., 0.25 μm film thickness) column with 1 mL min -1 He as carrier gas, using the following GC temperature program: 40°C for 2 min, ramp at a rate of 25°C min -1 to 250°C, hold 5 min, pulsed splitless injector held at 250°C. Compounds were identified by comparison to authentic standards and reference mass spectra databases (Wiley W9N08L and the National Institute of Standards and Technology MS library searches). The authentic sclareol standard was purchased from Sigma.

Liquid chromatographic (LC)-MS Analysis
LC-MS analysis was conducted on an Agilent 1100 LC/MSD Trap XCT Plus mass spectrometer. Compound separation was achieved on an Agilent Zorbax SB-C18 column (50 mm × 4.6 mm ID, 1.8 μm) with isocratic elution of CH 3 CN/NH 4 HCO 3 (5%/95%) pH 7.95 as mobile phase at 35°C and a flow rate of 1.2 ml min -1 . Mass spectrometric analysis was performed after electrospray ionization (ESI) in negative mode with a scan range of m/z 100-600.

Subcellular localization of SsLPPS and SsSS
The GFP fusion constructs for transient expression in Nicotiana benthamiana were prepared as follows. The predicted N-terminal signal peptides of SsLPPS[SsLPPS (SP); 1-102 bp] and SsSS [SsSS(SP); 1-99 bp] were amplified with gene-specific primers and cloned into pENTR/ D-TOPO (Invitrogen). The resulting clones were transferred by Gateway LR reactions as recommended by the manufacturer (Invitrogen) into the destination vector pMDC83, in-frame with a C-terminal green fluorescent protein (GFP).
N. benthamiana seeds were disinfected with chlorine fumes for 6 h and placed on Murashige and Skoog basal salt medium (Sigma) containing 0.7% agar. After incubation for 2 weeks at 24°C, seedlings were transferred to soil and the plants were grown in a chamber with a 16 h photoperiod and a temperature range of 22°C (night time low) to 25°C (day time high). Plants that were 4 to 6 weeks old were used for transformation.
Each expression vector was introduced into Agrobacterium tumefaciens strain C58 (pMP90) by chemical transformation as reported elsewhere [59]. Plants were transfected with the fluorescent constructs as described previously [60]. Five mL of LB overnight A. tumefaciens cultures containing antibiotics were pelleted for 15 min at 4,000 g and 22°C, washed one time and resuspended in sterilized distilled water to an OD 600 of 0.5 before being infiltrated into the abaxial side of the leaf using a 1 mL syringe (without needle) and gentle pressure. To avoid post-transcriptional gene silencing, A. tumefaciens strains carrying the constructs of interest were coinfiltrated with another strain transformed with a binary vector for expression of the viral p19 silencing suppressor protein [61].
Two to four days post-infiltration leaf discs were excised and mounted between slides and coverslips for observation by confocal microscopy. Cell imaging was performed using an Olympus FV-1000 confocal microscope coupled to an Olympus BX61Wimicroscope stand.
Leaf samples were gently squeezed between cover and slide glass with a drop of distilled water. Images were recorded using an Olympus UPLFLN 40X objective lens. Excitation ⁄ emission wavelengths were 473 nm ⁄ 485-520 nm for GFP, and 561 nm ⁄ 600-700 nm for intrinsic fluorescence of the chloroplasts. The images shown in the results are single focal sections acquired using the FluoView Olympus software and directly analyzed with ImageJ [62].

Additional files
Additional file 1: Figure S1. Protein sequence alignments. Amino acid sequence alignments of SsLPPS Additional file 2: Figure S2. Mass spectra of assay products as compared to reference spectra of authentic standards and relevant databases. Illustrated are characteristic mass spectra of enzymatic reaction products as compared to authentic standards or reference mass spectra from the National Institute of Standards and Technology MS library searches (Wiley W9N08L): peak a, 13-epi-manoyl oxide 8; peak b, manoyl oxide 7; peak c, putative 13(16)-14-labdien-8-ol; peak d, putative copalol; peak e, sclareol 4; peak f, unknown compound; peak g, labda-13-en-8,15-diol; peak i, manool 6.
Additional file 3: Table S1. Oligonucleotides used for the amplification of cDNA sequences.
Additional file 4: Figure S3. Codon optimized sequences of SsSS and SsdiTPS3. Codon optimized sequences of SsSS and SsdiTPS3 that have been used for Escherichia coli and yeast-based heterologous protein expressions are shown in FASTA format.

Authors' contributions
ACa and PZ conducted the recombinant protein purification, the in vitro enzyme assays, the engineering of yeast cells, the tobacco transient expression assays for the subcellular localization of the three diTPS and the phylogenetic analyses. SL and NV isolated and cloned the full-length cDNAs of the three diTPS characterized in this article. SL also contributed to SsLPPS recombinant protein purification and enzyme assay. ACo and J-LM helped in the setting-up of the tobacco transient expression assays. JB and LL designed the study and contributed to the data analysis. LL conceived of the study. ACa, PZ, JB and LL wrote the manuscript with contributions from the coauthors. All partners have read and approved the final manuscript.