Towards synthesis of monoterpenes and derivatives using synthetic biology

Synthetic biology is opening up new opportunities for the sustainable and efficient production of valuable chemicals in engineered microbial factories. Here we review the application of synthetic biology approaches to the engineering of monoterpene/monoterpenoid production, highlighting the discovery of novel catalytic building blocks, their accelerated assembly into functional pathways, general strategies for product diversification, and new methods for the optimization of productivity to economically viable levels. Together, these emerging tools allow the rapid creation of microbial production systems for a wide range of monoterpenes and their derivatives for a diversity of industrial applications.


Synthetic biology for the production of monoterpenes and monoterpenoids
Synthetic biology is a powerful combination of multiple scientific disciplines, including biochemistry, molecular biology, systems biology, computational biology, and engineering, for the controlled design and construction of biological systems with new functionalities. One economically attractive application is the development of microbial factories for the biosynthesis of high-value chemical commodities such as pharmaceuticals, flavours, fragrances, fuels and many more. In order to achieve optimal biosynthetic production of these molecules, genes encoding enzymes involved in a desired biochemical pathway are collected from various source organisms (microbes, plants and fungi), modified and improved, and finally introduced into engineered production hosts (chassis) that are most suitable for production. The most famous synthetic biology example of high-value chemical production is artemisinic acid, the precursor of the antimalarial drug arteminisin, which was produced in engineered Escherichia coli and baker's yeast, Saccharomyces cerevisiae, reaching economically viable production levels after 10 years of iterative optimization [1,2].
Artemisinic acid is just one of thousands of potentially high-value terpenoids, and synthetic biology approaches towards versatile and robust biosynthetic production of additional members of this highly diverse class of chemicals have attracted considerable interest in recent years. Here, we specifically discuss recent developments towards a general synthetic biology toolbox for the production of monoterpenes/monoterpenoids, a particularly interesting subset of this family of molecules, with over 55,000 different compounds and many applications (e.g. as drugs, food flavourings, fragrances, biofuels and cleaning agents) [3]. Traditionally, monoterpenes and their derivatives are extracted from natural sources (generally plants), but this extraction process can be low yielding, costly, and sometimes highly dependent on raw material availability [4]; a synthetic biology approach to their synthesis provides a sustainable route to production and opens new possibilities for diversification and discovery.

The terpene precursor pathways
The biosynthesis of all terpenes is dependent on the two (C 5 ) isoprene precursors isopentenyl pyrophosphate (IPP) and dimethylallyl pyrophosphate (DMAPP), which are synthesized via either the methylerythritol 4-phosphate (MEP) pathway, also known as the 1-deoxy-D-xylulose5-phosphate (DXP) pathway, or the mevalonate-dependent (MVA) pathway ( Figure 1). IPP and DMAPP are condensed to form the terpene pre-cursors, with the order of the terpene being defined by the number of isoprene units incorporated (monoterpenes, C 10 ; sesquiterpenes, C 15 etc.; and so on). The universal precursor of monoterpenes is geranyl pyrophosphate (GPP), combining two C5 units, which is then further processed by monoterpene synthases/cyclases (mTS/C) to produce a vast array of chemical structures [5,6].
All organisms possess at least one route towards terpenoid production, either an MVA or an MEP pathway. The predominant source of monoterpenes/monoterpenoids is plants, which possess both a cytosolic MVA and a plastidial MEP pathway [7]. Typically, yeast, animals and archaea use the MVA pathway, whereas bacteria predominantly employ the MEP pathway; however, some species of bacteria can use an MVA pathway, whilst others use both [8].
Early engineering efforts to create monoterpene/oid and sesquiterpene/oid production systems in bacteria aimed to improve the availability of precursors by increasing the intracellular production of IPP and DMAPP [6]. This was achieved by the insertion of the 1-deoxy-D-xyulose-5phosphate synthase (DXS) and IPP isomerase (IPPHp) genes, responsible for the expression of key enzymes in the DXP/MEP pathway, thereby supplementing the endogenous E. coli pathway. When these biosynthetic pathways were expressed alongside monoterpene and sesquiterpene synthases, initial titres were in the low mg/L range. The subsequent efforts to improve the terpene titres have been extensively reviewed by Paddon and Keasling [1].

Current Opinion in Chemical Biology
An overview of monoterpene/monoterpenoid production pathways.

Monoterpene synthases
Monoterpene synthases/cyclases (mTS/C) produce a plethora of chemicals from a single substrate (GPP) and provide a powerful opportunity for the production of diverse chemical libraries ( Figure 1). They are a metaldependent family of enzymes that typically catalyse the cyclisation of GPP via an a-terpinyl cation intermediate, or elimination and addition reactions from the linear geranyl cation intermediate, resulting in a diverse selection of monoterpene products ( Figure 1). mTS/C are most commonly found in plants; however, recent genome mining efforts have demonstrated that terpene synthases also commonly occur in bacteria [9,10 ].

Synthetic biology production of monoterpenes/monoterpenoids
Over the last decade, numerous monoterpenes/monoterpenoids have been produced by engineered bacteria and yeast. A specialised limonene (and perillyl alcohol) production system was created in E. coli by introducing heterologous, codon-optimized, Staphylococcus aureus and S. cerevisiae MVA pathway genes into E. coli alongside the Abies grandis GPP synthase and Mentha spicata limonene synthase genes. Optimization of gene regulation and growth conditions resulted in a limonene titre of 400 mg/L [11 ]. Following this work, principal component analysis (PCA) was used in an effort to further improve the previously obtained limonene titres [12 ]. The authors of this study created a total of 27 production 'scenarios', in which the nine enzymes of the MVA pathway were present in different copy numbers under different promoters, and testing these in three different cell densities and three inducer concentrations. Proteomics (LC-MS/MS) and limonene production (GC-MS) data were obtained for each of these scenarios. Surprisingly, no single enzyme level showed a clear correlation with improved production, as tested by univariate statistics. However, the application of PCA, a multivariate statistical method, allowed the identification of combinations of proteins that needed to be optimized in order to achieve improved production. The results indicated that low and balanced expression of the early steps of precursor production, alongside an overexpression of limonene synthase would yield the optimal product titre. This was subsequently confirmed by constructing a production strain with these characteristics, which attained a maximal titre of 605 mg/L of limonene, a 40% improvement over the original pathway [12 ] ( Table 1).
The recent development of improved combinatorial design approaches for the assembly and characterization of large multi-gene operons further facilitates optimization strategies [13][14][15]. Using these approaches, which depend on the design of standardized re-usable bioparts and improved method for their rapid assembly, it is possible to quickly test a large number of pathway variants that differ, for example, in their promoter strengths, ribosomal binding sites, gene order, orientation and operon structure, to identify the most productive combination.
Alternative strategies in E. coli focussed on the MEP pathway, over-expressing the dxs and isopentenyl diphosphate isomerase (idi) genes, which had previously been identified as encoding rate limiting enzymes in the endogenous MEP E. coli pathway; however, the resulting strains provided a poor titre of 35.8 mg/L limonene [16]. Willrodt and colleagues subsequently demonstrated that the choice of bacterial production chassis, feedstock and fermentation approach have a major influence on Synthetic biology for monoterpenes Zebec et al. 39 Table 1 Diversity of monoterpene/monoterpenoid production strains engineered to date. achievable titres in engineered microbes. They showed increased production of limonene in E. coli grown on glycerol in minimal media, due to a prolonged growth and production phase [17]. Moreover, they were able to further improve limonene production by limiting magnesium sulphate availability [18]: in these nutrient-limited minimal media the cells enter a 'resting' state, in which cellular resources are no longer consumed for biomass and by-product formation, thereby increasing resource availability for limonene production.
While most of the synthetic biology of monoterpenes so far has focused on limonene production as a test case, E. coli has also been engineered to produce a variety of other monoterpenes, including a-pinene, myrcene, geraniol and sabinene, by the assembly and optimization of biosynthetic pathways containing a heterologous MVA or MEP pathway, a GPP synthase and the monoterpene synthase of interest ( Table 1).
As an alternative to E. coli, yeast has proven to be a successful chassis for monoterpene/monoterpenoid production, with strains capable of sabinene, limonene, linalool [19] and cineole production, obtained to date (Table 1). In the case of sabinene and limonene production, an engineered farnesyl pyrophosphate synthase (FPP synthase) Erg20 enzyme functioning as a GPP synthase was implemented. In addition to functioning as a GPP synthase, the engineered enzyme was unable to perform the sequential FPP synthesis reaction, that is seen for the wild type (WT) S. cerevisiae FPP synthase Erg20 enzyme, thus removing a potentially competing pathway that had been identified as an important factor limiting monoterpene titres [20]. In addition, the authors reasoned that the fusion of the Erg20 enzyme and sabinene synthase would help to direct GPP to the the sabinene synthase to rapidly sequester GPP at its source. Furthermore, the deletion of one Erg20 allele, thus reducing the gene doses of WT Erg20 and shifting the balance in favour of the overexpressed engineered Erg20 from a plasmid, resulted in a 340-fold improvement of sabinene titre (17.5 mg/L) compared with the original WT Erg20 (Table 1). In comparison to these relatively low titres, Ignea et al. [21 ] had previously successfully engineered a yeast system capable of producing cineole on a much larger scale, eventually reaching titres of >1000 mg/L. This was achieved using recyclable integration cassetes that facilitated unlimited sequential integration of genetic elements and was applied to the sterol biosynthetic genes HMG2, ERG20 and IDI1.

Diversification of monoterpenes/ monoterpenoids
To date, the majority of studies have reported the conversion of GPP to monoterpenes/monoterpenoids in a single step, but much of the natural diversity is created by subsequent tailoring by isomerisation or hydroxylation, among others. For example, limonene is a key intermediate of the mint pathways (leading, among others, to the valuable flavour and fragrance compounds originally derived from spearmint and peppermint). Recent work reported the use of a complementary cell-free synthetic biology strategy for the production of these tailored products using extracts from engineered E. coli containing biosynthetic genes from Nicotiana tabacum, including a double bond reductase (NtDBR), (-)-menthone:(-)-menthol reductase (MMR) and menthone:(+)-neomenthol reductase (MNMR) pathways [22 ]. This one-pot biocatalytic approach suggests new opportunities for the modular combination of reactions to generate libraries of derived monoterpenes/monoterpenoids, for example, for use in high-throughput screening for new functionalities.
A particular strength of synthetic biology is the ability to produce non-natural compounds by co-expression of enzymes, sourced from a variety of different organisms, in new combinations not found in nature. One recently published example of this exploited the modularity of class I and II diterpene synthases (diTPSs) by systematically co-expressing diTPSs in heterologous hosts. Hamberger and colleagues constructed a library of 51 diTPS combinations, 41 of which were described as 'new-tonature', resulting in a significant increase in product diversity [23 ]. Further efforts to improve diversity of monoterpenes included the incorporation of non-natural tailoring enzymes into pathways (e.g. cytochrome P450s or glycosyltransferases). The inclusion of such 'non-natural' enzyme combinations could be successful in providing access to new chemical space [11 ]. For example, the incorporation of a Mycobacterium sp. cytochrome P450 into an engineered E. coli limonene producer resulted in the production of perillyl alcohol.
Alternative efforts to improve monoterpene/monoterpenoid titres include the editing and optimization of enzymes used in the biosynthetic pathways via directed evolution strategies [24]. In this approach, mutant libraries are created by systematically varying the specific amino acid residues within an enzyme that are expected to affect substrate specificity, product purity or catalytic efficiency, and high-throughput screening and selection identifies optimal variants, that produce the desired products faster and more selectively, sometimes even accepting non-natural substrates not suitable for the original native enzyme. Directed evolution for enzyme optimization is important for monoterpene production, as mTS/Cs invariably also produce multiple monoterpenes, in addition to the desired main product, which is not ideal for commodity chemical production. Sequence analysis has shown that even mTS/Cs sharing close sequence identity can produce distinct monoterpene profiles [25,26]. The rational engineering or directed evolution of mTS/C for altered or cleaner product profiles is therefore a main ambition for monoterpene production using synthetic biology. These approaches may also be exploited as a means of introducing further diversity into monoterpene/ monoterpenoid production. The ability to alter the substrate specificity of monoterpene synthases, such that new suites of small, structurally diverse natural product libraries may be obtained, alongside the ability to 'reprogram' monoterpene synthase activity may be of significant interest to the fine chemical and pharmaceutical industries.

Outlook for future pathway design
Establishing genetic parts needed for the production of secondary metabolites, like monoterpenes/monoterpenoids, is the first challenge faced by synthetic biologists and commonly tackled by computational tools [27]. Predicting bacterial terpene synthases is very challenging, but extensive HMM analysis of the Pfam [28] database can be applied to identify new terpene synthases [10 ] and test them in production systems [29 ]. Once the monoterpene synthase of interest has been identified, it must be brought into genomic context by choosing the appropriate chassis, usually yeast or E. coli. Other host organisms engineered for the production of monoterpenoids include Corynebacterium glutamicum [30] and Pseudomonas putida [31], which were developed for the production of pinene and geranic acid, respectively. In addition, the Gram-positive bacterium Bacillus subtilis, which is already widely used in biotechnological applications, has recently been promoted as a potential platform for the general production of terpenoids, although to date there are no published instances of mTS/C production in this species [32].
The next step is the design of intrinsic regulation within the engineered biosynthetic gene cluster, where regulatory parts need to be selected carefully in order to reach the maximal efficiency of the selected parts [33]. It has been demonstrated for limonene-producing E. coli strains that production is highly dependent on the number of plasmids per cell, which can be modulated by changing the selective pressure using different antibiotics concentrations [11 ]. In yeast, inserting pathways on the chromosome has been shown to increase diterpenoid production up to threefold, and similar effects would be expected for monoterpenes/monoterpenoids [23 ]. In addition, genomic insertion would help in reducing biological variation, making the whole system more productive, which was demonstrated also in E. coli, where a threefold increase of production levels was observed for the tetraterpene, b-carotene [34]. With the emergence of the CRISPR-Cas9 technology, genome editing on a large scale has become more timely and affordable [35,36]. This technology allows biosystems engineers to insert de novo synthesized genes of up to 8 kbp and produce knock-outs of up to 18 kbp on the E. coli chromosome [37][38][39]. Other production chassis, such as S. cerevisiae [40], C. glutamicum [41] and Streptomyces sp. [42] can be CRISPR-Cas9 genome edited in a similar fashion. Additionally, various conventional methods of genome editing (using selection markers) can be employed in Pseudomonas putida and many other potential microbial production hosts [42]. The new opportunities created by the CRISPR/Cas technology have been strikingly demonstrated by engineering yeast for the production of farnesol, a sesquiterpene, which could not be produced if the pathway was encoded on a plasmid [44].
For E. coli it has been demonstrated that limonene is converted spontaneously to its toxic hydroperoxide form, causing severe growth retardation [45]. A natural point mutation in the gene for alkyl-hydroperoxidase (AhpC) decreased the formation of limonene hydroperoxide, resulting in improved limonene tolerance. Targeted genome editing will play a considerable role in engineering tolerant strains for improved production. Another strategy to overcome general cytotoxic effects of chemicals produced in a production host is the compartmentalization of the pathway, thus reducing the active concentration and intrinsic toxicity of the produced chemical or the pathway intermediates. Suitable compartments that are being explored for this purpose include peroxisomes in yeast and proteinaceous micro-compartments in bacteria [46,47].

Conclusion
The synthetic biology of monoterpene/monoterpenoid production has already made substantial progress in recent years, promising sustainable and economically viable new routes to industrial-scale production of these valuable chemicals. However, this is only the beginning: in the near future, we expect to see new computational tools identifying even more genes to add to the monoterpene/ monoterpenoid diversification toolbox; advances in metabolomics and proteomics that will more rapidly identify bottlenecks in engineered biosynthetic pathways; progress in directed protein evolution that will increase product purity and chemical diversity; and ever faster and more robust genome editing techniques that will facilitate the rapid and automated introduction and combinatorial assembly of biosynthetic pathway variants into tailormade high-performance industrial chassis strains. Together, these tools will enable a profound transformation in the bio-industrial production of an increasingly diverse range of monoterpenes and their derivatives.