Complex scaffold remodeling in plant triterpene biosynthesis

Triterpenes with complex scaffold modifications are widespread in the plant kingdom. Limonoids are an exemplary family that are responsible for the bitter taste in citrus (e.g., limonin) and the active constituents of neem oil, a widely used bioinsecticide (e.g., azadirachtin). Despite the commercial value of limonoids, a complete biosynthetic route has not been described. Here, we report the discovery of 22 enzymes, including a pair of neofunctionalized sterol isomerases, that catalyze 12 unique reactions in the total biosynthesis of kihadalactone A and azadiraone, products that bear the signature limonoid furan. These results enable access to valuable limonoids and provide a template for discovery and reconstitution of triterpene biosynthetic pathways in plants that require multiple skeletal rearrangements and oxidations.


One-Sentence Summary:
Discovery of 22 enzymes responsible for the production of bioactive limonoids with complex scaffold rearrangements from Citrus and Meliaceae species.
Among numerous complex triterpenes that are found in the plant kingdom, limonoids are particularly notable given their wide range of biological activities and structural diversity that stems from extensive scaffold modifications (1,2). Produced by mainly two families in the Sapindales, Rutaceae (citrus) and Meliaceae (mahogany) (3), these molecules bear a signature furan and include over 2,800 unique structures (4,5). Azadirachtin, a well-studied limonoid, exemplifies the substantial synthetic challenge for this group of molecules, with 16 stereocenters and 7 quaternary carbons. Notably, few synthetic routes to limonoids have been reported (6), (7), (8). More generally, complete biosynthetic pathways to triterpenes with extensive scaffold modifications have remained elusive. This lack of production routes limits the utility and biological investigation of clinical candidates from this diverse compound class (9).
Around 90 limonoids have also been reported to have anti-insect activity (2), and several have also been found to target mammalian receptors and pathways (4). For example, azadirachtin ( Fig. 1), the main component of biopesticides derived from the neem tree (Azadirachta indica), is a potent antifeedant, active against >600 insect species (9). Perhaps related to antifeedant activity, Rutaceae limonoids such as nomilin, obacunone and limonin ( Fig. 1) that accumulate in Citrus species at high levels (3) are partially responsible for the "delayed bitterness" of citrus fruit juice, which causes serious economic losses for the citrus juice industry worldwide (10). In mammalian systems, several limonoids have shown inhibition of HIV-1 replication (11) and anti-inflammatory activity (12). Some limonoids of pharmaceutical interest have also been associated with specific mechanisms of action: gedunin ( Fig. 1) and nimbolide (fig. S1) exert potent anti-cancer activity through Hsp90 inhibition (13) and RNF114 blockade (14,15), respectively.
Limonoids are unusual within the triterpene class due to their extensive biosynthetic scaffold rearrangements. They are referred to as tetranortriterpenoids because their signature tetracyclic, triterpene scaffold (protolimonoid) looses four carbons during the formation of a signature furan ring to give rise to the basic C26 limonoid structure (Fig. 1). A diversity of modifications can then occur to the basic limonoid scaffold through the cleavage of one or more of the four main rings (16,17) (fig. S1). Radioactive isotope labeling studies suggest that most Rutaceae limonoids are derived from a nomilin-type intermediate (seco-A,D ring scaffolds) whereas Meliaceae limonoids are derived from an azadirone-type intermediate (intact A ring) ( Fig. 1) (4,5,18,19). It is proposed that at least two main scaffold modifications are conserved in both plant families: a C-30 methyl shift of the protolimonoid scaffold (apo-rearrangement) and the conversion of the hemiacetal ring of melianol (1) to a mature furan ring with a concomitant loss of the C-25~C-28 carbon side chain ( Fig.  1) (20). Additional modifications specific to Rutaceae and Meliaceae would then yield the nomilin-and azadirone-type intermediates. The diversity and array of protolimonoid structures isolated beyond melianol (1) (fig. S1) hint at a series of possible conserved biosynthetic transformations, including hydroxylation and/or acetoxylation on C-1,C-7 and C-21, which suggests involvement of cytochrome P450s (CYPs), 2-oxoglutarate-dependent dioxygenases (2-ODDs) and acetyltransferases.
Despite extensive interest in the biology and chemistry of complex plant triterpenes over the last half century, few complete biosynthetic pathways have been described. A notable exception is the disease resistance saponin from oat, avenacin A-1, whose pathway consists of 4 CYP-mediated scaffold modifications and 6 side-chain tailoring steps (21). Significant barriers to pathway reconstitution of complex triterpenes include a lack of knowledge of the structures of key intermediates, order of scaffold modification steps, instability of pathway precursors, and the challenge of identifying candidate genes for the anticipated >10 enzymatic transformations required to generate advanced intermediates. Limonoids are no exception; to date, only the first three enzymatic steps to the protolimonoid melianol (1) from the primary metabolite 2,3-oxidosqualene have been elucidated ( Fig. 1) (20). In this work, we used systematic transcriptome and genome mining, phylogenetic and homologous analysis, coupled with N. benthamiana as a heterologous expression platform, to identify suites of candidate genes from Citrus sinensis and Melia azedarach that can be used to reconstitute limonoid biosynthesis.

Identification of candidate limonoid biosynthetic genes
One genome of Rutaceae plants (C. sinensis var. Valencia) and several transcriptome resources, including from Citrus and Meliaceae plants (two from A. indica and one from M. azedarach) were previously used to identify the first three enzymes in the limonoid pathway (20). These included an oxidosqualene cyclase (CsOSC1 from C. sinensis, AiOSC1 from A. indica, and MaOSC1 from M. azedarach), and two CYPs (CsCYP71CD1/MaCYP71CD2 and CsCYP71BQ4/MaCYP71BQ5) that complete the pathway to melianol (20). To identify enzymes that further tailor melianol (1), we expanded our search to include additional sources. For Rutaceae enzyme identification, we included publicly available microarray data compiled by the Network inference for Citrus Co-Expression (NICCE) (22). For Meliaceae enzyme identification, we generated additional RNA-seq data and a referencequality genome assembly and annotation.
Of publicly available microarray data for Citrus, fruit datasets were selected for in depth analysis as CsOSC1 expression levels were highest in the fruit and it has been implicated as the site of limonin biosynthesis and accumulation (19). Gene co-expression analysis was first performed on the Citrus fruit dataset using only CsOSC1 as the bait gene. This revealed promising candidate genes exhibiting highly correlated expression with CsOSC1 ( fig. S2). As we characterized more limonoid biosynthetic genes (as described below) we also included these as bait genes to enhance the stringency of co-expression analysis and further refine the candidate list. The top-ranking candidate list is rich in genes typically associated with secondary metabolism (Fig. 2A). The list specifically included multiple predicted CYPs, 2-ODDs and acetyltransferases, consistent with the proposed biosynthetic transformations.
Efforts to identify and clone candidate genes from M. azedarach have previously been limited by the lack of a reference genome with high-quality gene annotations and by the lack of suitable transcriptomic data for co-expression analysis (i.e. multiple tissues, with replicates). Therefore, in parallel to our search in Citrus, we generated genomic and transcriptomic resources for M. azedarach. A pseudochromosome level reference-quality M. azedarach genome assembly was generated using PacBio long-read and Hi-C sequencing technologies (table S1, fig. S3). Although the assembled genome size (230 Mbp) is smaller than available literature predictions for this species of 421 Mbp (23), the chromosome number (1n=14) matches literature reports (23) and was confirmed by karyotyping ( fig. S4). The genome assembly annotation predicted 22,785 high-confidence protein coding genes (Fig. 2B, table S1). BUSCO assessment (24) of this annotation confirmed the completeness of the genome, as 93% of expected orthologs are present as complete single copy genes (comparable to 98% in the gold standard Arabidopsis thaliana) (Fig. 2B, table S1).
Illumina paired-end RNA-seq reads were generated for three different M. azedarach tissues (7 different tissues in total, with four replicates of each tissue, table S2), previously shown to differentially accumulate and express limonoids and their biosynthetic genes (20). Readcounts were generated by aligning RNA-Seq reads to the genome annotation, and EdgeR (25) was used to identify a subset of 18,151 differentially expressed genes (P-value < 0.05). The known melianol biosynthetic genes MaOSC1, MaCYP71CD2 and MaCYP71BQ5 (20) were used as bait genes for co-expression analysis across the sequenced tissues and the resulting ranked list was filtered by their Interpro domain annotations to enrich for relevant biosynthetic enzyme-coding genes. This informed the selection of 17 candidate genes for further investigation for functional analysis along with Citrus candidates (Fig. 2C).

Citrus CYP88A51 and Melia CYP88A108 act with different melianol oxide isomerases (MOIs) to form distinct proto-limonoid scaffolds
Top-ranking genes from both the Citrus and Melia candidate lists ( Fig. 2A, 2C) were tested for function by Agrobacterium-mediated transient expression in N. benthamiana with the previously reported melianol (1) biosynthetic enzymes CsOSC1, CsCYP71CD1, and CsCYP71BQ4 or AiOSC1, MaCYP71CD2, and MaCYP71BQ4. LC/MS analysis of crude methanolic extracts from N. benthamiana leaves revealed that the expression of either CsCYP88A51 or MaCYP88A108, in combination with their respective melianol biosynthesis genes, led to the disappearance of melianol (1) and the accumulation of multiple mono-oxidized products (Fig. 3A, fig. S5 to S6). This result suggested that, while these CYP88A enzymes accept melianol as a substrate, the resulting products could be unstable or undergo further modification by endogenous N. benthamiana enzymes.
Despite the accumulation of multiple related metabolites, we continued to screen additional co-expressed candidate genes for further activity. This screen included homologs of A. thaliana HYDRA1, an ER membrane protein known as a sterol isomerase (SI) (two from the Citrus candidate list, and one from the Melia list). SIs are exclusively associated with phytosterol and cholesterol biosynthesis, where they catalyze double bond isomerization from the C-8 to the C-7 position. They are present in all domains of life and are required for normal development of mammals (26), plants (27) and yeast (28). Testing of these putative SIs through transient Agrobacterium-mediated gene expression in N. benthamiana resulted in a marked change of the metabolite profile with the accumulation of a single mono-oxidized product with no mass change (Fig. 3A, fig. S7). We suspected that these enzymes were able to capture unstable intermediates and promote isomerization of the C30 methyl group required to generate mature limonoids. These sterol isomerases are therefore re-named melianol oxide isomerases, CsMOI1-3 and MaMOI2, because of their ability to generate isomers of mono-oxidized melianol products.
SIs are typically found as single copy genes in given plant species. Surprisingly, we found additional putative SI genes in the C. sinensis and M. azedarach genomes, four and three, respectively ( fig. S8). Phylogenetic analysis of SIs across a set of diverse plant species revealed that SIs from C. sinensis and M. azedarach fall into two distinct sub-clades (Fig.   3B). The more conserved of these clades contained one sequence from each species (CsSI and MaSI), whilst the more divergent clade contained the remaining SIs (CsMOI1-3 and MaMOI1,2). This suggested that CsSI and MaSI are the conserved genes involved in phytosterol biosynthesis. Comparison of all C. sinensis and M. azedarach SI/MOI protein sequences showed that CsMOI2 is ~93% identical at the protein level to CsMOI3 and ~83% to MaMOI2, but only ~54% and ~60% similar to CsMOI1 and CsSI, respectively ( Fig. 3C). While CsMOI1, CsMOI2, and MaMOI2 ranked among the top 100 genes in our co-expression analysis lists (Fig. 3D), CsSI, MaMOI1 and MaSI do not co-express with limonoid biosynthetic genes. The absence of CsMOI3 from this list is attributed to the lack of specific microarray probes required for expression monitoring. Notably, screening of CsSI in the N. benthamiana expression system did not change the product profile of CsCYP88A51, consistent with its predicted involvement in primary metabolism based on the phylogenetic analysis (Fig. 3A).
To determine the chemical structures of the isomeric products formed through the action of these MOIs, we carried out large-scale expression experiments in N. benthamiana and isolated 13.1 mg of pure product. NMR analysis revealed the product of MaMOI2 to be the epimeric mixture apo-melianol (3) bearing the characteristic limonoid scaffold with a migrated C-30 methyl group on C-8, a C-14/15 double bond, and C-7 hydroxylation (Fig.  3E, table S3) (29). While the structure of the direct product of CsMOI2 was not determined until after the discovery of two additional downstream tailoring enzymes, NMR analysis also confirmed C-8 methyl migration (table S4). These data indicate that, as predicted by sequence analysis, CsMOI2 and MaMOI2 indeed are functional homologs and catalyze a key step in limonoid biosynthesis by promoting an unprecedented methyl shift. Analysis of the product formed with expression of CsMOI1, indicated the presence of a metabolite with a different retention time relative to apo-melianol (3) (Fig. 3A). Isolation and NMR analysis of (4'), a metabolite derived from (4) after inclusion of two additional tailoring enzymes (table S5), indicated C-30 methyl group migration to C-8 and cyclopropane ring formation via bridging of the C18 methyl group to C-14.
Based on the characterized structures, we proposed that in the absence of MOIs, the CYP88A homologs form the unstable C-7/8 epoxide (2), which may either spontaneously undergo a Wagner-Meerwein rearrangement via C-30 methyl group migration and subsequent epoxide-ring-opening or degrade through other routes to yield multiple rearranged products (2a), (2b), (2c) and (3) (Fig. 3E). MOIs appear to stabilize the unstable carbocation intermediate and isomerize it to two types of limonoids: CsMOI2, CsMOI3 and MaMOI2 form the C-14/15 double bond scaffold (classic limonoids) while CsMOI1 forms the cyclopropane ring scaffold (glabretal limonoids). Glabretal limonoids have been isolated from certain Meliaceae and Rutaceae species before but are less common (30,31). Together, our result suggest that CsCYP88A51, MaCYP88A108 and two different types of MOIs are responsible for rearrangement from melianol (1) to either (3) or (4) through an epoxide intermediate (2). These MOIs represent neofunctionalization of sterol isomerases from primary metabolism in plants.

Characterization of conserved tailoring enzymes L21AT and SDR
Having enzymes identified for the methyl shift present in the limonoids, we continued screening other candidate genes ( Fig. 2A, 2C) for activity on (3) towards downstream products. BAHD-type acetyltransferases (named CsL21AT or MaL21AT, limonoid 21-O-acetyltransferse) and short-chain dehydrogenase reductases (CsSDR and its homolog MaSDR) result in the loss of compound (3), and the accumulation of acetylated and a dehydrogenated products, respectively ( fig. S9 to S12). While the sequence of events can be important for some enzymatic transformations in plant biosynthesis, L21AT and SDR homologs appear to have broad substrate specificity. Our data suggests that L21AT can act on (1) or (3), and SDR is active on all intermediates after the OSC1 product ( fig. S13 to S14), suggesting a flexible reaction order in the early biosynthetic pathway.
Furthermore, the products formed from the modification of (3) by both Citrus and Melia L21AT and SDR homologs were purified by large-scale N. benthamiana expression and structurally determined by NMR to be 21(S)-acetoxyl-apo-melianone (6) (Fig. 4A, table S4,  table S6 (table S3). Overall, our results indicated that L21AT acetylates the C21 hydroxyl and SDR oxidizes the C3 hydroxyl to the ketone on early protolimonoid scaffolds.

Citrus and Melia cytochrome P450s catalyze distinct limonoid A-ring modifications
Further Citrus and Melia candidate screens ( Fig. 2A, 2C) supports activity of two Citrus CYPs, CsCYP716AC1 and CsCYP88A37, that are each capable of oxidizing (6) directly to (7) and (8) or consecutively to (9) (Fig. 4A, fig. S17 to S19), and that one CYP from Melia (MaCYP88A164, a homolog of CsCYP88A37) is also capable of oxidizing (6) to (8) (Fig. 4A, fig. S20). Purification and NMR analysis of the downstream product (9) revealed it to be 1-hydroxy-luvungin A, which bears an A-ring lactone (table S8). Additional NMR product characterization suggests that CsCYP716AC1 is responsible for A-ring lactone formation and CsCYP88A37 is responsible for C1 hydroxylation (table S9). Although the exact order of oxidation steps to (9) appeared to be interchangeable for CsCYP716AC1 and CsCYP88A37, incomplete disappearance of (6) by CsCYP88A37 suggested that oxidation by CsCYP716AC1 takes precedence ( fig. S19).
Interestingly, in the absence of CsSDR, neither CsCYP716AC1 nor CsCYP88A37 result in an oxidized protolimonoid scaffold, suggesting the necessary involvement of the C-3 ketone for further processing (fig. S21). These results, in combination with NMR characterization, indicated that CsCYP716AC1 is likely responsible for Baeyer-Villiger oxidation to the A-ring lactone structure signature of Rutaceae limonoids. Comparative transcriptomics in M. azedarach revealed the lack of an obvious CsCYP716AC1 homolog. The closest Melia enzyme to CsCYP716AC1 is truncated, not co-expressed with melianol biosynthetic genes, and only shares 63% protein identity (table S10). These results highlight a branch point between biosynthetic routes in the Rutaceae and Meliaceae families.

Acetylations complete tailoring in both Citrus and Melia protolimonoid scaffolds and set the stage for furan ring biosynthesis
Subsequent Citrus and Melia gene candidate screens ( Fig. 2A, 2C) revealed further activity of BAHD acetyltransferases. CsL1AT and its homolog MaL1AT (named limonoid 1-Oacetyltransferase) appear to be active on (9) and (8), respectively ( fig. S22 to S23). When CsL1AT was co-expressed with the biosynthetic genes for (9), a new molecule (11) with mass corresponding to acetylation of (9) was observed. When CsCYP88A37 was omitted, acetylation of (7) was not observed ( fig. S24), suggesting that CsL1AT acetylates the C-1 hydroxyl of (9) to yield (11). Surprisingly, when CsCYP716AC1 was omitted from the Citrus candidates or when MaL1AT was tested, the dehydration scaffold (10) accumulated ( fig. S23 to S24). Large-scale transient plant expression, purification, and NMR analysis of the dehydration product showed that the structure (10) (table S11 to S12) contains a C-1/2 double bond and is an epimer of a previously reported molecule from A. indica (33). (10) also accumulates in M. azedarach extracts ( fig. S16). Two more co-expressed Citrus and Melia acetyltransferase homologs, CsL7AT and MaL7AT, (named limonoid 7-Oacetyltransferase) were found to result in acetylated scaffolds (12) and (13); modification at the C-7 hydroxyl was confirmed by the purification and NMR analysis of (13) and its degradation product (13') ( Fig. 2A, 2C, fig S25 to S26, table S13 to table S14).
Taken together, these data suggest that three acetyltransferases (L1AT, L7AT, and L21AT) act in the biosynthesis of the tri-acetylated 1,7,21-O-acetyl protolimonid (13) (Fig. 4A). However, we also observed the accumulation of two di-acetylated intermediates, (11) (1,21-O-acetyl) and (11a) (1,7-O-acetyl) when testing gene sets that lead to accumulation of (13) ( fig. S27). This observation hints at the possibility of multiple sequences for enzymatic steps that comprise a metabolic network, at least in the context of pathway reconstitution in the heterologous host N. benthamiana.

Downstream enzymes complete the biosynthesis to the furan-containing products azadirone (18) and kihadalactone A (19)
With acetylation established, the key enzymes involved in the C4 scission implicated in furan ring formation still remained elusive. It was unclear which enzyme classes could catalyze these modifications. We screened gene candidates via combinatorial transient expression in N. benthamiana as previously described and ultimately identified three active candidate pairs (one from each species): the aldo-keto reductases (CsAKR/MaAKR), the CYP716ADs (CsCYP716AD2/MaCYP716AD4), and the 2-ODDs (named limonoid furan synthase, CsLFS/MaLFS) ( Fig. 2A, 2C). Systematic testing of these gene sets resulted in the accumulation of the furan-containing molecules azadirone (18) and kihadalactone A (19), two limonoids present in the respective native species. When CsAKR/MaAKR was tested alone in our screens, we identified the appearance of a new peak with mass corresponding to reductive deacetylation of (12) or (13) (fig. S28 to S29). The product generated by expression of the Melia gene set in N. benthamiana was purified and characterized via NMR analysis to be the 21,23-diol (14) (Fig. 4A, table S15). Thus, the corresponding CsAKR product (15) was proposed to share the same diol motif.
Unexpectedly, transient expression of MaCYP716AD4 or CsCYP716AD2 with the biosynthetic genes for (14) or (15) (16a and 17a) and a C 4 H 10 O fragment loss (16b and 17b) from their respective precursors (Fig. 4A, fig. S30 to S31). It is unclear whether these observed masses correspond to the true products of CYP716ADs or whether these are further modified by endogenous N. benthamiana enzymes. CYP716AD products are proposed to contain C-21 hydroxyl and C-23 aldehyde functionalities (16c and 17c) which could also spontaneously form the five-membered hemiacetal ring (16d and 17d) (Fig. 4A, fig. S32). A new peak with a mass equivalent to (16c or 16d) is identifiable alongside (16a and 16b) when transiently expressing MaCYP716AD4 with the biosynthetic genes required for accumulation of (14) (fig. S31). We found that additional co-expression of LFS with the characterized genes that result in (16) and (17) yields accumulation of products (18) and (19) (fig. S33 to S34). Based on the predicted chemical formula, MS fragmentation pattern, and NMR analysis ( fig. S33, table S16), we proposed the product of CsLFS to be kihadalactone A (19), a known furan-containing limonoid (34) previously identified in extracts from the Rutaceae plant Phellodendron amurense. We detected the presence of (19)  Taken together, we have discovered the 10-and 11-step biosynthetic transformations that enable a reconstitution of the biosynthesis of two known limonoids, azadirone (18) and kihadalactone A (19), as well as an enzyme catalyzing the formation of the alternative glabretal scaffold (CsMOI1). Sequential introduction of these enzymes into N. benthamiana transient co-expression experiments demonstrate step-wise transformations leading to (18) and (19) (Fig. 4B). All of the enzymes involved in the biosynthesis of (18) and (19), except CsCYP716AC1, are homologous pairs, and show a gradual decreasing trend in protein identity from 86% for the first enzyme pair CsOSC1/MaOSC1 to 66% for CsLFS/MaLFS. Intriguingly, despite the varied protein identities (Fig. 4B), these homologous enzymes from Melia or Citrus can be used to create functional hybrid pathways comprising a mix of species genes, supporting a promiscuous evolutionary ancestor for each of the limonoid biosynthetic enzymes (fig. S37).

Author Manuscript
Author Manuscript

Author Manuscript
Author Manuscript

Discussion
A major challenge in elucidating pathways that involve many (e.g. >10) enzymatic steps is to determine whether the observed enzymatic transformations in a heterologous host are "on-pathway" and, if so, in what order they occur. It is important to note that while all enzymes described in Fig. 4 play a role in the production of final limonoid products, the sequence of enzymatic steps shown by the arrows is proposed based on the accumulation of observed metabolites after addition of each enzyme in the N. benthamiana heterologous expression system, and other sequences of steps are possible. For example, we've shown that CsAKR likely doesn't accept hemi-acetal (13) directly as a substrate ( fig. S38) despite our observation that it accumulates as a major metabolite when all upstream enzymes are expressed. Intriguingly, while one expects a pathway without CsL21AT to still be functional as the C-21 acetal product (11a) appears to undergo reduction by CsAKR to yield (15), attempts to drop out CsL21AT led to significantly reduced yield of (19) (fig.  S39), suggesting that CsL21AT might have other unexpected roles in the pathway. In addition, reconstitution of several partial pathways indicates that some pathway enzymes can accept multiple related substrates. For example, each step after apo-melianol can diverge into multiple pathways, likely due to the promiscuity of these enzymes. Taken together, these data indicate that enzymes in limonoid biosynthesis might collectively function as a metabolic network ( fig. S40). Further study of each individual enzyme in vitro with purified substrate will be required to quantify substrate preference. This metabolic network observed in N. benthamiana suggests one possible strategy for how Rutaceae species access such a diverse range of limonoids; we anticipate that additional enzymes will further expand the network, e.g. for the oxidative cleavage of ring C, ultimately resulting in the most extensively rearranged and modified limonoid scaffolds isolated to date, e.g. azadirachtin (Fig. 1).
Among the 12 chemical transformations catalyzed by the 22 enzymes characterized in this study, several are not previously known in plant specialized metabolism. For example, MOI1 and MOI2, which appear to have evolved from sterol isomerases, are capable of catalyzing two different scaffold rearrangements despite their conserved active site residues (Fig. S41). The co-localization of the limonoid biosynthetic gene MaMOI2 with two other non-limonoid SI genes in the M. azedarach genome is consistent with the origin of MaMOI2 by tandem duplication and neofunctionalization ( fig. S42); this genomic arrangement is conserved in Citrus on chromosome 5 as well. Furthermore, recent findings demonstrate a similar role of these enzymes in quassinoid biosynthesis (35). Other noteworthy enzymatic reactions in the limonoid pathway include C-4 scission and furan ring installation that generate an important pharmacophore of the limonoids. Although furan-forming enzymes have been reported from other plants (36,37), (38), the AKR, CYP716AD and 2-ODD module described here represents a new mechanism of furan formation via the oxidative cleavage of a C-4 moiety. Along with the sterol isomerases (MOIs), the AKR and 2-ODDs add to the growing pool of enzyme families (39,40) associated with primary sterol metabolism that appear to have been recruited to plant secondary triterpene biosynthesis, likely due to the structural similarities between sterols and tetracyclic triterpenes.
Limonoids are only one of many families of triterpenes from plants with complex scaffold modifications. Other examples include the Schisandra nortriterpenes (41), quinonoids (42), quassinoids (43), and dichapetalins (42); each represent a large collection of structurally diverse terpenes that contain several members with potent demonstrated biological activity but no biosynthetic route. Despite the value of these complex plant triterpenes, individual molecular species are typically only available through multi-step chemical synthesis routes or isolation from producing plants, limiting drug development (15) and agricultural utility (9). Many are only easily accessible in unpurified extract form that contains multiple chemical constituents; for example, azadirachtin, one of the most potent limonoids, can only be obtained commercially as a component of neem oil. Our results demonstrate that pathways to triterpenes with complex scaffold modifications can be reconstituted in a plant host, and the gene sets we describe enable rapid production and isolation of naturallyoccurring limonoids. We anticipate that bioproduction of limonoids will serve as an attractive method to generate clinical candidates for evaluation, and that stable engineering of the limonoid pathway could be a viable strategy for sustainable crop protection.

Supplementary Material
Refer to Web version on PubMed Central for supplementary material.   (20), shown in red) and biosynthetic annotation. Heatmap (constructed using Heatmap3 V1.1.1 (44), with scaling by row (gene)) includes genes that are ranked within the top 87 for co-expression and are annotated with one of six interpro domains of biosynthetic interest (IPR005123 (Oxoglutarate/iron-dependent dioxygenase), IPR020471 (Aldo/keto reductase), IPR002347 (Short-chain dehydrogenase/reductase SDR), IPR001128 (Cytochrome P450), IPR003480 (Transferase) and IPR007905 (Emopamil-binding protein)). Asterisks indicate the following: (*) full-length gene identified in transcriptomic rather than genomic data via sequence  (A) Gene sets that lead to the production of azadirone (18) and kihadalactone A (19) in N. benthamiana leaves. Genes from Citrus are shown in blue and those from Melia are shown in green. The arrow reflects accumulation of the metabolites after addition of the associated enzyme as shown in Panel B rather than true enzymatic substrate-product relationship. In addition, limonoids biosynthesis likely proceeds as a network; other possible reaction sequences are shown in fig S40. Diamonds represent intermediates whose structures were supported either by NMR analysis of the purified product or comparison with an authentic standard (18). (3), (6), (9), (10), (13) and (14) were purified from N. benthamiana leaf extracts expressing the respective biosynthetic gene sets and analyzed by NMR; the structures of (7) and (19) are supported by partial NMR. Additionally, a side product (20), formed in experiments with all pathway enzymes up to and including MaCYP716AD4 but without MaL7AT ( fig. S44) was purified and confirmed by NMR (table S20)