Allylic hydroxylation of triterpenoids by a plant cytochrome P450 triggers key chemical transformations that produce a variety of bitter compounds

Cucurbitacins are highly oxygenated triterpenoids characteristic of plants in the family Cucurbitaceae and responsible for the bitter taste of these plants. Fruits of bitter melon (Momordica charantia) contain various cucurbitacins possessing an unusual ether bridge between C5 and C19, not observed in other Cucurbitaceae members. Using a combination of next-generation sequencing and RNA-Seq analysis and gene-to-gene co-expression analysis with the ConfeitoGUIplus software, we identified three P450 genes, CYP81AQ19, CYP88L7, and CYP88L8, expected to be involved in cucurbitacin biosynthesis. CYP81AQ19 co-expression with cucurbitadienol synthase in yeast resulted in the production of cucurbita-5,24-diene-3β,23α-diol. A mild acid treatment of this compound resulted in an isomerization of the C23-OH group to C25-OH with the concomitant migration of a double bond, suggesting that a nonenzymatic transformation may account for the observed C25-OH in the majority of cucurbitacins found in plants. The functional expression of CYP88L7 resulted in the production of hydroxylated C19 as well as C5-C19 ether-bridged products. A plausible mechanism for the formation of the C5-C19 ether bridge involves C7 and C19 hydroxylations, indicating a multifunctional nature of this P450. On the other hand, functional CYP88L8 expression gave a single product, a triterpene diol, indicating a monofunctional P450 catalyzing the C7 hydroxylation. Our findings of the roles of several plant P450s in cucurbitacin biosynthesis reveal that an allylic hydroxylation is a key enzymatic transformation that triggers subsequent processes to produce structurally diverse products.


Cucurbitacins are highly oxygenated triterpenoids characteristic of plants in the family Cucurbitaceae and responsible for the bitter taste of these plants. Fruits of bitter melon (Momordica charantia) contain various cucurbitacins possessing an unusual ether bridge between C5 and C19
, not observed in other Cucurbitaceae members. Using a combination of nextgeneration sequencing and RNA-Seq analysis and gene-to-gene co-expression analysis with the ConfeitoGUIplus software, we identified three P450 genes, CYP81AQ19, CYP88L7, and CYP88L8, expected to be involved in cucurbitacin biosynthesis. CYP81AQ19 co-expression with cucurbitadienol synthase in yeast resulted in the production of cucurbita-5,24-diene-3␤,23␣-diol. A mild acid treatment of this compound resulted in an isomerization of the C23-OH group to C25-OH with the concomitant migration of a double bond, suggesting that a nonenzymatic transformation may account for the observed C25-OH in the majority of cucurbitacins found in plants. The functional expression of CYP88L7 resulted in the production of hydroxylated C19 as well as C5-C19 etherbridged products. A plausible mechanism for the formation of the C5-C19 ether bridge involves C7 and C19 hydroxylations, indicating a multifunctional nature of this P450. On the other hand, functional CYP88L8 expression gave a single product, a triterpene diol, indicating a monofunctional P450 catalyzing the C7 hydroxylation. Our findings of the roles of several plant P450s in cucurbitacin biosynthesis reveal that an allylic hydroxylation is a key enzymatic transformation that triggers subsequent processes to produce structurally diverse products.
Terpenoids have been used as a source for medicines, cosmetics, materials, and food additives, making them one of the most important groups of compounds from natural sources. The diversity of the terpenoid structure originates from both the variation in the carbon skeleton and the variation in the modifications of the core carbon structure, mostly through oxidation and glycosylation. While studies on terpene synthases have been well-documented (1), there are relatively limited studies on terpene modification enzymes. Oxidative modifications are often carried out by cytochrome P450 monooxygenases (P450s), 3 which comprise one of the largest oxygenase families in biological systems. In plants, numerous P450 genes exist in the genome, being involved in various processes, such as herbicide metabolism and the biosynthesis of plant hormones and secondary metabolites (2,3). Owing to the large number of genes, the identification of a specific P450 gene involved in a specific process of interest has been hampered. Furthermore, the number of P450s involved in a particular transformation step is often unknown.
We have initiated our studies on one of the Cucurbitaceae plants, Momordica charantia, known as bitter melon. M. charantia is known for its bitter taste and used as a vegetable worldwide, being especially popular in Asian countries. Whereas M. charantia possesses numerous cucurbitacins, like other Cucurbitaceae plants, it also contains the characteristic cucurbitacins goyaglycosides and momordicosides, which contain an ether linkage between C5 and C19 that is not present in other cucurbitacins ( Fig. 1) (24). Some of these goyaglycosides exhibit anti-diabetic activity and are promising natural agents for the treatment of diabetes (5). Here, we report the identification of three novel P450 genes from M. charantia and, through functional expression in yeast, reveal an unprecedented transformation in cucurbitacin biosynthesis.

Selection of candidate genes involved in cucurbitacin biosynthesis using RNA-Seq data sets
To identify the modification enzymes involved in cucurbitacin biosynthesis, we utilized the expression profiling data sets from 10 different tissues of M. charantia obtained in our previous RNA-Seq analysis (25). We are particularly focused on P450s, which mainly contribute to the oxygenation reactions of triterpenoid backbones. Our previous studies revealed that OSC genes showed characteristic tissue-dependent expression patterns (25). In particular, the M. charantia McCBS gene, encoding cucurbitadienol synthase, showed the highest expression in leaves rather than in fruits. Therefore, P450 genes involved in cucurbitacin biosynthesis were expected to show a similar expression pattern. First, 27,127 total contigs in the RNA-Seq data sets were annotated with a BLASTX search. Next, contigs that showed a highly similar expression pattern with the McCBS gene were selected using the ConfeitoGUIplus software (version 1.2.3), which was designed to detect correlation networks (26). Eighteen contigs were obtained, which contained three P450 genes, namely M01465 (CYP81AQ19), M04110 (CYP88L7), and M00873 (CYP88L8) (Fig. 2 and Table  S2). Therefore, we decided to determine the function of these candidate P450s using heterologous co-expression with

Cucurbitacin biosynthesis in M. charantia
McCBS in yeast cells. Among the other contigs obtained, one of them was squalene monooxygenase, which is the enzyme that converts squalene to 2,3-oxidosqualene, a substrate for McCBS. Moreover, two contigs were annotated as transporters, exhibiting high correlations with McCBS. These candidates were expected to be involved in triterpene biosynthesis and transport.

Functional analysis of CYP81AQ19
Heterologous expressions of candidate P450 genes were carried out in Saccharomyces cerevisiae strain GIL747. This budding yeast strain accumulates 2,3-oxidosqualene, which is a ubiquitous substrate for OSC, due to a lack of an endogenous lanosterol synthase. S. cerevisiae strain GIL747 has been used for functional analysis of genes involved in triterpene biosyn-

Cucurbitacin biosynthesis in M. charantia
thesis (27). The strain GIL747 was transformed to express McCBS, each candidate P450, and a cytochrome P450 reductase from Lotus japonicus (LjCPR, GenBank TM accession no. AB433810). First, we examined the function of CYP81AQ19. The transformed yeast cells were cultured, expression was induced by galactose, and the collected cells were extracted with hexane and ethyl acetate (3:1) and analyzed by HPLC/ tandem MS (HPLC/MS-MS). As a result, a specific product 2 was detected in yeast extracts expressing CYP81AQ19, McCBS, and LjCPR ( Fig. 3 and Fig. S1). The mass spectrum of this product did not show any ion peak that corresponded to the expected [M ϩ H] at m/z 443 (M ϩ H ϩ 16) that would arise from the introduction of an oxygen atom due to hydroxylation. Instead, it showed an ion peak at m/z 425, which may derive from the dehydration of a newly introduced hydroxyl group (M ϩ H ϩ 16 -18) (Figs. S2 and S3). If so, this hydroxyl group seems to be rather labile. An additional dehydrated ion peak at m/z 407 was seen, which was most likely derived from the dehydration of the 3␤-hydroxyl group. Assuming that the hydroxylation of cucurbitadienol took place, a large-scale (3-liter) culture of yeasts was carried out, and the product 2 was extracted and purified by silica gel column chromatography for NMR analysis. The 1 H NMR spectrum of 2 exhibited eight methyl signals, including one secondary methyl and two vinylic methyls, a hydroxymethine signal at ␦ 3.47, and two olefinic signals at ␦ 5.19 and 5.58, all of which mostly coincided with that of authentic cucurbitadienol (Figs. S5 and S6) (7). Most notably, however, the appearance of a ␦ 4.46 (td) signal suggested the presence of an allylic alcohol moiety. Correspondingly, the olefinic signal for C24 (␦ 5.09) and two vinylic methyl signals for C26 and C27 (␦ 1.68 and 1.60) of cucurbitadienol were shifted to a lower magnetic field (␦ 5.19, 1.70, and 1.68, respectively). These results suggested that a newly introduced hydroxyl group was located at C23. In support of this idea, the 13 C NMR spectrum showed a shift of the C23 signal of cucurbitadienol from ␦ 24.8 to ␦ 65.9 (Fig. S7). Therefore, the structure of 2 was determined to be cucurbita-5,24-diene-3␤,23-diol and was further verified by HMBC and HMQC (Figs. S8 and S9).

Structural conversion of 2 under nonenzymatic conditions
During our analysis of the absolute configuration at C23hydroxyl group of 2, the reaction with MTPA chloride under acidic conditions showed that 2 was converted to an unexpected product but not the desired MTPA ester. To confirm the high reactivity of the allylic hydroxyl group under mild acidic conditions, 2 was treated with 0.1 N HCl/MeOH or kept in an NMR sample tube in CDCl 3 for a few days. Previous studies on tirucallane triterpenes having a similar 23␣ allylic hydroxyl group were shown to be prone to dehydration, resulting in 23,25-diene under mild acidic conditions (29). As a result, we observed that 2 was converted to 3 and 4 in HCl/MeOH and to 4 in CDCl 3 without any enzymes. In the 1 H NMR spectrum of 3, the allylic hydroxymethine (␦ 4.46, C23) and olefinic signals (␦ 5.18, C24) disappeared, and a new exo-methylene signal (␦ 4.85, C26) and another two olefinic signals (␦ 5.62, C23 and ␦ 6.11, C24) were observed (Figs. S13-S15). Through the comparison with published data on related tirucallane triterpenes (29), the structure of 3 was determined to be cucurbita-5,23,25-trien-3␤-ol. On the other hand, the 1 H and 13 C NMR spectra of 4 indicated the loss of an allylic hydroxyl group and the presence of three olefinic protons overlapped at ␦ 5.58 (Figs. S16 -S18). The presence of two olefins was supported by the 13 C NMR spectrum (␦ 121.4/141.2 for C5-C6 and ␦ 125.5/139.3 for C23-C24). In addition, a hydroxyl-bearing carbon signal was observed at ␦ 70.8, whereas two methyls attached to a carbon bearing a hydroxyl group were observed at ␦ 1.31 in the 1 H NMR. All of these data pointed to a structure with ⌬ 23 -C25-ol and, comparing with published data regarding related compounds with the same partial structure, the structure of 4 was determined to be cucurbita-5,23-dien-3␤,25-diol. Taken

Cucurbitacin biosynthesis in M. charantia
together, these results demonstrated that C23-hydroxylated cucurbitadienol 2 produced by CYP81AQ19 was converted to C25-modified cucurbitadienol 3 and 4 under mild acidic conditions. Therefore, the C23 allylic alcohol was shown to be rather unstable and likely to either dehydrate or isomerize into C25 tertiary alcohol with a concomitant migration of a double bond from C24(25) to C23(24) (Fig. 4).

Functional analysis of CYP88L7
Next, we examined the function of CYP88L7 by co-expressing it with McCBS and LjCPR in yeast GIL747. The extracts from the culture showed mainly two peaks in the LC/MS-MS analysis (Fig. 3). These products exhibited ion peaks that corresponded to an expected , suggesting that a hydroxylation and a formal dehydrogenation had taken place (Figs. S3 and S4). A sample was prepared from a large-scale culture (12 liters) and purified using a silica gel column chromatography to give products 5 and 6. The 1 H NMR spectrum of 5 was similar to that of cucurbitadienol. However, an AB system derived from the geminal protons of a hydroxymethyl group was seen at ␦ 3.36 and 3.53 (J ϭ 10.5 Hz), whereas the C19 methyl signal (␦ 0.92 in cucurbitadienol) disappeared (Figs. S19 and S20). The 13 C NMR spectrum also showed the existence of a hydroxymethyl carbon at ␦ 68.9 (Fig. S21). These results led to the determination of the structure of 5 as cucurbita-5,24-diene-3␤,19-diol, which was further confirmed by HMBC and HMQC (Figs. S22 and S23). On the other hand, the 1 H NMR spectrum of 6 exhibited similar but slightly shifted AB system geminal protons at ␦ 3.51 and 3.67, whereas 3␣-H also showed a slight shift from a typical ␦ 3.47 to ␦ 3.40 compared with cucurbitadienol (Figs. S24 and S25). The C29 methyl signal also showed a downfield shift from a typical ␦ 1.12-1.13 to ␦ 1.20. Furthermore, a drastic change was observed in the olefinic region where C6 proton at ␦ 5.59 disappeared and signals at ␦ 5.64 and 6.04 appeared, whereas the C24 proton at ␦ 5.09 remained unchanged. In the 13 C NMR spectrum, two carbinol carbons at ␦ 79.8 and 87.5 had newly appeared (Fig. S26). These results suggest an ether bridge between C5 and C19, which is the most characteristic structural feature of the cucurbitacins of M. charantia. Through the comparison with data from goyaglycosides (24), and from HMBC and HMQC spectra (Figs. S27 and S28), the structure of 6 was confirmed to be 5␤,19-epoxycucurbita-6,24-dien-3␤-ol. Therefore, CYP88L7 functions as cucurbitadienol C19 hydroxylase and, surprisingly, also catalyzed an ether bridge formation between C5 and C19 (Fig. 5).
We also tested the co-expression of CYP88L7 and CYP81AQ19. The yeast GIL747 expressing McCBS, CYP81AQ19, CYP88L7, and LjCPR was cultured, and the extracts were analyzed by LC/MS-MS (Fig. 3). Several products were observed, one of which gave the expected ion peak [M ϩ H] at m/z 459 (M ϩ H ϩ 16 ϫ 2) which arise from the introduction of two oxygen atoms due to hydroxylations (Figs. S3 and S4). A large-scale culture (6 liters) was carried out to isolate these products. After silica gel column chromatography, two fractions were obtained, which were each subjected to preparative HPLC separations. From a more polar fraction, three products, 7, 8, and 9, were obtained. Preliminary NMR spectra of the fraction containing the mixture of these compounds indicated the presence of both species having a C23-allylic hydroxyl group or ⌬ 23 -C25-ol in the side chain (Figs. S29 -S32). In addition, the presence of a C19 hydroxyl group (an AB system at ␦ 3.38 and 3.55) was observed. Moreover, an unassigned carbinol methine proton and an olefinic proton were seen. Most notably, the presence of a ␦ 9.73 signal pointed to an aldehyde moiety. The 1 H NMR spectrum of isolated 7 exhibited a spectrum very similar to that of 5, indicating the presence of a C19 hydroxyl group (Figs. S33 and S34). The only difference was the presence of a ␦ 4.46 signal for C23, which indicated the presence of a C23 allylic hydroxyl group. The 13 C NMR, as well as the HMBC spectra, confirmed the structure (Figs. S35 and S36), and 7 was determined as cucurbita-5,24-diene-3␤,19,23␣-triol. Compound 8 showed the presence of an aldehyde ( 1 H: ␦ 9.70, 13 C: ␦ 187.8), C23 allylic hydroxyl group (␦ 4.46), and ⌬ 24 (␦ 5.19) (Figs. S37 and S38). Furthermore, a C6 olefinic proton was shifted to downfield (␦ 5.89), and a new oxymethine proton at ␦ 3.97 appeared. From the HMBC spectrum obtained from the mixture sample (Fig.  S31), a correlation from the aldehyde proton to C9 (␦ 50.1) was observed, indicating that the aldehyde was attached to C9. Also, correlations were observed between the new oxymethine proton (␦ 3.97) and C9 and C5 (␦ 145.7), suggesting that the hydroxyl group was attached to C7. Therefore, the structure of 8 was determined as 3␤,7,23␣-trihydroxycucurbita-5,24-dien-19-al. The stereochemistry of the C7 hydroxyl was undetermined. Compound 9 showed a spectrum similar to that of 8, but with a side chain having ⌬ 23 -C25-OH, as evidenced by olefinic protons at ␦ 5.58 and methyls attached to carbinol carbon at ␦ Cucurbitacin biosynthesis in M. charantia 1.31 (Figs. S39 and S40). Therefore, compound 9 was determined as 3␤,7,25-trihydroxycucurbita-5,23-dien-19-al. The 13 C NMR peaks of compounds 8 and 9 were tentatively assigned based on the spectrum obtained for the mixture.
On the other hand, from a less polar fraction on the silica gel column, an inseparable mixture of 10 and 11 was obtained. The 1 H NMR spectrum of the mixture indicated the presence of a C5-C19 ether bridge with a double bond located between C6 and C7 (Figs. S41-S43). The signals at ␦ 4.47 and 5.58 also indicated the presence of both ⌬ 24 -C23-OH and ⌬ 23 -C25-OH structures at the side chain. The 13 C NMR, as well as the HMBC and HMQC spectra of the mixture (Figs. S44 -S46), confirmed the structure as 5␤,19-epoxycucurbita-6,24-diene-3␤,23␣-diol for 10 and 5␤,19-epoxycucurbita-6,23-diene-3␤,25-diol for 11. Collectively, when expressed together with C23 hydroxylase CYP81AQ19, CYP88L7 was shown to be a multifunctional enzyme catalyzing the C19 oxidation, not only to a hydroxyl or an ether but also to an aldehyde, along with the C7 hydroxylation (Fig. 5).

Functional analysis of CYP88L8
Finally, we examined the function of CYP88L8 by co-expressing it with McCBS and LjCPR in yeast GIL747. From the HPLC analysis of the extracts from the yeast culture, the specific product 12 was detected (Fig. 6A). A large-scale culture (8 liters) was carried out to isolate the compound after silica gel column chromatography. The 1 H NMR spectrum showed a signal for a new oxymethine proton at ␦ 3.94 similar to 8 and 9, which suggested the presence of a hydroxyl group at C7 (Figs. S47 and S49). Other signals were very similar to cucurbitadienol. The 13 C NMR spectrum also confirmed a new hydroxymethine carbon at ␦ 68.1 (Figs. S50 and S52). The HMBC correlations were observed between this proton and C5, C6, and C9, supporting the structure having a hydroxyl group at C7 (Figs. S47 and S51). The stereochemistry of C7-OH was determined by NOE measurements, where NOE effects were observed between C7-H and C6-H, C7-H and C14-Me, and C14-Me and C6-H (Figs. S48, S53, and S54). Therefore, C7-H was determined to have ␣ configuration, indicating the 7␤-OH stereochemistry. Thus, the structure of 12 was confirmed as cucurbita-5,24-diene-3␤,7␤-diol (Fig. 6B).

Expression analysis of P450s in M. charantia
The expression profiles of McCBS, CYP81AQ19, and CYP88L7 were examined by quantitative RT-PCR to verify the data from

Cucurbitacin biosynthesis in M. charantia
the RNA-Seq analysis. Consistent with previous data, all three genes showed the highest expression in leaves, whereas moderate expression was seen in the roots for McCBS and CYP81AQ19 (Fig. 7). All of the genes showed very low expression in fruits.

Discussion
Our candidate selection of P450 genes using the newly developed ConfeitoGUIplus (26), which is based on the total gene value expression patterns among different organs from RNA-Seq analysis synchronizing with a core skeleton-forming enzyme, CBS, was successful in identifying three genes responsible for cucurbitacin biosynthesis. The combination of RNA-Seq and gene-to-gene correlation analyses using the Con-feitoGUIplus software has the potential to become a standardized analysis method for mining new enzymatic and regulatory genes involved in secondary metabolite biosynthesis in plants whose genome has not been sequenced, such as in many useful medicinal plants.
The three P450s that we identified belonged to the CYP81AQ and CYP88L families and were similar to P450s reported for cucumber, CYP81Q58, and CYP88L2 (8), validating our selection methods (Fig. S55). Both CYP81Q58 and CYP88L2 were reported to catalyze C25 and C19 hydroxylation, respectively. Instead, our study demonstrated that M. charantia CYP81AQ19 catalyzed the C23 hydroxylation at the allylic position of cucurbitadienol. This was in sharp contrast to the function of CYP81Q58 in cucumber, which was reported to catalyze C25 hydroxylation (8). To our surprise, the C23 hydroxyl group was readily converted to a ⌬ 23 -C25-OH structure under mild acidic conditions, such as treatment with HCl/ MeOH. The dehydration into 23,25-diene was also observed. In fact, a similar dehydroxylation triggered a double bond migration and a C25 methoxylation, as was previously reported (30). However, we did not observe any methoxylation at C25. Instead, only a hydroxylation was seen. In general, the allylic hydroxyl group is labile to dehydroxylation, which in this case resulted in a C23 allylic cation that undergone isomerization into a C25 tertiary cation accompanied by a double bond shift (Fig. 8A). Finally, a capture of the cation by a hydroxyl ion would produce C25-OH. This isomerization step should shift toward the C25 cation, as the tertiary cation is more stable than the C23 secondary cation. Based on these considerations, we concluded that 2 was a direct enzymatic product and not the result of the isomerization of C25-OH, and that the true function of CYP81AQ19 was C23 hydroxylation and not C25 hydroxylation. The isomerization of C23-OH was also observed for products that derived from the co-expression of CYP81AQ19 and CYP88L7, such as in 8 and 9 and in 10 and 11. This indicated that the isomerization could take place during the extraction and purification steps. These observations raised an important question regarding whether the structure of naturally occurring cucurbitacins truly has a C25 hydroxyl group. If such an isomerization readily took place, it is difficult to deduce when and where it occurred. Most of the cucurbitacins isolated from M. charantia possessed a C25-OH group, with the exception of goyaglycoside-f, which had a C23-OH group (24). Moreover, the majority of cucurbitacins found in other Cucurbitaceae plants or other natural sources possessed a C25-OH group (6). Only a few were found to have C23-OH. It is still early to speculate, but finding a C23-hydroxylase in M. charantia strongly suggests that the P450s responsible for hydroxylation at these positions initially introduce a hydroxyl group at C23, which then isomerizes to C25 nonenzymatically. Whether this isomerization took place inside the plant or during the extraction procedure is unknown. It also raises the possibility that the structures of the majority of cucurbitacins isolated so far might be an artifact produced during extraction procedures and that the original structure could have possessed a C23 allylic hydroxyl group. If a P450 introduces a hydroxyl group directly at C25, a rational mechanism of C25 hydroxylation would assume that a P450 should abstract an H radical from C23, generating an allylic radical, which then requires the migration of a double bond to give the C25 radical, which the Fe IV -oxo complex (compound I) attacks. Compared with this C25 scenario, hydroxylation at C23 seems more straightforward and does not require a double bond migration. In the case of C25, a question arises regarding what factor favors a double bond migration before the attack of the compound I to the radical center. It also requires a substrate radical intermediate to shift with regard to the iron center of the distal side of the heme, right after the H abstraction, in order for compound I to attack a carbon different from where it originally abstracted the H

Cucurbitacin biosynthesis in M. charantia
radical. In this line, the function of CYP81Q58 should be carefully reexamined to see if it truly hydroxylates the C25 position with a double bond migration from ⌬ 24 to ⌬ 23 , and whether or not the observed C25 product results due to an artifact from the isomerization of C23-OH. In either case, the reaction of CYP81AQ19 brings us a unique opportunity to study the allylic hydroxylation catalyzed by P450, which seems to be a rare case in the P450 reaction.
The function of M. charantia CYP88L7 was shown to be cucurbitadienol C19-hydroxylation, similar to that of cucumber CYP88L2 (8). However, to our surprise, M. charantia CYP88L7 also catalyzed the formation of the ether bridge between C5 and C19. A rational mechanism for the formation of the ether bridge can be envisioned as follows (Fig. 8B). Both C19 and C7 hydroxylations are followed by a dehydroxylation at C7, generating an allylic cation, which then triggers a double bond migration to C6 (7). Finally, the capture of the resulting C5 cation with the C19 hydroxyl produces the ether bridge. Therefore, unlike CYP88L2, M. charantia CYP88L7 also catalyzed the C7 hydroxylation in addition to the C19 hydroxylation. This was also evident from the deduced structures of 8 and 9, which have a hydroxyl at C7. Moreover, CYP88L7 also catalyzed the oxidation of C19 hydroxyl to give C19 aldehyde. Such C19 aldehyde species were found in M. charantia (e.g. momordicoside K) (24). In fact, the structure of 9 corresponded to an aglycone of this compound. Therefore, the presence of unique cucurbitacins having an ether bridge between C5 and C19 in M. charantia is due to the presence of a multifunctional P450 that hydroxylates both C7 and C19. The absence of cucurbitacins having an ether bridge in cucumber corroborated the presence of a monofunctional CYP88L2 that can only hydroxylate C19. Whether this ether bridge formation in CYP88L7 was enzymatically catalyzed or not is unknown at this moment. Unlike in the CYP81AQ19 case, none of the intermediates possessing C7 and C19 diol have been isolated. A nonenzymatic ether bridge formation after a P450 hydroxylation is known in the CYP707A case, where a P450 catalyzes the 8Ј-hydroxylation of the plant hormone abscisic acid, and is followed by the concomitant Michael addition of the resulting hydroxyl to produce phaseic acid (31,32). Further studies are needed to see whether this ether bridge formation takes place in the active-site cavity of CYP88L7 or not. Similarly, the key to the ether bridge formation is the allylic hydroxylation at C7, which triggers the dehydroxylation that leads to the subsequent transformations.
On the other hand, the function of CYP88L8 was shown to be cucurbitadienol C7␤ hydroxylase. Although CYP88L8 exhibited high amino acid sequence identity with CYP88L7, exhibiting 79% identity (Fig. 9), it was a monofunctional P450 catalyzing a hydroxylation only at C7 and not at C19. Despite having such a high sequence identity, the functional difference from CYP88L7 is especially intriguing. Structural studies would certainly illuminate these important points in the future. Functional differences between CYP88L8 and CYP88L2 are also noteworthy, despite having a sequence identity of 54%, and resulted in a hydroxylation at different carbons. Cucurbitacins having only the C7␤ hydroxyl group have been isolated from M. charantia (33), and CYP88L8 should be responsible for the production of such cucurbitacins.
The expression of two P450 genes was confirmed to be higher in leaves than in fruits (Fig. 7). A synchronized expression with McCBS indicated the common regulation of the core skeleton-forming OSC and modifying enzymes. A similar transcriptional regulator previously identified in cucumber might be responsible for the regulation of the cucurbitacins pathway in M. charantia as well (8). This also suggested that cucurbitacins, at least the oxidized form rather than the precursor cucurbitadienol, are transported from leaves to fruits, where large amounts of cucurbitacins accumulate and produce strong bitterness. However, whether glycosides are transported or not still remains unknown, requiring further identification of glycosyltransferase genes.
Our current findings show that an allylic hydroxyl group in the cucurbitane skeleton is highly reactive, triggering nonenzymatic transformations. When complex structures are encountered in natural products, it is often not clear how many enzymes are needed to construct such complexity. In this case, an ether bridge between C5 and C19 represents an ambiguous case. At most, three or four enzymes could be predicted to participate in the formation of this structure. However, only one enzyme was used to produce them. The number of genes

Cucurbitacin biosynthesis in M. charantia
required to construct a molecule might be smaller than the number of chemical steps predicted rationally. Organisms seem to take advantage of nonenzymatic transformations to construct complex structures that otherwise would require multiple enzymes. In these cases, allylic hydroxylation was a key step to trigger the subsequent nonenzymatic processes to construct complex structures. Furthermore, the P450 catalyzed allylic oxidation might play a pivotal role in constructing complex structures in the biosynthesis of other natural products as well. So far, the supply of cucurbitacins by chemical synthesis or extraction from plants is difficult from the viewpoint of cost, effort, and quantity. Our findings are expected to lead to the production of pure bioactive cucurbitacins using yeasts expressing these biosynthetic genes.

Plant materials
All 10 different types of tissues of M. charantia, namely old leaves, young leaves, stems, tendrils, male flowers, female flowers, fruits, seedling leaves, seedling stems, and seedling roots, were harvested in a vegetable field (Tateyama, Chiba) of the Southern Prefectural Horticulture Institute in the Chiba Prefectural Agriculture and Forestry Research Center and in a greenhouse (Kazusa DNA Research Institute) from 2012 to 2015. All tissues were cut into small pieces, frozen by liquid nitrogen, and stored at Ϫ80°C prior to RNA extraction.

RNA isolation and library preparation for RNA-Seq analysis
Total RNA extraction, library construction, and Illumina sequencing were prepared according to a previous report (25).

De novo transcriptome assembly
The obtained Illumina reads were sequenced by Illumina's next-generation sequencing instruments Genome Analyzer IIx (in 2012), HiSeq 1000 (in 2012), and HiSeq 1500 Rapid mode (in 2013) with 100-bp paired-end reads. The reads were assembled using the commercially available CLC Genomics Workbench version 5.5.1 (CLC bio Japan, Tokyo, Japan), using a minimum contig length of 800, performing scaffolding. A total of 27,127 contigs were obtained. The assembled contigs were also used as queries against the nonredundant protein database using the BLASTX program (e-5). The total gene values were recalculated using CLC Genomics Work bench version 11.0. The raw RNA-Seq reads obtained in this study have been submitted to the DDBJ Sequence Read Archive (DRA) under accession number DRA007507.

Gene-to-gene correlation analysis using the ConfeitoGUIplus software
To identify the modification enzymes involved in cucurbitacin biosynthesis, we employed the commercial software Con-feitoGUIplus version 1.2.3 developed in the Kazusa DNA Research Institute. The ConfeitoGUIplus version 1.2.3 is standalone software to detect network modules from a correlation network composed of molecular biology multivariate data, based on the Confeito algorithm (26), which allows the adjustment of the network module sizes by modifying a single parameter and can detect elements specifically related to the network modules even when they are weakly correlated. Before the gene-to-gene correlation analysis, we calculated mean averages against the Total Gene values of RNA-Seq total genes from 10 different tissues (old leaves, young leaves, stems, tendrils, male flowers, female flowers, fruits, seedling leaves, seedling stems, and seedling roots) obtained by the three sequencers (GAIIx, HiSeq, and Rapid) for use as an input file for the ConfeitoGU-Iplus software. The correlation network analysis using Con-feitoGUIplus version 1.2.3 was performed with the following parameters: cosine similarity, 4; cosine minimum correlation, 0.5; minimum elements, 2; maximum elements, 50; and solid bold, 0.9 (on false-positive-out (FPO) analysis) and vertex spec- Cucurbitacin biosynthesis in M. charantia ificity threshold, 0.5; cosine correlation threshold, 0.5; maximum elements, 1000; and dots bold, 0.9 (on false-negative-in (FNI) analysis), to obtain gene-to-gene correlation network modules for including the McCBS gene and related genes involved in cucurbitacin biosynthesis. First, we performed an FPO analysis of the Confeito algorithm, resulting in 15 selective contigs ( Fig. 2 and Table S2) from a total of 7,127 contigs. The FPO module contained two P450 genes, two transporter genes, a squalene monooxygenase gene, and 1-deoxy-D-xylulose-5phosphate synthase gene highly correlated with the McCBS gene. Next, we performed an FNI analysis of the Confeito algorithm, resulting in an additional five contigs ( Fig. 2 and Table  S2) from the total 7,127 contigs. The FPO and FNI modules contained three P450 genes in total.

Purification and analysis of triterpene products from yeast
The products from transformant GIL747 were extracted with hexane and ethyl acetate (3:1) as described previously (25). The resulting extracts were injected into liquid chromatography (LC)-LTQ-FT-ICR-MS (LC, Agilent 1100 series (Agilent, Japan); LTQ-FT-ICR-MS (Thermo Fisher Scientific, Japan)). The products were separated using a C18 column (TSKgel ODS-100V; 4.6 ϫ 250 mm, 3 m; TOSOH Bioscience, Japan) and analyzed in a positive-ionization mode of APCI with a high resolution (full mass scans, 100,000), as described previously (25). An HPLC analysis was carried out with a LC-2010A HT (Shimadzu, Japan) with a column TSKgel ODS-80T M (4.6 ϫ 150 mm, 5 m; TOSOH Bioscience, Japan). A low-pressure gradient with acetonitrile (50 -100%)/35 min, 100% acetonitrile was flowed for 20 min with a flow rate of 1.0 ml/min with column temperature at 40°C and UV detection at 210 nm. All samples were dissolved in acetone, and 10 l was injected. To isolate the product triterpenes obtained by the heterologous expression of candidate genes, a large-scale culture was carried out. The yeast cells were collected and disrupted by 20% KOH, 50% EtOH under reflux conditions for 30 min, and the products were extracted with hexane and ethyl acetate (3:1) and evaporated to dryness. Compound 2 was extracted from a 3-liter culture of yeast cells expressing McCBS, CYP81AQ19, and LjCPR. The crude extract was applied onto a silica gel column (volume 50 ml) and eluted with hexane and ethyl acetate (7:1) as a solvent. The desired fractions containing compound 2 (5 mg) were collected and evaporated to dryness. A large-scale culture (12 liters) expressing McCBS, CYP88L7, and LjCPR was similarly extracted, and the crude extract was applied onto a silica gel column (volume 50 ml) eluted with hexane and ethyl acetate (3:1) as a solvent. The desired fractions containing compounds 5 (4 mg) and 6 (3 mg) were collected and evaporated to dryness. The crude extract from a large-scale culture (6 liters) expressing McCBS, CYP81AQ19, CYP88L7, and LjCPR was subjected to a silica gel column (volume 50 ml) and eluted with hexane and ethyl acetate (4:1) as a solvent. Two semipurified fractions (fraction 1 (5 mg) containing compounds 10 and 11 and fraction 2 (6 mg) containing compounds 7, 8, and 9) were obtained. Further purification was performed with a preparative HPLC (SSC-3461 pump, Senshu Scientific, Tokyo, Japan) using a Cosmosil 5C 18 -PAQ column (10 ϫ 250 mm, 5 m, Nacalai Tesque, Japan) with 70% aqueous acetonitrile (for fraction 1) or 85% aqueous acetonitrile (for fraction 2) as a solvent, with a flow rate of 15 ml/min and UV detection at 210 nm, to obtain 4 mg of 10 and 11 from fraction 1 and 3 mg of 7, 1 mg of 8, and 1 mg of 9 from fraction 2. Compound 12 was purified from a crude extract of 8 liters of culture yeast cells expressing McCBS, CYP88L8, and LjCPR. The crude extract was applied onto a silica gel column (volume 50 ml) eluted with hexane and ethyl acetate (6:1) as a solvent. The desired fractions containing compound 12 (4 mg) were collected and evaporated to dryness. Each purified compound was analyzed by NMR using JEOL ECP-500 ( 1 H at 500 MHz, 13 C at 125 MHz) with CDCl 3 (99.8% atom 2 H, Kanto Chemical, Tokyo, Japan) as a solvent with a solvent signal of ␦ 7.26 ppm for 1 H and ␦ 77.0 ppm for 13 C as references for chemical shifts.

Determination of absolute configuration of compound 2 using a modified Mosher's method
Two l of DMAP and (S)-MTPA or (R)-MTPA was added to 2 mg of compound 2 in dichloromethane. After 30 min at room temperature, 5 l of diisopropylamine was added for quenching the reaction. The (S)-or (R)-esterified products were purified by preparative TLC, and 1 H NMR spectra were measured. The absolute configuration of 2 was determined by calculating the difference in chemical shifts on 1 H NMR spectra between (S)and (R)-esterified products.

Isomerization of compound 2 under acidic conditions
Two mg of 2 was dissolved in 0.5 ml of 0.12 N HCl/MeOH and stood overnight at 4°C. After the reaction, products were evaporated to dryness and purified by silica gel column chromatography.

Expression analysis in M. charantia
cDNAs derived from the six organs of M. charantia described above were synthesized by the SuperScript III First-Strand Synthesis System for RT-PCR. The quantitative PCR was performed according to the automatic Ct method of the 7900HT Fast Real-Time PCR System (Applied Biosystems, Japan) using DyNAmo HS SYBR Green quantitative PCR kits (Thermo Fisher Scientific). Gene-specific primers are shown in Table S1. Transcription levels of target genes were normalized