Engineering of Yarrowia lipolytica for the production of plant triterpenoids: Asiatic, madecassic, and arjunolic acids

Several plant triterpenoids have valuable pharmaceutical properties, but their production and usage is limited since extraction from plants can burden natural resources, and result in low yields and purity. Here, we engineered oleaginous yeast Yarrowia lipolytica to produce three valuable plant triterpenoids (asiatic, madecassic, and arjunolic acids) by fermentation. First, we established the recombinant production of precursors, ursolic and oleanolic acids, by expressing plant enzymes in free or fused versions in a Y. lipolytica strain previously optimized for squalene production. Engineered strains produced up to 11.6 mg/g DCW ursolic acid or 10.2 mg/g DCW oleanolic acid. The biosynthetic pathway from ursolic acid was extended by expressing the Centella asiatica cytochrome P450 monoxygenases CaCYP716C11p, CaCYP714E19p, and CaCYP716E41p, resulting in the production of trace amounts of asiatic acid and 0.12 mg/g DCW madecassic acid. Expressing the same C. asiatica cytochromes P450 in oleanolic acid-producing strain resulted in the production of oleanane triterpenoids. Expression of CaCYP716C11p in the oleanolic acid-producing strain yielded 8.9 mg/g DCW maslinic acid. Further expression of a codon-optimized CaCYP714E19p resulted in 4.4 mg/g DCW arjunolic acid. Lastly, arjunolic acid production was increased to 9.1 mg/g DCW by swapping the N-terminal domain of CaCYP714E19p with the N-terminal domain from a Kalopanax septemlobus cytochrome P450. In summary, we have demonstrated the production of asiatic, madecassic, and arjunolic acids in a microbial cell factory. The strains and fermentation processes need to be further improved before the production of these molecules by fermentation can be industrialized.


Introduction
Some natural plant triterpenoids possess anticancer, immunomodulatory, or antimicrobial effects (Bishayee et al., 2011;Ríos, 2010). Extraction from plants can be complicated due to low content of the triterpenoids or due to co-extraction of impurities, while straining natural resources (Englund et al., 2015;Idris and Mohd Nadzir, 2021;Mora-Pale et al., 2014). Furthermore, the application or ingestion of plant extracts may cause adverse effects in some people (Gomes et al., 2010;Jorge and Jorge, 2005). An alternative method for manufacturing these compounds is by fermentation of engineered microbes that express the specific plant biosynthetic enzymes. Triterpenoids are produced mainly through the mevalonate pathway, which converts acetyl-CoA to the phosphorylated carbon 5 (C 5 )-units IPP and DMAPP (Liao et al., 2016). The condensation of IPP and DMAPP molecules leads to the formation of C 15 farnesyl diphosphate, which can be enzymatically condensated and dephosphorylated to form C 30 squalene. Squalene is converted by squalene epoxidases (SQEp) into 2,3-oxidosqualene, a common precursor of sterols and triterpenoids. Specialized triterpenoids produced by plants may confer resistance to predators or inhibit the growth of competing plants (González-Coloma et al., 2011;Wang et al., 2014). Many triterpenoids also have beneficial effects on humans and animals. For example, the aglycones asiatic and madecassic acid and their glycosylated derivatives asiaticoside and madecassicoside are commonly found in extracts of Centella asiatica (gotu kola), which are widely used in cosmetics (Bylka et al., 2013). The extracts of C. asiatica can positively affect conditions like diabetes, obesity, neurological and cardiovascular diseases, and skin wounds, and the effects are often attributed to the beforementioned triterpenoids (Sun et al., 2020). Asiatic and madecassic acid are formed by cyclization of 2, 3-oxidosqualene into α-amyrin, which is probably then carboxy-and hydroxylated by cytochromes P450 (Fig. 1) (Andre et al., 2016;Kim et al., 2018;Miettinen et al., 2017). Likewise, β-amyrin is formed by cyclization of 2,3-oxidosqualene by terpenoid synthases (Hayashi et al., 2001). Several rounds of oxygenation by cytochromes P450 can convert β-amyrin into hederagenin, maslinic acid, and arjunolic acid, which all possess useful properties (Kim et al., 2018). Arjunolic acid, found in the arjun tree Terminalia arjuna, has antidiabetic, cardioprotective, and anti-inflammatory properties (Ghosh and Sil, 2013;Hemalatha et al., 2010;Ramesh et al., 2012).
We aimed to produce valuable αand β-amyrin-derived triterpenoids in the oleaginous yeast Yarrowia lipolytica. This yeast has an inherently high acetyl-CoA flux, a sequenced genome, a comprehensive toolkit available for genetic modification, and a safe history of use, where several Y. lipolytica strains have received GRAS-status (Christen and Sauer, 2011;Dujon et al., 2004;Holkenbrink et al., 2018;Turck et al., 2019). Furthermore, Y. lipolytica accumulates high amounts of lipids, which can be useful for storing hydrophobic triterpenoids (Beopoulos et al., 2009). Moreover, Y. lipolytica has already been used to produce many terpenoids like astaxanthin, betulinic acid, oleanolic acid, and gibberellin phytohormones (Jin et al., 2019;Kildegaard et al., 2017Kildegaard et al., , 2021. Therefore, we sought to leverage the benefits of Y. lipolytica and modern engineering tools to construct triterpenoid yeast cell factories.

Strains and media
The strains used in this study are listed in Supplementary Table S1. A pre-engineered Y. lipolytica strain with a high flux towards 2,3-oxidosqualene biosynthesis was used to construct the strains in this study (Arnesen et al., 2020). The strain was previously constructed from the W29-derived strain ST6512 (MATa ku70Δ::PrTEF1-Cas9-TTef12:: PrGPD-DsdA-TLip2) that expressed Cas9p for CRISPR/based DNA integration (Holkenbrink et al., 2018;Marella et al., 2020). In turn, ST6512 was based on the Y. lipolytica strain Y-63746 (MATa), a kind gift from the ARS Culture Collection, NCAUR, USA. Plasmid construction was done with the Escherichia coli DH5α strain that was cultivated on lysogeny broth (LB) media with 100 mg/L ampicillin at 37 • C and 300 rpm shaking. The Y. lipolytica cells were cultivated at 30 • C on media containing 10 g/L yeast extract, 20 g/L peptone, and 20 g/L glucose (YPD) with 20 g/L agar added for solid media. Hygromycin (400 mg/L) or nourseothricin (250 mg/L) was added to the media for yeast cell selection. For yeast cultivations, YPD-media with 80 g/L glucose (YPD80) instead of 20 g/L was used. All chemicals were purchased from Sigma-Aldrich, unless otherwise noted. Nourseothricin was obtained from Jena BioScience GmbH (Germany).

Plasmids
The Supplementary Tables S2, S4, and S5 list the plasmids, biobricks, and primers, respectively, used in this study. Primer3 was used to design some primers (Untergasser et al., 2012). The Biobricks were PCR-amplified with Phusion U Polymerase (Thermo Scientific), and USER cloning was used to assemble the EasyCloneYALI plasmids that were transformed into E. coli (Holkenbrink et al., 2018). Correct plasmid assembly was verified by sequencing. The synthetic genes encoding GgBAS (Genbank: Q9MB42.1), MdOSC1 m (genbank: ACM89977.1) with the following mutations N11T/P250H/P373A based on (Yu et al., 2020), MtCYP716A12 (genbank: CBN88268.1), KsCYP72A397 (genbank: ALO23113.1), CaCYP716C11 (genbank: AOG74835.1), CaCYP714E19 (genbank mRNA sequence: KT004520.1), and CaCYP716E41 (genbank: AOG74834.1) were ordered from Thermo Fischer Scientific. The genes were codon-optimized for Y. lipolytica, and the algorithms used for codon-optimization can be seen in Supplementary Table S3. The DNA sequences of the synthetic genes can be found in the supplementary materials. For some of the genes, we used a codon-optimization algorithm based on codon usage of highly expressed genes (IhOP). The algorithm uses a custom codon usage table calculated from the 100 most highly expressed genes in the previously published Y. lipolytica RNA-seq dataset (Dahlin et al., 2019). The algorithm was designed to preferentially select the most abundant codon for a given amino acid, and to allow for the selection of alternative codons to work around sequence constraints, such as avoiding restriction sites, polynucleotide stretches, and high or low GC content. The IhOP-algorithm and the codon usage table are available via github (https://github.com/CfB-YME/YALI_opt). The sequence of AtATR2 was described in a previous study (Kildegaard et al., 2021). The Emboss needle global pairwise sequence alignment algorithm was used to align nucleotide sequences (Madeira et al., 2019).

Strain construction
Supplementary Table S1 lists the yeast strains used in this study. The integrative plasmids were NotI-digested prior to transformation, which was based on a previously described lithium acetate protocol (Holkenbrink et al., 2018). Colony PCR with primers complementary to the plasmid and adjacent genomic region was used to confirm genomic integration.

Cultivation and metabolite extraction
Precultures were made by inoculating 2.5 mL YPD media in 24-deepwell plates with an air-penetrable lid (EnzyScreen, NL) with a yeast colony and growing the cultures overnight at 30 • C with shaking at 300 rpm. Then, 2.5 mL of YPD80 was inoculated from the precultures to an OD600 of 0.1 and the cultures were left to grow at 30 • C with 300 rpm shaking for 72 h. By the end of cultivation, dry cell weight (DCW) was measured by transferring 1 mL culture broth to a pre-weighed 2 mL microcentrifuge tube (Sarstedt), which was centrifuged, and the supernatant was removed. The remaining cell pellets were then dried at 60 • C for a minimum of 72 h and weighted.
For metabolite extraction, 1 mL of culture broth was transferred to a 2 mL microcentrifuge tube, which was centrifuged, and the supernatant was removed. The cell pellets were washed twice with water before 500 μL of 0.212-0.3 mm acid-washed beads, and 1 mL of 99% ethanol was added to the cell pellet. The tubes were then bead-bashed at 6500 rpm three times for 45 s with 15 s pause on a Precellys®24 homogenizer (Bertin Corp.). The tubes were then centrifuged, and the ethanolic supernatant was sampled for analysis.

Analytical methods
For detection and quantification of ursolic acid, asiatic acid, and madecassic acid, 200 μL ethanolic sample extract and 10 μL of internal standard (cholesterol dissolved in ethanol) were combined and evaporated. 50 μL N,O-Bis(trimethylsilyl)trifluoroacetamide (BSTFA) and 50 μL pyridine were added to the dried samples, which were then incubated at 80 • C for 30 min, shaken at 650 rpm in a tabletop heatblock. After derivatization, 900 μL hexane was added for quantification of ursolic acid and qualitative analysis of asiatic and madecassic acid. No Hexane was added for quantification of madecassic acid.
Quantitative and qualitative analysis of ursolic, asiatic, and madecassic acid was performed with a Thermo Scientific™ ISQ Series single quadrupole GC-MS system with a Thermo Scientific™ TRACE™ 1300 GC system equipped with the Thermo Scientific™ Instant Connect Split/ Splitless (SSL) Injector and a CTC Analytical CombiPAL Autosampler.
For quantification of ursolic acid, 1 μL of the hexane-diluted sample was injected into a Thermo Scientific™ TraceGOLD TG-5MS column (30 m × 0.25 mm × 0.25 μm) in splitless mode with a constant helium flow of 1.5 mL/min. For quantification of madecassic acid, 1 μL of undiluted samples was injected, and for qualitative analysis of asiatic and madecassic acid 10 μL of hexane diluted samples were injected. The injector was set to 300 • C, and the oven was set to 80 • C which was held for 1.5 min post-injection. Then, the oven temperature ramped to 300 • C at 20 • C/min, and the temperature was held for 25.5 min resulting in a total run time of 38 min. The MS transfer line was set to 250 • C and the MS ion source at 300 • C. The acquisition mode was Full-scan with the m/ z range of 50-800 with a solvent delay of 4 min. The ion source was ExtractaBrite (Thermo Fisher Scientific), the EI ionization was 70 eV, and the acquisition rate was 0.2 s. The data was processed with the Chromeleon 7.2.9 and 7.2.10 software (Thermo Fisher Scientific), and the compound concentrations were calculated from authentic calibration standards.
Oleanolic acid concentrations were quantified using a Dionex 3000 HPLC system connected to an Orbitrap Fusion Mass Spectrometer (Thermo Fisher Scientific, San Jose, CA). This method was also used to confirm the identity of arjunolic and maslinic acids, and hederagenin in our samples. The chromatographic separation was achieved using a Waters ACQUITY BEH C18 (10 cm × 2.1 mm, 1.7 μm) column equipped with an ACQUITY BEH C18 guard column kept at 40 • C with a flow rate of 0.35 mL/min. The mobile phases consisted of MilliQ© water +0.1% formic acid (A) and acetonitrile + 0.1% formic acid (B). The initial composition was 2% B, held for 0.8 min, followed by a linear gradient to 5% B over 3.3 min, and afterward, 100% B was reached over 10 min and held for 2 min before going back to initial conditions. Re-equilibration time was 2.7 min. The injection volume was set at 1 μL and underivatized, ethanolic cell extracts were injected. The MS(MS) measurement was done in positive-heated electrospray ionization (HESI) mode with a voltage of 2500 V acquiring in full MS/MS spectra (Data Dependent Acquisition-driven MS/MS) with a m/z range of 70-1000. The MS1 resolution was set at 120,000 and at 30,000 for the MS2. Precursor ions were fragmented by stepped Higher-Energy C-trap dissociation (HCD) using collision energies of 20, 40, and 55 eV. Authentic standards were used to quantify oleanolic acid concentrations and to confirm the identity of arjunolic and maslinic acids, and hederagenin in our samples.
Hederagenin, maslinic acid, and arjunolic acid concentrations were quantified with a Dionex 3000 HPLC system coupled to a diode array detector. To achieve separation, 10 μL underivatized, ethanolic sample was injected into an Agilent Zorbax Eclipse Plus C18 4.6 × 100 mm, 3.5 μm column (Agilent Technologies, Santa Clara, CA, USA) heated to 30 • C. The mobile phase consisted of a 0.05% acetic acid (A) and acetonitrile (B). The gradient started as 5% B and followed a linear gradient to 95% B over 8 min. This solvent composition was held for 2 min, after which it was changed immediately to 5% B and held until 12 min. The elution of the compounds was detected at a wavelength of 210 nm. HPLC data were processed using Chromeleon 7.2.9 software, and compound concentrations were calculated from authentic calibration standards.

Production of asiatic and madecassic acids
We then extended the biosynthetic pathway from ursolic acid to asiatic acid. Centella asiatica cytochromes P450 CaCYP716C11p and CaCYP714E19p can catalyze C-2 and C-23 hydroxylation of ursolic acid, respectively, to produce asiatic acid ( Fig. 1) (Kim et al., 2018;Miettinen et al., 2017). Therefore, CaCYP716C11 and CaCYP714E19 were expressed in the HiUrs-strain. GC-MS analysis of cell extracts confirmed the production of asiatic acid based on retention time and MS spectral comparison of the sample peak with an authentic standard ( Fig. 3 and Fig. S1). We could not calculate asiatic acid concentration due to the presence of co-eluting compounds, which were tentatively identified as triterpenoid-type products based on their MS spectra (data not shown). MdOSC1p is known to catalyze the formation of lupeol and β-amyrin, which also may be hydroxylated by CaCYP716C11p and CaCYP714E19p (Andre et al., 2016). Therefore, it is possible that oxygenated lupane and oleanane side products were produced as well. Next, we extended the pathway towards madecassic acid by additional expression of CaCY-P716E41p, which can catalyze C-6 hydroxylation of asiatic acid (Miettinen et al., 2017). We were able to confirm the production of madecassic acid in the strain expressing CaCYP716C11, CaCYP714E19, and CaCYP716E41 by GC-MS analysis, based on comparison of the retention time and MS spectrum of the sample and reference standard ( Fig. 3 and Fig. S1). The production could be quantified at 3.1 ± 0.03 mg/L or 0.12 ± 0.005 mg/g DCW madecassic acid in the strain since no tentative triterpenoid-type compounds co-eluted with madecassic acid. To further demonstrate the potential of Y. lipolytica for the production of valuable triterpenoids, we decided to also construct oleanane triterpenoid cell factories, since the same cytochromes P450 can oxygenate oleanane triterpenoid backbones. Oleanolic acid production by expression of GgBAS and YlSQE, or fusion construct GgBAS-GSG-TrYlSQE Δ1-37 in a platform strain optimized for triterpenoid production expressing MtCYP716A12 and AtATR2. Averages and standard deviations are based on triplicate cultivations 24-deep well plates with YPD80-media. Asterisks indicate statistical significant difference compared to the expression of GgBAS and YlSQE (t-test, critical two-tailed, p < 0.05). n.s. not significant.

Production of arjunolic acid
To extend the pathway from oleanolic acid to other valuable oleananes, we expressed CaCYP716C11p in the HiOle strain. CaCYP716C11p can catalyze the formation of maslinic acid by hydroxylating the C-2 position of oleanolic acid (Miettinen et al., 2017). The new strain (HiMas) produced 162.0 ± 10.0 mg/L or 8.9 ± 0.5 mg/g DCW maslinic acid (Fig. 4A). The presence of maslinic acid in HiMas was confirmed by LC-MS analysis with authentic standard comparison (Fig. S2). We then attempted to extend the pathway towards arjunolic acid, which necessitated the C-23 hydroxylation of maslinic acid. However, while CaCYP714E19p can catalyze the C-23 hydroxylation on oleanane triterpenoids, it can also generate a carboxyl group at the same position. Since the leakage of intermediates into C-23 carboxylated side-products would be undesirable, we attempted to express a cytochrome P450 from Kalopanax septemlobus (KsCYP72A397p) in the HiMas strain. KsCYP72A397p was shown only to produce the C-23 hydroxyl group, which could lead to a more efficient production of arjunolic acid (Han et al., 2018). Surprisingly, the expression of KsCYP72A397 in the HiMas strain did not result in arjunolic acid production (Fig. 4B). Of note, the expressed version of KsCYP72A397 was codon-optimized for Y. lipolytica by an algorithm from an external source (ExOP) (Swainston et al., 2014). The previous report showed that expressing the native KsCYP72A397 sequence cloned from K. septemlobus cDNA in an oleanolic acid-producing S. cerevisiae strain resulted in the C-23 hydroxylation of oleanolic acid into hederagenin (Han et al., 2018). Therefore, we expressed the native sequence (KsCYP72A397_UnOP) that only shared 74.9% nucleotide sequence identity with KsCYP72A397_ExOP. We also tried to optimize the sequence with an in-house developed codon-optimization algorithm based on codon usage of highly expressed genes (KsCYP72A397_IhOP). The IhOP-algorithm applies a strict codon usage, based on the codon usage of the 100 highest expressed genes from Y. lipolytica (Dahlin et al., 2019). Using highly favored codons could potentially improve the expression of heterologous genes such as KsCYP72A397. Neither the expression of KsCYP72A397_UnOP nor KsCYP72A397_IhOP in HiMas resulted in arjunolic acid production (Fig. 4B). We then attempted to extend the pathway from maslinic acid to arjunolic acid by expression of CaCYP714E19.
To discern the effect of codon-optimization of CaCYP714E19 for arjunolic acid production, we expressed CaCYP714E19_UnOP, CaCY-P714E19_IhOP, which was also used for asiatic and madecassic acid production, or CaCYP714E19_ExOP, which was optimized by another external algorithm (Fath et al., 2011), in HiMas. The expression of CaCYP714E19_UnOP in HiMas resulted in detectable amounts of arjunolic acid (>1 mg/L) (Fig. 4B). While expression of CaCYP714E19_ExOP in HiMas resulted in 29.1 ± 5.5 mg/L and 1.7 ± 0.2 mg/g DCW arjunolic acid, the expression of CaCYP714E19_IhOP lead to 75.6 ± 18.8 mg/L and 4.4 ± 0.5 mg/g DCW arjunolic acid (biomass specific yield, p < 0.05). The identity of arjunolic acid was confirmed by LC-MS analysis with an authentic standard (Fig. S3). Interestingly, all maslinic acid was converted in the CaCYP714E19_ExOP and CaCYP714E19_IhOP expressing strains, while small amounts of hederagenin could be detected for these strains (Figs. S4 and S5). Furthermore, expression of the KsCY-P72A397-and CaCYP714E19-codon variants in HiOle led to similar production of hederagenin; no production for the expression of any KsCYP72A397-codon variant, while the expression of all CaCY-P714E19-codon variants led to hederagenin production (Figs. S6 and S7).
Swapping of cytochrome P450 N-terminal segments has been shown to sometimes increase expression in S. cerevisiae (Cabello-Hurtado et al., 1999). Codon-optimization of only the 5 ′ -end of cytochrome P450 genes has also increased expression in S. cerevisiae (Batard et al., 2000). Therefore, we attempted to replace the NTM-domain of KsCY-P72A397_UnOP and KsCYP72A397_ExOP with the one from CaCY-P714E19_IhOP or KsCYP72A397_IhOP (Fig. S8). The domains were predicted with InterProScan (Jones et al., 2014). Yet, none of the NTM-swapped KsCYP72A397 variants resulted in arjunolic acid production when expressed in HiMas. Conversely, we investigated whether swapping the NTM-domain from CaCYP714E19_UnOP and CaCY-P714E19_ExOP with the NTM-domain either from CaCYP714E19_IhOP or KsCYP72A397_IhOP would affect arjunolic acid production. Replacing the NTM-domain of CaCYP714E19_UnOP with either the NTM-domain from CaCYP714E19_IhOP (Chimera1) or KsCYP72A397_I-hOP (Chimera2) seemingly improved arjunolic acid titers compared to CaCYP714E19_UnOP when expressed in HiMas (Fig. 5). More strikingly, replacing the NTM-domain of CaCYP714E19_ExOP with either the NTM-domain from CaCYP714E19_IhOP (Chimera3) or KsCYP72A397_I-hOP (Chimera4) gave rise to 105.2 ± 16.0 mg/L and 4.0 ± 0.16 mg/g DCW, or 182.0 ± 19.0 mg/L and 9.1 ± 0.3 mg/g DCW arjunolic acid, respectively. While expression of Chimera3 in HiMas produced arjunolic acid similar to the expression of CaCYP714E19_IhOP (biomass specific yield, p > 0.05), expression of Chimera4 in HiMas produced significantly more arjunolic acid than both the Chimera3 and CaCYP714E19_IhOP expressing strains (biomass specific yield, p < 0.05). Small amounts of maslinic acid were detected for the Chimera4-expressing strain, but not for the strains expressing CaCYP714E19_ExOP or Chimera3, which suggested that Chimera4p had a 'pulling' effect on the triterpenoid pathway when expressed in HiMas (Fig. S9). Unoptimized nucleotide sequence from native host. ExOP, nucleotide sequence codon-optimized with an algorithm from an external source. IhOP, nucleotide sequence codon-optimized with an algorithm based on codon usage of highly expressed genes. (C) Nucleotide sequence identities between the cytochrome P450 genes based on Needleman-Wunsch global alignment. Averages and standard deviations are based on triplicate cultivations 24-deep well plates with YPD80-media. Asterisks indicate statistical significant differences (t-test, critical two-tailed, p < 0.05). D, detected.

Discussion
Previous studies have demonstrated the high production of ursane and oleanane triterpenoids in yeast. For example, 1107.9 mg/L α-amyrin, or 384 mg/L maslinic acid was achieved in engineered S. cerevisiae, and 540.7 mg/L oleanolic acid in engineered Y. lipolytica (Dai et al., 2014(Dai et al., , 2019Yu et al., 2020). Like the C 30 -platform strain used in this study, the above studies featured overexpression of SQS, and HMG versions, while ERG20 overexpression was used for oleanolic acid and α-amyrin production, and SQE overexpression was used for α-amyrin and maslinic acid production. This suggests that alleviating rate-limiting MVA-pathway steps and increasing flux towards 2,3-oxidosqualene are potent strategies for developing yeast triterpenoid cell factories. It was also demonstrated that expression of Cucurbita pepo SQE in S. cerevisiae and Nicotiana benthamiana could improve the production of 2,3-oxidosqualene and thereof-derived triterpenoids . Furthermore, our previous study showed that although the C 30 -platform strain produced improved amounts of 2,3-oxidosqualene, it still accumulated 262.7 mg/L squalene (Arnesen et al., 2020). Therefore, increasing the copy numbers of SQE was an obvious strategy to improve triterpenoid production in the C 30 -platform strain since the beforementioned studies also utilized multi-expression of triterpenoid and MVA-biosynthetic genes (Dai et al., 2014(Dai et al., , 2019Yu et al., 2020).
Several studies have investigated squalene production in engineered Y. lipolytica strains, demonstrating titers in the range of 205-531.6 mg/L (Arnesen et al., 2020;Gao et al., 2017;Wei et al., 2021). Interestingly, deletion of the peroxisomal biogenesis factor 10 gene (ΔPEX10) or overexpression of diacylglycerol acyltransferase gene (DGA1) both increased squalene and lipid accumulation considerably. Manipulating fatty acid synthesis and degradation may also be useful for triterpenoid production. Notably, an S. cerevisiae squalene overproducing strain was constructed by targeting the MVA-and squalene biosynthesis pathways to the peroxisomes and improving the peroxisomal ATP, NADPH and acetyl-CoA pools, resulting in 1312.8 mg/L and 284.5 mg/g DCW squalene (G. S. . Hybridization of the peroxisomally engineered strain with a cytoplasmically engineered strain, and fed-batch fermentation of the resulting strain provided 11 g/L squalene. Compartmentalization engineering could also prove advantageous for complex triterpenoid production in Y. lipolytica. Fusing proteins can improve their expression, folding, and enzymatic activity (Guo et al., 2017). Yet, the fusion of MdOSC m p with trYlSQE Δ1-37 p did not benefit ursolic acid production, but fusing GgBASp and trYlSQE Δ1-37 p greatly improved oleanolic acid production, which showed that fusion of similar enzymes can lead to different outcomes. SQEp is a flavin adenine dinucleotide (FAD) containing monooxygenase, and it localizes to the ER and lipid droplets in S. cerevisiae (Leber et al., 1998). Little is known about the exact subcellular location of GgBASp, but the locations of other oxidosqualene cyclases have been investigated. The Panax ginseng dammarenediol synthase and A. thaliana cycloartenol synthase associate mainly with lipid particles and, to a lesser extent, the ER when expressed in S. cerevisiae, despite their lack of transmembrane domains (Liang et al., 2012;Milla et al., 2003). The A. thaliana marneral synthase was shown to localize to the ER (Go et al., 2012). One study indicated physical interaction between Cucurbita pepo cucurbitadienol synthase and the ER-localized C. pepo squalene epoxidase 2 in planta . Likewise, the Ononis spinosa α-onocerin synthase was shown to solubilize in the cytoplasm and interact with the ER-localized O. spinosa squalene epoxidases 1 and 2 in planta . These studies suggest that fusing triterpenoid synthases and SQEps could mimic natural interactions. It would be interesting to investigate how the protein fusion strategies from this study influence the enzymes' subcellular location, expression, and catalytic rate. The length of the linker can also affect the activity of the fusion protein. For example, the short GSG linker was found to be most effective for fusing A. thaliana 4-coumaroyl-CoA ligase and Vitis vinifera stilbene synthase, while using four GSG linkers was best for linking Streptomyces clavuligerus 1,8-cineole synthase and the Citrobacter braakii CYP176A1 (Guo et al., 2017;Wang et al., 2021). Therefore, optimization of linker length could potentially improve the activity of the fusion proteins described in this study.
Expression of C. asiatica cytochromes P450 in the ursolic acidproducing strain led to asiatic and madecassic acid production. To our knowledge, this is the first report of heterologously produced asiatic and madecassic acids. However, since only 3.1 ± 0.03 mg/L madecassic acid was produced, further optimization is needed before heterologous production becomes an industrially viable option.
We also applied the same and similar cytochromes P450 to produce oleananes, initially developing oleanolic and maslinic acids producing strains. However, pathway extension by expression of any KsCYP72A397-codon variant in HiMas did not lead to the formation of arjunolic acid. KsCYP72A397 PCR-amplified from cDNA had previously been expressed S. cerevisiae, which led to the C-23 hydroxylation of oleanolic acid, which produced hederagenin (Han et al., 2018). The S. cerevisiae strain was based on the WAT21 strain, which expressed the cytochrome P450 reductase AtATR2p (Urban et al., 1997). Our Y. lipolytica strains also expressed AtATR2p, so the lack of KsCY-P72A397p activity is probably not due to insufficient reductase partnering. Furthermore, it may be that some cytochromes P450 seemingly work in S. cerevisiae but not in Y. lipolytica; In one report, Vitis vinifera CYP716A17p could catalyze the C-28 carboxylation of β-amyrin into oleanolic acid when expressed in S. cerevisiae microsomes (Fukushima et al., 2011). However, the expression of VvCYP716A17p in Y. lipolytica did not lead to the C-28 carboxylation of lupeol into betulinic acid (Jin et al., 2019). Therefore, the lack of KsCYP72A397p activity could be due to inherent differences between Y. lipolytica and S. cerevisiae.
We improved the production of arjunolic acid by codon optimizing CaCYP714E19 with an algorithm that heavily favored highly expressed Y. lipolytica codons, which outperformed an algorithm from an external source. Other studies have also found that different codon-optimization algorithms lead to variable outcomes in protein expression (Ranaghan et al., 2021). Furthermore, we demonstrate that optimizing the 5' end of CaCYP714E19 with highly favored codons can lead to improved arjunolic acid production. N-terminal codon optimization has previously been used to increase the expression of plant cytochromes P450 in yeast. For example, optimization of the first 18, 39, or 111 nucleotides of the wheat CYP73A17-DNA sequence led to an expression increase in S. cerevisiae proportional to the optimized nucleotide segment length. At the same time, the non-optimized version did not seem to express at all (Batard et al., 2000). Likewise, optimization of the 120 first nucleotides of wheat CYP86A5 also led to increased expression in S. cerevisiae.
Microsomal eukaryotic cytochromes P450 are typically anchored in the ER by an NTM-domain, while the heme catalytic site occurs further towards the C-terminal (Poulos and Johnson, 2015). While the sequence and architecture of the heme catalytic site are important for enzyme function, several studies have demonstrated that some cytochromes P450 retain catalytic activity upon N-terminal truncation (Cosme and Johnson, 2000;Cullin, 1992;Von Wachenfeldt et al., 1997). Therefore, NTM-domain engineering could increase cytochrome P450 expression and activity without altering its basic function. Indeed, N-terminal modifications, often featuring truncations, tagging, or substitutions, of eukaryotic cytochromes P450 are common strategies for improving expression in prokaryotic systems Vazquez-Albacete et al., 2017). However, examples of N-terminal modifications improving cytochrome P450 expression in yeast are scarce. One report showed that when the N-terminal of wheat CYP51 was swapped with either the N-terminal from sorghum or S. cerevisiae CYP51 and both were expressed in S. cerevisiae; the sorghum-wheat CYP51 chimera had improved expression compared to yeast-wheat CYP51 chimera (Cabello-Hurtado et al., 1999). Another report demonstrated increased artemisinic aldehyde production in yeast when the first 15 amino acid residues from the Artemisia annua cytochrome P450 and reductase fusion sequence (AaCYP71AV1-CPRp) were replaced with a short N-terminal hydrophilic bovine protein tag . Likewise, our results show that exchanging the NTM-domain of CaCY-P714E19_ExOP with KsCYP72A397_IhOP led to highly improved arjunolic acid production. This result suggests that N-terminal modification of eukaryotic cytochromes P450 is a potential strategy for increasing the production of small oxygenated molecules in yeast. 182.0 ± 19.0 mg/L arjunolic acid was produced when the Chimera4-sequence was expressed in HiMas. Arjunolic acid can be extracted from the bark of Terminalia arjuna with yields ranging from 0.1 to 0.524% w/w (Kalola and Rajani, 2006;Ramesh et al., 2012). Simple ethanolic extraction of our final arjunolic acid-producing strain yielded 9.1 ± 0.3 mg/g DCW (0.91% w/dw), which suggests that engineering Y. lipolytica cell factories may be a promising way to produce arjunolic acid in the future.
In summary, we report the first-time heterologous production of the triterpenoids asiatic, madecassic, and arjunolic acid. Further improvement of the yeast chassis and fermentation conditions can result in economically viable and sustainable industrial processes for producing complex oleanane and ursane triterpenoids.

Author statements
Jonathan

Funding
The research was funded by the Novo Nordisk Foundation (grant agreements NNF15OC0016592, NNF20CC0035580, and NNF20OC0060809) and by the European Union's Horizon 2020 research and innovation programme under grant agreement No. 760798 (OLEFINE).

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.