Combinatorial pathway assembly in yeast

With the emergence of synthetic biology and the vast knowledge about individual biocatalytic reactions, the challenge nowadays is to implement whole natural or synthetic pathways into microorganisms. For this purpose balanced enzyme activities throughout the pathway need to be achieved in addition to simple functional gene expression to avoid bottlenecks and to obtain high titers of the desired product. As the optimization of pathways in a specific biological context is often hard to achieve by rational design, combinatorial approaches have been developed to address this issue. Here, current strategies and proof of concepts for combinatorial pathway assembly in yeasts are reviewed. By exploiting its ability to join multiple DNA fragments in a very efficient and easy manner, the yeast Saccharomyces cerevisiae does not only constitute an attractive host for heterologous pathway expression, but also for assembling pathways by recombination in vivo.


Introduction
Metabolic engineering and synthetic biology provide powerful tools to modify existing metabolic pathways and to extend them with new functions [1,2].The aim is to produce valuable molecules in a more economical and ecological manner than e.g.synthetic organic chemistry.Moreover, they embrace sustainable strategies for the synthesis of entirely new compounds.While the classical metabolic engineering approach has focused on the modification of single reactions in an existing pathway or transferring natural pathways to heterologous industrial hosts, the emergence of sophisticated recombinant DNA technologies and the strong influence of synthetic biology approaches directed the metabolic engineering discipline in a way to alter the metabolic performance of an organism in a much larger extent [3].
In recent years, a broad variety of DNA assembly technologies has been developed that enable the generation of large pathways from small DNA fragments [4].In general, these assembly technologies can be divided into two main classes.One the one hand, there are methods such as BioBrick [5] and Golden Gate cloning [6] that are built upon traditional, but smart restriction digest and ligation.On the other hand, there are methods that harness sequence homology to assemble parts.Examples for these overlap-directed methods are Gibson cloning [7] and in vivo recombination in yeast [8,9].The latter one offers several advantages.In comparison to other methods, a large number of fragments can be assembled in yeast in an efficient and reliable manner.Pathways also constitute rather large constructs; thus, the assembling in vivo is beneficial as the transformation efficiency decreases with increasing construct size.In addition, S. cerevisiae cannot only be employed as a tool for pathway assembly, but also directly exploited as cell factory for the heterologous product.
Nowadays, the technological possibilities to assemble large pieces of DNA from synthetic oligonucleotides are almost unlimited.However, simply putting together multiple genes to build pathways is often not enough to obtain microbial cell factories suitable for industrial applications since synthetic pathways lack evolved regulation.Thus, the by far more important step, which requires a vast amount of effort, time and often also innovation, is to optimize and balance the generated pathways [2,10].The aim is to maximize the flux through the pathway without the accumulation of intermediates or side products in order to obtain the desired product in industrial relevant amounts with suitable rates and yields.In this context, combinatorial approaches for pathway optimization are of interest to tackle this issue.As there are various genetic parts (e.g., genes, promoters,…) that can be mixed and matched to new pathways, but-due to the complexity of biological systems-little knowledge to rationally design the best combination, combinatorial pathway library construction has become an essential tool for metabolic engineering purposes.
In the current review we explore how the efficient homologous recombination machinery of S. cerevisiae can be exploited for combinatorial pathway assembly.

Mechanisms of homologous recombination in yeast
Homologous recombination can be defined as the repair of double-strand breaks using homologous sequences as template.This process has evolved to maintain genome integrity and has been extensively studied using S. cerevisiae as model organism.Currently, two main pathways have been proposed how homologous recombination repairs DNA lesions [11].The double-strand break repair (DSBR) pathway is used to describe recombination in meiotic cells predominantly creating cross-over products that ensure the accurate segregation of homologs [12].The synthesis-dependent strand annealing (SDSA) pathway, on the contrary, typically generates non-crossovers and describes mitotic DNA repair.Both pathways have been reviewed extensively [11,13,14] and are shortly described in the following.
In the DSBR pathway, the repair process starts by the degradation of the 5´-ends of the double-strand break (DNA resection).The resulting single-strand DNA ends then invade homologous DNA sequences, thereby forming heteroduplex DNA, and serve as primers for DNA synthesis.After ligation of the newly synthesized DNA to the resected 5´strands, an intermediate containing two Holliday junctions is formed.Resolution of this joint molecule and mismatch repair results in crossover or no-crossover products depending on how the double Holliday junction is cut.The first steps of the SDSA pathway are identical to those in the DSBR one (i.e.DNA resection, invasion, DNA synthesis).Instead of forming a double Holliday junction, the invading strand gets displaced from the D-loop structure.Afterwards the extended single strand anneals with the other side of the break, yielding a non-crossing over repaired DNA-product.Homologous recombination is not the primary repair mechanism in all yeast species.For example, in the methylotrophic yeast Pichia pastoris (Komagataella phaffi) nonhomologous end joining (NHEJ) is predominant as indicated by the high fraction of randomly integrated gene expression cassettes [15].In NHEJ, free DNA ends are simply ligated without the need of a homologous template, thus often resulting in illegitimate recombination and chromosomal rearrangements [16].This difference in recombination mechanisms is especially highlighted when the efficiency of targeted integration is compared in these two yeasts.In S. cerevisiae, homologous sequences as short as 38-50 bp on both sides of the integration cassette are sufficient to efficiently target it to the desired locus [17,18].In contrast, in P. pastoris targeting occurs with an efficiency of <0.1% if the total length of the homologous targeting sequence is <500 bp and with an efficiency of up to 30% if large (>1 kb) regions of homology are used [19,20].This demonstrates the clear advantages of S. cerevisiae for gene and pathway assembly by homologous recombination compared to other yeasts, where in vitro assembly before yeast transformation is mostly necessary.On the other hand, the ease of homologous recombination events on the basis of short homologous regions implies possible challenges in respect to strain stability, especially if identical DNA parts are repeatedly used for strain design and construction.Nevertheless, the efficiency of recombination can be exploited for combinatorial assembly to find the optimal expression construct.After having identified such a construct, full synthesis avoiding sequence similarities is then a possible way to obtain stable production strains.

Exploiting homologous recombination for pathway assembly in yeast
The efficient recombination machinery of S. cerevisiae has already been exploited for plasmid construction and in vivo cloning applications [21,22] as well as for the assembly of whole bacterial genomes [8].In recent times, methods have been developed that also allow the construction of biochemical multi-gene pathways in yeast directly.Zhao and coworkers established the "DNA assembler" method that generates large pathways in one step [9].In this approach, expression cassettes for each individual pathway gene are designed such that they carry overlaps (>40 bp) to the neighboring cassettes and/or overlaps to the vector backbone for plasmid borne expression.If chromosomal integration of the pathway is desired, the overlaps of the first and last expression cassette share sequence homology with a helper fragment carrying the selection marker and with the targeted integration locus, respectively.Thus, the design of the overlaps determines the order of the expression cassettes in the final construct and would allow combinatorial approaches.S. cerevisiae is then co-transformed with the resulting expression cassettes and the respective vector backbone or helper fragment for in vivo assembly.Using DNA assembler, pathways consisting of three genes (xylose utilizing pathway), five genes (zeaxanthin biosynthetic pathway) and eight genes (combination of xylose utilization and zeaxanthin biosynthetic pathway) have been successfully assembled on a plasmid or on yeast chromosomes [9,23].Also constructs encoding polyketide synthesis pathways were thus assembled in yeast and subsequently transformed into a bacterial production host [24].However, the efficiency of the method significantly dropped with increasing the pathway size dependent on the length of the overlap used.This issue was addressed by the work of Kuijpers et al. [25].In this study, the meganuclease I-SceI was used during pathway assembly to introduce site-specific double-strand breaks at the desired target locus.These breaks were identified to be essential to improve integration.Thus, the efficiency for the assembly and the targeted chromosomal integration of a ten-fragment 22 kb construct was increased from 5 % to 95%.
Another concept for pathway assembly in yeast is represented by "Reiterative Recombination" [26].In contrast to the methods described above, multi-gene pathways are not generated in one step, but rather by elongating a construct of interest in a stepwise fashion.Such a sequential assembly ensures the proper incorporation of each fragment, but is more time-consuming and, thus, less appealing for combinatorial approaches.

Combinatorial Pathway Assembly
Pathway optimization by the combinatorial assembly of pathway parts can be achieved by several means in yeast (see Figure 2).One possibility is to generate pathway libraries that contain random combinations of homologous genes from different organisms.As the corresponding enzymes often exhibit different properties such as catalytic activity, substrate specificity and stability a balanced flux through a metabolic pathway can be achieved by finding the optimal enzyme combination.Modulating the transcriptional expression of the pathway genes by employing regulatory elements of varying strength is another option to obtain balanced pathways.Pathways can also be evolved in vivo to obtain mosaic pathways with new/improved properties.The different approaches and proof of concepts are discussed in the following.

Combinatorial assembly of pathway enzymes
In order to produce desired compounds heterologously, a set of metabolic pathway enzymes has to be transferred and properly reconstructed in the expression host.This task might be straight-forward if that set of the corresponding genes is predefined.However, in terms of pathway optimization it would be of interest to simultaneously test combinations of genes from several different sources to find the optimal combination.In addition, the random assembly of various enzyme activities in a pathway might also lead to the production and discovery of new natural compounds of potential interest.
One strategy to assemble a large number of genes to pathways in a combinatorial fashion is based on expressible Yeast Artificial Chromosomes (eYACs) (see Figure 3) [27].In this approach, the genes of interest (e.g.homologous pathway genes from different organisms) are cloned into entry vectors in a first step.By the action of restriction enzymes linear expression cassettes with compatible sticky ends are released from the vector backbone and subsequently concatenated by ligation.The resulting cassette concatemers are then further elongated at the 5´-and 3´-end with YAC arms, carrying all elements necessary for chromosomal function as well as auxotrophic selection markers and used for the transformation of S. cerevisiae.Naesby et al. employed eYACs to generate a variety of flavonoid pathways [27].The end product of the seven step pathway was detected in 8 out of 24 analysed clones.In addition, clones predominantly producing an intermediate of the pathway or flavonoids with an unexpected hydroxylation pattern were identified.These findings indicate the assembly of incomplete pathways and the possibility of the combined action of yeast endogenous and heterologous enzymes, respectively.The average size of the generated eYACs was shown to be 130 kb corresponding to approximately 50 expression cassettes.This is also one advantage of eYACs in comparison to smaller vector systems.Multiple copies of a single gene and/or whole gene families coding for an individual pathway step can be introduced into yeast in a single step.Thereby, the number of clones that need to be screened to identify a first functional pathway can be reduced.
In the eYAC approach the actual random assembly of the pathway genes is performed in vitro.However, this step can also be performed directly in yeast.In this context, Kim et al. employed the concept of the DNA assembler method to generate combinatorial libraries of a fungal xylose utilization pathway in S. cerevisiae [28].A total of 20 homologues xylose reductases, 22 xylitol dehydrogenases and 19 xylulose kinases were randomly assembled and expressed on plasmids.Thereby, strains were generated that were not only able to use xylose as their sole carbon source, but that also displayed a balanced pathway flux indicated by a complete reversal of the major product from xylitol to ethanol.
Besides using genes from various sources, combinatorial pathway libraries can also be constructed based on genes that were subjected to error-prone PCR or other mutagenesis techniques [29].Also employing the DNA assembler strategy the cellobiose utilization in S. cerevisiae was improved by simultaneously evolving two proteins in the pathway.

Figure 3. Random pathway assembly employing expressible Yeast Artificial Chromosomes (eYACs). Genes of interest (red) are first cloned intro entry vectors that contain appropriate promoters (dark green) and terminators for yeast expression. Expression cassettes are released from the vector backbone (black) via double digestion, leaving cassettes with compatible sticky ends. Cassettes are randomly concatenated by in vitro ligation. Subsequently, YAC arms (blue) supplying all necessary elements for the functional assembly of an artificial chromosome are added.
For pathway optimization it also might be of interest to obtain different quantities of the individual genes within the pathway.Having multiple copies of genes that are involved in rate-limiting steps can circumvent bottlenecks in the flux [30,31].In the eYAC approach it is possible to have multiple copies of a gene present in the pathway, while it is difficult to adjust the gene copy number employing DNA assembler.To address this issue strategies that exploit -integration for combinatorial pathway expression have been developed [32,33,34].Targeting expression cassettes to sites of retrotransposons elements can result in multiple integration events as these sequences are highly abundant in the S. cerevisiae genome (>100 copies) [35].Thus, a pathway can be assembled by simultaneous and multiple integrations of expression cassettes harbouring the individual pathway genes.In the strategy developed by Yuan et al. the frequency of integration events can be modulated by simply changing the concentration of the antibiotic used for selection [34].

Combinatorial assembly of regulatory elements
After having defined a set of pathway enzymes whose concerted action results in the synthesis of the desired product, an important task is to express the corresponding genes at appropriately balanced levels.Thus, one can avoid metabolic burdens due to the overexpression of certain genes, the accumulation of toxic or unstable pathway intermediates or other bottlenecks that otherwise result in growth inhibition and/or suboptimal product yields.Fine-tuning pathway gene expression can be achieved by adjusting the corresponding gene copy numbers as described above.Employing different regulatory elements such as promoters with varying strength and regulatory profiles is another strategy to balance expression.In this context, combinatorial approaches are desirable to find the best promoter-gene combinations for each constituent of the pathway as it is nearly impossible to predict them.
Du et al. developed a method named "customized optimization of metabolic pathways by combinatorial transcriptional engineering (COMPACTER)" that allows the simultaneous optimization of multiple genes in a given pathway in S. cerevisiae (see schematic representation in Figure 2, panel B) [36].In this approach, each pathway gene is first linked with a distinct set of promoters that display varying level of expression strength.Using the DNA assembler method, pathway libraries are then generated that theoretically contain all possible combinations of expression levels for each individual pathway step.COMPACTER was successfully used to optimize pathways for biofuel production.For example, fine-tuning a cellobiose utilizing pathway consisting of two enzymes resulted in an engineered yeast strain that displayed a 5.4-fold faster cellobiose consumption rate and 5.3-fold higher ethanol productivity in comparison to a strain that harbored the same pathway enzymes under the control of the wild-type promoters.Interestingly, performing COMPACTER on the same pathway but in different S. cerevisiae strains resulted in different optimized versions indicating a context-dependency of heterologous pathways.
In a very recent work, Boeke and coworkers performed a combinatorial fine-tuning of pathway expression by a method called VEGAS (versatile genetic assembly system) (see Figure 4) [37].In this approach, transcription units (i.e.coding sequences flanked by promoter and terminator regulatory elements) are homologously recombined in yeast via so-called VEGAS adapters (VAs).These adapters consist of 57 distinct base pairs that are orthogonal in sequence with respect to the yeast genome.The actual transcription units with the VAs are assembled beforehand in vitro employing yeast Golden Gate (yGG) cloning [38].For this purpose, each part of the pathway needs to be cloned in a vector suitable for type IIS restriction cloning first.The capacity of this method for combinatorial assembly was demonstrated for the -carotene biosynthetic pathway comprising four enzymes.Using yGG a library for each transcription unit was constructed by randomly combining ten different promoters and five different terminators with the corresponding pathway gene.Recombining these transcription unit libraries in yeast revealed a broad variety of strains that all displayed different titers of -carotene and its precursors [37].

Mosaic pathways
Employing only naturally-occurring parts for pathway construction might not always result in the desired product and/or in optimal product yields.Beside the use of mutagenized promoter libraries or the use of individually evolved enzymes, it is also possible to exploit the recombination machinery of yeast to generate highly diverse pathway libraries.
The ability of S. cerevisiae to generate mosaic libraries of single genes has already been shown [39,40,41].Luque and colleagues went one step further and developed a method in which DNA repair deficient yeast strains are employed to assemble and simultaneously recombine whole pathways [42].Intragenic mosaic pathways were generated by providing non-identical genes as sole substrate for recombination.Thus, in contrast to the other methods described here, homeologous recombination was exploited for pathway generation.The feasibility of this study was proven by the generation of mosaic libraries of a flavonoid pathway.For seven out of the eight pathway enzymes, homeologous counterparts sharing a sequence identity of 75-91 % were identified and used to set up as expression cassettes for recombination.The resulting libraries showed a high degree of diversity, the chimeric gene pattern being different among all the analyzed strains.Up to 30 intergenic sequence exchanges were identified within one pathway.The length of DNA exchanges caused by the recombination events between the two similar sequences ranged from single nucleotides up to 900 nucleotides per mosaic gene.All analyzed clones exhibited a functional pathway indicating that the structure of the open reading frames and the functionality of the resulting proteins were conserved.In addition, mosaic pathways were identified that resulted in higher concentrations of flavonoid metabolites in comparison to pathways that were assembled of the corresponding wild-type genes.Thus, this method constitutes an exciting tool for the in vivo evolution of pathways.

Conclusions and Outlook
Combinatorial pathway assembly is a powerful tool to generate and identify pathways with improved flux that allow the production of valuable molecules in recombinant microorganisms.Besides the availability of a reliable high-throughput screening system, efficient means to generate such combinatorial pathway libraries are a prerequisite.Many examples show that the latter one can be achieved by exploiting the efficient recombination machinery of S. cerevisiae.Table 1 summarizes the features of currently available methods for combinatorial pathway assembly in yeast.Most of them have only been developed in recent years reflecting the novelty and brisance of this research topic.Successful examples obtained by these approaches will promote the development of further strategies and concepts.Recently, it has been shown that a polycistronic organization of multi-gene pathways is a feasible strategy to express heterologous pathways in yeasts [43,44].The coordinate expression was achieved by the employment of self-processing 2A peptides that enable the production of distinct proteins from a single transcript (for a comprehensive review see [45]).The 2A peptides have been used for simultaneous co-expression of the four genes of the -carotene pathway as well as the five genes of the violacein pathway.Thereby, the order of the genes within the polycistron had a strong effect on the pathway efficiency [40].This finding might be exploited for pathway optimization in the future.By employing the short DNA sequences coding for the 2A peptides as universal linkers shuffled libraries containing the pathway genes in variable order and quantity can be generated (see schematic representation in Figure 2, panel D).Grouping pathway genes into operons also opens the possibility to fine-tune gene expression by generating libraries of tunable intergenic regions (TIGRs).The combinatorial recombination of post-transcriptional control elements was shown to balance the expression of a heterologous mevalonate biosynthesis pathway, thereby increasing the mevalonate production in E. coli by a factor of seven [46].[27] COMPACTER Promoter variants are assembled with pathway genes into a library of pathways using the DNA assembler method Tuning of gene expression in pathways + Simultaneous optimization of gene expression levels [36] VEGAS Combination of yeast Golden Gate with in vivo recombination to assemble pathways from transcription units Tuning of gene expression in pathways + Simultaneous optimization of gene expression levels -Time and labour intensive for first assemblies [37] Mosaic pathways Assembly and recombination of similar, but not identical genes in DNA repair deficient strains by homeologous recombination In vivo pathway evolution + High pathway diversity -Additional crossing step with DNA repair proficient yeast is required to obtain a genetically stable production strain [42]

Figure 1 .
Figure 1.Schematic illustration of DNA double-strand break (DSB) repair mechanisms.DSBs can be repaired by double-strand break repair (DSBR) and synthesis-dependent strand annealing (SDSA), both being homologous recombination-mediated pathways.Another DSB repair mechanism is nonhomologous end joining (NHEJ).In NHEJ the break ends are directly ligated without the need for a homologous template.

Figure 2 .
Figure 2. Possibilities of combinatorial pathway assembly by in vivo recombination in yeast.(a) Pathways can be assembled combinatorially by employing several genes originating from different sources for each catalytic step.(b) Also regulatory elements, e.g.promoters of varying strength, can be shuffled during the generation of pathways allowing fine-tuning of the pathway flux via transcriptional modulations.(c) Pathways can also be assembled and simultaneously recombined in S. cerevisiae resulting in intragenic mosaic pathways.(d) In principle, designing pathways in a polycistronic format for expression in yeast might also be realized by the employment of self-processing 2A peptides.This strategy provides new opportunities for pathway optimization in future by randomly arranging the individual pathway genes in a polycistron.

Figure 4 .
Figure 4. Pathway assembly employing versatile genetic assembly system (VEGAS).In a first step, transcription units (TUs) are assembled using yeast Golden Gate (yGG).Each TU consists of a coding sequence (CDS), a promoter (PR) of varying strength (indicated by the size of the promoter) and a terminator (TER) and is flanked by a left and right VEGAS adapter (LVA and RVA, respectively).The four base-pair overhangs generated by BsaI restriction digest and used for TU assembly are indicated.The acceptor vector contains an expression cassette coding for a fluorescence protein that is replaced when a TU assembles correctly.In addition, it harbors a resistance marker (RM) distinct from the one that are present in the vectors harboring the individual parts.The combinatorial library of TUs is then either released from the vector backbone via BsmBI digestion or amplified via PCR using VEGAS adapter (VA) specific primers.The final pathway is then assembled in yeast by homologous recombination between VAs that flank TUs.