Universal loop assembly: open, efficient and cross-kingdom DNA fabrication

Abstract Standardized type IIS DNA assembly methods are becoming essential for biological engineering and research. These methods are becoming widespread and more accessible due to the proposition of a ‘common syntax’ that enables higher interoperability between DNA libraries. Currently, Golden Gate (GG)-based assembly systems, originally implemented in host-specific vectors, are being made compatible with multiple organisms. We have recently developed the GG-based Loop assembly system for plants, which uses a small library and an intuitive strategy for hierarchical fabrication of large DNA constructs (>30 kb). Here, we describe ‘universal Loop’ (uLoop) assembly, a system based on Loop assembly for use in potentially any organism of choice. This design permits the use of a compact number of plasmids (two sets of four odd and even vectors), which are utilized repeatedly in alternating steps. The elements required for transformation/maintenance in target organisms are also assembled as standardized parts, enabling customization of host-specific plasmids. Decoupling of the Loop assembly logic from the host-specific propagation elements enables universal DNA assembly that retains high efficiency regardless of the final host. As a proof-of-concept, we show the engineering of multigene expression vectors in diatoms, yeast, plants and bacteria. These resources are available through the OpenMTA for unrestricted sharing and open access.


Introduction
Methods for DNA assembly have been instrumental in the pursuit of understanding the genetic code and our capacity to engineer biological functions. Standardized methods for DNA assembly have gained popularity in recent years, due to their accessibility, reliability and amenability to automation (1)(2)(3). Among these, Type IIS restriction endonuclease-based assembly systems have been adopted in different fields of biology due to their high efficiency and versatility (4)(5)(6)(7)(8)(9)(10)(11)(12).
The development of Type IIS assembly systems has enabled the establishment of a 'common syntax' that allows basal DNA parts from different Type IIS assembly systems to be interoperable (8,13). This has led to a vast library of domesticated DNA parts available for use. As a consequence of the increased need for sharing and exchange, the universal biological material transfer agreement has been revised (14) and an open material transfer agreement is being proposed to facilitate and expedite the sharing and exchange of plasmids (15). Furthermore, the need for IP-free DNA assembly systems to enable open access to DNA fabrication has been addressed, as in the case of the Loop assembly method developed previously by the authors (10).
Loop assembly is a recursive DNA assembly system, based on Golden Gate (5), that enables reliable assembly of DNA fragments of exponentially increasing size. The use of inverted orientations of BsaI and SapI recognition sites in receiver vectors for odd and even levels of assembly (i.e. L1, L2, L3 . . .) allows the product of one level to become the substrate for following level. This design permits the use of a compact number of plasmids (two sets of four odd and even vectors), which are utilized repeatedly in alternating steps. Thus, Loop assembly provides a simple system for the parallel composition of four genetic modules per vector and assembly level. Despite its efficiency and simplicity, Loop assembly was originally developed in binary vector backbones used for Agrobacterium-mediated transformation of plants, thus its use has been largely restricted to plant biology. Similarly, most Type IIS assembly systems have been restricted to particular organisms or taxonomic groups due to the presence of host-specific elements in the plasmid backbones required for their propagation in the host; e.g. yeast (16), plants (6,11), mammalian (17) and microalgae (18).
Historically, elements required for transformation, conjugation and transfection of DNA into organisms of choice have been treated as integral components of plasmid backbones used for cloning and assembly. These elements are refactored from natural systems, such as the T-DNA from the tumor-inducing (Ti) plasmid of Agrobacterium tumefasciens, the origin of transfer (oriT) from the RK2 plasmid for bacterial conjugation, or autonomous replicating sequences (ARS) that enable extrachromosomal replication of plasmids in some eukaryotes. While well-established model systems such as yeast and plants can be manipulated with these engineered vectors from seminal work, new model organisms often require the development of new vector systems. For example, diatom transformation was implemented by the combination of conjugation machinery from the RK2 plasmid and ARS from yeast in order to enable bacterial conjugation and posterior replication of the conjugated DNA as an extrachromosomal episome (19). The use and reuse of plasmids are not generally possible without significant alteration due to the lack of modularity and standards in plasmid engineering.
Thus, elements required for transformation into organisms of choice and sequences designed for cloning and plasmid assembly have remained coupled in DNA assembly systems.
While no distinction has been made between these two plasmid functions, now it is possible to address them separately. Decoupling of the DNA assembly logic from elements required for DNA transfer and maintenance in the organism of choice can yield more compact vectors, which can be engineered and characterized for efficient multipart assembly in Escherichia coli. Elements required for transformation can be modularized and included during the assembly routine alongside with other transcription units (TUs) to generate species-specific plasmids capable of transfer and propagation in the organism of interest. This strategy would enable the use of the same assembly vectors in different organisms, already engineered and characterized for multipart assembly.
Here, we describe Universal Loop (uLoop) assembly, a system for open, efficient and cross-kingdom DNA fabrication. We have decoupled the assembly logic of Loop assembly from the hostspecific propagation elements to enable universal DNA assembly that retains high efficiency regardless of the final host. We utilized three vectors broadly used in the synthetic biology community as well as the pCAMBIA vector (https://cambia.org) in order to implement the Loop assembly logic (10). In this strategy, elements for propagation in the host are included in the assembly routine for their subsequent use in the organism of interest. We demonstrate the capability of uLoop to generate vectors containing elements for conjugation and show their function through multi-spectral fluorescent protein expression in the diatom Phaeodactylum tricornutum, the yeast Saccharomyces cerevisiae, protoplasts of Arabidopsis thaliana and in E. coli.

Plasmid construction
The uLoop vector kits were generated using Gibson assembly (20). First, vectors pCAMBIA, pJT170 (21), pSB4K5 (22) and pAN3945 (23) were re-annotated to determine the position of the ORI and the antibiotic resistance cassette. Then, additional elements not related to basic plasmid propagation were removed in silico, with the exception of pSB4K5 where prefix and suffix flanking terminator sequences were kept, and pJT170 where the rop gene was maintained. Sequences were screened for BsaI and SapI sites and then assemblies were designed to remove these restriction sites and other non-essential elements by Gibson assembly. Resulting plasmids were then used to generate a secondary version with an alternate antibiotic resistance marker (spectinomycin resistance for pCA and pCO vectors in even levels; chloramphenicol for pSB and pAN vectors in even levels). Finally, the Loop assembly schema was introduced into each vector backbone using gBlocks through Gibson assembly, generating the uLoop vector kits. uLoop vector kits were then used for Loop assembly.
DNA parts were domesticated into L0 using the pL0R-lacZ vector, an entry vector generated to domesticate sequences using SapI.
These resources are available from the corresponding authors on request. They are also available at Addgene.

Type IIS assembly
The Loop Type IIS assembly protocol was adapted from Pollak et al. (10) and can be found at https://www.protocols.io/view/ loop-and-uloop-assembly-yxnfxme. An aliquot of 15 fmol of each part to be assembled was mixed with 7.5 fmol of the receiver plasmid in a final volume of 5 ml with distilled H2O (dH2O). The reaction master mix for odd-level reactions was prepared using 3 ml of dH2O, 1 ml of T4 DNA ligase buffer 10Â (New England Biolabs), 0.5 ml of 1 mg ml À1 purified bovine serum albumin (1:20 dilution in dH2O of BSA, Molecular Biology Grade 20 mg ml À1 , New England Biolabs), 0.25 ml of T4 DNA ligase at 400 U ml À1 (New England Biolabs) and 0.25 ml of BsaI (New England Biolabs) at 10 U ml À1 , on ice. For even-level reactions, the reaction master mix was prepared using 3.5 ml of dH2O, 0.5 ml of T4 DNA ligase buffer 10Â, 0.5 ml of CutSmart buffer 10Â (New England Biolabs), 0.25 ml of T4 DNA ligase at 400 U ml À1 and 0.25 ml of SapI (New England Biolabs) at 10 U ml À1 . Then, 5 ml of the reaction mix was combined with 5 ml of DNA mix for a reaction volume of 10 ml by pipetting and incubated in a thermocycler as described previously (10). Next, 1 ml of the reaction mixture was added to 50 ml of chemically competent TOP10 cells (Thermo Fisher Scientific), incubated at 42 C for 30 s and left on ice for 5 min. A volume of 250 ml of Super Optimal broth with Catabolite repression (SOC) medium was added and cells were left incubating at 37 C for 1 h. Finally, cells were plated onto selective Lysogeny broth (LB)-agar plates supplemented with 1 mM Isopropyl b-D-1-thiogalactopyranoside (IPTG) (Sigma-Aldrich) and 50 mg ml À1 of 5-bromo-4-chloro-3-indolyl-b-D-galactopyranoside (X-Gal) (Sigma-Aldrich).

Escherichia coli strain construction
Strain AKR1 was created using the GeneBridges 'Quick & Easy Conditional Knockout Kit' following the manufacturer's protocol. In order to make the linear DNA for chromosomal insertion, the FRT-PGK-gb3-hyg-FRT cassette, containing hygromycin resistance, was cloned adjacent to a constitutively expressed mRFP1, and the resulting DNA was further flanked by insulating terminators. The final insertion cassette was amplified by PCR with 75-bp DNA primers containing 50 bp ends designed to be homologous to a chromosomal region within the arsB locus, a previously characterized insertion site (24).
For the chromosomal insertion, E. coli TOP10 cells were transformed with plasmid pRedET, which harbored inducible Lambda Red machinery. The cells were induced and electroporated with purified DNA containing the insertion cassette. After recovery, transformants were plated onto LB plates containing 100 lg ml À1 hygromycin (Duchefa Biochemie). Chromosomal insertion was verified by Sanger sequencing of linear DNA amplified from 200 bp up-and downstream of the insertion site (Source Biosciences). Hygromycin resistance was subsequently removed with the pCl-FLPe plasmid to produce the final strain AKR1, which showed constitutive expression of mRFP1. Correct antibiotic resistance removal was verified by Sanger sequencing of the genomic region.

Productivity of assembly
Productivity of assembly was obtained as the number of resulting colonies from transforming 1 ml of reaction into 50 ml of AKR1 (E. coli TOP10-derived) homemade chemically competent cells with a competence of 2.5 Â 10 8 CFU mg À1 of pUC19. After heat shock, 250 ml of SOC media was added to each transformation and incubated for 1 h at 37 C in a shaking incubator at 200 RPM. A volume of 10 ml of recovered cells was plated into LB plates supplemented with antibiotics, X-Gal and IPTG, and productivity of assembly was obtained by multiplying the number of colonies in the plate by 30.

Escherichia coli flow cytometry
Cells were inoculated overnight in M9 with antibiotics. A 1:50 dilution into fresh M9 media with antibiotics was performed in 96 well plates and samples were left incubating for 3 h at 37 C in a shaking incubator at 200 RPM. Samples were then measured in a BD FACS Aria II device (BD Biosciences) using a blue laser (ex. 488 nm) and a 530/30 nm filter for sfGFP and a 616/23 nm filter for mRFP1, recording 100 000 events for each sample.

Escherichia coli time-course plate fluorometry
Cell cultures of each plasmid transformed into AKR1 cells were grown for 8 h in M9-glucose media with kanamycin, spectinomycin or chloramphenicol antibiotics according to the resistance cassette of each plasmid. The cultures were diluted in 1:1000 in M9-glucose medium and six wells of flat bottom 96well black plates (Nunc) were loaded with 200 ml for each one. In the same way, six wells were loaded with AKR1 (GFP background), M9-glucose (OD600 background) and the plasmid pCAL1-sfGFP transformed into TOP10 cells (RFP background). Three experimental replicates were performed on different days (i.e. 18 wells per strain: 3 days with six technical replicates).
Plates were incubated at 37 C in a Synergy HTX Plate Reader with continuous orbital shaking (282 cycles per minute) to perform the growth kinetics assays. Measurements of OD600 and fluorescence (RFP ex: 585 em: 620/GFP ex: 485 em: 516) were made every 10 min for 24 h.
The data were analyzed using Python routines available in our repository (https://github.com/Prosimio/Plate_reader_analy sis) to obtain relevant parameters such as fluorescent expression and growth rate (see 'Plate fluorometry data analysis' on Supplementary data for further details).

Large-scale DNA assembly and CHEF gel
Four L3 fragments (L3-1_allx4 L3-2_allx4, L3-3_allx4, L3-4_sfGFP) were assembled in the presence of SapI and T4 DNA ligase under multiple conditions. A control reaction excluded T4 ligase to test for oligomerization. Other reaction conditions tested included all four fragments with full SapI/T4 DNA ligase mix, with variations in numbers of 16 C/37 C cycles (25Â versus 50Â) and differing amounts of T4 DNA ligase (1Â, 10 U ml À1 or 2Â, 20 U ml À1 ). A reaction 1Â master mix (0.5 ml T4 DNA ligase buffer 10Â, 0.5 ml CutSmart buffer 10Â, 3.25 ml ddH20, 0.5 ml of SapI at 10 U ml À1 , and 0.25 ml or 0.5 ml of T4 DNA ligase at 400 U mL À1 ) was combined with a mixture of 1 ml ddH20 and 1 ml of each L3 plasmid for a total volume of 10 ml. Pulsed-field electrophoresis was carried out using a 1% agarose gel in 0.5Â TBE buffer on the CHEF-DR III system (Bio-Rad). The entire 10 ml of each above-described reaction was mixed with 10 ml of molten (55 C) 1% low-melting point agarose (Thermo Fisher Scientific) and quickly pipetted into the appropriate lane after the gel was placed in the CHEF-DR III system and submerged in excess 0.5Â TBE. One mm slices of the Midrange PFG markers (New England Biolabs) were placed into the flanking lanes as size standards. Electrophoresis was carried out with the following parameters: initial switch time 1 s, final switch time 12 s, run time 12 h, 6 V cm À1 and a 120 angle. After electrophoresis, the gel was stained with a 0.5 mg ml À1 ethidium bromide solution in ddH20 for 30 min before imaging.

Diatom transformation
Sub-strains CCMP632 and RCC2967 of P. tricornutum were obtained, respectively, from the National Center for Marine Algae and Protozoa (Bigelow Laboratories, Maine, USA) and Roscoff Culture Collection (Station Biologique de Roscoff, CNRS, Roscoff, France) and maintained in L1 medium prepared according to the NCMA recipe (https://ncma.bigelow.org). Conjugation of episomes testing de novo assembly methods into P. tricornutum was carried out using the multi-well method described in Diner et al. (25). In brief, chemically competent E coli DH5a cells harboring the pTA-MOB conjugation plasmid (26) were transformed with 10 ng of L2 episomes by heat shock at 42 C, then plating outgrowths on LB medium containing 50 mg ml À1 kanamycin and 20 mg ml À1 gentamicin. Plates were incubated overnight at 37 C. Colonies from each plate were picked the next day and grown at 37 C overnight in liquid LB þ 50 mg ml À1 kanamycin and 20 mg ml À1 gentamicin. These outgrowths were used to make glycerol stocks for storage at À80 C. Four days prior to conjugation, wild-type P. tricornutum cells were plated on 12-well tissue culture plates (Corning) filled with 3.5 ml of 1 =2 L1 þ 5% LB agar (250 ll of 1 Â 10 8 cells ml À1 cell resuspension, final cell count of 2.5 Â 10 7 cells/well) and incubated for 96 h on a 14:8 diel cycle at 22 C with a light intensity of $50 mmol photons m 2 s À1 . The day before the conjugation, tubes containing LB þ 50 mg ml À1 kanamycin and 20 mg ml À1 gentamicin were inoculated with scrapings from the glycerol stock tubes and allowed to grow overnight at 37 C. The conjugation, incubation and selection plating on 1 =2 L1 þ 20 mg ml À1 phleomycin (InvivoGen) or 50 mg ml À1 zeocin were carried out exactly as described in Diner et al. (25). Exconjugants were observed on selective plates after 10 days incubation at 22 C or 20 days at 20 C and picked directly into liquid L1 medium supplemented with 20 mg ml À1 phleomycin or 50 mg ml À1 zeocin for outgrowth of diatom colonies.
Episomes for testing the stability and repeatability of expression of uLoop plasmids (Section 6) were assembled with uLoop parts and transformed into PTA-MOB E coli via heat shock at 42 C as described above. Selection of colonies employed appropriate antibiotics (depending on receiver backbone used in assembly) and 20 mg ml À1 gentamicin and glycerol stocks were generated as described before. Conjugation into Phaeodactylum strains was carried out using the method described in Karas et al. (19) and plated on selective 1 =2 L1 medium containing 50 mg ml À1 zeocin and 50 mg ml À1 ampicillin. Exconjugant colonies were visible on 1 =2 L1-Zeo50 plates after 10-12 days incubation at 20 C under constant light.

Yeast transformation
Saccharomyces cerevisiae cells were transformed following the lithium acetate/single-stranded carrier DNA/polyethylene glycol method (28). The transformation was made according to the specifications for single plasmid by the addition of PEG 4000 50% (w/v), LiAC 1.0 M, single-stranded carrier DNA (2.0 mg ml À1 ) and plasmid DNA plus sterile water (400 ng total DNA). Prior to the heat shock, cells were incubated at 30 C for 30 min. The heat shock was performed at 42 C for 30 min. Next, cells were plated on CSM-URA plates and incubated at 30 C for 2-3 days.

Confocal microscopy
Multispectral fluorescence microscopy was performed on a Leica TCS SP5 confocal microscope. mTagBFP2 was excited with a 405 nm laser, mTurquoise2 was excited with a 458 nm laser and Venus was excited with a 514 nm laser, collecting fluorescence in sequential mode using appropriate emission windows for each fluorophore (mTagBFP2, 450-470 nm; mTurquoise2, 470-490 nm; Venus, 520-540 nm). Images were then loaded into ImageJ and composed into a single multi-channel image 8-bit image. Contrast was auto-adjusted for each independent channel to span the range of the pixel intensity histogram.

Flow cytometric analysis and cell sorting
Flow cytometric analysis of P. tricornutum grown in liquid culture was performed with an InFlux Cell Sorter (BD Biosciences) equipped with 405, 488 and 640 nm lasers. Forward and side scatter and red fluorescence (692/40 nm filter) was excited by the 488 nm laser while blue fluorescence (460/50 nm filter) excited by the 405 nm laser. Sheath fluid was 3% NaCl, using an 85-mm nozzle, with sheath and sample pressures at 16 and 17 psi, respectively. Laser alignment was checked using 3 lm Ultra Rainbow fluorescent particles, RCP-30-5A (Spherotech). Cytometer settings were maintained between WT and exconjugate P. tricornutum colonies analyzed on the same day. Between weeks, gains for fluorescence detectors were sometimes changed, but the relationship between signal and gain was determined using fluorescent particles and fluorescence was adjusted accordingly. Also, fluorescence of mTagBFP2-expressing exconjugates was always normalized with respect to exconjugates expressing the nonfluorescent protein LuxR and grown in the same conditions and run in parallel, to control for any potential variability in blue autofluorescence (from photosynthetic pigments, their degradation products, or other metabolites) related to episome maintenance and growth in antibiotics. In sorting experiments, 1 million blue-fluorescent cells were sorted into 3 ml of L1 media with zeocin and transferred to fresh medium approximately once a week.

uLoop system and library
uLoop was developed by decoupling plasmid propagation elements from the Loop assembly logic ( Figure 1A). These propagation elements were converted and used as L0 parts during the uLoop assembly routine to create species-specific vectors from backbones optimized for efficient assembly ( Figure 1B). Four uLoop vectors kits were constructed (see Methods section) and used for implementing the Loop assembly logic ( Figure 1C, Supplementary Figure S1) in derivatives of pCAMBIA, pJT170 (21), pSB4K5 (22) and pAN3945 (23). These vectors were modified by removing BsaI and SapI sites through a single nucleotide mutation of the recognition sequences. Vectors were minimized by removing sequences not related to plasmid replication and maintenance functions in E. coli (Supplementary Figure S2). Then, the Loop assembly schema was introduced into each vector backbone to create four versions of uLoop vectors ( Figure 1D). uLoop vector kits were named according to the vector of provenance from which they were derived or their origin of replication: pCA (pCAMBIA-derived), pCO (ColE1 ori), pSB (pSB4K5-derived) and pAN (pAN3945-derived). Each kit version contains two sets of four vectors: four receiver plasmids of the odd levels (e.g. pCAo-1, pCAo-2, pCAo-3 and pCAo-4) and four receiver plasmids of the even levels (e.g. pCAe-1, pCAe-2, pCAe-3 and pCAe-4). All odd levels of uLoop plasmids use kanamycin resistance for selection, and even levels use spectinomycin (pCA and pCO) or chloramphenicol resistance (pAN and pSB).
The Loop assembly schema (10) was introduced to uLoop vectors with the only exception that in uLoop, all plasmids are flanked by the unique nucleotide sequences UNS1 (upstream) and UNSX (downstream) as described in Torella et al. (21), to enable PCR verification and sequencing of all plasmids with two standard oligonucleotides.

Efficiency and productivity of assembly
Vector kits were tested for their capability to assemble DNA through the Loop assembly method. An assay using a sfGFP bacterial expression cassette was used to provide a visible phenotypic marker for tracking colonies likely to contain the assembled construct. The presence of lacZ expressing colonies (negative selection marker) would indicate undigested template while colonies not expressing sfGFP would report misassemblies. Based on this, we determined that efficiency of assembly, calculated as the percentage of sfGFP over total colonies not expressing the lacZ marker (white colonies), would provide a simple metric to compare the vector sets. We also counted the number of colonies present in each assembly, to compare productivity of assembly between the vector kits (Supplementary Figure S3).
Test assemblies were performed to characterize efficiency of assembly as the number of TUs per vector was increased.
Assemblies were performed for a bacterial expression cassette of sfGFP in level 1 (L1-4_sfGFP), for a construct composed of three plant expression cassettes plus L1-4_sfGFP in level 2 (L2-4_sfGFP) and for a construct containing 12 plant expression cassettes plus the 4TU L2-4_sfGFP construct in level 3 (L3-4_sfGFP). Assemblies were performed into the fourth position of odd and even receivers of each kit (Figure 2A). For the L1-4_sfGFP construct, L0 parts were used (AB_J23101, BC_B0034m, CE_sfGFP and EF_B0015); for the L2-4_sfGFP construct, the L1-4_sfGFP plasmid was used along with three plant expression cassettes (pL1-1_35SmR3, pL1-2_35SmT2 and pL1-3_35SVe) used in our previous work for testing assembly efficiency (10); and the L3-4_sfGFP was assembled from three constructs containing four plant expression cassettes each (pL1-1_35SmR3, pL1-2_35SmT2, pL1-3_35SVe and pL1-4_35SmR3) and the L2-4_sfGFP construct (Figure 2A). After transformation, plates were scored for presence of colonies exhibiting blue coloration due to LacZ activity and fluorescence when imaged under UV light (Supplementary Figure S3). Levels L1 (3-4 kb) and L2 (9-10 kb) assemblies exhibited sfGFP expressing colonies predominantly over LacZ expressing colonies and no colonies lacking either sfGFP or lacZ expression were observed in any vector kit. For L3 (16 TUs, 32-33 kb), efficiency of assembly was 91% for pSB, 92% for pAN, 95% for pCA and 100% for pCO ( Figure 2B). Greater differences were observed in the total number of non-negative colonies per plate where pSB and pCA vector kits showed substantially more colonies than pCO and pAN vectors (i.e. $2.2-fold more pSB colonies than pAN colonies in L1 and $2.5-fold in L2). Productivity in L3 was also reduced, averaging only 2.8% of the total number of colonies produced in L1 assemblies.

Assembly integrity
Colonies showing a sfGFP phenotype were further tested to evaluate assembly integrity. Eight colonies per vector kit and level of assembly were analyzed by means of restriction digest profiling. Resulting profiles from agarose gels were then scored against the expected restriction pattern for correct assemblies ( Table 1, Supplementary Figure S4A-C). L1 assemblies showed 87.5% of expected profiles for pCA and pCO vectors and 100% for pSB and pAN vectors. L2 assemblies showed 100% correct profiles for all vector kits. L3 assemblies evidenced a higher level of variability between vector kits, where the lowest percentage was exhibited by pCO and pSB (62.5%), followed by pCA (75%) and the highest percent by pAN (87.5%). L3 assemblies were repeated using BsaI-HFv2 and restriction profiles for sfGFP expressing colonies were assessed. Assembly integrity for BsaI-HFv2 did not seem to vary substantially from what was observed with BsaI, except for a slight improvement obtained for pCA, pCO and pAN (Table 1, Supplementary Figure S4D).
We also tested different enzymes concentrations and reaction lengths for BsaI and SapI in an in vitro reaction assay that involved the assembly of four fragments. This was performed in the absence of a receiver vector to distinguish a full-length linear fragment by agarose gel electrophoresis. We used 0.25 U ml À1 or 0.5 U ml À1 of restriction endonuclease concentration in either a short cycle (1 min at 37 C and 1.5 min at 16 C) or a regular cycle (3 min at 37 C and 4 min at 16 C) for the 25 cycle reaction incubation. Resulting gels were analyzed and the expected bands were quantified to determine formation of product in evaluated conditions. For BsaI, the condition that generated the highest amount of product was 0.5 U ml À1 of BsaI with the regular cycling conditions. For SapI, use of 0.5 U ml À1 increased the formation of product, but no discernible differences were obtained from the short and regular cycle (Supplementary Figure S5).

Large-scale DNA assembly
In vitro reactions for large-scale DNA assembly were performed to assemble a 126 kb fragment (64 TUs) from four L3 parts, three containing 16 plant expression cassettes measuring $32 kb each and L3-4_sfGFP. The reaction was performed using either 25 or 50 cycles of the assembly protocol (3 min at 37 C and 4 min at 16 C), varying the concentration of T4 DNA ligase to evaluate if this would enhance the formation of the intended product. Pulsed-field gel electrophoresis was conducted to visualize the reaction products, where it was possible to observe the 126-kb tetramer fragment in all reactions, including the monomeric, dimeric and trimeric conformations of the parts (Figure 3). Excess ligase did not seem to improve the reaction. This reaction was also conducted in the presence of a pCA even receiver vector and transformed into E. coli; however, it was not possible to recover the intact plasmid from transformant colonies (data not shown).

Performance of plasmids in E. coli
Performance of plasmids was evaluated in an AKR1 E. coli, a TOP10-derived strain harboring a chromosomally integrated mRFP1 under constitutive expression, by means of time-course plate fluorometry and flow cytometry in mid-exponential phase. Plasmid-borne sfGFP expression was measured against mRFP1 expression as a proxy for plasmid versus genomic expression. L1, L2 and L3 levels for each vector set were contrasted to determine variability of expression in relation to plasmid size. Plasmid size varied from 3-4 kb in level 1 (only one sfGFP cassette) to 32-33 kb in level 3 (15 eukaryotic TUs and 1 sfGFP cassette), representing a $9-fold difference in plasmid size.
Plasmid-borne sfGFP to chromosomal mRFP1 ratio of expression measured by plate fluorometry and cytometry was consistent qualitatively (changes were in agreement in both assays). At a quantitative level, ratios varied according to detection BsaI overhangs follow the common syntax (8,13), and SapI overhangs are the same as those described in Loop assembly (10), a(ATG), b(GCA), c(TAC), e(CAG) and x(GGT). (D) Source of uLoop vector kits. Vectors from the synthetic biology community and the pCAMBIA vector were domesticated for BsaI and SapI and 'minimized' by removing elements not related to basic plasmid function. Two antibiotic resistance versions were generated and then the Loop schema was incorporated to generate odd and even plasmids for each version of vector kits. parameters and thus cannot be directly compared. Although some of the vector kits showed significant variation between levels, these were not correlated with plasmid size ( Figure 4A and B, Supplementary Table S1). Average sfGFP/mRFP1 fluorescence ratios for fluorometry and flow cytometry were highest for pSB vectors (4.51, 1.67), followed by pCA vectors (4.27, 1.50), then pCO vectors (2.37, 1.37) and finally pAN with the lowest average ratios (0.22, 0.57). A one-way ANOVA was conducted to compare the effect of plasmid size on plate fluorometry greento-red fluorescent ratios of level L1, L2 and L3 (all vector kits grouped together). This analysis showed no statistically significant differences among L1, L2 and L3 (Supplementary Table S2). However, an ANOVA test comparing levels within each kit showed statistically significant differences across levels of the pCA kit (Supplementary Table S3). A Tukey's honestly significant difference post hoc test showed that plate fluorometry green-to-red fluorescent ratios of level 3 were significantly lower than those of L2 and L1 for the pCA plasmid kit (Supplementary Table S4). Similar results were found for pCA comparing values obtained from flow cytometry experiments (Supplementary Tables S5 and S6). In addition, similar analysis on flow cytometry data showed statistically significant higher ratios for level 2 of pAN with respect to level 1 and 3 of the same kit (Supplementary Tables S5 and S6, Figure 4A and B). We calculated the coefficient of variability (CV) for plate fluorometry and flow cytometry readings for each kit as a measure of internal variation. The lowest CV was found in pSB plasmids with a 15% and 21%, followed by pCO plasmids with a 23% and 19%, pCA plasmids with 36% and 41%, and finally pAN plasmids with

Performance of plasmids in P. tricornutum
To test the stability and repeatability of expression of uLoop plasmids in an eukaryotic background, a plasmid with mTagBFP2 under the control of histone H4 (pH4) promoter in the pSB vector, including oriT, the yeast centromere CEN-ARS (which allows plasmid maintenance in diatoms, (19)), and the pleomycin-resistance gene shBle (pSBL2-1_Pt-B). Separate conjugations were performed on P. tricornutum sub-strains CCMP632 and RCC2967 (both originating from strain CCAP1055/1). Eleven exconjugate colonies (six from CCMP632 and five from RCC2967) were tested from each conjugation and found to retain blue fluorescence for over 4 months while maintained in liquid culture. At all times tested, exconjugates displayed multi-modal histograms of blue fluorescence, indicating different expression phenotypes co-occurred in culture. The lowest mode had blue fluorescence that was indistinguishable from wild-type P. tricornutum and exconjugates containing plasmids without mTagBFP2, while the highest modes showed over 50Â greater blue fluorescence (Supplementary Figure S6). Interestingly, the proportion of blue-fluorescence cells increased over time in four of the six exconjugates where <50% of cells showed blue fluorescence in the initial test (Table 2). Moreover, when high bluefluorescent cells were separated by cell sorting and cultured, they retained the high blue-fluorescence phenotype when retested up to 6 weeks later (Supplementary Figure S7, Table 2).

Library of parts and nomenclature
A core library of parts was generated to enable the use of uLoop plasmids in different organisms. The library was generated using the common syntax for standardized DNA components (13).
Parts domesticated for this work include promoters, coding sequences and terminators for the composition of TUs for bacteria, yeast, P. tricornutum and A. thaliana. Other parts not functionally described as standardized components (e.g. origins of transfer, centromeres, TUs encoding resistance genes for selection) were domesticated using common syntax overhangs for their composition. L0 parts, as well as L1 and L2 assemblies, used in this work along with meta-information are listed in Supplementary Table S10-S12, respectively. Furthermore, a virtual repository is currently under development to access information and parts' documentation in https://www.uloop.org.
To denominate L0 parts and their composability, we established a simple nomenclature describing overhangs flanking each part. Parts' names are preceded by letters to depict the overhangs that flank them (e.g. AC_CEN-ARS-HIS, CE_oriT, EF_PtBle), being A & F terminal overhangs. We used a constrained implementation of the common syntax at the part level to limit redundancy in the library. For promoters, the A and C overhangs were used, while A and B overhangs were used when a N-terminal fusion was required. B and C overhangs were used for N-terminal tags. For CDSs, C and D overhangs were used. The STOP codon (if present) was removed and GC nucleotides were included to encode alanine and glycine with the D overhang. For C-terminal tags, D and E overhangs were used except when no C-terminal tag was required, in which case a DE_3xSTOP codon part was used. For 3 0 UTRs and terminators, E and F overhangs were used.

Use of uLoop vectors across biological kingdoms
To demonstrate the universality of uLoop across different kingdoms, vectors containing transformation/transfection/conjugation elements were generated to allow transfer of DNA into different target organisms. Elements used in current transformation methods for each specific target organism were included in the vectors during the assembly routine along with TUs encoding fluorescent proteins. Then, vectors were transferred to the target organisms using described protocols and their function was assessed by fluorescence microscopy. A pCAL2-1_FPrep vector for transformation into P. tricornutum was constructed by hierarchical assembly. A module containing the origin of transfer (oriT), the yeast centromere CEN-ARS-HIS and a bleomycin resistance cassette (shBle) was assembled along with two fluorescent proteins tagged with localization signals and a cytoplasmic fluorescent reporter. The fluorescent reporters corresponded to a pH4-driven (29) mTurquoise2 fluorescent protein fused to a mitochondrial localization tag (30), a pNRdriven (31) Venus fluorescent protein fused to a peroxisomal localization tag (32) and a p49202-driven (V. Bielinski and C. Dupont, unpublished results) cytoplasmic mTagBFP2 fluorescent protein. The native terminator corresponding to each promoter was used. The plasmid was transformed into an E. coli strain containing pTA-MOB and then conjugated into P. tricornutum, using 20 mg ml À1 phleomycin as selection. Exconjugant colonies were propagated into liquid culture and then imaged using laser-scanning confocal microscopy. Microscopy inspection showed expression of all fluorescent reporters where Venus fluorescence was localized in discrete puncta in the vicinity of the nucleus; mTurquoise2 was found in elongated structures across the cell and mTagBFP2 was localized throughout the cytoplasm ( Figure 5A, Supplementary Figure S8A).
To evaluate the functionality of a multi-TU construct in a plant organism, a transient gene expression system using A. thaliana mesophyll protoplasts was used (27). Protoplasts were transfected with a construct encoding four constitutive fluorescent reporters in pCA. Reporters used correspond to a cytoplasmic mVenus, mTagBFP2 fluorescent protein fused to a Lti6b membrane localization tag (33), an mTurquoise2 fused to an N7 nuclear localization tag (33) and a cytoplasmic mRuby3. In all cases, fluorescent proteins were driven by the A. thaliana UBQ10 promoter, which provides stable expression of fluorescent reporters in plants (34). Transfected protoplasts were examined through fluorescence microscopy where mRuby3 fluorescence was found in the cytoplasm, mTurquoise2 fluorescence was concentrated only in the nucleus, and mVenus and mTurquoise2 fluorescence had similar localization patterns in the area surrounding the nucleus and in the membrane ( Figure 5B, Supplementary Figure S8B).
Next, we tested uLoop vectors in yeast. Two vectors (pCAL2-1_Yeast-mT2 and pSBL2-1_Yeast-B) were constructed in pSB and pCA receivers to evaluate their functionality in S. cerevisiae. These vectors contained an origin of replication for yeast (2 u or CEN), a selection marker (URA3) and a single fluorescent protein (mTurquoise2 or mTagBFP2). The expression of the fluorescent markers was confirmed by fluorescence microscopy ( Figure 5C). Finally, we tested uLoop vectors in E. coli for multi-spectral fluorescent protein expression. We used two sets of four constitutively driven fluorescent proteins: mBeRFP, EYFP, mTagBFP2, and either sfGFP or mTurquoise2 assembled on a pCA or a pSB vector. These four L2 constructs (pCAL2-1_RYBG, pCAL2-1_RYBmT2, pSBL2-1_RYBG and pSBL2-1_RYBmT2) containing different combinations of constitutively driven fluorescent proteins were inspected by fluorescence microscopy, where expression of all fluorescent reporters was observed ( Figure 5D, Supplementary Figure S8C).

Discussion
Here, we describe the uLoop suite of vectors for universal DNA assembly. These vector kits, derived from plasmids widely used in synthetic biology, provide a repertoire of ori options for plasmid copy-number control in E. coli, as well as a modular mechanism for incorporating propagation elements for the customization of vectors depending on the destination organism. The use of pCambia vectors was due to their freedom-tooperate policy while the use of pSB4K5 backbone was motivated by the prospect of providing resources to the international Genetic Engineering Machine (iGEM) community, which incentivizes the advancement of synthetic biology through a worldwide student competition and open access to genetic resources. In this respect, the pSB vector kit could facilitate the exploration of new model organisms for iGEM projects (e.g. diatoms). Furthermore, we have evaluated the capacity for assembly for each vector kit and showed that all are capable of reliably assembling constructs containing up to 16 TUs. All vector kits showed very high efficiency, productivity and integrity of assembly for plasmids up to 4 TUs, showing slight variation between the measures evaluated. Assembly of L3 constructs up to 16 TUs (54 DNA parts) showed a lower rate of efficiency, productivity and integrity of assembly, yet complete constructs were Fluorescent phenotypes of 11 pSBL2-1_Pt-B exconjugate colonies from conjugations of two separate sub-strains (CCMP632 and RCC2967) were tested 3 months apart.
The highest sub-populations of blue-fluorescent cells were sorted on two independent days at month 2.5 (sort A and sort B) from one exconjugate colony of each conjugation and re-tested at month 4 (see Supplementary Figure S17). Reported are the % of cells showing blue fluorescence above the 95th percentile for control (non-BFP) exconjugates, the geometric mean of blue fluorescence (normalized to control exconjugates), and the CV of blue fluorescence within each exconjugate. Fluorescence was measured logarithmically.
still easily obtained. Notably, even in the worst case, half of the putative positive colonies screened showed the correct assembly of 16 TUs. In Type IIS assembly methods, a correct pattern of assembly is expected to be free of errors due to its high fidelity.
In previous work, we showed through high-throughput sequencing that errors found in L2 and L3 constructs were attributable to junction misassembly rather than mutations being introduced during the Golden Gate (GG) reaction (10). The decrease in efficiency and productivity measures for L3 constructs could be due to the size of the construct, since transformation efficiency decreases substantially past a certain size (35,36). At that point, the un-cut template has an advantage for transformation, influencing both the efficiency and the productivity of assembly. To test uLoop's capacity for large-scale DNA construction, assemblies were conducted up to a level 4 linear fragment measuring 126 kb (64 TUs). Although we evidenced the assembled full-length product by electrophoresis, it was not possible to obtain transformants harboring the complete construct. The failure to recover this construct could be due to the high content of direct repeats of the assembly which could be lost in vivo, which could also explain the lesser integrity of assembly observed for L3 constructs. The L4 construct composed of 64 TUs contained 63 CaMV35S promoters, 31 mRuby3 CDSs, 16 mTurquoise2 CDSs, 16 Venus CDSs, 32 N7 tags and 63 nos terminators, aside from the sfGFP expression cassette. Direct repeats are known to induce deletions in E. coli (37). Repeats could be lost during replication if the construct exerts a selection pressure for DNA replication. In a pBR322 origin containing-plasmid with an average copy number of 20, the $128-kb plasmid (126 kb þ backbone) would amount to roughly half of an E. coli's genome in terms of DNA content. Presence of the complete fragment demonstrates that uLoop enables the construction of large DNA; however, it remains unclear if absence of direct repeats could allow recovery of such DNA in E. coli due to the metabolic burden demanded by the construct size and copy number of the vector used. Alternatively, a lower copy-number origin of replication (e.g. pDestBAC, (21)) could lead to higher stability enabling such DNA to be replicated. Nevertheless, the capability for large-scale assembly of linear DNAs can be of use when replication is not required such as in cell-free experiments and other in vitro methods (i.e. assembly of small genomes). For cases where replication is required, L3 constructs containing up to 16 TUs fulfill most current cloning needs in genetic engineering and synthetic biology.
Characterization of plasmid performance in bacteria showed that the effects of construct size on sfGFP expression levels can vary for each plasmid kit. In our experiments, variability was higher between vector kits than between plasmid levels. Furthermore, pSB and pCO kits showed a low level of variability of expression between levels according to coefficients of variation, suggesting that these vector kits can provide more consistent expression in E. coli regardless of their size (Supplementary  Table S7).
uLoop plasmids performed efficiently in each eukaryote system tested, demonstrating the versatility and broad applicability of the system. The target organisms were selected to represent three eukaryotic superkingdoms which diverged at the base of the eukaryotic tree of life (38), with the yeast S. cerevisiae from the Opisthokonts (animals and fungi), A. thaliana from the Archaeplastida (plants, green algae, red algae) and the diatom P. tricornutum from the SAR (Stramenopile, Alveolate, Rhizaria) superkingdom. In all these organisms, multispectral fluorescence imaging showed that multiple TUs are successfully expressed from uLoop plasmids. Special attention was devoted to testing uLoop in P. tricornutum, recognizing the special needs for genetic tools in such emerging model organisms and the need for new marine microbial model systems (39). The uLoop plasmids were efficiently introduced in P. tricornutum, in which exconjugants exhibited variable levels of fluorescent protein expression. Nevertheless, high-level expression phenotypes were maintained and even increased in relative abundance with time and, furthermore, the high-expression phenotype was very stable after cell sorting. This high stability over time is consistent with their design to function as episomes in P. tricornutum and suggests that they do not present an excessive metabolic burden to cells.
A cross-kingdom DNA assembly method has also been described recently (40); however, this system has been shown to assemble up to 5 TUs. uLoop integrates cross-kingdom capabilities with recursive assembly features (4,12,41,42) in order to deliver an all-in-one tool for the assembly of up to 16 TUs, for potentially any species of choice. Moreover, Chiasson et al. (40) described several vector backbones with different propagation elements for specific hosts, which indicates that the number of plasmids would further increase if new target species were to be used. uLoop decouples assembly from host-propagation, delivering plasmids optimized for assembly in E. coli that can be customized through basic (level 0) components. This yields a reduced number of vectors and avoids the need for creating de novo backbones and re-characterizing their assembly efficiency.
Together with the development of the uLoop vector kits, we generated a basic set of DNA parts (promoters, fluorescent proteins and terminators, among others) for this work. We expect uLoop to boost the growth of species-specific level 0 part collections for different hosts. To address the problem of traceability of DNA parts, we are developing a virtual repository (https:// www.uloop.org) to collect information such as sequences, descriptions, submitter, source of DNA, laboratory of origin in order to help with accessibility of parts' information and requests for materials. The establishment of digital tools and a crowdsourced repository seeks to promote a distributed growth of libraries of openly shared and well-documented parts for different organisms. While a limited number of parts were tested and validated in this study, we believe that input from the community will help in identifying sources of variability of gene expression, aid in developing strategies for construct design to achieve better performance and provide insights regarding stability of uLoop constructs in different hosts and the generality of the approach. Furthermore, adoption of uLoop in different model systems will provide the repository with an influx of DNA parts from diverse phylogenetic origin (such as transcription factors and repressors) and other genetic tools specific to model organisms. These parts would be useful for synthetic biology and genetic circuit engineering, and their availability would encourage further exchange between fields of biology. Finally, the work described here was conducted in several laboratories around the globe, where the simplicity of the method facilitated its rapid implementation and experimental work. A key element for rapid and wide distribution of the uLoop vector kits in the future is the Open Material Transfer Agreement, a legal tool that enables seamless sharing and use of biological materials (15). Open access to materials and freedom to operate will be crucial to maximize interoperability and enable the growth of distributed uLoop libraries and toolkits in different countries around the world.

SUPPLEMENTARY DATA
Supplementary Data are available at SYNBIO Online, including [reference citations].