Rational optimization of tolC as a powerful dual selectable marker for genome engineering

Selection has been invaluable for genetic manipulation, although counter-selection has historically exhibited limited robustness and convenience. TolC, an outer membrane pore involved in transmembrane transport in E. coli, has been implemented as a selectable/counter-selectable marker, but counter-selection escape frequency using colicin E1 precludes using tolC for inefficient genetic manipulations and/or with large libraries. Here, we leveraged unbiased deep sequencing of 96 independent lineages exhibiting counter-selection escape to identify loss-of-function mutations, which offered mechanistic insight and guided strain engineering to reduce counter-selection escape frequency by ∼40-fold. We fundamentally improved the tolC counter-selection by supplementing a second agent, vancomycin, which reduces counter-selection escape by 425-fold, compared colicin E1 alone. Combining these improvements in a mismatch repair proficient strain reduced counter-selection escape frequency by 1.3E6-fold in total, making tolC counter-selection as effective as most selectable markers, and adding a valuable tool to the genome editing toolbox. These improvements permitted us to perform stable and continuous rounds of selection/counter-selection using tolC, enabling replacement of 10 alleles without requiring genotypic screening for the first time. Finally, we combined these advances to create an optimized E. coli strain for genome engineering that is ∼10-fold more efficient at achieving allelic diversity than previous best practices.


Assessing the Dysfunctional Phenotype Associated with tolC Counter-selection Escape
To understand more about tolC counter-selection escape, we grew monocultures of EcNR2.ΔtolC and EcNR2.tolC + strains in colE1. Whereas EcNR2.ΔtolC were resistant to colE1 and grew without delay, 12/16 EcNR2.tolC + cultures exhibited growth after a significant delay. For example in one experiment, 4/6 EcNR2.tolC + replicates grew with a delay of 749.2±111.0 (mean±stdev) minutes compared to EcNR2.ΔtolC controls. Upon passaging these cultures into a 2 nd counter-selection, the 4 EcNR2.tolC + replicates that had grown after a delay during the 1 st colE1 selection now exhibited growth kinetics in colE1 that were indistinguishable from EcNR2.ΔtolC, implicating mutational escape rather than loss of colE1 activity (Fig. S3B).
To ascertain whether escape was due to mutation of the tolC coding region, we tested if tolC counterselection escape clones could grow in SDS. Replica plating onto SDS demonstrated that 47.3%±10.0% (mean±stdev, n = 6) of escape clones were sensitive to SDS, and likely harbored inactivating mutations in the tolC coding region. While SDS-sensitive escape mutants are readily removed during any subsequent SDS selection, dysfunctional clones (~53% that are both SDS-resistant and colE1-resistant) can survive both selections and rapidly take over upon repeated cycles of SDS and colE1 selections. For example in a typical tolC CoS-MAGE cycle, the recombination frequency for the selectable oligo is ~10 -2 -10 -1 recombinants per viable clone. In contrast, SDS-resistant/colE1-resistant cells pass both selections with a frequency of ~ 1; thus, dysfunctional escape clones rapidly take over a CoS-MAGE lineage, as the relative abundance between desired clones and escapees shifts by a factor of 10 1 -10 2 per cycle. It is noteworthy that increasing the mole fraction of the selectable oligo had a statistically significant impact on delaying complete breakdown of the CoS-MAGE workflow (Fig. 1E), suggesting that improving recombination frequency through increasing oligo concentration significantly reduces the shift in relative abundance between desired clones and escapees from cycle to cycle. When escapees are a minor fraction of the population, they are more likely to be left behind upon dilution into fresh media.

Examples of Mutations that are Associated, but Unrelated to tolC Counter-selection Escape
Only one non-tolQRA mutation, a frameshift in ogt (XM_ogt_1398047_T_TC, 0.56±0.01), was putatively causal based on Normalized Culture Time. Three additional XM oligos targeting recR (0.73±0.02), treB (0.79±0.01), and a putative intergenic site (nt 752378, 0.69±0.14) were also found to have borderline influence on counterselection escape. These 4 coding regions have not been previously implicated in tolC function.
We further explored the nature of the SNV in the intergenic region at nt 752378, between ybdG and gltA, we hypothesized that polar effects may affect ybdG or gltA. Therefore, we tested whether either ybdG or gltA played a role in colE1 resistance by inactivating each gene with MAGE oligos ybdG_mut and gltA_mut.
Notably, the ybdG_mut (1.146±0.265) and gltA_mut (1.292±0.471) mutations exhibited unrelated Normalized Culture Times, suggesting that any effect of XM_Intergenic_752378_T_G on colE1 resistance is not due to reducing/abolishing expression of surrounding coding regions.
To further study all 4 mutations, we performed additional biological replicates in a simpler experimental setup (Table S3) than the high-throughput validation (Table S2). These replicates were largely confirmatory, except for ogt, which was only weakly causal, consistent with a Borderline mutation. We then tested the mutant genotypes in isolation for their counter-selection escape phenotype. Whereas the previous validations have counter-selected from mixed populations containing the parental genotype, we used mascPCR to identify each of the four monoclonal mutations, then tested their colE1 phenotypes compared to EcNR2.ΔtolC and EcNR2.tolC + clonal controls. Under these conditions, the ogt, recR, and treB SNVs, as well as the intergenic SNV at nt 752378 grew with negative controls (Table S4), thus are likely unrelated to counter-selection escape.
Despite a cutoff for Borderline Causal as Normalized Culture Time < 1, we excluded XM_Rep321e_4294083_C_T from further analysis due to a relatively large error of measurement (0.933±0.116). Moreover, rep321e is a member of palindromic repeat elements between gltP and yjcO; thus, oligos targeting the mutation seen in our sequencing data set contain significant off-target homology to other repetitive elements, which obfuscates interpretation of the results.
Given previous literature implicating BtuB as the OM receptor for colE1 (1) and despite finding 3 unique mutations in BtuB (Table S5), we were surprised that btuB loss-of-function mutations were not classified as causal (Table S2). We duplicated btuB in a tolQRA duplicated strain (EcNR2.1255700::tolQRA) and found that 25/32 btuB duplication replicates escaped counter-selection, with extremely variable kinetics (OD 600 = 0.4 at 1846.7 ± 662.9 minutes, mean ± stdev), whereras only 1/32 EcNR2.1255700::tolQRA parental replicates escaped (OD 600 = 0.4 at 919.5 minutes). As a result, we did not pursue btuB duplication further; we hypothesize that increasing btuB copy number leads to off-target effects such as increasing the number of colE1 binding sites on the OM (1).
However, EcNR2.Nuc5exhibits a growth phenotype after recombination that resolves upon subsequent passage into fresh media (2). To explore this phenotype further, we isolated steps in the recombination workflow (42 C Induction of λ Red proteins, water Washes on ice to reduce salt content, Electroporation with or without oligos), then monitored growth to determine which step(s) underlie the phenotype. Figure S4A compares EcNR2 (top panel) with EcNR2.nuc5-(bottom panel), and shows a clear association of slower growth rate of EcNR2.nuc5with Induction (filled shapes). V max quantifies this effect as ~50% longer doubling times in induced EcNR2.nuc5compared to induced EcNR2 controls (Fig. S4C). The association of the Induction (42 C heat shock for 15 min) with the EcNR2.nuc5-recovery phenotype suggests two possible explanations: (1) the inactivated nucleases may be important in the heat shock response; or (2) the nucleases may be important in minimizing the transcriptional burden of λ prophage expression (which contains genes beyond exo, bet, and gam). To exclude the former, we used MAGE to replace the entire λ prophage with a zeocin recistance cassette (Δλ::zeoR) in EcNR2 and EcNR2.nuc5 -. Then we performed similar subsets of the recombination workflow and monitored growth for the recovery phenotype (Fig. S4B, S4C). Interestingly, EcNR2.nuc5 -.Δλ::zeoR did not exhibit a re-growth phenotype, despite Induction, suggesting that the EcNR2.nuc5phenotype is likely due to expression of the λ Red operon.
We chose to avoid reverting xseA, as its inactivation is associated with improved mutation inheritance at the 3' end of ssDNA oligos (2). We chose to avoid reverting exoX because this reversion strain exhibited the worst post-recombination recovery phenotype (Fig. S4D), although it was not statistically significantly slower than EcNR2 (p = 0.1). To decide between the remaining two nucleases (recJ and xonA), we used allele conversion in a single round of CoS-MAGE as criteria for our decision (Fig. 3B). Both the recJ (reported as Mean Allele Conversion ± SEM, 1.77±0.13) and the xonA (1.72±0.12) reversion strains yielded statistically similar mean allele conversions per clone (p = 0.16 by Kruskal-Wallis ANOVA) to each other and to the parent strain (1.55±0.13), suggesting that neither recJ reversion nor xonA reversion detracts from the improved MAGE performance seen in EcNR2.nuc5-. The xonA reversion strain exhibited a sporadic growth phenotype on carbenicillin-containing solid media, so we chose to instead revert recJ in MAGE-optimized strain EcM1.0.

tolQRA Duplication Enables Stable CoS-MAGE Cycling (continued from main text)
Our computational model suggested that a CoS-MAGE lineage would incrementally approach a completely modified genotype over 10 cycles (Fig. 1C, Fig. S2), whereas our empirical data showed that the lineages moved much more erratically towards the completely modified genotype (Fig. 3CD). The model did not take into account the selection library size when extrapolating CoS-MAGE performance across multiple cycles, and these results indicate that genetic bottlenecks play an important role fixing minor variants in a mixed population. However, it is not obvious why the tolC counter-selection would consistently fix minor variants (clones harboring additional mutations to the parental genotype), rather than fixing the major variant (the parental genotype). It seems unlikely that counter-selection (colE1) on a tolC+ strain would have an intrinsically greater capacity to recombine oligos at neighboring loci than the selection (SDS) and it will be interesting to see if this observation is born out through future empirical data with larger sample sizes.
In terms of improvements over previous best practices, CoS-MAGE cycling in EcM2.0 yielded tenfold more changes per cycle than MAGE in EcNR2. This translates into less calendar time needed to attain a defined set of modifications. Although the total number of cycles needed to attain a defined set of modifications is an important metric for evaluating the power of MAGE vs. CoS-MAGE, another important consideration is the number of cell divisions required to achieve a desired genotype. Defective mismatch repair (mutS::cat) in EcNR2-based lineages is an important enabling technology for MAGE, but comes with an ~10-100-fold increased mutation frequency (primarily transition mutations) (4). For example, we used MAGE to change all 321 TAG stop codons to TAA synonyms genome-wide (5). During strain engineering, this strain acquired 355 off-target mutations over ~7340 doublings due to defective mismatch repair (5). One way to reduce off-target mutation frequency is to minimize the total number of cell divisions that a population will proceed through during genome editing. Although CoS-MAGE fixes ~6-fold more variants per cycle (Fig.   1AB), a CoS-MAGE cycle involves more cell divisions than a MAGE cycle, as the subpopulation of cells that recombine the selection oligo must undergo additional divisions during selection in order to repopulate the culture (only a subset of viable cells gain the selectable allele). By taking into account both workflows, we estimate that cells will go through 9 divisions in two cycles of MAGE (assuming a singleplex AR frequency of 10 -1 ), while cells will go through 25 to 31 (uncertainty due to estimates of AR frequency for the tolC inactivation/reversion oligos, estimated at 10 -1 and 10 -2 , respectively) divisions in two cycles of tolC CoS-MAGE. Thus, to completely modify 10 loci in a majority of the population, we estimate that MAGE will need ~860 cell divisions (90 cycles), whereas CoS-MAGE will need around 245 to 312 divisions (10 cycles). This represents a 64-72% reduction in the total number of cell divisions, and should reduce the off-target mutation burden in kind.
Calendar time is another important consideration for evaluating the power of MAGE vs. CoS-MAGE.
To compare, we assume that the process cycle for MAGE is 1/8 day and that the process cycle for CoS-MAGE, which takes into account additional time needed for selection/counter-selection growth is 1/3 day.
Based on these process cycles, we estimate that it will take 11.25 days, whereas CoS-MAGE will take 3.33 days to completely modify 10 loci in a majority of a population. This corresponds to a ~70% reduction in total time needed, similar to the savings in mutation burden above.

Appending the ssrA Degradation Tag to tolC Increases Protein Turnover, but Reduces Fitness in SDS.
One suboptimal characteristic of TolC is its slow turnover time from the OM, which necessitates long recovery times between the inactivation recombination and the corresponding colE1 counter-selection. To increase TolC turnover rate, we appended the 11 amino acid ssrA tag (6,7) to the C-terminus of the tolC coding sequence. To assess and optimize emergent phenotypes associated with ssrA tagging tolC, we used several ssrA tag variants reported to effect different protein half-lives.
Tested variants included ssrA FULL (AANDENYALAA (6) We attempted to optimize expression of the tolC_ssrA FULL cassette by using MAGE to explore promoter mutations that would increase expression and offset increased turnover. Although we optimized the promoter toward the canonical E. coli σ 70 promoter, we were unable to alleviate the observed SDS phenotype. We also attempted to mutate the ssrA FULL tag with degenerate MAGE oligos to randomize the 11 amino acid ssrA sequence. After each round of MAGE, we selected against the growth phenotype using 1x SDS. After 6 rounds of MAGE, we performed a tolC-r.null_mut recombination, followed by a colE1 selection after a 30 minute recovery to ensure tag function. Sequencing showed that the tag was surprisingly resistant to mutation, and subsequent SDS phenotyping showed that the growth phenotype remained. These results suggested that the window between the non-selective dose and the maximum tolerated efficacious dose of SDS was too small for tolC_ssrA FULL strains (~4 fold, based on Fig. S5B, compared to ≥ 64-fold for untagged tolC). One potential explanation for our lack of success is that different proteases are responsible for degrading ssrA-tagged proteins in the cytosol (clpA, clpX, lon, etc.) versus the OM/PP (prc). (8). Our results are consistent with prc requiring the C-terminus of the ssrA tag to end in LAA (9), which was only the case for ssrA FULL . Thus, it should be recognized that different proteases prefer distinct, yet overlapping specificities within the ssrA tag sequence, making modular application of some ssrA tag variants problematic.

Validating Colicin Agar Plates to Quantify tolC Counter-selection Escape Frequency
We generated vancomycin (LB L CV), colE1 (LB L CCo), and colE1/vancomycin (LB L CCoV) agar plates as described in the Methods. To validate these plates, we plated EcNR2.tolC + onto LB L CV (2E3 cells), LB L CCo (1E7 cells), and LB L CCoV (1E7 cells). These inocula were chosen to yield a countable number of clones on the given media. As shown in Figure S6A, EcNR2.tolC + readily escapes selection on the LB L CV plates (8.3E-2 ± 1E-1), while the LB L CCo plates are significantly more stringent (3.4E-5 ± 1.7E-5). As expected, the colE1/vancomycin (LB L CCoV) plates are significantly more stringent (4.5E-8 ± 9E-9) than colE1 (LB L CCo) alone. To demonstrate that these plates can be used for tolC counter-selection, we streaked 10 6 EcNR2.tolC + and EcNR2.ΔtolC cells onto LB L CV, LB L CCo, and LB L CCoV plates. As shown in Figure S6B, EcNR2.ΔtolC grows without apparent phenotype, while EcNR2.tolC + yields markedly fewer colonies despite streaking the same number of cells. As in Figure S6A, we observed more EcNR2.tolC + growth on LB L CV than LB L CCo, while LB L CCoV permitted the least growth. This again suggests that the LB L CCoV plates are more stringent than either counter-selection agent used alone.
To test whether these plates are bactericidal rather than bacteriostatic, we performed a CoS-MAGE cycle on EcM2.0.tolC + cells using 1 μM tolC-r.null_mut, recovered for 7 hours, then plated the recovery cultures onto LB L CCoV plates to counter-select. After overnight incubation, we picked 96 clones from the LB L CCoV plates into LB L + Carb and LB L + Carb + SDS. As shown in Fig. S6C, these clones grew normally in LB L + Carb, but none grew in LB L + Carb + SDS. This demonstrates that the LB L CCoV plates successfully kill the parental (tolC + ) genotype, and indicates that clones picked from the plate-based counter-selection do not carry viable tolC + contamination.

Supplemental Discussion of Future Candidate Dual Selectable Markers for Optimization
It will be informative to apply this unbiased workflow to diagnose selection escape with other dual selectable markers, such as the sugar utilization kinases (malK, galK, etc.) (10), the multi-functional tetracycline efflux protein tetA (11), Herpes simplex virus thymidine kinase (hsvTK) and other thymidine-based markers (thyA).
The sugar utilization kinases are required for growth in media containing the respective sugar as a sole carbon source, while rendering the cell sensitive to toxic analogues. Selections based on sugar utilization must be performed in minimal media to maintain selection stringency, thus they require significant growth time. Of note, tetA confers tetracycline resistance through efflux, but also confers sensitivity to Ni-and Cd-salts. The striking similarity of the selection mechanisms associated with the tetA efflux protein and those of tolC suggest that tetA may be amenable to the workflow described here to make tetA a more robust dual-selectable marker.
Herpes thymidine kinase is an interesting case because selective pressure for selection escape would lead to mutations in off-target alleles, whereas selective pressure for counter-selection escape would lead to mutations in the marker itself (12). In the selection, the marker rescues thymidine deficiency in the context of inhibiting endogenous thymidine metabolism (thyA). In a scenario that is analogous to tolC, one would predict loss-of-function mutations (that interfere with thyA inhibition) would permit selection escape, and further predict that the selection escape frequency would be reduced by duplicating thyA. On the other hand, the counterselection exploits the substrate promiscuity of the marker to incorporate deleterious nucleoside analogues into the genome. Thus, one would predict selective pressure for deleterious mutations within the marker coding region itself. It remains to be seen how biology escapes these selection mechanisms and how, if at all, we are able to engineer a strain to avoid that escape.

Lambda Red Recombinations, MAGE, & CoS-MAGE
Lambda Red recombineering is the basis for MAGE and CoS-MAGE, which were carried out as described previously (2,13,14).

Notes on tolC-based Selections
The TolC protein is a multi-functional pore that provides a route for efflux of a wide variety of small molecules, salts, and macromolecules. As such, loss of tolC leads to a pleiotropic phenotype, including impaired cell division and morphology, and a broad sensitivity to environmental agents, including antibiotics and common metabolites (15). Although the tolC selection used in this work is based on efflux of a membrane compromising detergent (SDS), we also noted tolC-dependent sensitivity to antibiotics with intracellular mechanisms of action. For example, the mismatch repair gene mutS was replaced with chloramphenicol (Cm) acetyl-transferase (mutS::cat) in EcNR2-based strains. Despite the presence of the resistance gene, EcNR2.tolCexhibits reduced fitness in Cm with respect to EcNR2.tolC + (Fig. S1D). As the tolC-dependent Cm phenotype suggests that tolC plays a role in Cm efflux regardless of the presence of cat, we perform tolC selections in SDS plus Cm to take advantage of added selection robustness. A similar tolC-dependent phenotype has been observed for kanamycin, X-gal/IPTG, and McConkey media, agents that also gain access to the intracellular space for their respective actions. It is important to consider these observations when designing experimental protocols using tolC as a selectable marker, especially in strains bearing many selectable markers where antibiotic use can cause unexpected tolC selection/counter-selection failure.
Although published protocols (16) call for a single 5 hour recovery before inoculating the counterselection using ~2*10 7 stationary phase cells, we have observed that inoculating the counter-selection using ~5*10 5 early mid-log cells is crucial for optimal counter-selection performance, possibly by allowing additional time for TolC turnover.
We used tolC_null_mut/tolC_null_revert oligos to inactivate/re-activate tolC genes located on the plus strand of replichore 1 and on the minus strand of replichore 2, while we used tolC-r_null_mut/tolC-r_null_revert oligos to inactivate/re-activate tolC genes located on the minus strand of replichore 1 and the plus strand of replichore 2 (including the endogenous MG1655 tolC). Sequences for these oligos are given in Table S6.

CoS-MAGE Modeling
The input data for CoS-MAGE modeling were the genotypes of the 10 targeted loci from 92 clones of a population of EcNR2.nuc5-.dnaG_Q576A cells that had been subjected to one cycle of CoS-MAGE.
Only subtle differences can be seen between the uncorrelated (Fig. 1C) and correlated (Fig. S2) distributions: Broadly speaking, the uncorrelated distributions that treat each allele as independent tend to be narrower and more regular than the distributions that take correlations into account. The two sets of distributions approach each other after many cycles, suggesting that the effect of correlations diminishes as cycles increase. This is reasonable because MAGE edits accumulate so that as cycles increase, there is less opportunity for the correlated and uncorrelated processes to yield different results.

Assessing Causality of Alleles Identified via Sequencing
To determine whether the alleles identified by sequencing resulted in the dysfunctional tolC phenotype, we Normalized Culture Time was defined as 0 for EcNR2.tolC + recombined with tolC-r.null_mut and 1 for EcNR2.tolC + recombined with water, which enables direct replicate-to-replicate comparison. Causal mutations were defined as those exhibiting 0 < Normalized Culture Time < 0.6. This cut-off was chosen to have a high confidence of covering all obvious candidate mutations/genes. Borderline mutations were defined as 0.6 ≤ Normalized Culture Time < 1.0, whereas unrelated mutations exhibited Normalized Culture Time ≥ 1.0. While this approach enables high-throughput assessment of causal mutations, Normalized Culture Time is affected by both mutation causality and oligo recombination frequency, which is known to vary significantly from oligo to oligo (13). While oligos that resulted in Normalized Culture Times < 0 are likely due to oligos that are more recombinogenic (not "more causal") than tolC.null_mut, this metric provides a quantitative strategy for identifying candidate alleles that can be easily validated synthetically.  Table S1. Fixed Variants in EcNR2. Deep sequencing of the dysfunctional clone set revealed a number of fixed polymorphisms that deviated from the E. coli K12 MG1655 reference genome. Since this clone set was derived from modified MG1655-based strain EcNR2, we report these polymorphisms as specific to that lineage and not related to mutations arising due to colE1 killing pressure.  Table S2. Assessing Causality of Alleles Identified via High Throughput Sequencing. To investigate the causality of the high-incidence mutations seen in the deep sequencing data set of dysfunctional tolC selection clones, we designed oligos to generate the exact mutations ("XM") seen with the highest incidence, and designed knockout oligos to cover all of the mutations seen in the coding regions mutated with the highest incidence ("cvrALL"). We performed a singleplex recombination with each oligo on EcNR2 (tolC + ), followed by a colE1 selection, to test the ability of each mutation in isolation to generate the dysfunctional phenotype. As controls, we recombined EcNR2 with 1, 2, & 5 µM tolC-r.null_revert oligo or water only. The former three constitute an important positive control that phenotypically defines "dysfunction" in this assay. The latter control (water) defines normal selection bleedthrough. Thus, mutations that contribute to dysfunction will lead to growth in the colE1 selection before the water control and likely after the tolC-r.null_mut controls. To distill growth data, we presented "Normalized Lagtime", "Normalized t @ V max " (time where d 2 OD 600 /dt 2 = 0), and "% Change in V max " (where V max (EcNR2.ΔtolC) was used as reference). In the case of the normalized metrics, the average of tolC-r.null_revert wells was defined as 0, while the average of water control wells was defined as 1. Thus, normalized metrics for oligos bearing causal mutations fall between 0 and 1. All data are presented as the average & standard deviation of two independent replicates. All controls were performed for each replicate to account for assay-to-assay variability.  Table S3. Assessing Causality of Alleles Identified via High Throughput Sequencing. As in Table 1, except that recombinations to test these 4 mutations were done as a low-throughput singleplex recombination into EcNR2 before colE1 selection. To rule out the possibility that XM_intergenic_752378_T_G did not affect its proximal coding regions, ybdG and gltA, we also included oligos ybdG_mut and gltA_mut. As a positive control, we included a single replicate of XM_tolA_775912_G_GA. Assay controls included EcNR2 recombined with tolC-r.null_mut as well as with water, which defined Normalized Lagtimes of 0 and 1, respectively. The data are based on three biological replicates.      and the y-axis is OD 600 . These data are single, representative replicates from an experiment that has been performed at least three times.  (dnaG Q576A , Nuc5 -, CoS + ) were used to model the allele conversion distribution through 10 cycles of CoS-MAGE, to understand how a mixed population will approach the isogenic modified population (e.g., ~100% of cells exhibiting 10/10 of desired mutations). These data were derived from the model that accounted for empirical, positional dependence for conversion of certain pairs of alleles. Both models predict that ~50% of cells will carry 10/10 mutations after 10 cycles of CoS-MAGE, suggesting that CoS-MAGE will need only ~10% of the cycles that MAGE would need to similarly convert the same set of 10 mutations (13). The data are reported as a stacked bar graph where each color indicates the frequency of clones bearing that number of allele conversions. Theoretical selection kinetics are presented to explain the possible results of typical selections and how we quantified Normalized Selective Advantage (NSA). In all panels, we present the selection kinetics of a population recombined with an oligo to change the state of a selectable marker (e.g., tolC - tolC + , 'CoS-MAGE Recombinants', blue circles), and a control population recombined with water only ('Negative Control', red squares). Thus, the selection kinetics of the negative control is indicative of the background selection escape. At left, we present an ideal selection, where the negative control never grows, defined as Selection Advantage = 1. At center, we present a more commonly-observed selection, in which the desired population (blue circles) exhibits a non-zero growth advantage (Δ), which is the relative advantage that the desired population exhibits over the negative control (red squares). This advantage is translated into a NSA between 0 and 1, using 1-[t RS *(t CNS /t RNS )]/t CS , as discussed in tolC-based Selections of the Methods). At right, we present a broken selection, where the negative control grows at the same rate as the recombinants (NSA = 0).

SUPPLEMENTAL TABLES AND LEGENDS
In dysfunctional tolC selections, the population is genotypically tolC+, but phenotypically SDS-resistant and colE1-resistant. B. To test whether counter-selection escape in colE1 is due to loss of colE1 activity over time Figure S4. CoS-MAGE Strain Improvements. Figure S4. CoS-MAGE Strain Improvements. To probe the post-recombination phenotype of Nuc5 --based strains, we performed mock recombinations using modified protocols that eliminated certain components of the recombination workflow and allowed us to isolate the effects of each component. The standard recombination protocol includes heat shock-based recombinase induction ("Induced"), icing the culture and performing 2 ice cold water washes ("Washed", blue squares), and electroporation, either in water ("Electroporated (w/ H 2 O)", red squares) or with a multiplexed oligo pool ("Electroporated (w/ 5.2 µM oligo)", dark red squares). A. Using post-recombination growth as a metric, we compared these conditions using EcNR2 and EcNR2.Nuc5 -. Nuc5strains exhibit reduced fitness, which is associated with λ Red induction (bottom panel, filled squares). This   Figure S6. Validating ColE1 Agar Plates to Quantify tolC Counter-selection Escape Frequency. A. We made vancomycin (LB L CV), colE1 (LB L CCo), and colE1/vancomycin (LB L CCoV) plates as described in the Methods. To validate these plates, we plated 2*10 3 , 10 7 and 10 7 EcNR2.tolC + cells onto LB L CV, LB L CCo, and LB L CCoV plates, respectively. These inocula were chosen for demonstration to yield discrete colonies for counting. Replicates of EcNR2 counter-selection escape frequency (Fig. 4C) are reported here, ordered from least to most robust: LB L CV, 8.3E-2 ± 1E-1; LB L CCo, 3.4E-5 ± 1.7E-5; LB L CCoV, 4.5E-8 ± 9E-9. B. We streaked 10 6 EcNR2.tolC+ and EcNR2.ΔtolC cells onto LB L CV, LB L CCo, and LB L CCoV, respectively, which mirrors the data in A showing that the LB L CCoV plates produce the most striking reduction in counter-selection escape. C. We performed a tolC inactivating CoS-MAGE cycle on EcM2.0.tolC + using tolC-r.null_mut (0.2 µM) plus a multiplexed oligo pool (5 µM). After recovery, we plated onto LB L CCoV to perform the tolC counterselection, then picked 96 clones into LB L supplemented with Carb and LB L supplemented with Carb and SDS to test if the plates efficiently kill the parental genotype. Growth (dark wells) was seen for 95/96 wells in LB L supplemented with Carb, whereas none (light wells) grew in LB L supplemented with Carb and SDS, demonstrating that the plates efficiently kill tolC + strains and can therefore be used to directly isolate monoclonal tolCclones.