Manufacturing DNA in E. coli yields higher-fidelity DNA than in vitro enzymatic synthesis

Biotechnologies such as gene therapy have brought DNA vectors to the forefront of pharmaceuticals. The quality of starting material plays a pivotal role in determining final product quality. Here, we examined the fidelity of DNA replication using enzymatic methods (in vitro) compared to plasmid DNA produced in vivo in E. coli. Next-generation sequencing approaches rely on in vitro polymerases, which have inherent limitations in sensitivity. To address this challenge, we introduce a novel assay based on loss-of-function (LOF) mutations in the conditionally toxic sacB gene. Our findings show that DNA production in E. coli results in significantly fewer LOF mutations (80- to 3,000-fold less) compared to enzymatic DNA replication methods such as polymerase chain reaction (PCR) and rolling circle amplification (RCA). These results suggest that using DNA produced by PCR or RCA may introduce a substantial number of mutation impurities, potentially affecting the quality and yield of final pharmaceutical products. Our study underscores that DNA synthesized in vitro has a significantly higher mutation rate than DNA produced traditionally in E. coli. Therefore, utilizing in vitro enzymatically produced DNA in biotechnology and biomanufacturing may entail considerable fidelity-related risks, while using DNA starting material derived from E. coli substantially mitigates this risk.


INTRODUCTION
2][3] These applications encompass DNA and mRNA vaccines, gene-editing templates, and gene therapies employing both viral and non-viral vectors.Globally, over 400 million people suffer from rare diseases and could benefit from gene therapies. 4owever, high prices and complication risks are hurdles that must be overcome before gene therapy can reach its full potential. 5,6Using high-quality materials in manufacturing could increase potency (lower doses), improve production yield and efficiency (reduce cost), and reduce the potential risk of complications due to impurities in the product.
Recently, there has been increasing interest in the use of in-vitro-synthesized DNA.DNA made by polymerase chain reaction (PCR) or rolling circle amplification (RCA) can be produced rapidly and without the requirement of a living cell or cell bank.Nonetheless, even though the polymerase enzymes used in these systems are advertised as high fidelity, a lingering question persists: can they uphold the same degree of accuracy as DNA produced in a living organism?
To avoid damaging mutations during reproduction, all living organisms have evolved mechanisms to ensure that DNA is copied correctly.This replication fidelity begins with accurate base selection by DNA polymerase, followed by proofreading capability where highfidelity polymerases reverse their progression to remove and replace incorrect bases. 7DNA polymerases used in vitro for PCR-or RCAbased DNA production include these two mechanisms. 8However, in addition to base selection and proofreading, living cells such as E. coli that have been used for decades in the manufacturing of pharmaceutical products have additional mechanisms to ensure replication fidelity.Most prominent is mismatch repair (MMR), which recognizes mismatched bases produced on the unmethylated daughter DNA strand, excises the inaccurate copy, and replaces it with the correct sequence dictated by the methylated parental strand. 9,10[13] In this work, we aimed to directly compare the sequence fidelity and quality of DNA starting material produced by in vivo (E.coli) and in vitro (DNA polymerase) procedures.Polymerases are traditionally tested by replicating lacZ, cloning it into a plasmid vector, and using blue/white screening to identify inactivating mutations in lacZ. 14owever, the sensitivity of this test is limited; at high fidelity, the few white colonies cannot be accurately identified from among the blue background.Next-generation sequencing (NGS) methods of examining error rates are also limited in sensitivity since they rely on in vitro polymerases.These polymerases have their own inherent error rates, which are similar to those used for in vitro DNA synthesis. 8,15A sensitive NGS approach, PacBio's SMRT HiFi sequencing, uses repeated reads of the same DNA molecule to achieve Q60 data. 16This allows detection of mutation rates of approximately 10 À6 -10 À7 -excellent for most applications.]15 To complement existing technologies, we introduce an innovative and highly sensitive approach to assess the accuracy of DNA synthesis.Our "sucrose toxicity" (SuTox) fidelity assay leverages positive selection for loss-of-function (LOF) mutants, effectively eliminating background and achieving exceptional sensitivity.Using the SuTox assay, we provide compelling evidence of the superior fidelity associated with DNA produced in vivo compared to DNA synthesized by in vitro enzymatic techniques.

Design of the SuTox fidelity assay
We designed a screen for LOF mutations that generate a quantifiable phenotype in bacterial colonies.We chose the conditionally lethal gene sacB as our template for replication.SacB is harmless when E. coli are grown in standard media, avoiding selective pressure during in vivo replication.However, in the presence of sucrose, the SacB protein (levansucrase) produces a compound (levan) that accumulates in the periplasm and is toxic to E. coli. 17Therefore, faithful replication of sacB prevents bacterial growth on agar plates containing sucrose.Colonies with LOF mutations in sacB survive, allowing simple visualization and quantification.This approach allows us to plate billions of bacteria without significant background, increasing sensitivity.
We employed a construct containing sacB and chloramphenicol resistance (chloramphenicol acetyltransferase) genes (Figure 1A).We cloned this cassette into a precursor plasmid used to produce double-stranded linear covalently closed DNA minivectors (ministring DNA [msDNA]). 18,19Additionally, we added inverted terminal repeats (ITRs) to flank each end of the cassette to assess if these problematic palindromic sequences-used in recombinant Adeno-Associated Virus (rAAV) manufacturing-would influence replication fidelity. 20,21We replicated the ITR-sacB-cat-ITR cassette in vivo (by amplifying in E. coli) or in vitro using PCR or RCA DNA polymerization reactions (Figure 1B).We then cloned the synthesized DNA into a pUC19 vector, transformed highly competent cells, and plated on selective agar with or without sucrose.We selected for both the sacB-cat insert cassette (chloramphenicol) and the pUC19 vector (ampicillin).Notably, the ampicillin also counter-selects for the starting plasmid template, eliminating potential background from any carryover during cloning.Since survival on sucrose implies a LOF mutation(s) in the sacB gene or its expression, colonies growing in the presence of sucrose can be quantified to assess the accuracy of DNA production processes.
DNA manufactured in E. coli has higher fidelity than any enzymatic method tested Using the SuTox fidelity assay, we compared the accuracy of DNA synthesis by multiple E. coli strains (in vivo) and three different DNA polymerases (in vitro).We observed far fewer colonies on sucrose plates with E. coli transformed by DNA replicated in vivo in E. coli compared to any of the three enzymatic in vitro processes tested (Figure 2A).We quantified sacB mutants (colony-forming units [CFUs] on sucrose), total transformants (CFUs without sucrose), and the number of DNA doublings during synthesis (see materials and methods).By normalizing sacB mutants to total transformants and DNA doublings, we were able to accurately calculate relative mutation rates comparing each synthesis method.Compared to in vivo replication strain MBI, we observed 2,983-, 737-, and 84-fold higher mutation rates for Taq, Phi29, and Q5, respectively The SacB protein is toxic to bacteria in the presence of sucrose, allowing for positive selection of mutants: faithful replication of the sacB gene results in bacterial cell death, whereas if sacB is mutated (LOF) during DNA synthesis, then a colony will grow on sucrose.Palindromic ITRs were included on the ends to investigate if they influence replication accuracy.(B) Overview of the SuTox process.First, the sacB-cat DNA construct is replicated in vivo in bacterial cells or in vitro using PCR or rolling circle amplification (RCA).Second, the product is cloned into a vector and transformed into ultracompetent cells.Third, the transformation is plated with or without sucrose to quantify sacB mutants or total transformants (respectively).
(Figure 2B).Notably, the same experiment conducted with an ITR-free sacB-cat cassette yielded similar data (Figure S1), suggesting that ITRs did not influence gene of interest (GOI) fidelity.

In vivo DNA production reduces mutation rate per kilobase by several orders of magnitude
We inferred that bacterial growth on media containing sucrose could be attributed to a LOF mutation occurring within the sacB open reading frame (ORF) or its associated promoter, encompassing a total length of 1,533 bp.Due to the uncertainty surrounding the specific mutation sites responsible for LOF mutations, we calculated the mutation rate per nucleotide incorporated by dividing mutation rate by the entire 1,533 bp.This calculation provides only an estimate, as it remains unknown which specific sites are mutated to cause LOF.Our data revealed that replicating DNA in the MBI E. coli strain yields an estimated LOF error rate of approximately 7.9 Â 10 À9 /bp, or 0.00079% per kb (Table 1).For context, Phi29 (applied in RCA of DNA) showed an estimated error rate of 0.58% per kb.Notably, even Q5, advertised as the highest-fidelity PCR polymerase, 8 was more accurate than Phi29 but still exhibited a significantly higher error rate than any of the E. coli strains tested.

DISCUSSION
In this work, we have introduced the SuTox fidelity assay for the sensitive quantification of LOF mutations in DNA, whether generated in vivo or in vitro.The assay's sensitivity primarily hinges on ligation efficiency and competency of the transformed cells.In one instance, a replicate using DH5a fell below the detection limit (BDL), yielding no colonies on the sucrose plate.The other replicates for DH5a were comparable to other in vivo strains, yielding few but quantifiable CFUs on sucrose plates.The BDL replicate contributes to a lower average, but larger variance (Figure S1), and highlights that increasing the efficiency of competent cells could further improve the detection limit of the SuTox assay.The detection limit is also influenced by the number of DNA replication cycles.Since in vivo samples originated from a single transformed plasmid, whereas the enzymatic methods started with nanograms of template DNA, the in-vivo-produced DNA underwent additional replication cycles before assessment, enhancing their detection limit.
Notably, the SuTox assay is template dependent, as the sacB gene is a vital component of the test construct.It is conceivable that some enzymes or strains will exhibit different fidelity for other template sequences.Additionally, it will be interesting in future work to examine fidelity rates of template-free de novo gene synthesis approaches using the SuTox assay (to de novo synthesize the sacB construct), though we expect that those error rates will be high enough to detect by other methods as well. 22,23portantly, the SuTox fidelity assay does not detect mutations that do not impede SacB activity (e.g., silent mutations).Additionally, as not all possible LOF sites within sacB are known, we calculated mutations per nucleotide using the entire length of the promoter and sacB ORF rather than solely the LOF subset.Consequently, the overall mutation rate is likely higher than the LOF mutation rate.This holds true across all tested conditions, and the relative comparisons between DNA synthesis methods remain accurate.Given that the impact on biotechnology primarily stems from mutations affecting  functionality, the LOF mutation rate may be a more useful metric of assessment: LOF reveals functional data influencing outcomes from using the DNA to express protein while ignoring less critical silent mutations.
The data obtained using the SuTox fidelity assay closely align with mutation rates documented in existing literature: particularly, our estimated mutation rate per kb per doubling data (Table 1) is similar to findings using SMRT sequencing for Q5 8 and radioactive nucleotide misincorporation for Phi29. 24Interestingly, our assessment reveals that Taq polymerase shows 5-fold higher fidelity compared to previously reported data. 8,14This discrepancy may stem from potential improvements in enzyme or buffer conditions offered by different Taq polymerase vendors.6][27] These works have demonstrated even higher fidelity for in vivo DNA replication than what we present here.While this could partly be attributed to variations in mutation rates between plasmid and genomic DNA, it is more likely attributable to distinctions in detection limits between the SuTox assay and long-term evolution assays, which involve more DNA replication cycles and employ a large screening template (chromosomal DNA).However, it is important to note that the SuTox fidelity assay offers the advantage of directly comparing replication fidelity between in vivo and in vitro methods, using plasmid templates commonly used in biotechnology.In this direct comparison, the difference in fidelity of in vivo over in vitro replication (approximately 2-3 orders of magnitude) aligns well with previous findings that quantified the benefit of MMR, 7,[11][12][13] lending further credence to our results.
To illustrate the practical significance of these findings, consider an rAAV cis (ITR-GOI-ITR) construct that is 4.7 kb long.Companies using Phi29 to manufacture their rAAV construct could estimate that approximately 2.7% of their final GOI products contain LOF mutations and that more may have mutations with potential consequences other than LOF of the GOI.These non-functional but packaged impurities could potentially increase immunogenicity and reduce dose potency of the final rAAV product.Synthesis of rAAV also requires two additional constructs, "RepCap" (5.5 kb) and "Helper" (13 kb).
Using DNA synthesized with Phi29, the odds of a triply transfected cell having no LOF mutations in any of the three constructs is only 86.5%, compared to 99.98% using DNA produced in E. coli.Similarly, consider a large non-viral gene therapy construct expressing the 12 kb dystrophin gene.Companies using Phi29 could expect approximately 7% of their products to be defective due to a LOF mutation, compared to only about 0.01% for companies using DNA produced in vivo in E. coli.This highlights that-at least in certain instances involving large DNA constructs-producing DNA in vivo provides a significant advantage due to its higher replication fidelity.
Finally, besides the occurrence of LOF mutations, there exists the potential for dominant negative gain-of-function mutations in certain instances.To illustrate, consider a gene therapy scenario involving expression of the tumor-suppressor gene p53.p53 has numerous known dominant-negative mutations that can promote cancerous growth. 28,29Drawing from our LOF data, it becomes evident that the likelihood of encountering dominant-negative mutations and their associated complications is remarkably higher when using DNA starting material synthesized using enzymatic in vitro methods.
Current guidelines for manufacturing plasmids and other DNA materials primarily address their use as therapeutic agents. 30However, these DNA forms (including traditional plasmids, miniplasmids, and linear DNA vectors) also serve as pivotal starting materials in the production of viral vectors and mRNA.As regulatory requirements become increasingly stringent, it becomes imperative to consider diverse quality, and testing needs to harmonize DNA production and procurement practices.Depending on the intended role of DNA in drug development, whether as a starting material, intermediate, drug substance, or drug product, various regulatory agencies could introduce distinct quality and fidelity standards.The SuTox assay described here provides a sensitive, cost-effective method of measuring mutations that influence functional outcomes.It is independent of NGS and can be used in parallel or as an alternate sensitive approach to measure mutation rates.
In summary, using the SuTox assay to directly compare replication fidelity, we have established that DNA produced in vivo in E. coli cells significantly surpassed the accuracy of PCR or RCA methods.While in vitro methods remain invaluable for many applications, for certain instances involving long DNA constructs and industrial scale up to quantities necessary for drug production, in vitro synthesis may introduce cumulative mutation risk.Employing E. coli-based in-vivogenerated DNA, such as plasmid or msDNA, can effectively reduce mutations, thereby mitigating risk and enhancing the overall quality of the final product.

Bacterial strains and plasmids
All bacterial strains were E. coli derivatives.The MBI strain could not be disclosed but yielded similar results to the other commonly used strain backgrounds presented.The starting plasmid, containing the ITR-sacB-cat-ITR construct (and the version without ITRs), was initially generated in Mediphage's proprietary precursor plasmid for manufacturing msDNA, a linear double-stranded DNA molecule with covalently closed ends. 18,19msDNA is bacterial-sequence free but is produced by an in vivo process originating from plasmid DNA.Following replication, the cassette was inserted into pUC19 for transformation on sucrose plates.

DNA synthesis by in vivo or in vitro methods
For in vivo DNA replication, we transformed the starting plasmid into E. coli strains and grew the transformations overnight on selective LB agar plates.Single colonies were inoculated into 2 mL liquid LB cultures, grown shaking for 16 h at 37 C, and then treated with a commercial miniprep kit (OmegaBioTek D6945-02) to obtain in-vivosynthesized plasmid DNA.Since each colony originates as a single bacterium transformed with a single molecule of plasmid, replication from the starter plasmid began from that single plasmid molecule.Therefore, the DNA input was calculated as the mass of one plasmid molecule.The DNA output was calculated as the concentration of the miniprep (obtained using a nanodrop spectrophotometer) multiplied by the elution volume.
For in vitro replication, we designed primers binding just outside of the sacB-cat construct that added either a SacI or SalI restriction enzyme site.Using these primers, we amplified the cassette by PCR using Taq (Froggabio T-500) or Q5 (NEB M0491) polymerases, applying their respective manufacturer-suggested buffers and thermocycling conditions.The same primers were used to guide RCA with Phi29 polymerase (NEB M0269).Following PCR or RCA, the enzymes and buffer reagents were removed using a commercial PCR purification kit (Thermo Scientific K0702) to obtain in-vitrosynthesized DNA.This purification step will help remove unwanted PCR products (e.g., due to non-specific priming, polymerase run off, etc.).Since Phi29 produces multimers that are difficult to purify, we digested the completed Phi29 reaction with SacI and SalI restriction enzymes prior to the PCR purification protocol.The DNA input was calculated as the quantity of starter plasmid added to the reaction as the template multiplied by the length of the amplified region as a fraction of the total plasmid size.The DNA output was calculated as the concentration of the PCR purification (obtained using a nanodrop spectrophotometer) multiplied by the elution volume.

Transformation of synthesized DNA and calculation of mutation rate
The in-vivoand in-vitro-synthesized DNAs were digested with SacI and SalI restriction enzymes (NEB R3156, R3138), as was a pUC19 vector with ampicillin resistance.We used a commercial gel extraction kit (Thermo Scientific K0691) to isolate the sacB-cat fragment and ligated each insert:vector combination overnight with T4 ligase (NEB M0202).We also included a no-insert reaction as a negative control.The ligations were transformed into high-efficiency (1-3 Â 10 9 CFU/mg pUC19 DNA) competent cells (NEB C3040).The transformations were serially diluted and plated on LB with ampicillin (100 mg/mL), chloramphenicol (25 mg/mL), and either 1% NaCl (standard Miller LB) or 6% sucrose.The plates were grown at 37 C for 16-24 h.
CFUs were counted for each sample with and without sucrose.Any CFUs counted from the negative control transformation were subtracted as background for all samples.Samples with no CFUs on sucrose after background subtraction were treated as BDL and calculated as if 0.5 CFUs were on the sucrose plate.CFUs from the sucrose plates were counted as sacB mutants.CFUs on standard LB plates were counted as total transformants.The number of DNA doublings was calculated as log 2 (DNA output/input).Finally, the mutation rate was calculated as the sacB mutants/total transformants/DNA doublings.To paraphrase, this calculates the fraction of DNA molecules that contain a LOF sacB mutation, normalized to how many times the original template was replicated.
The mutation rate calculated from the SuTox method describes LOF mutations in the sacB gene.To obtain an estimate of the number of mutations per bp replicated, we divided the mutation rate by the length of the sacB ORF and promoter (1,533 bp).We then multiplied by 1,000 bp to get mutations/kb replicated, followed by multiplying by 100% to express the values as a percentage.

Figure 1 .
Figure1.Measuring DNA replication fidelity by the sucrose toxicity (SuTox) assay (A) Overview of the template DNA construct.Chloramphenicol resistance (cat) allows for selection of successful transformants.The SacB protein is toxic to bacteria in the presence of sucrose, allowing for positive selection of mutants: faithful replication of the sacB gene results in bacterial cell death, whereas if sacB is mutated (LOF) during DNA synthesis, then a colony will grow on sucrose.Palindromic ITRs were included on the ends to investigate if they influence replication accuracy.(B) Overview of the SuTox process.First, the sacB-cat DNA construct is replicated in vivo in bacterial cells or in vitro using PCR or rolling circle amplification (RCA).Second, the product is cloned into a vector and transformed into ultracompetent cells.Third, the transformation is plated with or without sucrose to quantify sacB mutants or total transformants (respectively).

Figure 2 .
Figure 2. DNA manufactured in E coli has fewer LOF mutations than any enzymatic method tested (A) Representative images of sucrose plates with transformations of sacB-cat DNA generated by PCR (Taq and Q5), RCA (Phi29), or in vivo in E. coli (DH5a).For Taq, 1 / 4 of the volume was plated relative to the other images.(B) LOF mutation rates identified using the SuTox method with a sacB-cat cassette containing ITRs.Data are shown as LOF mutant colonies/total transformant colonies/number of DNA doublings.DNA was synthesized in vivo (in two different E. coli strains) or in vitro (PCR with Taq or Q5 polymerases or RCA with Phi29 polymerase).Bars show the average of three biological replicates, and error bars show one standard deviation.One-way ANOVA with Dunnett's test (compared to DH5a): ****p < 0.0001.

Table 1 .
Approximate LOF mutation rates per kilobase synthesized aAs described in the materials and methods, estimated mutation rate/kb was calculated as sacB mutation rate (from SuTox data in Figure2) divided by the length of the sacB ORF and promoter (1,533 bp), multiplied by 1,000 bp/kb, multiplied by 100%.