Genome editing is induced in a binary manner in single human cells

Summary Even when precise nucleotide manipulations are intended, the outcomes of genome editing can be diverse, often including random insertions and deletions. The combinations and frequencies of these different outcomes in single cells are critical not only in the generation of genetically modified cell lines but also in the evaluation of the clinical effects of genome editing therapies. However, current methods only analyze cell populations, not single cells. Here, we utilized the Single Particle isolation System (SPiS) for the efficient isolation of single cells to systematically analyze genome editing results in individual human cultured cells. As a result, we discovered that genome editing induction has a binary nature, that is, the target alleles of cells tend to be all edited or not edited at all. This study enhances our understanding of the induction pattern of genome editing and provides a new strategy to analyze genome editing outcomes in single cells.


INTRODUCTION
Genome editing allows us to manipulate genetic information in basically any type of cell, and has been revolutionary in basic science, agriculture, and medicine. [1][2][3] Genome editing tools were originally designed to cleave target sequences in the genome DNA in the cell, so that genetic manipulations can be introduced into the genome via activated DNA repair pathways at the target sites. 4 These DNA repair pathways include non-homologous end-joining (NHEJ) and homology-directed repair (HDR). 5 Each pathway gives distinct genome editing outcomes. Although base editing and prime editing technologies do not require DNA double-strand breaks to manipulate the genome DNA sequence, they can still never produce only one type of editing. [6][7][8][9] Despite significant progress in the prediction of genome editing outcomes, [10][11][12][13] it has been impossible to produce a sole genetic manipulation. In other words, genome editing outcomes are always mixtures of different modifications of DNA sequences, such as insertions or deletions of different sizes and targeted recombination events. Therefore, for the application of genome editing, it is important to precisely measure its outcomes.
Various types of techniques have been used to analyze genome editing outcomes, 14 including sequencingbased methods (e.g., amplicon sequencing and TIDE 15 ), denaturation-based methods (e.g., T7E1 16 and single-stranded conformational polymorphism [SSCP] assays 17 ), and digital PCR-based methods. 18,19 However, all of these methods are designed to analyze cell populations, not single cells. The combination of fluorescent reporter systems, such as traffic light reporter 20 and flow cytometry, can visualize genome editing results in individual cells, but this setting cannot be applied to endogenous genes. To fully exploit the potential of genome editing, it is critical to grasp the editing outcomes in individual cells. For example, in a hypothetical situation where 50% of a population of diploid cells are WT/HDR heterozygotes and the other 50% are NHEJ/NHEJ homozygotes (Population 1 in Figure 1A), and the other population consists of 50% of WT/NHEJ and 50% of HDR/NHEJ heterozygotes (Population 2 in Figure 1A), the two populations would-as a whole-have identical allelic frequencies of WT, HDR, and NHEJ, whereas the cells would show a totally different composition ( Figure 1A). Therefore, there is a strong demand for an efficient strategy to investigate genome editing outcomes in single cells.
The main reason why analyzing genome editing results in single cells has been so difficult is the limitation in the number of target molecules (i.e., a diploid cell has only two copies of genomic DNA). Even karyotypically abnormal cell lines only have several copies of target DNA sequences per cell at most. This is in clear contrast to the recent advancement of the single cell transcriptome, 21 where a typical mammalian cell has 10 5 mRNA molecules to analyze. 22 Recently, the Tapestri system to deep-sequence genomic DNA of single cells was applied to analyze genome editing outcomes. 23 Although this is an efficient strategy, sequencing with genomic DNA of single cells as a starting material is always at risk of losing alleles, as their copy number is so limited. Therefore, it is extremely challenging to directly analyze the genomic DNA sequences in single cells. However, there are two clear advantages in the analysis of genomic DNA: one is that cells can replicate their own genome DNA as long as they are alive and proliferate; the other is that its sequence does not change, in contrast to the repertory of mRNA that changes dynamically depending on the cell state. Therefore, we decided to systematically isolate clones from a pool of genome edited cells. To accomplish this task, we utilized the Single Particle isolation System (SPiS) (On-chip Biotechnologies). The SPiS is an automated single cell dispenser that can gently and accurately dispense single cells in multi-well plates using image recognition technology to monitor the number of cells in aliquots. 24,25 The SPiS enabled us to isolate clones derived from genome edited cells with unprecedented efficiency. Our new pipeline to analyze genome editing based on the SPiS and findings from it will greatly contribute to improving the understanding of how genome editing occurs in the cell. (C) A CGH analysis of HEK293T cells in comparison to diploid human iPS cells. The relative CGH signal of HEK293T cells normalized to that of diploid iPS cells is shown throughout the genome. (D) CGH copy number peak assignment. In the CGH analysis, there were several peaks of the CGH signal ratio between HEK293T cells and diploid iPS cells based on the chromosomal numbers. The highest peak corresponded to three copies per cell, and was set as the baseline of the CGH log ratio between HEK293T cells and iPS cells. (E) Line of fit between the copy number and the CGH log ratio based on the peak assignment shown in (D). The copy number and the CGH log ratio showed a clear linear correlation. (F) Scattered plot of relative CGH signal of HEK293T cells in comparison to diploid human iPS cells around the RBM20 gene. The relative CGH signals of HEK293T cells normalized by that of iPS cells are represented by +. No microduplications or microdeletions were detected around the RBM20 locus.

Measurement of the copy number of RBM20 in HEK293T cells
First, we measured the copy number of our target gene, RBM20, per HEK293T cell to comprehend genome editing outcomes in single cells, because cell lines often have abnormal karyotypes. Therefore, we combined karyotyping and a comparative genomic hybridization (CGH) analysis to precisely estimate the copy number. We first karyotyped a total of 10 HEK293T cells, and found extensive chromosomal abnormalities ( Figure 1B). The most frequent chromosomal number was three (Table S1). We also found that five cells each had three and four chromosome 10s where RBM20 is located, respectively (Figures 1B and S1A).
Next, we analyzed the genome DNA of HEK293T cells by a CGH analysis. We compared HEK293T cells with WTC11 iPS cells that had been confirmed to be diploid. 19 The CGH analysis also demonstrated the chromosomal abnormalities of HEK293T cells, as we revealed by karyotyping ( Figures 1B and 1C). We detected peaks of the CGH signal corresponding to the chromosome number in HEK293T cells ( Figure 1D). Because the most frequent chromosome number was three (as determined by karyotyping), the most frequently observed peak of the CGH signal corresponded to three copies. We were able to incrementally assign copy numbers to these peaks to draw a line of fit between the relative CGH signal intensity and the copy number ( Figures 1D and 1E). By applying the relative median signal intensity of HEK293T cells to WTC11 iPS cells around the RBM20 gene (Figures 1F, S1B, and S1C) to this line of fit, we estimated the average copy number of RBM20 to be 3.55 (Table S2), which matched the karyotyping results ( Figure 1D).

Efficient isolation of HEK293T cell clones driven by the SPiS
To systematically analyze the genome editing outcomes in individual cells, we established an efficient pipeline to isolate cell clones that had gone through the genome editing process. Using CRISPR-Cas9 and a single-strand DNA donor in HEK293T cells, as we described previously, 18 we introduced the RBM20 R636S mutation (c.1906C>A, chr.10), which causes inherited dilated cardiomyopathy 26 ( Figure 2). We used pX458 (Addgene plasmid #48138) to express EGFP via the T2A peptide fused to Cas9 in combination with gRNA. Therefore, the expression of EGFP guaranteed the expression of the CRISPR components in the same cell ( Figures 2B and S2A). To minimize damage in the sorting of these EGFP-positive cells, we used a microfluidic cell sorter, On-chip Sort. 27 We confirmed that the sorting of EGFP-positive cells successfully enriched genome-edited cells ( Figure S2B). These sorted cells were then subjected to the single cell isolation process using the SPiS, in which 384 cells were individually plated into four 96-well plates. The SPiS aspirates diluted cell suspension into a microtip and analyzes the images (taken by a CCD camera) of the contents. The camera takes two images with a 1-s interval; thus, the contents in the microtip go down by gravity. The SPiS analyzes the two images to measure the velocity of precipitation and the size of the contents ( Figure 2C). By doing this, the SPiS can discriminate cells from other objects (e.g., cell debris and dust). The SPiS dispenses contents of a microtip into a culture dish, only when the system recognizes one cell. After two to four weeks of culture, genomic DNA of HEK293T cell clones derived from single cells was harvested for amplicon sequencing to analyze the outcomes of genome editing ( Figure 2B).

Successful analysis of genome editing outcomes in single HEK293T cells
We conducted single cell isolation processes using the SPiS three times (Cas9 No.1 to No. 3 experiments), and obtained >90 clones in each attempt ( Figure 3A). We performed amplicon sequencing on the RBM20 target site and analyzed the outcomes in isolated clones using CRISPResso2 28 ( Figure S3A). We defined the clean RBM20 R636S (c.1906C>A) substitution and insertions or deletions as ''HDR'' and ''NHEJ'' events, respectively, whereas combinations of both in the same allele were defined as ''HDR + NHEJ'' events ( Figures 2A and S3B). HDR + NHEJ events occurred at a relatively low frequency in comparison to NHEJ, and the HDR frequency was even lower ( Figure 3A). Based on these HEK293T cell clones with edited alleles, we were able to estimate the overall allelic frequencies of each genome editing event (i.e., WT, NHEJ, HDR, and HDR + NHEJ) ( Figure 3B). Thus, our SPiS-based approach enables us to systematically analyze genome editing outcomes in individual cells. iScience Article Cas9 and gRNA ( Figure 3A). These observations were quite surprising considering the fact that these cells have three or four copies of RBM20 on average ( Figure 1).

Binary induction of genome editing in single cells
To address whether this trend is significant, we mathematically calculated the number of HEK293T cell clones with different genome editing outcomes assuming that these events occurred randomly. Because the copy number of RBM20 was 3.55 ( Figure 1), we built two models with three or four copies of RBM20 ( Figures 3C and S4). In these models, we calculated how many cells with three or four copies of RBM20 should have specific genotypes based on the overall frequencies of WT, NHEJ, HDR, and HDR + NHEJ alleles. We compared the proportions of clones that are expected to be unedited to remain as WT or fully edited by NHEJ between the two models and the actual observation ( Figures 3A and 3C). As we expected, the proportions of actually isolated clones were significantly higher than those based on the models for both WT and full NHEJ clones ( Figure 3D). In addition, among 330 clones, we observed one clone in which iScience Article all target alleles were modified by HDR ( Figure 3A). No such clone was expected in the mathematical models ( Figures 3C and S4). These results suggested that NHEJ is induced in a binary manner, in which cells show either no editing at all or full editing.

HypaCas9 induces HDR more efficiently than Cas9 in single cells
Because the frequency of HDR was so low with Cas9, we were not able to investigate how HDR is induced in single cells ( Figure 3). HDR is often more desirable than NHEJ, as HDR can introduce precise manipulation of the DNA sequence. Therefore, we decided to apply our SPiS-based analysis to genome editing conditions that are more favorable for the induction of HDR. We found that Cas9 with improved proof-reading (e.g., HypaCas9, 29 Cas9-HF1, 30 and eSpCas9 31 ) induced more HDR than Cas9 in our previous cell population-based assay; 32 thus, we targeted the same RBM20 R636S mutation in HEK293T cells using HypaCas9, iScience Article and conducted the SPiS-based single cells assay three times. We confirmed that HypaCas9 produced more clones with HDR than Cas9 ( Figure 4A). We calculated the allelic frequencies of WT, NHEJ, HDR, and HDR + NHEJ ( Figure 4B), and compared these between Cas9 and HypaCas9. We found that the HDR allelic frequency was higher with HypaCas9 than with Cas9; however, the WT, NHEJ, and HDR + NHEJ allelic frequencies were comparable ( Figure 4C). Based on these overall allelic frequencies, we built mathematical models of HEK293T cell clones with genome editing by HypaCas9 in the same way as for Cas9 ( Figures 4D  and S5). The binary manner of genome editing induction was also observed with HypaCas9, as the WT and full NHEJ clone proportions were significantly higher than the mathematical models ( Figure 4E). Moreover, binary induction was also observed in HDR. Among 422 clones, we isolated a total of six clones with all alleles edited by HDR ( Figure 4A). Because the overall frequency of HDR was 5.89% ( Figure 4C), no such clone was expected to be isolated if genome editing events occurred randomly in single cells (1 out of 4894 cells would be expected to have having three HDR alleles and 1 out of 83,088 cells would be expected to have four HDR alleles). To exclude the possibility that different genome editing outcomes affected the growth or survival of the cells, we induced the RBM20 R636S mutation in HEK293T cells by HypaCas9 and investigated how the overall frequencies of the WT, NHEJ, and HDR alleles changed after 25 days of continuous passaging and culture. If NHEJ or HDR gave advantage or disadvantage in growth or survival to the cells, we would observe a significant increase or decrease in the frequency of the NHEJ or HDR alleles. However, we observed only slight increase in the WT allelic frequency (75.8%-81.6%, Figure S6). Both the NHEJ (21.8-17%) and HDR (2.3%-1.2%) frequencies slightly decreased, the difference was not statistically significant ( Figure S6). Therefore, we concluded that the influence of different genome editing outcomes on cell growth or survival was marginal if any. Thus, we confirmed the binary induction of genome editing, even with HypaCas9, which induces more HDR than Cas9. The binary mode was observed in the induction of both HDR and NHEJ.
HDR is more often accompanied by NHEJ than WT As described above, we identified more clones fully edited by HDR than mathematically expected if the induction of HDR had been random ( Figure 4). However, we noticed that clones partially edited by HDR often had NHEJ alleles together. Therefore, we calculated the proportions of clone that would be expected to have only HDR and WT alleles, as these clones represent partial editing by HDR. Even though 2.5-3.5% of the cells were expected to have only HDR and WT alleles, we isolated no such clones ( Figures 4A and 4F). We further investigated the expected and observed the proportions of clones with both HDR and NHEJ events, including any clones with the HDR + NHEJ alleles, as they represent HDR and NHEJ events induced in the same alleles. As we expected, the proportions of clones with both HDR and NHEJ events that were observed and those that were predicted by the models were comparable ( Figure 4F). These results coincide with the fact that cells tend to have all target alleles edited when genome editing was induced, so partial editing is rare ( Figure 4E). Moreover, HDR and NHEJ are often induced together in single cells ( Figure 4F).

Genome editing is also induced in a binary manner in ATP7B and GRN
We next addressed whether the observed binary fashion of genome editing stands in other target sites. Therefore, we targeted ATP7B and GRN to introduce R778L (c.2333G>T, chr.13) 33 and R493X (c.1477C>T, chr.17) 34 mutations, respectively into HEK293T cells as described previously. 18 HypaCas9 and single stranded donor DNA were used in the same way as for RBM20 R636S mutagenesis ( Figures S7A and S7B). We isolated clones and performed amplicon sequencing to quantify the frequency of genome editing ( Figures 5A and 5D). To build models for genome editing of ATP7B and GRN, we quantified the copy numbers of these genes, which were determined to be 2.88 for ATP7B and 3.54 for GRN (Figures 1, S7C, and S7D). We then calculated the expected clone numbers with the assumption that genome editing events occurred randomly in the cells to build models for genome editing of ATP7B and GRN ( Figures S8A and S8B), and compared them to the observed results. Similar to the RBM20 R636S mutagenesis, the number of clones in which all target alleles were edited by NHEJ or not edited at all was significantly higher in comparison to the models ( Figures 5B and 5E). We also isolated clones in which all target alleles were edited by HDR (8/280 and 2/276 for ATP7B and GRN, respectively) ( Figures 5A and 5D). This was particularly noteworthy for GRN, as the overall efficiency of HDR induction was only 1.02% ( Figure 5D). Collectively, these results demonstrate that the binary nature of genome editing is not restricted to a specific target site. Moreover, we did not isolate any clones with partial HDR editing in ATP7B or GRN; however, HDR and NHEJ were often induced together in single cells, similarly to in RBM20 ( Figures 5A, 5C,  and 5D). We were able to apply our SPiS-based system to isolated HeLa cell clones that had gone through the genome editing process, although HeLa cells exhibited lower cell survival in comparison to HEK293T cells (Figures 6A-6G). We measured the copy numbers of these target genes by combining the karyotyping and CGH analyses in HeLa cells ( Figure S9 and Table S3). Based on these copy numbers, mathematical models of clones with different genotypes were built in the same way as for HEK293T cells ( Figure S10). We found that the observed proportions of clones that were unedited or fully edited by NHEJ were significantly higher than in the calculated models based on the overall allelic frequencies in all three genes ( Figures 6B, 6D, and 6G). The HDR frequency was higher in ATP7B than in RBM20 or GRN in HeLa cells, and we isolated 10 clones fully edited by HDR out of 234 clones ( Figure 6C). In ATP7B, the proportion of clones with partial editing by HDR was significantly lower in comparison to the model, whereas that of clones with HDR accompanied by NHEJ was comparable to the model ( Figure 6E). These results indicate that the binary nature of genome editing induction is shared between HeLa cells and HEK293T cells.

Binary genome editing is less evident in PC9 cells
Finally, we investigated the binary nature of genome editing in another cell line, PC9 cells. We introduced the same three mutations into PC9 cells and conducted our SPiS-based analysis ( Figures 7A-7F). We noticed that the genome editing efficiency in PC9 cells was generally lower than that in HEK293T cells or HeLa cells, and that HDR was barely induced ( Figures 7A, 7C and 7E). We measured the copy numbers of the target genes in PC9 cells ( Figure S11 and Table S4). Based on these copy numbers, the mathematical models of clones with different genotypes were built for PC9 cells ( Figure S12). We compared the proportion of clones that were unedited or fully edited by NHEJ between the observations and the models. We found that the trend regarding the binary induction of genome editing was noticeable but less evident, as the proportions of clones fully edited by NHEJ in RBM20 and clones unedited in ATP7B were the only significant differences between the observed clones and the models ( Figures 7B, 7D, and 7F). Therefore, binary induction is a general feature of genome editing; however, the extent of binary induction can vary in different contexts.

DISCUSSION
Most of the current methods used to analyze genome editing outcomes deal with cell populations. In this study, we were able to isolate single cell clones with an unprecedented efficiency by the SPiS. The SPiS not only increased the number of clones that we were able to isolate, but also ensured that the isolated clones were derived from single cells based on the image analysis, which avoided plating multiple cells together. This original system allowed us to investigate genome editing outcomes in single cells.
We were surprised to observe that most HEK293T cells were either unedited WT or had all targeted alleles fully manipulated by Cas9. The number of partially edited cells that had both unedited WT alleles and edited alleles was limited ( Figure 3). We previously reported that Cas9 variants with enhanced proof-reading are more efficient at inducing HDR than regular Cas9. 32 We confirmed that HypaCas9 produced more HDR in single cells than Cas9, and found that HypaCas9 also induced genome editing in a binary fashion ( Figure 4D). This trend was shared among all RBM20, ATP7B, and GRN genes in both HEK293T cells and HeLa cells, and to a lesser extent in PC9 cells (Figures 4, 5, 6, and 7). Because we sorted cells that expressed EGFP together with Cas9 and gRNA, these WT clones escaped from DNA cleavage by Cas9, or precisely repaired the genomic DNA after DNA cleavage. We speculate that each cell has an intracellular environment that is favorable for a particular genome editing outcome, iScience Article so that all target alleles in one cell tend to have the same fate. The average copy numbers of RBM20, ATP7B, and GRN in HEK293T cells, HeLa cells, and PC9 cells were 2.88-3.71 (Figures 1, S1, S9, and S11). The fact that some cells have full HDR or HDR + NHEJ events in roughly 3 to 4 copies despite the much lower overall frequencies also suggests that target alleles in one cell tend to have the same fate (Figures 4, 5, and 6). This binary nature of genome editing can be beneficial if homozygous mutagenesis is necessary. Indeed, we previously reported the isolation of a human iPS cell line with a homozygous PHOX2B Y14Xmutation generated by TALENs, even though the overall induction efficiency of HDR was just approximately 1%. 19 At the same time, however, it makes heterozygous mutagenesis extremely difficult, as we observed that HDR was more often accompanied by NHEJ rather than unedited WT alleles (Figures 4, 5, and 6). A similar trend was reported in mouse Ba/F3 cells, 23 so the binary manner of genome editing induction may be shared among different species, at least among mammals.
At this point, however, the factors that determine which DNA repair pathway a single cell takes remain unknown. One possibility is that the cells respond differently to Cas9 cleavage at different points in the cell cycle. It is known that the frequency of HDR and NHEJ increases in the S/G2 and G1 phases, respectively. 35,36 Therefore, it would be interesting to address whether the cell cycle also influences this binary choice between full editing and no editing in single cells.
PC9 cells showed a much lower efficiency of overall induction of genome editing than HEK293T cells and HeLa cells (Figure 7). The binary fashion of induction of genome editing was also less evident in PC9 cells than in HEK293T cells and HeLa cells. This could be due to the difference in the expression of HypaCas9 protein. We observed a relatively low gene transduction efficiency and low expression level in PC9 cells in comparison to the other two cell lines based on the expression level of EGFP, which was co-expressed with Cas9 via the T2-peptide ( Figure S13). Therefore, it is possible that certain levels of Cas9 expression and activity are necessary for the binary induction of genome editing. Another possibility is that different DNA repair pathways are active in different cell types to yield different genome editing outcomes. In PC9 cells, HDR was barely induced even with HypaCas9, which could be because the HDR pathway is not very active in PC9 cells (Figures 7A, 7C, and 7E). An investigation of the active DNA repair pathways in these cell lines would be an interesting way to address the observed differences in genome editing outcomes.
We are currently applying the same analytic procedure to other cell types to further address the generality of our findings. Our target cells types include human iPS cells, because we are highly interested in the genome editing outcomes of normal diploid human cells. However, the biggest challenge is the low survival rate of iPS cells, especially in single cell culture; 37 thus, we are optimizing the culture conditions. We can also apply different transduction methods, including lipofection, electroporation of plasmids and RNPs, and viral infection. We will test these various parameters by our SPiS-based strategy to figure out general characteristics of genome editing outcomes in single cells, and also best conditions to induce specific types of genome editing.
Our study reveals the previously unknown but fundamental features of genome editing in single cells including the ''binary nature'' of genome editing induction. This was only possible with the SPiS. Our findings contribute to the better understanding of the underlying mechanism of induction of genome editing. Moreover, NHEJ often results in gene disruption; thus, the gene functions in a cell may be lost in clones fully edited by NHEJ. This can be very important in genome editing therapy that is dependent on HDR, as byproducts of NHEJ might hamper the therapy if the function of a certain gene is completely lost in some cells. Therefore, the analysis of genome editing results in single cells is critical in precise evaluation of iScience Article therapeutic effects. Our SPiS also provides researchers with a versatile platform to study genome editing in single cells.

Limitations of the study
In this study, we found that genome editing is induced in a binary manner in single human cells. However, further study is required to clarify whether normal diploid cells (e.g., iPS cells and human primary cultured cells) have the same binary trend.

STAR+METHODS
Detailed methods are provided in the online version of this paper and include the following:  Tables S6 and S7, respectively. All the single strand DNA donors and oligonucleotides for gRNA cloning, which were purified by standard desalting, were purchased from FASMAC, Japan.
Calculation of the copy number of target genes by karyotyping and CGH analysis The karyotyping analysis of the cell lines was performed by Nihon Gene Research Laboratories. Karyotypes were analyzed in 10 cells for each cell line. The CGH assays were performed using the SurePrint G3 Human CGH Microarray Kit (Agilent) and the genomic DNA of WTC11 cells as a reference of the diploid cells. The genomic DNA of the cell lines used in this study for the CGH analysis was extracted using the DNeasy Blood and Tissue Kit (QIAGEN) according to the manufacturers' instructions. The CGH analysis gave peaks of the frequently observed signal intensity, which correspond to the number of chromosomes. Therefore, the chromosomal number most frequently observed in the karyotype analysis was assigned to the most frequent peak in the CGH array (for example, because the most frequently observed chromosomal number was 3 by karyotyping of HEK293T cells, the most frequent peak of the signal intensity in the CGH assay corresponded to 3 copies). Then, we were able to incrementally assign copy numbers to the peaks in the CGH assay around the highest peak to draw a line of fit between the copy number and the CGH peak values. The precise copy number of the target gene was calculated by this line of fit and the signal intensity around the target genes measured by the CGH assay (see also Table S2).

Transfection and cell sorting
For transfection, 2310 5 cells were plated in a well of a 12-well plate coated with 80 mg/mL Growth Factor Reduced Matrigel Basement Membrane Matrix (BD Biosciences). One day later, the cells were transfected with 720 ng of pX458 or pX458-HypaCas9 with a gRNA targeting RBM20, ATP7B, or GRN, and 80 ng of oligonucleotide donor DNA using 2.4 mL of Lipofectamine 2000 (Thermo Fisher Scientific) for HEK293T cells and HeLa cells, and 2.0 mL of Lipofectamine 3000 (Thermo Fisher Scientific) for PC9 cells, respectively, per well, according to the manufacturers' instructions. After 24 hours, cells were detached by 0.25% Trypsin-EDTA (Thermo Fisher Scientific) and EGFP-positive cells were sorted using On-ship Sort (On-chip Biotechnologies). The isolated EGFP-positive cells were dispensed one by one into Matrigel-coated 96-well plates using the On-chip Single Particle isolation System (SPiS). Subsequently, 100 mL/well of conditioned medium was added to the plates with dispended cells. One week after single cell dispensing by SPiS, the medium was changed. Two weeks after dispensing, for HEK293T cells and HeLa cells, the surviving colonies were first detached by Trypsin-EDTA, then the cells were resuspended in DMEM supplemented with 20% FBS and 1% P/S to evenly re-distribute the cells within the wells of a 96-well plate. The cells were cultured in this media until they reached confluence.

Preparation of multiplexed amplicon sequencing library
Genomic DNA was extracted from the cells in 96-well plates as described previously. 32 The DNA was resuspended in 30 mL/well of water. Targeted sites were amplified by the first PCR round using primers with homology to the region of interest and the Illumina forward and reverse adapters (Table S8). The first PCR round consisted of 0.3 mL each of 100 mM forward and reverse primer, 1.0 mL of genomic DNA, 2.0 mL of 2 mM dNTPs, 5.0 mL of 23 PCR Buffer KOD FX (Toyobo), 0.1 mL of KOD FX enzyme (Toyobo), 2 mL of betaine (FUJIFILM Wako Pure Chemical), and 0.3 mL of water. The thermal cycling settings were as follows: 95 C for 2 min, then 30 cycles of (95 C for 30 sec, 60 C for 30 sec, and 72 C for 1 min), followed by a final 72 C extension for 3 min. The first PCR products were diluted by adding 90 mL of water. Then, the DNA barcodes and illumina adaptors were added to the amplicons in the second PCR round, which consisted of 0.3 mL each of 100 mM unique forward and reverse index barcoding primers 38 (Table S9), 0.5 mL of the diluted first PCR product, 2 mL of 2 mM dNTPs, 5.0 mL of 23 PCR Buffer KOD FX, 0.1 mL of KOD FX enzyme, and 1.8 mL of water. The thermal cycling settings were as follows: 98 C for 3 min, then 30 cycles of (98 C for 30 sec, 57 C for 30 sec, and 72 C for 1 min), followed by a final 72 C extension for 3 min. The PCR products were electrophoresed in 2% agarose gel. NucleoSpin Gel and PCR Clean-up Midi kit (MACHEREY-NAGEL) were used to extract the pooled PCR products from 24 samples in 200 mL of water. The DNA concentrations of these library mixtures of 24 samples were quantified using the GenNext NGS Library Quantification Kit (Toyobo iScience Article Amplicon sequencing data analysis Fastq files generated by MiSeq were imported into the CLC Genomics Workbench (QIAGEN). Adapter sequences were removed and demultiplexed using the DNA Index. The data were then analyzed by CRISPResso2 28 (https://github.com/pinellolab/CRISPResso2) in the CRISPResso Batch mode. CRISPResso2 was installed as recommended using a Docker containerization system. The commands for the CRISPResso2 analysis used in this study are listed in Table S5. Alleles with %5% frequency were excluded from the analysis as they were expected to be either sequence errors or very minor populations generated during cell proliferation from single cells. In the CRISPResso2 analysis, any alleles with deletions spanning the nucleotides for single nucleotide substitutions were characterized as "ambiguous". Therefore, we classified these ''ambiguous'' alleles as NHEJ alleles in this study.

Time course analysis of allelic frequencies
The RBM20 gene in HEK293T cells was edited induce the R636S mutation by HDR in the method described above. The edited cell population was passaged every 4-5 days at 1:30 ratio, and the WT, NHEJ, and HDR allelic frequencies were analyzed by droplet digital PCR (ddPCR) on day 0 (D0) and day 25 (D25). The ddPCR assay for RBM20 used in this study was described in our previous publications in detail. 18,19 Creating a model in which alleles were evenly distributed Models assuming that the WT, NHEJ, HDR, and HDR+NHEJ alleles were randomly induced in all target alleles were built by distributing these edited alleles at their observed overall frequencies. The number of target alleles per cell was calculated by karyotyping and a CGH analysis, as described above. For example, when ATP7B was targeted in HEK293T cells, one cell has three ATP7B alleles. If the overall frequencies of WT, NHEJ, HDR, and HDR+NHEJ were 40%, 30%, 20%, and 10%, respectively, and 100 HEK293T cell clones were isolated, the expected number of clones with WT, NHEJ, and HDR alleles would be two (1003(0.4 3 0.330.2) = 2.4, rounded up to 2). We calculated the expected clone numbers of all the possible combinations of WT, NHEJ, HDR, and HDR+NHEJ alleles, and combined them to build the models.

QUANTIFICATION AND STATISTICAL ANALYSIS
The transfection and single cell dispensing experiments were performed in triplicate (three biological replicates). Clones isolated by the SPiS that showed no amplification of target sites by PCR or no successful alignments of amplicon sequencing reads by CRISPResso2 28 were excluded from the analyses. Values are displayed as Mean G Standard Error (S.E.). Statistical significance between two groups was assessed by a non-paired two-tailed Student's t-test and displayed on the figures with asterisks as follows: *p < 0.05; **p < 0.01. NS: not significantly different (p > 0.1).