New additions to the CRISPR toolbox: CRISPR-CLONInG and CRISPR-CLIP

CRISPR proven to be one of the most versatile protein of our time, predominantly as a precision genome engineering tool. Here, two additional usages to the ever-expanding repertoire of CRISPR’s application are demonstrated for procuration of donor DNA template, an indispensable component for facilitating precision genome editing. (i) CRISPR-CLONInG (CRISPR-Cutting & Ligation Of Nucleic acid In vitro via Gibson) is devised as a promising digestion tool to enable efficient cut-and-paste of DNA sequences to construct donor templates (dsDNA and AAV vectors) via Gibson Assembly. Cas9 was used to cut out undesired DNA segment from existing plasmid with complex sequences, rendering vector backbone suited for facilitating Gibson Assembly to ligate multiple DNA fragments without cloning scars and incorrect insert orientation. (ii) CRISPR-CLIP (CRISPR-Clipped Long ssDNA via Incising Plasmid) is devised as a DNA clipping tool to efficiently procure long single-stranded DNA (lssDNA) up to 2.2 kbase from donor-carrying plasmid that can be supplied as a donor template for genome editing in mouse zygotes via Easi-CRISPR. I utilized two different Cas types, namely Cpf1 and Cas9n (D10A mutant nickase of Cas9) to create two types of incisions on the respective ends of the lssDNA cassette junctions on the plasmid, yielding three independent single-stranded DNA units of different sizes eligible for strand separation treatment, followed by target strand clip-out through gel extraction. The acquired lssDNA donor was directly retrieved from plasmid without using restriction enzymes or involvement of DNA polymerase-based steps, hence not only retains sequence fidelity but carries virtually no restriction on sequence composition, further mitigating limitations on current Easi-CRISPR method. In addition, by placing an universal add-on DNA-tag sequences of Cpf1-Cas9 duo PAM on the plasmid at two sites flanking the lssDNA cassette, the devising of CRISPR-CLIP can be further simplified and applicable to construct lssDNA template targeting any genomic sequences. Altogether, the two CRISPR-based methods presented herein offer solutions to overcome technical challenges frequently encountered in procuring donor DNA for genome editing.


Introduction
The versatility of CRISPR system is attributable to the simplistic feature of Cas nuclease being guided by a single programmable RNA (1), coupled with unique spacer sequence for precise target navigation. The commonly used CRISPR protein, SpCas9, recognizes a NGG PAM which exists once in every 42 bases in the human genome (2), and along with mutant version of Cas9 such as from xCas9 group that recognizes even shorter PAM 'NG' (3), hence relaxing the PAM stringency to allow its binding at every G (also complementary to C). Alternatively, for AT rich sequences, Cpf1 can be used instead (4), further enabling PAM to recognize all four nucleotides of the DNA sequence. Together with Cas9's stability and ATP independent catalytic reaction for facilitating DNA cleavage (5), these attributes have spurred dynamic and wide-ranging applications of CRISPR system, including as a robust tool for precision genome engineering in mammalian cells (6,7).
CRISPR-mediated genome editing harnesses innate DNA repair mechanism by using CRISPR/Cas9 system that can be easily programmed to induce DNA break at virtually any locus of interest in the genome. And by supplying an exogenous DNA template, which carries desired sequence flanked by homology arms to the CRISPR cut site, through homology-directed repair (HDR) pathway, desired genetic changes can be precisely integrated into the genome of target organisms in highly efficient manner. For creating genetically engineered mouse models, donor template can be supplied as either single-stranded oligodeoxynucleotides (ssODN) or double-stranded DNA (dsDNA) vector (8). The chemically synthesized ssODN functions as an efficient donor in zygotes, yet it can only carry limited length of sequence due to technical difficulties thus is suitable only for minor genetic alterations (1-50 bp); whereas dsDNA donormediated editing can accommodate larger-scale genetic modifications (50 bp to 10 kb or even longer), which has been routinely carried out via mouse embryonic stem cells (mESCs) route due to poor efficiency in zygotes. Recent development of Easi-CRISPR has successfully expanded the use of ssODN donor up to ~2 kbases, noted as long single-stranded DNA (lssDNA), to introduce modifications over much larger genomic region in mouse and rat zygotes (9,10), suffices to cater to most of genetically engineered animal models that are commonly used in biomedical research (e.g. gene replacement or insertion, humanization).
To improve current methods for constructing donor templates, I developed two CRISPR-based methods, termed CRISPR-CLONInG (CRISPR-Cutting & Ligation of Nucleic acid In vitro via Gibson) and CRISPR-CLIP (CRISPR-Clipped Long ssDNA via Incising Plasmid) for generating dsDNA (targeting and AAV vectors) and lssDNA templates, respectively. Unless chemically synthesized, the lssDNA template is procured through PCR-or restriction enzymes (RE)-based methods from an assembled dsDNA donor (or gene synthesized donor-carrying plasmid), whereas the assembly of dsDNA donor commonly relies on BAC recombineering or multi-step cloning methods to ligate vector backbone and multiple DNA fragments that oftentimes need to be acquired from various existing plasmids. The development of Seamless DNA cloning methods, such as Gibson Assembly (11), enables multiple DNA fragments to be ligated and assembled into a vector backbone in one step in vitro without leaving any footprints, providing an ideal alternative to lengthy and laborious conventional methods for rapid assembly of donor vector. While PCR is routinely used for amplifying each DNA segment component to facilitate Seamless cloning, the process could be challenging especially when the source plasmid is of long length (e.g. amplification of vector backbone) or carries complex sequences that are highly repetitive or palindromic (e.g. multiple Lox sites in FLEx vector or ITR sequences in AAV vector). Upon PCR amplifications, such sequences are prone to forming secondary structure and mutations, thereby causing amplification and cloning failures, or introducing sequence infidelity in the assembled donor. Alternatively, restriction enzyme (RE) can be used to acquire vector backbone and DNA fragments of interest by excising undesired DNA segments from source plasmids; nonetheless, there is a very slim chance of finding a RE that cuts exclusively at the precise location despite the existence of >3000 REs discovered so far. The aforementioned technical difficulties are categorically applicable in lssDNA procuration process. Here, I demonstrate the utilization of CRISPR system to circumvent such hurdles for generation of dsDNA donor, AAV vector as well as lssDNA template. The donor templates thus generated can be delivered via various delivery systems into cells, zygotes, or living organism for genome editing.

CRISPR-CLONInG (CRISPR-Cutting & Ligation Of Nucleic acid In vitro via Gibson) -Replacing undesired DNA segment on FLEx vector with new DNA sequence
To construct a FLEx (flip-excision) donor vector, I devised CRISPR-CLONInG to make customizations on the existing one by facilitating efficient cut-and-paste of un/desired DNA sequences to assemble new donor of interest. First, CRISPR/Cas9 was used to excise the unwanted Luciferase gene from the existing FLEx vector, rendering a backbone of 7.5 kb in length (Fig. 1A, 1D left) with complex sequences that was otherwise infeasible to acquire via inverse PCR amplification (data not shown). Next, PCR primers that carry respective 20-30 bp complementary Gibson overhangs were used to amplify the DNA sequence of interest, FRT-Neo-FRT (~1.87 kb) and tdTomato (~1.43 kb), from different plasmids (Fig. 1B, 1D right) for insertion into the backbone. The three DNA components thus acquired were ligated together via Gibson Assembly (Fig.  1B, 1C), with 75% success rate (15 out of 20 clones verified with RE(s) diagnosis showed correct assembly; sequence integrity validated for three clones). Moreover, the use of CRISPR/Cas9 allows the excision of Luciferase gene from the existing vector to precisely take place at its junction sites flanked by 3'-UTR and IRES sequences (Fig. 1A), with only 1 nt deviation that can be easily remedied by 1 extra nt carried in the primer overhangs. The FLEx vector assembled was delivered into mESCs via transfection and successfully achieved gene editing.
-Replacing cargo sequence on AAV vector with novel donor template Viral delivery systems serve as potent gene-delivery vehicles and have been the cornerstone of gene therapy. Adeno-associated virus (AAV) system gains preference over other viruses due to its nominal level of adverse immunogenicity in humans (12). In addition, AAV has been exploited for targeted gene modifications (vs. Adeno-and Lenti-virus systems for transgenic purpose) (13) long before the advent of programmable nucleases, mainly in cell lines due to its capability to effectively transduce donor DNA into cells that typically bear poor recombination efficiency for facilitating gene targeting (14). Incorporating AAV delivery with CRISPR system has synergized the capacity of genome engineering to manipulate genetic contexts in living organisms whereby disease models could be rapidly generated within several weeks and ready for biomedical study (15).
To construct an AAV vector for CRISPR-mediated gene editing in cell line, I customized an existing AAV vector by replacing partial cargo sequences with novel donor DNA template. Because AAV vector carries limited cargo capacity (~4.4kb) and contains unusual repeats with palindromic ITR sequences in its backbone that is subject to PCR amplification failure, an alternative to DNA polymerase-based approach is crucially needed for facilitating efficient cut-andpaste of DNA segments on an AAV vector (yet, suitable RE recognition site scarcely exists). The source AAV vector consists of U6-driven CRISPR guide with sgRNA cloning site, plus Cre and other components (noted as Cre-Comp) that is not needed for our gene editing purpose. CRISPR-CLONInG was thus devised to use CRISPR/Cas9 for excising Cre-comp from AAV backbone ( Fig.  2A, 2D left), followed by using PCR primers to amplify DNA sequences of our interest (0.8 kb) from a custom gBlock carrying 30-35 bp Gibson overhangs (I empirically observed that direct use of gBlock in Gibson Assembly tends to result in poor cloning efficiency). The amplified gBlock fragment which carries desired modification (12 bp knock-in sequence), flanked by homology arms (Fig.  2D right) to target genomic site, was inserted into the customized AAV backbone (Fig. 2B, 2C) through Gibson Assembly, with 50% efficiency (10 out of 20 clones confirmed by RE(s) diagnosis followed by Sanger sequencing validation of 3 clones). Upon viral packaging of the assembled AAV vector, the recombinant AAV (rAAV) was used to infect Cas9-expressing mouse neuron cell line (N2A), where rAAV serves as gene delivery vehicle to introduce its cargo into host cell line for targeted modification. We observed bi-allelic knock-in at 5% editing efficiency (2 out of 50 clones screened by Sanger sequencing), while more than 60% clones showed heterozygous knock-in (data not shown).

CRISPR-CLIP (CRISPR-Clipped
Long ssDNA via Incising Plasmid): for procuration of lssDNA donor Recent development of Easi-CRISPR (Efficient additions with ssDNA inserts-CRISPR) that utilizes lssDNA as donor templates to insert large segment of novel DNA sequence (~1.5 kb) or to replace endogenous genes at precise location in the genome has enabled CRISPR-assisted genome editing to make strides toward a more simple and rapid workflow (9,10,16). By leveraging the notion that short single-stranded DNA oligo (ssODN) serves as efficient donor in mouse zygotes for facilitating HDR-mediated genome editing, Easi-CRISPR expands to use lssDNA as donor which accelerates the timeline to as little as two months for creating most types of genetically engineered mouse models (F0).
Various approaches have been proposed to construct single-stranded DNA template that is longer than 200 bases (lssDNA); ivTRT (in vitro transcription and reverse transcription) as demonstrated in the original Easi-CRISPR protocol (17); dsDNA plasmid-retrieval-based method using RE (BioDynamics Laboratory kit); PCR-based method which uses phosphorylated primer to label the undesired DNA strand for degradation (Takara Bio kit); chemical synthesis provided by commercial vendors (e.g. Megamer by IDT). These methods enable the procuration of lssDNA with sequence fidelity (except ivTRT) and length extension up to ~2 kb (comprised of ~1.5 kb insert of desired genetic modifications plus homology arms), which suffices to accommodate most of the genome editing purposes that used to be mediated via mESCs route. However, certain constraints could arise from the use of RE (efficacy, availability, or undesired RE cut on the donor sequence) as well as from technical difficulty in PCR-based or chemical synthesis to procure complex sequences (e.g. composition of donor sequence with specific nucleotides in too high or low percentage), which can fail the lssDNA construct. For such cases, I devised CRISPR-CLIP (CRISP-Clipped LssDNA via Incising Plasmid) method to directly retrieve lssDNA from the donor plasmid DNA bypassing the use of PCR and RE.
For creating conditional knock-out (CKO) mouse model for GENE-Y* via Easi-CRISPR, I first obtained the corresponding dsDNA template through gene synthesis (Genewiz). The template (2.2 kb in length), anchored in a default plasmid (pUC57), consists of a floxed cassette of exon 2, flanked by homology arms (HA) encompassing upstream and downstream genomic sequences of two respective LoxP sites, where various types of unusual repeats are present (Fig.  3A). To procure lssDNA donor through CRISPR-CLIP method, I used Cpf1 and Cas9n (D10A) to make two types of incisions on the plasmid DNA. One dsDNA cleavage and a nick were respectively induced on the two junction sites flanking the lssDNA cassette (Fig. 3B), thereby resulting in three stand-alone singestranded DNA units of different sizes that were subjected to denaturing gelloading buffer (DGLB) treatment for separation on agarose gel electrophoresis (Fig. 3C). The target strand of interest (i.e. lssDNA donor) can hence be identified and clipped out through gel extraction procedure.
The integrity (length intactness, single stranded) of the acquired lssDNA (2.2 kbase) was verified through size separation (resolving on gel electrophoresis) as well as subjecting to dsDNA-specific RE digestion. Specifically, the 2.2 kbase -lssDNA was resolved on agarose gel for size comparison with its 2.2 kb dsDNA counterpart that carries lssDNA cassette (Fig. 4A), where I used Cas9 (wild-type) and Cpf1 to digest the lssDNA-carrying plasmid (comprised of 2.2 kb dsDNA with lssDNA cassette, plus 2.7 kb pUC57 backbone), resulting in two DNA fragments migrated at respective sizes on the agarose gel (lane 'b'), while the 2.2 kbase-lssDNA donor migrated around 1.2-1.3 kb, indicating intact length (lane 'd'). The single-stranded feature of lssDNA was verified by using dsDNAspecific RE BamHI to digest the lssDNA donor that carries a BamHI cut site (Fig.  4B): upon BamHI digestion, the lssDNA-carrying plasmid, served as positive control, resulted in three digested fragments ( lane 'c'), whereas the acquired lssDNA did not yield digestion product (lane 'e'), hence confirming its singlestrand feature. Furthermore, the sequence integrity was validated by Sanger sequencing using complementary primers (Fig. 4C). As negative control, noncomplementary primers were used to sequence the lssDNA and failed to pick up correct reading, reflecting the lssDNA purity (not mixed with dsDNA). The lssDNA yield was ~50% efficiency (e.g. digesting 100 µg of 2.2 kb dsDNA in 2.7 kb pUC57 backbone would result in >10 µg of lssDNA). The lssDNA thus procured was supplied as a donor into mouse zygote via pronuclear microinjection and successfully generated CKO mouse model (data not shown).
*To maintain confidentiality of the research, the name of the gene was altered.

Discussion
The two CRISPR-based methods showcased in above three cases offer effective approaches to tackle a variety of technical challenges routinely encountered in genome editing. In CRISPR-CLONInG, the designation of Gibson Assembly (instead of Golden Gate Assembly that typically requires using PCR to add type IIs RE sites) for facilitating Seamless Cloning is aimed to minimize the involvement of PCR, which tends to stumble over DNA sequences with extended length and with complexity, a common scenario especially for vector backbone amplification. While in the two examples presented, I used CRISPR/Cas9 as an excision tool only to acquire vector backbones but not the vector inserts, CRISPR/Cas9 certainly can be applied to acquiring the latter should PCR amplification fail. In that case, the lack of Gibson overhangs on the vector inserts can be amended by additionally supplying a short gBlock to carry complementary sequences from the backbone-insert or insert-insert junction for Gibson Assembly reaction.
The CRISPR-CLIP demonstrates a PCR-free-and-RE-free strategy that imposes no restrictions on sequence complexity for procuring lssDNA donor template encoding fairly large genetic modifications for rapid generation of gene modified mice. That said, pre-existing PAM for CRISPR target sites for incisions on the dsDNA template plasmid may not be always available, especially for Cpf1's case. To cope with such issue, an add-on sequence of Cas9-Cpf1 duo-PAM can be placed on the plasmid at the exact junction sites flanking the lssDNA cassette, which streamlines the donor design process. The implement of duo-PAM is reasoned to incorporate the provision for top vs. bottom strand choice of plasmid as a preferred lssDNA donor. A kinetic study reported long residency time of Cas9 on DNA double-stranded break target site with asymmetric dissociation timeline from four broken strands, wherein 3' end of the cleaved DNA strand that is not complementary to the sgRNA (nontarget strand) is released first while the other three strands still tethered to Cas9-sgRNA complex (18). Such scenario implicates accessibility lag among the cleaved DNA strands for initiating strand complementation in the DNA repair process. I hypothesize that the recombination event could potentially be aided with lssDNA that carries complementary sequence to the nontarget strand exploiting its PAM-distal (3') end that has immediate accessibility (also make the HA that is homologous to the PAM-proximal end longer than the PAM-distal end complementary HA to address possible ssDNA exonuclease degradation issue on the former HA). Accordingly, the duo-PAM feature provides feasibility for choosing optimal strand of lssDNA donor when suitable. As shown in Figure 5A and Table 1, two duo-PAM (A and B) were created, thereby the DNA strand polarity of choice can be clipped out using suitable Cas type: the top strand can be incised using Cas9n-A and Cpf1-B (Fig. 5B, 5D), and vice versa for the bottom strand (Cpf1-A and Cas9n-B) (Fig. 5C, 5D). Overall, the CRISPR-CLIP with duo-PAM add-on further simplifies the lssDNA generation process and broaden its applicability to suit nearly any genomic sequences of interest, and together with CRISPR-CLONInG, both methods illustrate platforms for efficient construction of highly customizable donor templates for facilitating precision genome engineering.

PCR amplification of vector insert for CRISPR-CLONInG
All the primers ( Table 2) were ordered from IDT and Eurofins Genomics. Primer pair Neo-F & Neo-R was used to amplify FRT-Neo-FRT from pL451 plasmid (19); tdTom-F & tdTom-R was used to amplify tdTomato gene from an existing plasmid. Primer pair AAV-F & AAV-R was used on custom synthesized gBlock that carries 12 bp knock-in sequence flanked with two HA (~400 bp each). Two different PCR systems were adopted: Herculase II Fusion DNA polymerase (Agilent, part#600679), and Accuprime Pfx DNA polymerase PCR system (Thermofisher Scientific, cat# 12344-024). The PCR condition was 95 ℃ for 3 min; 30 cycles of 95 ℃ for 30 sec, 60 ℃ for 30 sec, 72 ℃ for 1 min/kb; 72 ℃ for 5 min. PCR amplified DNA fragments were subjected to 0.9% agarose gel electrophoresis and DNA of expected size were gel purified.

Gibson (HiFi) Assembly for CRISPR-CLONInG
The CRISPR digested vector backbones (~50 ng for each FLEx and AAV) and PCR-amplified inserts carrying Gibson overhangs (FRT-Neo-FRT and tdTomato for FLEx; novel donor template for AAV) were assembled in a ratio of 1:2 with Gibson (HiFi) DNA Assembly Master Mix (NEB, cat# E2621S) following manufacturer's protocol. The reaction mix was incubated at 50 ℃ for 1 h, and 2 µl of the assembled mix (~5 ng of vector backbone) was transformed into competent cells (NEB, cat# C2987H), followed by spreading on antibioticselective LB agar plates. Mini-prep (Qiagen, cat# 27104) DNA was digested with appropriate RE(s) for diagnostic test followed by Sanger sequencing validation.
Three incisions-bearing plasmid (~10 µg) was column purified (Invitrogen, cat#K220001) to check for digestion; 1 µg, 2 µg and 3 µg of the eluate, was mixed with 3-fold of Denaturing gel-loading buffer (DGLB) (Diagnocine, cat# DS611), and subjected to 70 ℃ for 5 min, flash cooled on ice for 1min, and resolved in 0.9% agarose gel electrophoresis at constant 100 volts until desired distance is attained. Double digested sample by Cpf1 & WT Cas9 was also included for control reference. Once the lssDNA of interest was separated indicating successful digestion on the agarose gel that requires at least 30 min staining with EtBr (>0.5 g/ml), the remaining 90 µg of the digested plasmid was DNA precipitated following standard protocol. The 2 µg/lane which gave the best separation was scaled up for extraction using QIAquick Gel Extraction Kit (Qiagen cat# 28704).

Validation of lssDNA
The dsDNA plasmid (carrying donor cassette) (~400 ng) was cleaved with ctRNPCas9(WT) and cRNPCpf1 (~3 pmole of protein and ~6 pmole of gRNA) at 37 ℃ for >2 h, followed by inactivation of protein and gRNA degradation following same method described in the CRISPR-CLONInG section. In another replicate reaction, only an extra 20U of BamHI (NEB, cat# R0136S) was added. The CRISPR cleaved +/-BamHI digested samples, along with lssDNA (~200 ng) +/-BamHI were subjected to 0.9% agarose gel electrophoresis at constant 100 volt. The sense lssDNA (top strand) was sequenced with reverse primers.     depending on certain factors, such as buffer condition and potential secondary structure formation due to sequence composition, in which case additional DGLB treatment on acquired lssDNA is needed for size separation with more precise outcome. The integrity of procured lssDNA is considered of good quality as long as the majority of the lssDNA resolved close to the predicted size.     Table 1. CRISPR guide RNA and their binding sites. PAM bolded. *partial sequence changed to maintain research confidentiality.