Dear Editor,

The CRISPR/Cas9 gene editing method has been successfully applied to modify genomes in many organisms1. However, several critical issues remain unresolved and have become major hurdles for its broad applications1. First, editing efficiency varies widely at different genetic loci and some targeted sites are resistant to editing for unknown reasons. Even for the same gene, editing efficiency differs greatly at different positions. Second, generation of undesirable insertions or deletions (InDels) at the target sites constitutes a major issue in precise genome editing and gene therapy. Third, one-step homozygosity in precise gene editing is extremely rare and is highly desirable for genetic and functional analyses, especially in systems where genetic manipulations are not possible or are tedious and time-consuming, such as cultured cells and mammals.

A Cas9-based technique has been established to generate precise nucleotide changes in the C. elegans genome using single-stranded oligonucleotides as donor templates2,3,4,5,6,7,8. In this system, the eft-3 gene promoter is used to drive Cas9 protein expression in C. elegans germline9,10. A SV40 Large T-antigen nuclear localization signal (NLS) is added to the C-terminus of Cas9 to usher it to the nucleus. Precise nucleotide alterations were achieved at multiple target sites with varying efficiencies, but not at some other loci tested2,3,4,5,6,7.

To investigate the causes of varying editing efficiencies, we examined the editing efficiency at four different target sites in the ben-1 gene by screening for ben-1 loss-of-function mutations, which are dominant suppressors of the Uncoordinated (Unc) phenotype induced by the benomyl drug treatment2. We attempted to alter nucleotides at positions 22, 153, 1340 and 1499 of the ben-1 gene to create a stop codon (Supplementary information, Figure S1A and Data S1). Using a construct, Peft-3::Cas9-SV40_NLS::tbb-2 3′ UTR9, which expresses Cas9 with a C-terminal SRAD-NLS tag (named Cas9 I for simplicity; Figure 1A), we only obtained non-Unc F1 progeny from P0 animals injected with the DNA mixture expressing Cas9 I and sgRNA targeting the ben-1153 position along with the cognate oligonucleotide template (Supplementary information, Figure S1B and Data S1). Approximately 20% of the non-Unc animals contained precise ben-1153 knock-ins, constituting 3% of the F1 progeny (Supplementary information, Figure S1B and S1C). Despite multiple attempts, we could not obtain any non-Unc F1 progeny using sgRNAs targeting the ben-122, ben-11340 and ben-11499 positions (Supplementary information, Figure S1B).

Figure 1
figure 1

A new combination of Cas9 and sgRNA greatly improves editing efficiency and fidelity and drives one-step homozygosity. (A) A schematic diagram of Cas9 proteins with different C-terminal tags. The black boxes depict the exons of the Cas9 coding region and the blue lines depict the introns. (B) The editing efficiencies at the ben-1153 position mediated by seven different Cas9 proteins shown in A. The normalized percentage of precise knock-ins identified from the total F1 animals screened is also shown. (C) A diagram of sgRNA(F+E) and its target DNA. The sequences shared by regular sgRNA and sgRNA(F+E) are in blue. An A-U base-pair flip and an extension of the first stem-loop in the scaffold of sgRNA(F+E) are highlighted in yellow and red, respectively. (D) Editing efficiencies of different combinations of Cas9 I or Cas9 II with regular sgRNA or sgRNA(F+E) at the indicated positions. For editing experiments at the ben-122 position, editing results were scored and presented as in B. For editing experiments at the drp-1118 position and the ced-9 locus, a co-CRISPR method, which includes a driver sgRNA proven to work well in editing, was used to facilitate the identification of edited F1 animals3,4. The ben-1153 sgRNA was used as a co-CRISPR driver. (E) Comparison of the editing efficiencies in generating F1 homozygotes with the indicated editings using different combinations of Cas9 I or Cas9 II with sgRNA or sgRNA(F+E). The co-CRISPR method was used to enrich F1 edited animals, as in D.

We then tested another Cas9 construct, pDD162 (named Cas9 II)10, which uses the same eft-3 promoter to express a Cas9 protein with the identical protein sequence but a slightly different C-terminal tag (Figure 1A). This Cas9 II construct injected at the same concentration in an otherwise identical injection mixture produced non-Unc F1 progeny at all four ben-1 positions (Supplementary information, Figure S1B). Sequencing analyses confirmed precise editing at all four targeted sites (Supplementary information, Figure S1C-S1F), indicating that Cas9 II has a much better editing ability. Importantly, Cas9 II not only notably enhanced the efficiency of editing at the ben-1153 position, but also greatly improved the fidelity of editing at this position, from 20% with precise knock-ins among sequenced non-Unc F1 progeny using Cas9 I to 90% with precise knock-ins using Cas9 II (Figure 1B). 70% of Cas9 I-edited animals (38/54) at the ben-1153 site did not have the designed edit and instead contained InDels, including large InDels (Supplementary information, Figure S1C). By contrast, 90% of Cas9 II-edited animals at the ben-1153 site contained precise knock-ins and the rest had 1-bp InDels, with 3 also containing the designed edit (Supplementary information, Figure S1C). Altogether, a 20-fold increase in F1 precise editing efficiency was achieved at the ben-1153 position and precise editing was also obtained at the other three ben-1 sites with Cas9 II, indicating that Cas9 II is a superior nuclease for gene editing.

The two Cas9 expression constructs have the same promoter, the same Cas9 protein sequence, and the same 3′ UTR sequence, but their Cas9 coding sequences differ in four aspects: codon usage, the intron number, the intron sequence, and the C-terminal tag. In Cas9 I, a “SRAD” linker sequence is inserted between Cas9 and NLS, whereas in Cas9 II a more flexible “GGSGP” linker is inserted between Cas9 and NLS, with one extra HA tag placed after NLS (Figure 1A). To determine whether one or multiple different elements of these two Cas9 coding sequences affect their editing ability, we generated five additional Cas9 coding sequences (III to VII), with different introns and C-terminal tags (Figure 1A). Gene editing results from using these five new Cas9 constructs along with the same ben-1153 sgRNA and oligonucleotide template indicate that the intron number or sequence does not alter the editing efficiency of Cas9 (Figure 1A and 1B). The C-terminal tags to Cas9 do have a profound effect on both its editing efficiency and accuracy (Figure 1B and Supplementary information, Figure S1G). For example, Cas9 II and Cas9 IV with the same C-terminal tag showed 66% and 76% editing efficiency (BenomylR/F1), respectively, which is 4-5-fold higher than those of Cas9 I and Cas9 V with a Cas9 I C-terminal tag (Figure 1B). Importantly, Cas9 II and Cas9 IV showed high editing fidelity (90% and 79% precise knock-ins/sequenced BenomylR F1, respectively), whereas Cas9 I and Cas9 V showed only 20% and 35% editing accuracy, respectively (Figure 1B and Supplementary information, Figure S1C and S1G). Interestingly, removal of the HA motif from the Cas9 II C-terminal tag reduced the editing efficiency of the two resulting Cas9 variants, but not their editing fidelity (Cas9 VI and VII; Figure 1A and 1B), whereas addition of the HA motif to the Cas9 I C-terminal tag markedly decreased the editing efficiency of the resulting Cas9 protein (Cas9 III, Figure 1A and 1B). These results indicate that the C-terminal tag attached to Cas9 unexpectedly plays a critical role in determining both the editing efficiency and accuracy of Cas9 and that a flexible GGSGP linker between NLS and Cas9 is important for robust and accurate editing.

We then examined whether placement of an NLS at the N-terminus of Cas9 also affects its editing ability. An NLS immediately before Cas9 II (Cas9 VIII) drastically reduced its editing efficiency (from 66% to 4%) at the ben-1153 position (Supplementary information, Figure S1H-S1J). Addition of an NLS immediately before Cas9 (Cas9 IX), which has no tag at its C-terminus, resulted in an even worse editing efficiency (Supplementary information, Figure S1H-S1J). Insertion of a flexible linker, GGSGP, or its reversed version (PGSGG), between NLS and Cas9 in Cas9 VIII, marginally improved the editing efficiency of the two resulting Cas9 proteins (Cas9 X and Cas9 XI) over Cas9 VIII (Supplementary information, Figure S1H-S1J). Together, these results indicate that an N-terminal NLS tag to Cas9, even with a flexible linker, is detrimental to Cas9 activity.

A structurally optimized sgRNA, sgRNA(F+E), was designed to improve imaging of genomic loci in cells by a GFP-tagged, nuclease-defective Cas9 protein11. Two modifications are made in sgRNA(F+E), in which “F” is an A-U base pair flip that destroys a potential polymerase III terminator (UUUU) and “E” is a 5-bp extension of the Cas9-binding hairpin structure that likely improves the assembly of the sgRNA/Cas9 complex (Figure 1C). We tested whether sgRNA(F+E) improves gene editing at two different loci, the ben-122 position and the drp-1118 position (Figure 1D). Editing at these two sites was unsuccessful using Cas9 I and occurred at a low-to-moderate frequency using Cas9 II (Figure 1D). sgRNA(F+E), when used with Cas9 I, did not produce successful editing at either target site. By contrast, using sgRNA(F+E) with Cas9 II led to a 32-fold increase in editing efficiency at the ben-122 position and a 1.8-fold increase at the drp-1118 position compared with the sgRNA/Cas9 II combination (Figure 1D and Supplementary information, Figure S2A and S2B). Moreover, using sgRNA(F+E)/Cas9 II, we precisely inserted a 24-bp FLAG-coding sequence right after the initiation codon of the ced-9 gene at a reasonable frequency (4%), which was not achieved in our multiple attempts using the sgRNA/Cas9 I, sgRNA/Cas9 II, or sgRNA(F+E)/Cas9 I combination (Figure 1D and Supplementary information, Figure S2C). We could also precisely insert a large tag, such as GFP, at targeted loci using sgRNA(F+E)/Cas9 II and circular DNA templates with varying efficiencies (data not shown). Therefore, sgRNA(F+E) has the ability to greatly enhance the efficiency of precise editing when combined with Cas9 II.

Remarkably, the sgRNA(F+E)/Cas9 II combination also produced a low but significant frequency (3%) of the drp-1D118A homozygous mutants in F1 progeny of the injected animals, wherein both chromosomes of the F1 animals were precisely edited (Figure 1E and Supplementary information, Figure S2D). This one-step homozygosity in template-mediated precise gene editing is very rare and thought to be extremely difficult to achieve3,4. It occurs rarely in other systems1 and is highly desirable in experimental systems not amenable to genetic manipulations.

We tested whether sgRNA(F+E)/Cas9 II can generate precisely edited homozygous F1 mutants at two other loci, the fis-2 and ubr-1 genes (Figure 1E) and were able to obtain a reasonable frequency (6%-7%) of precise F1 homozygous mutants (Figure 1E, Supplementary information, Figure S2E and S2F). As we failed to obtain F1 homozygous drp-1D118A and ubr-1M80ochre mutants using sgRNA/Cas9 II, despite getting substantial percentages of F1 heterozygous drp-1D118A and ubr-1M80ochre mutants (Figure 1E), sgRNA(F+E), which works in multiple organisms11, appears to be a key driver for one-step homozygosity in gene editing. This unique sgRNA(F+E)/Cas9 II combination should enable generation of homozygous F1 mutants at most genetic loci in C. elegans and may be applicable to other experimental systems.

We last used database analysis and molecular dynamics (MD) simulation to investigate how different C-terminal tags affect the editing efficiency of Cas9. Both analyses suggest that the GGSGP linker is flexible and capable of adopting various conformations, while the SRAD linker is more rigid and tends to take a locally bent structure due to a stable electrostatic interaction between its Arg residue and Asp residue (Supplementary information, Figure S2G, Figure S2G, S2H and Data S1). Since all Cas9 crystal structures lack a C-terminal tag12,13,14, we modeled the structures of Cas9 I and Cas9 II by taking the representative conformations of SRAD and GGSGP from simulation trajectories and the structures of NLS and HA from structural databases (Supplementary information, Figure S2G, S2H and Data S1). In the modeled structures, the structurally flexible GGSGP linker allows the highly positively charged NLS sequence in Cas9 II to interact favorably with negatively charged nucleic acids (Supplementary information, Figure S2I), which likely reinforces the interaction between Cas9 and DNA or/and sgRNA and thus enhances the cleavage activity and specificity of Cas912,13,14 (Figure 1B). By contrast, the shorter and locally bent SRAD linker in Cas9 I does not provide sufficient flexibility to facilitate the interaction between the NLS tag and nucleic acids (Supplementary information, Figure S2J). Moreover, the HA sequence at the C-terminus of Cas9 II may further stabilize the formation of the Cas9/sgRNA/DNA ternary complex, resulting in further increase of the Cas9 activity (Figure 1B and Supplementary information, Figure S2I). Interestingly, a Pro-to-Ala substitution (Cas9 XII) along with substitutions of two negatively charged Asp residues in the HA tag of Cas9 II with two positively charged Lys residues (Cas9 XIII) or two neutral Ala residues (Cas9 XIV) to generate a more flexible HA tag markedly reduce rather than enhance the editing efficiency of Cas9 (Supplementary information, Figure S2K-S2M). These findings, and the observations that addition of the HA tag to Cas9 I and removal of the HA tag from Cas9 II and Cas9 IV both compromise the editing efficiency (Figure 1A and 1B), indicate that a C-terminal HA tag also has an important impact on the editing efficiency of Cas9.

The ability to alter at will any site in the genome with high fidelity is the ultimate goal of genome engineering. Multiple studies have reported improved CRISPR/Cas9 systems that significantly increase the efficiency of precise gene editing in C. elegans through improved screening methods3,4,15, modification of sgRNAs6,7, inactivation of genes involved in non-homologous end joining6, and direct injection of in vitro assembled Cas9/sgRNA ribonucleoprotein complexes8. In this study, we report a Cas9/sgRNA combination that greatly improves the editing efficiency and fidelity and enables precise editing at all genetic loci tested in C. elegans. Importantly, this robust system also permits one-step generation of precise homozygous mutations at multiple tested target sites with a reasonable success rate, which has not been achieved before. This important technical advance will greatly facilitate genetic and functional analysis in C. elegans, for instance, by allowing effortless construction of double or triple mutations on closely linked genes and facile assembly of homozygous mutants on multiple genes of interest. Surprisingly, the N-terminal or C-terminal addition of short polypeptide sequences to Cas9 has a profound effect on its editing efficacy and fidelity. This finding suggests the possibility of further improvement of the Cas9 editing efficiency and fidelity in C. elegans and other systems through altering or substituting N-terminal or C-terminal tags or sequences.