Specificity profiling of CRISPR system reveals greatly enhanced off-target gene editing

To explore the editing specificity of CRISPR/Cpf1 system, effects of target mutation were systematically examined using a reporter activation assay, with a set of single-nucleotide mutated target site. Consistent with our previous study performed with CRISPR/Cas9, a “core” sequence region that is highly sensitive to target mutation was characterized. The region is of 4-nucleotide long, located from +4 to +7 position of the target site, and positioned within a positively charged central channel when assembled into Cpf1 endonuclease. Single-nucleotide mutation at the core sequence could abolish gene editing mediated by a however active sgRNA. With a great majority of the target sites, a kind of ‘super’ off-target gene editing was observed with both CRISPR/Cpf1 and CRISPR/Cas9. For a given target site, mutation at certain positions led to greatly enhanced off-target gene editing efficacy, even up to 10-fold of that of the fully-matched target. Study further found that these effects were determined by the identity of target nucleotide, rather than the nucleotide of crRNA. This likely suggests that the interactions between target nucleotide and the endonuclease are involved in this process.


Results
To explore gene editing specificity of CRISPR/Cpf1, a modified reporter activation assay was developed. In addition to a reference reporter gene, three other components are engaged in the assay, including a Cpf1 endonuclease, a crRNA and a target DNA. Cpf1 endonuclease is encoded by pCpf1 vector, crRNA is encoded by pcrRNA vector, target sequence within a split Firefly luciferase gene is encoded by pTarget vector, reference gene Renilla luciferase is encoded by pRL-TK vector. Serving as a target of gene editing, coding sequence of Firefly luciferase is split into two parts by the target sequence of a crRNA, rendering the reporter inactive. At their adjacent ends, each contains an identical sequence of 1000 bp. This allows for a homologous recombination after Cpf1 cleavage, and generates an active Firefly luciferase gene (Fig. 1). Therefore, activation of Firefly luciferase gene will occur after target editing and homologous recombination. In relative to the originally inactive luciferase, activation of the reporter is quantitatively determined by examining the increase of luminescence in CRISPR/Cpf1-treated samples.
Specificity profile of CRISPR/Cpf1. Using this sensitive assay, gene editing efficacy of three active crR-NAs was examined. In order to make the study comparable to that of Cas9, the three crRNAs were selected from our earlier study with CRISPR/Cas9 7 . For a given crRNA, single-nucleotide mutations were made at every position of the target site. Mutations were also made at PAM region, as well as the adjacent positions downstream of PAM (Fig. 1). Therefore for each crRNA, 81 synthesized DNA fragments carrying the fully matched or single-nucleotide mutated targets were cloned into pTarget plasmid, generating a library of single-nucleotide mutated targets. The recombinant vectors were transfected individually into HEK293 cells, together with pCpf1, pcrRNA and reference vector. Two days after transfection, activation of Firefly luciferase was quantitatively determined using dual-luciferase reporter assay.
With a reference of the fully-matched target, all the tested crRNAs mediated efficient gene editing, resulting in a boost of luciferase activity up to 11-fold. To facilitate data analysis, editing efficacy of the fully-matched target was set as 100%, for each crRNA. Then, the relative gene editing efficacy was determined for each single-nucleotide mutated target (Fig. 2). Positions across the target sequence were numbered as position (+1) to position (+23), starting from the 3′ end of PMA region. PAM positions were accordingly numbered as position (−4) to position (−1). Background luciferase activities were determined with scrambled target sequences.
In the process of CRISPR-mediated gene editing, PAM recognization is a critical step 10 . In agreement with the dominant PAM sequence of TTTN 11 , mutations at position (−1) were partly tolerated, while the mutations at position (−2), (−3) and (−4) were not tolerated at all, leading to greatly decreased editing efficacies. For mutations made on the target sequence, editing efficacies were found to vary with the identity and the location of the mutated nucleotide. Among them, positional effects were the most evident. In relative to the modest effects of  Characterization of a highly sensitive "core" sequence. Despite the varying editing efficacies across the target site, the most evident influence was revealed within a 4-bp "core" sequence, located from position (+4) to position (+7) downstream of PAM (Fig. 3). Similar to that of CRISPR/Cas9, gene editing was abolished by most of the mutations made at the "core" sequence. Also similar to CRISPR/Cas9, this 'core' sequence was located within a positively charged central channel of Cpf1 endonuclease 14 . Therefore for both CRISPR/Cas9 and CRISPR/Cpf1, a common "core" sequence was demonstrated to serve as a major determinant of targeting specificity.
In addition to the positional effects, varying editing efficacies were also revealed for the mutations made at the same target position. For example at position (+15) of ALKBH5 target, rU:dG mismatch was the most tolerated, while rU:dT or rU:dC mismatches led to a sharp decrease in editing efficacy (Fig. 2C).
In addition to the molecular components, host cells may also contribute to the gene editing as a background system. To address their potential influence, gene editing of a few targets was examined parallel with both HEK293 and Hela cells (Fig. 4). This led to comparable editing profiles. With EGFP target as an example, target mutations were barely tolerated at position (+5) within the "core" sequence, while mutations at position (+17) caused less compromised editing efficacies. It is of interesting to observe a greatly enhanced off-target gene editing at position (+18) of AAVS1 target and position (+8) of ALKBH5 target, for both HEK293 and Hela cells.   www.nature.com/scientificreports www.nature.com/scientificreports/ Characterization of a 'super' off-target gene editing. In the assays, greatly enhanced off-target editing was revealed at several target positions. For AAVS1 target, when nucleotide at position(+18) was mutated to adenine, gene editing efficacies were increased for an 8.8-fold. The effect was however not observed at position(+18) of the other targets, rendering it a sequence-dependent event. Similarly for ALKBH5 target, rA:dC mismatch at position(+8) led to greatly enhanced off-target editing (4.8-fold). Taken with other sporadically reported cases 7,15,16 , a 'super' off-target gene editing was uncovered.
To further explore this phenomenon, orthogonal mutations were made with the deoxyribonucleotide of target DNA (dN) and the ribonucleotide of crRNA (rN). Reporter activation assays were performed with different combinations of mutated crRNA and target DNA. In addition to confirming the enhanced off-target gene editing, results showed that occurring of this event was determined by the identity of target nucleotide, rather than the nucleotides of crRNA. For position (+18) of AAVS1 target, enhanced gene editing was observed when target nucleotide was dA, regardless of the identity of crRNA nucleotide. Similarly, for position(+8) of ALKBH5 target, enhanced editing was observed when target nucleotide was dA, regardless of the identity of rN.
We then examined our previous data with CRISPR/Cas9 7 , and identified enhanced off-target gene editing at a few target positions. Orthogonal assays performed with 4 target sequences showed that, the enhanced off-target gene editing was determined by the target nucleotide, rather than the nucleotide of crRNA (Fig. 5B). At position (+11) of CDK11 target, when target nucleotide is dA, off-target gene editing was increased for more than 8-fold. Taken these data together, a kind of 'super' off-target gene editing was characterized with CRISPR systems, representing a universal property of this widely applied gene editing platform.
Strategy to optimize targeting specificity of CRISPR systems. To the mechanism of this phenomenon, we speculated that target nucleotides may form a certain interaction with protein domains of the endonuclease, and thereby contribute to targeting specificity in on-target and off-target gene editing. To explore this possibility, a sequence-modified Cas9 protein was examined with the reporter activation assay. The modified Cas9 carries a K855A point mutation and was shown to have improved target specificity 17 . As shown in Fig. 5C, the most profound effects were found with FTO target, in which the enhanced gene editing was almost abolished. While for CLTA and CDK11 targets, the off-target editing was largely maintained, with decreased levels (Fig. 5C). For example at position (+3) of CLTA target, the efficacies of off-target gene editing decreased from 600% to ~400%.
Therefore, despite the general increase in targeting specificity, a position-specific improvement was observed with the sequence-modified Cas9. This, on the one hand, lends support to the involvement of specific interaction between target nucleotide and protein domains; on the other hand, it proposes that by optimizing protein sequence and structure, targeting specificity of CRISPR system may be further improved.

Discussion
Although strict targeting specificity is desirable for gene therapy, the systems we borrow from nature however exhibit a prominent property of redundancy, which is definitely of benefit to their hosts. For CRISPR systems borrowed from bacteria and archaea, targeting redundancy enables them to defend a batch of relevant virus species, instead of targeting only the virus from which the crRNAs are derived. Therefore, tolerance to certain off-target effects is theoretically preferred, as this may valuably extend their target ranges. In the present study, CRISPR/ Cpf1 system was shown to have a similar specificity profile as that of CRISPR/Cas9, tolerating point mutation at most target positions. Also similar to that of CRISPR/Cas9, a 4 bp-long "core" sequence was identified to be the major determinant of targeting specificity in CRISPR/Cpf1.
In addition to the specificity profile, a 'super' off-target gene editing was surprisingly revealed for both CRISPR/Cpf1 and CRISPR/Cas9, depending on the specific target sequences. Target mutation at certain positions led to greatly enhanced gene editing efficacy, up to as high as 10-fold of that of the fully-matched target. When the effects were examined in terms of mutation position, no clear rule could be drawn. This suggests that sequence-independent factors, such as methylation level, GC content, surrounding chromatin content, nucleosome location, etc. may be involved in the phenomenon.
Besides our study, gene editing of single mutated crRNA/target sites has been previously investigated 8,9,15,16 . In their studies, sequence-and position-dependent tolerance profiles have been reported. To the aspect of enhanced 'off-target' effect, it was observed with some target mutation, but not reported for the crRNA mutation.
To explore the mechanism underlying the 'super' off-target gene editing, we speculated that target nucleotides may form certain interactions with protein domains of Cpf1, and thereby contribute to targeting specificity of the system. To this aspect, a sequence-modified Cas9 protein was tested with the reporter activation assay. Results showed that both the general and 'super' off-target gene editing got improved, in a position-specific manner. Although K855A mutation of Cas9 protein ameliorated off-target editing at positions +3, +6 and +11, the most profound effect was found with position +9. Identification of this position-specific improvement supports the involvement of specific interaction between target nucleotide and protein domains.
Targeting specificity is a long-term focus of our group. In previous studies performed with RNA interference (RNAi) [18][19][20] , a reporter-based assay was performed to examine the specificity of gene silencing. Although position-specific, rather than sequence-specific profiles were revealed with both RNAi and CRISPR systems, the 'super' off-target editing has never been found with RNAi, showing it is unique to CRISPR systems. Considering their potential harm to gene therapy 21 , we believe that more attention needs to be paid to the off-targeting gene editing, in particular the 'super' off-target gene editing of CRISPR systems.