Elimination of infectious HIV DNA by CRISPR–Cas9

Current antiretroviral drugs can efficiently block HIV replication and prevent transmission, but do not target the HIV provirus residing in cells that constitute the viral reservoir. Because drug therapy interruption will cause viral rebound from this reservoir, HIV-infected individuals face lifelong treatment. Therefore, novel therapeutic strategies are being investigated that aim to permanently inactivate the proviral DNA, which may lead to a cure. Multiple studies showed that CRISPR–Cas9 genome editing can be used to attack HIV DNA. Here, we will focus on not only how this endonuclease attack can trigger HIV provirus inactivation, but also how virus escape occurs and this can be prevented.

the viral DNA, such as zinc finger nucleases (ZFN), transcription activator-like effector nucleases (TALEN) and homing endonucleases [6][7][8][9][10]. More recently, CRISPR-Cas9 has become a very popular endonuclease to attack HIV DNA. This tool is derived from the CRISPR-Cas system that detects and cleaves nucleic acids from invading viruses and plasmids in bacteria and archaea [11][12][13]. The CRISPR-associated endonuclease Cas9 of Streptococcus pyogenes (spCas9) was developed into a genome editing tool that cleaves double-stranded (ds) DNA in eukaryotic cells. Sequence specificity is mediated by a 20 nucleotide (nt) sequence in the guide RNA (gRNA) that directs Cas9 to a complementary DNA target (Figure 2). Only complementary sequences flanked by a protospacer adjacent motif (PAM; NGG for spCas9) can be cleaved [14][15][16][17]. The dsDNA breaks resulting from Cas9 cleavage are repaired by cellular DNA repair mechanisms. These mechanisms include classical non-homologous end-joining (NHEJ), which ligates the DNA ends with frequent introduction of insertions or deletions (indels), and microhomology-mediated end-joining (MMEJ), in which short matching sequences present at the DNA ends anneal, eventually resulting in deletion of intervening nucleotides [18,19].
Simultaneous targeting of two distant loci can result in mutations at both gRNA target sites, but also in excision or inversion of the intervening sequences [20-22,23 • ]. For example, Canver et al. [23 • ] observed excisions and inversions in approximately 27% and 13% of sequences, respectively, when analyzing the Cas9-induced mutations resulting from several dual-gRNA combinations with an intervening region ranging from 2 to 20 kilobases (kb). Such excisions and inversions are frequently accompanied by indels at the cleavage sites. The CRISPR-Cas toolbox expanded in recent years and now includes systems originating from diverse bacterial species with distinct gRNA and PAM characteristics and different target specificities. The high specificity and efficiency of the CRISPR-Cas9 system have led to its widespread application, including in anti-HIV strategies [24]

CRISPR-Cas9-targeting of HIV infection
The CRISPR-Cas9 system can be used to modify host cells in such a way that they are no longer susceptible to HIV infection. Inspired by the successful cure of the 'Berlin patient', who underwent allogeneic stem cell transplantation with donor cells lacking the CCR5 coreceptor, several studies focused on targeting the CCR5 gene [25 • ]. However, CCR5 inactivation may trigger envelope mutations that shift viral receptor usage from CCR5 to CXCR4 [26,27]. CRISPR-Cas9 can also target CXCR4 [28] or other cellular factors involved in HIV replication [29,30], but this may have undesirable side effects on cell physiology.

Inhibition of virus replication and viral escape
Initial CRISPR-Cas9 studies involving replication-competent HIV demonstrated efficient inhibition of virus replication in short-term cell culture experiments. However, these studies did not address virus escape, even though HIV-1 is well-known for its capacity to develop resistance against inhibitors [31][32][33]. We and others therefore tested HIV replication in longterm cultures of T cells stably expressing Cas9 and an antiviral gRNA [34 •• ,35 •• ,36,37-40]. These studies confirmed that CRISPR-Cas9 can potently inhibit HIV replication, but also showed that the virus frequently escapes from this inhibition, which was due to acquired mutations clustering around the Cas9 cleavage site [34 •• ]. Intriguingly, indels of variable size were observed in poorly conserved targets (e.g. in the LTR promoter region), whereas mostly substitutions and 3-nt insertions were found in conserved targets (e.g. in proteincoding domains). This mutational pattern differs strikingly from that observed upon virus escape from other inhibitors like ART drugs or RNA-interference therapeutics, where predominantly nucleotide substitutions are observed that are generated during the errorprone reverse transcription process [41]. The frequent indels that cluster at the Cas9 cleavage site implicate the cellular DNA repair mechanism that acts on the Cas9-generated dsDNA breaks in the generation of HIV escape viruses [42]. Upon Cas9 cleavage, the proviral DNA is repaired by cellular DNA repair pathways that introduce mutations (mostly indels, but also substitutions) at the cleavage site upstream of the PAM ( Figure 2). Because this target domain is very important for gRNA/Cas9 binding, most mutations will prevent further Cas9 cleavage. In addition, they can also inactivate the virus (e.g. due to a frameshift mutation or inactivation of an essential RNA or protein domain), but some mutations will be compatible with virus replication, yet prevent gRNA binding, thus resulting in escape viruses. Wang et al. [35 •• ] demonstrated that the escape mutations indeed originate from the cleaved and repaired proviral DNA pool, indicating that cellular DNA repair facilitated viral escape. However, a minor contribution of regular RT-generated mutations in virus escape cannot be excluded as some nucleotide substitutions were detected further away from the Cas9 cleavage site [34 •• ,36]. Moreover, a poorly replicating escape virus resulting from Cas9 cleavage and subsequent DNA repair may accumulate additional RT-produced mutations in the target site or compensatory mutations elsewhere to increase its replication capacity.
In our study, the period of virus suppression varied for different gRNAs but did not correlate with the capacity of the gRNAs to induce HIV DNA cleavage and to suppress gene expression [34 •• ]. Instead, the time to escape strongly correlated with the evolutionary conservation of the target sequence. Rapid viral escape was observed when poorly conserved HIV sequences were targeted, while escape was delayed when strongly conserved viral sequences were targeted. Poorly conserved HIV sequences correspond to non-essential regions, which can relatively easily accommodate the indels that are introduced during DNA repair. In contrast, highly conserved sequences correspond to essential viral regions, which will tolerate only specific mutations that are generated less frequently during DNA repair (e.g. nt substitutions and nt-triplet insertions that do not destroy the open reading frame).

Combinatorial CRISPR-Cas9 attack prevents viral escape and triggers inactivation of the viral genome
These studies demonstrate that single gRNA/Cas9 targeting of HIV-1 can potently inhibit virus replication, but subsequent DNA repair facilitates virus escape. As previously shown when treating patients with antiviral drugs and when testing RNAi antivirals in cell culture experiments [43], combining antivirals does not only increase the magnitude of virus inhibition (because of additive or possibly even synergistic antiviral effects), but also the genetic threshold for development of resistance, as multiple mutations at different positions in the viral genome will be required.
To test whether gRNA combinations can similarly prevent viral escape, we and others evaluated HIV replication in T cells harnessed with CRISPR-Cas9 and different combinations of two gRNAs [38,44 •• ]. Indeed, combinations inhibited viral replication more effectively than the corresponding single gRNAs, but viral escape was eventually apparent for most combinations due to acquisition of mutations with the typical Cas9/DNA repair signature in both targets. However, some gRNA combinations targeting highly conserved HIV sequences were found to completely block virus replication for the duration of our experiment, which lasted over four months [44 •• ]. In these cultures, we observed the gradual disappearance of wild-type and point-mutated HIV sequences and gradual accumulation of indels and multiple-nucleotide substitutions at both target sites, indicating repeated CRISPR-Cas9 attack on point-mutated targets. Attempts to rescue replication-competent virus from the infected dual-gRNA protected cells by co-culturing with susceptible cells failed after some incubation period. These results demonstrated that the infected cells were functionally cured through mutation of both antiviral target sites, leaving the cells with a graveyard of inactivated HIV proviruses. These studies provide the proof of principle that CRISPR-Cas9 can be used to cure HIV-infected cells [44 •• ].

Mutation versus excision
It has previously been suggested to excise integrated HIV proviruses with CRISPR-Cas9 and two gRNAs or a single gRNA targeting both LTRs ( Figure 3) [45-48,49 • ,50-53]. Some studies focused exclusively on this goal, apparently assuming that provirus excision is the major mechanism behind HIV inactivation [49 • ,53]. Excision will require simultaneous cleavage of both targets, followed by 'ligation' of the ends. As the cleavage kinetics may differ for different Cas9 targets, this timing requirement may be more easily fulfilled with a single gRNA targeting the identical sequence in the 5′ and 3′ LTR, although the chromosomal environment will differ and possibly influence the cleavage and repair processes. Several studies suggested efficient excision of the proviral genome [48,49 • ,53]. However, the PCR-based strategy used to detect excision strongly favors detection of the short excision product over the longer non-excised product. In fact, some studies did also detect a non-excised product, which may represent inactivated genomes with mutated target sites, but could also correspond to wild-type genomes. Unfortunately, the experimental systems used in these studies did not support massive HIV replication and did not allow testing for complete and permanent virus inactivation or virus escape. Such necessary assay conditions were met in our study and although we could also detect a low level of provirus excision, we demonstrated that complete virus inactivation coincided with mutation at both target sites ( Figure 3). Thus, hypermutation seems a major mechanism for HIV inactivation, which is in agreement with the high frequency of dual-site mutations observed upon dual-gRNA cleavage in genome editing studies [20-22,23 • ]. Fragment inversion, detected at a low frequency in these genome editing studies, may also contribute to HIV inactivation.

Countering HIV sequence variation
HIV demonstrates considerable genetic variation, with four phylogenetic groups (M, N, O, and P), multiple subtypes and much inter and intra-patient sequence diversity. Sequence variation in the gRNA target site may affect the Cas9 cleavage efficiency and thereby compromise the antiviral strategy. Single nucleotide mismatches between the gRNA and DNA target, in particular mismatches in the PAM-proximal region and non-consensus PAM nucleotides, did indeed reduce Cas9 cleavage activity in studies that yielded algorithms to predict the activity of mismatching gRNAs [17,54].
Dampier et al. used such an algorithm to calculate the activity of published gRNAs against diverse HIV isolates [55] and to design personalized and broad-spectrum gRNA combinations based on within-patient sequence variants and consensus sequences from multiple patients, respectively [56]. This in silico analysis, for example, suggested that the gRNAs in our sterilizing dual-gRNA combinations were effective against 82-95% of all HIV-1 subtype B variants [56]. However, this estimation uses an arbitrary level of cleavage activity required for virus inactivation and the algorithm is based on experimental data from single-nt mismatches only and assumes that dual-nt mutations have a multiplicative effect. Roychoudhury et al. demonstrated that there was only a 'trend to weak positive correlation' between the in silico predicted and experimentally measured activity of gRNAs, when testing the knockdown activity of 59 LTR-targeting gRNAs in an LTR-GFP reporter assay [57]. By modeling the reservoir depletion during CRISPR-Cas9 therapy, these authors illustrate that reduced gRNA activity and limited coverage of the patient's viral quasispecies will reduce the efficacy of the CRISPR-Cas9 therapy. However, our long-term virus escape experiments demonstrated that durable virus inhibition does not correlate with gRNA/Cas9 cleavage activity but rather with sequence conservation of the target sequence, which correlates inversely with the mutational escape options for the virus [34 •• ]. In the CRISPR-Cas9 therapy, escape variants can be instantly produced due to Cas9 cleavage and subsequent DNA repair at the gRNA target site. Although continuation of ART treatment during CRISPR-Cas9 therapy will block virus replication and prevent reverse transcriptiondriven evolution, it will not prevent the generation of Cas9-induced mutations and thereby the possible formation of gRNA/Cas9-resistant variants. Such escape variants may lead to virus rebound upon discontinuation of ART. It is thus critically important that the gRNAs used in the CRISPR-Cas9 therapy do not only inactivate most, preferably all, replicationcompetent proviral genomes in the latent reservoir, but also that the genetic threshold to escape is high. Combination of gRNAs that simultaneously target multiple highly conserved sequences seems the best strategy.
We recently evaluated the impact of HIV genetic diversity on CRISPR-Cas9 antiviral activity and viral escape by testing the most effective dual-gRNA combinations against distinct HIV-1 isolates, including different subtypes [58 • ]. Despite the fact that the gRNAs were designed to target highly conserved viral sequences, these sites could mismatch at 1 or 2 nt-positions. Replication of nearly all isolates could be prevented by at least one gRNA combination, which caused inactivation of the proviral genomes and the gradual loss of replication-competent virus over time. Inspection of the gRNA targets in viruses that can be blocked versus those that cannot did shed light on the sequence requirements for an effective gRNA attack. Most 1-nt mismatches did not significantly affect gRNA/Cas9 inhibition, but the gRNA lost activity when the mismatch was positioned at the Cas9 cleavage site. In contrast, two mismatches -independent of the position in the target -significantly reduced the antiviral effect. Inclusion of such a non-effective gRNA turned the dual-gRNA therapy essentially into a single gRNA therapy from which the virus was able to escape. This study demonstrates that even minor sequence variation in conserved viral targets can affect the efficacy of the combinatorial CRISPR-Cas9 therapy. Unfortunately, the in silico predicted cleavage activity of the mismatching gRNAs as based on the above described algorithms [17,54] did not correlate with their capacity to durably inhibit virus replication and are thus poor predictors. Successful HIV cure attempts may therefore require elaborate testing of gRNAs.

Future directions
CRISPR-Cas9 attack of the HIV proviral DNA in infected cells can lead to permanent inactivation of the virus when gRNA combinations are used that target essential, highly conserved viral domains. Besides coping with HIV genetic diversity, several other issues need to be addressed for the development of a safe and effective CRISPR-Cas9 HIV therapy. First, off-target CRISPR-Cas9 effects need to be excluded. Although in silico design tools can predict off-target sites and optimized CRISPR-Cas9 systems with increased sequence-specificity have been developed [59,60], experimental validation seems necessary to exclude mutation of non-target sequences in the human genome. Large deletions extending over many kb and complex genomic rearrangements have also been detected in Cas9 studies [22,61]. The frequency and potential harmful effects of such dramatic genome rearrangements (e.g. oncogene induction or tumor-suppressor gene disruption) need further investigation.
A sterilizing cure will require delivery of Cas9 and the gRNAs to all HIV-1 reservoir cells. Several methods are available for transient delivery of these components (e.g. gRNA-Cas9 ribonucleoprotein particles and virus-like particles [62][63][64]), but their in vivo delivery efficiency is likely suboptimal. Vectors based on adeno-associated virus (AAV) and HIV (lentiviral vector, LV) will facilitate prolonged Cas9 and gRNA activity and a more sustained therapeutic effect, but may also increase the risk of off-target effects. Although animal experiments show that HIV sequences can be targeted in vivo through AAV and LVmediated delivery of the CRISPR reagents [49 • ,53,65 • ], the efficiency of current viral vectors is likely too low to reach all reservoir cells. A significant constraint is the restricted packaging capacity of the AAV and LV vectors, especially given the large size of the Cas9 gene. This problem may be reduced by the use of smaller Cas9 variants (e.g. Staphylococcus aureus Cas9 [saCas9]), truncated Cas9 proteins lacking non-essential domains or smaller gRNA/Cas9 cassettes [50,53,66,67]. The Cpf1 (Cas12a) system forms an interesting alternative as it has increased specificity but a small size, which could alleviate both the delivery and off-target problems [68,69]. The viral vector should preferably only target HIV reservoir cells, but development of such a vector is complicated by the fact that the viral reservoir is still poorly defined. Immune responses against the non-human Cas9 protein and the viral particles may also complicate this in vivo inactivation strategy [70,71].
CRISPR-Cas9 can be combined with other anti-HIV therapeutics such as antiviral drugs or RNAi molecules. The combinatorial approach will further reduce the level of virus replication, but also increase the genetic threshold for virus escape to occur. A combined CRISPR-Cas9 and RNAi attack on HIV, targeting both the viral DNA and RNA, did indeed inhibit HIV replication more durably than the corresponding monotherapies [72].   HIV-1 replication cycle and antiviral therapy. (a) The HIV particle contains two genomic RNA copies. The Env protein exposed at the viral membrane mediates attachment to the CD4 receptor and CCR5 or CXCR4 co-receptor of target T cells. Upon membrane fusion and virus entry, the viral RNA genome is reverse transcribed into DNA with a complete LTR at both ends. Upon integration into the cellular genome, this proviral DNA can be transcribed by the cellular RNA polymerase II transcription complex. RNA transcripts are processed by the cellular capping, polyadenylation and splicing machinery and subsequently translated. Genomic RNA dimers are packaged into new virus particles that assemble and bud at the cellular membrane. Antiviral drugs are grouped in six classes. Fusion inhibitors bind Env during the membrane fusion process, thus inhibiting virus entry. Entry inhibitors (CCR5 antagonists) bind CCR5 and inhibit entry of virus isolates that use the CCR5 coreceptor. Nucleoside reverse transcriptase inhibitors (NRTIs) and non-nucleoside reverse transcriptase inhibitors (NNRTIs) inhibit the viral RT enzyme. Integrase strand transfer inhibitors (INSTIs) target the viral integrase enzyme that is essential for the integration of the proviral DNA copy into the cellular genome. Protease inhibitors (PIs) inhibit the viral  Excision or mutation of the HIV DNA. CRISPR-Cas9 attack of the HIV DNA with two gRNAs that target different viral domains (or with a single gRNA that targets both the 5′ and 3′ LTR domain) can result in excision (left panel) or dual-site mutation (right panel) of the viral DNA. Simultaneous cleavage at both targets and subsequent ligation of the free DNA ends will result in excision of the intervening fragment. Otherwise, for example when a DNA break is repaired before the second target is cleaved, both targets will be mutated. Wang et al. [44 •• ] identified gRNA combinations targeting highly conserved essential sequences that durably blocked virus replication in infected T cell cultures. These gRNA combinations resulted in hypermutation of the viral DNA, that is, major indels and multiplenucleotide substitutions at both targets increased over time at the expense of wild-type and point-mutated HIV sequences, which is likely due to repeated CRISPR-Cas9 attack on point-mutated targets (red diamond: mutation due to error-prone DNA repair).