Non-RVD mutations that enhance the dynamics of the TAL repeat array along the superhelical axis improve TALEN genome editing efficacy

Transcription activator-like effector (TALE) nuclease (TALEN) is widely used as a tool in genome editing. The DNA binding part of TALEN consists of a tandem array of TAL-repeats that form a right-handed superhelix. Each TAL-repeat recognises a specific base by the repeat variable diresidue (RVD) at positions 12 and 13. TALEN comprising the TAL-repeats with periodic mutations to residues at positions 4 and 32 (non-RVD sites) in each repeat (VT-TALE) exhibits increased efficacy in genome editing compared with a counterpart without the mutations (CT-TALE). The molecular basis for the elevated efficacy is unknown. In this report, comparison of the physicochemical properties between CT- and VT-TALEs revealed that VT-TALE has a larger amplitude motion along the superhelical axis (superhelical motion) compared with CT-TALE. The greater superhelical motion in VT-TALE enabled more TAL-repeats to engage in the target sequence recognition compared with CT-TALE. The extended sequence recognition by the TAL-repeats improves site specificity with limiting the spatial distribution of FokI domains to facilitate their dimerization at the desired site. Molecular dynamics simulations revealed that the non-RVD mutations alter inter-repeat hydrogen bonding to amplify the superhelical motion of VT-TALE. The TALEN activity is associated with the inter-repeat hydrogen bonding among the TAL repeats.


Modelling CT-and VT-TALE structures
CT-and VT-TALE structures used in the all-atom MD simulations were built from the chain A of the DNA-bound dHax3 TALE crystal structure (PDB ID: 3V6T) by changing the amino acid residues at positions 4 and 32 as in CT-and VT-TALEs, respectively ( Supplementary Fig. S2). In modelling the structures, we used VMD 1.9.1 1 and the Mutator 1.3 plugin to substitute atoms in the residues at mutation sites, and subsequently applied Automatic PSF builder to supplement missing atoms. CT-and VT-TALE used in the experiments comprise four units, with each unit composed of four TAL-repeats, and they have 16.5 TAL-repeats in total (Fig. 1c). The modelled CT-and VT-TALE in the simulations, however, contain only 11.5 TAL-repeats, because the dHax3 TALE crystal structure used as the template for modelling has 11.5 TAL-repeats. In the analyses for the inter-repeat hydrogen bonding using the MD trajectories, we considered the 11.5 repeats ( Supplementary Fig. S2). It should be noted that the last TALrepeat in the model TALEs were built from the half-repeat in dHax3, having the different sequence for the residues 16-34 from the corresponding regions of the other TAL-repeats ( Supplementary Fig. S2). The sequence and the spatial structure of the N-terminal half of the last half-repeat were the same as the parts in the other TAL-repeats. Therefore, the N-terminal segment of the last TAL-repeat (the 12 th repeat) was also considered in the analyses for the inter-repeat hydrogen bonding.

MD simulation of the TALEs starting from the compressed DNA bound form
The MD simulations for the CT-and VT-TALEs in the absence of DNA were carried out starting from the compressed structures in the complex with DNA, which mimicked their unbinding process from DNA.
We used the models generated from the crystal structure of dHax3, as described above. The modelled TALEs were soaked into a 90 × 90 × 150 Å TIP3P explicit water box with 0.15 mol/L of KCl added. The CHARMM36 force field and NAMD 2.9 program 2 were used, with the NPT ensemble (300 K, 1 atm, Langevin), particle-mesh Ewald (cutoff at 12 Å is applied, also for van der Waals interactions with 3 switching at 10 Å) and a 2 fs time-step with SHAKE/SETTLE. After energy minimisation, the MD simulations were carried out for 50ns, in triplicate for each of the TALEs.
To monitor the conformational changes, the distance between α-carbon atoms of residues 303 and 675 was measured ( Supplementary Fig. S3b). These residues are separated by ca. 1 turn around the DNA; the distance is 35.76 Å in the original DNA-bound crystal structure, and 60.54 Å in the DNA-free crystal structure (PDB ID: 3V6P). The Root-Mean-Square Displacement (RMSD) to the initial (DNA-bound) and DNA-free crystal structures were also calculated (Supplementary Fig. S3c and S3d). In the calculation of RMSD, the backbone atoms (C, C α , and N) in only those residues common to both crystal structures (residues 303 to 675) were considered (except undetermined coordinates for Arg448 in 3V6P), and global translation and rotation were cancelled. Both the CT-and VT-TALE models significantly elongated during the 50 ns simulation.

The inter-repeat hydrogen bonds transiently formed during the dynamics of CT-and VT-TALEs
In exploring the inter-repeat hydrogen bonding efficiencies in CT-and VT-TALEs, we considered all the possible donor-acceptor pairs between the 5th residue in each TAL-repeat (residues 293 to 633) and the 4th residue in the next repeat (residues 326 to 666). The hydrogen bond is counted if the donor-acceptor distance is less than 3 Å and the donor-hydrogen-acceptor angle is less than 20 degrees. To capture the labile hydrogen bonding, we sampled the structure every 1 ps over the designated period. Probability of forming a hydrogen bond for each considered pair of residues was calculated, and then averaged (Supplementary Table S1). Although we considered all the possible donor-acceptor pairs, almost all the hydrogen bonds were between the sidechains, and mostly between N ε2 -H ε2 of Q-5 and O δ of D-4 (or O ε of E-4). The contribution of the back bone atoms was found ignorable (the hydrogen bonding probability was less than 10 -5 ). Ala-4 has no chance to form the hydrogen bonds (probability < 10 -5 observed). In addition, E-4 in VT-TALE formed inter-repeat hydrogen bonds with much lower (ca. 1/20) efficiency than D-4. The overall probability of inter-repeat hydrogen bonding was significantly decreased in VT-TALE, due to the mutations of D-4 to A-4 or E-4.

4
As seen in the MD trajectories, each inter-repeat hydrogen bond is labile and has a short life time in a range of ps (Fig. 3b). The probability of the hydrogen bonding represents the number of the inter-repeat connections allowed to be formed simultaneously at a moment. The inter-repeat hydrogen bond can be formed at any inter-repeat position in the TAL-repeat array without any preference to specific sites; the low hydrogen bonding probability implies that all the inter-repeat hydrogen bonds cannot be formed simultaneously, but only a probable fraction of the selected pairs are allowed to form hydrogen bonds.
To show the dynamic engagement of atoms in the hydrogen bonding, we monitored the distances between O δ of Asp-4 (or O ε of Glu-4) and H ε2 of Gln-5 in the neighbouring repeats in the MD trajectories, which were sampled at every 10 ps (   Table S2. The probability of forming inter-repeat hydrogen bonds in the compressed and extended forms of CT-and VT-TALEs. a The inter-repeat hydrogen bonding probability was calculated from the structures in the MD trajectories, sampled every 1 ps during 0 ns -10 ns (compressed form) and 40 ns -50 ns (extended form), 3 trials each. The hydrogen bond is counted if the donor-acceptor distance is less than 3 Å and the donor-hydrogen-acceptor angle is less than 20 degrees; all the possible donor-acceptor pairs between the 5th residue in each repeat (residues 293 to 633) and the 4th residue in the next repeat (residues 326 to 666) were considered. Note that most hydrogen bonds were between the sidechains, and the contribution of the backbone atoms was negligible (bonding probability < 10 -5 ). b The CT-TALE model has 11 Asp residues at the 4th positions, while the VT-TALE model has 5 Asp (D), 3 Glu (E) and 3 Ala (A) residues there. The overall probability is equal to the average of probability weighted by the numbers of inter-repeat residue-pairs. P o s i t i o n 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 2 0 2 1 2 2 2 3 2 4 2 5 2 6 2 7 2 8 2 9 3 0 3 1 3 2 3 3 3 4 N -t a i l 2 3 1 - T A L -r e p e a t -6 4 5 9 T A L -r e p e a t -1 0