Title Effect of single pyrrole replacement with β-alanine on DNA binding affinity and sequence specificity of hairpin pyrrole / imidazole polyamides targeting 5 '-GCGC-3 '

N-Methylpyrrole (Py)–N-methylimidazole (Im) polyamides are small organic molecules that can recognize predetermined DNA sequences with high sequence specificity. As many eukaryotic promoter regions contain highly GC-rich sequences, it is valuable to synthesize and characterize Py–Im polyamides that recognize GC-rich motifs. In this study, we synthesized four hairpin Py–Im polyamides 1–4, which recognize 50-GCGC-30 and investigated their binding behavior with surface plasmon resonance assay. Py–Im polyamides 2–4 contain two, one, and one b-alanine units, replacing the Py units of 1, respectively. The binding affinities of 2–4 to the target DNA increased 430, 390, and 610-fold, respectively, over that of 1. The association and dissociation rates of 2 to the target DNA were improved by 11 and 37-fold, respectively, compared with those of 1. Interestingly, the association and dissociation rates of 3 and 4 were higher than those of 2, even though the binding affinities of 2, 3, and 4 to the target DNA were comparable to each other. The binding affinity of 2 to DNA with a 2 bp mismatch was reduced by 29-fold, compared with that to the matched DNA. Moreover, the binding affinities of 3 and 4 to the same mismatched DNA were reduced by 270 and 110-fold, respectively, indicating that 3 and 4 have greater specificities than 2 and are suitable as DNA-binding modules for engineered epigenetic regulation. 2013 The Authors. Published by Elsevier Ltd. All rights reserved.


Introduction
N-Methylpyrrole (Py)-N-methylimidazole-(Im) polyamides are small molecules that bind to DNA with sequence-specificity and can be used as DNA-binding modules. [1][2][3][4][5] Py-Im polyamides can recognize specific DNA sequences in the minor groove of B-form DNA according to DNA recognition rules. 6,7 Py favors T, A and C bases, excluding G; Im is a G-reader. The lone electron pair of N3 in Im forms a hydrogen bond with the 2-amino hydrogen of guanine. Anti-parallel pairing of Im/Py specifies GÁC, whereas Py/Im specifies CÁG. Anti-parallel pairing of Py/Py specifies AÁT or TÁA degenerately. 6,7 Recently, we measured the K D s, k a s and k d s of Py-Im polyamides with different numbers of Im rings and these data indicated that the association rate of the Py-Im polyamides with their target DNA decreased as the number of Im in the Py-Im polyamides increased, even though their dissociation rates were comparable with each other. 8 We calculated the model structures of four-ring Py-Im polyamides by density functional theory. 8 These data suggested that an increase in planarity of Py-Im polyamide induced by the incorporation of Im reduced the association rate of Py-Im polyamides.
Many eukaryotic genes contain highly GC-rich sequences in the respective promoter regions. [9][10][11][12] One representative DNA-binding domain that binds to GC-rich sequences is a zinc-finger, common among many transcription factors. 13 As Py-Im polyamides can bind to DNA with sequence specificity comparable to DNA-binding proteins, it is potentially useful to synthesize Py-Im polyamides that recognize GC-rich promoter regions and thereby regulate gene expression. However, as described above, GC-rich sequences such as 5 0 -GCGC-3 0 and 5 0 -CGCG-3 0 have been identified as difficult recognition sequences for Py-Im polyamides due to the aforementioned planarity of Im-containing polyamides. 8,14,15 Interestingly, as reported previously, replacement of two Py with aliphatic b-alanine can increase binding affinity and provide flexibility in the polyamide structures. The binding affinity of Imb-ImPy-c-Im-b-ImPy-b-Dp, which recognizes 5 0 -GCGC-3 0 , was 100-fold greater than that of ImPyImPy-c-ImPyImPy-b-Dp 14 (b = b-alanine, c = c-aminobutyric acid, and Dp = ((dimethylamino)propyl)amide). However, the corresponding association rate constant and dissociation rate constant have not determined. Furthermore, we lack information about the Py-Im polyamides containing only one replacement of Py with b-alanine. As design of Py-Im polyamides with higher DNA binding affinities and sequence specificities is crucial for the production of synthetic DNA binding modules, the detail molecular characterization of Py replacement with b-alanine is valuable not only from a practical stand point, but also for the elucidation of the DNA recognition mechanism of Py-Im polyamides. In this study, we synthesized four hairpin Py-Im Polyamides 1-4 that recognize Py-Im polyamide 2 contains two pairs of Im/b, and 3 and 4 each contain one Im/b pair. We measured the K D s, k a s and k d s of the Py-Im polyamide to matched and mismatched binding sites by surface plasmon resonance (SPR) assay. Our SPR data suggest that one Py-replacement with b is sufficient to increase the binding affinity of the Py-Im polyamide targeting 5 0 -GCGC-3 0 . However, the k a s and k d s of 3 and 4 to the matched DNA were higher than those of 2. We also measured the DNA binding affinity of 1-4 to various mismatched DNA targets to determine their sequence specificities and these data provide valuable information for the design of Py-Im polyamides that recognize GC-rich sequences, and elucidate the target DNA recognition mechanism.

Hairpin Py-Im polyamide synthesis and DNA preparation
As reported previously, our SPR data and calculated model structures of 4-ring Py-Im polyamides suggested that an increase in planarity of the Py-Im polyamide induced by the incorporation of Im reduced the association rate of Py-Im polyamides. 8 However, two replacements of Py by b in the hairpin Py-Im polyamide that recognizes 5 0 -GCGC-3 0 can increase the binding affinity. 14 In this study, to characterize the effect of Py replacement with b in more detail, we synthesized four hairpin Py-Im polyamides 1-4 ( Fig. 1) by the Fmoc-chemistry solid-phase synthesis method. Py-Im polyamide 1, 2, 3, and 4 contain zero, two, one and one internal Im/b pairs, respectively. As shown in Figure 1, Py-Im polyamide 2 contains two Im-b-Im, and 3 and 4 contain one Im-b-Im at the C-terminal side and N-terminal side, respectively. Based on the recognition rule of Py-Im polyamides, the target DNA sequence of 1-4 is WGCGCW, 6,7 however, because of symmetry of the target DNA sequences, two single-mismatched target DNA sequences, WGNGCW and WGCNCW are not distinguishable. Previously, we synthesized five Py-Im polyamides containing two bs at the N-terminal and demonstrated that the b-Dp at the C-terminal of the Py-Im polyamides had a slight steric preference for AÁT or TÁA relative to GÁC or CÁG. 8 Therefore, in this study, we synthesized the aforementioned four hairpin Py-Im polyamides 1-4 containing two bs at N-terminus, and prepared four hairpin DNAs, ODN1-4 to characterize the binding affinity of the Py-Im polyamides containing Im/b pairs and the relationship between the position of the b substitution for Py and a mismatched DNA site (Fig. 1). ODN2 and 3 contains one mismatched DNA site and ODN4 contains two mismatched DNA sites (Fig. 1b). After synthesis of 1-4, we purified them by reverse phase HPLC and confirmed that the purity of 1-4 were more than 95% by analytical HPLC and ESI-TOFMS.

Aliphatic b-alanine increases the DNA association rate constant and decreases the DNA dissociation rate constant
To characterize the DNA binding affinity of 1-4, the SPR experiments were performed as described in the experimental section and the respective dissociation equilibrium constant (K D ), association rate constant (k a ) and dissociation rate constant (k d ) of 1-4 against the target DNAs were measured. ODN1, which includes the target DNA sequence of 1-4, was immobilized to a streptavidin-coated sensor chip and the Py-Im polyamide solutions were injected. As shown in Figure 2, the SPR sensorgrams were obtained, and kinetic binding parameters K D , k a and k d were determined (Table 1). The K D , k a and k d of 1 were 2.7 Â 10 À7 M, 8.2 Â 10 4 M À1 s À1 , and 0.022 s À1 , respectively, and these data were consistent with previous data. 8 The K D of 2 was determined to be 6.3 Â 10 À10 M, and the binding affinity of 2 was 430-fold over that of 1, which is consistent with previous data. 14 The k a and k d of 2 were determined to be 9.4 Â 10 5 M À1 s À1 and 5.9 Â 10 À4 s À1 , respectively, and interestingly, the k a and k d of 2 were improved by 11 and 37-fold, respectively, compared with those of 1.
In this study, we also measured the K D , k a and k d of 3 and 4 ( Fig. 2c and d, and Table 1). The K D s of 3 and 4 were determined to be 7.0 Â 10 À10 and 4.4 Â 10 À10 M, respectively. The binding affinities of 3 and 4 were 390 and 610-fold, respectively, over that of 1, and comparable to that of 2. However, the k a s of 3 and 4 were determined to be 1.3 Â 10 7 and 1.5 Â 10 7 M À1 s À1 , respectively, and the k d s of 3 and 4 were determined to be 9.1 Â 10 À3 and 6.5 Â 10 À3 s À1 , respectively (Table 1). Compared with those of 2, the k a s of 3 and 4 were increased by 14 and 16-fold, however, the k a s of 3 and 4 were also increased by 15 and 11-fold, respectively, indicating that 3 and 4 bind to and dissociate from the target DNA ODN1 faster than 2.
2.3. DNA binding affinities of Py-Im polyamide 1-4 to mismatched DNA As described above, we demonstrated that a single Py replacement with b can increase the binding affinity of the Py-Im polyamide just as well as two replacements with b. Interestingly, even though the K D s of 3 and 4 were comparable to that of 2, the k a s and k d s of 3 and 4 were higher than those of 2. These results indicated that the properties of 3 and 4 are different from 2, and further characterizations of these Py-Im polyamides would clarify not only the effect of Py-replacement with b-alanine, but also the target DNA recognition mechanism of Py-Im polyamide. Therefore, we measured the K D , k a and k d of 1, 2, 3 and 4 using ODN2, 3 and 4 (Supplementary Figs. S1-S3 and Tables 2-4) to characterize the sequence-specificity of 1-4. The binding affinities of 1 to ODN2, 3 and 4 were significantly decreased by 250, 230 and 300-fold, respectively, compared with that of ODN1. On the other hand, the binding affinities of 2 to ODN2, 3 and 4 were decreased by 17, 17 and 29-fold, respectively, compared with that of ODN1 ( Fig. 2 and Supplementary Figs. S1-S3 and Table 1 -4). The k a s and k d s of 2 to ODN2, 3 and 4 were impaired moderately by 2.2 to 11-fold, compared with those of ODN1 (Tables 1-4). By contrast, the binding affinities of 1 to ODN2, 3 and 4 were significantly reduced, and especially the k a s were reduced by 40 to 60-fold, compared to ODN1 (Tables 1-4).
In this study, we also determined the K D , k a and k d of 3 and 4 using ODN2, 3 and 4 (Supplementary Figs. S1-S3 and Tables 2-4). The binding affinities of 3 to ODN2, 3 and 4 were 1.1 Â 10 À8 , 8.3 Â 10 À8 and 1.9 Â 10 À7 M, respectively (Tables 2-4). Interestingly, the binding affinity of 3 to ODN2 was reduced by 16-fold relative to ODN1, which was comparable to those of 2 to ODN2, 3 or 4; however, the binding affinities of 3 to ODN3 or 4 were significantly reduced by 120 or 270-fold, respectively. Regarding Py-Im polyamide 4, the binding affinities of 4 to ODN2, 3 and 4 were 8.9 Â 10 À8 , 1.1 Â 10 À8 and 4.7 Â 10 À8 M, respectively (Tables 2-4), and the binding affinity of 4 to ODN3 was reduced by a rather modest 25-fold, comparable to that of 3 to ODN2. However, the binding affinities of 4 to ODN2 or 4 were as significantly reduced as those of 3 to ODN3 or 4.

DNA binding affinities to the target DNA
In this study, to characterize substitution of b for Py, especially the difference between two b substitutions and a single replacement of Py with b in hairpin Py-Im polyamide targeting 5 0 -GCGC-3 0 , we measured the K D s, k a s and k d s of four hairpin Py-Im Polyamides 1-4 to matched target or mismatched DNAs. The K D of 2 was 430-fold over that of 1, and the k a and k d of 2 were improved by 11 and 37-fold, respectively, compared with those of 1. Previously, Dervan and co-workers have also reported the NMR structures of Py-Im polyamide containing b (Im-b-Im and Py-b-Im) bound with the target DNA, and the dihedral angles of the two Im rings in Im-b-Im, and between Py ring and Im ring in Py-b-Im, were determined to be 50°and 33°, respectively. 16 They suggested that the increase in dihedral angle for Im-b-Im is due to the need for proper orientation to maximize hydrogen bonding between N3 in Im and the 2 amino hydrogen of G. 16 In this study, we showed that the k a and k d of 2 were improved significantly, compared with those of 1. These results suggest that structural flexibility of 2 by replacement of two Py with b results in more stable complex formation of 2/ODN1 compared with 1, and that it requires less energy to change the structure for binding to the target DNA.
The K D s of 2, 3 and 4 to ODN1 were comparable to each other, however, interestingly, the k a s and the k d s of 3 and 4 were higher than those of 2. To obtain insight into the difference among the observed k a and k d values, we calculated the model structure of Im-bImPy by Density Functional Theory, without the target DNA. The structure was folded (data not shown), but it was not curved like the model structure of ImPyImPy reported previously, 8 suggesting that ImbImPy is considerably more flexible compared with ImPyImPy, and may form an unfavorable folded structure for DNA binding. However, ImPyImPy in 3 and 4 may restrict the flexibility of ImbImPy, and suppress the unfavorable folded structure for DNA binding, resulting in the observed higher k a s of 3/ODN1 and 4/ ODN1, compared with that of 2. However, because 3 and 4 contain one ImPyImPy, stabilities of 3/ODN1 and 4/ODN1 are less than that of 2/ODN1, resulting in higher k d s of 3/ODN1 and 4/ODN1, compared with that of 2.

Application of the Py-Im polyamides to DNA binding modules for genetic regulation
Because Py-Im polyamides conjugated with chromatin-modifying suberoylanilide hydroxamic acid (SAHA) have been recently used as artificial transcriptional activators for epigenetic regulation, [17][18][19] not only higher DNA binding affinity but also stronger sequence specificity are crucial for the design of the Py-Im polyamides to suppress nonspecific DNA binding of the designed Py-Im polyamides. As described above, the binding affinities of 2, 3 and 4 to the matched DNA were comparable to each other. In this study, we also measured the K D s, k a s and k d s of 1, 2, 3, and 4 to     The K D s of 1 to ODN2, 3 and 4 were decreased by 250, 230 and 300-fold, respectively, compared with that of ODN1 and the k a s in particular were reduced by 40 to 60-fold (Tables 1-4), whereas the K D s of 2 to ODN2, 3 and 4 were decreased by 17, 17 and 29-fold, respectively, compared with ODN1 (Tables 1-4). To evaluate sequence specificities of 2-4, we also determined the free energy change (DG 0 , kcal/mol) from the K D upon the formation of the Py-Im polyamides 1-4/DNA complexes ( Table 5). The differences in DG 0 between 1/ODN1 and 1/ODN2, 1/ODN3, or 1/ODN4 were À3.3, À3.3, and À3.4 kcal/mol, respectively, (Table 5), whereas the differences in DG 0 between 2/ODN1 and 2/ODN2, 2/ODN3, or 2/ODN4 were À1.8, À1.7, and À2.1 kcal/mol, respectively (Table 5). These results suggest that due to the flexibility of the two ImbImPy at the N-and C-terminal sides in 2, the structure of 2 is able to change easily and the curvature of 2 may precisely match the minor groove of 1 bp or 2 bp mismatched DNA sites, resulting in only a moderate decrease of the k a s and k d s of 2 to ODN2, 3 and 4, compared with those for ODN1. On the other hand, due to the planar structure and the lack of the structural flexibility of 1, the structure of 1 is not able to change as much as 2, therefore, the k a s of 1 to mismatched DNA targets were significantly reduced, even to a 1 bp mismatched DNA target such as ODN2 or 3. Thus, even though the DNA binding affinity of 1 is much lower than that of 2, 1 has a stronger sequence specificity than 2.

Polyamide polymers at the N-terminal and C-terminal sides in hairpin Py-Im polyamide assist and coordinate each other for matched and mismatched DNA binding
As described above, the K D s of 3 and 4 to the matched DNA were comparable to that of 2, whereas the k a s and k d s of 3 and 4 were higher than those of 2, indicating that further characterizations of these Py-Im polyamides would clarify not only the effect of Py replacement with b, but also the detailed target DNA recognition mechanism of Py-Im polyamide.
In contradiction to 3/ODN2 or 4/ODN3, in the case of 3/ODN3 and 4/ODN2 the ImbImPy is placed at the 1 nt mismatched 5 0 -GCAC-3 0 and the ImPyImPy is placed at the matched 5 0 -GTGC-3 0 , and the binding affinities of 3/ODN3 and 4/ODN2 were significantly reduced. In the case of 3/ODN4 and 4/ODN4, their binding affinities were comparable to those of 3/ODN3 or 4/ODN2, and interestingly, these affinities were better than that of 1/ODN1 (Tables 1-3). However, because the complex of the ImbImPy with the 1 nt mismatched DNA site is less stable than it is with a matched DNA site, the binding affinities of 3/ODN3, 3/ODN4, 4/ODN2, or 4/ODN4 were reduced, compared with those of 3/ODN2 or 4/ ODN3.
As described above, the binding affinities of 2 to ODN2, 3 and 4 were comparable to each other. Because 2 contains two ImbImPys at the N-terminal and C-terminal sides, in the case of ODN4, both the ImbImPys face the 1 nt mismatched DNA, 5 0 -GTAC-3 0 ; however, the binding affinity of 2/ODN4 was comparable to those of 2/ODN2 or 3. Furthermore, as discussed above, Dervan and co-workers have demonstrated that the binding affinities of the Py-Im polyamide, Im-b-ImPy-c-Im-b-ImPy-b-Dp, for 2 bp mismatched DNA targets, 5 0 -TGGCCA-3 0 and 5 0 -TGGGGA-3 0 , were decreased by 26 and 34-fold, compared with the affinity for matched DNA 5 0 -TGCGCA-3 0 . 14 In the case of 5 0 -TGGCCA-3 0 , both the ImbImPys face 2 nt mismatched DNA, 5 0 -GGCC-3 0 . In the case of 5 0 -TGGGGA-3 0 , one ImbImPy faces 2 nt mismatched DNA, 5 0 -GGGG-3 0 , while another ImbImPy faces a 2 nt mismatched DNA, 5 0 -CCCC-3 0 . These results suggest that even though both ImbImPys face 2 nt mismatched DNAs, due to the flexibility of ImbImPy, both ImbImPys manage to interact with the mismatched DNA, and the N-terminal and Cterminal polyamides in the Py-Im polyamide assist each other for binding to the target DNA.
In this study, using SPR assay, we demonstrated the effect of replacements of Py with b on the improvement of the association and dissociation rate constants, and the sequence specificity of the Py-Im polyamide. Even though the target DNA sequence of the Py-Im polyamide is determined by the anti-parallel pairings of the Py-Im polyamide, 6,7 our data suggest that structural flexibilities of the N-terminus and C-terminus of the polyamides are substantially involved in the association and dissociation rate constants of the respective Py-Im polyamide, and consequently the binding affinity of the Py-Im polyamide to mismatched DNA. Table 5 Free energy change (DG 0a , kcal/mol) on the formation of 1-4/ODN1-4 complexes We are now planning to synthesize Py-Im polyamides with different numbers of Im and substitution of b for Py. Further analysis of these Py-Im polyamides will provide valuable information about the effect of b substitution and the target DNA recognition mechanism.
Electrospray ionization time-of-flight mass spectrometry (ESI-TOFMS) was conducted with a BioTOF II (Bruker Daltonics) mass spectrometer to determine the molecular weight of Py-Im polyamides 1-4.

Polyamide synthesis
Py-Im polyamides 1-4 were synthesized in a stepwise reaction using a previously described Fmoc solid-phase protocol. 15 Syntheses were performed using a Pioneer Peptide Synthesizer (PSSM-8, Shimadzu) with a computer-assisted operation system on a 36 lmol scale (100 mg of Fmoc-b-alanine Wang resin). After the synthesis, Dp was mixed with the resin and the mixture was shaken at 550 rpm for 4 h at 55°C to detach the Py-Im polyamides from the resin. Purification of Py-Im polyamides 1-4 were performed using a high-performance liquid chromatography (HPLC) PU-2080 Plus series system (JASCO), using a 10 mm Â 150 mm ChemcoPak Chemcobond 5-ODS-H reverse-phase column in 0.1% TFA in water with acetonitrile as eluent, at a flow rate of 3 mL/ min, and a linear gradient from 20% to 60% acetonitrile over 20 min, with detection at 254 nm. Collected fractions were analyzed by ESI-TOFMS.

Surface plasmon resonance (SPR) assay
All SPR experiments were performed on a BIACORE X instrument at 25°C as described previously. 15,20 The sequences of biotinylated hairpin DNAs containing target sequences are shown in Figure 1b. The hairpin DNAs were immobilized on a streptavidincoated SA sensor chip at a flow rate of 20 lL/min to obtain the required immobilization level (up to approximately 1400 resonance units (RU) rise). Experiments were carried out using HBS-EP buffer (10 mM 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid (HEPES), 150 mM NaCl, 3 mM ethylenediaminetetraacetic acid (EDTA), and 0.005% Surfactant P20) with 0.1% DMSO at 25°C, pH 7.4. A series of sample solutions were prepared in HBS-EP buffer with 0.1% DMSO and injected at a flow rate of 20 lL/min. To measure association and dissociation rate constants (K D , k a and k d ), data processing was performed with an appropriate fitting model using the BIAevaluation 4.1 program. The sensorgrams of all data were fitted by using the 1:1 binding model with mass transfer. The values of K D , k a , and k d for all data are summarized in Tables  1-4.