Sequence-Dependent T:G Base Pair Opening in DNA Double Helix Bound by Cren7, a Chromatin Protein Conserved among Crenarchaea

T:G base pair arising from spontaneous deamination of 5mC or polymerase errors is a great challenge for DNA repair of hyperthermophilic archaea, especially Crenarchaea. Most strains in this phylum lack the protein homologues responsible for the recognition of the mismatch in the DNA repair pathways. To investigate whether Cren7, a highly conserved chromatin protein in Crenarchaea, serves a role in the repair of T:G mispairs, the crystal structures of Cren7-GTAATTGC and Cren7-GTGATCGC complexes were solved at 2.0 Å and 2.1 Å. In our structures, binding of Cren7 to the AT-rich DNA duplex (GTAATTGC) induces opening of T2:G15 but not T10:G7 base pair. By contrast, both T:G mispairs in the GC-rich DNA duplex (GTGATCGC) retain the classic wobble type. Structural analysis also showed DNA helical changes of GTAATTGC, especially in the steps around the open T:G base pair, as compared to GTGATCGC or the matched DNAs. Surface plasmon resonance assays revealed a 4-fold lower binding affinity of Cren7 for GTAATTGC than that for GTGATCGC, which was dominantly contributed by the decrease of association rate. These results suggested that binding of Cren7 to DNA leads to T:G mispair opening in a sequence dependent manner, and therefore propose the potential roles of Cren7 in DNA repair.


Introduction
Thymine-guanine (T:G) and uracil-guanine (U:G) base pairs are generated in DNA either by the spontaneous hydrolytic deamination of 5-methyl cytosine (5mC) and cytosine (C) respectively [1] or by misincorporation during DNA replication [2]. Deamination of pyrimidines occurs approximately 200-300 events per cell per day, an about 50-fold higher rate than that for purines [3]. These mismatches cause C to T transition mutations in 50% of the progeny DNA if not repaired upon DNA replication [4]. There seems to be no problems in the case of the U:G mispair, since uracil, which is not a natural DNA base, is efficiently removed in a baseexcision repair pathway initiated by uracil DNA-glycosylase (UDG) [5,6]. T:G mispair presents a greater challenge, however, as the thymine arising from deamination of 5mC is indistinguishable from other thymines in the genomic DNA.
Organisms have evolved DNA repair pathways to process T:G mispairs, including mismatch repair (MMR), base-excision repair (BER) and nucleotide excision repair (NER) pathways. Crenarchaea, a major linage of hyperthermophilic archaea, lacks the critical components of the highly conserved NER and MMR pathways [7,8]. Despite the presence of the structure-specific nucleases of eukaryotic NER, the homologues of damage-recognition proteins are missing. Similarly, Crenarchaeota does not encode orthologues of MMR-specific proteins, MutS and MutL, which initiate the MMR pathway [9]. For BER pathway, the thymine DNA-glycosylase (TDG) plays a central role. The enzymes demonstrating TDG activities, however, were only identified from Pyrobaculum aerophilum [10] and Aeropyrum pernix [11] among Crenarchaea.
Chromatin proteins have been reported to be directly involved in DNA repair. In bacteria, histone-like protein HU from Escherichia coli was found to serve roles in repair of abasic sites and the closely opposed lesions [12,13]. Eukaryotic histones also participate in various DNA repair pathways, including MMR and DNA double-strand break repair, through regulation of their methylation and acetylation [14,15]. Crenarchaea, one of the two major phyla of cultured archaea, possesses a number of small and basic chromatin proteins, including Sul7d, CC1 and Cren7 [16,17]. And Cren7 is the unique one conserved at the kingdom level. In Sulfolobus solfataricus P2, a model strain of Crenarchaeota, Cren7 is synthesized in abundance, constituting 1% of the total cellular proteins [1]. This protein preferentially binds dsDNA over ssDNA and changes the DNA geometry in vitro [1]. Moreover, the protein is found to be methylated dynamically at multiple sites in vivo [18].
Here we report the crystal structures of Cren7 in complex with two DNA sequences containing T:G mispairs. In crystals, binding of Cren7 to the AT-rich DNA duplex (GTAATTGC) induced opening of T2:G15 but not T10:G7 base pair. By contrast, both T:G mispairs in the GC-rich DNA duplex (GTGATCGC) retained classic wobble type. SPR assays showed a 4-fold lower binding affinity of Cren7 for GTAATTGC than that for GTGATCGC. These results suggested that binding of Cren7 to DNA leads to T:G mispair opening in a sequence dependent manner.

Protein expression and purification
The recombinant Cren7 protein was expressed and purified by using a modification of the method described previously [1]. Briefly, cell lysates were prepared by sonication in 20 mM Tris-Cl, pH 6.8, and 1 mM EDTA (buffer A). After centrifugation, the supernatant was heattreated at 80°C for 20 min and centrifuged again. The sample was applied to a HiTrap SP Sepharose XL column (5 ml, GE) equilibrated in buffer A, and eluted successively with 200 and 500 mM KCl. Proteins eluted in 500 mM KCl were dialyzed to buffer A, concentrated to above 20 mg/ml and stored at -80°C.

Preparation of oligonucleotides
All the oligonucleotides used in the present work were commercially synthesized at Sangon BioTech (Shanghai, China). The octamers, 5'-GTAATTGC, 5'-GTGATCGC and 5'-GCGATCGC, were dissolved in 20 mM Tris-Cl, pH 6.8, 100 mM NaCl and 1 mM EDTA to a final concentration of 10 mM oligonuleotide. Duplex DNA fragments for crystallization were prepared by allowing each octamer to self-anneal in a process involving heating at 95°C for 5 minutes in water bath and subsequent cooling to room temperature in two hours. For SPR analysis, DNA duplexes with a 5'-(dA) 8 overhang were prepared by annealing each octamer to the 5'-biotin labeled oligomer containing its sequence (5'-biotin-AAAAAAAAGTAATTGC, 5'-biotin-AAAAAAAAGTGATCGC and 5'-biotin-AAAAAAAAGCGATCGC) in a molar ratio of 1.5:1.
Crystallization, data collection and structure determination Crystals of Cren7 in complex with d(GTAATTGC) 2 or d(GTGATCGC) 2 duplexes were grown at 20°C using sitting-drop vapor diffusion. The protein and the duplex DNA were mixed at a molar ratio of 1:1.5 to a final concentration of 1.5 mM for the protein. The sample (1 μl) was mixed with the reservoir solution (1 μl) and equilibrated with 30% PEG1500 at 20°C. The crystals were mounted on nylon loops and immediately frozen in liquid nitrogen. The data were collected to 2.0 Å and 2.1 Å at the BL17U beamline of the Shanghai Synchrotron Radiation Facility (Cren7-GTGATCGC) and in-house Cu Kα X-rays generated by a Rigaku MicroMax-007 rotating-anode X-ray source (Cren7-GTAATTGC). Two sets of data were integrated and scaled with HKL2000 [19].
The structures of Cren7-d(GTAATTGC) 2 and Cren7-d(GTGATCGC) 2 were determined by molecular replacement using the program Phaser [20] from the CCP4 program suite [21] using the crystal structure of Cren7-d(GTGATCAC) 2 (PDB code: 3LWH) as the initial model. The software packages Refmac [22] and Coot [23] were used to complete the model. MolProbity was used to validate the structure [24]. The Ramachandran plots showed that 98.2% of the residues were in the most favorable region, and that no residues were in the disallowed region. Statistics for data collection and refinement are given in Table 1. DNA conformations were analyzed using the program X3DNA [25]. All images were prepared using Pymol (http://www. pymol.org).

Surface plasmon resonance assays (SPR)
SPR experiments were carried out at 25°C using the BIAcore 3000 instrument (BIAcore AB, Uppsala, Sweden). The running buffer contained 50 mM Tris-Cl, pH 7.5, 100 mM NaCl, 1 mM EDTA and 0.005% (v/v) Tween 20. The DNA duplexes with a 5'-(dA) 8 overhang at one of the strands (described in section 2.2) were immobilized on the SA sensor chip to 90-100 response units (RU). The presence of the overhang was supposed to avoid the steric hindrance during Cren7 binding to DNA on the surface of the chip, without affecting the results of SPR experiments due to the low affinity of Cren7 to single-stranded DNA. A blank flow cell was used as the reference to correct for instrumental and concentration effects. Cren7 at a concentration within a range spanning the K D value for the interaction of the protein with the DNA was injected over the DNA surface for 2 min at a flow rate of 30 μl/min. After the dissociation phase (2-4 min), bound protein was removed by a 30-sec wash with 0.01% SDS, followed by a 60-sec buffer injection. For each experiment, the measurement was repeated once at the protein concentration of 1.25 μM. Equilibrium and kinetic constants were calculated by a global fit to 1:1 Langmuir binding model (BIA evaluation 4.1 software).

Overall structures of Cren7-DNA complexes
To investigate the structural details of the DNA containing T:G base pairs, the crystal structures of Cren7-d(GTAATTGC) 2 (PDB code: 5K07) and Cren7-d(GTGATCGC) 2 (PDB code: 5K17) have been determined and refined at 2.0Å and 2.1Å, respectively. A summary of data collection and final refinement statistics are listed in Table 1. The two complexes have different crystal packing patterns. In the crystal of Cren7-GTAATTGC, one protein-DNA complex is found in the asymmetric unit. Meanwhile, two complexes are in the asymmetric unit of Cren7-GTGATCGC, which are related by a perfect non-crystallographic 2-fold symmetry axis with a root-mean-square deviation (RMSD) value of 0.018 Å between them.
The two Cren7-DNA complexes share a similar overall structure with RMSD of 0.305 Å over the backbone atoms (Fig 1A-1C). A notable distinction is flipping of the base of T2 in GTAATTGC ( Fig 1C). Similar to other Cren7-DNA complexes reported previously (Fig 1D), Cren7 binds to the mismatched DNA duplex in the minor groove as a monomer, covering almost all the eight base pairs. The bound Cren7 induces a sharp kink (~50°) to the major groove by the intercalation of the side chain of Leu28 into the A3A4 step of d(GTAATTGC) 2 (S1A Fig), while a smaller kink (~48°) is observed at the G3A4 step for d(GTGATCGC) 2 (S1B Fig). At the intercalating site, the DNA duplex is drastically widened in the minor groove bỹ 5.7Å with the major groove unchanged. Structural comparison of the Cren7 molecules in different DNA complexes showed little conformational changes (Fig 1E), suggesting that the protein retains its overall conformation when binding to different DNA sequences. All DNA-interacting residues are located on the triple-stranded β-sheet and loop β3-β4 in Cren7 (Fig 1E). The side-chain conformation of these amino acid residues exhibits small variations. The patterns of protein-DNA contacts of the Cren7-GTGATCGC and Cren7-GTAATTGC complexes are almost identical to those of

T:G base pairs
Among the most notable structural features of the Cren7-GTAATTGC complex is the flipping of the pyrimidine ring of T2 (Fig 1C). To learn more about the atomic details of the unusual T2:G15 base pairing, the structures of all the T:G mispairs in the Cren7-DNA complexes are compared (Fig 2). Both T:G base pairs in the Cren7-GTGATCGC complex form a classic wobble pair with hydrogen bonds from G-O6 to T-N3 (2.74 Å for G7:T10 and 2.72 Å for G15: T2) and from G-N1 to T-O2 (2.91 Å for G7:T10 and 2.83 Å for G15:T2) (Fig 2A and 2B). In the Cren7-GTAATTGC complex, the G7:T10 base pair is also of the wobble type with normal hydrogen bonds between G-O6 and T-N3 (2.77 Å) and between G-N1 and T-O2 (2.91 Å) ( Fig  2C). By contrast, the G15:T2 base pair in this complex appears a totally different conformation with an unusual hydrogen bonding geometry instead of the wobble type ( Fig 2D). The hydrogen bond from G-O6 to T-N3 has been broken, whereas the G-N1 to T-O2 hydrogen bond (2.75 Å) remains intact. Intriguingly, the hydrogen bond to T-O2 is bifurcated as an additional hydrogen bond is formed from G-N2 to this atom (2.58 Å). Although it has been reported that forming the bifurcated O2-N2 hydrogen bond is not the discriminating factor for the opening of G:U/G:T base pair [26],

DNA deformation
The mismatched DNA sequences bound by Cren7 resemble the matched ones in the global conformation except for an obvious translocation of the base of T2 in GTAATTGC (Fig 1F). The distortion of T2:G15 base pair in GTAATTGC leads to the B to A transformation of the G1pT2pA3 steps in addition to the A3pA4 step, while the later is the unique step undergoing this type of transformation in other DNA sequences bound by Cren7.
Detailed DNA conformations of the Cren7-DNA complexes are listed in Table 3. The average conformational parameters of the base pair steps in GTGATCGC resembles those in GCGATCGC bound by Cren7 [27]. It suggests that the existence of T:G mispairs has little effects on the conformation of GC-rich DNA bound by Cren7. However, the situation is quite  Table 3. DNA helical parameters of the Cren7-DNA complexes.
Step different for GTAATTGC, an AT-rich mismatched DNA duplex, as its DNA helical parameters are significantly changed compared to those in other DNA sequences. The average roll angle of GTAATTGC (~12.1°) is much larger than that of GTGATCGC (~9.9°). In fact, the value is the largest one among all the DNA duplexes bound by Cren7 (~10.5°for GTAATTAC, 9.9°for GCGATCGC and~9.6°for GTGATCAC, respectively) [2,3,28], indicating that GTAATTGC is generally over curved than other DNA duplexes bound by Cren7. However, the angle of the sharp kink that occurs at the A3pA4 step in GTAATTGC (~50°) is even lower than that in GTAATTAC (~53°). The increase in DNA curvature in the Cren7-GTAATTGC complex is mainly due to the obvious positive rolls at the G1pT2pA3 steps (about 2.5°and 16.1°, respectively) since a slightly negative roll of about -0.4°and a small positive roll of 8.6°a re observed at the corresponding steps in the Cren7-GTAATTAC complex. Binding by Cren7 also induces undertwisting of the DNA helix. Intriguingly, unlike the matched DNA sequences in complex with Cren7, the smallest twist in GTAATTGC is observed at the T2pA3 step (~10.1°) instead of the site of intercalation (A3pA4 step,~17.3°). A similar case is found for GTGATCGC, with the values of the corresponding steps being about 15.4°and 18.8°, respectively. And, the reduction of twist at the T2pA3 step in GTAATTGC has a dominant contribution to its smallest average twist of a single step, among the DNA sequences bound by Cren7.

GTAATTGC
In addition to DNA bending and unwinding, the opening of T2:G15 base pair in GTAATTGC appears to affect some aspects of DNA conformation. The DNA helical parameters of the T2pA3 step, a step adjacent to T2:G15 base pair, in Cren7-GTAATTGC are significantly changed ( Table 3). The helical-rise (H-rise) at the T2pA3 step (1.83 Å) is much smaller than that of the T2pG3 step in GTGATCGC bound by Cren7 (2.81 Å). While a largely increased inclination of~57.4°is observed at this step in GTAATTGC, as compared to that in GTGATCGC (~38.5°), a negative tip of about -11.7°is seen here in contrast to the positive tip in the later (~2.6°).

Kinetic analysis of the interaction of Cren7 with mismatched DNA
To learn more about the effects of T:G mispairs on the interactions between Cren7 and DNA, we analyzed the binding kinetics of Cren7 to three DNA sequences (GTAATTGC, GTGATCGC and GCGATCGC) by SPR, respectively (Fig 3). In all the experiments, a 5'-(dA) 8 overhang of the short DNA fragments was used to avoid the steric hindrance during Cren7 binding to DNA on the surface of the chip. The presence of the single-stranded region was supposed not to affect the results of SPR experiments due to the 10-fold lower affinity of Cren7 for ssDNA [1]. Despite the low melting temperatures (10~20°C) of the DNA duplexes used in SPR assays, more than 85% of each type of the DNA fragments retained dsDNA form as revealed by polyacrylamide gel electrophoresis (data not shown). In the SPR assays, the baseline, which could represent the mass of the DNA strands immobilized on the chip surface, showed no visible decrease throughout the entire process of each experiment, suggesting that melting of the dsDNA barely occurred. In addition, the two curves derived from the repeated running cycles at the protein concentration of 1.25 μM in each experiment shown in Fig 3 were almost overlap, which further proved that the immobilized DNA duplexes remained unchanged during the measurements. Therefore, these results represented the different binding affinities of Cren7 to matched or mismatched dsDNA fragments but not the distinctions in the thermal stabilities of the DNA duplexes. Table 4 lists the kinetic parameters of Cren7 binding to different DNA sequences. Cren7 showed a 1.3-fold reduction in the binding affinity for GTGATCGC as compared to that for GCGATCGC. This was resulted from the combined effects of a slightly decreased association rate (k a ) and a slightly increased dissociation rate (k d ), indicating that the presence of T:G wobble base pairs had limited effects on both Cren7-DNA contacts and the stability of Cren7-DNA complex. By contrast, the binding affinity of Cren7 for GTAATTGC was 4.6-fold lower than that for the matched DNA. This reduction was dominantly led by the decrease in association rate of GTAATTGC as compared to that of GTGATCGC. These results suggest that the opening of one of the T:G mispairs largely diminishes Cren7 contacting DNA instead of reducing the stability of Cren7-DNA complex.

Discussion
The maintenance of genomic integrity is a crucial task for all living cells, as many environmental factors contribute to DNA damage. Of the environmental extremes accommodated by Crenarchaeota, high temperature has particular significance, as it cannot be excluded from the interior of microbial cells. High temperature directly leads to more rapid reactions such as hydrolytic deamination of nucleotide bases, which generate U:G or T:G base pairs. Despite the lack of the crucial protein homologues in MMR, NER or BER pathways, the rate of mutation in Sulfolobus, a model strain of Crenarchaea, is not higher than that for Escherichia coli [29], indicating that DNA damage is repaired efficiently.
In the present work, we resolved the crystal structures of the Cren7-GTAATTGC and Cren7-GTGATCGC complexes. In the former complex, the T2:G15 base pair shows a characteristic opening angle of~45°, indicating its opening state. The hydrogen bonding geometry of this base pair also resembles that of the U:G open base pair in the TGT/AUA sequence [26]. Although predictions based on pairing of the complementary bases and the stacking of pairs suggested no obvious sequence dependent base pair opening probabilities [30], Cren7 may prefer to induce the opening of T:G base pair in the AT-rich DNA sequences, as the fact that both T:G base pairs in the Cren7-GTGATCGC complex retain the classic wobble type. Therefore, Cren7 can readily induce the opening of the T:G base pairs in the genomic DNA of Sulfolobus cells owing to its high AT-content. Opening of the T:G base pair induces more DNA bending toward the major groove and more DNA undertwisting over other DNA duplexes bound by Cren7. The severe DNA distortions may thus affect the protein-DNA interactions, evidenced by that Cren7 shows a 4-fold lower binding affinity for GTAATTGC than for GTGATCGC. Taken together, these results suggested that opening of T:G base pair in DNA bound by Cren7 might promote the activity of some DNA repair enzymes in recognition of the mispairs and base excision of the thymine.