The Role of Methylation in the Intrinsic Dynamics of B- and Z-DNA

Methylation of cytosine at the 5-carbon position (5mC) is observed in both prokaryotes and eukaryotes. In humans, DNA methylation at CpG sites plays an important role in gene regulation and has been implicated in development, gene silencing, and cancer. In addition, the CpG dinucleotide is a known hot spot for pathologic mutations genome-wide. CpG tracts may adopt left-handed Z-DNA conformations, which have also been implicated in gene regulation and genomic instability. Methylation facilitates this B-Z transition but the underlying mechanism remains unclear. Herein, four structural models of the dinucleotide d(GC)5 repeat sequence in B-, methylated B-, Z-, and methylated Z-DNA forms were constructed and an aggregate 100 nanoseconds of molecular dynamics simulations in explicit solvent under physiological conditions was performed for each model. Both unmethylated and methylated B-DNA were found to be more flexible than Z-DNA. However, methylation significantly destabilized the BII, relative to the BI, state through the Gp5mC steps. In addition, methylation decreased the free energy difference between B- and Z-DNA. Comparisons of α/γ backbone torsional angles showed that torsional states changed marginally upon methylation for B-DNA, and Z-DNA. Methylation-induced conformational changes and lower energy differences may contribute to the transition to Z-DNA by methylated, over unmethylated, B-DNA and may be a contributing factor to biological function.


Introduction
DNA methylation is one of the main epigenetic modifications contributing to gene regulation and a considerable amount of scientific effort has been devoted to understanding the mechanisms, roles, and effects of methylation in healthy and diseased states [1]. DNA methylation has been reported to play a role in development [2], gene silencing [3], and carcinogenesis [4]. In addition, the CpG dinucleotide, which represents the main target for methylation in humans, is a known hot spot for pathological mutations [5,6].
DNA has a flexible backbone structure characterized by fluctuations in the torsional angles a, c, e, f, and sugar pucker (d) (See Figure 1). The a, c angles are associated with canonical/ non-canonical backbone states and e, f are associated with BI and BII substates. B-DNA has been observed to prefer the BI state over the BII state in crystals and solution [7,8]. These two states are defined by the backbone torsional angles e and f (Figure 1), where e2f,0 defines the BI state and e2f.0 defines the BII state. These conformational ensembles are important since they play a direct role in shaping the DNA backbone, and consequently underlie protein-DNA recognition [7,8].
Most repetitive DNA sequences can assume different structures apart from canonical B-DNA, including cruciforms, triplexes, hairpins, quadruplexes and Z-DNA [9,10,11]. These ''noncanonical'' structural forms have been implicated in genomic instability and disease [12]. Z-DNA is a left-handed helical structure where the alternating purine (generally guanine) and pyrimidine (generally cytosine) base pairs form a zigzagging pattern [13]. The transition from right-handed B-DNA to lefthanded Z-DNA is accompanied by a shift from the anti to the syn conformation of the alternating purines and has been shown to involve base extrusion at both B-Z junctions [14,15]. Although both unmethylated and methylated CpG runs in B-DNA can switch to Z-DNA forms, methylation greatly favors this transition under physiological salt concentrations [16,17,18] or minute negative supercoiling [19].
Molecular dynamics (MD) simulations have been used extensively to study DNA structure and dynamics [20,21,22]. With the emergence of improved force fields [23,24] and increasing computer power, simulations of B-DNA in the microsecond timescales have been conducted [22,23,25,26]. However, despite the large number of simulations available, only 10-15 ns simulations of methylated B-DNA [27,28] or unmethylated Z-DNA have been reported [23]. Since a detailed understanding of the effects of CpG methylation on the dynamics of both B-and Z-DNA at the molecular level is still lacking, the factors underlying methylation-assisted B-to-Z DNA transition remain largely unknown.
In an attempt to bridge this knowledge gap, structural models of a d(GCNGC) 5 repeat sequence were constructed in ideal B-, methylated B-, Z-, and methylated Z-DNA forms. Two 50 nanoseconds MD simulations in explicit solvent under physiological conditions were performed for each, reaching a 0.4 microsecond aggregate simulation time. An analysis of the trajectories was then performed to examine the effect of methylation on the BI/BII stability in B-DNA and the sampling of non-canonical a/c backbone torsional states (g+/t (gauche+/trans), g2/t (gauche2/ trans), and g+/g2 (gauche+/gauche2)) during the B-and Z-DNA simulations. The results indicate that methylation lowered the free energy difference between B-and Z-DNA. This suggests a lower energetic barrier, which may help explain the more facile change to Z-DNA by the methylated, rather than unmethylated, B-DNA.

B-factors and thermal flexibility
In this study, the effects of cytosine methylation on DNA dynamics are examined, including flexibility, base pair step parameters, base pair geometry and backbone torsion angles ( Figure 1) in both the right handed canonical B form, and the lefthanded non-canonical Z form. The flexibility of the structures is analyzed from the calculated B-factors by estimating the thermal mobility for each system (Figure 2) over the whole trajectory using the average structure of each trajectory as the reference. Note that terminal bases (residues 1, 10, 11 and 20) are excluded from the analysis. In addition, both methylated and unmethylated B-DNA display higher B-factors than Z-DNA (compare black, B-DNA, with red, Z-DNA, lines). Methylation is not seen to affect the fluctuations of Z-DNA, whereas it appears to slightly lower the Bfactors in B-DNA (compare solid and dashed lines for B-DNA (black) and Z-DNA (red)).
The average structures of the heavy atoms of non-terminal bases obtained from the trajectories are used to calculate root mean square deviations (RMSD) ( Figure S1). The results show that the two independent runs for B-and 5mCB-DNA yield very similar values (compare the black with red lines). A detailed analysis of the B-factors and RMSD values indicates that Z-DNA simulations ( Figure S1C and D) have slightly lower values (,1.0-1.3 Å (0.1 nm) RMSD) than B-DNA simulations (,1.5-1.8 Å (0.15 nm)) ( Figure S1A and B), implying that Z-DNA is overall less mobile than B-DNA.

Effects of CpG methylation on B-DNA base pair and base pair step geometry
Base pair geometry. The sequence averaged conformational parameters from the eight independent simulations for base pairs (Table S1), base pair steps (Table S2), helix (Table S3) and backbone torsion angles (Table S4) show that, overall, the B-DNA simulations yield similar values to those recently reported by the Ascona B-DNA consortium [22] and those seen in crystal structures (Table S5). There is a marginal difference (1 degree) between the experimental averages and the simulations in the base pair parameter buckle (Tables S1 and S5). The distributions of individual geometric parameters from representative trajectories for B and 5mCB-DNA are displayed in Figures S2, S3, S4. Inspection of the data ( Figure S2) indicates that base pair parameters for B and 5mCB-DNA form rather similar distributions and fall within one standard deviation of the crystallographically observed means (Table S5). Methylation does not appear to cause any large effect on the Watson-Crick base pairing in B-DNA as seen from base pair parameters shear, stretch, and stagger (Table S1). Nevertheless, the differences observed for buckle, propeller, opening, shear, stretch and stagger between the unmethylated and methylated B-DNA simulations (Table S1 and Figure S2), although marginal, are statistically significant, with -log(P) values of 16 (two sample paired t-test). This significance is likely revealed by the large number of data points used (1.6 million). Likewise, no differences are observed in base pair parameters when CNG and GNC base pairs were separated (data not shown).
Base pair step geometry. Base pair step parameters are within one standard deviation of the experimentally observed values (vertical dashed lines in Figure S3). The differences from the crystallographic data are in slide, roll and twist. Slide was positive (0.19 Å ) in the crystallographic data but is negative (with values of 20.3 and 20.44 Å ) in these simulations. The experimental mean of roll was 0.61u, whereas it is around 4u in the simulations.  Finally, twist was 36u in the crystallographic data, but 33u in the simulations, similar to the value reported by the Ascona B-DNA Consortium [22]. Except for shift (P value = 0.11), the geometrically confined differences observed between the methylated and unmethylated simulations (Table S2) are also statistically significant, with -log(P) values of 16. The reason for the differences noted between methylated and unmethylated base pair step parameters becomes clear when the distributions for the GpC and CpG steps are separated ( Figure S4). In this case, methylation is seen to narrow the base pair step distributions shift, tilt, roll, and to shift the distributions of twist and slide for the GpC steps relative to unmethylated B-DNA. On the other hand, the CpG steps are minimally affected, with the exception of slide.
Z-DNA base pair and base pair step geometry. As with B-DNA, the sequence averaged conformational parameters for the Z-and 5mCZ-DNA forms are similar on a gross scale (Tables S1, S2, S3, S4 and Figures S5, S6, S7, S8). However, statistically significant differences (2log(P) = 16) are noticed on a finer scale upon methylation for every geometric parameter, except for shear. Thus, the mean angle of base pair propeller decreased from 0.19 and 20.14 in Z-DNA to 20.50 and 20.28 in 5mCZ-DNA in the four independent simulations (Table S1). Similarly, base pair opening increased from 0.08 to 0.57 degrees upon methylation, whereas the standard deviation and the range of opening decreased (Table S1). Finally, in both independent runs, methylation increased the means of base pair step parameters tilt and roll, and decreased the standard deviations of tilt and roll with respect to the unmethylated simulations (Table S2). Interestingly, inspection of the distributions of the geometric parameters ( Figures S5, S6, S7, S8) indicates that the mean buckle of the GNC base pairs decreased from 23 to 28 degrees, whereas that of the CNG base pairs increased from 3 to 8 degrees upon methylation ( Figure S6). A similar shift is also observed in the distribution of the base pair step parameter rise ( Figure S8), where the GpC step shifts to the left while the CpG step shifts to the right. Finally, the zigzagging nature of Z-DNA (CpG vs. GpC steps) results in bimodal distributions for the base pair step parameters slide and twist. In summary, methylation is seen to constrain fluctuations for a number of geometric parameters in Z-DNA.
B-DNA backbone dynamics. The effects of methylation on DNA backbone torsional angles and puckering are also analyzed. Figure S9 shows the representative density distributions of DNA backbone torsional angles from the B-and 5mCB-DNA simulations. Methylation modifies the distributions of the torsional angles d, e, and f to a certain extent. The a/c and e/f dynamics ( Figure 1) will be discussed in more detail below. With regards to the puckering of the sugar backbone, Figure S10 shows the phase and amplitude distributions of the sugar pucker, as well as the distributions of torsional angles d and x. Overall, the x angle and amplitude distributions display very similar profiles for the methylated and unmethylated forms. Minor differences are noted for the distributions of d ( Figure 1) and the phase of the sugar, which are attributed for the most part to a shift in the populations of C29 Endo (from 35% in unmethylated B-DNA to 30% in methylated B-DNA) and O19 Endo (from 16% in unmethylated B-DNA to 21% in methylated B-DNA) ( Table S6). All four puckering parameters are in close agreement with the previously reported MD simulation averages [22]. In summary, methylation resulted in minute changes in sugar pucker dynamics for the B-DNA backbone.
Z-DNA backbone dynamics. In Z-DNA, changes are observed upon methylation in the distributions of the backbone angles a, c, e and d (Figure 1), whereas no changes are detected for the b and f distributions ( Figure S11). The zigzagging nature of Z-DNA results in two separate distributions for cytosine and guanine bases. Thus, the x angle exists in syn conformation in guanine bases, whereas it is in the anti conformation in cytosines ( Figure  S12). Methylation also affects the distributions of the torsional angle d for the guanine backbones.
BI-BII states. Crystallographic analyses have shown that canonical B-DNA comprises two conformational sub-states, BI and BII [7]. In the BI state, which is more common, the backbone phosphate adopts a rather symmetric position between the major and minor grooves, whereas in the BII state the phosphate groups are closer to the minor groove as a result of coupled changes in the two dihedral angles e and f ( Figure 1). Indeed, in the BI state e and f are in the t/g2 conformation, whereas in the BII state they switch to g2/t. In the present study, e2f,0 and e2f.0 are used as the cut-offs between the BI and BII states, respectively. Figure 3 shows the time evolution of the fraction of nucleotides in the BI state (Panel A) for B-(black) and 5mCB-DNA (red). The simulations for B-DNA are on average 84% in the BI state and 16% in the BII state, in agreement with previously reported results from a one microsecond B-DNA simulation [25] and with X-ray and NMR measurements [29,30]. The simulations for 5mCB-DNA, on the other hand, show a BI population of 92% and, correspondingly, a BII population of 8%. The results of the time evolution of e2f for B-and 5mCB-DNA (Figures S13 and S14) and the cumulative averages of the BI states from the simulations ( Figure S15, top panel) support the conclusion that methylation stabilizes the BI state [28]. Using a similar approach to Rauch et al. [28], the free energy profiles for the BI/BII transitions were calculated from the B-and 5mCB-DNA simulations. Since no major differences were seen between the two independent runs, for either the B-or 5mCB-DNA cases, the data for the reaction coordinate e2f were combined, resulting in 1.6 million observations. These data were used to plot the histograms for the e2f distributions using 18 degree bins (20 total bins). Assuming the simulations were long enough to cover the available angle space, the partition function Z was calculated and the free energy at each bin was obtained using G = 2RTln(Z) [28]. Figure 4A shows the overall free energy profiles for B-DNA (black) and 5mCB-DNA (red). The free energy differences between the BI and BII states (DG BI-BII ) are 1.6 kcal/mol in B-DNA and 2.08 kcal/mol in 5mCB-DNA. Interestingly, when the free energy profiles for the CpG versus GpC steps are separated, an increase in DG BI-BII from 1.02 kcal/mol (B-DNA) to 1.98 kcal/mol (5mCB-DNA) is revealed for the GpC steps, whereas no effect is observed for the CpG steps. (DG BI-BII = 0.16 between B-DNA and 5mCB- DNA) ( Figure 4B). Also, the barrier height of the BI-BII transition is virtually unchanged after methylation, so the BII-BI transition is 0.46 kcal/mol lower for 5mCB-DNA compared to unmethylated B-DNA. The differences observed in the free energies for the BI and BII states (Figure 4 a/c transitions in B-DNA. The effects of methylation on the BI/BII states suggest that methylation caused significant backbone torsional rearrangements. In solution, free B-DNA is mostly found in the canonical (g2/g+) a/c states; in protein-DNA complexes, on the other hand, DNA exhibits a higher percentage (,15%) of non-canonical states [31,32], which are believed to assist protein-DNA interactions [31,32]. Because the simulations used the parmbsc0 parameter set [23], which corrects the non-canonical conformers of B-DNA, unmethylated B-DNA is seen to sample canonical conformations over 99% of the time ( Figure 3B, black lines), as expected [25]. For the methylated 5mCB-DNA simulations ( Figure 3B, red lines), all nucleotides sample canonical states similarly to unmethylated DNA ( Figure S15, bottom panel). The distribution of the backbone torsions over the a/c space is shown in Figure 5. Both B-DNA and 5mCB-DNA almost exclusively prefer the canonical (g2/g+) conformations ( Figure 5A-B and Table S7) with the exception of g+/t (1% in 5mCB-DNA) and t/g+ space (1% in 5mCB-DNA). To address the question as to whether methylation affected a/c state sampling to a similar extent for the two types of base pair steps (CpG and GpC), separate plots were generated for the a/c states based on the CpG (a/c angles of G) and GpC (a/c angles of C) steps for Band 5mCB-DNA simulations ( Figure 6). In unmethylated B-DNA, the GpC steps sample the non-canonical conformations g+/g2, g+/t, and g+/t (less than 1% of simulation time, Figure 6A), whereas CpG steps spend all of the simulation time in canonical g2/g+ conformations ( Figure 6B). Upon methylation, Gp5mC steps sample an increased number of g+/t (greater than 1% of simulation time, Figure 6C) and t/g+ states ( Figure 6C). Thus, although methylated and unmethylated DNA spend most of the time in the canonical g2/g+ conformational state (Table S7), methylation causes the GpC step to sample an increased number of the non-canonical states g+/t and t/g+. In summary, GpC steps contribute to the difference in sampling of non-canonical conformers induced by methylation in 5mCB-DNA.
a/c transitions In Z-DNA. Although on a gross scale the a/ c torsions show similar distributions in the Z-and 5mCZ-DNA simulations ( Figure 5C and D), a number of differences are observed upon methylation. These include a shift from g2/g+ to t/g+ and an increase in the number of conformations in the g+/t state ( Figure 5C and D). The increase in the g+/t populations could unambiguously be attributed to the GpC step, whereas the shift from g2/g+ to t/g+ is mainly caused by the CpG step ( Figure  S16). The a torsions consistently lost the trans conformations in the cytosine backbone (GpC steps) after methylation, thereby switching to g+ ( Figure S16A and C). The c torsions, on the other hand, lost g+ states and increased the trans population, thus compensating for the changes in the a torsion angle. In the CpG steps, the a torsions shift to trans, whereas the c torsions remain unchanged ( Figure S16B and D). In summary, methylation caused shifts in backbone torsional preferences in Z-DNA.

Discussion
In this study, MD simulations are conducted to assess the role of methylation in the intrinsic dynamics of a 10 base paired d(GCNGC) 5 repeat, both in the canonical B-DNA and noncanonical left-handed Z-DNA forms. The work is motivated by the critical role that 5mCpG methylation plays in human development and cancer [2,3,4] and by the fact that methylated d(GCNGC) n sequences facilitate B-to Z-DNA structural transitions [17,33] by mechanisms that are not fully understood, leading to genetic instability [10,12]. Our results confirm that unmethylated B-DNA almost exclusively samples canonical a/c states and methylation only marginally increases sampling of non-canonical states, particularly for the Gp5mCp steps. By contrast, unmethylated and methylated Z-DNA sample non-canonical conformations extensively (,30-50% of the time) ( Figure S15 and Table S7). It has been shown that Z-DNA exists in substates, ZI and ZII, using FT-IR spectroscopy [34]. ZI and ZII states are defined mainly by a and f torsional angles. In our simulations, we did not see any effect of methylation on the structure or free energy difference of these states (data not shown).
MM/PBSA [35,36,37] methods have been extensively used to study conformational stability of nucleic acids [36,37], and ligand-DNA interactions [35]. Here, simple MM/PBSA analyses were performed to infer the free energies and stability of the systems studied (Table 1). To have a second estimate for configurational entropy we used ACCENT-MM [38], but 50 nanoseconds of simulation time was not sufficient for convergence for the B-and Z-DNA models (data not shown). Both methylated and unmethylated B-DNA are more stable than their Z-DNA counterparts. The calculated free energy difference between B-and Z-DNA is 21.6 kcal/mol and between methylated B-and Z-DNA is 14.4 kcal/mol. A recent targeted molecular dynamics study of a B-Z junction has reported a barrier of 13 kcal/mol and a free energy difference of 4.7 kcal/mol for a 10 base pair DNA sequence, proposing a sequential zipping mechanism for Z-DNA formation [39]. Although our numbers are higher than those of Lee et al. [39], they are close to the experimental free energy range for B-Z (12-17 kcal/mol) and methylated B-Z (9 kcal/mol) transitions [19,40]. The differences in the calculated free energies are found to be statistically significant using a two sample t-test with -log(P).23, implying that the B-Z transition barrier is lower for the methylated than the unmethylated system, in agreement with experimental observations [19,40].
MD simulations have been extensively employed to study nucleic acids dynamics [41]. Although these analyses can now reach millisecond time scales [25], artifacts have been found in the Amber parm99 force field in the form of extensive a/c transitions [42,43,44]. The Parmbsc0 force field [23] was introduced to correct for these artifacts. Thus, despite perceived limitations to large-scale   conformational predictions [22], our results support the use of MD simulation studies as a means to predict DNA dynamics. The overlapping conformational states by concerted phosphate backbone torsional angle switches observed in both methylated Band unmethylated/methylated Z-DNA ( Figure 6) agree with recent NMR [45] and single molecule fluorescence data [18].
In summary, MD simulations were used to further understand the role of cytosine methylation on both the canonical B-DNA and non-canonical left-handed Z-DNA structures. The results show that methylation lowers the free energy difference between B and Z-DNA resulting in the increased population of Z-DNA. We suggest that methylation-induced differences in the CpG and GpC steps' backbone dynamics may facilitate the initial step in the mechanism of B to Z transitions.

Model Building
Four structural models of 10 base paired d(GCNGC) 5 repeats were built using the canonical B-and Z-DNA settings in w3DNA [46]. The cytosine bases of two model structures (one Z-and one B-DNA) were then manually methylated at the 5-carbon position using UCSF Chimera [47] at all positions except the terminal bases. Therefore, the procedure resulted in an overall 80% methylation of the cytosines (8 out of 10) simulating a hypermethylated state [48]. The model structures are named B, 5mCB, Z, and 5mCZ for B-DNA, methylated B-DNA, and Z-DNA and methylated Z-DNA, respectively.

MD Simulations
MD simulations were performed using the AMBER 10 simulation package [49]. The AMBER parm99 [24] with parmbsc0 corrections [23] and TIP3P water molecules [50] were used to represent molecular interactions. Parameters for 5-methyl-cytosine (5mC) were taken from Rauch et al., 2005 [27], who also employed parm99 [24]. Each system was neutralized with 18 Na + ions and solvated with approximately 5000 water molecules for the B-DNA models and 5450 water molecules for the Z-DNA models in truncated octahedral boxes. Additional Na + and Cl 2 ions were added by randomly replacing water molecules, to bring the system to 150 mM salt concentration (34 and 32 ions to the Z-DNA and B-DNA models, respectively). The radial distribution functions of counterions around DNA for all simulations (data not shown) agree well with the previously published distributions [51,52,53]. A 10 Å cut-off was used for non-bonded interactions, along with the Particle Mesh Ewald [54] method. SHAKE [55] was used for hydrogen atoms. A 2 femtoseconds time step was used for the simulations. The systems were energy minimized and then heated to 300 K in 20 picoseconds (ps) at constant volume with 100 kcal/ mol/Å 2 harmonic restraints on all solute atoms. The harmonic restraints were then reduced to 50, 10, 5, and 1 kcal/mol/Å 2 in 20 ps intervals, followed by 380 ps of unrestrained equilibration at constant pressure and temperature. Two independent 50 ns-long production runs were performed for each system starting with different initial velocities. The aggregate simulation time was 400 nanoseconds. Trajectories were analyzed using the MM/ PBSA and PTRAJ modules of AMBER 10, as well as Curves+ [56], and custom R [57] scripts. To avoid end effects, the terminal base pairs were removed from the analyses of geometric parameters and torsional angles. The density distributions of the various parameters were calculated using the kernel density function of R, which estimates the probability density function of the variables.

MM/PBSA analysis
Conformational free energies were calculated by the MM/ PBSA method using the perl scripts available within the AMBER 10 [49] simulation package. Snapshots for the MM/PBSA analysis were extracted from all eight simulations in 100 ps intervals, yielding 500 snapshots per independent trajectory. Absolute free energies were calculated using the equation G = E MM +E PB + E SA 2TS, where E MM is the molecular mechanics energy, E PB is Poisson Boltzmann energy, E SA is the nonpolar solvation free energy and TS is the entropic contribution. E SA was assumed to be proportional to the solvent accessible surface area (SA), i.e. E SA = cSA+b, with the coefficients set to default c = 0.00542 kcal/ Å 2 mol and b = 0.92 kcal/mol. The AMBER nmode program was used to estimate the vibrational entropies after one thousand step energy minimization [58]. The results were averaged over the 500 snapshots for each system.  Table S5). Top row xaxes (buckle, propel, opening) are in degrees and bottom row xaxes (shear, stretch, stagger) are in Angstroms. Y-axes show the densities. (PNG) Figure S3 Representative distributions of base pair step parameters in B-DNA (black) and 5mCB-DNA (red) simulations. Vertical dashed lines indicate the mean 6 one standard deviation of the crystallographically determined values (see Table S5). Top