Selective Chemical Labeling of Natural T Modifications in DNA

We present a chemical method to selectively tag and enrich thymine modifications, 5-formyluracil (5-fU) and 5-hydroxymethyluracil (5-hmU), found naturally in DNA. Inherent reactivity differences have enabled us to tag 5-fU chemoselectively over its C modification counterpart, 5-formylcytosine (5-fC). We rationalized the enhanced reactivity of 5-fU compared to 5-fC via ab initio quantum mechanical calculations. We exploited this chemical tagging reaction to provide proof of concept for the enrichment of 5-fU containing DNA from a pool that contains 5-fC or no modification. We further demonstrate that 5-hmU can be chemically oxidized to 5-fU, providing a strategy for the enrichment of 5-hmU. These methods will enable the mapping of 5-fU and 5-hmU in genomic DNA, to provide insights into their functional role and dynamics in biology.


Section 3: ODN sequences, reactions and characterisation
Source of fU-ODN, fC-ODN and hmU-ODN. fU-ODN (see Table S1) was synthesized using a protected 5-formyldeoxyuridine phosphoramidite. The identity of the product was confirmed by LC-MS analysis. fC-ODN (see Table S1) was obtained from Eurogentec and subjected to further HPLC purification using a Agilent Technologies 1200 series HPLC to remove impurities. A Pursuit C18 column (5 μM , 150 x 10.0 mm, Agilent) was used, (solvent A = 50 mM NH 4 OAc, solvent B = MeCN, flow-rate 4 mL/min, 3% B for 5 min, and a gradient of 3-10% B for 25 min).

Synthesis of fU-DNA, fC-DNA and GCAT-DNA by polymerase chain reaction (PCR).
fU-DNA (See Table S1) was synthesized using template 1 and forward primer 1 and reverse primer 1 (See Table S8) in the presence of dATP, dCTP, dGTP and dfUTP.
GCAT-DNA was synthesized using template 3, forward primer 3 and reverse primer 3 (See  Gaussian smoothing (7 points) was applied.  Table S2.  Scheme S5: Selective biotinylation of hmU-ODN with 2. ESIsignals correspond to fU-ODN functionalized with NH 2 OMe and hmU-ODN that has been oxidized and subsequently functionalized with 2 to form a hydrazone. b) fU-DNA (500 ng) and fC-DNA (500 ng) was incubated with sodium phosphate buffer pH = 6 (40 mM) and 1 (0.4 mM) to make a final reaction volume of 50 μL, at RT for 4 h; c) fU-DNA (500 ng) and fC-DNA (500 ng) was incubated with sodium phosphate buffer pH = 7 (40 mM) and 2 (10 mM) to make a final reaction volume of 50 μL, at RT for 4 h; d) fU-DNA (500 ng) and fC-DNA (500 ng) was incubated with sodium phosphate buffer pH = 7 (40 mM) and 3 (5 mM) to make a final reaction volume of 50 μL, at RT for 4 h; e) fU-DNA (500ng) and GCAT-DNA (500 ng) was incubated with sodium phosphate buffer pH = 7 (40 mM) and 2 (10 mM) to make a final reaction volume of 50 μL, at RT for 4 h.
DNA enrichment procedure. A reported DNA enrichment protocol was used with some modifications. 3 MagneSphere streptavidin magnetic beads (50 μg, Promega), were washed with 1 × binding buffer (5 mM Tris pH 7.5, 0.5 mM EDTA, 1M NaCl, 0.05% Tween 20) (3 x 500 μL) and then resuspended in 50 μL 2 × binding buffer (10 mM Tris pH 7.5, 1 mM EDTA, 2M NaCl, S12 0.1% Tween 20). Input DNA (10 μL, 1000 pg/ODN) and Salmon sperm DNA (10 μg, Invitrogen) were mixed and made up to a final volume of 50 μL, and then added to the magnetic beads, before incubation for 15 minutes at RT. Beads were washed with 1 × binding buffer (6 × 500 μL), and the beads were then resuspended in 100 μL elution buffer (95% formamide, 10 mM EDTA) and were heated to 95 °C for 5 min. The eluent was then removed from the beads and placed on ice. The step was then repeated using 50 μL elution buffer to remove residual DNA from the magnetic beads. The eluent was diluted with water (350 μL), and purified by filtration using Amicon Ultra-0.5 mL Centrifugal Filters 10K (Millipore), following a wash by water (450 μL) and centrifugation for 15 min. The Amicon filters were washed with water and centrifuged for a further 15 min (2 × 450 μL). DNA was then recovered from the Amicon filter (25 μL).
Enrichments were carried out in either duplicate or triplicate from each reaction. qPCR analysis for chemical enrichment studies. qPCRs were performed using a CFX96 Real-Time System (BioRad), and data was processed using the CFX Software manager (BioRad).
Enriched DNA (1 μL) was added to a mixture of Brilliant III Ultra-Fast SYBR Green qPCR Master Mix (5 μL) (Agilent Technologies), forward primer 1, 2 or 3 (1 μM) (See Table S8), reverse primer 1, 2 or 3 (1 μM) (See Table S8) and diluted with water to give a final volume of 10 μL. The mixture was subject to qPCR according to the protocol outlined by the manufacturer.            All templates and primers were sourced from either Invitrogen or Sigma Aldrich.

Section 5: Ab initio study on 5-fU and 5-fC reactivity
To obtain a theoretical insight on what might facilitate the increased reactivity of 5-fU, as compared to 5-fC, in the Schiff base, oxime or hydrazone formation reactions described in this work, we have performed ab initio quantum mechanical calculations on reduced model systems.
The models and the level of theory used in the computational study. The models used in the computational study are shown in Figure S5a, where the N-glycosidic bonds in the 5-fU and 5-fC nucleotides are reduced via capping methyl groups. Taking into account that all the reactive ends of the studied tagging reagents have -NH 2 groups, methylamine is taken as the model reactant.
The rate-determining step in our experimental condition and used reactants is most likely to be the nucleophilic addition to the carbonyl, hence, we considered this stage of the reaction ( Figure   S5b) to reveal the possible stationary points along the pathway.

Natural bond orbital analysis of 5-fU m and 5-fC m . The formation of a transient C-N bond is
among the key stages in our reactions (see Figure S5b). In this process, the aldehyde carbon in 5-fU or 5-fC should become more tetrahedral, which is expected to occur more easily if the conjugation of the nucleobase ring extends less to the aldehyde group. In order to find out whether such a connection may be behind the core electronic cause of the increased reactivity of 5-fU as compared to 5-fC, we have performed a natural bond orbital (NBO) analysis 13,14 of the 5-fU m and 5-fC m models, localising and focusing our attention to the properties of the C ring -C aldehyde bonding orbital. The calculated orbital energies are listed below (Table S9). The data show, that regardless the conformational state of the nucleobases, 5-fU features a weaker C ring -C aldehyde bond (the bonding orbital energies are higher), owing to which the aldehyde carbon can gain its pyramidality and form the transient C-N bond much easier as compared to 5-fC. If we consider only the anti conformations, the C ring -C aldehyde bonding orbital energy in 5-fC m is more stable by 7.71 kcal/mol, then that in 5-fU m (1 a.u. = 627.509 kcal/mol). For the ground state conformation (syn for 5-fC m and anti for 5-fU m ), the stability difference is 18.37 kcal/mol (Table S9). Such difference can reflect the core electronic reason for the increased reactivity of 5-fU over 5-fC. The same trend is observed while we consider the effects of the ring on the aldehyde C=O bonding orbital (Table S10). However, the influence is less pronounced owing to the relatively increased distance of the C=O moiety from the ring and the much lower basal energy of the C=O bonding orbital. For the ground state conformations, the C=O bonding orbital energy in 5-fC m is more stable by 1.71 kcal/mol compared to that in 5-fU m (Table S10). Intermediates along the pathway of the hemiaminal formation. We have attempted to characterise the stationary points in the energy landscape of the hemiaminal formation reaction of 5-fU m and 5-fC m with methylamine ( Figure S5a), where the first stage ( Figure S5b) is expected to be the rate limiting one defining the reactivity of 5-fU m and 5-fC m with our tagging reagents. The pathways were initially explored via four techniques; a) we constructed multiple initial geometries (with methylamine and both anti and syn conformers of the modified bases) that would be close to the expected intermediate state (the product of the scheme in Figure S5b) and performed a geometry optimisation to a transition state; b) we constructed two geometries in two directions from the expected transition states, and performed a synchronous transit-guided quasi-Newton search 9 in between the two structures; c) and d) we performed the above mentioned a and b techniques but without the constraints on the number of imaginary force constants, hence enabling a convergence to any stationary point.
All the above attempts, while using only the modified base and methylamine as reactants, did not locate a transition state, or a local minimum differing from just a set of initial molecules with no  Figure S8. We have applied the above-described a-d techniques, which unfortunately were only breaking the system into the individual molecules with little interaction in between. In the reference [15], the Schiff base formation reaction was studied on different model molecules via DFT, and the whole water cage was necessary to stabilise the transition states.
However, while considering similar interactions involving 5-fU m , we quickly located a stationary point upon the interaction of its minimum-energy anti conformer with methylamine and just one water molecule ( Figure S9a). The further vibrational analysis verified the stationary point to be S25 an energy minimum which appeared to be more stable (ΔE=-13.79 kcal/mol) than the system of non-interacting 5-fU m , water and methylamine molecules.

S26
The stationary structures reported in this work. All the calculations were done at the RMP2/cc-pVDZ level of theory as described above.        Figure S27: LC-MS profile for reaction of fU-ODN and hmu-ODN with NH 2 OMe (pH = 6, 3 h), followed by oxidation with KRuO 4 and reaction with 2 (pH = 7, 4 h). No peak corresponds to fU-ODN + 2, indicating that 5-hmU can be tagged selectively in the presence of 5-fU.