Nucleophilic substitution at di- and triphosphates: leaving group ability of phosphate versus diphosphate

Adenosine triphosphate (ATP) is the universal energy carrier in biochemical processes. Herein, we aim for a better understanding of the origin of the high-energy content of the triphosphate moiety involved, the influence of various physicochemical factors thereon, and implication for the actual SN2@P-induced hydrolysis, which drives uphill biochemical processes, such as, DNA replication. To this end, we have investigated the SN2@P-induced hydrolysis of triphosphate (PPP) versus that of diphosphate (PP) using density functional theory (DFT) at COSMO(H2O)-ZORA-OLYP/TZ2P. We find that SN2@P-induced hydrolysis of PPP is favored over that of PP, both kinetically and thermodynamically. The energetic advantage of PPP over PP is slightly diminished by the coordination of Mg2+ counterions. Our activation strain and energy decomposition analyses reveal that the activation barrier for PPP hydrolysis is lower compared to that for PP due to a weaker Pα–O leaving group bond.

nucleophiles, where both the number and position of Mg 2+ counterions and the degree of protonation were systematically varied [8]. The reactions were found to proceed via pentacoordinate transition states that generally are more stable if the nucleophile (ROH) is deprotonated (RO − ) [5], in line with the suggested proton hopping to the enzyme surroundings of the primer 3′ hydroxyl proton in the DNA polymerase active site [34][35][36][37][38][39][40][41][42][43][44][45][46]. Deprotonation of, and thus buildup of negative charge on, the triphosphate substrate resulted in an increase in barrier height, whereas inclusion of Mg 2+ counterions (counteraction of negative charge) on the deprotonated triphosphate stabilizes the transition state, resulting in a lower S N 2@P barrier.
Despite the importance of DNA replication, the exact details of the mechanism are still not fully understood. Inspired in part by Westheimer's work on why nature chose phosphates [30], we now address why nucleotide triphosphates, rather than diphosphates, are the energy carriers for the ubiquitous DNA replication. We accomplish this by extending upon our work on solvated triphosphates in S N 2@P [8] with equivalent reactions involving diphosphates. Our analysis addresses the leaving group ability of phosphate and diphosphate, assesses the kinetics and thermodynamics of S N 2@PP and S N 2@PPP, and investigates the effect protonation and Mg 2+ counterions on these model hydrolysis reactions.

Computational details
Density functional and basis set All calculations are based on density functional theory (DFT) [47][48][49][50][51][52][53] and have been carried out using the Amsterdam Density Functional program (ADF 2017.111) [54][55][56]. Geometries, single points and vibrational analyses were computed with a generalized gradient approximation (GGA) of DFT. All computations use the OLYP functional, which consists of the optimized exchange (OPTX) functional proposed by Handy and coworkers [57,58], and the Lee-Yang-Parr (LYP) correlation functional [59]. Previously we have shown that OLYP reproduces S N 2 barriers from highly correlated ab initio benchmarks within a few kcal mol −1 [60,61]. Stationary points were confirmed as either minima (no imaginary frequencies) or transition states (one imaginary frequency) through a vibrational analysis. The transition states were connected to their associated minima by following the intrinsic reaction coordinate (IRC) [62]. The molecular orbitals (MOs) were expanded in a large TZ2P-type basis set consisting of uncontracted Slater-type orbitals (STOs). The TZ2P basis set is of triple-ζ quality and has been augmented with two polarization functions: 2p and 3d on hydrogen; 3d and 4f on carbon, oxygen, phosphorus and magnesium [63]. The core shells of carbon, oxygen, phosphorus and magnesium (1s) were treated with the frozen-core approximation. Scalar relativistic effects were taken into account by means of the zero order regular approximation (ZORA) [64][65][66][67]. The accuracy parameter of both the Becke grid integration and ZLMfit density fitting of the IRCs were set to GOOD, whereas geometry optimizations and frequencies analyses were calculated at VERYGOOD [68]. Spurious imaginary frequencies were set to 20 cm −1 , amounting to a correction of −2.0 and 0.6 kcal mol −1 to −T * S vib and U vib (298.15 K, 1 atm), respectively. Structures were illustrated using CYLview [69].

Solvation model
Aqueous solvation was taken into account in all calculations using the COSMO implicit solvation model [70][71][72]. Solvent effects were explicitly used in the solving of the SCF equations and during the optimization of Scheme 1. S N 2@P α substitution leads to elongation of the primer strand in DNA-replication.
Electron. Struct. 1 (2019) 024001 the geometry and the vibrational analysis. The solvent radius (R s ) for water was taken from experimental data for the macroscopic density (ρ) and molecular mass (M m ) with the formula R 3 s = 2.6752·M m /r, leading to a R s value of 1.9 Å for water; a value of 78.4 was used for the dielectric constant of water. Atomic radii values were taken from the MM3 van der Waals radii [73], which are available for almost the whole periodic system, and scaled by 0.8333 (the MM3 radii are 20% larger than the normal van der Waals radii due to the specific form for the van der Waals energy within the MM3 force field). The surface charges at the Delley surface [74] were corrected for outlying charges. A finer grid was employed for the construction of the surface to minimize numerical noise, increasing the maximum number of points per atom for hydrogen from 194 to 302. This setup provides a 'non-empirical' approach to including solvent effects with a dielectric continuum, and works well for solvation processes [75]. Trends in triphosphate hydrolysis reactions are unchanged when either modeled in implicit solvation or with the inclusion of discrete water molecules to the coordination sphere of the Mg 2+ counterions [9].

Solution-phase activation strain analyses
The activation strain model (ASM) [76][77][78][79][80][81], also known as the distortion/interaction model [82,83], was used to obtain quantitative insight into the activation barriers of the studied S N 2 reactions in aqueous solution. This analysis involves decomposing the solution-phase potential energy surface ΔE solution (ζ) along the reaction coordinate ζ into the strain ΔE solution-strain (ζ), associated with deforming the reactants from their equilibrium geometry, and interaction ΔE solution-int (ζ) terms, accounting for interaction between the deformed reactants (equation (1)).
The destabilizing ΔE solution-strain (ζ) is determined by the rigidity of the reactants in solution and by the extent to which they must distort in order to achieve their geometry at a given point along the reaction coordinate.
The ΔE solution-int (ζ) is related to the electronic structure of the reactants in solution and how they are mutually oriented over the course of the reaction. These strain and interaction energy terms were calculated in the potential of the (implicit) solvent. The solution-phase potential energy surface, ΔE solution (ζ), was further decomposed into the ΔE solvation (ζ), which accounts for the interaction between solute and solvent, and the ΔE solute (ζ), which is the reaction system in vacuum with the solution-phase geometry (equation (2)) [9].
The ΔV elstat (ζ) term corresponds to the classical electrostatic interaction between unperturbed charge distributions ρ A (r) + ρ B (r) of the deformed fragments A and B and is usually attractive. The Pauli repulsion ΔE Pauli (ζ) involves destabilizing interactions between occupied orbitals and is responsible for steric repulsion. The orbital interaction ΔE oi accounts for charge transfer (interaction between occupied orbitals on one fragment with unoccupied orbitals of the other fragment, including HOMO-LUMO interactions) and polarization (empty-occupied orbital mixing on one fragment due to the presence of another fragment).
In the activation strain and energy decomposition plots, the IRC is projected onto the stretching of the P α -O α bond. This critical reaction coordinate ζ undergoes a well-defined change over the course of the reaction from the reactant complex to the transition state and product complex [89]. The activation strain analysis was performed with the aid of the PyFrag program along the intrinsic reaction coordinate [90].

Thermochemistry
Enthalpies at 298.15 K and 1 atm (ΔH°) were calculated from the electronic bond energies and vibrational frequencies by using the thermochemistry relation for an ideal gas (equation (5)) [91,92].
Here, ΔE trans,298 , ΔE rot,298 , ΔE vib,0 are the differences between the fully protonated substrate (1dR and 1tR) and the fragments that result from the heterolytic cleavage of the P α -O α bond (i.e. the P α cation and the anionic leaving group) in translational, rotational, and zero-point vibrational energy, respectively. Δ(ΔE vib,0 ) 298 is the change in the vibrational energy difference as one goes from 0 to 298.15 K. The vibrational energy corrections are based on our frequency calculations. The molar work term Δ(pV) is (Δn)RT; Δn = +1 for one substrate dissociating into two fragments. Thermal corrections for the electronic energy are neglected. The change in the Gibbs free energy (ΔG) was calculated at 298.15 K at 1 atm (ΔG°) (equation (6)).

Results and discussion
The S N 2@P reactions and substrates investigated in the present study are summarized in Unlike previously suggested mechanisms [34][35][36][37][38][39][40][41][42][43][44][45][46], our simple model system, based on a relatively small methoxide nucleophile, highlights a preferential interaction of the Mg 2+ ion(s) with the phosphate chain to produce tridentate phosphate-(Mg 2+ ) n complexes [8]. In the absence of the surrounding enzyme active site, the mutual stabilization of both the phosphate negative charge and magnesium positive charge is preferred, contrasting to the catalytic nucleophile-bridging function adopted by one of the Mg 2+ counterions in e.g. DNA replication [35]. Nevertheless, the placement of Mg 2+ counterions mimics the positions in the reported crystal structure of DNA polymerase [46].
We show how deprotonated and therefore negatively charged substrates give rise to higher central reaction barriers under attack by a deprotonated organic nucleophile, methoxide, and how Mg 2+ counterions can effectively reduce these barriers again. Whereas in the gas phase, the pentacoordinate intermediates can potentially be stable minima, solvation changes the nature of the central species to a labile transition state for all reactions. As previously shown, stable transition complexes (TC) in the gas phase can turn into labile transition states (TS) in solution, the transition from TC to TS depending on the solvent polarity and spatial extent of the substrate LUMO [9]. All reactions investigated in the current study proceeds via a labile central TS in the aqueous phase. Only reactions involving the fully protonated species (1d and 1t) proceed from a reactant complex (RC), to the transition state and then to a product complex (PC). While still lacking an RC, Mg 2+ (3d, 3t and 4t) gives rise to highly stable product complexes.
Nucleophilic attack by CH 3 O − at the neutral PP (1d) and PPP (1t) substrates must compete with proton transfer from the α-proton to the nucleophile, rendering these reactions unviable from a biochemical standpoint. The barrierless and highly exothermic proton transfer (−44.8 and −46.6 kcal mol −1 for 1d and 1t, respectively) converts the reactive methoxide nucleophile into methanol, forming a stable hydrogen bonded RC with the deprotonated substrate. No RC is observed for reactions lacking a protonated substrate (i.e. 2d, 2t, 3d, 3t, and 4t).
The pentavalent transition states (TS) are greatly influenced by the presence of Mg 2+ counterions, systematically lowering barriers for S N 2@PPP from 34.9 [2t − (Mg 2+ ) 0 ], to 23.3 [3t − (Mg 2+ ) 1 ], to 9.9 [4t − (Mg 2+ ) 2 ] kcal mol −1 . Earlier TSs on the reaction coordinate (longer O nuc −P α and shorter P α −O α distances) are observed for reactions with lower activation barriers (figure 1). The catalytic effect of Mg 2+ was previously traced back to (i) an enhanced interaction between nucleophile and triphosphate with more stabilizing electrostatic and orbital interactions and (ii) weakening of the leaving-group bond leading to a reduced activation strain [8]. Mg 2+ has the same catalytic effect on PP as it does on PPP, decreasing the S N 2@PP barriers from 42.1 [2d − (Mg 2+ ) 0 ] to 27.4 [3d − (Mg 2+ ) 1 ] kcal mol −1 . All transition states, products and reactant/product complexes associated with PP are destabilized with respect to its PPP counterpart, an effect which is most pronounced in the two (Mg 2+ ) 0 systems (2d and 2t), barriers and reaction energies respectively being 8.2 and 10.0 kcal mol −1 lower for PPP.
Weakly bound product complexes (PC) are observed for the neutral PP (1d) and PPP (1t) systems, stabilized by hydrogen bonding between dimethylphosphate and the protonated mono-/diphosphate leaving group. Just as is observed in the RCs, phosphate deprotonation (2d and 2t) causes the disappearance of the PC by removing the phosphates capability as a hydrogen bond donor. In contrast, the introduction of Mg 2+ (3d, 3t, and 4t) gives rise to highly stabilized PCs, chelation of the Mg 2+ counterion by both dimethylphosphate and the mono-/ diphosphate leaving group yielding PC energies as low as −63.5 kcal mol −1 (4t). Note that this stable PC is likely bypassed completely in the enzyme active site (e.g. DNA polymerase), as the interaction of Mg 2+ with key amino acid residues, an interaction unaccounted for in our simplified model system, facilitates the stabilization and retention of Mg 2+ .

Activation strain and EDA
In order to shed light on the factors leading to the different diphosphate and triphosphate S N 2 reactivities, we next performed an activation strain analysis for all reactions. The resulting activation strain diagrams (figures 3(2d, 2t, 3d, 3t, and 4t) and S1(1d and 1t) (stacks.iop.org/EST/1/024001/mmedia)) decompose the solutionphase PES (ΔE solution ) into the energy associated with distorting the reactants from their equilibrium geometry (ΔE solution-strain ) and subsequent interaction between the deformed reactants (ΔE solution-int ) in (implicit) solvation. The activation strain analysis is performed along a reaction coordinate defined by the stretching of the P α −O α bond [89]. The effect of Mg 2+ on the PES ( figure 3(a)) is similar for both PPP and PP. Coordination of Mg 2+ counterions to the substrate facilitates a more stabilizing interaction between nucleophile and the di-/triphosphate substrate by lowering the di-/triphosphate LUMO and depleting its negative charge ( figure 3(a)) [8].
Turning our attention to the reactivity differences between PP and PPP (2d and 2t) without Mg 2+ counterions, we find that 2d proceeds with a higher barrier than 2t, due to more destabilizing activation strain along the entire reaction coordinate ( figure 3(b)). The interaction curves are similar and, thus, reactivity differences can be attributed to differences in strain curves. Introduction of a single Mg 2+ leads to lower, but more similar barriers for reactions 3d and 3t. The kinetic preference for the triphosphate is thus diminished ( figure 3(c)). The counterion stabilizes the negative charge of the leaving group for both cases. 3t goes with a slightly more stabilizing interaction curve compared to 3d due to the more stabilized LUMO of [PP-3H] 3− (Mg 2+ ) 1 leading to more favorable charge transfer, but the differences in strain curves are again responsible for the differences in barrier height.
Our activation strain analysis reveals that the primary factor leading to the enhanced reactivity of triphosphates compared to diphosphates, both with or without the Mg 2+ counterion, is the less destabilizing strain along the entire reaction coordinate. The origin of the more destabilizing activation strain for S N 2@PP was traced to Table 1. Model S N 2@P reactions in aqueous solution with relative energies of stationary points (in kcal mol −1 ) and overall charge Q (in a.u.). Gibbs free energies (298.15 K, 1 atm) are bolded a .

Rxn
Designator the stronger bond between the electrophilic P α and the monophosphate leaving group. The computed heterolytic bond dissociation energies (BDE), which are energetically favored over homolytic bond cleavage in solution [93], between P α and the mono-/diphosphate leaving group for systems 1dR and 1tR are shown in tables 2 and S1. The BDE for the leaving group bond of PP is, indeed, larger (ΔH solution = 49.9 kcal mol −1 ) than for PPP (ΔH solution = 40.2 kcal mol −1 ). The EDA in the gas phase was used to understand and rationalize the differing bond strengths of 1d and 1t (table 3). The interaction between P α and the leaving group is more stabilizing for PP compared to PPP in reactants 1d and 1t. Analysis of the interaction energy at every point along the reaction coordinate also reveals a more stabilizing interaction for PP (figure S1). Taken together, both the interaction and bond dissociation energy point to the same conclusion: PP has a stronger P α −O α bond than PPP. Additionally, the trends observed for ΔE solute-int (table 3) also hold for the interaction energy in solution ΔE solution-int (table S2).
By decomposing the ΔE solute-int , the prominent role of electrostatics and, to a lesser extent, orbital interactions in contributing to the relative strength of the P α −O α bonds becomes apparent. A relationship was found between the degree of charge localization on the leaving group and the magnitude of the electrostatic interaction (see scheme S1 for details). Negative charge localization on the leaving group results in an increasingly stabilizing electrostatic interaction with P α and thus a stronger bond. The stabilizing electrostatic interaction associated with PP (1d), and by extension its stronger P α −O α bond, are rationalized by its localized negative charge on the small monophosphate leaving group. Conversely, a more delocalized negative charge, as observed for the larger diphosphate leaving group of S N 2@PPP (1t), leads to a less stabilizing electrostatic interaction and a weaker leaving group bond.
Moreover, we have also analyzed the bond strength of the O γ −P γ bond in PPP using the EDA method. The O γ −P γ bond is the key phosphodiester bond that is broken during S N 2@P γ hydrolysis reactions in ATPase  and GTPase [13,14,94]. Comparable trends in O γ −P γ and P α -O α bond dissociation energies (table S3) and  interaction energies (table S4) are found, highlighting the applicability of our findings for these other ubiquitous enzymatic reactions.

Conclusions
We have computationally investigated the potential energy surfaces of seven aqueous S N 2@P reactions involving di-and triphosphates, substrates which play a key role in biological processes such as DNA replication. Our DFT study reveals that the hydrolysis S N 2@PPP reactions proceed with lower activation free energies and are more exergonic than the corresponding S N 2@PP reactions. These hydrolysis reactions all proceed via a pentavalent transition state, which is stabilized by the coordination of Mg 2+ counterions.
The enhanced reactivity of PPP compared to PP was traced back to the weaker P α -O α bond of PPP. This reduced bond strength lowers the strain associated with breaking the P α -O α bond over the course of the reaction and thus the overall reaction barrier. The weaker P α -O α bond in PPP originates from a less stabilizing electrostatic interaction between the leaving group and electrophilic phosphorus center P α , compared to PP, caused by the more delocalized negative charge of the PPP leaving group. Mg 2+ counterions diminish the reactivity differences between PP and PPP by stabilizing the negative charge of the leaving group. Stabilization of the LUMO by Mg 2+ , which is prominent in PPP, yields strain and interaction terms that still clearly favor PPP over PP in Mg 2+ -catalyzed systems, although differences in reactivity slightly diminish when compared to the uncatalyzed S N 2@P reactions.
In conclusion, we find that both S N 2@PP and S N 2@PPP are exothermic and exergonic. Importantly, however, our results underscore both the favorable kinetics and thermodynamics of the S N 2@PPP reactions. Thus, PPP can fulfill its role as an energy-rich species that stores energy and fuels up-hill reactions in biological systems. This energetic preference is a factor that contributes to making PPP the universal energy storage molecule in biological processes, including DNA replication. Additionally, by also computing the bond dissociation energy along with the P γ −O γ (for PPP) and P β −O β (for PP) interaction energies, we find that our previous conclusions also hold, that is, PP is a better leaving group than P, due to a weaker leaving-group bond in these corresponding S N 2@ P γ hydrolysis reaction central to ATPases and GTPases.