Effect of Protonation on the Molecular Structure of Adenosine 5′-Triphosphate: A Combined Theoretical and Near Edge X-ray Absorption Fine Structure Study

The present work combines the near edge X-ray absorption mass spectrometry of a protonated adenosine 5′-triphosphate (ATP) molecule isolated in an ion trap with (time-dependent) density functional theory calculations. Our study unravels the effect of protonation on the ATP structure and its spectral properties, providing structure–property relationships at atomistic resolution for protonated ATP (ATPH) isolated in the gas-phase conditions. On the other hand, the present C and N K-edge X-ray absorption spectra of isolated ATPH appear closely like those previously reported for solvated ATP at low pH. Therefore, the present work should be relevant for further investigation and modeling of structure–function properties of protonated adenine and ATP in complex biological environments.


Figure S1. N K-edge NEXAFS. Experimental photoabsorption spectra of isolated protonated ATP (light blue) and solvated ATP at pH=2.5 (dark green).
In the main text, Figure 2a compares NEXAFS spectra at the N K-edge collected using different techniques.In particular, we discuss here the comparison between two spectra shown again in Figure S1, namely, ATP from acidic solution at pH = 2.5 (ATP aq ), measured using a liquid microjet apparatus (dark green plot in Figure S1) [1], and the present sample produced using an electrospray ionization source and stored in a linear ion trap (light blue plot).Both techniques are expected to yield samples mainly containing protonated ATP (i.e., ATPH + ), and we have already remarked in the main text the striking similarity between the spectra, that points to a close similarity to the samples.We want to add here a few technical details on the apparatus, that can explain minor discrepancies between the plots.The spectra were measured differently; the one of solvated ATP [1] was recorded as a total electron yield using a positively biased metal electrode placed close to the liquid jet/photon beam intersection, whereas the present one as a total ion yield (TIY) summed within the mass/charge (m/z) detection limit.It should be noted that due to the low m/z cut-off of the quadrupole ion trap (m/z 110) in the present experiment, some fragment ions are not detected.With increasing the photon energy, particularly above the core ionization potential, the formation of lower m/z ions is favored, which could lead to apparent decrease of the TIY.Also, the present NEXAFS of dilute ionic targets was recorded with somewhat lower resolution than the one from ATP aq , thus some spectroscopic structures are less resolved.Finally, a small blue shift of the present spectrum with respect to the previous one cannot be discussed because the precision of the photon energy calibration for the ATP aq NEXAFS was not reported, therefore such discrepancy could be experimental.

S2. Detailed theoretical methods.
In order to ensure the full reproducibility of our calculations, we report here a complete account of the employed computational methods.The properties of adenine, ATP and ATPH + have been investigated using a multilevel computational protocol, based on tight-binding and (time dependent) density functional theory atomistic simulations.A preliminary wide screening of the molecular configurations of ATP (and adenine) in gas phase has been performed by using a semiempirical tight-binding method rooted on the GFN2-xTB Hamiltonian, as implemented in the xTB suite of programs [2].The global energy minimum and an ensemble of low-energy conformers have been identified using the conformer-rotamer ensemble search tool (CREST), which uses GFN2-xTB as engine to perform a complex combination of (meta)dynamics simulations and geometry optimizations, blended by genetic sorting and mixing of structures [3].In the case of ATP, a specialized algorithm has been used to evaluate all possible protonation sites and sort their energies in order to find the most stable protomers [4].This procedure indicated the four N atoms labeled N1-N4 in Figure S2 as possible candidates for protonation.The CREST algorithm has been then applied to the four resulting protomers to find their most stable conformers.The 10 lowest-energy conformers find by CREST for each molecule have been then investigated using DFT-based simulations in a localized-basis-set framework, as implemented in the ORCA suite of programs [5].CREST structures have been fully re-optimized using the M062X functional [6], the D3 dispersion correction [7] and the def2-TZVPP Gaussian basis set [8] to calculate electronic energies, and using the r 2 SCAN functional together with its tailored mTZVPP basis set [9] to calculate the zero-point vibrational energy.The conformers have been sorted in agreement with their E+ZPE values to find the most stable configuration of each system, used for the calculation of spectroscopic properties.Total energies of such most stable conformers have been also recalculated using a very accurate wavefunction-based coupled-cluster approach.In detail, the calculations have been performed at the CCSD(T) level of theory using a domain-local pairs of natural orbitals (DLPNO) approximation and the large def2-QZVPP basis set, as also implemented in ORCA [10].The results confirm the ordering of protomers indicated by DFT calculations, strengthening the slight prevalence of protomer 3: the relative energies in eV are Protomer 3 = 0, Protomer 1 = +0.09,Protomer 2 = +0.15,Protomer 4 = +0.52.Finally, time-dependent DFT (TD-DFT) calculations, restricted to core levels of O, N and C as occupied orbitals, have been performed to calculate NEXAFS spectra using the BH&HLYP functional [11], and the mentioned def2-TZVPP basis set, computing up to 400 electronic transitions for each atomic species.NEXAFS vibrational structures discussed below have been calculated by using the independent-mode displaced harmonic oscillator (IMDHO) model, as implemented in the ORCA advanced spectral analysis (ASA) module.We did not present in this study the results of our NEXAFS calculations at the P L-edge, because they cannot be currently performed using the same TDDFT framework used for K edges, due to theoretical complications in the treatment of the spin-orbit interaction.This would have forced us to introduce and assess a different calculation method and we preferred to refrain from introducing further complexity in this letter.Numeric labels of N and C atoms of adenine and adenine units contained in ATP and ATPH + protomers are defined in Figure S2 and used hereafter.As the excess proton is attached to N atoms, we start the fine analysis of NEXAFS excitations based on TDDFT calculations from N spectra.The lowest five bound excitations of adenine, ATP and of the three ATPH + protomers discussed in the main text, corresponding to the first two features of N NEXAFS, are highlighted as colored bars in Figure S3 and also listed in Table S-I.As anticipated in the main text, the most prominent feature of the plots is represented by the blue shift of around 2 eV upon protonation of the corresponding transitions (cyan stars *), with respect to the same line in ATP, also accompanied by a slight reduction of its oscillator strength.Such blue shift is consistent with the relative reduction of intensity of the first peak of the spectrum, which in adenine and ATP is formed by three lines (N1, N2 and N3) close in energy, with respect to the second peak, which acquires the contribution of the protonated N.Moreover, as also discussed in the main text, the first core excitation arising from N4 is systematically red shifted in ATPH + (and ATP) with respect to adenine, contributing to the gap reduction between the first and second peak measured for ATPH + .Minor rearrangements of the N4, N5 and protonated N in the second peak produce differences in the calculated vibronic spectra, reported in Figure S5 as Voigt convolutions of vibrational progressions.It can be noted that all protomers spectra are slightly red shifted with respect to adenine.More interestingly, protomer 1 is characterized by a broad distribution of the second block of transitions involving N1, N4 and N5, which is apparently less similar to measurements with respect to protomer 2 and 3.

Figure S4. C K-edge NEXAFS electronic excitations and corresponding oscillator strengths are reported as bars for adenine, ATP, and three ATPH + protomers. The lowest energy transition arising from the five nonequivalent C atoms of the adenine block of the molecules, listed in Table S-II, are reported as colored bars. The same shift of 2.6 eV has been applied to all results.
C K-edge NEXAFS spectra present less obvious differences among them than N K-edge ones.However, the proximity between adenine C atoms and neighboring protonation sites on N atoms in a π-conjugate system is responsible for the modulation of C transitions, as shown in Figure S4 and in Table S-II, that may provide useful hints for a sounder assignment of the preferred protonation site.First of all, the first broad feature of the spectrum between 286 and 288 eV is wholly contributed by transitions arising from 1s orbitals of the adenine C atoms.This is expected, because core holes are better screened in conjugate rather than saturated systems such as those found in the ribose unit.Apart from a weak feature clearly visible above 290 eV in the adenine spectrum, which is assigned to further electronic transitions involving the adenine C6-C10 atoms, the steady increase of the spectrum above 289 eV in the larger molecules is almost entirely due to the contribution of ribose C atoms, with negligible differences among ATP and ATPH + .

Figure S5. Calculated N and C 1s NEXAFS spectra of adenine, ATP, and three ATPH + protomers, including contribution of vibrational states (harmonic approximation). All N (C) spectra have been shifted by 2.7 eV (2.6 eV) to favor the comparison with measurements.
If we take a closer look to the first feature of the C spectra, assigned to the five lowest-energy strong electronic transitions of the C6-C10 atoms, we note that the colored bars in Figure S4 are ordered in the same way (C9, C7, C6, C10, C8) in all species but protomer 3, whose ordering is C9, C7, C10, C8, C6.The effect of the additional proton on the conjugate structure of the adenine unit is responsible for this inversion.To shed light on this mechanism we show in Figure S6 densitydifference maps of four NEXAFS electronic transitions involving the C6 and C8 atoms in protomer 3 and protomer 1.In detail, the maps show the electronic density difference between the ground state and the excited state corresponding to a given electronic transition in the NEXAFS spectrum; in other words, charge is displaced from blue regions to red regions upon excitation of the molecule.
Preliminarly, we note that blue regions are, as expected, strictly confined to atomic-like 1s orbitals, visible in transparency inside the red regions surrounding the C6 and C8 atoms (gray arrows in the figure).Then, let us focus on the role of excess protons in the transitions.When the core hole and the excess proton hold a reciprocal ortho position, (as in the case of C6*/N3-H + and C8*/N1-H + ) there is a repulsive strain of positive charges, which summons delocalized π-conjugate clouds to screen the charge.This process blocks the transposition of the excited electron through the C9=N1 (in the case of protomer 3) or C9=N3 (in the case of protomer 1) double bond, as shown by red arrows in Figure S6.This results in less delocalized final states for the excited electron and, in turn, in higher energy for the corresponding NEXAFS lines.When the core hole and the excess proton hold a reciprocal para position (as in the case of C6*/N1-H + and C8*/N3-H + ), on the contrary, the excited state is more delocalized across the whole six-membered pyrazine ring, in order to better screen the relatively farther positive charges, allowing participation of C9 in the charge accumulation of excited states (green arrows) and resulting in lower energies for the corresponding NEXAFS lines and, in turn, in the inversion of the C6 and C8 lines in the spectra of protomer 1 and 3.Even if there is no conclusive argument to fully support the assignment, the cross comparison of the calculated N and C 1s vibronic NEXAFS spectra suggests that protonation of N3 yields spectra in an overall slightly closer agreement with measurements than N1 and N2.This also agrees with the energetics of the three protomers, slightly favoring protomer 3 with respect to protomer 1 and protomer 2. However, it cannot be excluded the concurrence of more protomers.Moreover, rearrangement of the same protomers between slightly different low-energy rotamers (as found by CREST and reoptimized at DFT level, within 1 eV from the most stable rotamer of protomer 3) has been considered.The calculated N 1s vibronic NEXAFS spectra related to all the eight considered rotamers, shown in Figure S7, are very similar one another and well distinguished among different protomers, but can all the same contribute to blend the results, preventing the unambiguous assignment of protomers in the measured sample.
S4. Fine analysis of structure-properties relationships.
In order to provide further insight into structure-properties relationships, Atoms-in-molecule (AIM) charges have been calculated using the Bader partitioning method [12].In this case, 9-methyladenine has been also added to the investigated molecules, with the purpose of demonstrating that its charge density distribution is much more similar to ATP than adenine.It could therefore be considered as a more reliable reduced toy-model even for further experimental studies.The results are summarized in Table S-III.Some interesting trends can be highlighted by this results and crosscompared with calculated spectra (see above) and protonation maps (see below).First of all, protonation clearly affect not only the charge distribution on N atoms, but also the charge distribution on C atoms, as already indicated by the fine analysis of C 1s NEXAFS spectra.Moreover, protonation only induces a slight increase of charge on the adenine unit of ATPH + protomers with respect to ATP (around 0.1 a.u., localized on the newly formed N-H moieties), which is necessary to the screening of the positive charge of the added proton, accompanied by some charge displacement inside the same adenine block.The main relocation of charge induced by protonation is discussed with the help of Table S-III and of protonation maps in Figure S8.For the sake of clarity, the latter maps have been obtained in the case of 9-methyl-adenine, having a charge distribution inside the adenine unit close to ATP.A single proton has been added to the structure without structural relaxation and the maps display the charge density difference between the protonated and unprotonated structures, with the charge density displaced from blue regions to red regions upon protonation.We stress that only the added proton is relaxed after its addition to the structures; this is necessary to calculate the maps, that can, therefore, only show tendencies, while charges in Table II correspond to fully relaxed systems so that the information must be cross compared.In protomer 1 the main displacement involves a net charge transfer from N3 to C6.This can be interpreted using resonant structures typical of reaction intermediates of electrophilic aromatic substitution reactions [13]: the positive charge induced in N1 by protonation can be partly delocalized in the ortho (C8 and C9) and para (C6) position of the 6-terms aromatic ring, where it is screened through delocalization of the aromatic charge.This is particularly visible in the case of the C6 atom (the blue spot, having an evident π character, see the green arrow in Figure S8).However, the C6 atom is surrounded by two electron rich neighbors, namely the N3 -aza-and the terminal N4-H 2 group, both bearing a lone pair.N4 reacts to the lack of charge by assuming an sp 2 hybridization that does not alter its total charge.The C6-N4 bond shrinks from 1.34 Å in ATP to 1.31 Å in protomer 1, with a significant increase of the Mayer bond order from 1.19 (almost single bond) to 1.76 (almost double bond).On the other hand, there is an elongation of the C6-N3 bond from 1.33 Å in ATP to 1.35 Å in protomer 1, corresponding to a decrease of the Mayer bond order from 1.41 to 1.24, suggesting a partial relocation of the aromatic bond on C6.As a final remark, it can be noted that protomer 3, so far the best ATPH + candidate, is characterized by the closest distribution of electronic charge to ATP.

S5. Cartesian coordinates of optimized molecular structures.
We make available the Cartesian coordinates of the five molecular structures (adenine, ATP and three ATPH + protomers) on which NEXAFS spectra have been calculated.The optimized structures of ATP and ATPH + protomers are also shown in Figure S9.

Figure S9. Geometries of ATP and of the three ATPH + protomers investigated in this study, optimized using the dispersion-corrected M06-2X functional and the def2-TZVPP basis set. Additional protons are enclosed in green circles. H-bonds, shown as dashed lines, promotes the formation of a coiled molecular structure in gas phase.
Adenine  The close similarity between the N 1s NEXAFS spectra of protonated ATPH + obtained using different techniques (discussed in detail in Section S1) has stimulated further theoretical studies to unravel at what extent the presence of water molecules can affect present structural and spectroscopic results.It is no doubt that the present Near edge x-ray absorption mass spectrometry technique is able to select only protonated ATPH + , with no contribute of water molecules.The same cannot be considered as necessarily true in the case of the pressurized-jet technique used by Kelly et al. [1], which uses a water solution at controlled pH as starting sample.Moreover, it has to be also considered that ATP is generally found as dianion in water solution at neutral pH, with deprotonation of the phosphate units [14], and that at lower pH the adenine unit can be protonated even before the phosphate units.Finally, water might heavily alter the strong tendency of neutral and protonated ATP to form coiled structures with the formation of strong H bonds between the adenine and phosphate units discussed in the main text and also shown in Figure S9.
To shed light on these hypotheses, we have performed a series of calculations that address the multiple role played by water, comparing the effects of the addition of different amount of H 2 O molecules to neutral ATP and to the most stable ATPH + protomer 3.In detail, we have studied step by step a transition from the clusterization of a few water molecules together with the biomolecules in gas phase to the complete solvation of the same molecules in a water solution.The addition of water molecules has been controlled using a special version of the CREST algorithm discussed in Section S2, termed quantum cluster growth (QCG) [15], which has been used to surround ATP and ATPH + with a fixed amount of water molecules and to explore minimum energy structures, which are shown in the upper panel of Figure S10.Regarding the structural properties, first of all we note that, irrespective of the amount of water molecules, ATP and ATPH + retain the coiled structure found in gas phase.Water molecules preferentially interact with the polar phosphate units without entering the biomolecule pocket.In all structures, single water molecules generally act as bridges between phosphate and adenine, because this position probably reduces the strain required to bend the structures and form head-tail coils.We have already found at this stage a difference between ATP and ATPH + : 10 water molecules in gas phase are sufficient to induce the deprotonation and ensure the screening of one of the phosphate groups in neutral ATP.The released proton is then preferentially localized on N2 in these conditions (red arrow in Figure S10), suggesting that gasphase interactions may lead to unexpected localization of excess protons that cannot even be predicted on the grounds of well assessed properties of solutions.No shifting of protons is induced in the positively charged ATPH + in gas phase, in the presence of 5 or 10 water molecules.When a realistic water solution of ATP and ATPH + is simulated by raising the amount of explicit water to 130 molecules embedded in an implicit water solution, stable structures found are phosphate dianions, in agreement with the catalytic role of divalent cations as Mg 2+ in the release of phosphate units in living organisms [14,16].However, protons released in solution from phosphate units are not captured by nitrogen in the case of ATP, and don't alter the protonation of N3 in the case of the ATPH + protomer 3. Remarkably, in both solvated systems the molecules retain a similar coiled structure, as shown in Figure S10 in the two insets enclosed in green circles.Such results suggest that the protonation of the adenine unit in low-pH solutions is not affected by the transition from solution to gas phase obtained using different methods such as liquid microjet or electrospray ionization.
The analysis based on the properties of low-energy structures is strengthened by the results of N and C 1s NEXAFS calculations of the six gas-phase ATP and ATPH + structures shown in the upper panel of Figure S10.The spectra are shown in the lower panels of the same figure.Regarding the N K-edge, in the case of the neutral ATP molecule the addition of five water molecules partially in contact with N atoms through H-bonds does not significantly alter the N 1s NEXAFS spectrum, inducing a red shift of 0.1 eV of the main, sharp peak which is smaller than the uncertainty of experimental photon energy scales used in experiments (see Section S1).As discussed above, the addition of 10 water molecules promotes the formation of ATPH + in gas phase through deprotonation of phosphate, inducing a dramatic change in the shape of the spectrum and a larger red shift of 0.3 eV of the main peak, in line with the results obtained for the conventional ATPH + protomers in absence of water, shown in Figure 2d of the main text.The high compatibility of our calculated N 1s NEXAFS spectra of ATP in gas phase with the measured spectrum shown in Figure 2c, corresponding to a jet-spray sample from a pH=7.5 solution, suggests a secondary role of water in the spectroscopic measurements at the N K-edge and excludes the presence of higher amounts of water interacting with the molecule capable of inducing the migration of protons from phosphate to adenine through water molecules.The presence of water is even less significant in the case of the N 1s NEXAFS spectra of the protonated ATPH + protomer 3. The spectrum is almost insensitive to the addition of 5 water molecules, and only moderately blue-shifted (0.2 eV) in the case of the addition of 10 molecules, in both cases without strong alterations of its shape.Even less differences are found in the case of the C K-edge spectra upon addition of the same amounts of water molecules; the spectra of hydrated ATP and ATPH + are not shifted and only slightly modified in shape.Putting together all the information extracted from such calculations and from their comparison with measurements, we can also rationalize the close similarity between the two spectra shown in figure S1, concluding that both ATP and ATPH + have a strong tendency to form similar coiled or folded structures in gas phase and in solution, and that the presence of small amounts of water molecules connected to ATP and ATPH + through H-bonds does not significantly affect this tendency nor the electronic environment of N and C atoms, as reflected by calculated and measured NEXAFS spectra.

Figure S2 .
Figure S2.Ball-and-stick model of the adenine unit of the investigated molecules.Numerical labels are assigned to nitrogen and carbon atoms to facilitate the discussion of results.

Figure S6 .
Figure S6.Difference-density maps of four NEXAFS transitions arising in protomer 1 and protomer 3 from two carbon atoms labeled C6 and C8 in Figure S2.Charge density is displaced from blue regions to red regions upon excitation.Green and red arrows indicate allowed and blocked conjugation, respectively, of the C9 site, as discussed in the text.Yellow and cyan stars indicate the positions of core holes and excess protons, respectively.

Figure S7 .
Figure S7.Calculated N 1s NEXAFS spectra of eight lowest-energy rotamers of the three ATPH + protomers, including contribution of vibrational states (harmonic approximation).All spectra have been shifted by 2.7 eV to favor the comparison with measurements.

Figure S10 .
Figure S10.Upper panel: Lowest-energy structures of ATP and of the ATPH + protomer 3 in gasphase clusters containing no, five or ten water molecules, and of the same molecules in water solution, surrounded by 130 explicit water molecules and immersed in an implicit solvent.In these last cases the coiled structure of the molecules alone are shown inside the green circles.A red arrow indicates the protonation of the N2 site in the ATP+10w system, as discussed in the text.Hydrogen bonds in gas-phase structures are represented by black dashed lines.Lower panels: Calculated C and N 1s NEXAFS spectra, including the contribution of vibrational states, of the six gas phase structures shown in the upper panel.

Table S -
I. Lowest-energy NEXAFS transitions arising from 1s orbitals of the N1-N5 atoms labeled in Figure S2.In the table cells are reported transition energies, assignment to the N 1s donor orbital, and Oscillator strength (in a.u.).A * indicates the protonation site.The same shift of 2.7 eV has been applied to all results to favor the comparison with measurements.

N K-edge Adenine ATP Protomer 1 Protomer 2 Protomer 3
Figure S3.N K-edge NEXAFS electronic excitations and corresponding oscillator strengths are reported as bars for adenine, ATP, and three ATPH + protomers, where a * indicates the protonation site.The lowest energy transition arising from the five nonequivalent N atoms of the molecules, listed in Table S-I, are reported as colored bars.The same shift of 2.7 eV has been applied to all results.