How to understand quantum chemical computations on DNA and RNA systems? A practical guide for non-specialists
Introduction
Due to fast advance of computer hardware and software in the last two decades, computational methods became a common tool to study selected aspects of structural dynamics of nucleic acids [1], [2]. We provide a short summary of modern electronic structure quantum-chemical (QM) calculations of the basic energy contributions in nucleic acids. We do not review other applications of QM methodology such as computations of chemical reactions, proton and electron-transfer processes, excited states, etc.
In 1995, the first stable classical (force field) molecular dynamics (MD) simulations of nucleic acids in explicit solvent were achieved [3], [4] thanks to new particle-mesh Ewald (PME) treatment of long-range electrostatics [5]. Without PME, the highly charged nucleic acids were falling apart due to accumulation of errors along the trajectories. The first PME simulations were ∼1 ns long while contemporary simulations often reach microsecond time scale, with the millisecond time scale in sight [6], [7], [8], [9]. Longer simulations are unmasking problems with accuracy of the empirical molecular mechanics (MM) force fields [8], [10], [11], [12], the potential energy functions relating the molecular energy with molecular geometry. The fundamental MM approximations represent the crucial limit of MM simulations [13], [14]. The magnitude of MM limitations is yet to be determined and the problems are often underreported in the literature.
Likewise, around 1995, the first modern electronic structure (quantum-chemical) calculations of basic interactions (base stacking and base pairing) in nucleic acids were published [15], [16], [17], [18], [19], [20], [21]. These calculations for the first time included correlation energy of electrons and achieved a semiquantitative accuracy. All basic conclusions of these studies, such as clarification of the nature of base stacking, revelation of the intrinsic nonplanarity of exocyclic amino groups of nucleic acid bases, etc. remain entirely valid till these days, as summarized in Ref. [22] or in a review for nonspecialists [23].
Later QM calculations achieved quantitative accuracy and serve as benchmarks, i.e., accurate structure/energy data about the studied systems which are used in a similar manner as experimental data [24], [25], [26] (for a review see Ref. [27]. QM calculations fill gap in experimental methods, as contemporary physical chemistry lacks appropriate tools to measure true (intrinsic) energies of base pairing, base stacking or different backbone rotameric states [28], [29].
Section snippets
Traditional wave function theory vs. modern density functional theory
Wave function theory (WFT) solves Schrödinger equation applying various levels of physically evincible approximations (Fig. 1) [1], [23], [30]. Basic Hartree–Fock (HF) calculation neglects by definition the correlation of spin-opposite electrons, i.e., the electron correlation energy. The electron correlation energy creates, e.g., the attractive part of the van der Waals interactions (the dispersion term).
The most complete WFT method is full configuration interaction (full-CI) that represents
Can we straightforwardly compare QM calculations with experiments?
The primary goal of nucleic acids computations is not to reproduce experimental data, since design of computations differs from experimental methods in many important aspects. We should not expect that the same numbers are coming from the computations and from a given experiment. Making computations exactly comparable with experiment is often impossible. Similarly, we do not have experiments that could provide the information that is reachable by computations. Computations provide analytical
QM computations and experiments
Typical QM computation is equivalent to hypothetical energy-measuring “single-geometry” experiment carried out in gas phase (in vacuo) at a temperature of 0 K [38]. The energies are determined either for set of configurations of the nuclei (set of fixed geometries), or with the use of gradient geometry optimization (energy minimization). Thus, QM calculations investigate the potential energy surface (PES). PES assigns energies to all geometries of the studied system (Fig. 3). PES of two
Molecular simulations and experiments
For completeness, let us shortly comment also on the MD simulation technique. Also MD is not equivalent to any existing experimental method. MD simulation is a single-molecule solution technique. The studied molecule initially assumes certain xyz geometry (the starting structure), which often has major impact on the subsequent simulation [1], [2], [14], [68]. Then the simulation investigates the genuine thermal motions of the molecule on very short (compared to most real processes) time scale
Inclusion of solvent
QM calculations have been traditionally done in gas phase, i.e., with completely isolated systems [22], [23]. However, negative charges associated with multiple phosphate groups will necessitate inclusion of solvent screening effects for larger nucleic acids systems. It can be done using approximate QM continuum solvent models [75], [76] analogous to classical Poisson–Boltzmann (PB) theory used in molecular modeling [77], [78].
The continuum solvent calculations are a major obstacle in QM
Should we use QM or MM descriptions?
MM approaches allow incomparably better sampling of the conformational space then QM, study of bigger systems and explicit inclusion of solvent. QM computations are justified when we need better accuracy of the description than achievable by MM.
Part of the simulation literature may give an impression that the force fields are almost perfect. QM is viewed as an exotic tool, which could bring somewhat better description of some non-essential details. In reality, there is a fundamental difference
QM and force field parameterization
QM computations can help in parameterization of the nucleic acids force fields [67]. The most straightforward way is to use benchmark QM data as target values for the force field. We can calculate reference energy profiles associated with a given torsion using QM and MM without the MM torsion term. Then the QM–MM difference can be used to fit the desired MM term [105]. Although the procedure looks simple, it is not. First, it is not easy to make sampling of the QM and MM potential energy
Calculations of chemical reactions, a place where QM meets MM
QM methods allow to study bond breaking and making processes. They are widely used to estimate reaction barriers and to gain detailed insights into reaction mechanisms. However, QM methods cannot be applied to whole enzymes directly because of their sizes. On the other hand, MM does not allow bond breaking and formation, which occur during the course of catalysis, but is computationally efficient enough to describe complete biomolecule with surrounding ions and explicit waters. In 1976, Warshel
Perspectives and outlook
Accurate QM computations have been so far limited to few dozens of atoms, using idealized as well as experimental geometries [121], [122], [123], [124], [125], [126], [127]. With the latest advances of fast accurate QM methods [35], [36], [128], [129] we see prospect of high-quality QM calculations on systems with hundreds of atoms and still sufficient accuracy. Sufficient accuracy is the key point since cheap QM methods are quite unreliable. The best lower-cost methods capable to realistically
Acknowledgements
This work was supported by the Ministry of Education, Youth and Sports of the Czech Republic: projects “CEITEC – Central European Institute of Technology” (CZ.1.05/1.1.00/02.0068) and “RCPTM – Regional Centre of Advanced Technologies and Materials” (CZ.1.05/2.1.00/03.0058) from European Regional Development Fund, and “RCPTM-TEAM”, (CZ.1.07/2.3.00/20.0017, M.O., P.B.) from the Operational Program Education for Competitiveness – European Social Fund and by the Grant Agency of the Czech Republic [
References (133)
- et al.
Curr. Opin. Struct. Biol.
(1996) Curr. Opin. Struct. Biol.
(2004)- et al.
Biophys. J.
(1997) - et al.
Methods
(2009) - et al.
J. Mol. Biol.
(1999) - et al.
Curr. Opin. Struct. Biol.
(2006) - et al.
J. Mol. Biol.
(2004) - et al.
Biophys. J.
(2001) - et al.
Methods
(2012) - et al.
Biophys. J.
(2007)