Molecular dynamics simulations data of the twenty encoded amino acids in different force fields

We present extensive all-atom Molecular Dynamics (MD) simulation data of the twenty encoded amino acids in explicit water, simulated with different force fields. The termini of the amino acids have been capped to ensure that the dynamics of the Φ and ψ torsion angles are analogues to the dynamics within a peptide chain. We use representatives of each of the four major force field families: AMBER ff-99SBILDN [1], AMBER ff-03 [2], OPLS-AA/L [3], CHARMM27 [4] and GROMOS43a1 [5], [6]. Our data represents a library and test bed for method development for MD simulations and for force fields development. Part of the data set has been previously used for comparison of the dynamic properties of force fields (Vitalini et al., 2015) [7] and for the construction of peptide basis functions for the variational approach to molecular kinetics [8].


a b s t r a c t
We present extensive all-atom Molecular Dynamics (MD) simulation data of the twenty encoded amino acids in explicit water, simulated with different force fields. The termini of the amino acids have been capped to ensure that the dynamics of the Φ and ψ torsion angles are analogues to the dynamics within a peptide chain. We use representatives of each of the four major force field families: AMBER ff-99SBILDN [1], AMBER ff-03 [2], OPLS-AA/L [3], CHARMM27 [4] and GROMOS43a1 [5,6]. Our data represents a library and test bed for method development for MD simulations and for force fields development. Part of the data set has been previously used for comparison of the dynamic properties of force fields   [7] and for the construction of peptide basis functions for the variational approach to molecular kinetics [8].
& The data can be used to construct peptide basis functions for the variational approach to molecular kinetics [10,11], as described in Ref. [8].
Our data set allows the user to probe and compare the properties of the force fields [7] and to make an informed decision when choosing a force field for the simulation of larger systems.

Data
The public repository (ftp://bdg.chemie.fu-berlin.de/Ac-X-NHMe/) is structured as following. In the main folder a README.txt file can be found. It illustrates the simulation details common to all the set-ups, also described in Experimental Design, Material and Methods A. A GROMACS-specific [9] simulation parameters file (nvt_production_1mus.mdp) is also adduced for clarity.
The data is sorted according to force field. In each force-field subfolder twenty folders are present, one per amino acid. The folders are denoted as Ac-X-NHMe, where X is replaced with the one letter code of the amino acid. Within an amino-acid specific folder, another README.txt summarizes the number and length of the independent runs. A sub-folder is associated to each independent run, including the initial configuration Ac-X-NHMe_run0.gro and the trajectory in (GROMACS binary) format (.xtc). Moreover, a topology file (Ac-X-NHMe.top) is given, which contains the atom types, the bonded and non-bonded parameters of the force field of choice, and lists the constraints. The initial configuration (.gro), the topology file (.top) and the simulation file (.mdp) permit the re-run of the simulations.

MD simulations
We performed all-atom MD simulations in explicit solvent of terminally blocked amino acids, acetyl-X-methylamide (Ac-X-NHMe), where X stands for any of the twenty encoded amino acids. All twenty amino acids were simulated with five different force fields: AMBER ff-99SB-ILDN [1], AMBER ff-03 [2], OPLS-AA/L [3], CHARMM27 [4] and GROMOS43a1 [5,6]. The water model was chosen to be in agreement with the one used for the validation of the force field, i.e. TIP3P [12] for AMBER ff99SB-ILDN, AMBER ff03, OPLS-AA/L and CHARMM27, and SPC [13] for GROMOS43a1. Simulations were performed with the GROMACS 4.5.5 simulation package9. The number of particles and the volume Table 1 Simulation parameters per amino acid and force field: number of water molecules, size of simulation box, number of independent runs and total simulation time.  3  were fixed during the simulations. Temperature was restrained at 300 K using the V-Rescale thermostat [14]. Each initial set up was minimised using the steepest descent algorithm and equilibrated in the NVT ensemble for 100 ps. Subsequently two independent production runs of 1 μs each, were carried out for each amino acid/force field combination (exception: aliphatic amino acids A, G, I, L and P in ff99SB-ILDN [1] force field, production runs of 200 ns each). This yields to a total simulation time of 2 μs per simulation setup (exception: aliphatic amino acids A, G, I, L and P in ff99SB-ILDN [1] force field, 4 μs; A, V in ff-03 [2], OPLS-AA/L [3], CHARMM27 [4] and GROMOS43a1 [5,6], 4 μs). The integration time-step was of 2 fs and atom positions of the solute were written to file every 1 ps.
In the production runs, the leap-frog intergrator was used and bonds to hydrogen atoms were constrained using the LINCS algorithm [15] (lincs iter¼1, lincs order¼4). A cut-off of 1 nm was used for Lennard-Jones interactions. Electrostatic interactions were treated by the Particle-Mesh Ewald (PME) algorithm [16] in combination with a real space cut-off of 1 nm, a grid spacing of 0.16 nm, and an interpolation order of 4. Periodic boundary conditions were applied in all three dimensions. For further details refer to Table 1.

Ramachandran plots
Backbone dihedral angles Φ and ψ are good reaction coordinates for the dynamics of amino acids and short peptides. Using the GROMACS command g_rama we extracted the Φ and ψ time-series from the MD trajectories. The space spanned by the {Φ Àψ}-combinations of a single amino acid (capped or within a peptide-chain), has a well-defined distribution and can be represented in a two-