Assessment of GAFF and OPLS Force Fields for Urea: Crystal and Aqueous Solution Properties

Molecular simulations such as Monte Carlo, molecular dynamics, and metadynamics have been used to provide insight into crystallization phenomena, including nucleation and crystal growth. However, these simulations depend on the force field used, which models the atomic and molecular interactions, to adequately reproduce relevant material properties for the phases involved. Two widely used force fields, the General AMBER Force Field (GAFF) and the Optimized Potential for Liquid Simulations (OPLS), including several variants, have previously been used for studying urea crystallization. In this work, we investigated how well four different versions of the GAFF force field and five different versions of the OPLS force field reproduced known urea crystal and aqueous solution properties. Two force fields were found to have the best overall performance: a specific urea charge-optimized GAFF force field and the original all-atom OPLS force field. It is recommended that a suitable testing protocol involving both solution and solid properties, such as that used in this work, is adopted for the validation of force fields used for simulations of crystallization phenomena.


A.1.1
Bond interactions GAFF, OPLS-AA, OPLS-AA-N and OPLS-AA-D use a harmonic bond potential, given in Eq. 1, which is implemented with bond_style harmonic in LAMMPS.This is also used by the SPC/E water model.OPLS-S and OPLS-G use the gromos bond potential, given in Eq. 2, which is implemented with bond_style gromos in LAMMPS.
E bond = K r (r − r 0 ) 2 (1) K r is the bond constant, r 0 is the equilibrium distance and r is the distance between the two atoms.
A.1.2Angle interactions GAFF, OPLS-AA, OPLS-AA-N and OPLS-AA-D use a harmonic angle potential, given in Eq. 3, which is implemented with angle_style harmonic in LAMMPS.This is also used by the SPC/E water model.OPLS-S and OPLS-G use a squared cosine angle potential, given in Eq. 4, which is implemented with angle_style cosine/squared in LAMMPS.
K θ is the angle constant, θ 0 is the equilibrium angle and θ is the angle between the three atoms.
OPLS-S and OPLS-G use a harmonic dihedral potential, given in Eq. 7, which is implemented with dihedral_style harmonic in LAMMPS.
K ϕ,i and V 1−4 are the force constants, ϕ is the dihedral angle, n is the periodicity of torsion, δ is the phase angle and d = cos(δ).These three equations are describing the same potential in dierent ways: 7 is simply a re-arranged version of Eq. 5 and Eq. 6 is a re-arranged and expanded version of Eq. 5.
A.1.4Improper dihedral interactions GAFF and OPLS-AA, OPLS-AA-N and OPLS-AA-D use a fourier improper potential, given in Eq. 8, which is implemented with improper_style cv in LAMMPS.This is the potential that was used for the GAFF dihedrals.OPLS-S and OPLS-G uses a harmonic improper potential, given in Eq. 9, which is implemented with improper_style harmonic in LAMMPS.
K ξ is the force constant, ξ is the improper angle, n is the periodicity of torsion and d = cos(δ) where δ is the phase angle. A.1.5

Non-bonded interactions
All the force elds use the Coulomb potential for the non-bonded electrostatic interactions, this is given in Eq. 10.
q is the charge, i and j are the two dierent atoms, r is the distance between atoms i and j and ϵ 0 is the dielectric constant with a default value of 1.0 for a vacuum.
All the force elds use the Lennard-Jones potentials for the non-bonded dispersion interactions, this is given in Eq. 11.
Again, i and j are the two atoms interacting and r is the distance between these atoms.ε is the minimum energy between the two atoms and σ is the distance required between the two atoms for the energy to be zero.
The ε and σ values are specied for the interactions between atoms of the same type (e.g.

C -C interactions)
. Mixing rules are used to calculate these values for any atom pairing (e.g.

C -O and C -N).
For this, all the GAFF force elds use the arithmetic (Lorentz Berthelot) mixing rules given in Eqs. 12 and 14 and all the OPLS force elds use the geometric mixing rules given in Eqs. 13 and 14.For water, the mixing rules were chosen to match the other force eld being used.
A.2 GAFF parameters for urea The GAFF1 and GAFF2 parameters are taken from GAFF version 1.81 and 2.11 respectively, these were obtained using the Antechamber software which is part of AmberTools21. 1 The GAFF-D1 and GAFF-D3 parameters were taken from Özpnar et al. 2 .*The H-N-C-N dihedral is not used in GAFF-D1 and GAFF-D3. 2,3hedrals The bond and angle parameters used with OPLS for urea.K r in kcal mol −1 Å −4 , r 0 in Å, K θ in kcal mol −1 rad −2 and θ 0 in °.The K r and K θ values have been multiplied by OPLS-G, 9,10 OPLS-S 9,10 OPLS-AA, 7,11 H -N -C -N 0.000 4.900 0.000 0.000 H -N -C -O 0.000 4.900 0.000 0.000 Table 9: The dihedral and improper dihedral parameters used with OPLS for urea.K ϕ in kcal mol −1 , d and n are unit less, K ξ in kcal mol −1 rad −2 and ξ 0 in °.The K ξ value has been multiplied by 1 2 and 180°has been added to ξ 0 to match the LAMMPS formatting convention.

C Further Radial Distribution Functions
The OH W and ON RDFs have been discussed in the main paper, here we also provide Literature data is taken from Burton et al. 14 and Soper et al. 15 .
The HO W RDFs in Figure 1 are similar for all the force elds and concentrations, except that OPLS-AA-N has greatly extended peaks and troughs, which also appear earlier.Like with OH W there is the rst peak at < 2 Å, indicating that strong hydrogen bonding is present.There are also weaker second and third peaks appearing at 3.5 Å and > 5.0 Å respectively.The HO W RDFs are similar to those obtained by Burton et al. 14 and Soper et al. 15 , and also from Ishida et al. 16 and Duy et al. 8 but these had separate RDFs for the urea H molecules in both syn and anti arrangements, which were not shown for clarity.The RDFs of Duy et al. 8 indicate that the broader third peak is made up of two pairs of smaller peaks from these (syn peak, anti peak, syn peak, anti peak).For the OO W RDFs, shown in in Figure 2, there is a strong rst peak at > 2.5 Å followed by broader second and third peaks for all the force elds apart from OPLS-AA-N.For the remaining OPLS force elds a small second peak occurs at > 3.5 Å, which plateaus before the third peak appears at 5 Å.For the GAFF force elds a slow rise leads to the second peak which appears later at > 4.0 Å, this peak is higher that that for the OPLS force elds.
This later, taller second peak runs into the third peak at 5.0 Å.This shape is similar to that obtained by Duy et al. 8 and Weerasinghe and Smith 17 .These peaks are very similar for both the dilute and concentrated systems.The overlapping second and third peaks, could be a combination of water molecules not directly bonded to the O on the urea molecule and water molecules hydrogen bonded to the other parts of the urea molecule.OPLS-AA-N has rst and third peaks very similar to those of the other OPLS force elds.However, the second OPLS-AA-N peak is much stronger and more well dened than those of the other force elds, and it Speaks earlier at 3.5 Å.This is not just due to the OO W partial charges (and also those of OH W ), since the charge dierences for OPLS-AA-N lies between those of the other OPLS force elds and GAFF force elds.The NH W RDF, in Figure 3, has an extra peak at < 2 Å for GAFF-D1 and OPLS-AA-N, and which is very weak for GAFF-D3 and absent from the other force elds.This matches a very clear peak in Ishida et al. 16 , whilst there is only a small bump in Soper et al. 15 and no peak in Duy et al. 8 and Burton et al. 14 .This indicates that there are only strong N• • •H W hydrogen bonds presents in GAFF-D1 and OPLS-AA-N, with a few present in GAFF-D3.
These additional hydrogen bonds can be related to the higher density of these force elds.
The rst main peak appears at > 3.0 Å, which also appears in all the reference RDFs.There is a second peak at > 5.0 Å, this not noticeable in the higher concentrations of OPLS or reference solutions, with the exception of OPLS-AA-N where this is enhanced in the higher concentrations.
C.4 NO W Interestingly, the charge on the N atom is more negative than that of the O W atom for GAFF-D1, GAFF-D3 and OPLS-AA-N, but this is the opposite for the remaining force elds.Despite this there is no signicant dierence between the NO W RDF of GAFF-D3 (for which the N and OW charges are very similar) compared to GAFF1 and GAFF2, as shown in Figure 4.The OPLS RDF peaks are generally lower and shifted to the right compared to those of the GAFF force elds.Again, OPLS-AA-N provides the exception, with signicantly taller rst and second peaks, and the rst peak further to the left than that of the other force elds.The GAFF-D1 RDF sits somewhere between that of OPLS-AA-N and the other GAFF force elds, getting more like the GAFF RDFs at higher concentrations, where the water-to-urea ration decreases.In the dilute solutions neither Ishida et al. 16 or Duy et al. 8 match the rst peaks obtained here.However, at the higher concentrations, the Weerasinghe and Smith 17 curve matches the GAFF RDFs well, the Burton et al. 14 curve has a taller rst peak but otherwise is similar to the OPLS RDFs and Soper et al. 15 has similarities to GAFF-D1.The CO W RDFs in Figure 5 have one main peak at > 3.5 Å and a lower broad peak at ≥ 7.5 Å as it is levelling o, the RDFs are similar for both the dilute and concentrated solutions.OPLS-AA-N and GAFF-D1 have signicantly taller rst peaks than the other force elds.In the dilute systems OPLS-AA-N and GAFF-D1 both have an additional peak at ∼ 5.5 Å, this also occurred for Duy et al. 8 this is also present at the higher concentrations for OPLS-AA-N only.The RDFs of Weerasinghe and Smith 17 matches well, so does Burton et al. 14 although the second peak is shifted slightly earlier and this is signicantly earlier in Ishida et al. 16 .

1 4 and 1 2
respectively, to match the LAMMPS formatting convention.* The N -C -N bond is not used in OPLS-AA.
Figure 1: HO W RDFs for (a) the dilute solutions and (b) the concentrated solutions.Literature data is taken from Burton et al. 14 and Soper et al. 15 .

C. 2 Figure 2 :
Figure 2: OO W RDFs for (a) the dilute solutions and (b) the concentrated solutions.Literature data is taken from Ishida et al. 16 , Duy et al. 8 , and Weerasinghe and Smith 17 .

C. 3 Figure 3 :
Figure 3: NH W RDFs for (a) the dilute solutions and (b) the concentrated solutions.Literature data is taken from Ishida et al.16 , Duy et al.8 , Burton et al.14 , and Soper et al.15

Figure 4 :
Figure 4: NO W RDFs for (a) the dilute solutions and (b) the concentrated solutions.Literature data is taken from Ishida et al. 16 , Duy et al. 8 , Burton et al. 14 , Soper et al. 15 , and Weerasinghe and Smith 17 .

C. 5 Figure 5 :
Figure 5: CO W RDFs for (a) the dilute solutions and (b) the concentrated solutions.Literature data is taken from Ishida et al. 16 , Duy et al. 8 , Burton et al. 14 , and Weerasinghe and Smith 17 .

Table 1 :
GAFF parameters for mass in Da and charge in e.

Table 2 :
The GAFF Lennard-Jones parameters used for urea.ε in kcal mol −1 and σ in Å.

Table 4 :
The GAFF dihedral and improper dihedral parameters.K ϕ in kcal mol −1 , n is unit less and d in °.These parameters are the same for all the force elds 13 with one exception.

Table 5 :
OPLS parameters for mass in Da and charge in e.

Table 8 :
The dihedral and improper dihedral parameters used with OPLS for urea.V in kcal mol −1 , K ξ in kcal mol −1 rad −2 and n and d are unit less.

Table 10 :
SPC/E parameters for mass, charge and Lennard-Jones parameters for water.Mass in Da, charge in e, ε in kcal mol −1 and σ in Å.

Table 12 :
Crystal lattice parameters after minimisation of the experimental form I crystal structure.

Table 13 :
Crystal lattice parameters after minimisation of the experimental form IV crystal structure.

Table 14 :
Crystal lattice parameters from the NPT simulations starting in form I. The mean and standard deviation are calculated with a sampling frequency of 0.1 ns. ) Mean St. dev.Mean St. dev.Mean St. dev.Mean St. dev.

Table 15 :
Crystal lattice parameters from the NPT simulations starting in form IV. The mean and standard deviation are calculated with a sampling frequency of 0.1 ns. ) Mean St. dev.Mean St. dev.Mean St. dev.Mean St. dev.

Table 16 :
Crystal potential energy in kJ mol −1 after minimisation of the experimental form IV crystal structure.

Table 17 :
Crystal potential energy in kJ mol −1 for the NPT simulations.The mean value and standard deviation are calculated with a sampling frequency of 0.1 ns.

Table 18 :
Crystal cohesive energy in kJ mol −1 for the NPT simulations.The mean value and standard error are calculated with a sampling frequency of 0.1 ns.

Table 19 :
Solution density mean and standard deviation, calculated with a sampling frequency of 0.1 ns.

Table 21 :
Solution density mean and standard deviation, calculated with a sampling frequency of 0.1 ns.

Table 24 :
Solution diusion coecient, with standard error, for the GAFF force elds.The gradient and associated standard error of the MSD was calculated over the rst 10 ns of the MSD.Trajectory data was used every 0.01 ns for calculating the multi-time-origin MSD.