Plane Waves Versus Correlation-Consistent Basis Sets: A Comparison of MP2 Non-Covalent Interaction Energies in the Complete Basis Set Limit

Second-order Møller–Plesset perturbation theory (MP2) is the most expedient wave function-based method for considering electron correlation in quantum chemical calculations and, as such, provides a cost-effective framework to assess the effects of basis sets on correlation energies, for which the complete basis set (CBS) limit can commonly only be obtained via extrapolation techniques. Software packages providing MP2 energies are commonly based on atom-centered bases with innate issues related to possible basis set superposition errors (BSSE), especially in the case of weakly bonded systems. Here, we present noncovalent interaction energies in the CBS limit, free of BSSE, for 20 dimer systems of the S22 data set obtained via a highly parallelized MP2 implementation in the plane-wave pseudopotential molecular dynamics package CPMD. The specificities related to plane waves for accurate and efficient calculations of gas-phase energies are discussed, and results are compared to the localized (aug-)cc-pV[D,T,Q,5]Z correlation-consistent bases as well as their extrapolated CBS estimates. We find that the BSSE-corrected aug-cc-pV5Z basis can provide MP2 energies highly consistent with the CBS plane wave values with a minimum mean absolute deviation of ∼0.05 kcal/mol without the application of any extrapolation scheme. In addition, we tested the performance of 13 different extrapolation schemes and found that the X–3 expression applied to the (aug-)cc-pVXZ bases provides the smallest deviations against CBS plane wave values if the extrapolation sequence is composed of points D and T, while performs slightly better for TQ and Q5 extrapolations. Also, we propose as a reliable alternative to extrapolate total energies from the DTQ, TQ5, or DTQ5 data points. In spite of the general good agreement between the values obtained from the two types of basis sets, it is noticed that differences between plane waves and (aug-)cc-pVXZ basis sets, extrapolated or not, tend to increase with the number of electrons, thus raising the question of whether these discrepancies could indeed limit the attainable accuracy for localized bases in the limit of large systems.

(eq 9 of the main text) that uses the existing mixed distributed/shared (MPI/OpenMP) parallelization strategy 31,32 of CPMD (CP GROUPS are currently not supported).The work load is divided into blocks.Within a block, a list of summand indices is created.Then, the partial two-electron integrals are saved in an array, and summed across tasks only once the loop has been completed.This allows for the OpenMP parallelization of the outermost loop, leading to a much smaller shared-memory overhead from thread creation.Additionally, inter-and intra-task-communication becomes much cheaper, as one big array is distributed once per block, instead of a single number being distributed for every ijab tuple.This saves a lot of overhead.Then, the summed partial integrals are used to calculate the final MP2c energy.
Note that at the Γ-point, the orbital coefficients can be chosen to be real, introducing the symmetry φi,a(G) = φ * i,a (−G) which can be exploited to speed-up the code.
In order to extrapolate the MP2 correlation energy at the basis set limit (eq 11), E MP2 c,n values are printed out whenever n is incremented by bincr virtual orbitals until nmax.This increment is user-defined; the larger the value, the more efficient the calculation, which however becomes more memory intensive and provides less extrapolation points.In this work, bincr = 100 was found as a good compromise.In addition, nmax is defined sufficiently large as to reach a reliable extrapolation regime according to eq 11 that is valid at high virtual index.Values between nmax = 10000-20000 were used herein for relative energies.A RESTART mechanism is available in order to diagonalize supplementary virtual orbitals and continue the computation of E MP2 c,n for larger nmax.This implementation corresponds to the one used in our recent study on the acceleration of the MP2c energy calculation by stochastic sampling of the virtual space integrands based on Monte Carlo summation. 47In that case, the stochastic sampling is carried out by selecting which ijab tuples will be effectively included in the list for later evaluation and proper renormalization.
for n mod b incr = 0 Wavefunction optimization with HF → E HF , φi (G), ε i ; Diagonalization of the nmax lowest virtuals → φa(G), εa; /* The G vectors are distributed among the MPI tasks for all i, a orbitals */ [ φi,a (G)];       The solid red lines correspond to the smallest maximum deviation obtained with plain Q and 5 zeta basis sets reported in Table 2.The legend is given in Figure 2. The solid red lines correspond to the smallest maximum deviation obtained with plain T, Q or 5 zeta basis sets respectively, as reported in Table 2.The legend is given in Figure 2.   The dashed(solid) red lines correspond to the smallest MAE(maximum deviation) obtained with plain Q or 5 zeta basis sets respectively, as reported in Table 2.The legend is given in Figure 2.   2. The legend is given in Figure 2. Note that the small deviations due to the HF contribution would not counterbalance the results of MP2c.
(a)  2. The legend is given in Figure 2. Note that the small deviations due to the HF contribution would not counterbalance the results of MP2c.Table S3: Quality of interpolation per GTO extrapolation.The mean absolute error (MAE) and root-mean-square deviation (RMSD), as well as the coefficient of determination R 2 are calculated between fitting curves and (aug-)cc-pVXZ data points, always evaluated up to the 5 zeta values (like shown in Figure S8) to reflect the predictive power at larger Xs.Averages on all test systems are reported and include errors on both dimer and monomer energies that are fitted separately.MAE and RMSD are in [kcal/mol].For simplicity and due to the small Helgaker /HF errors, Helgaker /MP2 results are based on the MP2c energies only.3 and 4.

Points
for extrapolation at consecutive b incr end Algorithm 1: Pseudocode for the calculation of the MP2c energy in CPMD.35

Figure S1 :
Figure S1: HF energy of the (a) NH3 monomer and (b) (NH3)2 dimer for different exchange (Coulomb) potentials.Ω is the volume of expanding cubic or orthorombic supercells around the dimer electron density.

FigureFigure S3 :
Figure S2: Formamide dimer -Convergence of the HF and MP2c energy contributions to the total MP2 interaction energy for the (aug-)cc-pVXZ basis sets and different treatments of the BSSE.The CBS PW value is also indicated.

Figure S4 :
Figure S4: Box plots of the differences ∆E MP2GTO − ∆E MP2 PW between extrapolated GTO and PW MP2 interaction energies of the S22* test systems.Medians are shown as horizontal black lines and yellow lines stand for the mean signed deviation (MSD).The solid red lines correspond to the smallest maximum deviation obtained with plain T, Q or 5 zeta basis sets respectively, as reported in Table2.The legend is given in Figure2.

Figure S5 :
Figure S5: Box plots of the differences ∆E MP2 GTO − ∆E MP2 PW between extrapolated GTO and PW MP2 interaction energies of the S22* test systems.If applicable, HF and MP2c contributions to the total MP2 energies have been extrapolated separately.Absolute differences are given in (a) while (b) reports signed values.Medians are shown as horizontal black lines and yellow lines stand respectively for the mean absolute error (MAE) in (a) and the mean signed deviation (MSD) in (b).The dashed(solid) red lines correspond to the smallest MAE(maximum deviation) obtained with plain Q or 5 zeta basis sets respectively, as reported in Table2.The legend is given in Figure2.

Figure S6 :
Figure S6: Box plots of the differences ∆E MP2 c,GTO − ∆E MP2 c,PW between extrapolated GTO and PW MP2c interaction energies of the S22* test systems.Absolute differences are given in (a) while (b) reports signed values.Medians are shown as horizontal black lines and yellow lines stand respectively for the mean absolute error (MAE) in (a) and the mean signed deviation (MSD) in (b).The dashed(solid) red lines correspond to the smallest MAE(maximum deviation) obtained with plain T, Q or 5 zeta basis sets respectively, as reported in Table2.The legend is given in Figure2.Note that the small deviations due to the HF contribution would not counterbalance the results of MP2c.

Figure S7 :
Figure S7: Box plots of the differences ∆E MP2 c,GTO − ∆E MP2 c,PW between extrapolated GTO and PW MP2c interaction energies of the S22* test systems.Absolute differences are given in (a) while (b) reports signed values.Medians are shown as horizontal black lines and yellow lines stand respectively for the mean absolute error (MAE) in (a) and the mean signed deviation (MSD) in (b).The dashed(solid) red lines correspond to the smallest MAE(maximum deviation) obtained with plain D, T, Q or 5 zeta basis sets respectively, as reported in Table2.The legend is given in Figure2.Note that the small deviations due to the HF contribution would not counterbalance the results of MP2c.

Figure S8 :
Figure S8: Examples of fitting curves on aug-cc-pVXZ/CP-corrected data points for the ethene-ethine complex.Including the D zeta point in the sequence for (a) Helgaker and (b) Martin4 deteriorates the interpolation while these are better for (c) Rovibi34 and (d) Rovibi45.

Figure S9 :
FigureS9: Deviations between the cc-pVXZ and PW MP2 interaction energies as a function of the number of electrons Ne in the dimer system.(a) for plain basis sets, (b) for energies extrapolated to the CBS limit with best schemes of Tables3 and 4.

Table S1 :
HF and MP2c contributions to the MP2 interaction energy for some S22 systems and wavefunction cutoff energy E ϕ cut .Energies are in [kcal/mol].rx,y,z are the respective x, y, z ratios of the orthorhombic supercell dimensions with respect to the HF electron density measured at an isosurface of 0.002 a.u., while Ω is the volume of the supercell.σ MP2

Table S2 :
MP2 interaction energies in [kcal/mol] of the S22* test systems, uncorrected and with CP correction for the 5 zeta GTO basis sets.The PW values are given with the standard deviation σ resulting from the two consecutive extrapolations: with respect to the virtual orbitals (eq 38) and the supercell volume (eq 43).Mean signed deviations (MSD) and mean absolute errors (MAE) against PWs are indicated, respectively for each dominant interaction type and over all systems.