Many-Body Methods for Surface Chemistry Come of Age: Achieving Consensus with Experiments

The adsorption energy of a molecule onto the surface of a material underpins a wide array of applications, spanning heterogeneous catalysis, gas storage, and many more. It is the key quantity where experimental measurements and theoretical calculations meet, with agreement being necessary for reliable predictions of chemical reaction rates and mechanisms. The prototypical molecule–surface system is CO adsorbed on MgO, but despite intense scrutiny from theory and experiment, there is still no consensus on its adsorption energy. In particular, the large cost of accurate many-body methods makes reaching converged theoretical estimates difficult, generating a wide range of values. In this work, we address this challenge, leveraging the latest advances in diffusion Monte Carlo (DMC) and coupled cluster with single, double, and perturbative triple excitations [CCSD(T)] to obtain accurate predictions for CO on MgO. These reliable theoretical estimates allow us to evaluate the inconsistencies in published temperature-programed desorption experiments, revealing that they arise from variations in employed pre-exponential factors. Utilizing this insight, we derive new experimental estimates of the (electronic) adsorption energy with a (more) precise pre-exponential factor. As a culmination of all of this effort, we are able to reach a consensus between multiple theoretical calculations and multiple experiments for the first time. In addition, we show that our recently developed cluster-based CCSD(T) approach provides a low-cost route toward achieving accurate adsorption energies. This sets the stage for affordable and reliable theoretical predictions of chemical reactions on surfaces to guide the realization of new catalysts and gas storage materials.

We provide additional supporting data as well as contextual information to the main text here.All output files are provided on GitHub, which contains a Jupyter Notebook file that analyzes the data.This data can also be viewed and analyzed on the browser with Colab.
S1 Discussion of past CO on MgO literature

S1.1 Theory
Most early work applying many-body methods (Fig. 1a of the main text) to the CO on MgO system have employed cluster approaches as it requires smaller system sizes than periodic approaches.Cluster approaches work by placing a finite-cluster within appropriate embedding environments, of which many flavors are possible.For example, Ugliengo et al.S1 embedded the finite clusters mechanically into DFT via a hybrid high-level:low-level quantum mechanical (HL:LL) method (in the fashion of Morokuma's ONIOM S2 ), where second-order Møller Plesset perturbation theory (MP2) was chosen as the high-level.On the other hand, Herschend et al.S3 and Qin et al.S4 embedded the cluster electrostatically into a field of point charges and Staemmler S5 performed a many-body expansion of the correlation energy via the method of increments.These early works were confined to methods with well-known deficiencies, such as the lack of higher-order dispersion effects in MP2.As a result, reliable estimates of E ads cannot be guaranteed even when the electronic structure quality and models are converged -already a challenging task in itself.
The high-level method of choice is coupled cluster (CC) with single, double and perturbative triple particle-hole excitation operators [CCSD(T)].It is considered the 'gold-standard' of quantum chemistry and is capable of reaching sub-chemical accuracy within its domain of applicability.The first study to incorporate CCSD(T) for CO on MgO adsorption was by Boese et al., S6 where it formed a correction (performed on a small cluster) on top of MP2 in the HL:LL approach.This study came to a surprising E ads value of −220 meV, different from earlier experimental and simulation estimates.It was reproduced again in a later study from the same group by Alessio et al., S7 which also incorporated periodic local MP2 S8 calculations to validate the cluster-based approach.Unfortunately, no other studies S9-S13 since have been able to produce such a value.In fact, the newer works involving more sophisticated CC techniques cover a range of almost 500 meV (12 kcal/mol), with one end being a periodic CCSD calculation in 2022 by Mitra et al.S13 and the other end being a cluster CCSD (T)   calculation in 2016 by Mazheika and Levchenko.S11

S1.2 Experiment
While experimental estimates to the CO on MgO E ads have frequently been used to assess the quality of past theoretical work, the large discrepancies between experiments over the years can make reliable comparisons challenging.These past experimental estimates have ranged from weakly physisorbed S14,S15 (∼ −130 meV) to moderately chemisorbed (∼ −400 meV).S14,S15 While the variations in recent estimates (Fig. 1a) have largely settled to between −140 and −240 meV, this range is too large to enable reliable benchmarks.For example, the E ads of the CO on MgO system was considered largely resolved to around −130 meV between 2003 and 2012, with simulations at the time (Fig. 1a) agreeing with estimates by Wichtendahl et al.S16 (alongside several other studies).The agreement was actually fortuitous because the E ads computed from theory was being incorrectly compared to the adsorption enthalpy H ads

S2 CO on MgO E ads estimates from the literature
The adsorption energy (E ads ) and adsorption enthalpy (H ads ) for both simulation and experimental work from the past years are presented in Table S1.Experiments provide either H ads or activation energies E act (for the case of TPD).We have converted these quantities from experiment into an E ads with the formula: H ads = E ads + E cor , where E cor is estimated to be 19 meV (see Sec. S9).These experimental estimates are plotted in Fig. 1a of the main text.
It should be distinguished that in Sec.S9.1, we will also include an additional correction for the pre-exponential factor on top of the E ads currently given in Table S1.For the E ads from simulations, we also performed the (opposite) conversion to H ads .Table S1: The adsorption energy E ads and adsorption enthalpy H ads (the activation energy E act is given instead for TPD) of previous computational and experimental work in meV.Additional details (including method and publication year) are given for each study.

Reference
E ads H ads /E act Method Year Details Experiment Furuyama et al.S14 -184 -165 Isothermal adsorption 1978 Paukshtis et al.S15 -174 -155 IR spectroscopy 1981 Henry et al.S18 -439 -420 Auger spectroscopy 1991 He et al.S19 -448 -429 Isothermal adsorption 1992 Wichtendahl et al.S16 -156 -140 TPD 1999 pre-exponential log(ν) = 13 Dohnálek et al.We used the adsorption energy E ads determined in Sec.S6 for the cluster CCSD(T) technique as the reference.Alongside this, we also calculated binding curves with cluster CCSD(T) of the CO molecule as a function of distance from the surface, both frozen at their revPBE-D4 geometries in the combined CO adsorbed on MgO system, to determine a reference Mg -C distance for comparison to DFT.The reference lattice parameter was taken from experiment.S25 As shown in Table S2, we find that revPBE-D4 is the most appropriate DFT functional as it provides the best agreement to the reference E ads and lattice parameter alongside reasonable Mg -C distance, all at a low computational cost.S5 Computational details and convergence

S5.1 Cluster CCSD(T)
Our cluster calculations use an electrostatic embedding approach.The overall system (Fig. 1 of the main text) consists of a central quantum(-mechanically treated) cluster placed within an embedding environment.This embedding environment models the long-range electrostatic interactions of the surface by placing point charges within a 60 Å radius from the central CO molecule on the periodic 4L slab.These point charges are placed at the Mg and O ion crystallographic positions, taking formal oxidation values, with an additional outer layer of (optimized) point charges that enable the Madelung potential to be accurately represented at the quantum cluster.In the vicinity of the quantum cluster (<4 Å), the +2 point charges on the Mg ion sites are capped with the ECP10SDF effective core potential from the Stuttgart/Cologne group.S26 It prevents electron leakage from the dangling bonds of the O ions in the quantum cluster boundary.These were all constructed using py-ChemShell.S27 The SKZCAM protocol program suite.The very tight or "vtight" LNO local correlation thresholds were employed to closely approach conventional MP2 and CCSD(T) results.In particular, recent advancements in LNO-CCSD(T), such as its redundancy free local MP2 S33 and (T) S29 expressions, and a CCSD(T) code S32 redesigned specifically for the LNO method make it uniquely efficient and memory economic for large systems S31,S36 including also ionic and surface interactions.S28,S37 We included the sub-valence correlation of the 2s and 2p electrons on the Mg atom with the corresponding cc-pwCVnZ S38,S39 basis set.For all other atoms, we used the aug-cc-pVnZ S40 basis set.The def2-QZVPP-RI-JK basis set was used as the auxiliary basis function for the Hartree-Fock (HF) computations.The RI auxiliary basis sets S41,S42 corresponding to the AO basis sets were used for the local cWFT calculations but the automatic auxiliary basis functions of Stoychev et al.S43,S44 were generated for the Mg basis sets.Complete basis set (CBS) extrapolation parameters for the TZ and QZ pair, CBS(TZ/QZ), taken from Neese and Valeev, S45 were used for the HF and correlation energy components of the cWFT total energy.

S5.2 Periodic CCSD(T)
The periodic coupled cluster theory calculations are performed using the Cc4s code, S46 which is interfaced to the Vienna ab initio simulation package (VASP).S47 The calculations are performed in several steps involving Hartree-Fock and MP2 theory to obtain corresponding S-9 energies and optimized approximate natural orbitals.S48 Once the natural orbitals have been computed, the Cc4s interface to VASP is used to compute intermediate quantities S49 that are needed for the subsequent coupled cluster energy calculations including the corresponding finite size S50 and basis set corrections.S51 As discussed in Sec.S6, the final E ads with periodic CCSD(T) is dominated mainly by the CCSD(T) interaction energy on a 2L MgO slab (E ).In Table S4, we show the key contributions that make up . Besides a HF (E HF int,2L ), CCSD correlation (E CCSD int,2L ), and perturbative triples correlation (E interaction energy, there are also finite size corrections and basis set incompleteness error correction, denoted as ∆ CCSD FS and ∆ CCSD FPC respectively.In Ref. S52, all individual steps are described when combined with an embedding approach (which was not employed for the present system) .We chose a plane-wave cutoff parameter of ENCUT = 500 eV.If not stated otherwise, a Γ-centered 1×1×1 k-mesh is used to sample the first Brillouin zone.These contributions are obtained using 10 unoccupied natural orbitals per occupied orbital.For the present system containing the CO molecule adsorbed on the 2L MgO surface, this corresponds to 117 occupied spatial orbitals and 1170 unoccupied spatial orbitals.The computational cost for a CCSD(T) calculation of the adsorbed CO system including an MP2, CCSD and (T) step is about 2 kCPUh, 6 kCPUh and 80 kCPUh, respectively.The memory requirements are also significant; the CCSD and (T) calculations require about 3 TB of distributed memory.It should be noted that ∆ CCSD FPC depends on estimates of the MP2 pair energies, which are computed using 100 natural orbitals per occupied state and an automated extrapolation to the CBS limit.
We have checked convergence of the computed interaction energies with respect to all employed parameters and found that the only significant uncertainty originates from the basis set incompleteness error in the estimates of E CCSD corr.

S5.3 Periodic DMC
All calculations were performed in CASINO.S53 We used ccECP pseudopotentials S54,S55 for all of the atoms.The Jastrow factor included two-body electron-electron (e-e) term, twobody electron-nucleus (e-n) terms, and three-body electron-electron-nucleus (e-e-n) terms.
These variational parameters were optimized by minimizing the variance for the bound structure (either 2L for the 10 valence electron Mg ccECP or 4L for the 2 valence electron Mg ccECP), with the Jastrow subsequently used for all other systems, whether unbound or bound.PWSCF S56 was used to initialize the wavefunction using the LDA functional.We ensured the convergence of the timestep τ for both the 10 electron and 2 electron Mg ccECP.
As seen in Table S6, the 2 electron pseudopotential was already converged at τ = 0.1.On the other hand, the 10 electron ccECP requires a smaller timestep of τ = 0.03.We incorporate the model periodic coulomb (MPC) correction S57-S59 for a shorter duration than with plain Ewald interaction because it is very expensive (around 6× more).Fortunately, its correction up from Ewald requires significantly fewer steps to reach the same statistical errors.We note that the results in Table S6 do not include this correction (and other contributions discussed in Sec.S6), hence the lower value than shown in the main paper.All quoted statistical errors are given as two standard deviations (2σ) in the main text.
Table S6: The convergence of the DMC interaction energy E int in meV with timestep (0.1, 0.03 and 0.01 au) for the 2 valence electron Mg ccECP and 10 valence electron Mg ccECP on the 4L and 2L slabs respectively.These calculations were performed without the MPC correction.

S6 Details on E ads estimates and computational costs
As shown in Table S8, the final E ads for each of the three techniques [cluster CCSD(T), periodic CCSD(T) and periodic DMC] is composed of several contributions, with ∆ geom having been discussed previously.Here, we have attempted to make highly conservative estimates of the error bars to the most important contributions for all three approaches.
The root sum square of all these errors were then taken as the final error for each method, as the individual error contributions are assumed to be uncorrelated.
For similar reasons to the cluster CCSD(T) technique (discussed in the main text and Sec.S7), this decomposition has been made because it is not possible to directly compute E int accurately at the converged 4L (4 × 4) slab with either DMC or CCSD(T).For both methods, we have only been able to apply them to a smaller 2L slab, which was cleaved from the original 4L slab.This contribution forms the majority of the final E ads for both CCSD(T) and DMC, given by E

CCSD(T) int,2L
and E DMC int,2L respectively.In the vein of Pople's model chemistry, S61 we have computed the remaining (much smaller) contributions with more computationally economical methods.A pseudopotential (PP) is utilized for both periodic calculations, being either a PAW potential for periodic CCSD(T) or an effective core potential (from the ccECP family S54,S55 ) for periodic DMC.Here, the number of valence electrons associated with the Mg PP is another controllable factor which affects the accuracy and cost.In particular, we show in Sec.S8 that besides the 3s, inclusion of 2s and 2p electrons to form a 10 valence electron Mg PP is necessary to match all-electron calculations.
For DMC, we used a more economical form of DMC as the lower level, hereafter referred to as LL-DMC, that uses a 2 valence electron Mg PP, which (i) results in fewer electrons and (ii) allows for a larger timestep of τ = 0.1 that remains converged (as shown in Sec.S5.3).
The 10 valence electron Mg PP DMC calculation on the 2L MgO slab E DMC int,2L was corrected up to a 4L MgO slab in the ∆ LL-DMC 2L → 4L term with LL-DMC.We also used LL-DMC account for finite size errors (FSE) with the ∆ LL-DMC FSE term using a Model Periodic Coulomb (MPC) correction S57-S59 in the 4L MgO slab.The final (small) ∆ LDA IPFSE contribution is a correction Table S8: Comparison of the final E ads estimates between the cluster CCSD(T), periodic CCSD(T) and periodic DMC techniques.We give the walltime in kCPU-hours and maximum RAM usage in GB.No RAM usage has been given for DMC because it uses a negligible amount relative to CCSD(T).The components making up the final estimates are also shown (see text for description).Here, conservative estimates have been made for the errors of key components, which we discuss in Secs.S5, S6 and S7.The final error in E ads is taken as the root sum square of these errors.
Cluster CCSD(T) Contribution (meV) Cost (kCPUh) RAM (GB) for Independent Particle FSE S62 (IPFSE) computed at the LDA level going from a 1 × 1 × 1 Gamma-centered k-point mesh to a 3×3×1 mesh.Besides the final term, which is very small, statistical (2σ) error bars can be estimated for all of the above terms, which are expected to be much bigger than the systematic errors arising from using a lower level theory.
In periodic CCSD(T), we use MP2, CCSD and Hartree-Fock (HF) as lower level theories.
As discussed in Sec. .
We discuss the details of the individual terms to the cluster CCSD(T) E ads in the next section.

S7 SKZCAM protocol and its convergence
As discussed in the main text, the final estimated adsorption energy E ads arising from the cluster CCSD(T) technique was composed of an MP2 contribution E MP2 int,bulk-lim followed by a correction up to CCSD(T), termed ∆CC.These quantities can be computed accurately and in a cheap manner using the SKZCAM protocol and LNO-CCSD(T).S28 Here, this protocol produces a set of clusters of systematically increasing size, for which we can compute the interaction energy E int at several levels of theory (MP2, CCSD(T) and their local variants) at different basis sets.In Table S9, we compute the E int for several levels of theory along some of the clusters generated by the SKZCAM protocol and describe in the next two subsections how E MP2 int,bulk-lim and ∆CC are estimated.

S7.1 Extrapolation to MP2 bulk limit
The smooth convergence with cluster size (see Fig. 2b of the main text) allows for the bulk limit to be estimated via an extrapolation.We used the following formula for the relationship Table S9: The interaction energy E int (in meV) computed along the series of clusters produced from the SKZCAM protocol S28 for the CO on MgO system.We provide estimates for MP2 with double-zeta (DZ), triple-zeta (TZ) and quadruple-zeta (QZ) basis functions (see Sec. S5).The complete basis set (CBS) limit for the TZ and QZ pair is computed for MP2 and canonical CCSD(T), while the CBS limit for the DZ and TZ pair is computed for local MP2 (LMP2) and LNO-CCSD(T).

MP2
where A, B and γ are parameters to be fit via linear regression, with A being the bulk limit E int, bulk-lim .In Table S10, we show the estimate of E int, bulk-lim as more clusters are included in the linear regression fit at the DZ and CBS(TZ/QZ) basis sets at the MP2 level; QZ calculations were only feasible up to 5 clusters if we wanted to stay within a 2 day walltime on a single node.An estimate on the errors for only including 5 clusters in the MP2 bulk limit extrapolation can be estimated from observing the convergence of the DZ basis set, for which calculations can be performed on much larger clusters.As seen in Table S10, the difference for using 5 clusters w.r.t. the converged value (from extrapolating 8 clusters) is only 5 meV, which we take as an estimate of its error in E int, bulk-lim in the main text; this error is expected to be a conservative estimate because the convergence with the DZ basis set is slower than CBS(TZ/QZ).
Table S10: The extrapolated bulk limit interaction energy E int, bulk-lim at the MP2 level for the DZ and CBS(TZ/QZ) basis sets as we include more clusters (from the SKZCAM protocol) in the extrapolation.The error w.r.t. to the largest set of clusters ( 8) is also computed for the DZ basis set.All energies are given in meV. Cluster

S7.2 ∆CC correction
The steep cost of canonical CCSD(T)/CBS means that it can only be performed on the smallest one or maybe two clusters generated by the SKZCAM protocol.With local approximations, such as with LNO-CCSD(T), the feasible cluster sizes which can be studied are significantly extended.Additionally, the computation of accurate bulk-limit (extrapolated) CCSD(T) estimates can be formulated efficiently by adding the difference between the CCSD(T) and MP2 E int estimates (∆CC correction), to E MP2 int,bulk-lim .Since well-converged ∆CC corrections can be obtained using only the first few clusters of the SKZCAM protocol, the corresponding individual LNO-CCSD(T) computations become very economical.For example, it required less time on a single (many-core) computer node than the typical 1-2 days wall time limits accessible in commodity high-performance computing clusters.
As seen in Table S11, the chosen vtight LNO settings gives LMP2 estimates that are within 1 meV w.r.t.canonical MP2 for the first three clusters.This was also confirmed between LNO-CCSD(T) and canonical CCSD(T) for the first cluster.Importantly, we find that we do not need to go beyond the first three clusters of the SKZCAM protocol with LNO-CCSD(T) or LMP2 because the ∆CC local difference between LNO-CCSD(T) and LMP2 stays consistent to within 3 meV across these three clusters.We have chosen to make ∆CC the average ∆CC local from these three clusters with an error bar of 3 meV (in Sec.S6) corresponding to the largest observed deviation across these clusters.
Table S11: The interaction energy E int at the local MP2, LNO-CCSD(T), (canonical) MP2 and (canonical) CCSD(T) levels.The ∆CC correction is computed for both the local and canonical levels of theory.The complete basis set (CBS) limit for the TZ and QZ pair is computed for MP2 and canonical CCSD(T), while the CBS limit for the DZ and TZ pair is computed for local MP2 (LMP2) and LNO-CCSD(T).All energies are given in meV.

S8 Analysis of prior CO on MgO simulations
The differences in our estimated values of E ads w.r.t.previous simulations can be identified to arise from factors such as: (i) frozen core size, (ii) basis set size, (iii) basis set superposition error (BSSE) and (iv) use of unconverged cluster sizes.If all of these settings are accounted for, then any differences can be attributed to the inadequacy of the chosen level of theory.
Table S9 shows that using a smaller (ii) basis set leads to a weaker binding (i.e. a less negative E int ) compared to the CBS limit, assuming counterpoise corrections.Assuming no counterpoise corrections (CPC) to fix for (iii) BSSE, we find that the E int strongly overbinds, predicting a MP2 CBS(DZ/TZ) E int of −294 meV for the 5 th cluster relative to −196 meV when CPC is included.For this 5 th cluster, we have also found that a larger (i) frozen core (i.e. only correlating valence 3s but not 2s or 2p electrons) on the Mg causes a weaker binding, giving E int of −165 meV.Finally, using an unconverged cluster size also typically causes weaker binding, as already demonstrated in Table S9, where E int becomes more negative (i.e.exothermic) with larger clusters.Many embedded cluster studies specifically use the cubic 3 × 3 × 2 (Mg 9 O 9 ) cluster in their studies and it also provides weaker binding, giving a CBS(TZ/QZ) binding energy of −184 meV at the MP2 level relative to the final estimated value of −200 meV in the CBS(TZ/QZ) bulk limit.
We have assessed many of the previous simulations against these four factors in Table S12 and have been able to understand their trends relative to our converged E ads estimates.For example, the three early studies by Ugliengo et al., S1 Herschend et al.S3 and Qin et al.S4 all do not include 2s and 2p electrons in their correlation treatment of the Mg atom while using small unconverged basis sets, leading to a weakening of the binding.The study by  There are fewer assumptions involved with the three TPD estimates by Wichtendahl et al., S16 Dohnálek et al.S17 and Sterrer et al.S22 As a result, there is excellent agreement between the three TPD experiments and our cluster CCSD(T) estimate (Fig. S1).This agreement is made possible by converting the TPD activation energy E act quoted in the original literature to an adsorption energy E ads (with an intermediate step to convert it to an adsorption enthalpy H ads ) as shown in Fig. S1b and discussed in more detail below.
We also explain why we expect the studies by Wichtendahl et al.  and Sellers S66 These three terms form a correction term which we call E cor .The ∆ terms represent differences in ZPV or thermal energy contributions between the molecule from the combined molecule-surface system (with the surface degrees of freedom frozen) and the gas-phase molecule.
We used the quasi-rigid rotor harmonic oscillator approximation (quasi-RRHO) to compute both ∆ ZPV and E th (T).These were first proposed by Grimme et al.S68 for entropies and then adapted for enthalpies by Li et al.S69 We leave the description of the details to these two references.In Table S13, we have calculated the ∆ ZPV , ∆ th (T) and RT contributions to the E cor for the set of 6 DFT functionals (PBE-D2 [Ne], revPBE-D4, vdW-DF, rev-vdW-DF2, PBE0-D4 and B3LYP-D2 [Ne]) that we have used throughout this study.On the MgO surface, the surface degrees of freedom have been fixed.The average E cor value was the average taken from all 6 DFT functionals.There was only a standard deviation of 1 meV, suggesting E cor is not particularly dependent on the DFT functional, despite E ads depending significantly (Table S2).The resulting E cor of 19 meV is in agreement with Boese et al.S6 As discussed in their work, the anharmonic effects are expected to be relatively small compared to the errors intrinsic in experimental estimates.

S9.3 Comparing final experimental estimates
For clarity, we give the final TPD estimates (with ν corrected) in Table S14 below.
measured from experiments.As discussed in Sec.S2, removing temperature and zero-point effects contribute a −19 meV shift to convert H ads to E ads , making the agreement much worse (see Fig. 1a of the main text).On the other hand, later work in 2013 by Boese et al.S6 with the MP2+∆CC:PBE-D2 approach found excellent agreement to a separate TPD estimate by Dohnálek et al.S17 in 2001, while agreement was unsatisfactory against Wichtendahl et al..
S28  provides the framework to generate a series of quantum clusters with rapid and systematically converging properties.For each cluster, it uses robust and chemically intuitive rubrics to first select the Mg cations, followed by the O anions.The Mg cations arrange as shells, containing symmetry-related equidistant cations around the C atom of the CO molecule.Clusters (of only Mg cations) can be constructed by progressively incorporating more of these shells.The O anions are then placed to fully coordinate all of the Mg cations.The resulting set of clusters are always negatively charged as there will be more O anions than Mg cations; each Mg and O ion contributes +2 and −2 to the charge of the cluster.Localized orbital correlated wave-function theory (cWFT) calculations on the quantum cluster were performed with the local natural orbital (LNO) CCSD(T) S29-S32 and local Møller Plesset perturbation theory (LMP2) S33 implementations of Nagy et al. in the Mrcc S34,S35 Mg PP) −142 ± 18 −144 ± 19 2L (10 electron Mg PP) −140 ± 22 −159 ± 14 −161 ± 25 S5.4 DFT Periodic supercell calculations with DFT were performed in the Vienna Ab initio Simulation Package (VASP).S47,S60 The (001) MgO surface calculations employed an asymmetric fourlayer slab with the top two layers allowed to relax to form the pristine surface for a (4 × 4) supercell, with 15 Å of vacuum between the two surfaces and (2×2×1) Γ-centred Monkhorst-Pack k-point sampling.We chose a plane-wave cutoff parameter of ENCUT = 600 eV., with the 8 valence electron Mg pv PAW potential used for Mg, while standard PAW potentials were used for all other atoms.As shown in Table.S7, we confirmed that all of these (a) chosen settings are converged to within 3 meV by comparing against (b) a larger supercell, (c) more slab layers and (d) more expensive electronic structure settings, as shown in Table.S7.
and Dohnálek et al. to be more accurate than the one by Sterrer et al.. Hence, this is why the average of these two experiments forms the (best) experimental estimate given in Fig. 1 of the main text (−198 ± 19 meV).

Figure S1 :
Figure S1: (a) Comparison of recent temperature programmed desorption (TPD) and Fourier-transform infrared (FTIR) spectroscopy experiments on the adsorption energy of CO on MgO.These values were compared to the reference cluster CCSD(T) calculation from this work and we find that the TPD experiments from Dohnálek et al.S17 and Wichtendahl et al.S16 obtained the closest agreement.(b) This agreement required accounting for several effects to convert the original activation energies (from TPD experiments) to appropriate adsorption energies which we discuss in the text.
et al. and Dohnálek et al. to give the most accurate E ads that reproduce the dilute limit.The final experimental E ads value in Fig. 1 of the main text (−198 ± 19 meV) is taken to be the average of the two studies, with the error bar taken to arise from the 2σ error in ν discussed in Sec.S9.1.

Table S2 :
Comparison of the adsorption energy E ads , lattice parameter and Mg -C distance between DFT and reference values (discussed in text).As discussed in the main text, the adsorption energy E ads is partitioned into an interaction energy E int and geometrical relaxation term ∆ geom .This partition has been chosen to enable

Table S3 :
Comparison of the (true) adsorption energy E True ads from using the appropriate DFT functional geometry and (approximate) adsorption energy E Approx.

Table S4 :
The contributions making up E

Table S5 :
The change in the CCSD interaction energy combined with basis set corrections (E CCSD corr.int,2L-flat + ∆ CCSD FPC, 2L-flat ) against number of natural orbitals per occupied state (NOs/occ) for CO adsorbed on a flat (non-corrugated) 2L MgO surface.The difference w.r.t. the largest 15 NOs/occ calculation is given for the smaller basis sets.All energies given in meV.
occupied state for a perfect flat surface (i.e.bulk truncated) in the initial phase of this project.Our findings in TableS5show that this changes the interaction energy by 22 meV.Due to the observed rapid convergence of E CCSD corr.+∆CCSD FPC with respect to the number of natural orbitals reported for a wide range of molecular benchmark systems in Ref.S51,we conclude that ±22 meV is a very conservative estimate of the CCSD contributions to the interaction energy reported in this work.The other potential source of error is that the CO molecule in the unbound structure has only been shifted by 5 Å w.r.t. the bound structure, but at the rev-vdW-DF2 level, we find that this changes E int,4L by only 5 meV as opposed to the first definition of E int in Eq. 2 of the main text, equivalent to an infinite shift.

Table S7 :
The change in E ads of the revPBE-D4 DFT adsorption energy E ads for the (a) 4×4 supercell with 4L slab as the (b) supercell size is increased, (c) layer number is increased and (d) energy cutoff, grid precision and k-point density is increased.
4L .For the latter correction, we have employed the POTCAR file labeled Mg sv GW

Table S12 :
Staemmler et al.S5so far seem to have converged all four factors, probably suggesting that the CEPA method used is inadequate; it is formally based CCSD, which is a method that produces weaker binding than CCSD(T) for CO on MgO.S6The studies by Boese et al. and Alessio et al. both agrees with our estimated values to within about 20-30 meV because their studies have confirmed the sufficient convergence of all four factors.The study by Mitra et al. as well as by Heuser et al.S12 strongly overbinds the E ads because they have not included CPC to account for the BSSE.Finally, the studies by Li et al.S10 and Mazheika and Levchenko S11 predict weaker binding because they use a smaller unconverged basis set, with Mazheika and Levchenko predicting the weakest binding (in fact repulsion), because it uses a small cluster and does not include 2s and 2p electrons in the correlation treatment.Tableindicatingwhether the (i) Frozen core, (ii) Basis set, (iii) Basis superposition error and (iv) Cluster size have been converged or corrected in the past simulation work.We also include here their values of E ads from TableS1, which can be compared to our fully converged cluster CCSD(T) value of −199 ± 11 meV.
S21,S63-S65As seen in Fig.S1, the TPD experiments can all reach closer agreement to the cluster CCSD(T) value than FTIR, with both FTIR E ads estimates (given by orange markers), performed by Spoto et al.,S20,S21significantly underestimating the cluster CCSD(T) numbers.This underestimation can be attributed to two uncontrolled approximations used in the FTIR analysis: (i) IR intensities are proportional to CO coverage and (ii) H ads is independent of coverage.Due to dipole-dipole interactions between CO molecules, H ads decreases with CO coverage on many surfaces (including MgO) and the FTIR assumptions would thus result in smaller (i.e. less negative) H ads .

Table S13 :
The contributions (∆ ZPE , ∆ th and RT) that make up E cor for the 6 different functionals.All energies are given in meV.∆ZPE ∆ th (T) RT E cor

Table S14 :
The adsorption eS17gy E ads for the temperature programmed desorption experiments with ν corrected.The CO surface coverage of the separate studies is also given in terms of monolayer (ML) coverage .Both the works of Wichtendahl et al. (coverage of ∼0.3 monolayer (ML)); and Dohnálek et al. (coverage of ∼0.3 ML) reach sufficiently low coverage where the adsorption energy isexpected to be close to the dilute limit.As shown in TableS15, a coverage of 0.25 ML is only 1 meV away from the result of 0.125 ML.We note here that the reason why Sterrer et al. has an E ads that is less negative (i.e.weaker binding) than the other two TPD studies is because it quotes a value at relatively high CO coverage, close to a full monolayer (1 ML).As shown byDohnálek et al.,S17this is expected to cause a strong weakening in E ads the closer the CO coverage gets to 1 ML.For this reason, we consider the TPD studies by Wichtendahl

Table S15 :
The change in CO on MgO adsorption energy E ads with CO coverage.These values have been computed at the rev-vdW-DF2 level and are quoted in meV.