Analyzing the Local Basis Set Superposition Error -- a case study of CO adsorption on rutile(110)

The aim of this letter is to introduce a new method to analyze the local basis set superposition error (LBSSE) using local counterpoise corrections (CP) in order to converge a basis set for a given compound. Using this approach, we are able to define a basis set composition for a CO molecule adsorbed on rutile(110), which can be regarded as complete. In addition to a new LBSSE analysis, adsorption energies of CO on rutile(110) at the MP2 and CCSD level of theory are presented.


Introduction:
Nowadays, most of theoretical surface science studies use density functional theory (DFT) to calculate adsorption energies or reaction barriers. In general, DFT works surprisingly well in most cases. Therefore, we also "jumped" on the non-stoppable "DFT train" for most of our studies. Nevertheless, in sense of the famous quote "Finding the right answer for the right reason" it is necessary to benchmark these DFT studies at a higher and especially more systematic level of theory. The ultimate goal or also referred to as "holy grail of quantum chemistry" is full configuration interaction (full CI) in a complete basis set (complete CI). Unfortunately, for systems with more than ~20 electrons this method or even just the full CI is computationally much too demanding to be used on modern high performance computers 1 . Therefore, other high level methods that cover electron correlation are utilized. Two of the most famous ones are i) coupled cluster theory or more precisely CCSD(T) which is referred to as the "gold standard of quantum chemistry" and ii) 2 nd order Moller-Plesset perturbation theory (MP2). The later one might not be as accurate, but covers some aspects of electron correlation for a very low price (in sense of computation time).
A problem that most of standard quantum chemical correlation methods share is the computational cost. The problem is much worse than known from text books. In these books, most times just the scaling of correlation methods is mentioned. The formal scaling for some of the previous mentioned methods is ( ) where is the formal system size and = 4 for Hartee Fock, = 5 for MP2, = 6 for CCSD and = 7 for CCSD(T). Note that this is just a formal textbook scaling. The real scaling will depend on the implementation and will most likely be dependent on the number of occupied and the number of virtual orbitals. Another problem, which is not too often mentioned in textbooks, is that in order to give accurate results for these correlation methods, bigger basis sets need to be considered. Thus, not just gets bigger, if a higher level is considered, but additionally has to be increased. Therefore, it might be a clever idea to increase the basis set just at centers, where an increase is necessary and to use for chemical non-interesting parts of the system a smaller basis set. If adsorption of a molecule on a surface is investigated and adsorption energies should be calculated, a common approach is to use a bigger basis set for the adsorbate and the adsorption center and decrease the size of the basis set, when moving away from this center. This way, one hopes to converge the basis set where it is necessary, but have a reduced basis set at regions that are chemically not of interest.
In this letter, we will discuss how well the commonly used approach works for converging the basis set composition for a given system. We will show how one can utilize the counterpoise correction (CP) 2 of Boys and Bernadi to analyze the local BSSE. The CP here is just used in order to achieve convergence of the so-called basis set incompleteness error (BSIE) without using CP in the end. In addition, we will very briefly discuss the method of increments 3-4 and will report MP2 and CCSD adsorption energies for CO on rutile(110).

Basis set superposition and basis set incompleteness errors
In general, a complete basis set (CBS) is a basis with an infinite number of one-particle functions. Since we would not be able to compute an infinitely large space, one commonly reduces the size to a finite basis. The error that occurs by introducing the truncation of our space is called basis set incompleteness error (BSIE). The magnitude of the BSIE can be defined as the difference of the energy of a system using a complete basis set or a finite basis set .
Nowadays, there are different methods available accounting for the BSIE. Examples are the CBS methods of Petersson and coworkers 5 or Feller 6 . A benefit of calculations of adsorption energies of molecules on surfaces is, that the BSIE is partially accounted for by error cancelation, as briefly mentioned in the introduction. Basis functions that are located further away from the adsorption center do contribute less or at some point even do not contribute to the adsorption energy. Therefore, in a good approximation, the BSIE can be neglected for these centers.
But when considering the interaction of two separated fragments, e.g. an adsorbate A and a substrate B, another basis set related problem comes into play. When both systems interact, the basis functions localized at one fragment contribute to the description of the other fragment and vice versa. Thus, the basis set for the interacting systems is virtually increased.
There are different methods available that try to compensate for this error, but the widely applied one is the counterpoise correction by Boys and Bernadi 2 which can be written in its geometry independent form as follows: Here, the counterpoise corrected energy is calculated in terms of the difference of the entire system in its basis set ( ) and the energies of the fragments in the basis set of the entire system ( / ) in contrast to the non-corrected energy where the energy of the fragments in their own basis / ( / ) is considered. Although this method is widely used, there is still a debate, whether the CP correction is responsible for introducing other errors 7-8 . In addition, correcting for the BSSE might not completely compensate for the BSIE, and therefore a CP corrected adsorption energy might be as far away from the correct solution as a non-corrected value for too small basis sets. Therefore, the goal of this letter is to present a method where the CP correction is utilized in order to converge to a basis set, where no CP correction is needed.

Short excursion: The method of local increments
Before we introduce our scheme how to analyze the local BSSE and utilize this technique to converge our system almost BSIE free, let us first introduce the method of local increments [3][4]9 . We do so for two reasons. First of all, the method of local increments is a fragment based method utilizing a many-body expansion (MBE). The influence of the BSSE using a MBE has been extensively discussed within the past years [10][11][12] . The second reason is that this method inspired our analysis method introduced in the next chapter.
The method of increments, first introduced by Stoll 3-4 , is a MBE that helps reducing calculation time when high level energies need to be achieved. One way to use the method is to first calculate the energy of the entire system at the HF level of theory . Then just the correlation part of the energy is treated using the MBE.
is defined in terms of one-body ε , two-body ε , threebody ε and higher order contributions according to Equations 6 and 7. A one-body term can be understood as a group of localized orbitals. In substrates, often all orbitals localized at one atom of the surface is chosen as a one-body increment, where it might be useful to define all orbitals localized at the adsorbed molecule as a single one-body increment.
In several publications about adsorption energies of small molecules on oxide surfaces [13][14][15] it has been mentioned that the method of increments can be truncated either after the second or the third order to achieve adsorption energies that are almost identical to those calculated the conventional way. We tested this for CO adsorbed on rutile(110) which we investigated previously [16][17] . In this underlying work, we found Eads = -1.61 eV (Eads(CP) = -0.72 eV) with the method of increments and Eads = -1.65 eV (Eads(CP) = -0.73 eV) the conventional way at the MP2 level of theory (see supporting information (SI)). Since the scope of this letter is not to introduce this method, all further computational details can be found in the SI. In addition, we were able to predict the CCSD adsorption energy of CO on rutile(110) with Eads = -1.60 eV (Eads(CP) = -0.70 eV) using the method of local increments. All information about the computational setup can be found in the supporting information.
At his point it is worth to mention that it might not be beneficial to use the method of local increments at the MP2 level of theory for system sizes comparable to our example. A conventional calculation will be faster because of the computational scaling.
For equal-sized orbital groups/increments, the method of local increments has a formal scaling of: Where is the abstract system size, is the number of orbital groups (number of one-body increments), is the maximal considered order (e.g. if =2 then just one-and 2-body increments will be considered) and is a correlation method dependent exponential factor. E.g. =5 for MP2, =6 for CCSD or =7 for CCSD(T). Therefore, the method will formally just save computational time, if Equ. 9 is fulfilled.
This consideration totally neglects the real scaling of the computational methods, which depend on the implementation. Nevertheless it gives a good starting point for considering weather it is worth using the method of increments or not.
In recent years, the method of increments and additionally the more general MBE were quite frequently used in different studies e.g. adsorption on surfaces [13][14][15] or interaction of molecules like water monomers 18 and polymers [19][20] . Not all studies show that MBE based methods will by definition work well [10][11] , but show up limitations and problem while dealing with solving some of the issues especially the impact of the BSSE 10, 21 . In addition, Manby and co-workers showed how to utilize the method for a two-body energy correction scheme [22][23] . In their method, they do not use the method of increments as a MBE, but utilize twobody increments for energy corrections of lower level computational methods. In a similar fashion, the method of local increments inspired us to create a simple LBSSE analyzation tool as introduced in the next section.

Local Basis set superposition error analysis
Similar to conventionally calculated adsorption energies, the ones calculated with the method of local increments will exhibit a BSSE. In order to account for this, the previous mentioned CP correction can be used. We redefine Equation 4 and define the local BSSE energy (LBSSE) for a given orbital group as follows: With: Here Δ and Δ / are differences between one-body increments for the interacting systems ( ) and the noninteracting systems ( ( ) or ( )) in the basis set of the entire system AB or basis set of the fragments A/B, respectively. One could define a similar error for two-or higher-body increments, but there is no benefit in this. In our study, we define all orbitals localized at a substrate atom or all orbitals localized at the (CO) molecule as a one-body increment. When analyzing the LBSSE, one finds the BSSE for a local atomic environment or local molecular fragment, depending on the definition of the one-body increment or orbital group, respectively. Since quantum chemists are used to define basis functions atom wise in most modern quantum chemistry program packages, this leads to an intrinsic understanding, when also analyzing the LBSSE atom wise. In addition, in the method of local increments (or more general in a MBE) the one-body contribution is the most crucial, since its appearance dominates the expansion (see Equ. 6 and 7). Note, in the case of adsorption of a small molecule like CO on a surface, it might be useful to define all orbitals localized at all atoms of the molecule as one local environment. See Figure 1 for atom labels. In order to test the analyzation method for the local BSSE, we used the previous mentioned system of CO adsorbed on a rutile(110) embedded cluster model. Information about the model and computational setup can be found in the SI. In order to be able to converge to bigger basis set compositions, we used MP2 as correlation method, but similar results could be achieved at other levels of theory, e.g. CCSD(T). Note however, that the MP2 results just reflect the LBSSE at MP2 level. If switching to another level of theory, a new analysis might be necessary.
In Table 1 we listed values for different basis set compositions. In contrast to our previous studies [16][17] and to the calculations shown in the SI, we used Dunnings correlation consistent basis sets cc-pVXZ (with X = D, T, Q or 5) in order to understand the convergence behavior more rigorously.
For the first basis set composition (all cc-pVDZ), adsorption energies differ by 0.61 eV, depending, whether pure energies or counterpoise corrected values are considered. A standard procedure therefore is to trust in the CP corrected value. But as seen from the biggest basis set composition used in this study, this is not completely converged and will go down from -0.41 eV to -0.82 eV. The non-corrected energies are additionally effected. Here, Eads varies between -1.27 eV to -0.91 eV.
For the basis set composition with least basis functions (A), Eads is in good agreement with the final adsorption energy (F) in the basis set limit, whereas the counterpoise corrected energy shows the biggest error. This result is surprising, but can be explained by a partial compensation of the BSIE by a completely overestimated BSSE. In addition, it is worth to mention that as basis set convergence is not always monotonous but might oscillate, especially for too small basis sets. Nevertheless, after increasing the size of the basis set at the adsorption center (Ti1) and the adsorbate (CO) to triple-quality, a monotone convergence is achieved from basis set compositions B to F. Information about basis set compositions A to F can be found in Table 1. The corresponding LBSSE values for compositions A to E can be found in Table 2. In the beginning (A) the LBSSE of Ti1 and CO (compare Figure 1) are the largest, at least one order of magnitude larger than the others. Both decrease, when increasing the basis set size, but are still dominant at quadruple quality (E). As seen from Table 2, sometimes we decided to increase the basis set size, although the LBSSE of a chosen center was smaller compared to another center. But in addition to the pure value it needs to be considered, that these centers a present more often in the material due to the C2v symmetry (presented by the factor given in Table 2). Therefore, these group of symmetrical equivalent centers contribute more to the total BSSE than others.
An obvious result derived from Table 2 is that especially atoms or orbital groups close to the adsorption center exhibit a high LBSSE. This result is neither surprising, nor new, but it is commonly known that a basis set composition has to be chosen in such a way, that the adsorption center is described by most basis functions. Then the number of basis functions may decrease for further distances to the adsorbate. Nevertheless, now we are able to determine the influence of each center individually using the method described in this letter. This way, one can determine, whether a bigger basis set for a specific center needs to be considered, or if it has been chosen large enough. Thereby an unnecessary large amount of basis function can be avoided. This can drastically reduce the computation time, due to the unfortunate scaling of correlation methods with the basis set size.

Conclusion
We have defined a way of determining the local basis set superposition error (LBSSE) using localized orbitals at a specific atomic environment or for molecular fragments. Using this analysis, a convergence of the basis set for a specific subsystem in a complex environment can be achieved.
We were able to show that counterpoise (CP) corrected energies for too small basis set compositions might be less correct than non-corrected. On the other hand, as soon as a moderate basis set size is reached, a faster basis set convergence for CP corrected values is achieved than for non-corrected values.
Next to the LBSSE analysis method, we were able to present a CBS MP2 adsorption energy and a CCSD adsorption energy for CO on rutile(110). Here, the CCSD energy has been calculated using the method of local increments.

SI 1: Computational Methods
All calculations have been performed using Molpro 2010 1 or Gaussian16 2 . All adsorption energies listed in Table 1 of the main text have been calculated at the MP2 level of theory using Gaussian. The adsorption energies listed in SI 2 for our basis set composition used in previous works [3][4] have been calculated in Molpro. Here, first, HF energies have been calculated for the entire system. Then orbitals have been localized using the Foster-Boys method 5-6 . The localized orbitals have been sorted in orbital groups according to the nearest atom center. These orbital groups are so-called one-body increments ε as mentioned in Eq. 6 and 7 in the manuscript. Then, MP2 or CCSD correlation energies have been calculated for all one-and two-body increments, and the total correlation energy has been calculated according to Eq. 6. Note that the LMP2 module of Molpro has been used for MP2 calculations. All thresholds of LMP2 have been chosen in such a way that a LMP2 calculation for the entire system gives exactly the same value as canonical MP2.

SI 2: MP2 and CCSD adsorption energies for CO adsorbed on rutile(110) calculated with the method of increments
In two previous works, we presented adsorption energies of CO on rutile(110) at the MP2 level of theory [3][4] . Using the same basis set as presented in this works, we calculated a PES for the linear adsorption/desorption of CO on top of the 5-fold low coordinated Ti atom. Adsorption energies at the HF level and MP2 level have been calculated conventionally. Additionally energies at the MP2 level of theory have been calculated using the method of local increments are shown in Figure S1. All centers of the Ti 9 O 18 Mg 7 14+ -cluster except the Mg 2+ ions have been considered as one-body increments. We tested to include the Mg 2+ ions, but no significant change in adsorption energies has been achieved (see Table S1). Since we just considered the linear adsorption of CO in this study, we can make use of the C 2v symmetry of the system. Thus, we have to calculate much less incremental energies compared to the entire system. This way, the total number of calculations can be reduced from 406 to 131 for CO on Ti 9 O 18 .  Figure S1 clearly shows that incremental energies and conventional energies are in very good agreement even when just one-and two-body increments are considered. In Table S1 all counterpoise and not counterpoise corrected CO adsorption energies are listed at both MP2 and CCSD level of theory. Including just one-and two-body increments will result in an error smaller that 2.5% compared to the conventional calculated MP2 energy. An interesting result of this study is that CCSD and MP2 energies are virtually identical. Nevertheless, compared to the basis set analysis presented in the main text, all energies shown in Table S1 seem to be significantly affected by BSIE and BSSE. For this basis set composition, the error for the counterpoise corrected adsorption energy is less than 0.15 eV whereas the non-corrected value overestimates the binding strength by almost 100%. From a LBSSE analysis (see Table S2) as presented in the manuscript it is obvious that besides the basis sets chosen for adsorption center and CO, especially O27 and O10-O13 a poorly described, which results in a large BSSE.