Numeric atom-centered-orbital basis sets with valence-correlation consistency from H to Ar

We present a series of numerically tabulated atom-centered orbital (NAO) basis sets with valence-correlation consistency (VCC), termed NAO-VCC-nZ. Here the index ‘nZ’ refers to the number of basis functions used for the valence shell with n = 2, 3, 4, 5. These basis sets are constructed analogous to Dunning's cc-pVnZ, but utilize the more flexible shape of NAOs. Moreover, an additional group of (sp) basis functions, called enhanced minimal basis, is established in NAO-VCC-nZ, increasing the contribution of the s and p functions to achieve the valence-correlation consistency. NAO-VCC-nZ basis sets are generated by minimizing frozen-core random-phase approximation (RPA) total energies of individual atoms from H to Ar. We demonstrate that NAO-VCC-nZ basis sets are suitable for converging electronic total-energy calculations based on valence-only (frozen-core) correlation methods which contain explicit sums over unoccupied states (e.g. the RPA or second-order Møller–Plesset perturbation theory). The basis set incompleteness error, including the basis set superposition error, can be gradually reduced with the increase of the index ‘n’, and can be removed using two-point extrapolation schemes.

In physical terms, this slow convergence is due to the difficulty of using smooth orbital product expansions to describe the electron-electron Coulomb cusps in real space [24,25]. In this work, we assess the performance of different types of atom-centered-orbital basis sets in RPA calculations, which include the widely used analytical Gaussian-type orbital (GTO) and the flexible numeric atom-centered orbital (NAO) basis functions. Over the years, several methods have been pursued that aim to address the slow convergence of the correlation energy, including: • Explicitly correlated wave function methods (i.e. R12 and F12) [24,[26][27][28], the transcorrelated method [29,30], diffusion quantum Monte Carlo [31][32][33], density functional perturbation theory together with plane-wave basis sets [34,35] and Nakatsuji's local Schrödinger equation method [36]. All these methods attempt to accelerate the convergence behavior directly by addressing the electron-electron cusp.
• Complete-basis-set (CBS) extrapolation schemes, which are currently the most popular method to eliminate the basis set incompleteness error in quantum chemistry. These CBS extrapolation schemes are based on standardized basis sets. These basis sets provide a step-wise systematic convergence of the basis set incompleteness error, which enables an analytic extrapolation to the CBS limit particularly for the terms containing unoccupiedstate sums [6,7,21,22,[37][38][39][40][41].
• CBS extrapolation in plane-wave basis sets. Plan-wave basis sets are another popular choice for electronic-structure calculations, because they provide an intrinsically and systematically improvable resolution of v in real or reciprocal space by increasing the momentum cutoff parameters. Plane waves are particularly suitable when combined with 'pseudoization' strategies that remove the atomic nuclei and core electrons from the explicit parts of the calculation. The plane-wave basis set error convergence of the RPA correlation energy has been investigated by Harl et al [16] and Björkman et al [42]. A recent development by Shepherd et al [40] suggests that, by introducing a new kind of energy cutoff based on the momentum transfer vector, it is possible to obtain accurate extrapolated correlation energies with lower computational cost. In the closely related GW method [43][44][45][46] the completeness relation is used to collapse the sum over unoccupied states [47,48].
The most widely used basis sets in quantum chemistry for CBS extrapolations are Dunning's (augmented) polarized valence-correlation consistent GTO basis sets (aug)-cc-pVnZ [49][50][51], with n = 2, 3, 4, 5 and higher. In early 2001, Furche investigated the basis set dependence of the frozen-core RPA method for the atomization energy of N 2 using the cc-pVnZ basis sets [22].
Comprehensive studies were carried out in recent years by Eshuis et al [21], reporting accurate benchmark data of the frozen-core RPA method, including various covalent and non-covalent interactions using the two-point extrapolation scheme from quadruple-ζ (4Z) and quintupleζ (5Z) together with Dunning's (aug)-cc-pVnZ basis sets. Fabiano et al [37] investigated the basis set convergence behavior of all-electron RPA calculations together with Dunning's correlation consistent GTO basis sets augmented with core and core-valence basis functions (VnZcv with n = 4, 5, 6 and 7) for H, C, N, O, F and Ne. It was found that the all-electron RPA method suffers from serious basis set incompleteness errors. Neither the extrapolation from 4Z and 5Z nor bare septuple-ζ (7Z) calculations without extrapolations suffice to provide converged results for the all-electron RPA method. It was suggested that accurate results can be achieved by extrapolating from 5Z and sextuple-ζ (6Z), if globally optimized two-point extrapolation schemes are employed. These observations reveal that the valence and core correlation contributions converge differently in RPA calculations, indicating that the basis set parts accounting for core and valence correlation can be constructed separately [52,53]. We here explore the use of NAO basis functions of the form as a recipe to obtain an efficient, systematic 'correlation consistent' convergence of the unoccupied-state sums in equation (2) for RPA, MP2 and related explicitly correlated approaches. Here, u i (r ) is a radial function and Y lm ( ) denotes the real parts (m = 0, . . . , l) and imaginary parts (m = −l, . . . , −1) of complex spherical harmonics. Analytic prescriptions such as Slater-type orbitals (STOs) [54,55] or GTOs (including the Dunning basis sets) [49][50][51]53] are obviously consistent with this shape, but the radial function u i (r ) may also be chosen independently of any analytical restriction [56][57][58][59][60][61][62]. Previous investigations [57,58] have shown that CBS total energies can be obtained with NAO basis sets for conventional DFAs and HF. For instance, efficient, hierarchical basis sets for DFT calculations for elements 1-102 were developed in 2009 within the Fritz Haber Institute 'ab initio molecular simulations' (FHI-aims) electronic-structure package [57], FHI-aims-2009 for short. The radial functions of these basis sets are organized in groups (so-called 'tiers') of functions in addition to a minimal basis of the occupied orbitals (core and valence) of spherically symmetric free atoms. The additional 'tiers' (n = 1-4) were found by a step-wise minimization of the LDA total energies of symmetric dimers for all elements from light to heavy. The FHIaims-2009 basis sets were found to be accurate and transferable for the total energies of all local, semilocal and hybrid DFAs and the HF method [57,63].
However, the unoccupied state sums of equation (2) present a different problem: Obtaining systematic convergence for a spectrum of unoccupied states, the number of which is not bounded by the number of electrons in the system but rather by the spatial resolution needed to approximate the effect of the electron-electron cusp in a systematic way. Ren et al [63] have investigated the convergence properties of the FHI-aims-2009 basis sets for explicit-correlation methods such as RPA and MP2. In essence, the convergence of energy differences (e.g. binding energies) is satisfactory, but only if a counterpoise (CP) correction scheme is applied.
In this paper, we develop a new sequence of NAO basis sets with valence-correlation consistency for light elements (H-Ar), in order to extend the reach of this prescription to absolute total energies for explicitly correlated methods and to CBS extrapolation schemes for cases where a direct CP correction is not practical (e.g., energy differences between very different conformations of a single molecule). As outlined below and in the appendix, using NAOs for this purpose offers both physical and numerical advantages in (i) the radial function shape near the nucleus, (ii) their tails toward large distances and (iii) the fact that their spatial extent can be controlled by a single criterion (smooth cutoff distance, which can be large) in a practical calculation. We stress the fact that these new NAO basis sets are not intended to supersede the FHI-aims-2009 basis sets, but instead serve as an independent complement for methods which require a direct treatment of the electron-electron cusp by way of unoccupiedstate sums.
Dunning analyzed the contribution of basis functions to the correlation energy on the theoretical level of the frozen-core configuration interaction method with single and double excitations (CISD) [49]. He found that the first d-type GTO function has the largest effect on the correlation energy, but the second d-type GTO function has approximately the same effect as adding the first f-type GTO function. Likewise, the third d-type, second f-type and first g-type GTO function compose the third grouping. Based on these observations, Dunning generated a series of basis sets by introducing batches of basis functions that give a (approximately) consistent amount of correlation energy from large to small. These GTO basis sets are called 'correlation consistent'. In this work, we are interested in whether this strategy also works for the NAO basis functions.
Correlation consistent basis sets can provide energies and other ground-state and excitedstate properties that converge smoothly toward the CBS limit, especially for explicit-correlation methods, such as RPA, MP2 and CCSD(T). This observation has resulted in many empirical extrapolation formulae that can be used to estimate the CBS limit [64][65][66]. Using the He ground state energy, Hill [67] found that the energy increments due to partial waves of a given angular momentum number l in a CI calculation are proportional to (l + 1 2 ) −4 , which immediately yields an inverse relation between the correlation energy and the highest angular momentum l max in a specific basis set [68][69][70]: This relation can be used to extrapolate the energy for atoms, but can only be used approximately for molecules. More importantly, Klopper et al [70] argued that this relation is not fulfilled until the function spaces with lower l than l max are saturated. Dunning's 'correlation consistent' strategy introduces a group of s and p basis functions on top of the minimal basis (named the (sp) correlation set) to facilitate the convergence of the valence correlation energy. Generally, these basis functions mainly affect the description of the valence electrons, and are used to define the index 'n' of Dunning's basis sets, i.e. the cc-pVnZ basis sets contain n s (and n p) basis functions for the valence shell (1 from the minimal basis, and n − 1 from the (sp) correlation set). The s and p function spaces in cc-pVnZ basis sets have been saturated further in several subsequent studies [50,52,53,71]. Core-correlation consistency requires augmented basis functions with much more compact radial shape, as, e.g. in cc-pCVnZ [53] or cc-pwCVnZ [52]. The calculation of electronic affinities and other properties associated with anions requires augmented diffuse basis functions, i.e. aug-cc-pVnZ [50]. Ahlrichs and co-workers proposed a family of def2 basis sets [71], that have become one of the default choices in the TURBOMOLE program (see footnote 7) [71,72]. For the first-and second-row elements, the def2 basis sets use the same polarization sets as Dunning's cc-pVnZ basis sets, but the former generally contains more s and p functions than the latter. These s and p functions were determined by minimizing the HF energies of each element. Distinct from these works, in this paper we investigate the influence of saturating s and p function spaces for valence-correlation consistency. We propose an additional group of sp functions, named enhanced minimal basis, which greatly improves the valence correlation convergence (VCC) behavior of our 'correlation consistent' basis sets (see section 3 for details). Dunning's 'correlation consistent' GTO basis sets can be incorporated into the NAO framework of FHI-aims [57] and other NAO codes without any additional implementation effort. However, our investigations show that these GTO basis sets require more expensive real-space integration grids to achieve a numerical integration accuracy that is comparable to that of the NAO functions in FHI-aims. As shown in the appendix, this is in fact not only just a numerical but also a physical problem of any contracted GTO function with very high exponents near the nucleus. Regardless of the integration method used, the actual integrand in expressions such as (Ĥ scf denotes the self-consistent field Hamiltonian of standard or generalized Kohn-Sham theory and ϕ i denotes a basis function) may become unphysical very close to the nucleus due to the inclusion of very sharp primitive GTOs. In contrast, NAO basis functions do not have this problem as one can use the exact occupied orbitals of free atoms as the minimal basis. Similar numerical integration difficulties also occur for GTOs far from the nucleus, where the analytical tails of diffuse GTOs decay slower than those of the NAO functions. Here, it proves extremely convenient that NAO functions can be rigorously localized by a confining potential [57]. Even if the confinement radius is large, the relevant integration volume is still under direct and unambiguous control (see the appendix for a detailed discussion).
In this work, we demonstrate the advantage of NAO functions for compact basis sets with valence-correlation consistency. As a first step, we focus on the valence electrons and ground-state total energies only. The description of core and core-valence correlation requires significantly different shapes of basis functions [52,53] and will be discussed in future work.

Basis set extrapolation
As mentioned above, the slow convergence behavior of RPA, MP2 and CCSD(T) total energies with basis set size is often addressed by basis set extrapolation schemes [64][65][66]. Since we face the same underlying problem (the two-electron cusp), we show below that our valencecorrelation consistent NAO total energies can be successfully extrapolated as well. Therefore, we first introduce the two extrapolation strategies to be used in this work.
The first one is the simplest but popular two-point extrapolation scheme [69] where 'n 1 ' and 'n 2 ' are the indices of the valence-correlation consistent basis sets. This 1/n 3 formula was originally proposed for the correlation energy [68,69] but was also used directly for the total energy [38,73]. Taking frozen-core CCSD(T) atomization energies of 51 small molecules, Feller et al [38] suggested that it is not necessary to extrapolate the correlation energies separately, because the error in the rest is significantly smaller than the error in the correlation component. Considering that the converged total energies can be easily obtained with NAO basis sets for conventional DFAs and HF [57], this two-point extrapolation is used only for total energies. We denote this scheme CBS-TE. Eshuis and Furche [21] stated that they use this two-point formula to extrapolate frozen-core RPA correlation energies. However, they did not say explicitly how to determine the CBS limit of the remaining components. With the same (aug)-cc-pVnZ basis sets, we can reproduce their results using the CBS-TE scheme.
In order to fully exploit the power of correlation consistent basis sets for frozen-core RPA calculations, we also consider another popular two-point extrapolation scheme with two global optimized parameters α and d, [68,69] denoted CBS-OPT: In previous investigations [37,38,41,68,69], the parameters α and d were separately optimized for n 1 = 2, n 2 = 3 (CBS-OPT [23]), n 1 = 3, n 2 = 4 (CBS-OPT [34]) and n 1 = 4, n 2 = 5 (CBS-OPT [45]). However, in this work, we prefer to use the same global parameters for each level of extrapolation, since a sequence of correlation consistent basis sets should provide consistent convergence for the basis set incompleteness error. We will determine the global parameters for (aug)-cc-pVnZ and our new NAO basis set in section 4.1.

Construction
The sequence of basis sets, constructed in this work, are composed of a varying number of primitive NAO basis functions {ϕ i (r)} in the form of equation (3). These functions with spherical harmonics include the radial terms {u i (r )} generated by the exact solutions of radial Schrödinger equations for different radial potentials {v i (r )} within a confining potential v cut (r ): where l denotes the angular momentum quantum number related to the spherical harmonics Y lm ( ). In this work, we choose the default v cut (r ) used in FHI-aims [57] to confine the radial extent, which is turned on smoothly at a distance from the nucleus (called the onset radius r onset ), and reaches infinity at r onset + w (w = 2.0 Å throughout this work). While the confining potential v cut (r ) could also be used to provide extra basis flexibility at a given basis size [59,74,75], we use it as a purely technical quantity, which should not impact any physical results. Because we only focus on the 18 light elements from H to Ar, our experience in this work (see section 3.4 below) suggests that r onset = 4 Å suffices to provide well-converged results for most of these elements. However, the basis set optimization for the metal elements (Li, Na, Mg and Al) is apt to include more diffuse basis functions, which exhibit a slower convergence with respect to the confining potential. Therefore, a slightly larger value (r onset = 5 Å) is required for these metal elements.
The FHI-aims-2009 basis sets take the self-consistent free-atom radial potential v i (r ) ≡ v free atom (r ) as default to generate the radial functions {u i (r )} used in the minimal basis, which consists of the occupied orbitals from a non-spinpolarized, spherically symmetric free atom calculation. This minimal basis naturally captures the wave function oscillations near the nucleus not only in bare atoms but also in bonded structures [57] (because the nuclear Coulomb singularity here dominates the potential). We therefore select this minimal basis as the starting point of our new basis set. As mentioned above, this minimal basis is less demanding on the radial grids than the GTO minimal basis of Dunning's correlation consistent basis set (see also the appendix).
In addition to the minimal basis, our new basis sets include three other subsets named polarization set, (sp) correlation set and (sp) enhanced minimal basis, where (sp) indicates that Table 1. The character of the radial functions included in the NAO-VCC-nZ (n = 2, 3, 4, 5) basis sets for the first-row elements. In brackets: number of radial functions of each angular momentum channel. In addition, each basis set includes the minimal basis set of numerically computed radial functions of occupied free atoms. only s and p type orbitals are used. We will establish them in the following paragraphs. All these subsets employ only radial functions u i (r ) constructed from hydrogen-like atoms. We use the notation 'H(l,z i )' to define each of these hydrogen ('H')-like functions unambiguously in the context of equation (8). Here, l is the chosen angular momentum channel. The radial functions are generated by hydrogen-like radial potentials v i (r ) = z i /r (see equation (8)). For a unique definition of u i (r ), it remains to specify the principal quantum number n. n is always chosen to be the minimal one, n = l + 1, leading to nodeless radial functions similar to Slater-type orbitals [54,55]. Each radial function thus defined leads to 2l+1 basis functions via equation (3). z i acts as an a priori free parameter that controls the extent of the resulting radial functions u i (r ), analogous to the role of the exponent in a GTO. By a suitable choice of z i for individual subsets of radial functions {H(l,z i )}, we can emulate the even-tempered relation [76] z i = αβ i−1 i = 1, . . . , N orb (9) to expand the nuclear charges {z i } of hydrogen-like orbitals which belong to the same subset of N orb individual radial functions. (α, β) are the optimization parameters used to expand the z i within individual subsets {H(l,z i )}. The number of such optimization parameters increases when new basis functions with new l are introduced in each subset. Table 1 summarizes which subsets of radial functions and how many individual radial functions per angular momentum channel l are used to define our new series of basis sets. We follow the concept of valence-correlation consistency proposed by Dunning [49] to construct the polarization set by adding numeric hydrogen-like orbitals with increasing l as the basis set quality increases. We call our NAO basis sets with valence-correlation consistency 'NAO-VCC-nZ' (n = 2, 3, 4, 5). The index 'n' equals the highest angular momentum number l max in the specific basis sets for the first-and second-row elements. Taking the first-row elements, the last column in table 1 ('polarization set') lists the polarization functions for each index of the basis set. NAO-VCC-2Z utilizes only one d-type hydrogen-like orbital H(d,z i ) in its polarization set, labeled as (1d) (i.e. N orb = 1, l = 2 in this case). However, the size of the polarization set increases quickly. At the highest quality level (NAO-VCC-5Z), we then have 4 H(d,z i ), 3 H(f,z i ), 2 H(g,z i ) and 1 H(h,z i ) hydrogen-like orbitals (4d 3f 2g 1h). We note here that the second-row elements share the form of the polarization set with the first-row elements. However, for H and He, the index 'n' is not l max but l max − 1, which again emphasizes the empirical nature of the extrapolation schemes used for the CBS limit.
In order to guarantee VCC, the continued saturation in (sp) basis function space is equally important. According to Dunning's strategy, we therefore also build (sp) correlation sets of increasing size. As discussed in the introduction, the index 'n' of Dunning's (aug)-cc-pVnZ sets was defined as the basis functions used for the valence shell, counted by the 'n − 1' s (or p) functions in the (sp) correlation sets plus the other one from the minimal basis. This definition of 'n' is also valid for our new NAO-VCC-nZ basis sets. Like in the polarization set, we group basis functions with the same angular momentum l in the (sp) correlation set, e.g. (4s) or (4p) in NAO-VCC-5Z, following the even-tempered relation. The parameters (α, β) are different for each element, and optimized separately for double-ζ (2Z) to quintuple-ζ (5Z), respectively.
Further investigation in this work reveals that it is beneficial to introduce additional s-and p-type basis functions on top of the (sp) correlation set. We address this point in more detail in the next subsection, observing that these (sp) type functions are especially important to guarantee VCC in small basis sets. The result is that our basis sets generally include what we call a '(sp) enhanced part' of the basis set (see table 1), containing 1 H(s,z i ) for H and He (1s); 2 H(s,z i ) and 1 H(p,z i ) for the first-row elements (2s1p); and 3 H(s,z i ) and 2 H(p,z i ) for the second-row elements (3s2p). Because the basis sizes are the same as those of the minimal basis, we also call this (sp) enhanced set 'enhanced minimal basis'. We again employ the even-tempered relation to expand the basis functions with the same angular momentum for the enhanced minimal basis. The corresponding parameters are optimized for each level of our basis sets. Concerning the optimization procedure, we first introduce and optimize the (sp) correlation set based on the minimal basis. Then we add the (sp) enhanced set and optimize parameters (α,β) for that, and finally the same for the polarization set.
Regarding the optimization target, we select the total energies of spherically symmetric free atoms from H to Ar in the frozen-core RPA based on the orbitals from a self-consistent Perdew-Burke-Ernzerhof [77] (PBE) DFT calculation. At the DFT-PBE level, all orbitals (core and valence) are self-consistently determined. Frozen-core RPA here implies that the index i in equation (2) only runs over occupied valence orbitals. We denote this RPA procedure RPA@PBE in the following. Unless otherwise stated, frozen-core RPA@PBE calculations are carried out for the rest of the discussion.
The parameters (α,β) are optimized to minimize the RPA@PBE total energies for each element. The optimization is carried out with a Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm [78] based on three-point fitted numerical gradients. The corresponding iterations are terminated if the energy deviation in each BFGS step falls below 10 −6 eV. In order to reach the global minimum as best as we can, a set of initial guesses is generated globally in well-chosen parameter windows with tabulated grids, i.e. 0.0 < α < 50.0 and 1.0 < β < 3.0.
As an illustration, table 2 reports the optimized parameters for the carbon atom. In the development of correlation consistent GTO basis sets [49], it was found that the minimal exponent of GTOs with the same angular momentum in the polarization set decreases. Meanwhile, the maximal exponents increase with increasing index. The equivalent parameter in hydrogen-like orbitals is the nuclear charge {z i }. The aforementioned observation also partially holds for the NAO basis sets, where the maximal nuclear charges z max for the same angular momentum always grow monotonically from NAO-VCC-2Z to 5Z. Taking H(d, z max ) Table 2. Optimized parameters (α, β) and expanded nuclear charges of hydrogen-like orbitals for the (sp) correlation set, the polarization set, and the enhanced minimal basis in the NAO-VCC-nZ (X = 2, 3, 4, 5) basis sets for the carbon atom. The NAO-VCC-nZ basis sets employ the same minimal basis for all indices 'n'. The minimal basis is not shown in the table.  The enhanced minimal basis in this work is originally designed to saturate the (sp) basis functions on top of the (sp) correlation set. Table 2 indicates that the optimization is generally apt to produce more compact basis functions with larger nuclear charges, which demonstrates the importance of saturating the (sp) inner space for describing valence correlation. For the other elements up to the second row, the NAO-VCC-nZ basis sets are listed in the supporting information 5 . Figure 1 plots the averaged basis sizes of (aug)-cc-pVnZ, NAO-VCC-nZ and FHI-aims-2009. For most of these basis sets, the average counts 18 elements from H to Ar. However, the averaged basis size of aug-cc-pV5Z only covers 14 elements, because the basis sets are unavailable for the four metal elements (Li, Be, Na and Mg). In FHI-aims-2009, no tier-4 basis set was optimized for H, He, Li, Ne and Ar, because the tier-3 basis sets are sufficient to provide converged LDA total energies for symmetric dimers of these five elements. The averaged basis size of tier-4, therefore, takes the remaining 13 elements into account. Unless otherwise stated, the smaller basis sets will be used automatically if '(aug)-cc-pV5Z' and 'tier-4' are unavailable. As shown in figure 1, the basis sizes of NAO-VCC-nZ are slightly larger than those of cc-pVnZ due to the (sp) enhanced minimal basis. Both are significantly more compact than the aug-cc-pVnZ sequence, which augments cc-pVnZ with a set of diffuse functions. The basis size of tier-3 is still similar to that of NAO-VCC-4Z and cc-pV4Z, but one level up tier-4 becomes the smallest among the four atom-centered orbital basis sets.
With the aid of the resolution-of-identity (RI) technique, the most expensive step of the RPA method is the construction of the independent-particle response function. The number of required operations is proportional to N 2 aux N occ N vir , where N aux corresponds to the number of auxiliary basis functions used to expand the response function, and (N occ and N vir ) are the numbers of the occupied and virtual single-particle states, respectively. The memory scaling is dominated by the storage of three-center integrals, which is proportional to N aux N 2 bs , where N bs is the basis size of a specific basis set (see [11,12] for a detailed discussion). For a given system, for which N occ is then a constant, both N vir = N bs − N occ and N aux N occ (N bs − N occ ) are proportional to the basis size N bs , resulting in a cubic scaling of the RPA method in terms of N bs (O(N 3 bs )) for both time and memory. Our discussion of figure 1 demonstrates that our new NAO-VCC-nZ basis sets are more compact (i.e. smaller) than the Gaussian orbital aug-cc-pVnZ sequence while retaining the accuracy of aug-cc-pVnZ (see section 4). The NAO-VCC sets therefore facilitate a significant computational speed-up that is comparable to the one of switching from aug-cc-pVnZ to cc-pVnZ.

Basis set convergence for total energy
As a particular example, we first examine the performance of NAO-VCC-nZ for the carbon atom shown in table 3. The calculated RPA@PBE total energies become lower monotonically for all three sequences of basis sets with valence-correlation consistency. However, comparing to the extrapolated values, the basis set incompleteness errors in quintuple-ζ (5Z) are still larger than 120 meV. Table 3 reveals that, while diffuse functions help to notably lower the total energies for finite basis sizes, the contribution seems to be negligible when extrapolating to the CBS limit. The CBS value extrapolated from NAO-VCC-4Z and 5Z (CBS-TE [45]) is −1030.7145 eV, which is about 21 meV higher than the extrapolated value from aug-cc-pV4Z and 5Z. However, the discrepancy reduces to only 6 meV by taking the extrapolated value from aug-cc-pV5Z and 6Z (CBS-TE [56]) as the reference, validating the power of NAO-VCC-nZ basis sets to describe valence correlation for explicit-correlation methods. We further show the convergence behavior of four types of atom-centered orbital basis sets for several first-and second-row elements. In figure 2, the basis set errors of RPA@PBE ground-state total energies are plotted for increasing basis size. Figure 3 presents the same data logarithmically to better illustrate the convergence behavior. The cc-pVnZ basis sets show a consistent convergence for all elements examined here. Adding a set of diffuse functions to cc-pVnZ does not change the convergence behavior considerably for aug-cc-pVnZ.
The FHI-aims-2009 basis sets were established using the ground-state LDA total energy of symmetric dimers. The minimal basis of the NAO basis sets is by construction exact for a spherically symmetric, non-spin polarized free atom and a given DFT functional [57]. Thus, total energies for free atoms of any kind are already very close to converged when using the same DFT functional. Beyond the minimal basis, the 'tiers' in FHI-aims-2009 were by construction optimized to capture the DFT total energy contributions arising from the bonding environment. However, in RPA, the absolute energies of free atoms are not yet converged (unlike in the DFT case). As a result, NAO basis sets constructed based on semilocal or hybrid DFT do not necessarily provide the flexibility to converge atomic total energies in RPA, and this is exactly what is shown in figures 2 and 3.
The NAO-VCC-nZ basis sets provide a consistent convergence behavior for these elements.   of (aug)-cc-pVnZ, the basis set convergence curves are quite close for C and Si. However the deviation increases for the pair of N and P. And the maximum discrepancy occurs for F and Cl. Figure 3 presents the behavior of the basis set errors of figure 2 on a logarithmic scale. Among the cases shown, NAO-VCC-nZ exhibits excellent VCC for RPA calculations. The NAO-VCC-nZ basis sets extend the 'valence-correlation consistency' strategy upon including the enhanced minimal basis. Considering that the only difference of the elements in the same periodic table groups comes from the core shells, the equally good convergence behavior of NAO-VCC-nZ for these elements confirms the importance of saturating s and p functions for valence-correlation consistency. The quality of NAO-VCC-nZ is further demonstrated in table 4, which lists the absolute RPA total energies of cc-pV2Z, and the corresponding energy lowerings of aug-cc-pV2Z, NAO-VCC-nZ for elements from C to F, and Si to Cl. Comparing to the RPA total energies of cc-pV2Z, consistent energy lowerings can be achieved by introducing a set of diffuse basis functions, (one s, p and d primitive GTO each for first-and second-row elements). The augcc-pV2Z basis sets aim to saturate the basis set in the outer space. In contrast, the enhanced minimal basis sets within NAO-VCC-nZ are designed to saturate the (sp) basis functions in the inner space. Table 4 shows that dramatically larger energy lowerings can be gained by NAO-VCC-2Z, indicating that a large contribution to the error of double-ζ basis sets can be removed by saturating the inner (sp) basis function space. More importantly, these errors are not simply a function of the number of valence electrons. For example, the energy lowering of NAO-VCC-2Z for F is −2.2165 eV, which is more than double that of Cl. The elimination of these errors is the key to the excellent performance of NAO-VCC-nZ in figures 2 and 3, where we only focus on the issue of valence correlation.
MP2 is the simplest finite-order perturbation theory method, counting only corrections at second-order. As a systematic extension of RPA, renormalized second-order perturbation theory (rPT2) [12,13] adds second-order screened exchange [6,10,79] and renormalized singleexcitation (rSE) corrections [14] to the standard RPA method. To demonstrate the transferability of NAO-VCC-nZ, we also show frozen-core MP2 and rPT2@PBE results. The corresponding basis set errors are plotted as a function of the index 'n' in figures 4 and 5. Since all these methods are explicit correlation methods which share the same difficulty in approaching the CBS limit, our basis sets also provide a performance similar to RPA for MP2 and rPT2@PBE. In particular, NAO-VCC-nZ performs similarly well for systems with the same number of valence electrons, highlighting once again the importance of the enhanced minimal basis to saturate the (sp) basis function space for valence correlation.

Basis set convergence for energy differences
We next focus on the homonuclear diatomic molecules N 2 and F 2 to illustrate the performance of NAO-VCC-nZ for covalent binding energies. In figure 6, the RPA binding energies of N 2 Upon increasing the basis size, the computed RPA binding energies of N 2 and F 2 converge quickly to the CBS limits for the sequences of (aug)-cc-pVnZ and NAO-VCC-nZ. The CP correction decelerates the basis set convergence. However, most importantly, the extrapolated values (CBS-TE [45]) with or without CP correction converge to almost the same CBS limit, demonstrating that the notorious basis set superposition error (BSSE) is eliminated naturally without having to resort to CP corrections. This is the strategy of VCC. In fact, the maximum deviation is only about 20 meV or 0.2% (1.5%) of the N 2 (F 2 ) binding energies, among six CP corrected and uncorrected CBS limit values extrapolated by (aug)-cc-pVnZ and NAO-VCC-nZ for both N 2 and F 2 . For comparison, the FHI-aims-2009 basis sets ('tier-n'), which were not constructed to converge unoccupied-state sums, require a CP correction to obtain qualitatively correct energy differences. Without such a correction, systematic error cancelation between the unoccupied-state sums entering the energy difference would not be guaranteed.
To further inspect the basis set convergence, figure 7 plots the basis set errors of RPA binding energies on a logarithmic scale. We take CBS-TE [56]/aug-cc-pVnZ values as the reference. The basis set error is defined as the absolute deviation between the calculated binding energy (BE) in a finite basis set and the reference |BE − BE Ref |. The diffuse (augmentation) functions in aug-cc-pV nZ are very helpful for the description of the chemical bond. The sequence of aug-cc-pV nZ yields fast basis set convergence, especially for the F 2 case. As shown in figure 7, NAO-VCC-nZ also provides a very good convergence behavior. The logarithmic scale reveals that the NAO-VCC-nZ basis set errors decrease nearly exponentially with the cardinal number, n. We argue that the (sp) enhanced minimal basis makes an essential contribution to this excellent convergence behavior of NAO-VCC-nZ, which highlights again the importance of saturating the (sp) basis function space for valence correlation. Figure 7 also illustrates that the CP corrections can significantly improve the convergence quality of the sequences of (aug)-cc-pVnZ, even though they decelerate the basis set convergence.

Role of the confining potential
As discussed in the context of equation (8), the default confining potential v cut (r ) is employed with the onset radius r onset = 4 Å in the optimization of NAO-VCC-nZ basis sets for most of the elements. For Li, Na, Mg and Al, a slightly larger value r onset = 5 Å is used. We next quantify the influence of the onset radii on the results computed using NAO-VCC-nZ. Taking the calculated total energies for atoms and binding energies for dimers (no CP correction applied) together with larger onset radii (r onset = 6 Å for all elements) as reference, table 5 lists the absolute errors computed by the default onset radii.
The NAO-VCC-nZ basis sets are optimized for the corresponding RPA@PBE total energies of the 18 elements. Table 5 reveals that the default onset radii are a reasonable choice for the basis set optimization. The mean absolute errors (MAE) are within 2 meV. The largest deviations compared to r onset = 6 Å are ∼6 meV (Li) and ∼10 meV (Na). This deviation does not change with increasing the basis size.
In table 5, we also show the influence of the onset radii on the calculation of binding energies of 18 homonuclear diatomic molecules from H 2 to Ar 2 . In all cases, the change in binding energy with onset radius is of the order of a few meV only. The MAEs are also within 2 meV. The errors in the total energy calculations of Li and Na are mostly canceled out for the binding energies. Only the case of Si 2 is an exception. Here, the more confined double-ζ basis set (NAO-VCC-2Z, r onset = 4 Å) apparently describes the dimer total energy better (lower total energy by ≈6 meV) than the same set of radial functions with the larger confinement radius  (r onset = 6 Å). For an incomplete basis set, residual effects of this magnitude are possible (there is no variational principle for a fixed basis set size and change of basis function shape with increasing confinement radius). Importantly, the effect is small and decreases with increasing basis set size, as it should. Thus, for the quintuple-ζ basis set (NAO-VCC-5Z), the maximum error (MAX) amongst 18 binding energies is only about 5 meV. This observation is consistent with the previous work by Ren et al [63] that the onset radii only affect the binding energies for small basis sets. Since the onset radius of the cutoff is always available as an explicit convergence parameter in a practical calculation, this issue is not particularly difficult to address. Nevertheless, even in small basis sets, such numerical uncertainty is negligible compared to the basis set incompleteness error for explicitly correlated methods, e.g. RPA and MP2.

Homonuclear diatomic molecules from H 2 to Ar 2 , HDM18
We choose homonuclear diatomic molecules from H 2 to Ar 2 (HDM18) as the first test set to examine the performance of the new basis set NAO-VCC-nZ. The experimental equilibrium  (57) 17 (80) 15 (53) 13 (41) 14 (51) 18 (83)  /cc-pVnZ, because corresponding basis sets with higher quality are not available for these elements. The MAEs and unsigned MAX of the basis set incompleteness error in the RPA@PBE binding energies for HDM18 are listed in table 6. Confirming the above observation, both MAEs and MAXs present a systematic convergence with increasing basis size for the (aug)-cc-pVnZ and NAO-VCC-nZ sequences. For the FHI-aims-2009 tier-n sequence, the MAX values show that energy differences without a CP corrections should not be used. However, with CP correction, a gradual decrease of the basis set incompleteness error is found also for tier-n.
To converge their RPA calculations, the authors of [22] and [21] extrapolated correlation consistent basis sets. As shown in table 6, when extrapolating from 4Z and 5Z (45), the CBS-TE scheme yields the CBS limit in good agreement with other sequences of basis sets. The statistical uncertainty is within 16 meV.
As observed in section 3.3, the CP correction can improve the regularity of the basis set convergence for RPA energy differences. One would therefore expect that better extrapolated values from 2Z and 3Z should be obtained using the CP correction. Unfortunately, the performance of the CBS-TE scheme from 2Z and 3Z becomes dramatically worse when doing CP corrections. A possible remedy is the more sophisticated extrapolation scheme, Table 7. The global parameters α, d in the CBS-OPT extrapolation scheme, A/(n + d) α , for (aug)-cc-pVnZ and NAO-VCC-nZ. These parameters are used for each level of extrapolation. CBS-OPT, that has two global parameters α and d (see equation (7)) [68,69]. We optimized the parameters by minimizing the CP-corrected MAEs of CBS-OPT [23] in table 6. The same parameters are then applied to CBS-OPT [34] and CBS-OPT [45] without any reoptimization. The optimized α, d for (aug)-cc-pVnZ and NAO-VCC-nZ are listed in table 7, and the corresponding performance in table 6.
We see that the CBS-OPT [23] scheme performs well for the three correlation consistent basis set prescriptions if CP corrections are applied. The MAEs of CBS-OPT [23] are around 35 meV, comparable to those of the bare 5Z basis sets. While the parameters α, d are only determined in CBS-OPT [23], they are also suitable for high levels of the extrapolation. A substantial improvement can be found for CBS-OPT [34], and the performance of CBS-OPT [45] is comparable to the corresponding CBS-TE [45]. Inspecting table 6 also reveals that this CBS-OPT scheme optimized with CP corrections works for the calculations without CP corrections. For NAO-VCC-nZ, the non-CP corrected MAE obtained with the CBS-OPT [23] scheme is about 37 meV. In particular, a small MAE combined with a small MAX shows that both CBS-TE [45]/NAO-VCC-nZ and CBS-OPT [45]/NAO-VCC-nZ are accurate and robust methods to approach the CBS limit of RPA binding energies.

Reaction energies, G2RC
The so-called G2RC test set consists of 25 reactions of small organic molecules, one of which includes the element Li. Because the basis sets of (aug)-cc-pV6Z are unavailable for Li, we examine the basis sets using the remaining 24 reactions in this work. Figure 8 presents the mean error (ME), MAE and MAX in RPA@PBE reaction energies of the G2RC set. In this case, the reference reaction energies and geometries are the experimental(!) values taken from [80]. Thus, the errors do not go to zero: we are simultaneously exploring the basis set incompleteness error and the error of the chosen approximation to the calculated energy, RPA@PBE. A key message from figure 8 is that the basis set incompleteness error can be eliminated systematically using the NAO valence-correlation consistent NAO-VCC-nZ sequence. The three overall measures of the reaction energy error examined here exhibit a systematic convergence behavior toward the values that represent the intrinsic errors of the RPA@PBE method itself.
The MAE is an important criterion to judge the performance of a specific method. As shown in the upper right panel of figure 8, the MAE value given by cc-pV2Z-the smallest one in the cc-pVnZ sequence-suffers from a large basis set incompleteness error. This error dies away quickly with increasing basis size, and the corresponding MAEs approach the converged value. With a similar basis size, NAO-VCC-nZ basis sets provide a better convergence behavior for this test set, and achieve convergence comparable to that of aug-cc-pVnZ basis sets.
The lower right panel of figure 8 shows the MAE values for two kinds of extrapolation schemes for the (aug)-cc-pVnZ and NAO-VCC-nZ sequences. As the benchmark reference, the dotted line marks the MAE of the CBS-TE [56]/aug-cc-pVnZ, which is 110 meV. The six CBS-limits are in good agreement with this reference. In particular, the NAO-VCC-nZ basis sets provide almost the same MAE as the reference (111 meV) for both CBS-TE [45] and CBS-OPT [45]. Therefore, we report an intrinsic MAE of 120 meV in RPA@PBE reaction energies of the whole G2RC set (including 25 reactions). This MAE is obtained by CBS-TE [45]/NAO-VCC-nZ.
Our results confirm the conclusion of Eshuis and Furche [21] that the reaction energies are converged well at the basis set level of aug-cc-pV4Z (MAE = 119 meV). While the MAE of the cc-pV4Z basis set is 143 meV, NAO-VCC-4Z gives a better performance (MAE = 115 meV) albeit at the size of cc-pV4Z. The basis set incompleteness contribution to the MAE is about 40 meV for NAO-VCC-3Z and aug-cc-pV3Z. Our results suggest that this triple-ζ error can be reduced efficiently to less than 20 meV using an extrapolation from 2Z and 3Z (both CBS-TE [23] and CBS-OPT [23]) without a substantial increase of computational cost, especially for the NAO-VCC-nZ sequence shown in the lower right panel of figure 8.

Isomerization energies, ISO34
Isomerization is a well-defined reaction process in organic chemistry, and isomerization energies have been used to validate the performance of theoretical methods [81][82][83]. Table 8 presents the basis set convergence of RPA@PBE results for 34 isomerization energies of small organic molecules of the ISO34 set [81]. Recently, Eshuis and Furche [21] reported ME, MAE and MAX, of RPA@PBE isomerization energies for Dunning's (aug)-cc-pVnZ (n = 3, 4, 5) using a development version of the TURBOMOLE program package 7 . In this work, we reproduce these values with a discrepancy of less than 1 meV using our own numerical framework (FHI-aims) [12,57].
We again extrapolate the isomerization energies using CBS-TE [45] and CBS-OPT [45] with the (aug)-cc-pVnZ and NAO-VCC-nZ sequences. The converged MAE of RPA@PBE has an uncertainty of 1 meV and is reported in table 8. The MAX error is more sensitive to the basis set incompleteness error than ME and MAE. Even from the '34' to the '45' extrapolation levels, the investigated correlation consistent basis set types still show a change of 20-30 meV. In short, the convergence of the MAX error is comparatively poor, leaving an uncertainty of the order of 10 meV or more for all basis sets and extrapolation schemes shown in table 8. Left panels: basis set convergence of the RPA@PBE binding energies for NH 3 · · · NH 3 , C 6 H 6 · · · CH 4 , and C 6 H 6 · · · H 2 O. Right panel: basis set convergence of the ME in RPA@PBE interaction energies of the S22 set with and without CP corrections. On the x-axis, n = 2, 3, 4, 5 list the indices of (aug)-cc-pVnZ and NAO-VCC-nZ, while the number n = 1, 2, 3, 4 given in parentheses are those of FHI-aims-2009 tier-n. The CBS-TE [45] extrapolated values are labeled '45'. Geometries are taken from [84]. The ME is the averaged deviation of BE RPA -BE Ref , where BE Ref is the CCSD(T) reference binding energy (BE) taken from [85]. Negative MEs imply systematic underbinding. The dotted line marks the CP-corrected CBS-TE [45]/aug-cc-pVnZ extrapolated data (in meV).
Nevertheless, our results demonstrate that for these isomerization energies, the quadruple-ζ basis sets will generally suffice, not only for Dunning's (aug)-cc-pVnZ sequences, but also for the numeric NAO-VCC-nZ basis sets developed in this work. The bare tier-n basis sets of FHI-aims-2009 perform similarly well for the isomerization case, indicating that some of the basis set incompleteness error cancels here.

Non-covalent interactions, S22
The S22 set is the most widely used benchmark database for non-covalent interactions [84]. It includes 22 binding energies of seven hydrogen bonds, eight dispersion bonds and seven weak bonds with mixed bonding character. Molecular geometries in the S22 set are taken from [84], and the reference binding energies are CCSD(T) values in the CBS limit taken from [85]. The right panel of figure 9 shows the ME in RPA@PBE binding energies of the S22 test set using different sequences of basis sets. BSSE has the effect to overestimate the non-covalent interactions systematically. The CP correction removes the BSSE. The remaining basis set convergence error (BSCE) now underestimates the non-covalent interactions [86,87]. For three kinds of valence-correlation consistent basis sets, the non-CP and CP corrected MEs converge to the CBS limit from opposite directions. In the CBS-TE [45] extrapolation scheme, the extrapolated MEs with (without) CP correction are −31 meV(−33 meV) for NAO-VCC-nZ, −32 meV(−36 meV) for cc-pVnZ, and −34 meV(−33 meV) for aug-cc-pVnZ respectively.
A negative value implies an underestimation of non-covalent bonds. Because the RPA@PBE method consistently underestimates all interactions in the S22 set, the extrapolated MAEs are simply the absolute values of the MEs.
For non-covalently bonded systems, even quintuple-ζ basis sets are not converged to better than 10 meV, regardless of whether CP corrections are applied or not. The non-CP corrected MEs are −21 meV (NAO-VCC-5Z), −25 meV (cc-pV5Z) and −21 meV (aug-cc-pV5Z), and the CP-corrected MEs are −45 meV, −46 meV and −39 meV, respectively. Furthermore, it is also insufficient to extrapolate from the smaller 2Z and 3Z basis sets (extrapolation not shown), even though a considerable improvement is found. The extrapolation from 3Z and 4Z is enough to produce converged MEs and MAEs (CBS-TE [34] values) for the aug-cc-pVnZ sequence, but not for cc-pVnZ and NAO-VCC-nZ. Therefore, we report the intrinsic RPA@PBE errors for the S22 set as CBS-TE [45]/aug-cc-pVnZ extrapolated ME and MAE of −33 meV and 33 meV with an uncertainty of ±2 meV.
Our results confirm the observation of Blum et al [57] and Ren et al [11] that CP corrections are always necessary for the FHI-aims-2009 basis sets in RPA or MP2 calculations. The CPcorrected ME of tier-4 is −47 meV, similar to that of cc-pV5Z and NAO-VCC-5Z. Upon adding a set of diffuse functions obtained from aug-cc-pV5Z, the CP-corrected ME of 'tier-4+a5Z-d' is −39 meV reported by Ren et al [11], which is equal to that of aug-cc-pV5Z.
In the left panels of figure 9, we chose the interactions of NH 3 · · · NH 3 , C 6 H 6 · · · CH 4 , and C 6 H 6 · · · H 2 O as illustrations for hydrogen bonding, dispersion bonding and mixed bonding. Dunning's aug-cc-pVnZ sequence performs well for the three types of non-covalent interactions. Especially, the CP-corrected values of aug-cc-pVnZ present the fastest convergence for the three valence-correlation consistent basis sets. However, without the CP correction, the best performance is given by the NAO-VCC-nZ basis sets. For the NH 3 · · · NH 3 case, the converged values are approached quickly by NAO-VCC-4Z. We also find a similar behavior for the C 6 H 6 · · · H 2 O case. Taking the CP-corrected CBS-TE [45]/aug-cc-pVnZ values as reference, the extrapolated NAO-VCC-nZ values have a tendency (within 4 meV or 3% of the binding energy) to overestimate mixed bonds. For NH 3 · · · NH 3 and C 6 H 6 · · · CH 4 , the deviations between the extrapolated values of NAO-VCC-nZ and aug-cc-pVnZ are less than 1 meV. In comparison, we can see that the CBS-TE [45]/cc-pVnZ scheme without CP corrections underestimates hydrogen bonds (8 meV for NH 3 · · · NH 3 ) and mixed bonds (3 meV for C 6 H 6 · · · H 2 O).

Conclusions and outlook
In this work, we introduce a sequence of numeric atom-centered basis sets with valencecorrelation consistency, called NAO-VCC-nZ (n = 2, 3, 4, 5), for the elements from H to Ar. We demonstrate that NAO-VCC-nZ basis sets are suitable for explicit correlation methods, e.g. RPA or MP2. The basis set incompleteness error, including the notorious BSSE, is reduced gradually by increasing the index 'n', and can be removed directly using two-point extrapolation schemes for the total energy. Generally, the NAO-VCC-4Z basis sets are sufficient to produce satisfactory results for covalent bonds and isomerization energies. However, for non-covalent bonds, it is always necessary to extrapolate the NAO-VCC-nZ sequence from 4Z and 5Z.
The NAO-VCC-nZ basis sets developed here are only designed for valence correlation. We expect them to fail for core correlation and extended cases, like anions. Core and core-valence correlation effects generally become important when relatively small errors in energetics or spectroscopic constants are required [88][89][90]. Also, more diffuse functions might be required for the calculation of electronic affinities and other properties associated with anions [50,51]. These aspects were not the target of our present work, but it is clear that suitable core or diffuse functions could be added to NAO-VCC-nZ if needed. Furthermore, at present the NAO-VCC-nZ basis sets only provide access to the elements from H to Ar. In fact, so far there is no valence-correlation consistent basis set that covers all elements of the periodic table [90]. Now that we have established a solid basis with NAO-VCC-nZ, work to address these remaining shortcomings is ongoing in our group.

Appendix. Accuracy of numerical integrations
In FHI-aims, there is a hierarchy of numerical integration settings 'light', 'tight' and 'really tight', catering to various numerical precision requirements. These settings simultaneously increase the accuracy of the following aspects of the code: (i) number of radial functions; (ii) confinement radius; (iii) Hartree potential expansion; and (iv) the grids used for threedimensional integrations. The technical aspects behind these choices are described in [57]. What is important for the present development, however, is that the appropriate choice of integration grid can of course depend on the choice of basis set. Here, we encounter a substantial difference between standard quantum-chemical GTO basis sets, and NAO basis sets. Specifically, our experience suggests that Dunning's (aug)-cc-pVnZ basis sets demand denser radial grids near the nucleus and denser angular grids on faraway shells than would be required for NAOs.
In FHI-aims, spherical shells of integration grid points ('radial grid shells') are positioned around each atom. In practice, these shells are determined as follows: and s = 1, ..., N r . Note that the hypothetical shell indices s = 0 and N r + 1 correspond to r = 0 and ∞, respectively. Typical values for the parameters are r outer = 7 Å (FHI-aims 'tight' and 'really tight' species defaults) and N r = 1.2 × 14 × (atomic number + 2) 1/3 , again according to Baker et al [91]. In practice, this leads to between 24 (hydrogen) and approx. 80 grid shells, but in principle, any other combination (r outer ,N r ) is possible. 2. A simple way to increase the radial grid density uniformly is to place additional grid shells according to equation (A.1) at fractional values s, e.g, s = 1/2, 3/2,. . . , (N r + 1)/2. The grid is thus extended both toward r = 0 and ∞. The fractional spacing is called radial multiplier; in the example, radial multiplier=2, which is employed in standard FHI-aims 'tight' settings. Examples of the location of grid shells for the C atom and radial multiplier=2, 4 and 6 are shown in the topmost panel of figure A.1.
For each radial grid shell, the angular integration points are distributed based on so-called Lebedev grids [91][92][93][94][95][96][97]. These grids are tabulated to integrate successively higher angular momentum orders around a center exactly on the unit sphere. In FHI-aims, the number of angular grid points increases from the innermost to the outermost grid shells. The number of angular grid points on faraway shells is 434 in the 'tight' settings and 590 in the 'really tight' settings. We denote this maximum number of angular grid points by 'outer grid' in the following. The minimal basis of Dunning's 'correlation consistent' basis sets is composed of combinations of primitive GTOs. These combinations are called the contracted GTOs. Using the 1s orbital of carbon as an illustration, the contracted GTO in Dunning's cc-pV5Z consists of ten primitive GTOs. The exponential factors (EFs) from maximum to minimum are 96 770.0000, 14 500.0000, 3300.0000, 935.8000, 306.2000, 111.3000, 43.9000, 18.4000, 8.0540 and 3.6370. Figure A.1 shows the radial shapes of the 1s contracted GTO of cc-pV5Z, and the sharpest primitive GTO (E F = 96 770) in this contracted GTO on a logarithmic scale. The contracted GTOs in cc-pVnZ are constructed to approximate the exact solutions of the free atoms. In contrast, accurate numerical versions of the free-atom solutions are used as the minimal basis in NAO-VCC-nZ. In figure A.1, the 1s orbital from NAO minimal basis for the PBE functional is also plotted for comparison. Table A.1 lists the numerical error in the DFT-PBE and RPA@PBE total energies of the carbon dimer. Using the default 'tight' setting with radial multiplier=2, the numerical error of cc-pV5Z is more than 300 meV for both DFT-PBE and RPA@PBE. The magnitude of this error is surprising, all the more since already radial multiplier=4 (i.e. doubling the grid density) reduces this error to below 1 meV.
In figure A.2, we trace the origin of the discrepancy to the 1s function of the minimal basis of Dunning's cc-pV5Z GTO basis set. In fact, practically the entire error can be traced to the 1s Kohn-Sham eigenvalue already in DFT-PBE. At a first glance, the GTO 1s function and the   dimer as example, table A.2 presents the sensitivity of DFT-PBE, HF@PBE, and RPA@PBE to outer grid. Three quintuple-ζ (5Z) basis sets are examined here. Since increasing the angular grids in the near-nuclear shells has no effect on the accuracy, calculations are performed with the 'tight' grids and the default numerical thresholds in FHI-aims. For the first-and second-row elements, outer grid is set to 434 in 'tight'. We can see that RPA@PBE gives a nonsensical binding energy of 4.574 eV for the default tight setting and the most diffuse aug-cc-pV5Z set investigated in this work. For cc-pV5Z and NAO-VCC-5Z, the default 'tight' setting offers plausible results, but still deviate by 117 and 38 meV, respectively. Such errors in cc-pV5Z can be reduced to about 4 meV with tighter grid setting of outer grid=590 with which a numerically converged result (error within ±1 meV) is achieved for NAO-VCC-5Z. outer grid=770 is a safe setting for cc-pV5Z, and outer grid=974 for augcc-pV5Z. The tier-n basis sets behave similar to NAO-VCC-nZ.
Consistent with previous work [11,57], these results demonstrate that the default 'outer grid=434' is good enough for conventional DFAs and HF. The PBE and HF@PBE values do not change (uncertainty within ±1 meV) by increasing the angular grids on faraway shells.