Linear-Scaling Local Natural Orbital CCSD(T) Approach for Open-Shell Systems: Algorithms, Benchmarks, and Large-Scale Applications

The extension of the highly optimized local natural orbital (LNO) coupled cluster (CC) with single-, double-, and perturbative triple excitations [LNO-CCSD(T)] method is presented for high-spin open-shell molecules based on restricted open-shell references. The techniques enabling the outstanding efficiency of the closed-shell LNO-CCSD(T) variant are adopted, including the iteration- and redundancy-free second-order Møller–Plesset and (T) formulations as well as the integral-direct, memory- and disk use-economic, and OpenMP-parallel algorithms. For large molecules, the efficiency of our open-shell LNO-CCSD(T) method approaches that of its closed-shell parent method due to the application of restricted orbital sets for demanding integral transformations and a novel approximation for higher-order long-range spin-polarization effects. The accuracy of open-shell LNO-CCSD(T) is extensively tested for radicals and reactions thereof, ionization processes, as well as spin-state splittings, and transition-metal compounds. At the size range where the canonical CCSD(T) reference is accessible (up to 20–30 atoms), the average open-shell LNO-CCSD(T) correlation energies are found to be 99.9 to 99.95% accurate, which translates into average absolute deviations of a few tenths of kcal/mol in the investigated energy differences already with the default settings. For more extensive molecules, the local errors may grow, but they can be estimated and decreased via affordable systematic convergence studies. This enables the accurate modeling of large systems with complex electronic structures, as illustrated on open-shell organic radicals and transition-metal complexes of up to 179 atoms as well as on challenging biochemical systems, including up to 601 atoms and 11,000 basis functions. While the protein models involve difficulties for local approximations, such as the spin states of a bounded iron ion or an extremely delocalized singly occupied orbital, the corresponding single-node LNO-CCSD(T) computations were feasible in a matter of days with 10s to 100 GB of memory use. Therefore, the new LNO-CCSD(T) implementation enables highly accurate computations for open-shell systems of unprecedented size and complexity with widely accessible hardware.


INTRODUCTION
The accurate modeling of open-shell species remains challenging due to their potentially complicated electronic structure.Among these, the systems of interest here exhibit a high-spin open-shell ground-state wave function, which can usually be described by single-reference quantum chemical methods.Such a wide range of systems include radicals appearing as stable species or intermediates and transition states of reactions, products of ionization, or electron attachment processes, etc., often relevant in redox, atmospheric, polymer, combustion, and astro-and electrochemistry, just to name a few representative fields. 1 While density functional theory (DFT)-based methods remain the workhorse of computational chemistry due to their relatively affordable cost, cases posing challenges for DFT approaches occur more often for open-shell systems than for closed-shell systems.−4 While the second-order Møller−Plesset (MP2) 5 approach remains popular due to its cost-efficiency and relative simplicity, coupled-cluster (CC) methods represent a faster converging series to the chemically accurate (within 1 kcal/mol) description of processes of singlereference species. 6,7In particular, the "gold standard" CC model with single, double, and perturbative triple excitations [CCSD(T)] 8 offers a good compromise between computational cost and robust accuracy.
Still, the steep ( ) 4 -and ( ) 7 -scaling data storage and operation count complexity of CCSD(T) with system size limits its applicability range to molecules of up to 20−25 atoms.This is an even more severe problem for open-shell species, where unrestricted CC formalisms require the solution of about three times as many equations as their restricted counterparts.Moreover, the higher technical complexity of the unrestricted CC methods also slows the adaptation of new approaches frequently proposed only for the spin-restricted case.−17 However, even the combination of these techniques with additional advancements in our recent integral-direct, parallel DF-CCSD(T) implementation pushes the limits of conventional CCSD(T) computations only up to 30 atoms. 13his motivates the introduction of reduced-scaling approximations, such as the robust frozen natural orbital (NO) approach, 18,19 which can extend the applicability range of NO-CCSD(T) somewhat further. 20,21Alternatively, one can also accelerate the basis set convergence via explicitly correlated (F12) CC methods, 22−24 leading to more compact atomic orbital (AO) basis set requirements.However, even our recent combination of the NO approach and an optimized F12 implementation 25 allowed us to approach the complete basis set (CBS) limit for closed-shell species of only up to 50 atoms with an even smaller open-shell limit of about 30−35 atoms. 26−29 Current methods still build on fundamental techniques pioneered by Pulay and Saebø, 30,31 such as the approximation of the energy contribution of distant localized molecular orbital (LMO) pairs (pair approximation) and the restriction of the correlating orbital spaces to a spatially compact list surrounding the strongly interacting LMOs (domain approximation).One group of methods then approximates or neglects the distant pair interactions leading to a number of decoupled subsystem MP2 or CC equations to be solved; but here, it remains challenging to deal with the significant overlap of the subsystems. 27,28,32,33−53 Among these, the LMO pair-specific pair natural orbitals (PNOs) are employed in the domain-based local PNO (DLPNO) method of Neese, Valeev, and coworkers, 47,48,54−56 but PNO-based approaches were also developed up to the CCSD(T) level by Werner and Ma, 49,50,57 as well as by Schmitz and Haẗtig. 51−62 Due to the considerable challenges associated with openshell systems and unrestricted CC formalisms, fewer local CCSD(T) methods are available for systems other than those of a closed-shell singlet electronic structure.The incremental scheme was extended to both unrestricted Hartree−Fock (UHF) 63 as well as restricted open-shell Hartree−Fock (ROHF) 64 references.Additionally, the high-spin open-shell variants of the PNO-L methods by Werner and Ma 50,57 as well as the DLPNO method by Neese, Valeev, Hansen, Saitow, Guo, Kumar, and co-workers 47,48,55,56 were also introduced recently, while here, we present the high-spin open-shell extension of our LNO-CCSD(T) approach.
To that end, here, we combine the attractive properties of two lines of recent developments within our restricted LNO-CCSD(T) 52,53,62 as well as our open-shell local MP2 (LMP2) 65 schemes.The outstanding efficiency of these approaches originates from the Laplace-transform-based, redundancy-free evaluation of the amplitudes at both the LMP2 and the (T) levels. 62,65Moreover, the CCSD contribution is also obtained in the compact LNO space, which was further accelerated via our highly optimized CCSD implementation designed also for the unconventional ratio of the occupied and virtual orbital dimensions occurring in the LNO basis. 13The resulting LMP2 and LNO-CCSD(T) algorithms are fully ab initio, i.e., free from empirical or distance-based cutoff parameters, manual fragment definitions, bond cutting and capping, etc. usually associated with local correlation approaches.In fact, the LNO approximations are defined completely automatically and adopt to the complexity of the wave function of the systems.Exploiting this property, we designed an LNO approximation hierarchy of threshold combinations (e.g., Normal, Tight, ...), which form a systematically convergent series usually also suitable for extrapolation toward conventional CCSD(T) and for providing a conservative LNO error estimate. 53Further unique features of our LMP2 and LNO-CCSD(T) methods include the exceptionally small memory and disk use, checkpointing, utilization of point group symmetry (even non-Abelian), and treatment of near-linear-dependent AO basis sets. 52,53,66These capabilities were all required in our largest local CCSD(T) computation so far performed for extended supramolecular complexes 67 as well as for a protein of 1023 atoms with almost 45,000 AOs in a quadruple-ζ basis set. 53Even larger systems can be targeted via various embedding approaches combining LNO-CC-based chemically active regions embedded into lower-level LNO-CC, LMP2, DFT, and/or molecular mechanics (MM) environments. 68,69n important goal in the generalization of the above spinrestricted local correlation methods to open-shell systems is to retain as much as possible the computational efficiency of the closed-shell scheme.Therefore, in our open-shell LMP2 method 65 as well as here for the LNO-CCSD(T) case, we employ a restricted open-shell (RO) reference determinant, RO LMO set, and RO intermediate basis sets used for the costly integral transformation steps.Interestingly, this strategy is implemented in three completely different manners in the two PNO-based and our approaches.The DLPNO method employs an n-electron valence state perturbation theory Ansatz, 47 the PNO-L methods utilize a spin-adapted MP2 formulation, 57 while we employ a spin-restricted integral transformation combined with a simpler ROHF-based but unrestricted MP2 Ansatz. 70,71Building on that, here, we construct a restricted LNO basis.However, due to the properties of the perturbative triples corrections, in the end,

Journal of Chemical Theory and Computation
both the PNO and LNO methods have to use unrestricted formulas for the CCSD(T) part.
To enable the cost of the unrestricted CCSD(T) calculations of the LNO scheme to approach that of the closed-shell method, at least in the asymptotic limit, we utilize an additional approximation that we developed for our openshell LMP2 approach. 65Briefly, we can exploit the fact that the long-range spin-polarization effects of localized singly occupied MOs (SOMOs) can be taken into account at the mean-field and approximated MP2 level. 65Then, the efficient closed-shell formulas and algorithms can be utilized for the LMP2 and LNO-CCSD(T) correlation energy contributions of the LMOs that are not interacting strongly with any SOMOs.Interestingly, here, we also find that for large systems of about 200 atoms or more, up to 50−90% of the LMO correlation energy contributions can be safely evaluated with this approach at practically the cost of the closed-shell counterpart.
The capabilities of the resulting open-shell LNO-CCSD(T) code are illustrated on three-dimensional transition-metal complexes of up to 179 atoms as well as on protein models of 565 and 601 atoms.The protein models involve about twice as many (ca.11,000) AOs as the largest open-shell local CCSD(T) computations in the literature so far. 48,56In addition to the more complicated electronic structure of the studied metal complexes, the largest protein model also exhibits a highly delocalized SOMO, posing a challenge to any local correlation approach.Nevertheless, these LNO-CCSD(T) computations were still feasible within wall times of about 2−4 days using a single CPU with 20 physical cores and mostly a few tens to at most 100 GBs of memory and comparable disk use.
It is even more important to retain the accuracy of restricted LNO-CCSD(T) for more challenging open-shell applications.The closed-shell LNO-CCSD(T) method was extensively benchmarked by us 52,53,67,72−75 as well as independently, 76−83 revealing highly competitive accuracy and efficiency compared to other local CCSD(T) approaches, e.g., for organic thermochemistry, 53,78 noncovalent complexes, 67,74,81 conformational and isomerization energies, 82,83 ionic interactions, 73,80,84,85 as well as for organometallic- 77,79 and extended π-systems 67,76 exhibiting moderate nondynamic correlation.Here, we extend these benchmarks to radical stabilization energies, ionization potentials, and spin-state energies of smallto medium-sized systems and up to triple-ζ basis sets.On average, at the range where we can compare against the canonical CCSD(T) reference (that is, ca.20−30 atoms), we find the open-shell LNO-CCSD(T) correlation energies to be 99.9−99.95%accurate, which translates into a few tenths of kcal/mol average energy difference deviations for the investigated systems.This is in accord with the accuracy of the closed-shell method, where, however, one should point out that the local errors somewhat grow with increasing system size and wave function complexity. 52,53,67,72−75 The discussion of the corresponding details is organized as follows.Sections 2 and 3 collect the theoretical and algorithmic details of the new LNO-CCSD(T) approach, focusing on the technicalities emerging specifically for the open-shell case.The computational details and benchmark molecules are introduced in Section 4. The accuracy of the individual and combined local approximations is assessed in Sections 5 and 6.Finally, large-scale applications for systems of 175−601 atoms and the corresponding computational requirements are discussed in Section 7.

THEORETICAL BACKGROUND
Throughout the presented derivations, restricted open-shell (RO) reference determinants are assumed, consisting of SOMOs and doubly occupied molecular orbitals (DOMOs).Since the LNO method makes use of multiple orbital types, the notation of these is summarized in Table 1.The conventional and the LNO correlation energy expressions also employ unrestricted, semicanonical (also known as pseudocanonical) MOs.The lower (upper) case indices label orbitals with spin up (down) occupation, while i, j, k, ..., I, J, K, ... and a, b, c, ..., A, B, C, ... indices are used for the occupied and virtual subsets, respectively.Local approximations are introduced in the basis of LMOs obtained from a restricted open-shell reference, which will be denoted in general by indices , , , ..., while these LMOs will be labeled as i′, j′, k′, ... (I′, J′, K′, ...), respectively, when occupied by spin up (spin down) electrons.
To introduce the orbital-specific correlation energy contributions of the LNO approach, first the open-shell CCSD correlation energy expression is rewritten as a sum of contributions from the occupied spin up and spin down orbitals, δE i CCSD and δE where , and t denotes the CCSD singles and doubles cluster amplitudes.Additionally, ⟨ab∥ij⟩ stands for antisymmetrized electron repulsion integrals (ERIs), constructed as ⟨ab∥ij⟩ = ⟨ab|ij⟩ − ⟨ab|ji⟩ using the Dirac notation.
Analogously, the energy formula for the open-shell perturbative triples correction of CCSD(T) can be written as where t ijk abc denotes the triple excitation amplitude of the (T) method and W ijk abc can be written as t ijk abc D ijk abc , that is, the triple excitation amplitude multiplied by the canonical orbital energy differences collected into the denominator D ijk abc . 7,8ombining eqs 1 and 2, we can also write the full CCSD(T) correlation energy as a sum of orbital contributions To introduce the local CCSD(T) Ansatz, let us transform the separated occupied indices of eq 3 to the restricted openshell LMO basis where in the last term, denotes a spatial orbital occupied by either one or two electrons in the RO LMO basis, while i′ (I′) refers to orbitals with the same spatial component as LMO , but occupied by at most one spin up (spin down) electron.
The explicit expressions used for the LNO method will be introduced in Section 3.7.
The scaling of CCSD(T) can be made asymptotically linear with respect to the system size if only a domain of asymptotically constant number of orbitals is required to evaluate the correlation energy contribution of a given orbital , that is, E CCSD(T) .To construct the domain around each LMO, which is then called the central LMO of its own domain, first, the LMOs are collected that strongly interact with the central LMO.The selection of strongly interacting LMO pairs is based on multipole approximated MP2 pair energies [see the E CCSD(T) term of eq 6] evaluated in LMO pair-specific domains (PDs, ). LMO pairs with very small E ( )

MP2
pair correlation energy estimates are considered distant and, unlike the strong pairs, they do not enter to the higher-level computations.Next, the virtual space of the domain of LMO is constructed from local projected atomic orbitals (PAOs) surrounding the central LMO and its strongly interacting LMO pairs.The domain obtained in this way is referred to as the extended domain (ED, ) of LMO , which is sufficiently compact to efficiently perform MP2 computations exploiting our open-shell LMP2 implementation. 65The resulting local MP2 correlation energy contribution of the ED E ( )

MP2
[ ] is utilized as part of the correction employed to decrease the effect of the remaining approximations in the ED at the CCSD(T) level (see eq 6).To further compress the orbital spaces of the ED, LNOs are constructed using the density built from the first-order Møller−Plesset (MP1) amplitudes of the ED, yielding the local interacting subspace (LIS, ) of LMO .Consequently, the LNO-CCSD(T) correlation energy reads as where the energy correction E MP2 is calculated at the MP2 level of theory as Therefore, correlation energy terms are included in the final LNO-CCSD(T) expression for all orbital pairs of the entire molecule.The largest and most important component, which corresponds to the correlation of the strong pairs, is included at the complete CCSD(T) level in the LISs.The contribution of the frozen LNOs and the correlation energy contribution of distant LMO pairs are included at the MP2 level in the extended and pair domains, respectively.

ALGORITHM
The algorithm of the restricted open-shell LNO-CCSD(T) method is summarized in Figure 1 and described in this section step by step in detail.

Self-Consistent Field and Orbital Localization.
The considerations regarding the reference selection are analogous to those of our local RO MP2 scheme. 65The ROHF or RO Kohn−Sham (KS) reference determinant can be obtained via a restricted open-shell self-consistent field (SCF) calculation (ROHF/ROKS) or via unrestricted (UHF/UKS) computations followed by constructing quasi-restricted orbitals (QROs). 65,87he computational cost of (RO/U)HF can become high above the few hundred atom range even with DF approaches.−90 In the present study, local DF is employed only for the ROHF and UHF computations in the 500-to 600-atom range.
The occupied orbitals of the reference determinant are localized with the algorithm proposed by Boys, 91 but Pipek− Mezey, 92 intrinsic bond orbital, 93 and generalized Boys methods with higher orbital variance power 94 are also implemented.The localization is carried out in a spin restricted manner, and the doubly occupied and open-shell MOs are not mixed, resulting in doubly and singly occupied LMOs.The clear advantage of the restricted formalism is that the correlation energy contribution of the spin up and down electrons of a doubly occupied LMO can be obtained in a single domain corresponding to an RO LMO.The alternative way of computing E i CCSD(T) and E I CCSD(T) of eq 4 in separate, spin case-dependent domains would be approximately twice as costly.The potential drawback of keeping the LMO space restricted is that in systems with only one or a few SOMOs (or when the mixing of the SOMOs is prohibited, e.g., due to symmetry), it may not be possible to (sufficiently) localize the SOMO(s), leading to singly occupied orbitals in the LMO basis that are (potentially) still delocalized (see below an example in Figure 7).

PAO Construction and Pair Energy Calculation.
The PAO construction and pair energy calculation algorithms follow closely those of the open-shell LMP2 scheme 65 with minor extensions needed for the LNO-CC methods discussed here in more detail.To identify the strongly interacting LMO pairs, their approximate MP2 pair correlation energies are computed in LMO PDs.The PDs contain the corresponding occupied LMO pair and PAOs as virtual functions that are centered on atoms surrounding the two LMOs.
To assemble the virtual space, first, the PAOs of the entire molecule are constructed by projecting out the restricted LMOs (both DOMOs and SOMOs) from all AOs of the molecule (|μ⟩) The AO |μ⟩ and the corresponding atom can be considered as the center of the resulting PAO |a μ ⟩.The PAOs of eq 7 are restricted and span the virtual subspace of the spin-up electrons.To span the virtual subspace of the spin down electrons, we use the union of all SOMOs and PAOs.
The pair domain of a given LMO pair is constructed as the union of the primary domains of the two LMOs.To define the primary domains, a set of atoms is assigned to each LMO and PAO using a modified 52,66 Boughton−Pulay (BP) algorithm. 95he BP atom list of an LMO or PAO is compiled such that the projection of the LMO/PAO onto the AOs of its BP atom list has an overlap value of at least T with the original, unprojected LMO/PAO.Thus, 1 − T gives an upper bound for the truncation error of the projection onto the BP atom lists.For the primary domain construction, BP atom lists are assembled using T PDo = 0.999 for LMOs and T PDv = 0.98 for PAOs.The occupied subspace of the primary domain contains a single LMO, while its virtual subspace includes those PAOs that are centered on any of the atoms in the LMO's BP atom list.Additionally, if a BP list of an SOMO constructed with completeness criterion T PDv overlaps with the BP set of the LMO, the SOMO is also included in the spin-down virtual subspace of the primary domain.Then, the atom list of the PD contains the union of the BP atom lists of all of the orbitals in the primary domain.Finally, the orbitals of the primary domain are projected onto the AOs of the primary domain, the projected orbitals are orthogonalized, and the spin-up and spin-down orbitals are separately canonicalized.
The multipole approximated MP2 pair energy of an LMO pair is then evaluated in the primary domain bases of the LMO pair Here, the pseudocanonical orbital energy of the virtual orbitals is denoted by ε, while F i′i′ (F I′I′ ) indicates a diagonal element of the spin-up (spin-down) Fock matrix.The LMO subscript of the virtual orbitals indices indicates that the corresponding summations include virtual orbitals only from the primary domain of the given LMO.The ERIs, denoted here in Mulliken notation by a i b j ( ) 4

| [ ]
, are calculated using a multipole expansion up to the fourth order, including dipole− dipole, dipole−quadrupole, quadrupole−quadrupole, and dipole−octupole moment terms as discussed in ref 66.
Utilizing the E ( ) MP2 pair energies, the LMO pair can be classified as a strongly interacting pair if , where ε w is the strong pair energy threshold and f w is a factor of 1, 1

2
, or 1 4 for DO−DO, DO−SO, and SO−SO LMO pairs, respectively.The f w scaling factor is introduced so that pairs between LMOs of different occupations are handled on an equal footing since SO LMOs have half as many pair energy terms in eq 8 as DO LMOs.A more detailed discussion and numerical benchmarks regarding the benefits of using the f w factor are provided in our RO LMP2 study. 65Note that the approximated pair energy of those pairs that are not classified as strong is added to the MP2 correlation energy corrections as the last term of eq 6.
The pair energies of eq 8 have been extensively tested on a large number of (mainly organic) molecules, 52,53,65,66 with consistently satisfactory performance.However, the open-shell LNO-CCSD(T) approach is expected to be applied more frequently also for transition-metal complexes (see, e.g., Sections 5 and 7), exhibiting electronic structures potentially more complex than those of a typical organic molecule.In a few complicated cases, we have found that the multipole approximated MP2 pair energies can underestimate the approximation-free MP2 pair energies.This can lead to LMO pairs that are classified as distant instead of strong on the border of the two categories.
LMO pairs were classified as distant instead of strong on the border of the two categories.To remedy this issue, an additional mechanism is introduced here to extend the strong pair list.In the present approach, we investigate more closely the orbital pairs characterized originally as distant that are on the border of the distant and strong categories.More precisely, the orbital pairs with E f ( ) ered with parameter g w defining the range of pair energies where potentially important LMO pairs may still appear.For these pairs, an additional measure is computed where C A is an orbital coefficient of truncated LMO for AOs on atom A and S μν A is the overlap matrix of these AOs.

The C S C A A A A on
part is thus the Mulliken charge of (the truncated) LMO on atom A. Therefore, measure M sums the products of LMO and LMO Mulliken charges for all atoms; hence, it can be interpreted as a discretized measure of the overlap of the two LMOs.Orbital pairs for which M is large are more likely to be localized on the same or nearby atoms and consequently may interact strongly.The strong pair list is therefore extended with those pairs, which exhibit a considerable M measure.In practice, it is preferable to include the pairs that are indicated to be more strongly interacting compared to their pair correlation energy.The selection of the additional strong pairs is thus controlled by the h w parameter as relative to the pair correlation energies.In practice, we found that g w = 5 and h w = 50 E h −1 are reasonable choices also for particularly challenging cases, while the strong pair list extension introduces (practically) no changes to our previous strong pair list definition for organic molecules.

Local MP2
Energy in the Extended Domains.The MP2 correlation energy contribution of each LMO is evaluated in its ED analogously to our RO LMP2 method, 65 so we focus on the steps required in the EDs for LNO-CC computations.The occupied subspace of the ED consists of its central LMO and the strong LMO pairs of the central LMO.The atoms of the ED are defined to be the union of the extensive BP atom lists of all LMOs in the ED.These large BP atom lists are constructed with a completeness criterion of T EDo = 0.9999 by default.Then, each LMO is projected onto the AOs of its respective BP atom list, ensuring at most 1 − T EDo (that is, here, below 0.01%) projection error for accurate computations with these projected LMOs in the ED.The projected LMOs are reorthogonalized by a specific combination of the Gram− Schmidt and Loẅdin symmetric (GSL) orthogonalization algorithms. 96,97In the Gram−Schmidt step, the central LMO and all SOMOs are projected out from the DOMOs of the ED to ensure that they are not changed until the LNO construction step (Section 3.4).Subsequently, the projected DOMOs are Loẅdin-orthogonalized, and all occupied orbitals are semicanonicalized.For the latter, the spin-up and spindown Fock matrix blocks of the ED are diagonalized separately in the spin-up and spin-down occupied orbital bases.Next, the virtual subspace of the ED is constructed from the restricted PAOs centered on the atoms of the PAO center domain (PCD) of the ED.The PCD of an ED is defined as the union of the compact BP atom lists of the LMOs in the ED, constructed with T 0 = 0.985.The selected PAOs are subsequently projected onto the AO basis of the ED.To span the spin-down virtual subspace of the ED, the SO LMOs of the ED are appended to the spin-down PAO list of the ED.Finally, the occupied subspace is projected out from the virtual space of the ED; the virtual orbitals are Loẅdin canonicalorthogonalized among themselves and are semicanonicalized in a spin-unrestricted manner.
The ERIs of the ED are transformed to the MO bases of the ED using the DF approximation 98,99 and highly optimized AO integral 100 and local integral-transformation algorithms. 65,66As demonstrated previously, 66 the set of auxiliary basis functions of the DF approximation can also be restricted to include those residing on atoms of the PCD.The three-center DF integrals (μν|X) are therefore computed only for a significantly restricted AO list (μ) of the ED and for auxiliary functions (X) in the PCD.The transformation to the MO bases of the ED is started with an intermediate transformation step to the restricted occupied LMO basis of the ED.Since this first step is the most demanding, the overall cost of the open-shell DF integral transformation is kept comparable to that of our closed-shell LMP2 algorithm. 65he four-center antisymmetrized ERIs are then only assembled for the single central LMO, e.g., a single i′ index value, retaining the formally at most fourth-power-scaling operational complexity in the ED in terms of the dimensions of the ED bases where the K tensors are computed from the DF integrals according to In the above equation, I contains three-center DF integrals transformed to the ED MO basis aj X I ( ) . V XY = (X|Y) is the two-center auxiliary integral, whose inverse is computed using Cholesky-decomposition: Similar to the four-center ERIs, the assembly of the MP1 amplitudes is only required for the fixed i′ or I′ values; therefore, it is advantageous to compute them using either Cholesky-decomposition or Laplace-transform.This way, the amplitudes can be written in closed form also in the noncanonical basis used here to avoid the redundant computation for other occupied index values. 65,66Accordingly, the MP1 amplitudes of the ED are computed as )   where the index ω runs over the Cholesky-vectors or the integration quadrature of the Laplace-transform.The bar over integrals I and J denotes that these have been multiplied with the factor corresponding to the factorized energy denominator.For example, J J c BJ X BJ X BJ , ,

=
, where c BJ is either a Choleskyvector element or if Laplace-transform is used, with t ω and w ω as a quadrature point and the corresponding weight, respectively.It is important to note that, for example, the Laplace-quadrature is also determined in a spinindependent manner, 65 enabling the assembly also of the mixed spin MP1 amplitudes, such as in eq 12. Building the four-center integrals and MP1 amplitudes for all required spin cases leads to the ED MP2 energy contribution of the central LMO as Here, it is important to note how the contributions from single excitations to the ED MP2 energy originate from two sources.First, as we employ the restricted open-shell determinant, the unrestricted Fock matrices are not selfconsistent and thus have off-diagonal elements even on the unrestricted canonical MO basis of the entire molecule.Second, as a consequence of truncating the occupied and virtual MOs of the ED, a small portion of the exact virtual space is mixed with the occupied ED MOs and vice versa.The latter contribution to the singly excited MP1 amplitudes and hence to the MP2 energy of the ED is discarded in our closedshell LMP2 variant, which, however, would undesirably discard the first contribution as well in the open-shell case.Since the first contribution originates from the off-diagonal block of the MO Fock matrix, while the second one appears mostly in the diagonal part of the MO Fock, the two parts can be separated. 65For that purpose, we also need to separate the off-diagonal Fock matrix blocks in the AO basis as where F OD and F are the off-diagonal and the full Fock matrices in the AO basis, respectively, while C is the matrix of the unrestricted MO coefficients, and ϵ is a diagonal matrix with the corresponding orbital energies.Then, we use F OD to evaluate off-diagonal Fock matrix elements in the MP2 energy of the ED as shown in eq 14.This F OD matrix vanishes if the reference orbitals of the molecules are exact eigenfunctions of the full Fock matrix, such as in the case of our closed-shell methods where a self-consistent HF reference is employed.Finally, we also note that the complete local MP2 energy can be constructed as a side product of an LNO-CC computation as This is beneficial as this local MP2 energy can be used, for example, for various composite energy expressions, such as local MP2 level basis set corrections.

Local Natural Orbitals.
Due to the extensive cost of CCSD(T), the orbital spaces of the EDs should be further decreased after the MP2 part of the ED computation is

Journal of Chemical Theory and Computation
completed.To that end, LNOs are constructed as the eigenvectors of the second-order density matrix blocks of the ED built from the MP1 amplitudes of eq 12.The so-obtained LNO list is truncated by retaining the most important occupied and virtual LNOs for the domain correlation energy contribution with occupation numbers above a threshold.
For this purpose, first, the occupied−occupied density matrix block contribution of central LMO is obtained in the semicanonical ED basis for both spin cases as In order to reduce the computational cost of the integral transformation step (see Section 3.5), it is beneficial to first construct an intermediate spin-restricted orbital set also for the LNOs.To obtain spin-restricted LNOs, we transform the spinup and spin-down density matrix blocks of eq 17 to the restricted occupied basis of the ED and add together the resulting terms Here, C l J C ( ) contains the spin-up (spin-down) semicanonical to restricted LMO transformation coefficients of the ED.
Note that the restricted occupied basis contains both the central LMO and the SOMOs without mixing them with the rest of the truncated LMOs of the ED.This is the intentional result of using the GSL orthogonalization as described in Section 3.3 since the central LMO has to be kept exactly in the LIS, and we also want to keep the DO and SO subspaces separated after the truncation of the LNO basis.Therefore, before diagonalizing the restricted density matrix, its D m and D l elements are replaced with zeros to avoid mixing the central LMO with the rest of the occupied ED orbitals upon diagonalization.If the ED contains any SOMOs, the SOMO− DOMO blocks of the density matrix are also overwritten with zeros in order to obtain a single set of restricted occupied LNOs by compressing only the DO subspace of the ED.After diagonalizing this modified density matrix, the (restricted) occupied basis of the LIS ({i̅ }) consists of the central LMO, all SOMOs, and those doubly occupied LNOs that exhibit occupation numbers higher than the occupied LNO threshold, ε o .Then, the retained orbitals are semicanonicalized using the spin-up and spin-down Fock matrices, yielding the final, unrestricted semicanonical occupied LNO basis ({i̲ },{I̲ }) in which the CCSD(T) contribution of the LIS is evaluated.
Notice that in domains where the central LMO is singly occupied, all MP1 amplitudes with an occupied index of I′ are zero because there is no spin-down electron on such orbitals.Thus, for SO central LMOs, two of the four terms of eq 17 also vanish, which approximately halves the number of nonzero density matrix terms compared to the case of DO central LMOs.To treat the LNO construction with both SO and DO central LMOs on the same footing, we scale the density matrix corresponding to SO central LMOs by a factor of 2. With this setting, we found that the same ε o -occupied LNO threshold provides balanced accuracy for both domain types.
Having the occupied LNO space at hand, we recommended to take advantage of the fact that the virtual LNOs only need to describe the correlation energy contribution of the retained occupied LNOs. 52To achieve this, the occupied indices of the MP1 amplitudes in the ED are transformed to the retained LNO basis before virtual density matrix construction.This decreases the number of terms contributing to the virtual− virtual density matrix elements and thus leads to a smaller number of virtual LNO occupation numbers being above the corresponding threshold, ε v . 52t is worth noting a difference in the definition of the density matrix contribution of the central LMO to its occupied (eq 17) and virtual (eq 19 below) blocks.Namely, for the occupied− occupied block, only a single occupied index can be selected to be the central LMO as the other two indices (j ̃and k ̃of eq 17) are the indices of the density matrix block.In contrast to that, virtual indices (ãand b ̃of eq 19 below) label the elements of the virtual−virtual density matrix block; thus, there is some freedom of choice regarding, which occupied index is set to be the central LMO and at what point of the derivation should one introduce the restriction for the central LMO index.This leads to three different local virtual density matrix fragment variants, for which we provide derivations and further analysis in the Appendix.Out of the three choices, the definition with the most balanced contribution for all spin cases was selected Let us note that for closed-shell systems, this virtual density matrix expression does not completely match the density matrix in our closed-shell LNO-CC method 52,53,58,59 as our previous definition reduces to one of the other two expressions derived in the Appendix.Therefore, the virtual density matrix expression of the closed-shell algorithm is updated to match the open-shell form defined here in the limit of applying the open-shell code to closed-shell molecules.This introduces a very small change to the closed-shell LNO-CCSD(T) energies, much below the uncertainties corresponding to the LNO approximations.
In order to define an intermediate restricted LNO basis, the virtual density matrices of eq 19 built in the canonical ED basis are transformed to the restricted PAO basis of the ED.Then, the spin-up and spin-down components are added together, analogously to the occupied case in eq 18.Similar to the occupied density matrix fragment, half of the terms of eq 19 vanish in domains with an SO central LMO.Therefore, in these domains, the virtual density matrix fragment is also scaled by a factor of 2. If SOMOs are present in the ED, the SOMO-unoccupied block of the restricted density matrix is replaced with zeros, enabling the computation of spinrestricted virtual LNOs upon the diagonalization of the density matrix modified in this way.The restricted virtual basis of the LIS ({a ̅ }) then consists of the SOMOs and the retained virtual LNOs with occupation numbers larger than the threshold ε v .The resulting restricted virtual LNO basis is then semicanonicalized separately with the spin-up and spin-down Fock matrices to obtain the unrestricted, semicanonical virtual LNO basis of the LIS ({a̲ },{A̲ }).
3.5.Two-External Integral Transformations.For the CCSD(T) computations in the LNO basis of the LIS, the integral lists (i̅ j̅ |X), (a ̅ i̅ |X), and (a ̅ b̅ |X) are required.By utilizing intermediate restricted LNO bases, this demanding step can be performed at a cost very close to that of our closed-shell LNO-CC method.The (a ̅ b̅ |X) two-external integral list poses a larger computational challenge due to the fact that a typical LIS contains about 4−5 times as many or more virtual LNOs as occupied ones.To overcome a potential bottleneck for large systems corresponding to naive AO (→ ED PAO) → LNO transformation algorithms, for our closed-shell LNO-CC approaches, we implemented a significantly more efficient solution introducing approximate intermediate basis sets denoted as PAO′ and LNO′. 52That approach is generalized here to the open-shell case focusing on the algorithmic details, while we refer to our original discussion regarding the theoretical introduction and justification of the PAO′ and LNO′ functions. 52riefly, an intermediate PAO′ basis is introduced in each ED designed to provide a much more compact expansion of the virtual LNOs than the AO basis of the ED.For that purpose, we project out the restricted DOMOs and SOMOs of the ED from the AOs of its PCD (μ PCD ) where O j PCD denotes the overlap integral ⟨j ̂|μ PCD ⟩.The resulting PAO′ orbitals are not identical to the PAOs of the ED, but the difference is small for our purposes and well controlled by the strong pair energy threshold (ε w ) and the ED occupied BP completeness criterion (T EDo ) as discussed in detail for the closed-shell case. 52Therefore, the compact expansion in the PAO′ basis instead of the AO basis of the ED can also be written for approximated virtual LNOs as Here, A collects the orbital coefficients of the LNO′ functions in the PAO′ basis, while B = OA.Then, the corresponding two-external integrals can be written in the LNO′ basis as The main benefit is that the summations over the AOs in the above equation run over the PCD, which is usually about 2−3 times smaller than the (complete) AO basis of the ED spanning the original virtual LNOs.We showed previously that this approach can lead to about a factor of 8−9 speedup for large systems with saturated ED sizes with negligible difference compared to the use of the original LNO basis.
Compared to the two-external integrals, the (i̅ j̅ |X) and (i̅ b̅ ′|X) integrals are obtained with relatively low computational costs by transforming, e.g., the half-transformed integrals in the ED LMO basis to the LIS LNOs.To complete the two-external integral lists, one additionally needs to evaluate the SOMO−virtual and SOMO−SOMO blocks of (a′b ̃′|X), which will contribute to the two-external integrals in the spindown unrestricted basis.However, since we so far worked in a restricted basis, where the SOMOs play the dual role of occupied spin-up and virtual spin-down orbitals, the required blocks are equivalent to the SOMO−virtual and the SOMO− SOMO block of the spin-up (i̅ b̅ ′|X) and (i̅ j̅ |X) integrals, respectively.The complete set of restricted two-external integrals of the LIS can therefore be efficiently compiled from the three sets of integrals obtained so far.As the last step of the two-external integral transformations, the restricted (a ̅ ′b ̃′|X) integrals are transformed to the semicanonical spin-up and spin-down bases of the LIS, which represents a negligible additional cost compared to the closed-shell algorithm.For the sake of simplifying the notation, the prime distinction of the virtual LNOs is omitted in the remaining sections.
3.6.Natural Auxiliary Functions.Before utilizing the above three-center integrals to assemble the four-center ERIs of the LIS, their auxiliary function index is compressed using the natural auxiliary function (NAF) technique. 20,101NAFs are constructed to find the best low-rank approximation of the J pq X , and J PQ X , three-center integrals of the LIS.For efficiency considerations, instead of preforming the singular value decomposition of the three-center integral tensors, we can build the following W intermediates It is important to note that for the assembly of four-center ERIs with mixed spin, the auxiliary basis should be the same for both the spin-up and the spin-down DF integrals.Therefore, we diagonalize the spin-averaged W = (W ↑ + W ↓ )/2 matrix 20,101 and retain its eigenvectors as NAFs with eigenvalues above ε NAF .We also note that the NAFs obtained in this manner are equivalent to those of the closed-shell LNO-CCSD(T) algorithm in the closed-shell limit due to the factor of 1 2 in the definition of W.

Correlation Energy Calculation in the LIS.
Having the DF integrals transformed to the LNO and NAF bases of the LIS, one can proceed with writing the CCSD and (T) correlation energy contributions of eqs 1 and 2 specifically for the LNO approximations in the LIS.
First, the CCSD correlation energy contribution is evaluated in the LIS as

Journal of Chemical Theory and Computation
The CCSD amplitudes are obtained by solving the CCSD equations in the basis of the LIS, relying on our handcoded, highly optimized, and semi-integral direct unrestricted CCSD(T) implementation in MRCC. 102,103The computational cost of a CCSD iteration scales in total as the sixth power of the occupied (n̲ o ) and virtual (n̲ v ) LIS basis dimension [e.g., n n ( ) 4 ], which may take a significant portion of the total wall time of an LNO-CCSD(T) computation.In comparison with the closed-shell alternative, the computational cost of the LIS CCSD part is roughly three times higher, stemming from the three times as many terms in the open-shell CCSD equations.
Regarding the (T) correction of the LIS, its naive implementation would scale with the seventh power of the LIS dimensions [i.e., n n ( ) ].This can be reduced to sixthpower scaling in the LIS if we recognize that we only need the triples amplitudes for a fixed central LMO index to evaluate the δE I (T) correlation energy contribution since the perturbative (T) triples amplitudes are not coupled, at least in a semicanonical basis.For that purpose, we utilize an orbital invariant form of the (T) expressions using the Laplacetransform in our LNO methods. 62In this way, the triples energy denominators in the LIS, D ijk abc , are factorized as utilizing n q pieces of quadrature points s q , with quadrature weights ω q .Writing the exponential of the energy denominators as products of exponentials of single orbital energies, they can be absorbed in the intermediates contributing to the triple excitation amplitudes.Then, the latter can be directly computed in a basis that contains the central LMO, enabling an algorithm with a much more favorable scaling in the LIS [i.e., n Utilizing this Laplace-transform approach, introduced in detail in ref 62, the (T) contribution of central LMO is given as < i k j j j j j j j j j j j j y { z z z z z z z z z z z z Here, the overbar of t ̅ and W̅ indicates that these quantities contain the Laplace energy denominator factors absorbed as defined in ref 62.The formal scaling of evaluating eq 26 is similar to that of its closed-shell analogue, 62 with a larger prefactor of about 3 due to the increased number of spin cases.The number of quadrature points is determined systemspecifically and, matching our closed-shell experience, 62 the resulting 3(−4) points provide sufficiently high accuracy for the (T) contribution.

Comparison with the Closed-Shell LNO-CCSD(T) Method.
Here, we compare the computational requirements of the most important steps of the restricted open-shell LNO-CCSD(T) algorithm to those of their closed-shell counterparts.The computationally demanding part of the MP2 pair energy evaluation is the transformation and contraction of the multipole approximated integrals, of which there are four times as many in the open-shell case.However, even for the largest systems of 500−600 atoms considered here, the pair energy evaluation takes only a few percent of the total run time.The next important algorithmic step is the three-center integral transformation in the ED basis.Even though there are twice as many integrals to transform as in the closed-shell case, all transformation steps of considerable demand are kept at the same cost as in the closed-shell case by utilizing the intermediate restricted ED basis.Next, the MP1 amplitudes, the MP2 energy, and the density matrices in the ED are computed from the available ERIs of the ED.As these steps have to be carried out in the unrestricted semicanonical basis, their operation count requirement is roughly 2−3 times as large as that of the closed-shell analogue.As the next step, the ERIs necessary for the LIS CCSD(T) calculations are transformed to the LNO and NAF bases.Again, all transformation steps in the LIS with considerable cost are performed in the intermediate restricted virtual LNO basis to keep the operation count close to that of the closed-shell case.Finally, the correlation energy contribution of the central LMO is computed in the LIS using an unrestricted CCSD(T) formalism requiring about three times as many operations as closed-shell CCSD(T).Since the LIS CCSD(T) part usually takes about half of the total LNO-CCSD(T) wall time, and our unrestricted CCSD(T) implementation is less well parallelized than the restricted one, 13 one can expect at least twice as high costs for the open-shell LNO-CCSD(T) compared to its closed-shell counterpart for molecules of similar wave function complexity.
3.9.Approximate Long-Range Spin Polarization.While for small to medium-sized systems, the additional expense of using unrestricted CCSD(T) formalism is affordable with the present as well as other local CCSD(T) methods, 48,50 the same might not be the case for demanding, large-scale local CCSD(T) computations.Recently, we introduced the idea of approximating the long-range spin polarization at a lower level of theory, such as our distant pair MP2 model, which turned out to be very effective in our local MP2 implementation. 65o introduce this approach also at the LNO-CCSD(T) level, let us consider an extended system with a limited amount of SOMOs, which are often located around a well-defined, narrow region of the molecule.We activate this long-range spin polarization approximation domain specifically when the central LMO of a given domain does not couple strongly to any SOMOs, i.e., when there are only DOMOs and no SOMO within the ED (and LIS).Thus, when this approach is Journal of Chemical Theory and Computation activated, the small, spin-polarized correlation effects between the distant LMO-SOMO pairs are taken into account in the final LNO-CCSD(T) correlation energy via the multipoleapproximated MP2 pair energies.In other words, if there are no SOMOs in the EDs of the LMOs that form distant pairs with all of the SOMOs, then the only spin-polarization effect to take into account in those EDs comes from the spindependence of the domain Fock matrices.This manifests in the slight splitting of the spin-up and spin-down LMOs and PAOs of the ED due to their semicanonicalization with the corresponding spin-dependent Fock matrices.However, since the central LMO of such EDs forms distant pairs with all SOMOs, the AOs around the SOMOs are mostly not part of the ED.Then, one can expect that the blocks of the spindependent Fock matrices corresponding to AOs far away from the SOMOs are moderately spin-polarized and that the magnitude of this secondary spin-polarization effect in the ED energy contribution is small.Therefore, if we approximate this spin polarization effect by replacing the spin-dependent ED Fock matrix blocks with their spin-averaged counterpart, then all orbital sets of the ED and LIS become spinindependent.Then, we can utilize the restricted local MP2 and LNO-CCSD(T) algorithms to evaluate the ED and LIS correlation energy contributions for the domains far away from the SOMOs.As a consequence, in large systems where the SOMOs are well localized to a single region of the molecule, we can take advantage of the more efficient restricted formalism and thus approach the computational effectiveness of a completely closed-shell calculation. 65.10.Scaling of the Algorithm.The presented open-shell LNO-CCSD(T) method, just as its closed-shell analogue, achieves asymptotically linear scaling with the system size for its rate-determining steps.Since the size of the EDs saturates for sufficiently large systems, all computations performed in these domains exhibit asymptotically linear scaling. 65On the other hand, computations performed on the whole system, such as the occupied orbital localization, the PAO construction, and pair energy computation, scale as the third, third, and second power of the system size, respectively.However, the cumulative run time of these computations is negligible compared to the total run time, even for the largest systems considered here.Moreover, especially for molecules of several hundreds of atoms, the SCF computation can potentially take a considerable portion of the run time.The implementation for the HF exchange term in MRCC is in principle also asymptotically linear scaling 66,89 (alongside an efficient cubic-scaling DF-based Coulomb term).However, the HF exchange computations reach the linear scaling system size range for much larger molecules than our local correlation methods, which is typically above a few thousand atoms for three-dimensional structures.Below that size range, about cubic-scaling can be expected, as the local DF capability of our HF exchange implementation can effectively reduce the scaling already above a few hundred atoms.
In terms of memory requirements, the open-shell LNO-CCSD(T) algorithm requires the storage of six matrices with dimensions equal to the total number of AOs of the entire molecule, while the preceding SCF procedure requires eight such matrices.These quadratic-scaling arrays are relevant only above tens of thousands of AOs.Furthermore, the remaining arrays allocated within the EDs are asymptotically constant in size, and their memory requirement is also highly optimized following the ideas exploited in our restricted LNO-CCSD(T) code. 52,53The shared-memory parallelization model used within a single node also contributes to the highly memoryeconomical nature of our implementation.

Technical Details.
The open-shell LNO-CCSD(T) method as presented here is implemented in the development version of the MRCC quantum chemical program suite 102,103 and will be made available in a forthcoming release of the MRCC package.The default or Normal truncation thresholds of the local approximations and their corresponding keywords are collected in Table 2.These default values are equivalent to those utilized in the most recent closed-shell formulation of LNO-CCSD(T). 52,53 all HF and reference canonical CCSD(T) calculations, the DF approximation was employed using the MRCC program.To accelerate the HF calculations for the largest systems (containing more than 500 atoms), we evaluated the HF exchange contribution in local fitting domains, as described in Section 3.1.The core electrons (including the subvalence electrons of the iron and cobalt atoms) were kept frozen in all correlated computations.The remaining occupied orbitals were localized with the algorithm of Boys 104 separately for the DO and SO subspace.
The calculations presented make use of the triple-ζ valence basis set including polarization functions (def2-TZVP) developed by Weigend and Ahlrichs 105 and Dunning's (augmented) correlation-consistent polarized valence basis sets [(aug-)cc-pVXZ, X = D, T and Q], 106 with the revised aug-cc-pV(X+d)Z basis sets used for second-row atoms. 107dditionally, the triple-ζ, weighted core−valence basis of Balabanov and Peterson (cc-pwCVTZ) was utilized for the iron atom of the Fe(III) complex investigated in Section 5.1, following the work of Radon. 108The appropriate auxiliary basis sets of Weigend et al. were used for all AO bases. 109The extrapolations of the HF 110 and the correlation energies 111 toward the CBS limit were performed according to the standard expressions.
The relative energy deviations with respect to some reference energy (E ref ) are obtained as (100%) To characterize the achieved accuracy on various test sets, the following statistical measures were utilized: the maximum absolute error (MAX), the mean absolute error (MAE), and, to measure the consistency of errors, the standard deviation of the absolute errors (STD).The presented timing measurements were performed using 2.5 GHz Intel Xeon Gold 6148 processors with 20 physical cores and at most 192 GB of total memory in dual-socket nodes.The accuracy of the open-shell LNO-CCSD(T) correlation energies and energy differences are benchmarked on three test sets containing small-to medium-sized molecules.The first test set contains the radical stabilization energies (RSEs) of 30 small organic molecules (RSE30). 112These structures were selected from the RSE43 compilation 113  Lastly, the 4′ Fe(III) complex of ref 108 is also investigated, constructed by replacing the methyl groups of [Fe(acac 2 trien)] + with hydrogens (where H 2 acac 2 trien is the Schiff base obtained from the 1:2 condensation of triethylenetetramine with acetylacetone).The structures of these medium-sized systems are depicted in Figure 2.
To demonstrate the performance of the open-shell LNO-CCSD(T) method, large-scale calculations were also carried out on four systems of 175−601 atoms (see Figure 3).Among these is the 175-atom Fe(II) complex taken from ref 115, both in its triplet and quintet spin states.Moreover, the Cob II alamin (Cbl) radical containing 179 atoms, a product of the homolytic bond breaking of the coenzyme B 12 (5′-deoxyadenosylcobalamin, dAdoCbl), was also considered. 116The largest systems investigated are the models of photosystem II (PSII) bicarbonate 47 containing 565 atoms and of the D-amino-acid oxidase (DAAO) 117 made up of 601 atoms.The singlet and triplet spin states of PSII bicarbonate were calculated by utilizing the def2-TZVP basis set and QRO reference determinants.The particular complexities and technicalities of this reference computation are discussed in ref 65.In the case of DAAO, two steps along the oxidative half-reaction of D- alanine oxidation are taken from Kiss et al., 117 where the reduced flavin adenine dinucleotide (FAD) moiety of DAAO is oxidized by O 2 − to produce the oxidized form of FAD and H 2 O 2 .The corresponding reactant and product structures labeled by O1 T and O3 CSS in ref 117 are also depicted in Figure 5 of ref 65.

CONVERGENCE OF THE LOCAL APPROXIMATIONS
In this section, the effect of the local approximations of the open-shell LNO-CCSD(T) method on its accuracy is investigated individually.To this end, the thresholds associated with such approximations are systematically varied, and the resulting energy deviations from the approximation-free DF-CCSD(T) or other reference methods are observed.Almost all of the employed approximations were extensively benchmarked in our related open-shell LMP2 and closed-shell LNO-CCSD(T) studies. 52,53,60,65,66Consequently, the presented tests focus on the parameters with the largest impact on accuracy: the strong pair energy threshold ε w and the LNO occupation limits ε o and ε v .Additionally, approximations that were previously not employed in the context of either any open-shell local CCSD(T) implementation or our open-shell LMP2 method, such as the distant spin polarization approximation discussed in Section 3.8 and the NAFs introduced in Section 3.6, are also investigated.For the remaining truncation thresholds, affecting closed-and openshell systems similarly, such as the BP parameters of the LMO atom list definition, the previously benchmarked values are adapted, which will be tested in combination with all approximations in Section 6.

LNO Selection.
First, the errors introduced by the truncation of the LNO basis of the LISs are illustrated on three systems.The system sizes were chosen to be as large as possible so that the LNO truncation is already active in all domains (with up to 50% of the occupied and 85% of virtual LNOs discarded), while still keeping the possibility of using DF-CCSD(T) energies as reference.The one-particle basis set of cc-pVTZ is chosen for the description of the hydrogen addition of the triphenylmethyl radical, while cc-pVQZ is used for the calculation of the singlet−triplet gap of diphenylcarbene.A composite basis set composed of cc-pwCVTZ for the iron atom, cc-pVTZ for the atoms connected to the iron atom, and cc-pVDZ for all other atoms is chosen to investigate the doublet-sextet gap of the Fe(III) complex, following Radon. 108ince our previous studies found that setting the ratio of the virtual and occupied LNO selection thresholds (ε o /ε v ) to 10 gives satisfactory results, 52,53,58,59 this ratio was kept in this work.
As can be seen from the left panel of Figure 4, the correlation energy errors decrease monotonically and converge to zero when the LNO selection thresholds are tightened (the approximations other than the LNO truncation are turned off in these tests).Moreover, the 99.9% accuracy approached already with the default setting of ε o = 10 −5 and ε v = 10 −6 is highly satisfactory for such nontrivial systems.Considering the right panel of the same figure, the three energy differences of the associated chemical processes also exhibit good convergence.The singlet−triplet energy gap of diphenylcarbene with almost perfect error cancellation already with the loosest thresholds is probably an exception, representative only for very local properties, like spin state differences, localized mostly on a few atoms.Nonetheless, the deviations introduced by the LNO truncation for this system remain below 0.3 kcal/mol for all investigated ε o values, and the trityl RSE and the doublet-sextet gap of the Fe(III) complex also drop below the satisfactory 0.5 kcal/mol value after ε o = 10 −5 .Thus, the default LNO threshold values of ε o = 10 −5 and ε v = 10 −6 are selected, which are recommended also based on our closedshell LNO-CCSD(T) benchmarks 52,53 as both the correlation energy and energy difference deviations are sufficiently converged (with at most 0.14% and 0.53 kcal/mol deviations, respectively, for the examples of this section).We note in passing that this value is not directly comparable to the frozen NO 20,21 or pair NO 48 threshold settings as only the strong pair LMOs are included in our orbital specific LNO density matrices (Section.3.4), while the summation is not restricted for the frozen NO and even more restricted for the orbital pair specific pair NO density matrices.

Strong Pair Classification.
The convergence of the LNO-CCSD(T) energies with the tightening of the pair energy threshold (ε w ) is also investigated.The restriction of the strong pair LMO list, controlled by this threshold, only starts to take effect for somewhat larger systems, preventing one from utilizing canonical DF-CCSD(T) calculations as reference.To  overcome this difficulty, first, we recall that this approximation is the main source of error in the LMP2 correlation energy, which we benchmarked against DF-MP2 for a number of larger open-shell molecules containing 42−81 atoms.Namely, the singlet−triplet gap of artemisinin, the vertical ionization energy of testosterone, and the hydrogen addition to vitamin E succinate were investigated using the aug-cc-pVTZ basis set. 65here, we found that the energy differences converged already at the default ε w = 10 −5 E h LMP2 setting with at most 0.03% relative correlation energy and 0.05 kcal/mol energy difference deviations.
To also assess the accuracy of LNO-CCSD(T) against an affordable reference on the same systems, we set all thresholds to their default value except for the ε w parameter, which we scan (see Figure 5).Therefore, as a reference, the converged calculation with the very tight ε w = 10 −6 E h can be taken.With that, we find the relative correlation energy and absolute energy difference deviations to converge quickly to zero also for LNO-CCSD(T) as the pair energy threshold is tightened.Note that the absolute energy difference considered for vitamin E succinate is about four (two) times larger than that for artemisinin (testosterone), meaning that the slightly larger energy difference deviations observed for vitamin E succinate correspond to similarly high-quality relative deviations for all systems.The value of ε w = 10 −5 E h , yielding highly satisfactory LNO-CCSD(T) results of less than 0.04% correlation energy deviations and highest energy difference deviations of just above 0.1 kcal/mol, can be again safely chosen as default.Our strong pair threshold can be compared to the settings of other local approaches, and for example, matches the value employed in the TightPNO settings of the DLPNO methods. 48This value matches our closed-shell setting as well, where the restriction of the strong pair LMO list was successfully benchmarked for even larger molecules of 92−260 atoms against DF-MP2 reference energies. 66.3.Approximate Long-Range Spin Polarization.The accuracy of the distant spin polarization approximation introduced in Section 3.8 is evaluated on the basis of both local correlation energies and energy differences.Here, the reference energies are obtained from default LNO-CCSD(T) calculations without activating the distant spin polarization approximation.Since the approximation is only active in the EDs that do not contain SOMOs, reasonably large systems should be chosen to activate and test this approach.Results for three medium to large test systems of 81−179 atoms are collected in Table 3.As shown in the last column of Table 3, already at this relatively moderate system size, one-half to twothirds of the EDs without SOMOs can be treated with the more efficient closed-shell formalism.At the same time, all relative correlation energy errors are well below 10 −4 %, indicating only a marginal loss of accuracy.The absolute and relative errors in the computed energy differences of about 0.1−0.4cal/mol or up to 5.4 × 10 −4 % are similarly negligible.It is also worthwhile noting that here the RSE and bond breaking processes in the first and third lines also involve large closed-shell species, where this approximation does not have an analogue eliminating the possibility of error cancellation.In conclusion, the errors introduced by the distant spin polarization approximation are substantially smaller than those caused by the other local approximations employed.These results are in line with the performance of the analogous approximation in our open-shell LMP2 approach, where we have also performed such tests up to 500−600 atoms. 65otably, in that size range, 80−90% of the EDs are free of SOMOs and can be treated with the more efficient restricted algorithms.
5.4.Natural Auxiliary Functions.In this section, the effect of NAF truncation is investigated on three systems of up to 81 atoms with various basis sets.These include the vertical ionization energy of testosterone (cc-pVTZ), the hydrogen addition process of vitamin E succinate (aug-cc-pVTZ), and the singlet−triplet gap of diphenylcarbene (cc-pVQZ).Since the NAF approximation was found very effective in the analogous closed-shell context, 52,53 and the only difference compared to the latter is that here, both spin-up and spin-  down integrals have to be expanded in the same NAF basis, we expect very similar accuracy.
To evaluate the performance of NAFs, LNO-CCSD(T) calculations with default settings except for the scanned ε NAF parameter are compared against the reference obtained with ε NAF = 0 E h .Figure 6 shows the convergence of the correlation energy (left) and energy difference (right) deviations.The correlation energy deviations converge almost completely already with ε NAF = 10 −2 E h , while energy difference deviations approach the reference almost completely monotonically.Since the largest energy difference deviation is already below 0.01 kcal/mol with ε NAF = 10 −2 E h , this value is chosen as the default, again matching the default setting of closed-shell LNO-CCSD(T).

BENCHMARKS FOR SMALL-AND MEDIUM-SIZED SYSTEMS
In this section, the cumulative effects of all local truncation thresholds employed at their default values are benchmarked against canonical DF-CCSD(T) references.Statistical measures for the correlation energy and energy difference deviations are presented for three test sets containing radical stabilization energies, ionization potentials, and spin-state energies.The reference data utilized in this section is available in the Supporting Information.

Accuracy of Local Correlation Energies.
Here, the accuracy of the open-shell LNO-CCSD(T) correlation energies is evaluated against reference DF-CCSD(T) values.The investigated systems include 30 radicals (from the RSE30 test set), 21 cations from IP21, and the 12 aryl carbenes in their triplet state in the AC12 test set.
The statistical measures of the correlation energy deviations of the three test sets are collected in the third columns of Tables 4−6.For all test sets, the MAEs of the relative correlation energy deviations with the triple-ζ quality basis sets are below 0.05% with all errors not exceeding 0.11%.The highest triple-ζ deviation of 0.11% is observed for the oxalic acid cation, where a large (0.35) single excitation amplitude of the reference DF-CCSD(T) solution suggests some multireference character.Apart from this challenging system lying at the boundary of the single reference CC's applicability, all triple-ζ local correlation energy deviations are below 0.1%, which is highly satisfactory already with the default LNO-CCSD(T) settings.When moving to the quadruple-ζ and triple-ζ−quadruple-ζ extrapolated [CBS(T,Q)] results of the RSE30 test set, one observes a slight increase in the MAE and MAX local correlation energy deviations.It is worth noting that the largest relative error is observed in both cases for the methyl radical.Due to its small size, the reference correlation energy values for this system are tiny, greatly increasing the relative correlation energy deviations even for the smallest of absolute errors.Compared to the high quality of MAEs for the correlation energies, it is important to point out that their STD where R• stands for the different radicals containing C, N, O, F, P, and S atoms. 57The deviations of LNO-CCSD(T) radical stabilization energies from reference DF-CCSD(T) ones are collected in the last column of Table 4.With MAEs of at most 0.05 kcal/mol and MAX errors below 0.2 kcal/mol, even for the CBS extrapolated energy differences, the results are highly satisfactory.Here, the 0.050 and 0.063% STDs of the quadruple-ζ and CBS(T,Q) correlation energies, respectively, are important indicators to explain that the triple-ζ, quadruple-ζ, and CBS(T,Q) RSE values are of the same high quality.One might also compare the results obtained with the aug-cc-pV(T+d)Z basis to those of the PNO-UCCSD(T)-F12 50 method, calculated for the same RSE30 test set.Naturally, PNO-UCCSD(T)-F12 and LNO-CCSD(T) are not directly comparable as the former contains additional explicitly correlated terms.Nonetheless, the LNO-CCSD(T) results appear to be comparable to or slightly better than the PNO-UCCSD(T)-F12 results, with default settings providing 0.076 kcal/mol RMS, 0.064 kcal/mol MAE, and 0.192 kcal/ mol MAX errors with the aug-cc-pVTZ basis set. 50It is always important to keep in mind in such comparisons that both the PNO-UCCSD(T)-and the LNO-CC-type methods can be converged to the same local approximation free limit, and here, we compare only the results obtained with the default settings of each method determined according to considerations that are not identical for the two methods.

Vertical Ionization Potentials.
The statistical measures of the vertical ionization potentials of the IP21 test set are shown in Table 5. Considering the fact that the IPs represent quite large energy differences in the range of 8− 14 eV (184 to 323 kcal/mol), the mean average error of 0.007 eV (0.16 kcal/mol) is excellent.The MAX IP deviation of 0.027 eV can be attributed to the benzoquinone cation, the reference DF-CCSD(T) solution of which exhibits a moderately large maximum singles amplitude of 0.15.Removing this outlier, the MAX IP error drops to below 0.02 eV (0.04 kcal/mol), and the standard deviation also decreases to 0.007 eV (0.16 kcal/mol).The PNO-UCCSD-(T)-F12 method achieves comparable or slightly more accurate results for this test set with its default settings and the aug-cc-pVTZ basis set (0.17 kcal/mol RMS, 0.11 kcal/mol MAE, and 0.50 kcal/mol MAX). 50.4.Singlet−Triplet Energy Gaps.Finally, the singlet− triplet spin state energy gaps of the aryl carbenes are benchmarked, and the results are collected in the last column of Table 6.The MAE of 0.24 kcal/mol for the cc-pVTZ basis is somewhat larger than those for the other two test sets but is still only a fraction of a kilocalorie per mole marking chemical accuracy.The slightly increased errors are at least partly explained by the fact that the system sizes of this test set are larger than those of the other two, with the average AC12 system containing more than twice as many heavy atoms as either RSE30 or IP21.To put the results in perspective, let us consider that the basis set incompleteness error of canonical CCSD(T) for the AC12 test set were found to be 2.74 and 0.98 kcal/mol, respectively, for the cc-pVDZ and cc-pVTZ bases, 114 which are significantly larger than the MAX local errors reported here.

PERFORMANCE AND COMPUTATIONAL REQUIREMENTS FOR LARGER SYSTEMS
The current capabilities of the open-shell LNO-CCSD(T) method are illustrated on large-scale calculations on four threedimensional systems containing 175−601 atoms (see Table 7).Of these systems, the Cbl radical and the FeC 72 N 2 H 100 complex represent the higher end of the typical size range, e.g., in homogeneous catalysis applications (ca.175 atoms), while the bicarbonate or DAAO models of 565−601 atoms are in the typical size range of quantum systems in biochemical applications, for example, when modeling proteins with active centers in a quantum mechanics/molecular mechanics (QM/ MM) framework.The Cbl radical and the FeC 72 N 2 H 100 complex pose a formidable challenge for local correlation methods due to the spin-polarized transition-metal atoms near their centers.This leads not only to high strong pair ratios of 22−26% but also to LISs that are somewhat more extended than those of the two protein models.It is therefore not completely surprising to see that the wall clock run times of the domain CCSD(T) calculations for the transition-metal complexes somewhat exceed those for PSII bicarbonate or the singlet DAAO species.Another reason for the unusually long CCSD(T) run time for the FeC 72 N 2 H 100 system is its moderate multireference character, 115 which hinders the convergence of the CCSD iterations, resulting in up to 27 iterations performed in particular domains, compared to the usual 9−11 iterations.
Regarding the other contributions to the run time, it is reassuring that the formally cubic scaling orbital localization and the quadratically scaling pair energy evaluation take negligible time, even for the largest systems.However, the cubic-scaling local DF-HF computations can take time comparable to the LNO-CCSD(T) correlation energy computation for the large protein models, in accord with our experience with even larger proteins of 1000−2000 atoms and our restricted LNO-CCSD(T) implementation. 52,53Moreover, as observed on the singlet and triplet DAAO models, the time required for the integral transformation to the LNO basis increases by only up to 30% compared to the closed-shell algorithm, which increase is mainly attributed to the larger domain sizes within the triplet system.As discussed in Section 3.8, an unrestricted implementation of the open-shell integral transformation would have twice the computational cost of the closed-shell analogue.However, due to the restricted intermediate basis and the distant spin polarization approximation, the computational costs of the most expensive integral transformation steps approach those of the closed-shell algorithm.This small overhead of the open-shell integral transformations means that while this step is one of the ratedetermining parts of the closed-shell algorithm (30% of the singlet DAAO run time), its significance is much reduced in the open-shell case (12% of the triplet DAAO run time).
Looking at the average (maximum) ED sizes of 116−137 (169−353) atoms, at least for the large protein models, the individual domain sizes are saturated, and we are getting close to the linear scaling regime of the ED and LIS computation parts of the LNO-CCSD(T) method.It is also clear that performing hundreds of MP2 energy computations for these domains is well beyond the capabilities of any conventional MP2 implementation.These ED MP2 computations are only feasible with a thoroughly optimized, Cholesky-decomposition-or Laplace-transform-based algorithm and extensive utilization of local approximations. 65,66In these (necessarily) large EDs, we can exploit the locality of the LMOs, use local AO and DF domains, and use redundancy free MP1 amplitude expressions.As a result of these enhancements, the MP2 energy computation in the largest ED with over 180 LMOs and almost 3000 PAOs takes less than an hour.Turning to the average LIS sizes, the necessity and efficiency of the LNO compression is evident.The LISs contain somewhat over half as many occupied and around 6−8 times fewer virtual orbitals after truncation of the ED basis dimensions.Without this compression of the correlated subspaces, the evaluation of hundreds of domain CCSD(T) energies would clearly be unfeasible.
It is interesting to inspect the DAAO systems more closely due to the availability of very similar closed-shell and openshell computations.Here, one immediately notices the surprisingly large maximum domain size of the triplet DAAO model in the last column of Table 7. Containing over 350 atoms, 100 occupied, and 449 virtual LNOs, the size of this domain is unprecedented in LNO-CCSD(T) calculations performed with the default threshold values.The central LMO of this domain is one of the SO LMOs of the system, depicted in the left panel of Figure 7.This orbital spreads across the entire FAD moiety and, to a small extent, even spills onto the peroxide molecule.The large spatial extent of this orbital means that it forms an unusually large number of strong pairs with other LMOs.This in turn leads to a large number of ED atoms and, finally, to large LIS orbital dimensions.
Beyond the inherently delocalized electronic structure of the reduced FAD, another reason for the extreme spatial extent of this SO LMO is found in the restricted nature of the employed orbital localization scheme.During localization, the SOMOs are mixed only among themselves in order to avoid the splitting of the spin-up and spin-down occupied orbitals.The triplet state of the DAAO model contains merely two SOMOs, but at the same time, the singly occupied subspace spreads across the entire FAD moiety and the peroxide molecule.Due to the large spatial extent of the SO subspace and the small number of orbitals spanning it, it is not possible to localize the SO LMOs better in a restricted scheme.A (partial) remedy could be to employ a spin-unrestricted localization scheme, where the two SOMOs would be allowed to mix with all other spin-up orbitals.This could probably lead to orbitals that are more localized than the highly delocalized SOMO in question at the cost of introducing spin splitting to the LMOs, doubling the number of domains in all computations.Despite the complex electronic structure and extreme domain size, the LNO-CCSD(T) energy contribution of this delocalized central LMO can still be evaluated on a single CPU.This would not be possible without efficient local approximations, such as the compact NAF and LNO orbital spaces, the highly optimized CCSD(T) code featuring integral direct evaluation of fourcenter more-than-two-external ERIs, and the redundancy free Laplace-transform (T) formulation of LNO-CCSD(T).The broader significance of the present triplet DAAO LNO-CCSD(T) computation is that it represents (i) the largest CCSD(T) computation in the literature with over 600 atoms and more than twice as many orbitals as in the def2-SVP level DLPNO-based PSII bicarbonate computation of Kumar et al. 56 and (ii) probably the most complicated system studied so far concerning the issues with delocalized SOMOs, illustrating that such problematic situations can also be handled with highly optimized implementations.
It is also encouraging to see the comfortably manageable minimal memory requirements in the last line of Table 7 despite the complexity of the investigated systems and the employed triple-ζ quality basis set.Less than 20 GB of minimal memory is required for most systems in Table 7, while with about 90 GB, LNO-CCSD(T) was feasible, even for the most challenging triplet DAAO system.Such low-memory requirements are the result of the highly optimized integral evaluation and transformation routines and the tiled computation of the most memory intensive CCSD(T) terms, which removes the need to store any four-center ERI arrays with more than two external indices in the memory.Additionally to the small minimal memory requirements, the wall-clock run times of 1.5−4.5 days also indicate that LNO-CCSD(T) calculations for open-shell systems containing several hundred atoms, at least with reasonable triple-ζ quality basis sets and the default settings, have become routine tasks even on a single many-core CPU with easily accessible amount of memory.
Lastly, the reaction energies and energy differences obtained with the LNO-CCSD(T) method for the four large examples discussed above are collected in Table 8.Comparing the estimates of the energy differences obtained at the LMP2 and LNO-CCSD(T) levels of theory, one observes significant differences for all applications.The deviation of the relative energy estimates between these two methods is smallest for the DAAO oxidation and dAdoCbl formation processes; however, with 2.1−2.4 kcal/mol, this deviation is still clearly outside of the range of chemical accuracy typically defined to be 1 kcal/ mol.Moreover, for the quintet−triplet energy differences of the two iron complexes, the disagreement between the two methods increases to 6.6−6.8 kcal/mol.These results are in accord with the observations emerging more and more often in the literature, enabled by large-scale CCSD(T) benchmarks, namely, that second-order perturbation theory can often be insufficient to accurately describe such challenging extended systems and processes.This also verifies the need for higherorder and more reliable methods.

SUMMARY AND CONCLUSIONS
A high-spin open-shell extension of the asymptotically linearscaling LNO-based CCSD(T) method is presented.The efficiency of the open-shell algorithm approaches that of our closed-shell LNO-CCSD(T) method 52 for large molecules due to the utilization of ROHF or ROKS reference determinants, the use of restricted open-shell intermediate basis sets, and a novel approximation of long-range spin-polarization effects at the CCSD(T) level.For compact molecules, where the domain CCSD(T) computations are rate-determining, the open-shell LNO-CCSD(T) method requires at least 2−3 times more operations and data than the closed-shell variant, just as for alternative local CCSD(T) approaches.The presented method is identical to its closed-shell counterpart when closed-shell systems are considered, enabling the combined use of these algorithms to efficiently compute consistent energy differences between open-and closed-shell species.
The proposed approach efficiently extends and generalizes the optimized algorithms developed for our other local correlation methods to the open-shell LNO-CCSD(T) case, 52,53,62,65,66 for example, the algorithms for domain constructions, auxiliary basis compression, integral transformation, and LNO construction.Its implementation is integral-direct, memory and disk use-economic, and OpenMP-parallel, although the scaling of the current version above 10−20 cores should be improved.The local approximations are defined without relying on empirical parameters, such as real-space cutoffs of fragment definition, and adapt to the complexity of the wave function by construction.Additionally, a highly accurate open-shell local MP2 correlation energy is also obtained without additional computational cost as part of the LNO-CCSD(T) calculation.The open-shell LNO-CCSD(T) code will be made available open-access for academic use in a forthcoming release of the MRCC program suite. 102,103he effect of the local approximations and the corresponding thresholds is profiled on both energy differences and a ΔE HF �DF-HF, ΔE LMP2 �LMP2 correlation contribution, ΔE LNO-CCSD(T) �LNO-CCSD(T) correlation contribution, ΔE total LNO-CCSD(T) �total LNO-CCSD(T).
correlation energies.Systematically tightening these thresholds, satisfactory convergence toward the exact CCSD(T) energies is observed.The errors caused by the local approximations using the default threshold values are benchmarked on diverse test sets of radical stabilization energies, vertical ionization potentials, and singlet−triplet energy gaps.For systems where the exact CCSD(T) reference is accessible, mean absolute errors are found to be below 0.25 kcal/mol, while maximum errors do not exceed 0.7 kcal/mol and are below 0.5 kcal/mol for all but the most challenging systems.
The performance of the open-shell LNO-CCSD(T) algorithm is demonstrated on systems of 175−601 atoms using triple-ζ quality basis sets.Among these, a triplet state of the O 2 reduction process via a D-amino acid oxidase model of 601 atoms is especially challenging, not only due to its unprecedented size with over 11,000 basis functions but also because of a singly occupied molecular orbital delocalized over about 20 atoms.The complicated triplet electronic structure of the 565-atom model of the bicarbonate protein in photosystem II is also investigated using about twice as many basis functions as in a previous local CCSD study. 47Each of these calculations is feasible using less than 100 GB of memory on a single node of 20−40 cores within ca.2.5−4.5 days of wall clock run time.These results demonstrate that the presented LNO-CCSD(T) algorithm extends the applicability range of open-shell CCSD(T) calculations up to biochemical systems of 500− 600 atoms using reasonable triple-ζ basis sets and commodity hardware.Having the ROHF/ROKS/QRO reference at hand, which can become relatively costly in this size range, the openshell LNO-CCSD(T) method should scale to even larger systems/basis sets, which approaches the capabilities of closedshell LNO-CCSD(T), currently standing at about 1000−2000 atoms and 45,000 basis functions. 52,53APPENDIX

Derivation of Virtual Density Matrix Fragments
In this Appendix, the orbital notation of the main text is simplified using i, j, k... and a, b, c... indices for the occupied and virtual (semi)canonical orbitals, respectively.Again, lowercase letters stand for spin-up, uppercase letters label spin-down orbitals, while MP1 amplitudes are denoted by t.Moreover, expressions only for the spin-up density matrix fragments are shown since their spin-down counterparts can be obtained by mirroring all spins of the expressions.
The spin-up virtual second-order density matrix reads as The other two forms can be obtained by recognizing that the last two terms of eq 28 can be combined together in two ways.One can either switch the order of the occupied indices of the last term of eq 28 and rename them to I → J, j → i, yielding   Here, we exploited that exchanging the occupied indices of the antisymmetric MP1 amplitudes introduces two sign changes that cancel each other (t Ij aC = −t jI aC ).Transforming the i index to the LMO basis and restricting its summation to the central LMO, one obtains a virtual density matrix expression different from the first one  If this form were to be used after restricting the summations to the central LMO index, both the central LMO with spin-up and spin-down occupation would contribute to the spin-up density matrix.However, the contribution of the spin-down central LMO would be weighted twice as high as that of the spin-up central LMO.
■ ASSOCIATED CONTENT

Figure 2 .
Figure 2. Structures of the six systems of intermediate size used to investigate convergence of the local approximations.Top row: trytil radical and triplet artemisinin; middle row: diphenylcarbene in its triplet state and testosterone cation; bottom row: vitamin E succinate radical and the 4′ Fe(III) complex of Radoń1 08 considered both with doublet and sextet multiplicities.

Figure 3 .
Figure 3. Structures of the four largest systems considered.Top row: the 5′-deoxyadenosylcobalamin (209 atoms, out of which the Cob II alamin (Cbl) radical part has 179 atoms) and the 175-atom iron(II) complex FeC 72 N 2 H 100 .Bottom row: a 565-atom model of photosystem II bicarbonate and the O1 T structure of D-amino-acid oxidase containing 601 atoms.

Figure 4 .
Figure 4. Deviations of the LNO-CCSD(T) energies from reference DF-CCSD(T) values as a function of the LNO selection thresholds.Left panel: relative correlation energy differences; right panel: deviations in energy differences (s.s. = spin state).The ratio of the occupied and virtual LNO selection thresholds were kept constant, ε o /ε v = 10.For a description of the systems and basis sets utilized, see Section 5.1.

Figure 5 .
Figure 5. Relative LNO-CCSD(T) correlation energy (left) and absolute energy difference (right) deviations.A converged LNO-CCSD(T) calculation with ε w = 10 −6 was used as reference.See Section 5.2 for a description of the utilized systems and basis sets.

Figure 6 .
Figure 6.Errors of LNO-CCSD(T) energies and energy differences computed with various NAF truncation thresholds.The reference energies are obtained from LNO-CCSD(T) calculations with ε NAF = 0 E h .Left panel: relative correlation energy differences; right panel: absolute deviations in energy differences.

Figure 7 .
Figure 7. D-amino-acid oxidase model with its two singly occupied molecular orbitals depicted with blue and yellow surfaces.The left panel shows the highly delocalized orbital spreading over the entire flavin adenine dinucleotide, while the other singly occupied orbital localized almost exclusively on the peroxide radical can be seen on the right.
of the occupied orbitals, say the i and I indices, to the LMO basis, one has three different choices to continue the derivation.If we first restrict the summations to the central LMO index, we arrive at the virtual density matrix expression used in the presented open-shell LNO-CCSD(T) algorithm (eq 19).

Table 2 .
Default Threshold Values Used in This Study 102,103of the corresponding keyword in the MRCC program package.102,1034.2.Benchmark Sets and Test Systems.
and reoptimized in ref 57.Next, a test set of adiabatic ionization potentials for 21 organic molecules (IP21) is considered as defined in ref 57 (with structures reoptimized in ref 65).Lastly, singlet−triplet energy gaps of aryl carbenes (AC) are also investigated invoking the AC12 test set as compiled in ref 114.To investigate the convergence of the local approximations as a function of the corresponding thresholds, six larger openshell systems containing 23−80 atoms are also selected.The structures of vitamin E succinate, trityl radical, and artemisinin are obtained from ref 47, testosterone is taken from ref 57, whereas diphenylcarbene comes from the AC12 test set of ref 114.

Table 3 .
Accuracy of the Long-Range Spin Polarization Approximation Compared to Reference LNO-CCSD(T) Correlation Energies and Energy Differences Obtained without This Approximation

Table 4 .
Relative Correlation Energy Deviations and Absolute Errors of the LNO-CCSD(T) Reaction Energies for Radical Stabilization Energies of the RSE30 Test Set Using Default Thresholds

Table 6 .
Relative Correlation Energy Deviations and Absolute Errors of the LNO-CCSD(T) Singlet−Triplet Energy Gaps of the AC12 Test Set by Using the Default Thresholds

Table 7 .
Average (Maximum) Domain Sizes, Orbital Space Dimensions, DF-HF and Correlation Energies (in E h ), Wall-Clock Times (in h) a , and Memory Requirements (in GB) for LNO-CCSD(T) Computations of Large Molecules Using two 20-core, 2.4 GHz Intel Xeon Gold 6148 CPUs.b Using the default local fitting domain size.The final iteration with larger fitting domains took about 3.5−4.8times longer.c DF-HF energies calculated with semicanonical QRO orbitals. a

Table 8 .
Contributions to Reaction Energies and Spin-State Gaps in kcal/mol for the Four Largest Examples Evaluated with the def2-TZVP Basis a However, in this form, the central LMO with spin-down occupation does not contribute to the spin-up virtual density matrix fragment at all.For the last form, one can perform the exchanging and renaming of the occupied indices on the second term of eq 28, yielding