Effective Reconstruction of Expectation Values from Ab Initio Quantum Embedding

Quantum embedding is an appealing route to fragment a large interacting quantum system into several smaller auxiliary “cluster” problems to exploit the locality of the correlated physics. In this work, we critically review approaches to recombine these fragmented solutions in order to compute nonlocal expectation values, including the total energy. Starting from the democratic partitioning of expectation values used in density matrix embedding theory, we motivate and develop a number of alternative approaches, numerically demonstrating their efficiency and improved accuracy as a function of increasing cluster size for both energetics and nonlocal two-body observables in molecular and solid state systems. These approaches consider the N-representability of the resulting expectation values via an implicit global wave function across the clusters, as well as the importance of including contributions to expectation values spanning multiple fragments simultaneously, thereby alleviating the fundamental locality approximation of the embedding. We clearly demonstrate the value of these introduced functionals for reliable extraction of observables and robust and systematic convergence as the cluster size increases, allowing for significantly smaller clusters to be used for a desired accuracy compared to traditional approaches in ab initio wave function quantum embedding.


I. INTRODUCTION
Quantum chemical methods to describe explicit correlations in an ab initio many-electron system can be highly accurate, though their applicability is often stymied by a steep computational scaling with respect to system size which (despite significant recent progress) limits their use for extended systems 1-6 .To combat this, the locality of this correlated physics is increasingly exploited, enabling a reduction in scaling to be competitive compared to mean-field or density functional approaches, whilst remaining free from empiricism 7,8 .The field of 'local correlation' methods in quantum chemistry generally build these locality constraints in the particle-hole excitation picture of the system, localizing each of these spaces separately [9][10][11] .While highly related, 'quantum embedding' approaches from condensed matter physics are also increasingly coming to the fore as an alternative paradigm and applied to quantum chemical and ab initio systems 12 .
A loose (and necessarily imperfect) characterization of a key difference in these approaches could be that quantum embedding does not build this locality from a particle-hole picture-rather, a fully local set of 'atomicorbital-like' degrees of freedom are chosen initially (which will in general have neither fully occupied or unoccupied mean-field character), which we will call the 'fragment' space, though is also often called the 'impurity' space for historical reasons in traditional quantum embedding literature.A larger space is then constructed by augmenting these fragment orbitals with additional orbitals (often called 'bath' orbitals).These are designed to reproduce the quantum fluctuations, entanglement and/or hybridization between the fragment and the rest of the a) Electronic mail: george.booth@kcl.ac.uk system, as characterized by some tractable (generally mean-field) level of theory which can be performed on the full system.These individual local quantum problems of the fragment and bath orbitals define a 'cluster', which is then solved to provide the correlated properties of the original fragment space, potentially with a subsequent self-consistency then applied to update the original mean-field/low-level theory on the full system.
The general algorithm in most quantum embeddings is therefore summarized as a) fragment the system, b) for each fragment, construct a bath space describing the coupling to the wider system, c) solve an interacting problem in the cluster space of each fragment via a 'high-level' correlated method, d) extract properties of the system, e) optionally, a self-consistency is performed to embed the correlated effects from the cluster model back into the low-level full system method to update the coupling between fragment and environment.There are a large number of choices and variations within this general framework, including (but not limited to) how the bath space is defined (including the choice of 'lowlevel' theory) [13][14][15] , how the interacting cluster Hamiltonian is constructed and solved [16][17][18][19] , and the choice of self-consistent requirements [20][21][22] .Furthermore, the fundamental quantum variables by which these quantities over the different spaces are characterized can vary, with dynamical mean-field theory and its variants working in a Green's function (dynamical) formalism [23][24][25][26][27][28][29] , while density matrix embedding theory (DMET) and its variants work in a wave function (static) formalism [30][31][32][33][34][35][36][37][38][39][40][41][42][43][44] , although these two approaches are not fundamentally distinct, and can also be rigorously connected via a common framework [45][46][47] .
Much recent progress has been made in these various quantum embeddings and their application to ab initio systems, including the use of quantum computation as a high-level solver [48][49][50][51][52] .The key point in all of these embedding approaches however, is that the scaling with respect to the full system size is defined by only the scaling of the low-level (often mean-field) method, given the local nature of the auxiliary cluster problems.Furthermore, the choice of high-level solver and arbitrary atomicorbital-like fragmentation allows for spaces which capture strong (albeit still local) correlation effects, beyond the traditional constraints of the particle-hole picture of most local quantum chemistry.
In this work we focus on a critical aspect of quantum embedding which we believe has received less attention, but which has substantial ramifications for its accuracy and applicability.This concerns how non-local properties (including the total energy) of the full system can be reconstructed from the independent cluster solutions of each fragment.We will assess the effect of the inherent locality approximation of quantum embedding on the convergence of different functionals of these non-local expectation values, and motivate and demonstrate new approaches which substantially accelerate convergence with respect to fragment and bath size in the embedding.While this is quite a technical work, the outcome is that general design principles by which different functionals can be devised become clear, including the N -representability of these estimates.
Here, we focus on wave function-based quantum embedding (we believe that the ability and approach for constructing appropriate functionals in a Green's function perspective is clearer).We start from the density matrix embedding theory as the parent wave function approach to quantum embedding 30,33 .We always consider fragments consisting of a single atom only and, where we seek to systematically improve expectation values, enlarge the cluster spaces by adding additional interacting bath orbitals.We believe that this is an efficient and 'black-box' approach, and avoids the ambiguities and non-monotonic improvements in the alternative of defining a systematic expansion in the fragment sizes for ab initio systems, which also can suffer from reducing the symmetry of the problem 53 .
The expansion of the bath space is defined from the approximate interacting density matrix (or instantaneous hybridization) between the fragment and environment at a simple approximate second-order perturbation theory (MP2), and is controlled by a single cutoff parameter as detailed in Ref. 53.This provides cluster-specific bath natural orbitals (BNOs) as a controllable, reliable and well-defined expansion of the bath space (and hence overall cluster space) of a given fragment.Furthermore, in this work we will neglect considerations of self-consistency of the original mean-field (beyond where self-consistency is required for meaningful extraction and comparison of expectation values, e.g. to control total electron number).More extensive self-consistency to qualitatively change the original full-system mean-field will be considered in future work 20,21 , but is unlikely to change the conclusions of this work, especially as convergence with cluster size (either via expansion of the fragment or, as in this work, an interacting bath) obviates the effect of self-consistency.
We start the paper recapitulating the original DMET 'democratic partitioning' for expectation values, which can be computed via fragment-projection of the reduced density matrices of each fragment 33 .We then describe an improved approach for the total energy based around a cumulant functional for the two-body effects.We move on to an approach based around direct projection of wave function amplitudes, rather than density matrices, formally satisfying N -representability conditions (not satisfied in the aforementioned density-matrix approaches) and substantially improving expectation values.Finally, practical approaches and further approximations to these will be described to define an efficient protocol for arbitrary expectation values and high-level wave function descriptions.These different approaches are benchmarked for energetics and other non-local properties (spin correlation functions) on both molecular systems via the W4-11 test set 54 , and periodic systems, finding an efficient approach for reconstruction of non-local observables in quantum embedding of general fragmented systems.) Bond length (Å)  Γ x ] corresponding to the democratically partitioned energy expression of traditional DMET 33 .Different energies correspond to different total energy functionals, constructed from the same solutions to the two DMET embedding problems (with fragment spaces of the (Löwdin-orthogonalized) core and valence orbitals of each Chlorine atom).The two cluster spaces are defined by 10 orbitals (9 fragment and 1 bath), with the full space comprising 18 orbitals.The motivation and definition of these different energy functionals from the embedded wave functions are given in the rest of this work, with more details on this system (and others like it) discussed in section IV.
In Figure 1, we show results from the different total energy functionals described in this work, as a representative illustration of the significant difference that this choice can make in a simple system (Chlorine dimer, min-imal basis, two atomic fragments).For all the different total energy functionals shown in Fig. 1, the same identical fragment and DMET cluster space is used, and the same (exact) solver and wave function are found from each cluster 30,31 .The only difference is the choice of energy functional to reconstruct the total energy from the two cluster solutions, with E[γ x , Γ x ] representing the traditional 'democratically partitioned' DMET energy used primarily in the literature to date 33 .
As a summary regarding the reconstruction of expectation values from cluster solutions in this work, we find the main conclusions to be the following: 1.It is advantageous to separate factorizable products of lower-rank contributions to expectation values where possible.In this way, we can construct e.g. the two-electron energy from the one-body density matrix and two-body cumulant, rather than the two-body density matrix directly.We demonstrate that this improves estimators via the inclusion of implicit cross-cluster contributions to the expectation values.Many of the improvements in this work arise from implicitly building in contributions to expectation values (or the wave function itself) from products of different cluster contributions, coupling the individual cluster solutions in a non-local fashion, and minimizing the local approximations inherent to the embedding framework.
2. N -representability of estimators can be used as an effective guiding principle, i.e. that they can be derived from a valid 'global' wave function from the combined cluster wave function solutions.We show how this can be achieved exactly.Defining this 'global' wave function ensures conservation of many quantum numbers e.g. total electron number, which removes the necessity for costly (and sometimes ill-defined) self-consistency conditions, and can allow for variational estimators 40 .Beyond this, enabling convergence with respect to bath size of each fragment can render self-consistency entirely unnecessary.
3. It is possible to construct estimators from this welldefined global wave function by casting the expectation value as a functional of the wave function amplitudes of each cluster, rather than combining density matrices or observables directly.Each factor of the wave function amplitudes in the expectation value has their occupied indices (symmetrically) projected onto the fragment spaces of each cluster to avoid double-counting, and a sum over all cluster solutions for each factor of the wave function parameters is introduced.

4.
Along with dramatically improved estimates, this wave function approach furthermore avoids the requirement for optimizations of chemical potentials, as the global electron number is strictly conserved, and the condition of the union of the fragment spaces is only that they span the occupied, rather than the full orbital space of the original system in order to converge to exact results (e.g. as the (interacting) bath is expanded to completeness for each cluster), vastly reducing the burden of large fragment spaces spanning virtual orbitals for calculations in realistic basis sets.
5. If the approach above results in an intractable scaling with respect to number of fragments in the system, then a principled approximation can be made, which we show can fortuitously lead to effective cancellation of errors and an even faster convergence of the ground state energy to the exact global expectation values.
We motivate and evidence these conclusions in the following sections, resulting in our final recommendation for an approach to one-and two-body expectation values from DMET and related wave function quantum embeddings in ab initio systems, while retaining at most an O(N 3 ) scaling with system size in the evaluation of these observables.We detail the practical implementation of these schemes for both an exact high-level solver, and an (arbitrary order) coupled-cluster framework.All results can be reproduced from our recently released Vayesta codebase for quantum embedding 55 , which interfaces with the PySCF code 56,57 , with scripts to generate many of the results of this work (including the input required for Fig. 1) found in the Supplementary Information (SI).

A. Democratic partitioning of density matrices
Expectation values in DMET derived from operators which span more than one fragment are calculated from 'democratically partitioned' reduced densitymatrices (RDMs) 33 .These can be written as where γ and Γ are one-and two-body reduced density matrices formed from the high-level solution of each cluster problem, as where the high-level cluster wave function, |Ψ x , includes the contribution from the doubly occupied environmental orbitals of each cluster.P x is an operator, which introduces a projection onto the fragment subspace of the cluster x in a symmetrically averaged fashion, e.g.
In these, x labels both the N frag fragments and clusters, for which there is an unambiguous one-to-one correspondence, and P x pq is the fragment projection matrix acting over the whole molecular orbital (MO) space, as with C representing the MO coefficients, S the atomic orbital (AO) overlap matrix, and C x f the coefficients of the fragment orbitals in cluster x.Note that a post meanfield cluster solver will only modify the 1-RDM in the cluster-cluster part, whereas the 2-RDM also acquires contributions in off-diagonal cluster-environment parts.As a result, the projection in Eq. ( 1) can be performed in the respective cluster spaces, whereas in Eq. ( 2) it has to be performed in the full system space, in order to take these changes in the off-diagonal parts into account.
The projection onto the fragment space of each cluster is required since the bath spaces overlap significantly between different clusters, and we must project out any double counting arising from contributions from overlapping bath spaces.In contrast, the partitioning of the system into fragment spaces is considered to be a disjoint fragmentation, where the set of all fragment orbitals are an orthonormal set with no overlapping fragment spaces in different clusters.From the democratically partitioned density-matrices, the total energy can be calculated as Note that in practice one can avoid forming the full system density-matrices to calculate the total energy and instead calculate energy contributions directly from the individual cluster density-matrices over purely the cluster degrees of freedom.The contribution from the doubly occupied environmental orbitals can be integrated out, by forming the Coulomb-and exchange potential of the unentangled occupied orbitals of each DMET cluster via an effective one-electron potential.This leads to the more common expression for the DMET energy 33 , equivalent to Eq. ( 8) via construction of democratically partitioned density matrices of Eq. ( 5), as where γ x and Γ x refer to the cluster reduced density matrices, with the projector purely acting in this cluster space, and N x cl denoting the number of orbitals in the cluster x.The energetic effect of these states (static Coulomb and exchange contributions) is then included via the construction of the hx , which includes the potential from these unentangled states to the one-body hamiltonian as 1 2 mn (pq||mn)γ x core,mn where γ x core is the density matrix from these core states.We can exploit fragments that are (by symmetry) in identical chemical environments by only computing the cluster solutions and energy contributions of symmetry-unique fragments.In the rest of this work, the expression E[γ x , Γ x ] will denote this democratically partitioned energy functional, shorthand for E[{γ x }, {Γ x }], denoting that the energy is computed from the set of individually constructed cluster one-and two-body RDMs.
This energy expression is exact when two conditions are satisfied.First, it requires that the fragmentation of the full system is complete, i.e. that the union of the fragment spaces spans all degrees of freedom of the system.This condition ensures that the trace of the sum of the different fragment projectors is exactly equal to the total number of orbitals in the system, or alternatively, that While it is a relatively mild condition to ensure that the combined fragment spaces span the generally localizable occupied space, ensuring that they span the (generally much larger) high-energy virtual space is harder to achieve and leads to much larger fragment spaces.This has required DMET simulations in realistic basis sets to augment fragment spaces with projected atomic orbitals (PAOs) 58 to ensure this condition is fulfilled for reasonable results 35 .The second criteria which must be fulfilled, is that the individual cluster density matrices must be exact, which in general will require the clusters of each fragment themselves to be enlarged to completeness, either by increasing the size of the fragment or (interacting) bath space.This ensures that |Ψ x → |Ψ and the projected density matrices of the clusters (Eq.5) are equivalent to the projections of the exact density matrix over the whole system.Combined with the completeness of the projector (Eq.10), this will lead to the exact energy from Eq. ( 9).Away from this exact limit, there are a number of drawbacks to this approach to compute properties from the DMET solutions.Firstly, the reconstructed fullsystem density matrices (Eqs.(1) and (2)) from the projected cluster solutions are not N -representable, meaning that they cannot be derived from a valid wave function.This can be seen as the democratically partitioned 1-RDM of Eq. (1), cannot be rewritten as a simple expectation value Ψ|c † p c q | Ψ of some wave function | Ψ .As a specific consequence, this can result in eigenvalues (occupation numbers) becoming negative or greater than two (in a restricted basis), violating the Pauli principle, and removing any variational guiding principle in the method 40 .Furthermore, conserved quantities and good quantum numbers in the individual cluster solutions such as electron number (N ), spin and its z-projection (S z , S 2 ) and other symmetries are not maintained in these composite full system descriptions.
To mitigate some of these effects, a global chemical potential (or potentially a fragment-specific chemical potential) is almost universally optimized in DMET to ensure that at least an exact, integer number of electrons is recovered in these democratically partitioned density matrices 33,59 .This can move electrons between the fragment and bath of each correlated cluster solution, such that the known global electron number is maintained as a constraint.While this corrects one known global quantum number in the density matrices, it does not correct others, and does not in general restore N -representability of the full system RDMs.Furthermore, this necessitates costly additional self-consistent loops over the high-level calculations.Furthermore, this requirement of a chemical potential optimization to a known total electron number further underlines the importance of the constraint of Eq. (10), ensuring that the full (occupied and virtual) space is spanned by the fragments, as all can be partially occupied in the correlated state and the democratically partitioned density matrices must trace to the correct electron number.Numerical demonstration of the breaking of these N −representability constraints in the democratically partitioned RDMs will be given in Sec.IV (with and without a global chemical potential optimization), also showing the deleterious effect on properties and computed energetics of the system that result.

B. Democratic partitioning of cumulants
In the following, we propose a simple alternative for the construction of democratically partitioned two-body density-matrices from DMET clusters, from which expectation values such as the energy can be calculated.Instead of partitioning the 2-RDM directly as in Eq. (2), we partition the two-body cumulant, K, defined (in a restricted basis) 60 via The non-cumulant (disconnected) contributions to the 2-RDM can then be reconstructed from the democraticallypartitioned one-body density-matrix, given by Eq. (1), such that Eq. ( 2) is replaced by The difference between Eq. ( 2) and ( 13) lies purely in the non-cumulant contribution to the two-body density matrix, which can be written as the product of democratically partitioned 1-RDMs.In the standard DMET partitioning of Eq. (2) these are taken from a single embedding problem at a time, whereas the partitioning of Eq. ( 13) contains 'cross-cluster' contributions, which can be seen by inserting Eq. (1) into the first term of Eq. ( 13): In this way, the non-local (correlated) one-body physics of two distinct clusters, x = y, contribute to global two body expectation values; the same is not possible in Eq. (2).This is expected to be important in cases where the orbitals p and q are far from the orbitals r and s, and are not spanned together in any single DMET cluster.In this case, the conventional DMET partitioning of Eq. (2) will not account for the relaxation of the external Coulomb-and exchange potential of a fragment due to the (potentially correlation-induced) density changes within the other orbital set.In contrast, this will be implicitly included in the partitioning according to Eq. ( 13).This is the key physics where we expect a partitioning of cumulants to be superior for two-body physics to the traditional democratic partitioning approach of Sec.II A for non-local expectation values.We will denote any total energies resulting from the democratically partitioned cumulant approach described in this section as E[γ x , K x ] in the rest of this work.We present a simple example showing the difference between the partitioned density matrices and the partitioned cumulant approach in Fig. 2, for a representative system of N 2 in a minimal basis set.The fragment space consists of the five symmetrically (Löwdin) orthogonalized atomic orbitals of a single atom (the 1s, 2s, 2p x , 2p y , 2p z spaces) with the DMET bath consisting of an additional three bath orbitals consistent with the bond order of the dimer in a minimal basis.The DMET cluster in this example therefore contains 10 electrons in 8 orbitals, which is compared to the full system of 14 electrons and 10 orbitals, and is solved with exact diagonalization (FCI).It is found that regardless of whether the fragment chemical potential is optimized or not, the democratically partitioned cumulant results in a substantial improvement in the energy, with the partitioned cumulant being almost unaffected by this chemical potential optimization.This is further consistent with the Cl 2 dimer shown in Fig. 1 and analyzed in more detail in Sec.IV, while the improvement in non-energetic properties (the spin-spin correlation function) from the cumulant partitioning will be shown in Sec.VI C. A minor technical detail to mention is that we use a slightly different partitioning in practice than that of Eq. ( 12) throughout this work.The correlated cluster 1-RDM can be decomposed as γ = γ 0 + ∆γ, where γ 0 is the reference mean-field density-matrix.We then work with a slightly modified cumulant definition, K, as The approximate cumulant K of Eq. ( 15) therefore contains the (∆γ) 2 -terms which are not present in the true cumulant K, with the relation between the two given as The final expression for the 2-RDM is then Γ pqrs = γ 0 pq γ 0 rs + γ 0 pq ∆γ rs + ∆γ pq γ 0 rs − 1 2 (γ 0 pr γ 0 sq + γ 0 pr ∆γ sq + ∆γ pr γ 0 sq ) where (in contrast to the democratically partitioned 2-RDM of Eq. 2), the projection of the cumulants can be performed in the cluster space only, since the correlated cluster solver will only lead to a non-zero cumulant in this space.This allows the total energy functional in terms of γ x and K x to be written as where E HF [γ 0 ] and F [γ 0 ] are the Hartree-Fock energy and Fock matrix corresponding to the reference meanfield density-matrix γ 0 , and ∆γ is the correlated part of the democratically-partitioned 1-RDM, formed from the cluster density matrices, γ x , as shown in Eq. (1).Note that Eq. ( 18), in contrast to the conventional DMET energy functional (9) which it replaces, does not involve a cluster-specific effective core-Hamiltonian.We find that democratically partitioning the approximate cumulant instead of the true cumulant gives almost identical results for all systems tested in this paper.However, in the case of the true cumulant, the Fock matrix of the correlated 1-RDM, F [γ 0 + ∆γ], would need to be constructed to calculate the energy, a step that scales as O(N 4 ) with respect to the system size N .For this reason, we use the democratically-partitioned approximate cumulant and calculate the total energy according to Eq. ( 18), referred to as E[γ x , K x ] in the results of this paper.

III. GLOBAL EXPECTATION VALUES FROM CLUSTER WAVE FUNCTIONS
While the democratically-partitioned cumulant described in the previous section can dramatically improve two-body properties (as will be further shown later), it still suffers from the same fundamental problem discussed in Sec.II A. This is that the resulting density matrices are not generally N -representable, i.e. they do not correspond to a physical fermionic wave function over the system.In this section, we go a step further and present an alternative paradigm of directly combining the cluster wave functions rather than RDMs to overcome this representability issue.This was first proposed in a correlation energy functional in Ref. 53 (described in Sec.III C), but is now generalized and expanded.
The basic idea is to consider an implicit full-system wave function reconstructed from the cluster wave functions themselves, as From this global state, expectation values can be defined which resolves N -representability issues and results in improved estimators.For instance, the 1-RDM can be written as As long as all projected cluster wave functions P x |Ψ x are physical, fermionic wave function in their own right, then the linear superposition of Eq. ( 19) is a physical, fermionic wave function as well.As a result, the 1-RDM of Eq. ( 20) is trivially N -representable, as it is simply Ψ|c † p c q |Ψ with the "global" wave function |Ψ given by Eq. ( 19).Higher order density-matrices can be constructed in the same way and are thus also Nrepresentable.Furthermore, the full system energy computed from RDMs constructed in this fashion will be variational if the cluster solvers are themselves variational methods.It is not difficult to rationalize that expectation values would also be improved from the formulation as sketched for the 1-RDM in Eq. ( 20) compared to the democratically partitioned RDMs.This is because a large number of 'cross-cluster' contributions to these expectation values are included, where wave function amplitudes from different clusters are combined beyond their mean-field contributions found in Eq. (11).

A. Partitioning of general wave functions
So far we have not specified what we mean by the projection of a cluster wave function as P x |Ψ x in practice.Since the DMET bath orbitals guarantee that the mean-field reference determinant |Φ is represented exactly in each cluster (ignoring the unentangled environment orbitals) 30 , it is convenient to define P x in terms of its action on the correlated part of the wave function only, i.e. our full system wave function can be written as where |∆Ψ x = |Ψ x − |Φ and intermediate normalization is assumed.
We then choose to represent the correlated cluster wave function part |∆Ψ in the basis of particle-hole excitations around the reference determinant in a linear wave function ansatz where we use i, j, . . .(a, b, . . . ) to represent general occupied (virtual) orbitals in this work.We choose the fragment projection of this wave function to be defined by its symmetric action on the occupied coefficient indices, for example in the case of the single and double excitation coefficients in the cluster, C x 1 and C x 2 , with generalization to higherorder excitation levels straightforward.Indices in these expressions correspond to the cluster canonicalized orbitals, with a critical consequence of the DMET (and BNO) bath construction being that the n-fold excitations of the cluster space are entirely spanned by the n-fold excitations of the full system due to the coincidence of their reference states.The fragment projection operators acting in the occupied-only space of the cluster can be formulated as where C x f represents the AO coefficients of the fragment orbitals (x) of fragment x, and C x represents the coefficients of the occupied cluster orbitals (in contrast to the projector defined in the full-system MO space of Eq. 7), with i, k representing occupied orbitals of cluster x.We note that this projection operator is not diagonal in the occupied orbital basis of the cluster.Furthermore, we have crucially chosen to only apply the projection to the occupied dimensions of the wave function coefficients in Eqs.(23, 24).This is an important difference compared to the projection of density matrices according to Eq. ( 5), where the projector always needs to act on general indices, which enumerate both occupied and virtual orbitals.The significance of this is that the initial fragmentation of the system now only needs to ensure that the entire occupied space is spanned by the union of all fragments, in order for the projector to be complete.If this is satisfied, the only remaining approximation results from the deviation of each cluster wave function from exactness (which can be systematically resolved via increasing the cluster/bath space).This is a significant advantage, ensuring that the expectation values become exact as the bath space of each cluster is increased, without requiring the fragment spaces to span the virtual space of the system.The requirement for completeness of the fragment projectors is now relaxed from Eq. ( 10), and can be written as where N e is the number of electrons.This quality is especially impactful in larger basis sets required for quantitative accuracy, as fragment sizes are now independent of the size of this basis, with the required virtual space captured instead by the bath expansion.A fragmentation spanning the occupied space is for example guaranteed by the choice of intrinsic atomic orbitals (IAOs) for the fragment spaces 61 , used in the main results of this work unless otherwise specified.We note that neither the representation of the wave function amplitudes in a particle-hole basis, nor the choice of projection of just the occupied space in itself precludes the solver from capturing strong correlation, since we do not assume a truncation of the excitation levels which are represented in the wave function.The ability to capture strong correlation effects (for an exact high-level solver) is determined by the suitability of the cluster space, and is invariant to the choice of representation of the resulting wave function.It could however be possible that a more efficient projection operator could be formulated for e.g.strongly correlated lattice models, where the requirement of the fragmentation spanning the virtual space is not difficult to fulfil.However, for ab initio quantum embeddings, we do not anticipate that this would be beneficial.

B. Partitioning of exponential wave function forms
While all wave functions can be written in the linear form of Eq. ( 22), we now show that there are benefits in casting the wave function to an exponential parameterization for partitioning in this space.This form (common to coupled-cluster methods 62 ) can be written as The amplitudes can be converted between the linear and exponential forms more generally 63 via the relations . . . .
For ground state wave functions, the norm of the Tamplitudes of the exponential ansatz generally decays more quickly with respect to the excitation rank than the C-amplitudes of the linear ansatz.The exponential wave function ansatz thus allows for an alternative way to partition the wave function by projecting the Tamplitudes, rather than C-amplitudes, which by analogy with Eqs.(23, 24) can be written as To illustrate the difference between the partitioning of the global wave function via its C-and T -amplitudes, we can compare the wave function ansätze truncated at the double-excitation level.For the partitioning of the linear C-amplitude representation of the wave function (Sec.III A), we obtain while for the exponential form of the wave function, we achieve While the first two terms in Eqs.(33, 34) are equivalent and involve only a single summation over clusters, the last term of Eq. ( 34) (representing disconnected double excitations) has a double loop over pairs of clusters.As a result, a single excitation in cluster x and a different single excitation in cluster y can contribute together to form a double excitation of the global wave function, even if the two clusters are far apart and do not overlap.Similarly, for wave functions truncated at the triple excitation level, two independent clusters (e.g. in T x 2 T y 1 ), or three independent clusters (e.g. in T x 1 T y 1 T z 1 ) can combine to contribute disconnected triple excitations to the global wave function.The same is not possible for the linear wave function partitioning, for which both connected and disconnected contributions must always come from a single cluster only.
This partitioning of the global wave function at the level of the T -amplitudes therefore results in non-local 'cross-cluster' information being built into the solution from the quantum embedding -this time on the level of the reconstructed global wave function itself, rather than similar cross-cluster information being built into the 2-RDM (see Sec. II B) or other expectation values (see Eq. 20).The beneficial impact of these cross-cluster contributions is a key tenet of this work, and serves to mitigate the impact of the local approximations inherent to quantum embedding by not treating the cluster solutions as contributing independently to final expectation values.
Finally, we should stress that while this framework of partitioning the global T -amplitudes of the wave function naturally fits with the use of coupled-cluster as a high-level solver, we are not inherently constrained to this.If we avoid truncating the rank of the T -operator, then any wave function can be cast into the exponential form of Eq. ( 27) for the high-level solution of the cluster problems, and benefit in this way.In particular, a linear wave function obtained from a FCI cluster solver can be converted to the exponential representation via Eqs.(28-30), projected, and finally recombined into a global exponential wave function to benefit from these cross-cluster contributions.Nevertheless, the partitioning of the global T -amplitudes of the wave function also particularly suited to coupled-cluster high-level solvers, which are increasingly being used as a compact and accurate approach which enable access to larger fragment and bath spaces, and have recently been shown to be accurate in a variety of embedding contexts 53,[64][65][66][67] .

C. Expectation values from linear functionals
In this section, we consider the use of this implicit 'global' wave function to compute expectation values of operators which commute with the Hamiltonian, and which can therefore be calculated from functionals which are linear in the wave function.A trivial example of this is the total energy, as its associated operator is the Hamiltonian itself.Projecting the time-independent Schrödinger equation from the left with the Hartree-Fock determinant Φ| and assuming intermediate normalization Φ|Ψ = 1, it follows that which is linear in Ψ.This is the traditional energy functional of e.g.coupled-cluster theory, but holds for any state.
Since the Hamiltonian operator of an electronic structure problem involves up to two-body interactions, only the double excitations contribute to the correlation energy where the single excitations do not contribute for a Hartree-Fock reference state, due to Brillouin's theorem (though a single-body contribution can easily be included if using a non-Hartree-Fock reference).Projecting the C 2 -amplitudes in Eq. ( 36) according to Eq. ( 24) enables us to partition the correlation energy into cluster contributions as These contributions can be formulated from the individual cluster solutions, where the orbital indices run over the occupied or virtual states of each cluster, and where the projection operator of Eq. ( 25) can be constructed in this cluster orbital space, to evaluate the energy efficiently in a cost linear in the number of clusters.This is the energy functional introduced and used in Ref. 53, and will be denoted by E[Ψ x ] throughout this work, to indicate that it is constructed directly from the cluster wave function amplitudes without requiring intermediate cluster density matrices.The use of this functional for energies is common to quantum chemical local correlation methods, for example within the PNO 68 or cluster-in-molecules approaches 69 .However, the 'fragments' in these methods are then generally defined in terms of (localized) purely occupied orbitals, instead of the general 'atomic-orbital-like' fragment spaces which are used in the context of quantum embedding and which in this work require the use of explicit (non-diagonal) projection operators as described.
This functional can also be used within a fragment projection of the T -amplitudes of an exponential representation of the high-level cluster wave function (see Sec. III B).In this case, inter-cluster contributions arise from ( P x T x 1 )( P y T y 1 ) terms.This physics is neglected in the C-amplitude projection of above in the case of no overlap between clusters x and y.However, in the examples in this work, the projection of the Cand T -amplitudes gave almost indistinguishable resultsunsurprising given the small contribution of the overall T 2 1 contribution to the energy and the fact that the local portion of this is still included.For this reason, and to reduce the computational cost of this energy functional and keep consistency with Ref. 53, all results in this work denoted E[Ψ x ] will correspond to the projection of the Camplitudes of the cluster state (even for coupled-cluster high-level solvers) as shown in Eq. ( 37), though the projection of T -amplitudes may be preferred in the future.

D. Partitioned wave function density matrices
General expectation values, with operators that do not commute with the Hamiltonian, require a quadratic functional of the wave function, of the form Ψ| Ô|Ψ .In the case of one-and two-body expectation values, O 1 and O 2 , these can also be expressed in terms of the one-and two-body reduced density-matrices (which in turn depend quadratically on the wave function) as The one-and two-body RDMs are thus important quantities to be able to compute for expectation values beyond the energy.Of course, with knowledge of the density matrices one can also calculate the total energy via Eq.( 8).For most quantum chemical methods, where a globally stationary state is found, the result will be identical to the projected energy of Eq. ( 35).However, within the quantum embedding framework the wave function is partitioned into fragment contributions which are then calculated in an approximate fashion and we do not expect this equivalency to hold.Therefore, while the individual cluster wave functions are invariant to this choice of energy functional, the reconstructed full system total energy may not be.Indeed, there is likely benefit in calculating the total energy as a symmetric (quadratic) expectation value (or equivalently from the RDMs), since the error in the energy will reduce quadratically with the error in the wave function, rather than linearly as in the energy functional of Eq. ( 35).
In sections III A and III B we showed how global wave functions can be defined in terms of contributions of projected cluster wave functions.In principle, these global wave functions then fully and uniquely determine their corresponding full system reduced density-matrices.For example, the 1-RDM of a partitioned linear wave function can be written as x,y λ∈x µ∈y where λ and µ enumerate the excitations (to all included orders) within the respective cluster in the particle-hole basis.Note that in contrast to the democratically partitioned 1-RDM of Eq. ( 11), the 1-RDM of a partitioned wave function contains a summation over cluster pairs, thus incorporating simultaneous excitations within two different clusters, as indicated above and in Eq. (20).For a partitioned exponential wave function the global state itself already contains nested summations over projected cluster wave function T -amplitude contributions (see Eq. ( 34)).The resulting 1-RDM will thus have even higher order contributions than the C-amplitude projection (going up to three simultaneous cluster contributions for a high-level wave function represented up to T 2 , and up to five simultaneous clusters for the 2-RDM).Both approaches result in explicitly N -representable RDMs, but again the T -amplitude projection can be argued to have a larger number of physical non-local 'cross-cluster' contributions and therefore represents a better approximation.In Sec.V we will detail the technical aspects of how all of these cross-cluster contributions can be efficiently computed while retaining no more than DFT scaling in the full system size (or number of clusters).Total energies computed from these T -amplitude projected RDMs as described in this section will be denoted E[(γ, Γ)[Ψ x ]] throughout this work, to signify that these are density matrices derived from the global partitioned wave function.

IV. A SIMPLE EXAMPLE: CHLORINE DIMER
Before describing specific implementational details in Sec.V, we consider a simple example to corroborate the claims in the work so far, before a more extensive set of test systems is considered in Sec.VI.We consider the binding of Cl 2 in a minimal basis.The choice of a minimal basis is intentional, since this means that we can easily compare to exact FCI results, and that the same (Löwdin orthogonalized) atomic orbital space can be used for the fragments in all approaches.For larger basis sets, the democratically partitioned density matrix and cumulant approaches require all orbitals to be spanned by the fragment spaces.However the projected wave function approaches require only (at least) the occupied space to be assigned to fragments for the projector to be complete.Therefore for more realistic basis sets, the fragment spaces would generally be chosen according to different criteria, with the fragment spaces of the former needing to grow with basis size.Using a minimal basis therefore avoids this issue and allows the different methods to be compared on the same footing.The DMET cluster space in this example therefore consists of 18 electrons in 9 fragment orbitals plus a single DMET bath orbital (due to the minimal basis) in each of the two symmetrically equivalent clusters, while the full space consists of 34 electrons in 18 orbitals.
The interacting cluster Hamiltonian is solved exactly via FCI 70 , and we compare the four different energy expressions outlined to far in Fig. 3: the democraticallypartitioned RDM energy common to DMET approaches to date (E[γ x , Γ x ] from Sec. II A), the democratically partitioned cumulant energy (E[γ x , K x ] from Sec. II B), the linear energy functional (E[Ψ x ] from Sec. III C) and the energy from the RDMs of the T -amplitude projected global wave function (E[(γ, Γ)[Ψ x ]] from Sec. III D).We include calculations with and without a global chemical potential optimization to ensure that the right number of electrons are conserved for the full system 1-RDM, but no further optimization of a correlation potential is considered here.When used (right column of Fig. 3), this fragment chemical potential, µ frag , is added Bond length (Å) to the one-electron cluster Hamiltonian within the fragment space only, and optimized such that the densitymatrix traces to the correct number of electrons 33 (these µ frag -optimized results are the same as shown in Fig. 1).
All results are subject to identical locality approximation of DMET, whereby the full system is represented only by the two wave functions in the cluster subspaces, allowing for a faithful comparison of the quality of the reconstructed expectation values.An example Python script, showing how these calculations can be performed using PySCF and Vayesta can be found in the SI.We find in Fig. 3 that the democratically partitioned RDM energy is changed significantly by the optimization of the fragment chemical potential, but in neither case are the results reasonable, with unphysical results without µ frag -optimization, and substantial overbinding of the dimer when this is included.These results are also found in many other systems (barring hydrogen dimers and chains, where we find the democratically partitioned density matrix energy accurate, as has been noted elsewhere 21,31,33,46 ).The democratically partitioned cumulant approach is also changed by the µ fragoptimization, but is much less sensitive to this, with both results already significantly improved over the democratically partitioned RDM energy.The chemical potential optimization further drops the non-parallelity error of the democratically partitioned cumulant approach from ∼ 6 mE H to just over 1 mE H in this system.
We also show the energies derived from partitioned wave functions over the system, either calculating the energy from a linear or quadratic functional (via the RDMs) of the wave function probability amplitudes.It is shown that these are also very accurate, which we can ascribe to two properties of these functionals.The first is that they are intrinsically N -representable, ensuring that they fulfil the physical constraints of being derivable from a wave function.As a consequence, they necessarily correspond to the correct, integer number of electrons, and therefore obviates the necessity of chemical potential optimization which has no effect (therefore also substantially reducing the cost of the calculations).The second rationalization for the particularly good performance of the partitioned wave function RDM energy is due to the introduction of contributions from products of wave function amplitudes from different clusters, coupling the cluster solutions together.Furthermore, for the E[(γ, Γ)[Ψ x ]] energy, these contributions result in an error in the energy which is rigorously quadratic rather than linear in the wave function error, and introduces a variationality (for a variational method) into the results which results in the smallest non-parallelity error of 0.5mE H . Representability error Bond length (Å) In order to analyze the different approaches in more detail, in Fig. 4 we compare the 1-and 2-RDMs resulting from the democratically partitioned RDMs (γ[γ x ] for the 1-RDM, and Γ[Γ x ] or Γ[γ x , K x ] for the 2-RDM from the direct democratic partitioning and the cumulant partitioning respectively).We also compare to the RDMs constructed from the partitioned wave function expansion (γ[Ψ x ] and Γ[Ψ x ]).We consider the norm of the errors compared to the exact full system 1-and 2-RDMs, as well as the electron number error from the trace of the 1-RDMs.We also introduce a measure of the Nrepresentability error of the 1-RDM, Ω, which sums the deviation from allowable occupations of the system orbitals as where n i are the eigenvalues of γ.
While the overall norm error of the 1-RDMs seem relatively similar between the approaches (especially after chemical potential optimization), it can be seen that there is still a significant electron number error (without chemical potential optimization), and N -representability error in this quantity.While the chemical potential optimization fixes the electron number error, it is actually found to increase the N -representability error.In contrast, the partitioned wave function approach ensures that the electron numbers are exactly fulfilled even without a chemical potential, and that the N -representability error is strictly eliminated.For the 2-RDM errors, it is clear that significant improvements are found by going to a democratically partitioned cumulant compared to the 2-RDM directly, further indicating that it is the noncumulant error in the 2-RDM which is contributing to the significant energy errors in the standard DMET energy functional.

V. FROM CONCEPTS TO PRACTICE
While the previous sections describe the principle and numerical advantages behind the reconstruction of a global wave function, in practice we will often want this to be implicit as the reconstruction of the global correlated wave function will in general be prohibitive in cost.Instead, we want to directly compute the RDMs or expectation values of interest from this state in low-polynomial time with respect to system size, and without combining the wave function amplitudes into an explicit global state.This allows for the cluster wave functions to remain distributed in their fragmented cluster representations.We focus in this section on the efficient computation of the 1and 2-RDMs, from which all two-body properties can be computed, with the principle of an implicit global wave function demonstrated in Sec.III D underlying this construction.
We restrict ourselves here to a projection of the Tamplitudes of the global state, represented in an exponential form.This is consistent with the approach in the previous section, will maximize the number of nonlocal combinations of cluster solutions, and is a natural choice for coupled-cluster high-level solvers (although as explained in Sec.III B, other solvers can be cast in this form).Furthermore, we focus on an efficient implementation of the cluster solver truncated at the T 2amplitudes, i.e.CCSD (or MP2), which is a cost-effective high-level cluster solver for ab initio systems 53,[64][65][66][67] , yet is sufficiently computationally inexpensive for us to rigorously demonstrate the convergence of these functionals with respect to bath size in Sec.VI.Extensions to other high-level cluster solvers (e.g.FCI) can proceed via Eq.40, and will be described in future work.We stress here that this approach to compute expectation values from a global wave function with alternative (i.e.strongly correlated) solvers would not lead to an increase in scaling with respect to full system size compared to the algorithm shown.Indeed, the fact that coupled-cluster RDMs involve high-degree polynomials of the cluster solution variables (i.e.T n amplitudes) results in the formal scaling with system size being generally larger than solvers such as FCI (where the RDMs are just quadratic in the cluster C n variables).
Explicit reconstruction of the global CCSD Tamplitudes (which we call the 'global T 2 algorithm') would require a memory overhead scaling as O(N 4 ) and computational scaling of O(N 5 ) and O(N 6 ) for the 1and 2-RDMs respectively (where N is a measure of full system size) via the usual CCSD equations 71 .We reduce this scaling down to O(N 2 ) cost in memory and O(N 3 ) time for the 1-RDM construction via direct use of the cluster amplitudes and construction of appropriate intermediates without introducing any additional approximations.This is described in Sec.V A, which we denote the 'distributed T 2 algorithm', and is crucial to ensure that this step is a sub-leading scaling compared to the initial mean-field calculation and for applicability to large systems.For the construction of the 2-RDM and properties derived from it, the requirement of a similar quadratic scaling with system size necessitate the introduction of a further approximation, which is described in Sec.V B, with rigorous validation of this further approximation demonstrated in Sec.VI.

A. One-RDM from cluster wave functions
In order to construct density-matrices at the CCSD level, both wave function T -amplitudes and the Lagrange multipliers (or Λ-amplitudes) are required.These are optimized within each cluster after the T -amplitudes are found, giving rise to a set of Λ x amplitudes for each cluster, x.The principle of reconstruction of a 'global' set of Λ-amplitudes follows symmetric projection of the occupied indices of each cluster Λ x -amplitudes onto the fragment space (defined analogously to Eqs. 31 and 32), and a summation over the cluster solutions, as All expectation values in coupled cluster (including the 1-RDM considered here) involve sums of expressions which are polynomials in the T -amplitudes, and up to linear in the Λ-amplitudes.The central idea of the efficient 'distributed' approach is to avoid the explicit reconstruction of the global T -and Λ-amplitudes entirely, and instead iterate over tuples of cluster solutions, directly summing in their corresponding contribution to the 1-RDM.The number of clusters which needs to be looped over in order to include all cross-cluster contributions to the desired expectation value (or 1-RDM in this instance) depends on the maximum order of the polynomial in the expression, i.e. the total number of T -or Λ-amplitudes which are contracted together.
For the 1-RDM, this maximum order is formally three, corresponding to contractions of the type Λ 1 T 2 1 .This corresponds to the maximum rank of simultaneous 'crosscluster' information in the construction of this quantity (i.e.triplets of cluster solutions are required).However, we can reduce this by first explicitly constructing the global T 1 -and Λ 1 -amplitudes from the cluster solutions according to Eqs. (31, 43), since this will only require a scaling of O(N 2 ) and O(N 3 ) in memory and time respectively, which is an acceptable overhead compared with the initial mean-field calculation.This is in contrast to the analogous explicit construction of the global T 2amplitudes, which we have already argued is prohibitive.The fragment projection for T 1 and Λ 1 is therefore performed in each cluster, and then projected back into the full system MO space and summed over all clusters.Furthermore, we store these global T 1 -and Λ 1 -amplitudes on every memory partition of a distributed memory parallel calculation.The key result is that we now only need to loop over cluster tuples up to the maximum rank of distributed (i.e.T 2 or Λ 2 ) amplitudes in the 1-RDM expression, which is only quadratic, resulting from an L 2 T 2 contraction.This requires only an O(N 2 ) double summation over pairs of cluster amplitudes (which can be further reduced asymptotically down to O(N ) as described later).If certain contractions require a T 1 to be contracted with a cluster distributed T 2 , then the global T 1 amplitudes can be projected into the required cluster space of the T 2 .
We demonstrate this distributed amplitude algorithm for the example of the CCSD contribution to the occupied-occupied block of the 1-RDM, given as The first term of this contribution can readily be calculated from the explicitly combined global singles amplitudes with O(N 3 ) scaling.For the second term, we can calculate the contribution arising from a specific cluster pair (x, y), by using the Λ 2 -and T 2 -amplitudes corresponding to these clusters, as where we use the notation i x (a x ) to indicate an occupied (virtual) orbital of cluster x, and S xy to represents the overlap matrix between the occupied (or virtual, as indicated by the subscript indices) orbitals of cluster x with cluster y, given by These S xy matrices can be precomputed in O(N 3 ) for all cluster pairs, where the columns of C x are the relevant orbitals of cluster x in the AO representation, and S is the AO overlap matrix.We then sum the 1-DM contributions of Eq. ( 46) over all cluster pairs, according to ∆γ where R represents the overlap between cluster MOs and MOs of the full system, i.e.
To analyse the computational scaling of the distributed amplitude algorithm, we note that the number of clusters, AOs, and full system MOs grows linearly with the system size, whereas the number of cluster orbitals per cluster remains constant.As a result, the calculation of all cluster pair contributions (46) scales as O(N 2 ), while Eqs.(47-49) can be evaluated in O(N 3 ) time, if intermediates are formed as indicated by the brackets.An analogous loop over cluster pairs can also be used to calculate the virtual-virtual and mixed occupied-virtual blocks of the 1-RDM, resulting in an overall O(N 3 ) scaling algorithm.
In Fig. 5 we compare the scaling of the global amplitude and distributed amplitude algorithms, for a series of alkanes between hexane (C 6 H 14 ) and C 22 H 46 in a cc-pVDZ basis and atomic IAO fragmentation.To extend beyond a minimal DMET bath size, we augment the bath expansion with 'bath natural orbitals' (BNO), which can be systematically enlarged in order to describe the beyond-mean-field coupling of the fragment to the environment at the level of approximate MP2 theory 53 .This bath construction is similar to pair natural orbitals (PNO)-though traditional PNOs are applied only for the virtual coupling of a localized electron pair 68,72,73 .The completeness of this bath space is controlled by a threshold η, where the bath becomes complete as η → 0, and is limited to just the traditional DMET bath as η becomes large.More details on the construction and motivation of this bath expansion can be found in Ref. 53.From the timing data for the 1-RDM construction in Fig. 5 for two different bath space sizes, we fit the polynomial t(N ) = aN b to the last six data points of each curve to determine the overall computational scaling.For the global amplitude algorithm, this is close to the expected scaling with b ∼ 5, whereas for the distributed amplitude algorithm we find an exponent close to 2 indicating a quadratically-scaling algorithm, instead of the expected exponent of 3.This indicates the small prefactor of the efficient O(N 3 ) operations of Eqs.(47-49), whereas the evaluation of Eq. ( 46), although scaling as O(N 2 ), has a significant larger prefactor and dominates the computational costs for the tested system sizes.
Despite this favorable scaling, we employ a number of additional techniques, to further improve the efficiency of this algorithm.These include a singular value decomposition of each cluster overlap space, in order to find the most compact domain for describing the inter-cluster physics.In the large system limit, this also naturally leads to a O(N ) scaling in the 1-RDM construction, due to the fact that each cluster will only have appreciable overlap with O(1) other clusters.In addition, we consider efficiency gains that can be made from k−point sampled periodic systems, effective distributed memory parallelism, and a compressed representation of the projected amplitudes.These technical improvements in the 1-RDM construction are detailed in Appendix A.

B. 'In-cluster' approximation to the two-RDM
In principle, an analogous approach to above can be used to recover any expectation value from the implicit partitioned wave function, with the scaling determined by the maximum number of products of T n or Λ n in any of the terms, where n > 1.For the 2-RDM, this increases to three (from two for the 1-RDM), which would necessitate looping over triplets of cluster solutions (noting that a similar SVD approach to screen the overlap would still asymptotically reduce to linear scaling).Note that including the T 1 and Λ 1 contributions, the 2-RDM would include simultaneous contributions of up to five cluster amplitudes, but by explicitly reconstructing these T 1 and L 1 amplitudes globally, only looping over three clusters (x, y, z) would be required.However, including the projections back to the full space and noting the O(N 4 ) storage of the global 2-RDM, these scalings are likely prohibitive if we aim for the quantum embedding to maintain a scaling with full system size which is no more than mean-field theory.
To return to (at most) a O(N 3 ) scaling approach, an additional approximation is thus required.As a first step, we split the 2-RDM into products of 1-RDMs and an (approximate) cumulant, K, defined according to Eq. ( 15).The non-cumulant part can be treated without further approximations, using the efficient calculation of the 1-RDM presented in the previous section.This ensures that all the cross-cluster contributions in the noncumulant part are exactly included from an implicit valid wave function.The cumulant part, on the other hand, will be treated in an 'in-cluster approximation', meaning that all products of wave function amplitudes are taken within a single cluster at a time, rather than explicitly including triplets of clusters.
In order to illustrate this, let us consider a contribution to the cumulant, δK, which involves a (Λ 2 T 2 2 ) triple product of double (de)excitation amplitudes.If evaluated exactly according to the implicit partitioned wave function, this would require contributions from all triplets (x, y, z) of clusters, as In the in-cluster approximation, this contribution is re-placed by which only contains amplitudes from cluster x.Note that Eq. ( 51) only contains a single fragment-space projector, which is applied to the Λ-amplitude, compared to Eq. ( 50), where every amplitude is projected.In general, the number of projectors in a term has to agree to the number of nested loops over clusters the given term is summed over.This ensures preservation of a core principle of the embedding, that we guarantee an exact result, free from double-counting, as the bath space increases to completeness, as long as the sum of the fragment spaces spans the entire occupied space (Eq.26).Instead of projecting the Λ-amplitude, we could have also chosen to project the first or second T -amplitude in Eq. ( 51), resulting in a different in-cluster approximation, which nevertheless becomes exact in the full-bath limit.However, projecting the Λ-amplitudes yields a simple recipe, since all contributions to the 2-RDM cumulant are at most linear in Λ 1 or Λ 2 .Some limited experimentation with other options for the projected amplitudes did not seem to significantly change the results.The only term this prescription does not work for is the Λ 0 terms of coupled-cluster expressions, which results in a MP2-like contribution in which case we use projected T -amplitudes, according to We note here that the 'global' T 1 -and Λ 1 -amplitudes as defined in the last section are not used in this 'incluster' cumulant approximation, ensuring that only a single cluster summation is used for all terms.Relaxation of this constraint will be investigated in the future.This 'in-cluster' approximation to the two-body cumulant reduces its computation to sums over single cluster wave function contributions, avoiding the computational effort associated with the fully non-local approach (as used for the 1-RDM in Sec.V A).The computational effort to compute the two-body cumulant contributions is now negligible compared to the 1-RDM construction, which requires contributions from all pairs of clusters.However, we do not want to build the whole full-system cumulant (which would require a prohibitive O(N 4 ) scaling in memory), but rather calculate expectation values directly from this sum over the independent cluster wave functions.We can therefore directly construct the energy functional with this 'in-cluster' approximated cumulant, which we denote E[(γ, K * )[Ψ x ]) in this work (the asterisk denoting the 'in-cluster' approximation to the two-body cumulant), as where K * [ P x Ψ x ] indicates that each term in the expression for the (approximate) cumulant of Eq. ( 15) is computed from products of contributions from the same cluster solution (the 'in-cluster approximation').This allows the two-body contributions from each cluster to be computed entirely independently, as opposed to ∆γ[Ψ x ] which is assumed to contain all relevant products of different cluster solutions (see Sec. V A).Other static twobody expectation values can be derived using this 'incluster' two-body cumulant analogously.In Sec.VI C and Appendix B, we go beyond energetics to consider the twopoint instantaneous spin correlation function, which can be computed in this same framework via reconstruction of a global 1-RDM, with cluster-local two-body contributions via the 'in-cluster' approximated cumulant.While this approach bears similarities to the democratically partitioned cumulant for two-body properties of Sec.II B (specifically that the disconnected part of the 2-RDM is treated separately to include cross-cluster contributions), it has significant advantages.In particular, the 1-RDM part is treated in a fully N -representable way, with contributions from all triplets of different cluster wave functions, as well as the fact that the projections acting on the occupied spaces mean that fragments only need to span the complete occupied space to ensure convergence to 'in-method' exactness as the bath spaces of each fragment are enlarged.We consider this a critical aspect to allow for practical extension of the embedding to realistic basis sets.
Although this 'in-cluster' approximation is justified primarily by numerical expediency, it is still a controllable approximation in the same fashion as the embedding approximation itself, via an expansion of the bath space, guaranteeing systematic improvability of the approximation to exactness.However, it is also clear that the 'in-cluster' approximation to the two-body cumulant does result in a loss of strict N -representability of the resulting 2-RDM.This is because it is no longer derivable from a full system wave function, since there is no single set of 'global' T -and Λ-amplitudes used for all expressions.This can be seen in the first set of results in Fig. 1 which presents all energy functionals used in this work.While the error in the energy functional which builds the RDMs from the explicit full system partitioned wave function (E[(γ, Γ)[Ψ x ]]) is variational with respect to the true solution, this is lost when the low-scaling 'incluster' approximation is used for the two-body cumulant (E[(γ, K * )[Ψ x ]]).
In the next section, we will rigorously benchmark these different approaches to computing energies and non-energetic spin-spin correlation functions, and analyze their convergence in realistic molecular and extended systems with systematic expansion of the bath space of each cluster.Surprisingly, we find that energies derived with the 'in-cluster' cumulant approximation exhibit an improved convergence to the exact complete bath limit of the embedding theory, due to more favorable cancellation of errors.

VI. RESULTS
We now turn to a larger selection of systems, to consider the convergence of these different energy estimators to their 'in-method' exact limit, as the size of the interacting bath expansion of each cluster is enlarged.For this expansion, we consider the 'cluster-specific bath natural orbitals' (BNO) introduced in Ref. 53, which can be used as a systematic way to increase the orbital space controlled by a single threshold parameter η, and we will use CCSD as a cluster solver throughout.As η is reduced towards zero, the bath space increases in size towards completeness, capturing longer-ranged and higher-energy coupling of the fragment to its environment.The use of CCSD as a solver allows us to access bath sizes sufficient to converge to 'in-method' exactness, and we will move to consider stronger-correlation solvers in future work.We will initially compare the energies from the three introduced energy functionals which are derived from reconstructed wave function forms.As a recap, these are: 53 and Sec.III C, this is the cheapest approach to energy computation of Eq. (37), requiring no communication between independent additive cluster contributions.
• E[(γ, Γ)[Ψ x ]]: Described in Sec.III D, this computes the energy from the exact density matrices from the reconstructed wave function partitioned over the clusters.This scales worse than the meanfield for the exact reconstructed Γ, and rapidly becomes the dominant cost in the calculation.It will primarily be used for reference.
• E[(γ, K * )[Ψ x ]]: Described in Sec.V B, this approximates the cumulant contribution to the 2-RDM via the 'in-cluster' approximation, which ensures lowscaling with system size for this non-factorizable two-body energy contribution.However, the 2-RDM loses strict N -representability.The 1-RDM contributions are still N -representable from the partitioned wave function according to the efficient algorithm of Sec.V A.
We do not compare the energies from the density matrix derived expectation values of Sec.II, due to the different constraints on the fragment spaces imposed.For the wave function derived expectation values above, all of these will converge to in-method exactness as the bath space increases as long as the combined fragment spaces span the occupied space, while the density matrix expectation values require the combined fragment spaces to span all degrees of freedom.This allows us to consider simple minimal-size atomic intrinsic atomic orbital fragments, with the only convergence criteria now being the bath size controlled by η.

A. Molecular results: W4-11 test set
We first consider the comparison of these energy functionals for the W4-11 test set of 152 small molecules consisting of first and second-row elements, which covers a broad range of bonding character, with static as well as dynamic correlated physics present, as well as a mix of high-spin, low-spin and open-shell systems 54 .In a cc-pVDZ basis, canonical CCSD calculations on the full test set is tractable, to obtain a 'ground-truth' total energy for all molecules, and we use ROHF/UHF and UCCSD for open-shell cases.Using the information from the corresponding Hartree-Fock calculation, we then fragment each molecule into a minimal basis set of IAOs on each atom, which we use as the fragments of each clusters, which ensures that the total occupied space of the full system is spanned by the fragments 61 (we omit the Beryllium dimer test system, due to a lack of virtual orbitals in the default IAO basis).The choice of single atom fragments simplifies their selection, but future work to consider larger (or smaller) disjoint fragmentation causes no conceptual or practical difficulties.To complete the cluster space of each atom, we augment this with the DMET bath space (which must be at most equal in size to the fragment space), and then with the BNOs defined from a given threshold, η.Smaller η values correspond to larger bath spaces, with η > 1 just returning to the smallest (DMET) bath size.Each cluster is then solved independently at the level of CCSD (restricted or unrestricted), and the total energies reconstructed from the three wave function-based energy functionals provided above, with no self-consistency or chemical potential optimization.
These results are shown in Fig. 6, with their absolute per electron energy errors aggregated for different values of η in Fig. 7.It is clear that the bath expansion of these IAO atomic fragments results in a rapid, monotonic and systematic convergence of all energy estimates to inmethod exactness, obtaining agreement with canonical CCSD calculations to within tight errors.The largest of these errors, in both MAE and standard deviation over the different molecules in the test set, arises from the E[Ψ x ] functional, which is to be largely expected, since this functional does not have any 'cross-cluster' contributions to the energy, relying on a sum over independent energy contributions from the fragments.A significant improvement in the rate of convergence can be found from the E[(γ, Γ)[Ψ x ]] functional, which contains many prod- FIG. 7. Violin plots showing the distribution of total energy errors per electron over the W4-11 test set of molecules compared to canonical CCSD calculations, for the different partitioned wave function energy estimators.This is performed with a bath expansion of each IAO atomic fragment for different values of η, which corresponds to a maximum cluster size as indicated in Fig. 6.Total distribution ranges show the minimum and maximum errors over the test set.
ucts of contributions between different cluster solutions in its construction.Furthermore, this energy is rigorously Ψ-derivable, and is therefore expected to exhibit an error which is quadratic in the implicitly reconstructed wave function error.Though the energy is not strictly variational due to the non-variationality of coupled-cluster theory, we also find it to be fully variational compared to the canonical CCSD in all cases.
However, somewhat surprisingly, the E[(γ, K * )[Ψ x ]] energy functional converges to in-method exactness at the fastest rate, obtaining the least energy error across the test set for each bath truncation.This approach was motivated as an efficient approximation to the E[(γ, Γ)[Ψ x ]] energy, but due to favourable cancellation of errors between the approximated cumulant energy contribution and the inherent locality approximation in the bath truncation, appears to be the preferred approach (as well as being particularly efficient).For intermediate cluster sizes, the reduction in per-electron MAE can be nearly an order of magnitude compared to the E[Ψ x ] energy.Due to the approximation in this cumulant non-variational total energies can result (compared to the canonical result), however the appearance of these non-variational results are very rare, and only observed for already tightly converged energies.This is because the bath truncation itself is a variational approximation (since this defines a subspace of the full variational freedom of the cluster), which explains the favourable cancellation of errors, and maintenance of generally variational results, even in this approximation which breaks strict N -representability in the 2-RDM.

B. Solid state results: Diamond
A key feature of the embedding approach proposed is the applicability to both molecular and periodic systems, enabled by ensuring all steps scale no worse than the initial Hartree-Fock calculation.We therefore also consider the convergence of structural properties of crystalline fcc diamond in an all-electron cc-pVTZ basis and a 5×5×5 k-point sampling.At 7500 orbitals and 1500 electrons, this is well beyond the capabilities of canonical CCSD for a direct comparison, but we can observe the relative convergence of the different energy estimators, and the resulting equation of state.
The supercell is split into 250 individual embedded cluster problems, comprising of the minimal basis IAOs of a single Carbon atom as the fragment (6 orbitals), the DMET bath space (also 6 orbitals), and the BNOs whose size depends on the η parameter.The size of the resulting clusters vary between 50 and ∼ 250 orbitals.Due to the translational and rotational symmetry, all embedded problems are symmetry-equivalent, and therefore only a single cluster needs to be solved, with the linear energy functional trivially exploiting this symmetry, while a slightly more involved exploitation of the symmetry to avoid explicitly solving multiple cluster problems in the accumulation of the long-range RDMs for other expectation values is described in Appendix A. We note that exploitation of this symmetry information is not required and only used here for numerical efficiency, with identical results achieved by computing all cluster contributions independently.
Results are presented in Fig. 8 for the linear energy functional E[Ψ x ] and the 'in-cluster' cumulant func-tional, E[(γ, K * )[Ψ x ]], for a range of lattice parameters of the unit cell.The E[(γ, Γ)[Ψ x ]] functional is omitted due to its unfavourable scaling to these system sizes.The input to our Vayesta code to generate these results is given in the SI, where periodic integrals for the system are generated from PySCF 5,74 .Changing the lattice vectors leads to an effective change in the basis set coverage at different cell sizes (since the same number of basis functions span different volumes of the supercell, changing the effective completeness of the overall basis).To compensate for this, we correct all energies for this 'basis set superposition error' (BSSE) via an adaptation of the counterpoise correction method 75 as described in Ref. 53.The BSSE per atom is estimated by computing two independent molecular sytems, the first for single carbon atom (30 basis functions) and the second for a single atom in addition to the basis functions of all carbon atoms within a 3×3×3 supercell (1620 basis functions in total), sufficient to describe the changing basis set coverage for the carbon atom under consideration.The difference between the two energies is the estimated BSSE error, with this correction changing across the shown lattice range by ∼ 7.2 mE H , which slightly lowers the energy at large lattice parameters relative to the more compressed cells, leading to a small expansion of the equilibrium volume.
The equation of state for the two different energy functionals and BNO bath threshold (η) values are shown in Fig. 8.These demonstrate the systematic improvement in the total energy as η decreases.However, while the total energy is not fully converged, we can fit these equations of state to a Birch-Murnaghan form 76 about approximately ±6% of the equilibrium value, in order to estimate structural properties for each of these curves, and gauge their convergence.We consider the convergence of the equilibrium lattice parameter and bulk modulus in Fig. 9, showing convergence to experimental values, corrected for zero-point vibrational effects 5,77 .We find a small yet consistently improved convergence for both of these properties from the in-cluster approximated cumulant energy functional at each cluster size compared to the linear energy functional derived properties, in keeping with the conclusions from the convergence of the energy in the molecular test set of Fig. 7. Our most accurate result (η = 10 −7.5 , resulting from a single CCSD calculation on ∼ 330 orbitals), gives a bulk modulus which agrees with experiment by less than 10GPa, while the equilibrium lattice parameter is in error by only ∼ 0.002 Å.By way of comparison, literature values from commonly used exchange-correlation functionals of DFT for structural properties give equilibrium lattice constants in error by between +0.02 Å(PBE) and −0.015 Å (HSESol) compared to experimental values 77 , an order of magnitude greater than these results.While accurate energetics are important, they are far from the only non-local expectation values of interest, and we should also gauge the validity of the cumulant-approximated 2-RDM approach for other two-body expectation values.We therefore evaluate instantaneous spin-spin correlation functions between pairs of atoms (A, B), given by where P A is a projector onto an appropriately chosen subspace representing the atom A. We define these atomic projectors via symmetrically (Löwdin) orthogonalized atomic orbitals (SAO) for each atom, as given in Appendix B. Note that while the total spin-spin correlations of the full system, S 2 z = AB ŜA z ŜB z , are necessarily zero in a spin-restricted formalism (though not in unrestricted), the local contribution from a given atom pair is generally non-zero, indicating the conditional spindensity between different points in the system.
We adapt the fragmentation and BNO bath expansion procedure as detailed in Sec.VI A to use a spinbroken UHF reference, spin-dependent bath orbitals, a UCCSD solver, and generalizations of all expectation value accumulation in a unrestricted formalism.We can then consider the convergence of the spin-spin correlation function with bath size for the n-propyl radical (shown in Fig. 10) in the cc-pVTZ basis between the radical C 1 position (atom A) and other atoms in the system (atom B).In order to compare results from the traditional democratically partitioned 2-RDM of Sec.II A (Γ[Γ x ]), the democratically partitioned cumulant of Sec.II B (Γ[γ x , K x ]) and the in-cluster cumulant approximated 2-RDM derived from the partitioned wave function of Sec.V B (Γ[(γ, K * )[Ψ x ]]), we choose a complete fragmentation of the system via atomic IAO ⊕ PAO fragments, ensuring that the full virtual space of the system is also spanned by the fragmentation.This is a necessary condition for the democratically partitioned approaches to approach exactness as the bath space becomes complete, but is a sufficient but not necessary condition for the Γ[(γ, K * )[Ψ x ]], which only requires fragments to span the occupied space.However, choosing this common fragmentation will allow for results to be compared on the same footing, despite requiring a larger fragment space than would be most efficient for the wave partitioned RDMs.In appendix A, we detail how an efficient in-cluster approximated form for Γ[(γ, K * )[Ψ x ]] can be utilized for this spin-spin correlation function, ensuring that all four-index contractions are performed within each cluster, to maintain cubic scaling with system size.Figure 11 plots the convergence of these spin correlation functions from the various functionals as the average cluster (bath) size increases.Beyond just convergence of any single expectation value, these results also allow us to consider the accuracy of these two-body spin-correlators as a function of their range, indicating the ability of each to compensate for the locality approximation inherent in the embedding for smaller cluster sizes, and converge highly non-local two-point observables.While there is only a small difference between the traditional partitioning of the 2-RDM and cumulant partitioning schemes for the on-site correlation function (top left plot), this changes for the longer-ranged correlations in the lower rows of the figure, where the partitioned cumulant approach of Sec.II B converges significantly quicker and more smoothly to the full system UCCSD limit, where the direct democratic partitioning of the 2-RDM can become highly erratic.It is particularly noteworthy that the democratically partitioned 2-RDM results can be worse than then uncorrelated UHF values, even when going to average cluster sizes as large as ≈100, at which point the partitioned cumulant results are almost indistinguishable from full system CCSD values.However, as expected, the wave function derived expectation values are by far the most reliable across the range of cluster sizes, even with the 'in-cluster' approximated cumulant, with an improved and more systematic convergence even for the fully local spin correlator.As such, our results support the use of this approach for both energetics and other one-and two-body properties where possible in this wave function embedding context.

VII. CONCLUSIONS AND OUTLOOK
In this work, we have critically assessed approaches to reconstruct expectation values from wave function quantum embedding methods.In doing so, we have motivated and compared a number of approaches for calculating total energies across the fragmented system, as well as considering the two-point spin correlation function as an example non-energetic and non-local quantity via the reconstructed reduced density matrices.A fundamental difference exists between two classes of approaches, where expectation values are computed from either partitioned density matrices, or partitioned wave functions, particularly due to the difference these approaches impose on the choice of fragmentation if a bath expansion of the cluster is to tend to exactness.In the former case, it was found that partitioning the cumulant rather than the 2-RDM directly gave rise to significantly improved two-body properties and energies, and is likely to find applications in solvers where access to wave function amplitudes is difficult (e.g. the increasing use of quantum computing algorithms as high-level solvers [48][49][50][51][52] ).
In the latter case where an implicit partitioned wave function is considered, total energy expressions are motivated and derived from a simple linear energy functional, and from density matrices constructed directly from this wave function.If obtained via these RDMs, the total energy from the embedding is variational (for a variational solver) and N -representable, but incurs significant overhead in the 2-RDM construction.A rigorous and efficient quadratically-scaling algorithm is developed for this N -representable 1-RDM, and an 'incluster' approximation for the 2-RDM is motivated, ne-glecting many of the cross-cluster contributions in the 2-cumulant.Taken together, these two developments are found to provide the most efficient convergence of energetics and non-local expectation values across a range of systems via a systematic bath expansion of the cluster, while only requiring an atomic IAO fragmentation of the system.We believe that this approach will prove important going forward in the development of robust and accurate wave function embedding techniques for both chemical and extended systems.
There are a number of further directions we aim to explore as a result of these insights.In systems where the electronic state is required to qualitatively change (e.g.quantum phase transitions), it is possible that a bruteforce bath expansion (while necessarily still improvable to exactness) is not going to be the most efficient route in order to capture this changing physics, and feedback from the correlated state to the underlying mean-field reference is required.The identification of an implicit wave function, as well as a formally N -representable 1-RDM opens new avenues for robust and accurate selfconsistency conditions.For instance, the self-consistency could maximize the overlap between this implicit partitioned wave function and a mean-field state by minimizing the partitioned T 1 amplitudes 78 .Alternatively, coupling between wave function amplitudes of different cluster are also being formulated.Additionally, the approach of 'projected'-DMET could be reconsidered with a full-system 1-RDM which contains more of the nonlocal physics and is fully N -representable compared to the standard democratic partitioning 20 .Finally, the application of a broader range of solvers which can treat significantly strongly correlated systems, as well as excited states, spectra and analytic gradients or forces with these new perspectives of an implicit embedded wave function are ongoing avenues of investigation.

APPENDIX
A. Further algorithmic efficiency in one-RDM construction In Sec.V A we detail the specifics of the 1-RDM construction from the implicit global T -amplitudes.There are a number of further efficiency gains which we make to further speed up the generation of this 1-RDM, and which can also be used for other non-local quantities which rely on combining different cluster solutions.
1) We perform a compact singular-value decomposition of the overlap matrices S xy in Eq. ( 47) and remove all left and right singular vectors corresponding to singular values smaller than a cutoff (which we choose to be 10 −3 ).For every pair of clusters, the Λ-and T -amplitudes are then rotated into the resulting SVD basis (scaled by the corresponding singular value), except those open indices which are not contracted with an overlap between the cluster pairs (e.g.i x and j y in Eq. ( 46)).These singular values characterize the overlap between the occupied or virtual orbitals of clusters x and y.If no singular value is above the threshold, the clusters are considered to have negligible overlap and and the contribution ∆γ xy is assumed zero.In the asymptotic large system limit, each cluster will only have an appreciable overlap with O(1) other clusters due to their local nature.This effectively screens the loop over cluster pairs to eventually only loop over O(N ) overlapping clusters.
2) When periodic boundary conditions are used, we always perform the embedding calculations at the Γ-point of the N 1 ×N 2 ×N 3 Born-von Karman supercell, corresponding to original Monkhorst-Pack k-point mesh of the same dimensions.In this case the summation over clusters x is only performed over fragments within the primitive cell at R 0 , whereas the summation over clusters denoted y is unrestricted over all primitive cells within the supercell (R i ).Note that even though density matrix contributions between x in the primitive cell and y over all clusters in the supercell are explicitly constructed, individual clusters only in the primitive cell R 0 need to be solved, as the resulting wave function amplitudes can be transformed into the AO basis and translated (corresponding to a permutation of the AO indices) to any other, symmetry-equivalent position within the supercell.The remaining density matrix contributions from clusters x outside of the primitive cell R 0 can be simply reconstructed via translation, as where T (R) is an operator which translates all atomic orbitals in ∆γ xy to the right by R, with ∆γ xy denoting the contribution to the correlated density matrix from the cluster pair x and y in the AO basis.If desired, the complete supercell density matrix can be readily Fouriertransformed to k-space according to where the AOs of the supercell, α, were enumerated in terms of the composite label (R i , α), with α representing an AO in the primitive cell, R 0 .
3) Embedding problems in Vayesta can be solved in parallel using the Message Passing Interface (MPI) over distributed memory.Symmetry unique fragments are assigned to each MPI rank (generally a node), with their clusters built and solved independently on these MPI processes, and OpenMP threading used within each rank for the dense linear algebra on a shared memory basis.However, when constructing the 1-RDM, the cluster T -and Λ-amplitudes need to be communicated.Classical pointto-point or collective communications do not naturally fit the existing algorithm.For this reason, we employ onesided communication (also called remote memory access, RMA), which allow each MPI process assigned to a given cluster to access the cluster amplitudes of all other MPI processes, without the need to halt or synchronize with the sending process.
4) When projecting T 2 -(or Λ 2 -) amplitudes, the projector is applied in a symmetrically averaged fashion between the first and second occupied index, as shown in Eq. (32).When contracting (for example) a symmetrically projected T 2 -amplitude with a symmetrically projected Λ 2 -amplitude, the result can be thought of in terms of four contributions (each weighted by 1/4): first, the T 2 -amplitude projected in the first index contracted with the L 2 -amplitude projected in the first index, second, the T 2 -amplitude projected in the first index contracted with the L 2 -amplitude projected in the second index, etc.In contrast to the symmetrically projected amplitudes, which require (N cl ) 4 memory to be stored, the amplitudes projected in a single index can be stored in N f (N cl ) 3 memory, where N f is the number of fragment orbitals in the cluster, when the projected index is rotated to be represented in the fragment-bath basis rather than the particle-hole basis of the cluster.More crucially, contractions between two such amplitudes, such as the one in Eq. ( 46), only scale as N f (N cl ) 4 , instead of (N cl ) 5 .Since the number of fragment orbitals is often significantly smaller than the number of cluster orbitals (sometimes by a factor of 100 or more), it is computationally more efficient to perform this contraction four times with the amplitudes projected in different indices and then average the result, instead of first performing the averaging of the projected amplitudes and then performing a single contraction.Furthermore, inter-node MPI communication of these 'single-index'-projected T 2 and Λ 2 cluster amplitudes can be significantly reduced by a factor of N f /N cl utilizing this compressed representation.

B. Local spin-spin correlation functions
In Eq. 55, the two-point instantaneous spin-spin correlation function is defined, ŜA z ŜB z , between two local subspaces denoted by A and B. We choose these atomic spaces to be the set of symmetrically (Löwdin) orthogonalized atomic orbitals (SAO) for each atom, resulting in a projector of the form where C SAO is the coefficient matrix of the SAOs and s runs over the N A SAO orbitals associated with atom A. However, for an efficient evaluation of this expectation value via the wave function partitioned approach, we require the two-body contributions via the 'in-cluster' approximated cumulant to be constructed in each cluster independently, to avoid the construction and subsequent contraction over rank-4 quantities in the full space which would lead to high computational scaling and memory footprints.This can be achieved in a similar manner to the computation of the E[(γ, K * )[Ψ x ]] energy in 54.The resulting working equations in a restricted basis are where the factors of the type 2γ σ qp − γ σ 0 qp arise due to the specific definition of K, as given in Eq. ( 15).

FIG. 1 .
FIG.1.Binding energy of the Chlorine dimer in a STO-6G basis compared to FCI, with E[γ x , Γ x ] corresponding to the democratically partitioned energy expression of traditional DMET33 .Different energies correspond to different total energy functionals, constructed from the same solutions to the two DMET embedding problems (with fragment spaces of the (Löwdin-orthogonalized) core and valence orbitals of each Chlorine atom).The two cluster spaces are defined by 10 orbitals (9 fragment and 1 bath), with the full space comprising 18 orbitals.The motivation and definition of these different energy functionals from the embedded wave functions are given in the rest of this work, with more details on this system (and others like it) discussed in section IV.

FIG. 4 .
FIG.4.Analysis of the 1-and 2-RDMs from direct democratic partitioning (dotted lines) compared to RDMs constructed from partitioning of the wave function (solid line), for the same Cl2 system as Fig.3.The results are shown without (left) and with (right) chemical potential optimization over the DMET clusters.Top to bottom rows show the errors in the norm, trace and N -representability conditions (Eq.42) of the 1-RDM for the different approaches, with the bottom row showing the norm error in the 2-RDM including the democratically partitioned cumulant approach.

t N 2. 01 FIG. 5 .
FIG.5.Timings for the construction of the full system 1-RDM from the partitioned wave function for two different bath truncation thresholds, η = 10 −4 and η = 10 −6 , for the 'global' and 'distributed' T2 algorithms.Increasing the bath size has little effect on the global T2 algorithm, whereas it leads to an increase in the prefactor (but not the exponent) of the distributed T2 algorithm.

FIG. 6 .
FIG. 6. Convergence of the reconstructed system energy from individual cluster solutions to the canonical CCSD with respect to the maximum cluster size for any atomic fragment in any molecule in the W4-11 test set.Cluster sizes are controlled by the bath threshold (η), starting from just the DMET bath, for the wave function derived energy functionals.Points indicate the mean absolute error (MAE) per electron in the total energy, while error bars show the standard deviation over the test set.Inset shows a magnified version of the larger bath threshold results.

5 EFIG. 8 .FIG. 9 .
FIG.8.Equation of state for diamond, using both the linear energy functional (E[Ψ x ], purple) and the RDM energy functional with in-cluster approximated cumulant (E[(γ, K * )[Ψ x ]], red).Data from different lattice parameters are fit to a Birch-Murnaghan equation of state for different values of η, systematically expanding the size of the cluster space.5 × 5 × 5 k-points are used in the supercell, with an all-electron cc-pVTZ basis, while small energy corrections for the basis set superposition error are also included.The calculated energies are shown as circles, the fits are solid lines and the equilibrium lattice parameter from the fit is indicated by pentagons.

FIG. 10 .
FIG. 10.Labels of the atoms in the n-propyl radical used in this work.
FIG. 11.S 2 z -correlation functions in the n-propyl radical between the radical position (C 1 ) and other carbon (first column) and hydrogen atoms (second column) in the molecule (see Fig.10).Spin correlators are derived from the 2-RDM computed in different ways; traditional democratic partitioning of Sec.II A (denoted Γ[Γ x ], green), democratic partitioning of the cumulant of Sec.II B (denoted Γ[γ x , K x ], yellow) and the wave function derived 2-RDM with in-cluster approximated cumulant of Sec.V (denoted Γ[(γ, K * )[Ψ x ]], red).The calculation was performed in the cc-pVTZ basis set (188 orbitals) with SAO projectors to evaluate the spin correlation functions, and IAO ⊕ PAO atomic fragmentation.

e
wf = emb.get wf energy () # P a r t i t i o n e d WF, d e n s i t y −matrix f u n c t i o n a l : e dm = emb.get dm energy (dm2='global−wf') # P a r t i t i o n e d WF, d e n s i t y −matrix f u n c t i o n a l , w i t h in−c l u s t e r approximation o f cumulant : e dm incluster cumulant = emb.get dm energy ()# I t does not make s e n s e t o e v a l u a t e t h e wave−f u n c t i o n based f u n c t i o n a l s w i t h a fragment c h e m i c a l p o # o p t i m i z e d from t h e d e m o c r a t i c a l l y −p a r t i t i o n e d 1−DM.I n s t e a d , one c o u l d o p t i m i z e t h e e l e c t r o n numbe # 1−DM o f t h e p a r t i t i o n e d wave f u n c t i o n , b u t t h i s d e n s i t y −matrix has t h e r i g h t numbero f e l e c t r o n s b # such t h a t t h i s o p t i m i z a t i o n would be w i t h o u t e f f e c t .else: e wf = e dm = e dm incluster cumulant = np.nanwith open('energies .txt', 'a') as f: f. write('%.2f%16.8f %16.8f %16.8f %16.8f %16.8f %16.8f\n' % ( d, e fci , e dmet part 2dm , e dmet part cumulant , e wf , e dm , e dm incluster cumulant )) Input for a single molecule in the W4-11 test set of Sec.VI A for a range of bath natural orbital thresholds import numpy as np import pyscf import pyscf .gtoimport pyscf .scfimport pyscf .ccimport vayesta import vayesta .ewf#−−− Setup m o l e c u l e w i t h PySCF : mol = pyscf .gto.Mole () mol.atom = [( 'N', [0.0 , 0.0, 1.855199]) , ('C', [0.0 , 0.0, 0.693922]) , ('C', [0.0 , 0.0, −0.693922]), ('N', [0.0 , 0.0, −1.855199])] mol.basis = 'cc−pvdz ' mol.output = 'pyscf.txt' mol.build () # −−− Obtain Hartree−Fock s o l u t i o n w i t h PySCF : mf = pyscf .scf.RHF(mol) mf.conv tol = 1e−10 mf.kernel () # −−− Reference CCSD cc = pyscf .cc.CCSD(mf) cc.kernel () e cc = cc.e tot etas = ['DMET ', 1e−3, 1e−4, 1e−5, 1e−6] for eta in etas: # −−− Embedded CCSD u s i n g BNO b a t h o r b i t a l s : if eta == 'DMET ': emb = vayesta .ewf.EWF(mf , solver ='CCSD ', bath options =dict( bathtype ='dmet ')) else: emb = vayesta .ewf.EWF(mf , solver ='CCSD ', bath options =dict( bathtype ='mp2 ', threshold =eta )) # Fragment t h e system i n terms o f i n t r i n s i c atomic o r b i t a s (IAOs) w i t h one fragment f o r each atom : with emb.iao fragmentation () as f: f. add all atomic fragments () # C o n s t r u ct b a t h spaces , c l u s t e r s , and s o l v e embedding problems : emb.kernel () # P a r t i t i o n e d WF, l i n e a r f u n c t i o n a l , energy : e wf = emb.get wf energy () # I t can s a v e r e s o u r c e s t o j u s t update t h e b a t h o p t i o n s , r a t h e r than running new c a l c u l a t i o n s emb.change options ( bath options =dict( threshold =eta )) emb.reset ( reset bath =False) emb.kernel () emb es [0].append (emb.e corr + emf) # Linear energy f u n c t i o n a l emb es [1].append (emb.get dm energy () + emf) # P a r t i t i o n e d WF w i t h in−c l u s t e r cumulant # OPTIONAL: Repeat c a l c u l a t i o n s f o r s i n g l e atom and s i n g l e atom i n p r e s e n c e o f a l l b a s i s f u n c t i o n s , # t o e s t i m a t e BSSE c o r r e c t i o n print (" Calculating BSSE Correction ") emb bsse = [] for rmax in [False , np.inf ]: bsse = [] # Remove p e r i o d i c boundary c o n d i t i o n s from c e l l , i n c l u d e b a s i s f u n c t i o n s f o r 3 x3x3 s u p e r c e l l cell = vayesta .misc.counterpoise .make cp mol (cell , 0, rmax , nimages =1, unit='B') # Run m o l e c u l a r Hartree Fock kmf = pyscf.scf.RHF(cell) kmf = kmf.density fit ( auxbasis ='cc−pvtz−ri') kmf.kernel () emf = kmf.e tot # C a l c u l a t e embedded CCSD c o n t r i b u t i o n f o r a range o f BNO t h r e s h o l d s emb = vayesta .ewf.EWF(kmf , solver ="CCSD", bno threshold =1e−4, solve lambda =True) with emb.iao fragmentation () as f: f. add atomic fragment (0) for j, eta in enumerate (etas ): vayesta .new log ("vayesta−N=% d a =%.3 f eta =%.2e.log" % (Nk , a, eta )) emb.change options ( bath options =dict( threshold =eta )) emb.reset( reset bath =False) emb.kernel () bsse.append (emf + emb.e corr ) bsse.append (emf + emb.get dm energy ()) emb bsse .append (bsse) emb es = np.array( emb es ) emb bsse = np.array(emb bsse ) # Add BSSE c o r r e c t i o n ( f a c t o r o f 2 due t o two atoms per u n i t c e l l ) es = emb es + 2 * ( emb bsse [0] − emb bsse [1]) FIG. 3. Comparison of different proposed DMET energy ex-pressions for the binding of Cl2 in a STO-6G basis.Each cluster consists of the atomic orbitals of one atom, with a single DMET bath orbital on the other, with all clusters solved exactly.Right column results include a fragment chemical potential optimization to constrain the total number of electrons, while this is omitted in the left column.