Application of fermionic marginal constraints to hybrid quantum algorithms

Many quantum algorithms, including recently proposed hybrid classical/quantum algorithms, make use of restricted tomography of the quantum state that measures the reduced density matrices, or marginals, of the full state. The most straightforward approach to this algorithmic step estimates each component of the marginal independently without making use of the algebraic and geometric structure of the marginals. Within the field of quantum chemistry, this structure is termed the fermionic $n$-representability conditions, and is supported by a vast amount of literature on both theoretical and practical results related to their approximations. In this work, we introduce these conditions in the language of quantum computation, and utilize them to develop several techniques to accelerate and improve practical applications for quantum chemistry on quantum computers. We show that one can use fermionic $n$-representability conditions to reduce the total number of measurements required by more than an order of magnitude for medium sized systems in chemistry. We also demonstrate an efficient restoration of the physicality of energy curves for the dilation of a four qubit diatomic hydrogen system in the presence of three distinct one qubit error channels, providing evidence these techniques are useful for pre-fault tolerant quantum chemistry experiments.

The rapid development of quantum hardware in recent years has spurred interest in the development of practical algorithms that do not require fault-tolerant quantum computers. It has been conjectured that the leading candidates for first demonstrations of practical algorithms on a pre-threshold device are so-called quantum/classical hybrid algorithms, such as the variational quantum eigensolver (VQE) for chemistry [44][45][46][47][48][49][50][51][52] and the quantum approximate optimization algorithm (QAOA) for optimization problems [53]. Recent experimental results [50] have shown that the quantum-classical feedback does provide robustness against noise in the device, and simple extensions are possible that allow one to further damp the noise through additional measurements [54,55].
A common step in almost all near-term implementations of these algorithms is the determination of an operator's expected value through a form of partial tomography of the quantum state. The statistical criteria relating the number of measurements from the distribution and the accuracy of the expected value are relatively straightfoward to determine [23,46,52]. When studying chemical systems, the general benchmark for accuracy is H ± Hartree where = 1.6 × 10 −3 is known as chemical accuracy and ± expresses a standard confidence interval. This stringent absolute accuracy is required for matching experimentally determined thermochemical properties such as heats of formation or ionization potentials [56]. Naturally, this accuracy requirement on the estimator for the Hamiltonian may require a large number of independent preparations of a state and measurements using current strategies.
Some routes to reduce the number of experiments by collecting Hamiltonian terms into commuting groups and dropping Hamiltonian terms with small coefficients have already been proposed [23,46,51,57]. Wecker et al. [48], estimated that for simulating the energy of ferrodoxin using this approach would require 10 19 total measurements. Though this estimate was made with pessimistic assumptions, the sheer number of measurements motivates one to pursue techniques to accelerate the operator averaging step of VQE and other hybrid algorithms.
The strategy we adopt in order to ameliorate the burden of excessive measurements is to leverage known structure in the marginals of the fermionic density matrices which have not yet been taken advantage of within the field of quantum computation. The set of necessary conditions on marginals of density matrices are known as n-representability conditions and were originally developed to use the reduced density matrices of fermionic and bosonic systems as the main computational variable instead of the wavefunction [58][59][60][61]. While in this work we focus on fermionic systems with at most pair-wise interactions and thus use constraints on the 2-marginal 2 D, these ideas are more generally applicable to problems with local Hamiltonian objectives, such as many instances of QAOA. Measuring the marginals of a state ρ, specifically the fermionic two-particle reduced-density-matrix (2-RDM), provides a powerful extension to VQE. These quantities are useful for connecting VQE to other quantum chemical techniques such as multiconfigurational self-consistent field (MCSCF), embedding procedures such as density matrix embedding theory, and augmenting the accuracy of electronic structure methods with perturbation theory, which is thought to be required to apply these methods to complex systems larger than one may fit in the quantum computer alone.
In this work we explore the utility of measuring marginals instead of directly measuring the Hamiltonian, and how the n-representability constraints on the marginals provide i ) a program for reducing the norm of the Hamiltonian directly leading to fewer measurements and ii ) using the set of necessary, but not sufficient constraints, on the fermionic marginals, known as p-positivity constraints, to propose two computational procedures for projecting measured marginals into the set of allowed marginals. The variance reduction that is developed is examined for linear hydrogen chains and generally shows an order of magnitude reduction in the required number of measurements.
We also introduce a series of polynomial time post-processing techniques for certifying and projecting measured marginals. This computational procedure is similar to maximum likelihood tomography except on a reduced density matrix space, thus making it efficient. Similar techniques have been applied to corrupted process tomography [62] and reduced state tomography on the RDMs of the system [63]. We compare four projection techniques that balance enforcing n-representability with computational efficiency. The first two techniques are based on positive projection of the 2-RDM matrix-which should be positive semidefinite-but does not include any constraints beyond positivity of the 2-RDM and fixing the trace. The third and fourth techniques add approximate n-representability constraints implemented through a more expensive iterative procedure and semidefinite program.
The paper is structured as follows: Sections I and II describe fermionic marginals and the n-representability conditions used in this work, Sections III and VI review the utility of marginals as they pertain to perturbation theory and a concentration bound for 2-RDMs that suggests a structured ansatz is required when parameterizing the space for hybrid quantum classical algorithms, Section IV discusses the optimal bound on the number of samples required to measure the expected value of a Hamiltonian and variance reduction from equality n-representability constraints, Section V discusses the 2-RDM projection procedures and how they can reduce the number of measurements required to measure to fixed accuracy and also restore physicality after state corruption by common error channels. We close with an outlook as to how n-representability techniques can further improve hybrid algorithms.

I. MARGINALS OF THE DENSITY OPERATOR
Here we introduce our terminology and set notation with respect to reduced density matrices, or marginals of the full density, in both the qubit and fermionic setting. Generally speaking, a marginal of a multivariable probability distribution is the partial trace, or integration, of a subset of the variables leaving a distribution on a smaller set of variables. Given a general quantum state ρ on n qubits where |ψ i are pure states of n qubits and i w i = 1, the set of p-qubit reduced-density matrices, or p-marginals, of the state are determined by integrating out q-qubits (such that n − q = p) of the joint distribution as resulting in n p different marginals each of dimension 2 p × 2 p . The coefficients n 1 , ..., n q on the trace operator indicate which qubits are integrated out of ρ and coefficients m 1 , ..., m p label the subsystem marginal. The result of marginalization is a distribution on the state space of p-qubits. We will interchangeably refer to these objects as marginals and reduced-density matrices (RDMs) alluding to the fact that like the von Neumann density matrix the marginals can be expressed as a sum where |φ j is a pure state on the subsystem of qubits that has not been traced out.
It is well known that this n p set of marginals is sufficient for calculating the expected value of a p-local Hamiltonian or other observable [64,65]. A p-local Hamiltonian is one where any term in the Hamiltonian involves no more than p-qubits interacting. Naturally, such a description of a quantum system is an attractive polynomial size representation. However, in order to use the set of marginals as computational objects (e.g. minimize energy with respect to them) rather than simply measuring them, the RDMs must satisfy certain constraints to ensure physicality. These constraints are termed "consistency" and are the requirement that all marginals satisfy Eq. (2) for the same initial state ρ. Despite the considerable structure on p ρ m1,...,mp , confirming a set of marginals are consistent was demonstrated to be QMAcomplete [64]. Naturally, one may ask if there is some alternative form of consistency, or approximation, that makes working with marginal distributions a computationally attractive approach.
Analogous to the case of qubits, the requirement for calculating expected values of k-local operators describing interactions of indistinguishable particles, such as fermions or bosons, is that only the k-marginal is needed. Note that as a point of clarification, while p-local qubit operators refer to operators acting on at most p qubits, k-local fermionic operators refer to interactions that derive from k-body interactions, and generically act on 2k fermionic modes. This does not complicate the methods to be discussed, but is a common point of confusion when working between chemistry and quantum computation. As an example, in chemical systems the energy is a functional of the 1-and 2-local fermionic operators and the 2-marginal of the system [59,[66][67][68]. Consider a Fock space constructed with a single particle basis associated with a Hilbert space H of size m. We can represent a general state on this space where each fermionic creation operator {a † m } is associated with a single particle basis state |φ m and |vac is the vacuum. We may consider a state ψ with fixed particle number n enforced by restricting the coefficients i m in Eq (4) to satisfy m i m = n. As noted in Reference [69], once an ordering of fermions is selected the Fock space states can be mapped to a Hilbert space of m-distinguishable spin- 1 2 particles. Marginals of the fermionic n-particle density matrix n D = |ψ ψ| = n D i1,...,in j1,...,jn |i n · · · i 1 j n · · · j 1 | involve integrating out particles by a trace operation. For example, the 2-RDM 2 D is obtained from n D by integrating out particles 3 to n.
By convention, in the quantum chemistry community the normalization constant for p D is typically scaled to n p when i 1 < i 2 < ... < i p or n!/(n − p)! when i k is allowed to be range over all values in [1, m] [59,61,70]. In this work we choose the later of the normalizations for ease of computation. From the study of n-representability theory, a number of efficiently implementable and necessary constraints on the one-and two-particle marginals ( 1 D and 2 D) are known. Specifically, defining where the simplest of these constraints may be enumerated as: 1. Hermiticity of the density matrices 3. The (p − 1)-marginal is related to the p-marginal by contraction-e.g. the 2-marginal can be contracted to the 1-marginal 4. The trace of each marginal is fixed by the number of particles in the system Tr 5. The marginals are proportional to density matrices and are thus positive semidefinite Additional constraints based on the quantum numbers of S 2 and S z operators can be derived for each marginal [71]. A short description of the form of the linear constraints inspired by fixed particle number n , fixed total angular momentum S 2 , and fixed projected total angular momentum S z are described in Appendix B.

II. THE n-REPRESENTABILITY PROBLEM
In the previous section we provided a brief introduction to the notation and setup of problems formulated through their marginal distributions. Here, we review a concise and elegant theoretical framework that allows one to derive the full set of representability conditions for fermionic systems, from which one may select a subset to form efficient approximations. The polynomial size of the p-marginals makes them attractive candidates for use as the representation of quantum systems. This was originally noticed by Coulson and Coleman [72,73] in the context of quantum chemical Hamiltonians where the energy operator is 2-local and thus a linear functional of the 2-RDM. The characterization and structure of valid 2-marginals arising from the integration of a fermionic density matrix led to the field of nrepresentability [59,73].
A significant amount of progress has been made in tackling the n-representability problem; most notably, the original works by Erdahl [74], Percus and Garrod [75,76], and Mazziotti [67,[77][78][79] formalize the n-representability problem by specification of an approximate set of constraints by parameterizing the polar cone of the set of 2-RDMs. Recently, Mazziotti formalized the complete structure of the ensemble n-representability constraints [79] and gave the structure of pure-state constraints for states with fixed particle number [78]. n-Representability has also been greatly influenced by quantum information: Liu [64] demonstrated that the n-representability problem is QMA-complete and Bravyi [65] and Klyacho [80] enumerated n-representability constraints for marginals of a pure-state.

A. n-Representability by Characterizing the Polar Cone
We begin with a geometric picture of the constraints within the space of fermionic density matrices. The formal characterization of the n-representable set of 2 D operators relies on characterizing the polar cone of the 2-marginals [59,60,68]. Consider the convex set of 2-marginals 2 D acting on the anti-symmetric two-fermion space ∧ 2 H, which may be defined in terms of its basis vectors {a ∧ b = a ⊗ b − b ⊗ a} for a, b ∈ H. The polar cone is defined as the subset of Hermitian operators that satisfy the positive projection condition, Operators of the polar cone 2 B ∈ 2D are positive operators with respect to the 2-RDM, which implies their nonnegativity with respect to all fermionic density matrices N D when lifted to the n-particle space. In quantum information this lifting procedures is accomplished by taking the tensor product of the operator with identities. For fermions, the lifting procedure involves the tensor product with the appropriate antisymmeterization operations.
The bipolar theorem states that elements of 2 D are completely characterized by the polar cone 2D Though specification of inequalities with elements of the polar cone provides a characterization of 2 D, it has the major drawbacks that a) there are infinitely many possible 2 B operators to check and b) checking 2 B requires checking if an exponentially large operator is positive [81].
The key approximation within n-representability is constructing a polynomial size approximation to the polar cone 2D a and deriving conditions on the 2-marginal through Eq (19). This is achieved by selecting a kth-order (where k < n) operator basis for 2D a . Given that the 2D a ⊂ 2D any representability conditions derived from 2D a will be an approximate set of representability conditions. By duality, the polar of 2D a implies 2 D ⊂ 2 D a . This naturally explains why variational calculations using the reduced-density matrix and approximate n-representability constraints are strictly a lower bound to the true energy [70,77,[82][83][84][85][86][87].
In the n-representability literature related to simulating fermions, fermionic operators up to a particular order-e.g.
. . }-are used as the operator basis for the approximate polar cone. Given a rank-k operator basis for the polar cone, we can define a real linear space of Hermitian that when constrained to be non-negative (implied by Eq. (19)) form a necessary set of conditions on the p-marginals of the von Neumann density matrix.
As an example, we will derive the famous 2-positivity conditions by restricting the rank of the monomials in the operator basis to rank less than 2 as setting and requiring that M 2 0. The {c} coefficients in Eq. (22) specify an arbitrary element of the approximate polar cone 2D a in a similar fashion to how a sum-of-squares polynomial can be expressed as a quadratic form c T Ac where elements of A represent various products of monomials. Considering symmetries of the system, such as fixed particle number, reduces the large matrix M 2 to a block diagonal matrix [82]. In this work we consider spinless fermionic Hamiltonians that commute with the number operator of our system and thus we can decompose M 2 into blocks where monomials correspond to number preserving operators-i.e {a † i a j , a † j a i , ...}. Restricting the operator M 2 to be non-negative against the 1-RDM and 2-RDM for all values of c yields the following inequalities where ψ is an arbitrary state. These conditions imply that the following matrices are positive semidefinite The positivity of these matrices canonically known as { 1 D, 1 Q, 2 D, 2 Q, 2 G} form a set of necessary conditions that the 2-marginal must obey. Clearly, as the 2-marginal is included in our set, the positivity of this operator naturally appears when building constraints starting from a polar cone picture. Though these constraints are formulated with pure states it is simple to show these conditions hold for mixed states as well.
The positivity of the operators on H-{ 1 D, 1 Q}-and ∧ 2 H-{ 2 D, 2 Q, 2 G}-are constrained to live in the space defined by equalities obtained by rearranging the fermionic ladder operators according to their anticommutation rules. A full enumeration of the equality constraints is contained in Appendix C. In this work we use the positivity constraints and the linear constraints from the anticommutation relationships as a set of constraints that 2-RDMs measured from a quantum resource must satisfy. This enables us to enhance the accuracy of estimation of quantities through employing basic physical relations.

III. CONCENTRATION OF MEASURE IN p-RDMS
Hybrid quantum-classical schemes depend both on the ability to perform the partial tomography that has been discussed as well as some parameterization of the quantum state space. Recent experimental proposals have considered the use of quantum states that are constructed from unitaries that are uniformly random with respect to the Haar measure acted upon a well defined initial state in order to demonstrate so-called "quantum supremacy" over classical devices [88]. A natural question is to ask whether these states can be harnessed as a resource within hybrid schemes. However, while these states are highly entangled, they demonstrate a number of surprising properties related to results on concentration of measure in high dimensional spaces. From the discussion above, we know it is sufficient to characterize a local fermionic system by its reduced marginals, so we investigate these states in that setting. In a colloquial sense, concentration results for random quantum states show that for many local observables, typical measurement results will yield the average value with overwhelming probability. Here, specifically we investigate the implications of this for p-particle reduced density matrices.
A key result we will leverage from the theory of measure concentration is Levy's lemma [89], which relates to the concentration of Lipshitz-continuous functions. Levy's lemma is as follows: consider a Lipschitz-continuous function f : || is the Euclidean norm in the surrounding space R 2n ⊃ S (2n−1) , and S (2n−1) is the unit sphere in R 2n which would correspond to a quantum state of log 2 (n) qubits. Drawing a point x ∈ S (2n−1) at random with respect to the uniform measure on the sphere yields We will leverage this lemma together with the fact that the expectation value on a normalized quantum state of any Hermitian operator A with bounded spectrum is Lipschitz continuous, with a Lipschitz constant that may be bounded by the norm of the operator. Consider first the case of 1-RDMs on a space consisting of any number of particles between 0 and M in M spinorbitals. We are interested in the average value of a matrix element [ 1 D] ij = ψ| a † i a j |ψ where |ψ is a randomly selected pure state. These states may be represented by the density matrix of all possible equally likely occupations, where I is the identity matrix on the space of 2 M possible occupations and d is the dimension of this matrix, d = 2 n . To evaluate a trace in this case, it suffices to choose the basis of all determinants ranging from 0 to M occupied spin-orbitals, created in a standardized ordering |ψ S = k∈S a † k | where | is the standard fermi vacuum state. To compute the average value of this operator, we thus need to evaluate On expanding this trace in the determinant basis, one realizes that the terms vanish unless i = j, unless this index appears in the wavefunction, and there is at least 1 particle in the wavefunction. The trace is given by Thus the average 1-RDM is a diagonal matrix with entries 1/2, corresponding to a system with an average number of particle of M/2. Note that for the operator a † i a j , the Lipschitz constant can be safely bounded by 1, as independent of the distance in space, this expectation value can differ by at most 1 due to the spectrum of the operator. As a result, Levy's lemma informs us how large we expect typical deviations to be from this average matrix element as Examining the case of 2-RDMs now using the same techniques, we find that From this, we see that the 2−RDM average also represents a generalization of a diagonal matrix to a 4-tensor with signs reflecting the anti-symmetry properties of the electrons. Moreover, it is easy to see from the operator 1 2 a † i a † j a l a k that the same concentration results hold for each individual matrix element as in the 1-RDM case, except that the Lipzschitz constant is modified by the normalization factor 1/2. Using the same process, one may derive similar results for higher particle reduced density matrices, and conclude that all higher RDMs concentrate towards diagonal matrices at a similar rate, with modified Lipschitz constants due to normalization. Now we consider the case where one restricts to random states within an n-particle subspace. Such a state can be represented by a density matrix of the form where I R is the identity operator on the full space of 2 M spin-orbitals subject to the restriction R to the space of n particles, and d is the dimension of that space. To evaluate this trace, we consider the basis of n-particle determinants. Using the same machinery as above, but restricting the sum to the case of only n-particles, we find Thus the average 1-RDM in a space of randomly generated n-particle states in M orbitals is the diagonal matrix with equally probable occupations on all sites. The exact convergence results of Levy's lemma depend on the spherical geometry of states, so direct application of the concentration results would require a modification of the lemma. However, we may embed the allowable quantum states into the space of log 2 (d R ) qubits and leveraging the fact that the norm of the operator of interest, and thus Lipschitz constant will remain the same. If one generates random quantum states within this embedded space, we can see that the concentration result holds with the same Lipschitz constant but modified dimension M n .
A consequence of these results is that random quantum states generate p-particle marginals that are effectively trivial, concentrating exponentially quickly to their average value as a function of system size. This would mean that one can evaluate the expectation value to specified precision of any p-particle observable (where p is held fixed as system size grows) efficiently on a classical computer for a random quantum state. Thus, we conclude meaningful explorations of the space of quantum states must be structured, whether it be in the path of time evolution or the design of a variational ansatz. If an ansatz and method used for hybrid quantum-classical methods cannot easily exit the space of Haar random states, the above analysis dictates it is doomed to give trivial observables for fermionic systems at relatively small system sizes.

A. Optimal Operator Averaging
Any L-sparse Hermitian operator on Hilbert space can be expressed as where w are real scalars and H are 1-sparse self-inverse operators which act on qubits. In second quantized formulations of electronic structure, the H are typically a special case of 1-sparse operators that have particularly convenient properties for measurement, namely they are strings of Pauli operators. Very often one is interested in estimating H by making measurements on M independent copies of a state |ψ . For instance, the typical procedure in variational algorithms is to estimate the energy H by repeatedly preparing a state and performing projective measurements onto the eigenstates of Pauli operators H . Since the H are self-inverse, the intrinsic variance of these projective measurements is computed as As sample variance is due to statistical fluctuations which are uncorrelated from sample to sample, the total variance of H scales as The real question is how to choose the number of samples for each term in the Hamiltonian M in order to minimize for the fewest overall measurements M = M . In [48], it is suggested that one choose M ∝ |w | with no guarantee of optimality. Here we prove that this choice is optimal by application of the Lagrange conditions (no proof was provided in [48]). We start with the Lagrangian where the constant λ is the Lagrangian multiplier. Our goal will be to solve the following expression for M , Accordingly, we take the derivative of L with respect to M to find, Plugging this back into Eq. (44), we find exactly that Therefore, If we insist on getting an asymptotic bound then we assume σ = O(1) and this leads us to confirm the optimality of the suggestion of [48].

B. Reducing Variance Using n-Representability
After mapping RDM elements to Pauli matrices (e.g. by using fermionic transforms such as the Jordan-Wigner or Bravyi-Kitaev transformation), one can express all equality n-representability constraints in the notation of Eq. (42). Assuming a list of K equality constraints, we will express the k th constraint C k as where we can always choose to have a constant term in the set (e.g. H 0 = 1 1) so that the equality sums to zero 1 . These constraints provide extra information about the relationships between expectation values which, in principle, should allow us to make fewer measurements. One very straightforward way to exploit this information in order to make fewer measurements is to add these constraints to the operator of interest in order to minimize the associated Λ from Eq. (49). Specifically, we have that where the relation H = H follows from the observation that C k = 0 for n-representable states due to the definition of C k in Eq. (50). When we do this, from Eq. (51) and Eq. (49) we can see that the number of measurements required is expected to scale as In order to minimize measurements then, the strategy is to compute depending on whether or not one has any meaningful prior on the expectation values H (which would provide a meaningful prior on σ via Eq. (43)). We can easily recast this optimization problem in a form amenable to efficient solution by common numerical methods. To do this, we think of the original Hamiltonian H as being expressed as a vector v H where each element of a vector represents a different fermionic operator; for example, we could map term a † p a q to vector element 1 + p + q N and map a † p a † q a r a s to 1 + N 2 + p + q N + r N 2 + s N 3 . The coefficients of the vector correspond to the coefficients of the term. Likewise, we can represent all of the constraints in a matrix C of dimension K × L where each constraint C k is a row of the matrix vectorized in the same way as v H . Then, we see that the optimization task at hand can be expressed as where β is a vector of dimension K. We can see now that this is a convex L 1 minimization. Such minimizations can be solved efficiently using simplex methods. We can cast L 1 minimization as the linear program: . In all plots, the blue circles correspond to the value of Λ 2 prior to applying the techniques of this section and the orange crosses correspond to the value of Λ 2 after applying the techniques of this section. In Figure 1a we demonstrate our technique on single atom calculations in the minimal basis. We see a consistent improvement of about one order of magnitude with a jump in values between the second and third rows of the periodic table. In Figure 1b we show a progression of hydrogen rings in the minimal basis of increased size where the distance between adjacent hydrogen atoms is fixed at the H 2 bond length of 0.7414Å. In Figure 1c we show how geometry affects these techniques by studying a square H 4 ring in the minimal basis as the spacing between hydrogens in the square is changed from 0.1 A to 1.8Å. Finally, in Figure 1d we examine how these techniques are effected as one increases the active space of an H 4 ring with atom spacing of 0.7414Å from four spin-orbitals to twenty spin-orbitals with calculations performed in a double zeta (cc-pVDZ) basis.
where q is an auxiliary vector variable. We provide freely available source code that generates the equality constraints and performs this optimization in the open source project OpenFermion [90]. Our code uses GLPK (GNU Linear Programming Kit) for the linear programming component via a Python wrapper known as CVXOPT. We show results of several numerical experiments which demonstrate the effectiveness of our method in Figure 1. These numerical experiments involved linear n-representability constraints coming from the 1-RDM trace, 1-RDM Hermiticity, 2-RDM trace, 2-RDM Hermiticity, 2-RDM to 1-RDM contraction, and the mappings between the 2-RDM and the other marginals in the 2-positive set. The set of mappings are described in Appendix C. These techniques often reduce the required measurements by an order of magnitude or more. Note that the method discussed here is actually quite a bit more general than presented. In particular, we have found an interesting method for transforming the Hamiltonian in a way that leaves its spectrum invariant in the fixed particle number sector. One might postulate that this method could also be used to optimize other simulation metrics, for instance, to reduce Trotter errors which are well known to be related to the norm of the Hamiltonian. Note that all constraints C k will be either Hermitian or anti-Hermitian operators. In particular, the Hermiticity constraints take the form of constraining anti-Hermitian components of the density matrix to be zero (thus those C k are themselves anti-Hermitian operators). So after applying the procedure here, one may end up with a H that is not Hermitian. Fortunately, one can restore Hermiticity without changing the value of Λ by creating a new Hamiltonian, H = ( H + H † )/2. H will be isospectral to H in the n-electron manifold and will have the same Λ as H.
It remains an open question if this variance reduction technique can be applied to other hybrid algorithms such as QAOA or quantum spin Hamiltonians. In the QAOA case, spin-marginals with significantly less structure than fermionic marginals must be considered. Linear constraints outside of the consistency of the marginals with overlapping support are generally unknown for an arbitrary Hamiltonian encoding a combinatorial optimization problem as is done in QAOA. For spin Hamiltonians constraints generated by fixed values of S 2 can also be considered. Further investigation of the consistency constraints and eigenvalue constraints in the form of pure-state constraints may provide additional variance reduction for problems described by fermionic and spin Hamiltonians.

V. n-REPRESENTABILITY INFORMED PROJECTION OF 2-RDMS
In this section we discuss the possibility of using n-representability conditions to improve 2-RDMs sampled from a quantum device. Errors in the 2-RDM measured from a quantum state can appear in multiple ways: 1) stochastic errors associated with the operator averaging techniques used to measure expected values and 2) device errors such as unexpected measurement correlations. We explored the utility of 2-marginal reconstruction schemes using nrepresentability rules to remove stochastic errors associated with sampling and state errors corresponding to noise of the device corrupting the intended state.
The general strategy is to design a process that projects a 2-RDM back into the set of n-representable 2-RDMs while balancing data-collection time and classical post-processing time. In this section we first discuss two simple purification procedures: positive-semidefinite projection of the measured 2-RDM with and without fixed-trace. These simple projection techniques are compared against procedures involving projections with knowledge of representability constraints.

A. Positive-Semidefinite Projection and Positive-Semidefinite Projection with Fixed Trace
The simplest of the n-representability rules enforce the 2-RDM to be Hermitian and non-negative with fixed trace. Given a measured 2-RDM we can define a computational procedure that determines the closest positive-semidefinite matrix.
The normalization is fixed by the particle number of the system. Without the trace condition the 2-RDM that minimizes the objective in Eq. (56) is the marginal constructed from the non-negative eigenvalues and eigenvectors of 2 D measured [91]. The procedure for finding a fixed-trace positive-semidefinite projection have appeared in contexts such as tomography [92], iterative purification of 2-RDMs from response theory [93], and finding positive-semidefinite correlation matrices [91,94]. This projection procedure benefits from computational simplicity but suffers from the lack of information about representability conditions. Therefore, given a sufficiently corrupted 2 D measured , physicality is not guaranteed after projection.

B. RDM Reconstruction with Approximate Representability Constraints
In order improve the projection criteria we add additional n-representability constraints to the minimization procedure outlined in Eq. (56). Given a collection of 2-RDM elements at some unknown precision, or possibly missing crucial elements, our reconstruction scheme seeks to minimize the Frobenius norm of the difference between the reconstructed 2-RDM and the set of known measurements subject to approximate n-representability constraints. Denoting E to be the difference between the reconstructed 2-RDM and 2 D measured the minimization procedure can be formulated as the following non-convex optimization problem: where A i is the map from one marginal to another required by the fermionic ladder operator algebra. The details of these mappings can be found in Appendix A. The squared Frobenius norm of the error ||E|| F is quadratic in 2-RDM. The optimization problem specified in Eq. (58) can be relaxed to a semidefinite program (SDP) by taking the Schur complement in the identity block of the large matrix M constrained to be positive-semidefinite. In M , I is the identity matrix, F is a matrix of free variables, and E is the error between the reconstructed 2 D and the 2 D measured . Taking the Schur complement in the identity block of M gives or Noting that the Frobenius norm of a matrix A, ||A|| F , is given as Tr[A † A], taking the trace of Eq. (62) gives the semidefinite relaxation of minimizing the Frobenius norm We can now formulate the non-convex RDM reconstruction scheme in Eq. (58) in terms of a semidefinite program: where A i are the linear maps between 2 D and the other matrices { 1 D, 1 Q, 2 D, 2 Q, 2 G, M } along with the trace constraint and antisymmetry constraint on 2 D. These maps are described explicitly in Appendix C and are used in the Section V D for the SDP reconstruction program.
C. Iterative Procedure for Projecting Noisy 2-RDMs Into the Approximate n-Representable Subspace Although the SDP projection procedure can be extended to include better approximate n-representability conditions, it suffers from the requirement of solving a semidefinite program. Despite the fact that an SDP can be solved in polynomial time with respect to the total number of variables and constraints, the high-order polynomial scaling of SDP algorithms makes the SDP-based project method infeasible for on-the-fly or online projections. An alternative to the SDP projection combines the faster projection techniques based on fixed-trace positive projection with augmented n-representability conditions. The projection technique is an iterative procedure that was originally developed to enforce approximate n-representability on 2-RDMs obtained through a response formalism [93]. The algorithm involves sequentially mapping 2 D to 2 Q to 2 G and enforcing the positivity and trace constraints at each of the operators. The algorithm's main drawback is that any representable 2-RDM is a valid fixed point. As a result, linear constraints on the 2-RDM preserving projected spin and total spin expectation values are not enforced and there is no guarantee that the 2 D obtained from the iterative procedure is any closer to the true 2-RDM. Therefore, it is likely required that the input 2-RDM measured from the quantum resources is sufficiently close to the true 2-RDM for this procedure to be most successful.
The algorithm starts by enforcing Hermiticity of the given 2-RDM matrix by averaging followed by a positive projection with fixed trace according to the procedure in Reference [94]. A detailed description of the algorithm for fixed-trace positive projection can also be found in the appendix of [93]. Given a system with r spin orbitals, n particles, and η = r − n holes, the iterative projection algorithm is as follows: 1 Enforce Hermiticity of 2 D and project to positive set with trace n(n − 1) 2 Map 2 D to the 2 Q 3 Enforce Hermiticity of 2 Q and project to positive set with trace η(η − 1) where η is the number of holes 4 Map 2 Q to 2 G 5 Enforce Hermiticity of 2 G and project to the positive set with trace n(η + 1) 6 Check the stopping condition associated with fixed trace for 2 D, 2 Q, and 2 G and positivity of their eigenvalues The iterative procedure is considered converged when the largest negative eigenvalue of any marginal in the 2-positive set is below a set threshold. The total algorithm is depicted in Fig. 2   FIG. 2: The iterative procedure for 2-positive approximate n-representability constraints. Starting with a noisy 2-RDM the flow diagram is followed until the largest non-negative eigenvalue falls below a set threshold. Eigenvalues are considered converged when the absolute value of the largest negative eigenvalue is less than 1.0 × 10 −7 . This stopping criteria was used for all numerical experiments with the 2-positive iterative scheme.

Reconstruction of small systems
To probe the utility of the n-representability inspired reconstruction schemes, we examined the accuracy of the energy and chemical properties obtained from 2-RDMs with simulated sampling noise for diatomic hydrogen and a linear four-hydrogen chain. All experiments involved corrupting the elements of a pure-state 2-RDM with Gaussian noise proportional to the amount of samples used in operator averaging, followed by reconstructing the corrupted marginal with the four projection procedures outline above. The accuracy and precision of the reconstructed energies, particle-number, projected spin expectation S z , and total spin S 2 are compared to provide the noise tolerance and precision of the various reconstruction schemes. We obtained the Hamiltonian and ground-state wavefunction for diatomic hydrogen and a linear four-hydrogen chain with a bond length of 0.75Å described with an STO-3G basis using the OpenFermion [90] and the OpenFermion-Psi4 plugin [95]. One hundred different corrupted RDMs were constructed by applying zero-mean Gaussian noise with variance 2 2 D pq rs measured = 2 D pq rs + N (0, 2 ).
This error model is bias-free because the energy is linearly proportional to the 2-RDM and has variance proportional to the error added to the 2-RDM elements. For each of the one hundred corrupted density matrices we solve for a projected 2-RDM with the positive projection, positive projection with fixed-trace, SDP n-representability reconstruction, and iterative n-representability projection techniques. For each method we find the mean-square-error (MSE) of the aforementioned observables over the projected 2-RDMs as a function of the noise parameter . Figure 3 contains a plot of the MSE of the energy estimator for H 2 decomposed into its variance and bias components, and a plot of the average trace distance of the reconstructed 2-RDMs from the true 2-RDM of H 4 . The solid bars in the MSE plot are the squared bias component of the MSE while the transparent bars are the variance component. In general, the n-representability inspired projection techniques decrease the variance of the energy estimator but introduce a bias. Similar MSE plots are shown for S 2 , S z , and n in Appendix E. The expected value for S z , S 2 , and n shows zero mean-squared-error for the SDP-based projection technique because these values are added as constraints to the semidefinite program. We refer to the correction of the three aforementioned expected values as restoration of physicality-i.e. the particle number expectation is what is expected for an isolated system. The reduced trace distance for the SDP projected 2-RDMs for H 4 indicates that the physicality constraints are important for removing errors from 2-RDMs measured from a quantum resource.

Reconstruction of marginals from the variational channel state model
One appealing application of projection techniques based on n-representability is purification of states corrupted by an error channel. The SDP n-representability reconstruction procedure ensures physicality of the states by ensuring known symmetries are preserved by formulating the projection as a constrained optimization. To verify this we used the SDP n-representability method in conjunction with the variational channel state error models presented in Reference [54] to demonstrate the restoration of physicality. The variational channel state model is implemented by corrupting a pure state |ψ with a channel described in Kraus operator form. We consider uniform uncorrelated single-qubit error channels associated with dephasing, amplitude damping and dephasing, and depolarizing noise. The dephasing and amplitude damping channels are parameterized with the assumption that 5% of the coherence time has elapsed with respect to T 1 and T 2 . For the depolarizing channel, the Kraus operators are constructed assuming 5% of the dephasing time T 2 has elapsed. For each point along the binding curve of diatomic hydrogen the action of the channel on the pure-state is calculated as Each ensemble 2-RDM is then reconstructed with 2-positive n-representability conditions using the SDP projection technique where the error matrices are set as the spin-adapted components of the 2-RDM associated with ρ channel . Spin adapting eliminates the need to explicitly enforce the antisymmetry of the 2-RDM elements and thus reduces the total number of constraints in the SDP. The energy of the the H 2 system under the action of each separate channel is plotted in Figure 4 along with the energy computed as a functional of the reconstructed 2-RDM. The kinks in the dephasing and amplitude + dephasing curves are associated with a spin-symmetry breaking, where the channels produced a mixed state dominated by a triplet state. The discontinuity in the depolarizing channel curve is associated with the channel state switching to be a mixed state with a large component of singlet character. The markers without a line in Figure 4 are the reconstructed 2-RDM with S z = 0 and S 2 = 0 imposed by linear constraints on the 2-RDM associated with Srepresentability [96]. Naturally the binding curves are now smooth as a function of bond distance. Though physicality is recovered by projecting onto the closest marginal with fixed symmetries, the energy increases at distances greater than 1.5 Angstroms for each error model and the potential energy minimum is shifted by -0.03Å for the dephasing channel, 0.062Å for the dephasing and relaxation channel, and 0.186Å for the depolarizing channel. The energy increase is due to constraining the projected 2-RDM to have the correct spin-symmetry when the the error channel, applied through the variational channel state model, has switched the ground state symmetry from a singlet to a triplet [54]. The increase of energy upon restoration of a symmetry is a well known effect in chemical systems and condensed matter systems. However, the qualitative improvement in the nuclear potential energy surface and the implications for forces derived from such a surface are more important than the energy increase that is incurred. The relative error between corrected 2-RDM and the uncorrected 2-RDM at the maximum bond distance considered in this work (3.0Å) is 18 percent for the dephasing channel, 8.7 percent for the dephasing and relaxation channel, and 9.9 percent for the depolarizing channel.

VI. USE CASES OF RDMS IN AUGMENTING ENERGY EXPECTATION THROUGH PERTURBATION THEORY
In this section, we briefly mention some of the additional use cases within chemistry for correcting the reduced density matrices beyond simple estimation of the energy, and review techniques that allow one to utilize them with only the 2-RDM. In traditional approaches to electronic structure on classical computers, the solution of the electronic structure problem within an active space often lacks the contributions from so-called dynamical correlation. This part of the correlation stems largely from the electronic cusp contributions and represents a low-rank interaction in many high-lying orbitals. Multi-reference perturbation theory techniques have been found to offer a good balance between cost and accuracy for including these contributions, and many approaches have been developed in this regard including complete active space second order perturbation theory (CASPT2) [97][98][99], multi-reference Moller-Plesset theory [100], n-electron valence perturbation theory (NEVPT2) [101][102][103], canonical transformation theory (CT) [104][105][106], perturbative explicitly correlated corrections ([2] R12 ) [107,108], and corrected multi-reference CI (MRCI+Q) [109,110].
In understanding how these techniques may be utilized within the quantum domain, one is most interested in those that are compatible with the measurements one expects to be able to feasible make on a quantum computer. This means that explicit knowledge of the determinant decomposition of a wavefunction cannot be required, and instead one prefers methods that require only the k-RDM of the electronic system, where hopefully k is small. Of those discussed, NEVPT2, [2] R12 , and CT can be applied in this way.
For example, in NEVPT2, the effective equations may be derived entirely with the use of the 4-RDM, and it does not require knowledge of determinant decomposition or full electronic wavefunction. However the 4-RDM is a relatively expensive quantity to estimate, and thus cumulant based approximations to the 4-RDM have been developed such that only the 2-RDM is required to correct the energy perturbatively [106,111].
FIG. 4: Energy curves for molecular hydrogen after action of three error channels. The uniform uncorrelated single qubit error channels applied in the variational channel state model assume the entire circuit is executed within 5% of the total coherence time. The solid lines are the curves without n-representability reconstruction while the markers of the same color indicate the reconstruction under exact n-representability conditions. The black curve, depicted by exact is the true ground state energy curve. The label amp refers to the single-qubit amplitude damping channel associated with T 1 time, phase refers to the single-qubit dephasing channel associated with T 2 time, and depolarizing is the associated with the single-qubit depolarizing channel.
The cumulant expansions decompose the reduced density matrices into their non-separable (connected) components and separable unconnected components, and are quite useful for both developing approximations and enhancing understanding. A convenient notation for expressing these expansions is given by the Grassmann wedge product defined generally by where π and σ are permutations on the upper and lower indices of the tensor a ⊗ b and denotes the parity of each permutation. As an example one might consider the wedge product of a cumulant matrix with itself With this notation, the reduced density matrices up to k = 4 may be expressed in terms of the cumulant expansions: These methods that neglect contributions from the 3-and 4-cumulant dramatically reduce the number of samples that would be required to estimate the energy, however the approximations introduce some error in the corrections. An additional factor is the consideration of the impact of measurement noise on the RDM in these numerical procedures.
While the purification techniques suggested in this draft are expected to mitigate some problems in this regard, the effect on an iterative procedure may be dramatic. For this reason, NEVPT2 and its approximations may be the preferred method for use with quantum computers. The strongly contracted equations that explicitly define the corrections to the energy are given in Appendix A of Ref. [102], and cumulant reconstruction methods may be used directly to form approximations as dictated in Ref. [111]. The above equations can be used to derive cumulant based approximations for up to the 4-RDM from using only the 2-RDM by setting the irreducible three-and four-particle components ∆ 3 and ∆ 4 to 0. Alternatively, more sophisticated approximation schemes have been developed in the context of RDM theory within traditional quantum chemistry [112,113] VII. CONCLUSION Reducing the number of experiments required in the partial tomography step of VQE and other hybrid algorithms is necessary for hybrid classical/quantum algorithms to become useful simulation tools. In this work we proposed using representability conditions on fermionic marginals as a route towards reducing the number of measurements required in a VQE operator averaging step. Directly measuring the marginals provides the information required to integrate the results from hybrid quantum algorithms with classical quantum simulation methods.
Fermionic representability conditions were used in two majors ways: 1) re-expressing the Pauli sum Hamiltonian in a form where the total variance is minimized given a fixed state and 2) designing projection techniques based on necessary conditions on 2-marginals. Both techniques have shown significant promise toward minimizing the total number of measurements and reducing stochastic noise seen from sampling the quantum resources.
The SDP projection techniques are especially attractive because they are constructed to return physical states. We have observed the restoration of physicality when a pure-state is corrupted by single-qubit error channels. More significant representability conditions based on positivity of elements of the 3-RDM may provide more accurate reconstruction under noise. Another area that is especially exciting is the application of pure state constraints in the reconstruction procedure. The conditions discussed in this work do not constrain the marginal to be integrated from a pure state. As a result, the positivity constraints likely do little for the preservation of physicality as compared to the equality representability constraints, such as fixed number operator and constrained total spin. Pure-states could potentially further reduce systematic errors associated with gates.
We fully expect that further investigation of representability conditions with realistic systems, more realistic error models, and performant numerical implementations will demonstrate the utility of measuring marginals for quantum simulation using hybrid classical/quantum algorithms.
For the calculations depicted in Figure 4 the total angular momentum S 2 , projected angular momentum S z , and the particle number n operators are used as linear constraints in the semidefinite program. Just like the energy, these operators are linear functionals of the 1-and 2-RDMs. To see this we express each component of the aforementioned operators as sums of fermionic operators resulting in polynomials of rank-4 and rank-2.
m is the total number of spin orbitals, and α (β) denotes the two eigenfunctions of the z-angular momentum operator for a single fermion. The expected value of each operator can be determined by summing over the indicated elements of the 2-RDM and 1-RDM.
FIG. 5: The mean-squared-error (MSE) in the estimators for energy H , total spin S 2 , projected spin S z , and particle number n for H 2 over one-hundred samples. MSE is decomposed into variance (clear bars) and bias (solid bars).