Local, Expressive, Quantum-Number-Preserving VQE Ansatze for Fermionic Systems

We propose VQE circuit fabrics with advantageous properties for the simulation of strongly correlated ground and excited states of molecules and materials under the Jordan-Wigner mapping that can be implemented linearly locally and preserve all relevant quantum numbers: the number of spin up ($\alpha$) and down ($\beta$) electrons and the total spin squared. We demonstrate that our entangler circuits are expressive already at low depth and parameter count, appear to become universal, and may be trainable without having to cross regions of vanishing gradient, when the number of parameters becomes sufficiently large and when these parameters are suitably initialized. One particularly appealing construction achieves this with just orbital rotations and pair exchange gates. We derive optimal four-term parameter shift rules for and provide explicit decompositions of our quantum number preserving gates and perform numerical demonstrations on highly correlated molecules on up to 20 qubits.


I. INTRODUCTION
Hybrid quantum classical variational algorithms, including those of the variational quantum eigensolver (VQE) type [1,2], are among the leading candidates for quantum algorithms that may yield quantum advantage in areas such as computational chemistry or machine learning already in the era of noisy intermediate scale quantum (NISQ) computing [3]. A foundational issue in VQE [1,2], and in many of its extensions and alternatives [4][5][6][7][8][9][10][11], is finding a "good" definition of the entangler circuit. Here the qualifier "good" has many facets, possibly including: (1) Providing an efficient approximate representation of the target quantum states in the limit of an intermediate (ideally polynomial-scaling) depth (2) Consisting of a low number of distinct physically realizable gate elements (3) Exhibiting a simple pattern of how these gate elements are applied (4) Exhibiting sparse spatial locality that is further compatible with device connectivity (5) Exhibiting simple analytical gradient recipes and robust numerical convergence behavior during optimization of the VQE energy, e.g., by mitigating the effects of barren plateaus [12] (6) Respecting exactly the natural particle and spin quantum number symmetries of the target quantum states, e.g., as notably explored in [13] (7) Providing an exact representation of the target quantum states in the limit of sufficient (usually exponential-scaling) depth.
In this work we develop a VQE entangler circuit recipe for fermions in the Jordan-Wigner representation and show, or at least provide evidence, that it obtains all facets, with facet (5) partially left to future numerical studies. Perhaps the most notable property of our fabrics is the exact preservation of all relevant quantum numbers the individual gate elements of the fabric, which is why we refer to them as quantum number preserving (QNP). This property may be critical for employment of VQE in larger systems, where contaminations from or even variational collapse onto states with different particle or spin quantum numbers can severely degrade the quality of the VQE wavefunction.
Note that after we posted the first version of our manuscript, we became aware of the generalized swap network reformulation of k-UpCCGSD of [17]. This paper refactors k-UpCCGSD to use nearest-neighbor connectivity, yielding a circuit fabric that could be written in terms of four-qubit gates containing diagonal pair exchange and orbital rotation elements in a very similar manner as ourQ-type QNP gate fabric discussed below. There are some tactical differences in the qubit ordering and the generalized swap network paper does not emphasize the role of quantum number symmetry as much as the present manuscript. Moreover, the origin of thê Q-type QNP gate fabric as a simplification of our more-completeF -type QNP gate fabric of Appendix D provides a markedly different approach to developing this gate fabric. In any case, we encourage any readers interested in the present manuscript to also explore [17]. arXiv:2104.05695v2 [quant-ph] 10 May 2021 SU (4) SU (4) SU (4) SU (4) SU (4) SU (4) SU (4) SU (4) SU (4) SU (4) . . . ∼ = SU (2 6 ) SU (4) := exp(X) :X = −X † , Tr(X) = 0,X ∈ C 4 × C 4 FIG. 1: Sketch for N = 6 of a gate fabric universal for SU(2 N ) providing inspiration for the fermionic quantum number preserving gate fabrics developed here. The gate fabric is a 2-local-nearest-neighbor tessellation of alternating 15parameter, 2-qubit SU (4) gates. The SU (2) gate in the SU (4) gate decomposition on the bottom line is the 3-parameter universal gate for the 1-qubit Bloch sphere. The indicated 24-parameter decomposition of SU (4) is overcomplete for the 15-parameter SU(4) group.

II. GATE FABRICS
Our VQE entangler circuit recipe draws inspiration from the well known fact that the qubit Hilbert space (without any fermionic symmetries) SU(2 N ) can be spanned by a tessellation of 2-qubit gates universal for SU(4) in alternating layers (see Figure 1). This tessellation can formally be repeated to infinite depth. However, one finds that after some finite, N -dependent critical depth of order O(2 2N ), additional gate layers do not increase the expressiveness of the circuit, as formal completeness (denoted "universality") in SU(2 N ) is achieved. In practice usually shorter circuit depths are of interest. For instance, one may consider the case where the tessellation is restricted to be polynomial scaling in N , in which case universality cannot be exactly achieved. However, a good approximation of specialized (e.g., physically relevant) parts of some subgroup may still be achievable in a way that is tractable to compute even on a NISQ computer but intractable to compute with a classical device.
Particularly striking in Figure 1 is the locality (alternating nearest neighbor connectivity) and simplicity (single gate element) of the circuit, properties of what we call a "gate fabric." More precisely, throughout this manuscript, we define a gate fabric for a subgroup of SU(2 N ) to be a tessellation of gates over N -qubits with the following properties: 1. Simplicity: Composed of a single type of k-qubit, l-parameter gate element (with a known decomposition into elementary gates), where k and l are independent of N . 2. Linear Locality: When the qubits are thought of as arranged on a vertical line the gate elements are arranged in layers and connect up to k contiguous qubits. 3. Universality: Achieving universality within the target subgroup of SU(2 N ) within a finite number of layers depending on N . 4. Symmetry: commuting with all symmetry operators used to define the subgroup of SU(2 N ), i.e., [Û ,N ] = 0, whereÛ is the circuit unitary for any set of parameters andN is the symmetry operator. Depending on the subgroup of SU(2 N ) of interest it can be more or less difficult to find fabrics akin to the one shown in Figure 1. In Appendix A we discuss the trivial restriction to SO(2 N ) and the less-trivial restriction to subspaces of definite Hamming weight.

III. GATE FABRICS FOR FERMIONS UNDER THE JORDAN-WIGNER MAPPING
The focus of this work is the construction of gate fabrics for the subgroup F(2 2M ) ∈ SU(2 N ) constrained to spin-restricted fermionic symmetry under the Jordan-Wigner representation. To make this more precise, we define M real orthogonal spatial orbitals {|φ p } M p=0 . For each spatial orbital, we define corresponding α(β) spin orbitals |ψ p := |φ p |α (|ψp := |φ p |β ) for a total of N := 2M spin orbitals in a spin-restricted formalism. We associate the occupation numbers of these spin orbitals with the occupation numbers of N qubits. We number the qubits in "interleaved" ordering . . . |1 β |1 α |0 β |0 α . The fermionic creation/annihilation operators are defined in terms of the qubit creation/annihilation operators via the Jordan-Wigner mapping in "α-then-β" ordering, p ± := (X p ∓ iŶ p )/2 p−1 q=0Ẑ q andp ± := (Xp ∓ iŶp)/2 p−1 q=0Ẑq M −1 q=0Ẑ q . We note that for the majority of applications in the space of spin-1/2 fermions, the governing Hamiltonians are real (e.g., for non-relativistic electronic structure theory), and so we restrict from complex to real unitary operators, i.e., SU(2 N ) → SO(2 N ). The spin-restricted fermionic subgroup is then defined as the subgroup ofÛ ∈ SO(2 N ) that respect the commutation relations [Û ,N α ] = 0, [Û ,N β ] = 0, and [Û ,Ŝ 2 ] = 0. Here the α(β) number operator isN α : The spin-squared operator isŜ 2 := pq pq †p †q + (N α − N β )/2+(N α −N β ) 2 /4, and does not admit a local description in terms of Pauli operators in the Jordan-Wigner basis (we provide further details in Appendices B and C). We denote this real subgroup, preservingN α ,N β , and S 2 , as F(2 2M ). Naively one might expect that there should not be any local gate fabric exactly preserving all three fermionic quantum numbers, since theŜ 2 operator is non-local. The crux of this work is thus the simple quantumnumber-preserving gate fabric of Figure 2. This gate fabric is composed of 2-parameter 4-qubit gate ele-mentsQ, each composed of a 1-parameter 4-qubit spin-adapted spatial orbital rotation gate QNP OR (ϕ) and a 1-parameter 4-qubit diagonal pair exchange gate QNP PX (θ).
We describe further related quantumnumber-preserving gate fabrics for F(2 2M ) in Appendix D -these were the progenitors of the simpler gate fabrics shown in the main text, and may have advantageous properties in specific realizations of VQE entangler circuits.
Facets (2-4) of gate fabrics are manifestly fulfilled for all these proposals and facet (6) holds by construction as all gates individually preserve all quantum numbers. For facets (1) and (7) we provide numerical evidence below and in Appendix G. It is worth noting that these tests numerically indicate that our gate fabrics are universal in the vast bulk of quantum number irreps (with the exception of a few high-spin edge cases for theQ-type gate fabrics of the main text, see Appendix G), i.e., that they may be used for cases where S = 0 and/or where N α = N β (including both even and odd spin cases). We believe that the methods from [30,31] or [23] can be used to rigorously show universality in the (bulk of the) quantum number sectors of F(2 2M ) in all cases (as well as that our circuits are polynomial depth -approximate unitary t-designs and form -nets). Working out the details of a rigorous proof, which we believe has to be done spin sector by spin sector in some cases, is however beyond the scope of this work.
For QNP OR , QNP PX (and all other parametrized quantum number preserving gates introduced in Appendix D) we provide explicit decompositions into elementary gates in Appendix E. We further provide generalized parameter shift rules [32][33][34][35] for theses gates in Appendix F 2, enabling a computation of the gradients with respect to their circuit parameters with a maximum of four distinct circuits and without increasing circuit depth, gate count, or qubit number. We compare this gradient recipe to the generalization presented in [36], extend the variance minimization technique from [37] to the QNP gate gradients and note that our new rule can be applied to a large variety of other gates.

IV. NUMERICAL DEMONSTRATIONS
To numerically investigate the properties of the gate fabric from Figure 2 and to collect evidence that it satisfies all facets of a "good" entangler circuit, we consider two prototypical examples of highly correlated molecular ground states: The first is p-benzyne, which exhibits a biradical open-shell singlet ground state, with two unpaired electrons indicated by significant deviations from Hartree-Fock natural orbital occupation numbers, and four other moderate deviations from Hartree-Fock natural orbital occupation numbers. We use the geometry from [38], build the orbitals at RHF/cc-pVDZ, and construct a (6e, 6o) active space Hamiltonian with the orbitals ranging from HOMO-2 to LUMO+2. This corresponds to a case of M = 6 spatial orbitals (i.e., N = 12 qubits), and we focus on the ground state irreducible representation (N α = 3, N β = 3, S = 0). For a larger test case, we consider naphthalene, which while not intrinsically biradical, has multiple natural orbitals with significant deviations from Hartree-Fock natural orbital occupation numbers. We build the orbitals at RHF/STO-3G, and then construct a (10e, 10o) active space Hamiltonian consisting of the π and π * orbitals. This corresponds to a case of M = 10 spatial orbitals (i.e., N = 20 qubits), and we focus on the ground state irreducible representation (N α = 5, N β = 5, S = 0). In both cases, we consider VQE gate fabrics of the form of Figure 2.
Our final VQE circuit starts with the preparation of an uncorrelated product state by applying local PauliX gates to appropriate qubits of an all-zero state depending on the number of alpha and beta electrons. The qubits are chosen such that for all parameters equal to zero in the following fabric the state is transformed to the state with the energetically lowest orbitals occupied. We then consider two parameter initialization strategies: (A) The fabric is initialized with all θ = 0 and all ϕ = π/2 and Π = QNP OR (π) (solid lines in 3) (B) The fabric is initialized with all θ = 0 and all ϕ = π andΠ =Î (dotted lines in 3).
In both cases we optimize the VQE ground state energy with respect to the VQE gate fabric parameters via L-BFGS. As the purpose of this study is to explore the expressive power of this gate fabric, we consider neither shot noise nor decoherence. This restriction permits the use of analytical expressions for the Hamiltonian expectation values and VQE parameter gradients thereof, greatly accelerating the classical statevector simulation of the VQE. Figure 3 shows the salient results of this study. For the case of p-benzyne, Figure 3a shows the VQE ground state energy vs. full configuration interaction (FCI) with respect to the depth of the gate fabric. The first notable point is that the fabric is able to provide higher accuracy than either Hartree-Fock (HF) (a fabric of QNP OR gates -redundant here due to the use of Hartree-Fock orbitals in the active space) or doubly-occupied configuration interaction (DOCI) (a fabric of QNP PX gatesequivalent to the pUCCD ansatz from [22]). Focusing on the early convergence behavior on the left side of the plot, even with only a few layers of the VQE gate fabric, e.g. ∼ 50 − 80 parameters, absolute accuracy of 1 kcal mol −1 is achieved, which is commonly referred to as chemical accuracy. As the gate fabric depth is increased, roughly geometric (exponential) convergence of the absolute energy is achieved, modulo some minor aberrations due to difficulties in tightly converging the L-BFGS-based numerical optimizations of the VQE gate fabric parameters. Focusing on the later convergence behavior on the right side of the plot, as the number of parameters in the VQE gate fabric approaches the number of parameters in the FCI problem (note that in this irrep there are 175 configuration state functions (CSFs), see Appendix C.3), the error convergence turns sharply downward. At 180 parameters c := cos(λ/2) s := sin(λ/2) FIG. 2: Proposed gate fabric for F(2 2M ) (sketched for M = 4). The spin orbitals in Jordan-Wigner representation are in "interleaved" ordering with even (odd) qubit indices denoting α (β) spin orbitals. The Jordan-Wigner strings are taken to be in "α-then-β" order. The gate fabric is a 4-local-nearest-neighbor-tessellation of alternating even and odd spatial-orbitalpair 2-parameter, 4-qubitQ gates. EachQ gate has two independent parameters and contains a 1-parameter, 4-qubit spatial orbital rotation gate QNPOR(ϕ) and a 1-parameter, 4-qubit diagonal pair exchange gate QNPPX(θ). The order of QNPOR and QNPPX (note [QNPOR, QNPPX] = 0) does not seem to substantially change expressiveness at intermediate depths. QNPOR(ϕ) implements the spatial orbital Givens rotation |φ0 = c|φ0 + s|φ1 and |φ1 = −s|φ0 + c|φ1 , with the same orbital rotation applied in the α and β spin orbitals. QNPPX(θ) implements the diagonal pair Givens rotation, |0011 = c|0011 + s|1100 and |1100 = −s|0011 + c|1100 . The real 1-parameter, 1-qubit rotation gate isRy(λ) := e −iλŶ /2 . InQ, we include the optional constantΠ gate, for which natural choices include the 4-qubit identity gate, i.e.,Π =Î, or the fixed spin-adapted orbital rotation gateΠ = QNPOR(π). In the latter case, the gate fabric with all parameters {θ = 0} and {ϕ = 0} promotes exchange of orbitals. We find that the choice ofΠ ∈ {Î, QNPOR(π)} does not appear to affect the expressiveness of the quantum circuit, but the latter choice has turned out to be advantageous during gradient-based parameter optimization. Regardless of the choice ofΠ, this gate fabric exactly preserves the real nature of the subgroup, exactly commutes with theNα,N β , andŜ 2 symmetry operators, and numerically appears to provide universality at sufficient parameter count.
we are able to achieve very tight convergence to errors of ∼ 10 −10 E h relative to FCI, numerically indicating the onset of universality. Figure 3b shows the sorted power spectra of the computational basis state (determinant) amplitudes of the various VQE gate fabrics and the FCI state. The exact zeros in the FCI state amplitudes are an artifact of the D 2h spatial point group symmetry of this molecule, which our VQE gate fabric was not optimized to capture. Even for low VQE gate fabric circuit depths, we see that all determinants are populated by nonzero amplitudes, with a compromise apparently being made to allow for some nonzero error in all amplitudes to provide for the best variational energy. As more layers are added to the gate fabrics, the precision of the amplitude spectra increases, as indicated by, e.g., significant attenuation of the symmetry-driven zero block of the amplitude spectrum. The tail of amplitudes exactly zero in FCI is exactly extinguished in the VQE state only when numerical universality is achieved at a 180-parameter VQE gate fabric. This behavior is reminiscent of the nonzero but structured tensor factorized representation of the determinant amplitudes in coupled cluster theories, where here the tensor structure is provided by the local quantum gate fabric.
Moving to the larger test case of naphthalene, Figure 3c tells a similar story as the corresponding plot for p-benzyne. Here we see similar and roughly geometric convergence of energy error vs. VQE gate fabric depth and parameter count, albeit with a smaller prefactor. As with p-benzyne, the VQE gate fabric rather quickly outstrips both the HF and DOCI ansätze, which its primitive gates are constituted from, and achieves chemical accuracy of ∼ 1 kcal mol −1 in absolute energy at just ∼ 800 − 1000 parameters (there are 19404 CSFs in this irrep, so universality is not reached for any of the depth explored here). Figure 3d considers convergence of the energy with respect to the L-BFGS epoch for a number of different VQE gate fabric depths. A first key finding is that making the gate fabric deeper decreases the epoch count needed to converge to chemical accuracy. A second key insight is that, while initialization strategy (B) has shallower circuits and ultimately achieves lower energy error at very high epoch count, plateaus are visible during the optimization with the L-BFGS optimizer ( Figure 3). Strategy (A), which exchanges orbitals by means of the non trivial choiceΠ = QNP OR (π), appears to circumvent the plateaus entirely and for deeper circuits speeds up (the power-law like) convergence to below chemical accuracy.
The fabric presented here has favorable properties for implementation on NISQ hardware: The 12 qubit ansatz at 110 parameters is without (with)Π gates decomposable into elementary gates (2-qubit controlled Pauli and 1-qubit gates) with resulting depth of 507 (617). The 20 qubit ansatz at 1080 parameters without (with)Π gates has depth 2761 (3361) in such decomposition. To put this into perspective, a single trotter step of a naive UCCSD circuit has gate depth ≈ 6600 (12 qubits), respectively ≈ 57600 (20 qubits). Another considerable advantage is that only N −2 unique 4-qubit gates have to be calibrated on hardware as the structure is repetitive after the first two layers.

V. COMPARISON WITH OTHER ENTANGLER CIRCUITS
Having numerically demonstrated the features of the VQE gate fabric, it is worth considering the relationship of this gate fabric to other proposed VQE entangler circuits. There has been substantial prior work along these lines in the past few years.
For one instance, the hardware efficient ansatz [19, 20] is manifestly a local gate fabric, using essentially SU(4) entangler elements or subsets thereof from the native gate set of the underlying quantum circuit architecture. However, this gate fabric does not respect the particle or spin quantum number symmetries, and therefore is likely to encounter substantial difficulties in locating low-lying states within a target quantum number irrep, particularly in larger active spaces.
In another direction, there are myriad proposed entangler circuit constructions which are either already explicitly or in principle could be adapted to real amplitudes and strict commutation with the number and/or spin-squared symmetry operators, but which are either nonlocal circuits or composed of heterogeneous gate layers. For instance, UCCSD [1,14,15], (here referring to the Trotterized version thereof) and its sparse and/or low-rank cousins k-UpCCGSD [9,16], , and Jastrow-Factor VQE [18] all may have the power to achieve universality at sufficient depth, e.g., as proved in a recent analysis of distangled UCC [23] and have been either partially or completely symmetrized already. However, as written, all of these ansätze require nonlocal gate elements that, e.g., mediate excitations among non-adjacent spin orbitals in UCCSD, and thus are not gate fabrics. Moreover, many of these constructions involve heterogeneous gate layers. For a canonical instance, Jastrow-factor VQE [18] involves alternating circuit layers of orbital rotations and substitutions (with the last of these being nonlocal and complex-valued in the usual formulation). Of all methods discussed in the prior literature, k-UpCCGSD is likely closest to our proposed method, with products of single and diagonal double substitution operators comprising the method. k-UpCCGSD as described in [16] involves nonlocal pair substitutions and therefore does not yield a local gate fabric. Note however that the generalized swap network reformulation of k-UpCCGSD described in and around Equation 12 and Figure 7 of [17] (noticed after the first version of this manuscript was posted) appears to realize k-UpCCGSD by means of a local circuit composed of 4-qubit gates that is essentially a gate fabric.
Yet another interesting direction to consider is previously proposed true gate fabrics that preserve quantum number symmetry, but do not achieve universality. Orbital rotation fabrics [39-41], (i.e., Hartree-Fock) are clearly one example here, but so too is doubly occupied configuration interaction (DOCI), for which a gate fabric was developed with the pUCCD ansatz [22]. Both of these ansätze have the interesting property that they can be mapped into gate fabrics requiring only M qubits, but both fail to reach FCI universality as the parameter depth is increased.
Another interesting gate fabric construction is the "gate-efficient ansatz" presented in [24], which presents as a gate fabric that preserves total particle number N α + N β , but does not appear to respect high-spin particle numberN α −N β orŜ 2 symmetry. Yet another interesting entangler is the "qubit coupled cluster" approach presented in [25], which essentially implements a partial spin adaption of UCCSD to preserveN α andN β symmetry within the single and double excitation operations, but neither preservesŜ 2 symmetry nor attains the structure of a local gate fabric. Another related approach presented in [26] constructs fermion-adapted excitation operators which preserve particle number symmetry but not spin symmetry, and additionally are aimed at optimizing the number of CNOT gates in non-gate-fabric UCCSD methods. An entangler that has the potential to preserve all quantum number symmetries with additional spin-adaption work is the QAOA-inspired Pauliterm approach, presented in [27], but this approach yields highly nonlocal circuits which do not resemble gate fabrics. Another approach which has some intersection with the present work is the correlating antisymmetric geminal power (AGP) approach explored in [28], which first implements classically-tractable APG to provide state preparation in quantum circuits and then augments AGP with an anti-Hermitian pair hopping entangler which resembles our pair exchange gate. However, the correlating AGP is not written in the form of a local gate fabric.
Another interesting direction to explore that we propose here is alternative local gate fabrics that fully preserve quantum number symmetry, but which exhibit different gate constructions than the QNP PX and QNP OR gates used in Figure 2. Examples of such gate fabrics using generic 4-qubit 5-parameter FCI gates and decompositions of these gates into QNP PX , QNP OR , 1hole/particle substitution, and pair-break up/down gates are described in Appendix D.
One additional interesting direction is the "symmetry preserving state preparation circuits" of [13]. This work primarily focuses on total number symmetry, but does introduce a four-qubit gate that preserves particle and spin quantum numbers via a hyperspherical parametrization.
Note that an alternative approach to the exact symmetry preservation explored in this manuscript is symmetry projection [42,43], which often requires ancilla qubits and extra measurements due to the necessarily non-unitary nature of the projection operation.

VI. SUMMARY AND OUTLOOK
In this work, we set out to construct doppelgängers of the well known gate fabric (i.e., a potentially infinitely repeatable, simple and geometrically local pattern of gate elements that span the parent group at sufficient depth) for the unrestricted qubit Hilbert space SU(2 N ) consisting of simple 2-qubit gate elements SU (4). Our major result is the construction of a gate fabric for the important special case of spin-1/2 fermionic systems in the Jordan-Wigner representation F(2 2M ) consisting of simple 4-qubit gate elementsQ. Each 2-parameterQ gate comprises a 1-parameter spatial orbital rotation gate QNP OR (ϕ) and a 1-parameter diagonal pair exchange gate QNP PX (θ). A fabric made of either of these gate elements alone does not achieve FCI universality with sufficient parameter depth, but our VQE gate fabric, being an amalgamation of the two appears to be able to do so. Moreover, at intermediate depths, the VQE gate fabric appears to be pragmatically expressive as evidenced by tests of the ground state energy convergence in strongly correlated molecular systems. It is worth emphasizing that these properties seem to hold in the vast bulk of quantum number irreps, i.e., that these fabric circuits can be applied for cases where S = 0 (including even or odd spin cases) and/or where N α = N β (see Appendix G for details on specific high-spin edge cases that are not universal with theQ-type QNP gate fabrics of the main text, but that can be addressed with elements of theFtype QNP gate fabrics of Appendix D). Many important questions remain regarding our quantum number preserving gate fabrics. These include: (1) How does the numerical optimization of parameters for such gate fabrics behave in the presence of shot and/or decoherence noise? (2) How can numerical optimization algorithms be adapted to exploit the knowledge that the VQE entangler circuit is a gate fabric? (3) Is the fixedΠ gate construction or an extension thereof an effective way to mitigate barren plateaus during numerical optimization? (4) How does the VQE gate fabric perform for relative properties, for properties at different nuclear geometries, and for properties in different quantum number irreps? (5) Is the construction of the VQE gate fabric in terms ofQ gates optimal, or do more elaborate constructions, e.g., using theF gates of Appendix D provide additional benefits? (6) What is the scaling behavior of the error in absolute and/or relative properties as a function of parameter depth for representative interesting molecular systems? (7) Can the gate fabric be adapted to additionally exploit external symmetries such as spatial point group symmetries, e.g., as explored in [44]? Taken together, the results of this work might provide an interesting guide for the required symmetries and limiting simplicities when constructing more elaborate VQE entanglers for fermionic systems.
Acknowledgements: RMP is grateful to Dr. Edward Hohenstein for many discussions on the structure of thê S 2 operator. The authors further thank Fotios Gkritsis for discussions. QC Ware Corp. acknowledges generous research funding from Covestro Deutschland AG for this project. Covestro acknowledges funding from the German Ministry for Education and Research (BMBF) under the funding program quantum technologies as part of project HFAK (13N15630 [29] Note that in a field-free lab frame, only the total particle number N T ≡Nα +N β and the total spin Ŝ 2 are distinguishable quantum numbers. The high-spin projection particle number N∆ ≡ Nα − N β ∝Ŝz is only resolvable in presence of external fields. However, as Nα and N β are relatively easy to separately constrain, we will require that the entangler circuits respect all three quantum number symmetries N α , N β , and Ŝ 2 (and thereby also NT and N∆ ) throughout this manuscript. This section discusses some of the technical hurdles encountered in developing universal gate fabrics for certain subgroups of SU(2 N ), using well-known literature results for restriction to real operators SO(2 N ) and further restriction to Hamming-weight-preserving operators H(2 N ).
The imposition of specific symmetries which constrain the subgroup of SU(2 N ) may or may not present considerable difficulties in constructing gate fabrics of the type defined above. For an example that does not introduce significant difficulty, consider the case where we restrict the Hilbert space operators to have real value, i.e., a restriction to SO(2 N ). In this case, one may simply substitute SU(4) → SO(4) in the gate fabric of Figure 1 to construct the desired gate fabric sketched in Figure 4 for SO(2 N ).
For an example that does introduce significant difficulty, consider the case where we restrict SO(2 N ) Hilbert space operators to preserve Hamming weight, i.e., to respect the commutation constraint [Û ,P ] = 0 wherê P := p (Î −Ẑ p )/2 is the Hamming weight or "particle counting" operator. We denote this subgroup as H(2 N ). Here, we might be tempted to continue restricting the 2-qubit SO(4) gates to preserve Hamming weight, mandating that we substitute SO(4) → H(4), where H(4) implements a Givens rotation between configurations |01 and |10 while acting as the identity in |00 and |11 . This is sketched in Figure 5. However, a tessellation of 2-qubit Givens gates is not a gate fabric for H(2 N ), as it does not provide universality for this subgroup. In fact, it can be shown that a tessellation of Givens gates amounts to a one-particle rotation of the qubit creation and annihilation operatorsp ± := q V qpq ± for V qp ∈ SO(N ) andq ± := (X q ∓ iŶ q )/2, and thus after exactly N layers and N (N − 1)/2 gates the part of Hilbert space reachable with the fabric does no longer increase anymore, and in fact the fabric is classically simulable in polynomial time via techniques such as the match gate formalism or direct implementation with classical photons and beamsplitters. Note that H(2 N ) has irreducible representations of dimension up to N N/2 , so failure to reach universality can be shown by simply parameter counting. Speaking more practically, this proposed gate fabric has very limited expressive power for most irreps of H(2 N ), and does not provide a good approximation to most desired actions within this space.
In fact, no gate fabric for H(2 N ) is possible with 2qubit gate elements. One possible construction of a gate fabric for H(2 N ) with 3-qubit gate elements is sketched in Figure 6. Note that this might not be a minimal representation -we will see examples of fermionic systems shortly where a much simpler gate element than the fully explicitly universal k-minimal qubit gate provides a gate fabric.

Appendix B: Additional Details on the Jordan-Wigner Mapping
This section is included to enumerate the expansion of same-spin number operators and theŜ 2 operator into Pauli operators in the Jordan-Wigner mapping defined in the main text.

a. Same-Spin Occupation Number Operators
The same-spin occupation number operators are whereby p + p − is a "particle occupation number operator" (counts 1s). p − p + is a "hole occupation number operator" (counts 0s). Note that the Jordan-Wigner strings cancel for these operators.

b. Same-Spin Substitution Operators
The same-spin one-particle substitution operator is Here Z ↔ p+1,q−1 := r=q−1 r=p+1Ẑ r . For completeness Technically, p + q − is the "one-particle substitution operator" and p − q + is the "one-hole substitution operator." With some algebra, one can show that, (The formula for p > q is the same except for the indices on Z ↔ p+1,q−1 .) Here, the Jordan-Wigner strings only cancel partially. However, the α-then-β Jordan-Wigner ordering does provide the advantage that the remaining Z ↔ p,q strings are supported only on the intermediate α(β) spin orbital indices for α(β) substitution operators.

a. Alpha Number Operator
The α number operator iŝ The eigenvalues of the α number operator are N α ∈ [0, 1, . . . M ] with degeneracy M Nα 2 M . The determinants are eigenfunctions of the α number operator, with eigenvalues given by the α population count, and popcount( I α ) counts the number of ones in I α .

b. Beta Number Operator
The β number operator is, The eigenvalues of the β number operator are N β ∈ [0, 1, . . . M ] with degeneracy M N β 2 M . The dets are eigenfunctions of the β number operator, with eigenvalues given by the β population count,

c. Total Spin Squared Operator
The total spin squared operator is, with the spin lowering operator, the spin raising operator, and the z-spin,Ŝ After some algebra, under the chosen Jordan Wigner mapping, this resolves toŜ The eigenvalues of theŜ 2 operator can be written as S/2(S/2 + 1) with S ∈ [0, 1, 2, . . .] (singlet, doublet, triplet, etc). This section explicitly enumerates, for the special case of M = 2, the characteristics of the Jordan-Wigner computational basis functions (representing Fock space Slater determinants), S 2 -pure configuration state function (CSF) linear combinations thereof, and arbitrary linear combinations thereof which we will refer to as full configuration interaction (FCI) states. This section is useful to develop the beginnings of a picture for the arbitrary M -spatial-orbital Fock space, and also directly leads to the FCI gate fabric of the next full section. Table I enumerates the Slater determinants in the M = 2 case along with their corresponding Jordan-Wigner computational basis states, provides the N α and N β eigenvalues of each determinant (all determinants are proper eigenstates of the particle number operators), identifies which determinants are configuration state functions (CSFs) (only some determinants are also CSFs), and if a CSF, provides the S eigenvalue of the determinant/CSF.

Quantum Number Operators
TheN α andN β operators are diagonal (as is always true in the Jordan-Wigner representation), and their diagonal values are depicted in the fourth and fifth columns of Table I, respectively. TheŜ 2 operator is not diagonal (as is generally true in the Jordan-Winger representation). Instead, only 14 out of the 16 rows/columns of this operator are diagonal, and their diagonal entries are given in the seventh column of Table I. The non-diagonal contributions arise from the non-CSF determinants

Configuration State Functions (CSFs)
Configuration state functions (CSFs) are defined as sparse linear combinations of Slater determinants that I: Enumeration of characteristics of M = 2 Fock space in Jordan-Wigner representation. First column: Base-2 qubit occupation string (i.e., qubit computational basis state). Second column: Base-10 qubit occupation string (i.e., base-10 index for vector and matrix quantities). Third column: Slater determinantal configuration represented by this qubit computational basis state. Fourth column: number of α electrons in this configuration (always a proper eigenstate). Fifth column: number of β electrons in this configuration (always a proper eigenstate). Sixth column: is this configuration a valid configuration state function (CSF), i.e., a proper eigenstate of S 2 ? Seventh column: if yes to the previous question, S eigenvalue for this simultaneous Slater determinant/CSF (S = 0singlet, S = 1 -doublet, S = 2 -triplet, . . . ).
Base 2 Base 10 Determinant Nα N β Is CSF? S provide proper eigenstates ofŜ 2 . The 14 spin-pure Slater determinants discussed above are also CSFs, with corresponding eigenvalues S. In the seniority-2 coupling set of the non-spin-pure Slater determinants |#6 and |#9 , the eigenvectors of theŜ 2 operator are, and the corresponding eigenvalues are, E.g., the + combination yields an S = 0 singlet CSF, while the − combination yields an S = 2 triplet CSF. Thus the symmetry-adapted configuration state func-tions (CSFs) for this seniority coupling set are, and, Therefore, we have a complete real, orthonormal set of 16 CSFs for F(2 2 * 2 ): 5 singlets, 8 doublets, and 3 triplets. These CSFs are proper eigenfunctions ofN α ,N β , and S 2 .

Quantum Number Irreps
Valid solutions to the time-dependent or timeindependent Schrödinger equation for spin-1/2 fermions governed by spin-free Hamiltonian operators must be definite simultaneous eigenstates of the quantum number operators (N α ,N β ,Ŝ 2 ) with definite target eigenvalues (N α , N β , S). We refer to the set of valid simultaneous eigenstates for a given set of target quantum numbers (N α , N β , S) as a quantum number irrep. Table II enumerates the dimensionality and our particular convention for the CSF basis for each definite (N α , N β , S) irrep of the M = 2 Fock space. An arbitrary special orthogonal rotation within each irrep would also provide a faithful representation of the basis for that irrep.

Full Configuration Interaction (FCI) States
The restriction of physically valid solutions of the timedependent or time-independent Schrödinger equation to a given target quantum number irrep severely constrains, but does not exactly determine the valid solution for most irreps. For instance, in the (N α = 1, N β = 0, S = 1) irrep, the 15-parameter generic solution, is invalid because it does not respect the quantum number symmetries, but the 1-parameter solution, is valid due to the fact that the dimension of the target irrep is D = 2 > 1.
We generically refer to states which exactly lie within a given target quantum number irrep, but where the remaining flexibility in the state is determined by solving an auxiliary equation such as the time-dependent or timeindependent Schrödinger equation, as "full configuration interaction" (FCI) states. The motivation for this naming is the set of states that emerge from exactly diagonalizing the spin-free electronic Hamiltonian within a given quantum number irrep, i.e., the classical FCI method, though the usage within this work should be understood to be generalized to solving any linear auxiliary equation governed by a spin-free operator which is simultaneously diagonalized by the three quantum number operators.
The question that arises at this point is how to construct special orthogonal operators that respect the quantum number symmetry but have the power to move from an arbitrary quantum-number-pure trial state to an FCI state within the same quantum number irrep. The simple answer is to construct complete special orthogonal operators acting on the CSF basis of each irrep, with the property that these operators commute with all three quantum number operators. This leads to the construction of 4× 1-parameter SO(2) operators (simple Givens rotation matrices) acting within the 4× S = 1 doublet irreps, and 1× 3-parameter SO(3) operator acting within the (N α = 1, N β = 1, S = 0) irrep. This seems to imply that the parameter dimension of F(2 2 * 2 ) is 7. However, further analysis reveals that to preservê S 2 symmetry, the same operator must be applied in the (N α = 1, N β = 0, S = 1) and (N α = 0, N β = 1, S = 1) irreps and that the same operator must be applied in the (N α = 1, N β = 2, S = 1) and (N α = 2, N β = 1, S = 1) irreps. This is related to the fact that the spin-free Schrödinger equation is invariant under permutation of the α and β labels in the working equations. This reduces the total number of parameter of F(2 2 * 2 ) to 5, and yields the highly structured special orthogonal operators that will be encountered as M = 2 FCI gate operators in the next section. An early iteration of the gate fabric described in the main text was developed by constructing a gate fabric comprising a 5-parameter 4-qubitF gate universal for M = 2 FCI as detailed in Figure 7. A fabric of thesê F gates was found to exactly preserve quantum number symmetry, to provide universality for F(2 2M ) for sufficient parameter depth, and to yield an expressive approximate representation at intermediate depths. The representation power and numerical convergence was found to be similar betweenF gate fabrics and theQ gate fabrics, and the latter is conceptually simpler, so we have elected to focus on the latter in the main text. In the following we describe this alternative gate fabric and additional variants and refer to and use the concepts and notation introduced in Appendices B and C.

Decomposition ofF into Simple Gate Elements
There are many different possible implementations of F into products of simpler (e.g., 1-parameter) gate elements. However, the block diagonal nature and configuration constituency ofF suggests the following pragmatic choice, leading to a decomposition with a simple decomposition all the way down to a standard two-qubit gate library. The text refers to theF gate matrix in the second line of Figure 7: The red block (matrix entries c 1p := cos(θ 1p /2) and s 1p := sin(θ 1p /2)) corresponds to a Givens rotation between the one particle (N α = 1, N β = 0, S = 1) CSFs |0001 =: • = |#1 and |0100 =: • = |#4 and the same Givens rotation between the (N α = 0, N β = 1, S = 1) CSFs |#2 and |#8 (to preserveŜ 2 symmetry). We call this operation the QNP 1p gate (quantum-numberpreserving 1-particle gate).
2-qubit gate operations is provided in Appendix E.

Simplifications of theF Gate Fabric
A natural question at this point is whether there exist gate fabrics for F(2 2M ) which are simpler than theFgate fabric described above. E.g., a simpler gate fabric might have fewer parameters per gate element, and/or fewer QNP product gates per gate element, while still preserving quantum number symmetry and numerical efficiency. For one explicit example, it is clear that the 3× QNP product gates in (N α = 1, N β = 1, S = 0) irrep are redundant, as QNP PX and QNP PBU (or QNP PBL ) are sufficient to attain any desired action in the irrep. Further, repeated application of pairs of QNP PX and QNP PBU (or QNP PBL ) gates is sufficient to attain any desired operator in the irrep. So we can already reduce from a 5-parameterF gate fabric to a 4-parameter mod-ifiedF gate fabric.
Next, we can consider the 1-hole and 1-particle spaces. Depending on the target irrep, only one of these rotations is generally needed, e.g., for most irreps, a gate fabric of QNP 1p , QNP PX , and, QNP PBU is universal. So, for most irreps, a 3-parameter modifiedF gate fabric is sufficient. A technical detail here is that the choice of QNP 1p vs. QNP 1h required for universality is contingent on whether there are more particles or holes in the desired irrep of F(2 M ) for extreme edge case irreps.
As we show in the main text, it is possible to reduceF even further to a 2-parameterQ gate fabric, where theQ fabric symmetrizes the rotations between the 1-particle and 1-hole irreps, and additional mixes rotations between the 1-particle/hole irreps and the (N α = 1, N β = 1, S = 0) irrep. To that end we consider an alternative quantum number preserving gate which is already well-known in the literature, the spatial orbital rotation gate, which we describe in the following section.

Orbital Rotations
A well-known operation in both classical and quantum electronic structure methods is the spin-adapted spatial orbital rotation gate, which implements, for V pq ∈ SO(M ), and for the particular case of M = 2 adjacent spatial orbitals. If we take V pq to be a 2 × 2 special orthogonal matrix, i.e., a Givens rotation matrix with parameter ϕ, then this 1-parameter, 4-qubit QNP OR gate (qantum-number-preserving orbital rotation gate) is a special case of the 5-parameter, 4-qubit F gate from Figure 7 with, c := c p1 = c h1 = cos(θ/2), s := s p1 = s h1 = sin(θ/2), and c 2 cs cs +s 2 −cs c 2 −s 2 cs −cs −s 2 c 2 cs +s 2 −cs −cs c 2 This gate can be viewed as a simultaneous and symmetrical application of the QNP 1p and QNP 1h gates which also acts in a direct product manner in the (N α = 1, N β = 1, S = 0) irrep. The explicit action of the QNP OR gate is depicted in Figure 8.
It is well-known that a fabric of M 2 QNP OR gates arranged in a rectangular or triangular gate fabric pattern can exactly implement an arbitrary orbital rotation within M spatial orbitals, with a classically tractable relationship between the V pq orbital rotation matrix and the parameters {φ d } of the fabric being possible through the QR decomposition of the orbital rotation matrix.

Variants of the QNP fabrics
As with the gate fabric in the main text it can be interesting to prepend the parametrized gate elementsF orF with a fixed gate like theΠ gate as this may improve trainability of the fabric or even expressiveness at intermediate depths. In the main text we have explored the optionsΠ ∈ {Î, QNP OR (π)}. Another natural option, inspired by the concept of fermionic swap networks, would be to takeΠ to be an orbital wise fermionic swap gate. This gate is also quantum number preserving and we introduce it in the end of the following section.
c := cos(ϕ), s := sin(ϕ) FIG. 8: Spin-adapted spatial orbital rotation gate between two adjacent spatial orbitals. The parameter ϕ is the argument of the Givens rotation between orbitals |φ0 and |φ1 , with the same Givens rotation applied in the α and β spaces.
Due to cancellations when expanding out the controlled Y rotations, a decomposition in terms of only standard gates has only slightly higher depth (16 → 18) and requires less two-qubit gates (18 → 14) even if the controlled Y rotation is a native operation:

Two orbital Givens rotation gate QNPOR
We describe the construction of the Givens rotation gate in more detail. A Givens rotation is generally a rotation in a two dimensional subspace of the form where c := cos(θ/2) and s := sin(θ/2) for a continuous parameter θ. Under the Jordan-Wigner mapping a Givens rotation between the orbital bases can be implemented as as pair of parallel Givens gates as follows: In the two qubit space, the Givens rotation gate G(θ) has the action and can be decomposed into elementary gates as follows: In the 4-qubit Hilbert space, the two orbital Givens rotation gate QNP OR (θ) has the action When applied to two neighboring spatial orbitals, this gate also preserves all three quantum numbers and has the following decomposition with gate depth 5 and just 4 CNOT gates: An alternative decomposition into controlled Y rotation gates is: Note that if these gates are native, the 2-qubit gate count is raised to 6 while reducing the depth to 3. Expanding out the controlled Y rotations yields: Of course the two Givens rotations of the gate commute and can be performed at the same time, giving a gate depth of 6 and a CNOT count of 8. Depending on the preceding and following gates, it may furthermore be favourable to substitute the doubled CNOT gates with a single CNOT and a SWAP gate.

Single particle and single hole gates
In the following we abbreviate R := RY (θ/8). The QNP A1B0 (θ) gate can be decomposed as follows: The QNP A0B1 (θ) gate can be decomposed as follows: The QNP A2B1 (θ) gate can be decomposed as follows: Finally, the QNP A1B2 (θ) gate can be decomposed as follows:

Pair breaking gates
For the pair breaking gates we present decompositions into standard gates and controlled Y rotations. The pair break low gate QNP PBL has the decomposion: While the Pair break up gate QNP PBU has the decomposition:

Fermionic orbital swap gate
Finally, the OFSWAP gate is an orbital wise fermionic swap gate, i.e., two fermionic swap gates (a SWAP gate followed by a controlled Z gate) with action where the −1 in the lower right corner takes into account the sign of the fermionic anti-communtation relations, applied between the alpha and beta wires respectively. Curiously OFSWAP is only up to phases representable by an orbital rotation as we have OFSWAPẐ 0Ẑ0 = QNP 2OGR (π).

Shift tuning
We briefly recap the derivation of the standard parameter-shift rule without fixing the shift angle, leading to a free parameter in the rule.
Consider a parametrized gate of the form where P 2 = 1, as is the case for example for Pauli rotation gates. In a circuit with an arbitrary number of parameters, let's single out the parameter of the gate U above and write our cost function of interest as where the part of the circuit preparing |ψ (θ) from some initial state applied before the gate U has been absorbed into |φ and the part after U is absorbed in to B. Then the derivative is, by the product rule, given by Now look at the conjugation of B by U at an arbitrary shift angle ±α: Subtracting U(−α)(B) from U(α)(B) and excluding multiples of π as values for α, we obtain the generalized twoterm parameter-shift rule where the original parameter-shift rule corresponds to choosing α = π/2. We note that the concept of shifttuning was independently discovered in [37] and introduced in the quantum computing software package Pen-nyLane [35].

a. Reducing the gate count
In particular, the general form of Eq. (F5) allows usprovided that θ is not a multiple of π -to choose α = −θ, making the first of the cost function evaluations f (0) and therefore reducing the gate count because U (0) = 1 can be skipped in the circuit. This may lead to an additional gate count reduction if the neighbouring gates on both sides of U can be merged, which is true for example in circuits for the Quantum Approximate Optimization Algorithm (QAOA).

Four-term parameter-shift rule
Here we derive a four-term parameter-shift rule for gates that do not fulfil the two-term rule, e.g. controlled rotation gates like CR Z (θ) or many of our QNP gates with one parameter.
To this end, consider a gate with Q 3 = Q but not necessarily Q 2 = 1, as is true for any gate with spectrum {−1, 0, 1}. Then the exponential series can be rewritten as and a computation similar to the one above leads to We can then obtain the commutator by linearly combining this difference with itself for a second angle ±β, so that which holds true if the angles α, β and the prefactors d 1,2 satisfy Therefore, we get the four-term parameter-shift rule where we again can choose α or β such that one of the function evaluations skips the gate U . A particularly symmetric solution of Eqns. (F12) and (F13) is In general, any gate for which the spectrum of the generator is {−a+c, c, a+c} obeys the four-term parametershift rule as the shift c can be absorbed into a global phase that does not contribute to the gradient and a can be absorbed into the variational parameter of the gate.
As an example, the four-term rule is applicable to (multi-)controlled Pauli rotations CR P (ϕ) for which Q is the zero matrix except for the Pauli operator P on the target qubit. For multiple control qubits and our QNP gates, this will lead to less circuit evaluations using the chain rule and applying the two-term rule to the gate decomposition.
In order to find out whether a n-qubit single-parameter gate U satisfies the four-term rule, one can compute and test if there is an a ∈ R such that Q 3 = a 2 Q, which is a sufficient condition, as the only thing we needed for the four term rule to apply was this assumptions about the generator spectrum.
a. Relation to other four-term rule Previous work showed the existence of a four-term parameter-shift rule [36] for gates of the form (F7), which is implemented with only one shift angle but requires the two additional gates There are four relevant aspects when comparing this rule to the one in (F14): First, our four-term rule does not require any additional gates like V ± , which add overhead to the gradient evaluation circuits. While the authors bound the additional cost by the cost of the differentiated gate itself, it might more crucially be non-trivial to construct V ± for gates that do not have an obvious fermionic representation like the gates considered in [36]. Second, the shift tuning technique for gate count reduction in F 1 can easily be extended to both, our fourterm rule the rule derived in [36], provided one has access to the parametrized versions of V ± . As the construction of V ± for fermion-based gates is based on rotations, this access can be assumed for these gates whenever V ± can be implemented.
Third, it was shown in [36] that their four-term rule reduces to a standard two-term rule up to the insertions of the V ± operators whenever both the circuit of interest and the measured observables are purely real-valued. This is the case for virtually all molecular Hamiltonians and most of the circuits proposed for quantum chemistry problems -including the fabrics in this work -such that gradients of highly complex gates may be computed with just two circuit executions including the gates V ± using the rule in [36].
Fourth, the variances of the derivative estimators given by the two rules can be minimized to the same value by choosing the shift angles optimally, as shown in Sec. F 3. This means that for a given budget of circuit executions, the quality of the estimated derivative is the same, even though the number of distinct circuits differs.
In summary, the specialized two-term parameter-shift rule in [36] is preferable if the following three criteria hold: Firstly, the circuit and observable need to be realvalued. Secondly, the auxiliary gates V ± have to be available. Thirdly, the computation must happen on a simulation level in which the number of distinct circuits instead of the measurement budget is relevant, so that the reduction from four to two terms provides an advantage which is larger than the overhead of adding V ± . In all other scenarios the four-term rule Eqn. (F14) with the optimal parameters in Eqns. (F24-F25) requires slightly fewer gates and the same number of circuit executions, making it preferable in particular on quantum computers.
b. Impossibility of some further shift rules One may wonder whether a three shift rule is possible for gates whose generators have just three distinct eigenvalues and whether shift rules exist for gates with more distinct eigenvalues. We present some insights on these questions in the following.
During the derivation of the four-term parameter-shift rule we chose to first linearly combine U(±α)(B) and U(±β)(B) with the same prefactors, respectively. Alternatively one may try to combine U(α i )(B) at three shift angles {α i } i∈{1,2,3} linearly and demand the result to fulfil This leads to the system of equations with c i = cos αi 2 and s i = sin αi 2 , which we conjecture to not have a solution.
Considering the generalization of the (standard) twoterm shift rule to the four-term rule in (F14) and their requirement on the gate generator, i.e. Q 2 = 1 and Q 3 = Q, it seems a natural question whether further generalization is possible to gates that, e.g., fulfil Q 5 = Q. We show next that this is not the case.
Consider the generalized condition Q m = Q n , m = n for the generator of a d-dimensional one-parameter gate. We recall that we may absorb shifts and scaling prefactors of the spectrum of Q into a global phase gate and the variational parameter, respectively, which may be used to obtain gates satisfying the generalized condition Q m = Q n . In the eigenbasis of the Hermitian matrix Q, this condition becomes λ m i = λ n i ∀1 ≤ i ≤ d, which only ever is solved by −1, 0 and 1 over R (in which the spectrum of Q must be contained) with the additional condition m − n mod 2 = 0 for λ i = −1. This means that Q already satisfies Q 3 = Q, allowing for the fourterm rule to be applied.
Consequently, a direct generalization of the four-term rule is not possible. Note that this does not exclude the existence of other schemes to compute the derivative of an expectation value w.r.t. parametrized states that are based on linear combinations of shifted expectation values.

Minimizing the variance
If we approximate the physical variance of the expectation value, V , to be independent of θ, the variance of measuring f at a given parameter for sufficiently many measurements N is V /N . The resulting variance of the two-term shift rule derivative for a budget of N measurements is where we chose the optimal allocation of N/2 measurements to each of the two terms in the shift rule. We may optimize the shift angle in the two-term rule w.r.t. this variance which yields the standard choice π/2 for the shift because The variance can be reduced further by introducing a multiplicative bias to the estimator, as presented in [37]; The optimal choice of the prefactor depends on the value and the variance of the derivative and is given by (F20) Note that λ * has to be estimated because V and ∂ θ f are not known exactly. The optimal choice of the shift parameter remains π 2 .
For the four-term rule in Eqn. (F14), the optimal shot allocation is proportional to the prefactors d 1,2 and leads to the variance  Figure 3b for a VQE with 110 parameters (orange curve) and the blue shaded FCI probabilities at a vertical cut off of 10 −7 with consistent ordering between the FCI and VQE computational basis states. The numbers above the columns indicate the seniority of the computational basis state at the respective index, e.g. the first column at index 0 is the Hartree Fock determinant with seniority 0.

Numerical universality demonstration for Haar random states
The test cases in real molecular systems in the main text are somewhat complicated by the specifics of the electronic structure Hamiltonian and especially by the spatial point group symmetry of the test molecules. One notable artifact is that some of the left-most gates in our gate fabrics in real molecules are "dead," as they perform orbital rotations and diagonal pair exchanges in the occupied or virtual subspaces of the Hartree-Fock starting state. The point group symmetry also seems to adversely affect the numerical convergence behavior of the VQE gate fabric parameter optimization, e.g., suggesting theΠ-gate pre-mixing initialization adopted in the main text. Noticeably better convergence behavior was observed when the molecules were perturbed from D 2h symmetry to C 1 by random Gaussian perturbations in XYZ coordinates.
This section is included to demonstrate the numerical universality properties of our proposed gate fabric for the artificial case of Haar random statevectors. Specifically, for a number of test case irreps (M, N α , N β , S), we form the full CSF basis, and then generate Haar random statevectors |A and |B within this irrep of F(2 2M ) by Gaussian random sampling and normalization of the statevector in the CSF basis, and then backtransformation to the standard Jordan-Wigner computational basis. We then optimize the VQE gate fabric parameters of the VQE entangler circuit U to maximize | A|Û |B | 2 via L-BFGS with noise-free analytical gradients. Note that we do not performΠ-based pre-mixing convergence enhancement in this section.
The results are shown for the half-filled cases for M = 4 and M = 6 in Figure 10. The top panels show the bulk convergence properties with respect to circuit depth D / number of parameters N parameter (roughly linearly proportional). The general finding here is roughly geometric convergence at low parameter depths, followed by a sharp drop to near the machine epsilon as the number of parameters crosses over the number of CSFs in the irrep, indicating the onset of universality in the action of the VQE entangler circuit. Quantum number symmetries are preserved to at least the machine epsilon for all intermediate and final parameter values. The lower panels show the numerical convergence behavior of the L-BFGS optimization procedure for each point in the top panel. There are several salient features in these plots: (1) The earliest convergence behavior appears to be roughly geometric, and self-similar between gate fabrics with different numbers of parameters (2) fabrics with smaller numbers of parameters deviate earlier from this geometric convergence and eventually "flatline" at their non-universal terminal values (3) some minor plateaus are observed in the convergence behavior for small numbers of parameters (4) there is a distinct phase change as universality is crossed, with circuits with larger numbers of parameters than needed for universality exhibiting strongly geometric convergence behavior all the way to the machine epsilon.
Such tests are assuredly artificial, but are free from the external artifacts present in the molecular test cases, and serve to more-strongly indicate that the gate fabric developed in Figure 2 are universal and quantum-numbersymmetry-preserving for F(2 2M ).  Figure 2 for Haar random states |A and |B within F(2 2M ). The VQE gate fabric parameters were optimized via L-BFGS with noise-free analytical gradients to maximize | A|Û |B | 2 , whereÛ is the VQE entangler circuit operator. Top row: Convergence of overlap | A|Û |B | with respect to respect to gate fabric depth D / number of parameters Nparameter (roughly linearly proportional). Bottom row: Convergence behavior of L-BFGS optimization procedure for each gate fabric depth D / number of parameters Nparameter (roughly linearly proportional). Left column: Results for (M = 4, Nα = 2, N β = 2, S = 0) 8-qubit example. Right column: Results for (M = 6, Nα = 3, N β = 3, S = 0) 12-qubit example. Each colormapped line in the lower panel corresponds to the L-BFGS numerical convergence behavior of a single point in the upper panel. Results are single random instances within each test case, and are wholly representative of generic random instances and test cases in other quantum number irreps.

Non-universal edge cases
It is important to note that while theQ-type QNP gate fabrics of main text are numerically universal for the vast majority of quantum number irreps in the "bulk" of the Hilbert space, there are a limited number of edge cases for which these gate fabrics are not universal. These cases constitute systems where, after high-spin constraints are accounted for, there are only holes or particles left in the remaining orbitals. In these cases, the QNP PX gates have trivial action in the wholly hole or particle space, and are unable to explore new configurations within the space. More tangibly, for an irrep with dimensions (M, N α , N β , S), we first compute the "unconstrained" irrep (M − S, N α , N β , 0) where N α + N β + S = N α + N β and the larger of N α := N α or N β := N β is decremented first until N α = N β , and then both N α and N β are decremented together (in this line, := is read as "initialized to"). The resulting unconstrained irrep will always have N α = N β . If the unconstrained irrep is all holes (N α = N β = 0) or all particles (N α = N β = M − S), then theQ-type QNP gate fabric is not universal. A trivial exception is if only a single orbital with all holes or all particles remains in the unconstrained irrep, in which case universality is still preserved.
Note that the number of irreps in the Hilbert space grows roughly as O(M 3 ), while the required constraints N α = N β = 0 or M − S seem to indicate that the number of non-universal irreps indicate that the number of irreps which are not universal withQ-type QNP gate fabrics will grow as roughly O(M ). Moreover, the non-universal irreps appear at the "edge" of the Hilbert space, and consist of cases with severe high-spin constraints which are likely to be either polynomially tractable classically, physically uninteresting, or both. Interesting cases with roughly half-and-half filling of holes and particles and moderately low total spin number will almost surely fall into irreps which are universal withQ-type QNP gate fabrics. Finally, it is worth noting that any issues with these edge cases can be completely obviated by instead working with the 5-parameterF gate fabrics discussed in Section D -these do not appear to exhibit any edge case non-universalities, and are numerically universal for all cases we have tested.
Tables III and IV show explicitly the irreps for M = 4 and M = 6 that were found numerically to be non-universal withQ-type QNP gate fabrics via numerical studies of the same type as the previous section. The non-universality behavior was immediately apparent as discrepancies of overlap of 1 − | A|Û |B | 2 of order of 10 −2 , while the universal irreps exhibited maximum discrepancies of overlap of order of < 10 −13 . TABLE III: Quantum number irreps for M = 4 for which theQ-type QNP gates of the main text are not universal. Overall there are 35 unique irreps for M = 4 with total dimension D ≡ 2 2M = 256. The 6 irreps with total dimension 36 listed below are not universal due to high-spin constraints. All other irreps are numerically found to be universal to the essentially machine precision.
Nα N β S Dimension 0 2 2 6 1 1 2 6 2 0 2 6 2 4 2 6 3 3 2 6 4 2 2 6 TABLE IV: Quantum number irreps for M = 6 for which theQ-type QNP gates of the main text are not universal. Overall there are 84 unique irreps for M = 6 with total dimension D ≡ 2 2M = 4096. The 24 irreps with total dimension 400 listed below are not universal due to high-spin constraints. All other irreps are numerically found to be universal to the essentially machine precision.