Reference-State Error Mitigation: A Strategy for High Accuracy Quantum Computation of Chemistry

Decoherence and gate errors severely limit the capabilities of state-of-the-art quantum computers. This work introduces a strategy for reference-state error mitigation (REM) of quantum chemistry that can be straightforwardly implemented on current and near-term devices. REM can be applied alongside existing mitigation procedures, while requiring minimal postprocessing and only one or no additional measurements. The approach is agnostic to the underlying quantum mechanical ansatz and is designed for the variational quantum eigensolver. Up to two orders-of-magnitude improvement in the computational accuracy of ground state energies of small molecules (H2, HeH+, and LiH) is demonstrated on superconducting quantum hardware. Simulations of noisy circuits with a depth exceeding 1000 two-qubit gates are used to demonstrate the scalability of the method.

In this study, we report on a chemistry-inspired error-mitigation strategy that can be combined with any variant of the variational quantum eigensolver 8,23 (VQE).Our approach, reference-state error mitigation (REM), relies on post-processing that can be readily performed on a classical computer.The method is applicable across a wide range of noise intensities and is low-cost in that it requires an overhead of at most one additional VQE energy evaluation.REM can readily be employed together with other error mitigation methods and throughout this work we additionally use readout mitigation, which corrects for hardwarespecific non-ideal correlation between prepared states and measured states. 16

Reference-state Error Mitigation (REM)
The goal of the VQE algorithm is to minimize the electronic energy with respect to a set of quantum circuit parameters, i.e., +Ψ'( ⃗ *-./ -Ψ'( ⃗ *0, (1)   where ( ⃗ = [( ' , ( ( , … ( ) ] and ./ is the molecular Hamiltonian.The VQE energy, !!"# (( ⃗ ) , can therefore be thought of as living on an 8-dimensional surface in parameter space.A one-dimensional representation of such a surface is shown in orange at the top of Figure 1.The ! !"# (( ⃗ ) surface is associated with some degree of systematic and random noise that can only be partially removed by readout mitigation.Our theoretical discussion of errors below assumes that we measure the expectation value given by equation ( 1) exactly, i.e., assuming no random error in the evaluation.In practice we are limited to a finite number of measurements (samples), resulting in a spread in the measured energy values around the exact expectation value.
Figure 1: (a) One-dimensional representation of the electronic energy E as a function of quantum circuit parameters ( ⃗ .The REM approach can be explained as a four-step process: (1) A computationally tractable reference solution (such as Hartree-Fock) is computed on a classical computer.(2) A quantum measurement of the VQE surface (orange) is made using parameters that exactly correspond to the reference solution.(3) The difference in energy calculated for the reference solution on classical and quantum hardware defines an error, Δ! *#+ .(4) The error estimate is assumed to be systematic and is used to correct the VQE surface in the proximity of the reference coordinate.The resulting REM corrected surface and the exact (noise-free) solution are represented by solid green and dashed red lines, respectively.Blue dots and diamonds represent the coordinates of the reference calculations and minima of the different energy landscapes, respectively (b) Insert illustrates the how the error after application of REM is a function of the parameter dependence of noise.
As the name suggests, the REM method rests on an appropriate choice of a reference wavefunction, or reference state.We recommend the reference state is a) chemically motivated, i.e., likely physically similar to the sought state, and b) fast (or at least viable) to evaluate using a classical computer.The Hartree-Fock state is a practical example of an often-suitable reference wavefunction that is based on a computationally efficient mean-field description of the electronic potential.We rely on Hartree-Fock in this work but note that other reference states can be used.Because an appropriate reference state usually also makes for a good initial guess for the VQE algorithm, it is often possible to perform REM without incurring any additional measurement cost.

Results and Discussion
To assess the reliability of REM we have implemented it for VQE computation of small molecules on two current NISQ devices, the Quito of IBMQ and the Särimner of Chalmers.Details of hardware, circuits and measurements are provided in the SI.Table 1 shows an amalgamation of our measurement results for the ground state energy of the hydrogen molecule (H2), helium hydride (HeH + ), and lithium hydride (LiH).The ansätze 3 used for the H2 and HeH + molecules are chemistry-inspired and based on unitary coupled cluster theory, whereas a hardware-efficient ansatz is used for LiH.A1).Our simulations of BeH2 contains 26 variational parameters (Table A1).S3 and S6).The examples in Table 1 are sorted by increasing circuit depth (details of which are provided in Table S1) which together are suggestive of both robustness and a scalability of the approach.The remaining error after mitigation is consistently on the order of millihartree, and the magnitude of the REM correction grows with the complexity of the quantum circuit (cf.H2 vs. BeH2 in Table 1).
We note that REM consistently provides energies that are lower than those of the reference Hartree-Fock energies in our calculations, even at relatively high noise levels (Figure 3).In principle, unsuitable choices of reference states combined with significant parameter dependence of noise, Δ! 3 '( ⃗ * ≉ 0, might result in over-correction of the measured VQE energy, taking the energy below the true minimum, as can be seen in the results for HeH + . The major challenge when running VQE calculations with deep circuits on NISQ hardware is converging the optimization.This challenge of VQE is one of classical optimization, mostly due to vanishing gradients with increasing noise in a large parameter space.Our method does not help in this task since it conserves the shape of the energy landscape.However, REM does (given a proper reference state) guarantee a more computationally accurate measurement of the total energy upon VQE convergence.
Table 1 demonstrates the effectiveness of REM when applied to molecules in their equilibrium geometry, for which the degree of electron correlation is relatively low and the Hartree-Fock reference is suitable.Figure 2 shows how the REM method performs far from equilibrium for the bond dissociation of H2 and HeH + .As expected, the effectiveness of our implementation of REM decreases somewhat in regions where the Hartree-Fock state offers a poor description, such as the stretched H2 bond.However, the method consistently provides a substantial improvement across the potential energy surface.A more suitable description of the partially broken bond of H2 should ideally account for the near degeneracy of multiple states, i.e., it would require a multi-reference wavefunction.The bond dissociation of HeH + proceeds to He and an isolated proton, a state well described by a single-reference Hartree-Fock description.Figure 2 also illustrates the effect of readout mitigation, 16 which we use per default in all measurements and that we recommend together with REM.Other mitigation strategies may, in principle, also be combined with REM.The robustness of REM was further evaluated by performing simulations of H2 and HeH + while varying the noise level.Noise was introduced in these simulations by modelling imperfect gate-fidelities as single (D) and two-qubit (E) depolarizing errors, connected through a linear relationship, D = 0.1 * E (see the SI).This kind of noise modelling enables straightforward comparison with error rates on physical quantum devices (Figure 3).REM is shown to be effective despite the steady increase in single-and twoqubit depolarizing error rates, as expected.The evaluation of computational accuracy in this work should not be confused with chemical accuracy. 24e here use the term computational accuracy specifically when comparing results of a quantum calculation with the exact solution at that same level of theory.Computational accuracy then, in the context of VQE calculations, refers to how exact a given VQE problem is solved with respect to the given Hamiltonian and ansatz.This accuracy can only be quantified so long as it is possible to solve the problem without noise, e.g., using conventional quantum chemistry (which we can still do; discussing the limits of conventional quantum chemistry methods is outside the scope of this work).Practical implementations of VQE on real hardware currently suffer from drastic deficiencies in level of theory, basis set size, 25 proper consideration of the physical environment (e.g., solvent effects), and dynamical effects.These limitations keep quantum computation (including our own) from accurately predicting real chemical processes.On the other hand, chemical accuracy is the correct term for what is required to make realistic predictions and is commonly defined as an error of 1 kcal/mol (~1.6 millihartree) from chemical experiments.No quantum computation has yet predicted molecular energies, reaction barrier heights, or the thermodynamics of a reaction to this accuracy.Despite this, there are several claims of "chemical accuracy" using NISQ devices. 8,13,26,27We encourage the community to use the appropriate terminology.The hunt for chemical accuracy in the NISQ era is far from over.

Conclusions
In this work we demonstrate an error mitigation strategy applicable to quantum chemical computations on NISQ devices.The REM method relies on determining the exact error in energy due to hardware and environmental noise for a reference wavefunction that can be feasibly evaluated on a classical computer.
The most important underlying assumption of REM is negligible dependence of noise on circuit parameters in the vicinity of this reference wavefunction.In this work Hartree-Fock references are used, which describes the physics of the molecular states well enough for REM to perform effectively.The REM method is shown to drastically improve the computational accuracy at which total energies of molecules can be computed using current quantum hardware.How significant an improvement REM provides depends on the noise level.In our experiments, the computation accuracy is improved by two orders of magnitude using REM.With increasing noise, the procedure will return an energy that is equal to or below the classically evaluated reference.However, in the presence of substantial quantum circuit parameter dependence of noise it cannot be ruled out that REM may underestimate the true energy.The nonvariational nature of error mitigation strategies remains a problem to solve.This mitigation strategy does not incur meaningful additional classical or quantum computational overhead and can be used to reduce errors on near-term devices by orders of magnitude when running VQE calculations.Because error rates vary both between NISQ devices and between circuits, it is not currently productive to rely on error cancellation, i.e., systematic errors inherent in quantum chemical levels of theory, when evaluating relative energies of chemical transformations.By enabling more precise evaluations of molecular total energies, REM moves us toward meaningful relative comparisons and toward chemical accuracy.

Notation
The following notation is used in the SI:

Chalmers Device Details
We executed the quantum algorithm at Chalmers on a superconducting three-qubit quantum processor named Särimner, of which we only use two qubits (Q0, Q1).This device is shown in Figure S1 and consists of three transmon qubits, 1 coupled using a single tunable coupler, C1.Single-qubit gates are implemented using on-chip drive lines to individually control each qubit with microwave pulses.Each qubit can be individually measured using its readout resonator with readout performed simultaneously using frequency multiplexed pulses on the common readout feedline.Two-qubit gates are activated via an AC flux-pulse applied to the coupler to modulate its frequency. 2The coupler is itself a frequency-tunable transmon qubit, however, it only serves to mediate the interaction between pairs of qubits and itself never enters the computational space during operation.A full list of parameters of the device can be found in Table S1.
The AC flux-pulse which activates the coupling takes the form Φ(,) = Φ : + Ω(,)cos (5 ; ,), where Φ : is the DC flux bias of the drive, Ω(,) is the envelope of the pulse which consists of a 25 ns cosine rise and fall and a 310 ns flat top, and 5 ; is the carrier frequency.The carrier frequency is chosen such that it drives a controlled-Z (CZ) transition between Q0 and Q1.This is on resonance with the transition |20⟩ ↔ |11⟩ and occurs at 5 ; = |5 < + < < − 5 = |.A full oscillation between these states brings about a conditional phase on the |11⟩ state which can be calibrated to implement a CZ gate.
The parametric gate is useful in this higher connectivity architecture as the interaction between pairs can be selectively chosen.Different gates are activated between qubits when the frequencies between transitions for pairs of qubits are off-resonant from one another as they are in this device.The third qubit can also be detuned such that all transitions are far off-resonant and there is no risk of a frequency collision when operating the device with just Q0 and Q1.

Readout Error Mitigation
Readout error mitigation is performed by constructing a calibration of the confusion matrix @.The entries of this matrix, @ -,> , are the probabilities of preparing the state |A⟩ and then measuring the state |B⟩, i.e., @ -,> = C( B | A ).
The matrix @ can then be used to correct a set of Pauli string measurements D ((⃗ = [C(0), … , C(B), … , C(H)] T by either multiplying by the inverse confusion matrix, @ ?= D ((⃗ = D ((⃗′, performing a least-squares fit to reconstruct the most likely outcome, or by a procedure known as 'Bayesian Unfolding'. 4 In this work, we mitigate our results by implementing a least-square fit to a quadratic cost function, where K(L ⃗) = (D ((⃗ − @L ⃗) @ with the constraint that the sum of the resulting vector must be 1 and each element itself is bounded in the interval [0,1].This avoids some issues resulting from matrix inversion arising from small offdiagonal elements, which can lead to unphysical results.The Sequential Least Squares Programming 5 (SLSQP) optimizer was used to find the L ⃗ that minimize the cost for each set of measured Pauli strings.The confusion matrix of the Chalmers Särimner device is reported in Figure S2.
Both options for readout mitigation are also available in Qiskit and can be straightforwardly implemented for calculations on real devices and simulations.

Hydrogen -H2 Ansatz, circuit and computation details
The H2 wavefunction can be represented with four qubits, where each qubit corresponds to one molecular spin orbital in minimal STO-3G basis.Parity mapping 6 was here used to further reduce the problem by two qubits, utilizing the particle and spin conservation.PySCF 7 was used to generate the initial Hartree-Fock state.The circuit ansatz was constructed using a parameterized wavefunction based on unitary coupled cluster theory 6 as implemented in Qiskit 0.21. 8Since single excitations do not contribute to the final ground state energy of H2, only double excitations were included in the ansatz which reduced the complexity further, to a single parameter, q.The resulting circuit, compiled to Qiskit's native U1, U2 and U3 gates, is depicted in Figure S2.
Figure S3: Quantum circuit for H2 implementing double excitation of the unitary coupled-cluster operator with respect to the Hartree-Fock reference state.The circuit is compiled to the native gates of the IBMQ-Quito device.
The circuit shown in in Figure S2 was then shortened further by removing a repeated entangling step.A single entangling step was found sufficient to explore the Hilbert space of the problem.The resulting circuit was transpiled to gates native to the Särimner device (Figure S4) and is similar to the one used by Kandala et.Al. 9 The Särimner gate set consists of the set of single-qubit gates, MN " (±P), N A (±P), N " Q± B @ R , N A Q± B @ R , N C (#)S, and two-qubit gate set {CZ}.Both circuits return the same energy up to the eighth decimal point of a hartree numerically, justifying our circuit design.Since the circuit contains only one parameter, it was varied in the interval [-p, p] to obtain energy as a function of the variational parameter at several geometries.The obtained energies were fit using the lmfit package 10 into a cosine function, A cos(# − U), where V and U are fit parameters, to give the minima at different geometries (Figure S5).All experimental calculations were run on the Chalmers Särimner device with 5000 shots.The details of the Hamiltonian operator to be minimized are given in Table S2 for different geometries of H2.The exact solutions for the given circuit and Hamiltonians were calculated using QuTiP.

Ansatz, circuit and computation details
Similar to H2, the HeH+ wavefunction can be represented with two qubits using parity mapping 6 in a minimal basis.PySCF 7 was used to generate the initial Hartree-Fock state and the circuit ansatz was constructed as a parameterized wavefunction based on unitary coupled cluster theory 6 as implemented in Qiskit 0.21. 8The circuit consists of three parameters -two single excitation parameters, q[0] and q [1], and a double excitation parameter, q [2].These parameters are optimized using the VQE algorithm implemented in Qiskit and run on the IBMQ-Quito device with 8192 shots.The details of the Hamiltonian operator to be minimized are given in Table S6 for different geometries of HeH + .
Figure S6: Quantum circuit for HeH + implementing the UCCSD operator with respect to the Hartree-Fock reference state.The circuit is compiled to the native gates of IBMQ-Quito device.

HeH + Hamiltonian
The details of the Hamiltonian operator to be minimized are given in Table S6 for different geometries of HeH + .

Lithium hydride -LiH Ansatz, circuit and computation details
A LiH wavefunction can be represented with twelve qubits where each qubit corresponds to one molecular spin orbital in a minimal STO-3G basis.Symmetries in parity mapping 6 was used to further reduce the problem size by two qubits to ten qubits.As shown by Kandala et al 9 , removing the orbitals that do not participate in bonding can bring down the problem size by four qubits.Coupled with the frozen core approximation, the final LiH circuit can be represented by just four qubits in and around equilibrium geometry.LiH Hamiltonian details at 1.5949 Å As shown in the previous section, LiH wavefunction can be represented with four qubits.Making similar approximations as shown by Kandala et al 9 , the BeH2 wavefunction can be represented by six qubits when using the frozen core approximation and orbital reduction.PySCF 7 was used to generate the initial Hartree-Fock state.Qiskit's UCCSD module is used to construct the problem ansatz.The circuits are too large to be represented here but the details of the circuit can be found in Table A1.A noise-model from IBMQ-Athens has been added to the simulations to replicate real-world noisy behavior.All simulations have been run using 20,000 shots and repeated 5 times to ensure a high number of samples.The sampling noise is expressed in terms of the standard deviations of our errors.Readout mitigation has been applied for all the simulations.

Depolarizing Noise Model
Depolarizing noise channels are a common way to model decoherence and gate errors in quantum devices. 13,14fter observing that our two-qubit (T) gate-error is approximately an order of magnitude larger than our singlequbit (S) gate error, a noise model was constructed that provides S as a linear function of T, != 0.1 * (.This noise model was used to evaluate the effect of REM, corresponding to a range of depolarizing errors.The depolarizing noise was modelled in Qiskit as a probability )′ of applying each of the Pauli gates +, ,, orafter running a single-qubit gate, and as probability )′′ of applying any combination of .! ⨂. " where .! , ." ∈ {3, +, ,, -} after running a two-qubit gate.

Figure 2 :
Figure 2: Top: Potential energy surfaces for the dissociation of H2 and HeH + .Exact noise-free solutions from state-vector simulations are represented by black lines.Regular VQE energies obtained using a quantum computer are shown as blue dots.Results following readout mitigation and REM are shown as orange crosses and green squares, respectively.The combination of both readout mitigation and REM is shown as red stars.Measurements for H2 and HeH + were performed on Chalmers Särimner and IBMQ-Quito, respectively.Bottom: error of the different approaches relative to the exact solution in the given minimal basis set.The gray region corresponds to an error of 1.6 millihartree (1 kcal/mol) with respect to the corresponding noise-free calculations.

Figure 3 :
Figure 3: Absolute errors as a function of increasing depolarizing errors for ground state calculations of H2 and HeH + .The two-qubit depolarizing error of the Chalmers Särimner device is indicated by a vertical line for reference.Bottom insert: Final energies following application of REM are largely independent of the noise level, and the computational accuracy is consistently close to or below 1.6 millihartree (1 kcal/mol) for these molecules.

Figure S1 :
Figure S1: Microscope image of the bonded 3-qubit Särimner device from Chalmers.Only Q0 and Q1 are used in this work.The coupler, C1, consists of a flux-tunable transmon qubit and serves to mediate interactions between pairs.Each qubit has individual readout and drive lines which are used to control the device.The readout lines are coupled to a shared feedline which is used for multiplexed readout.

Figure S2 .
Figure S2.Confusion matrix for the Chalmers Särimner device.The matrix consists of probabilities of measuring a state given a specific preparation.The confusion matrix is measured before each run of the VQE algorithm with 1000 shots and repeated 100 times to give an estimate for fluctuations in the readout.

Figure
Figure S4: A compact quantum circuit for H2 implementing double excitation of the coupled-cluster operator with respect to the Hartree-Fock reference state.The circuit is compiled to the native gates of Chalmers Särimner device.

Figure S5 :
Figure S5: Measurement results for a single sweep of # between −P and P. The same measurement results have been used with different electronic Hamiltonians to generate the energies using QuTiP 11 for the different geometries.A cosine function of the form, A cos(# − U), is used to generate the fits for the data.
PySCF 7 was used to generate the initial Hartree-Fock state.A hardware efficient ansatz, inspired by Qiskit's two-local circuit class is used to construct a compact circuit representing LiH.It consists of alternating rotating layers of entanglement layers and is chosen to utilize the connectivity of IBMQ-Quito's connectivity as shown in Figure S7.

Figure S7 :
Figure S7: Quantum circuit for LiH utilizing a hardware-efficient ansatz.
Table 1 also includes results of simulations of LiH and beryllium hydride (BeH2).The latter circuits are substantially larger than what is feasible on current devices as they would incur insurmountable errors due to noise.Combined, this test set ranges from a 2-qubit circuit with just one two-qubit gate for H2, to a 6-qubit circuit with 1096 two-qubit gates for BeH2 (Table

Table 1 :
Total ground state energies of molecules at experimental equilibrium distances, without and with the application of REM.Readout mitigation has been applied for all the VQE calculations.All energies are given in hartree.
a Calculations refer to experimental bond distances from the National Institute of Standards and Technology (NIST).b Run on Chalmers Särimner with 5000 samples.c Run on IBMQ-Quito with 8192 samples.d Simulated results using a noise model from IBMQ-Athens.

Table A1 :
Comparison of circuit complexities for H2, HeH + , LiH, and BeH2 listed in order of increasing complexity.Details are provided in the SI.

Table S1 :
Experimental parameters for the 3-qubit Chalmers device Särimner.We report our average gate error achieved through randomized benchmarking for single qubit gates, as our gate set differs from that of IBM, as well as interleaved randomized benchmarking for our 2-qubit CZ gate.The single-and two-qubit gate fidelities during the execution of the quantum algorithm were 99.95% and 98.2%.The device design, fabrication methods, measurement setup, gate implementation on hardware, and tune-up methods are described in detail elsewhere.3

Table S2 :
The electronic Hamiltonian for H2 expressed as weighted Pauli string operators after parity mapping for various geometry.

Table S3 :
Optimized parameters, before and after readout mitigation, that minimize the Hamiltonian operator on the Chalmer Särimner device for calculations of H2.

Table S5 :
Exact, regular VQE and mitigated electronic energies and errors of H2 at various geometries, !.All energies are in hartrees.

Table S6 :
The electronic Hamiltonian expressed as weighted Pauli string operators after parity mapping for various geometry of HeH + .

Table S7 :
The optimized parameter values that minimize the Hamiltonian operator on the IBMQ-Quito device for various geometry of HeH + .The angles are in radians.

Table S9 :
The exact, uncorrected and mitigated electronic energies and errors of HeH + at various geometries, !.All energies are in hartrees.

Table S10 :
The electronic Hamiltonian of LiH can be expressed as weighted Pauli string operators with the following coefficients:

Table S12 :
The optimized parameter values that minimize the Hamiltonian operator on the IBMQ-Quito device for LiH at 1.5949 Å.The angles are in radians.

Table S13 :
Total ground state energies of molecules(simulated) at experimental equilibrium distances, without and with the application of REM.Bond distances have been obtained from the National Institute of Standards and Technology (NIST).A noise model from IBMQ-Athens has been added to all the simulations.Readout mitigation has been applied for all the VQE calculations.All energies are given in Hartree.The sampling error of the simulations is represented as the standard deviation.