Quantum optimization using variational algorithms on near-term quantum devices

Universal fault-tolerant quantum computers will require error-free execution of long sequences of quantum gate operations, which is expected to involve millions of physical qubits. Before the full power of such machines will be available, near-term quantum devices will provide several hundred qubits and limited error correction. Still, there is a realistic prospect to run useful algorithms within the limited circuit depth of such devices. Particularly promising are optimization algorithms that follow a hybrid approach: the aim is to steer a highly entangled state on a quantum system to a target state that minimizes a cost function via variation of some gate parameters. This variational approach can be used both for classical optimization problems as well as for problems in quantum chemistry. The challenge is to converge to the target state given the limited coherence time and connectivity of the qubits. In this context, the quantum volume as a metric to compare the power of near-term quantum devices is discussed. With focus on chemistry applications, a general description of variational algorithms is provided and the mapping from fermions to qubits is explained. Coupled-cluster and heuristic trial wave-functions are considered for efficiently finding molecular ground states. Furthermore, simple error-mitigation schemes are introduced that could improve the accuracy of determining ground-state energies. Advancing these techniques may lead to near-term demonstrations of useful quantum computation with systems containing several hundred qubits.


Introduction
Recent advances in the field of quantum computing have boosted the hope that one day complex problems can be solved efficiently on quantum computers. The ultimate goal is a universal fault-tolerant quantum computer that runs arbitrary algorithms much faster than on a classical computer. However, millions of physical qubits and highfidelity gate operations are required to implement a universal fault-tolerant quantum computer, a system that currently cannot be built. Yet, quantum devices with a couple of hundred physical qubits with limited or no error correction are likely to become available in the near future. With it comes the question how to exploit these devices for useful calculations. In this paper, we discuss how the variational quantum eigensolver can be run on near-term quantum devices to tackle optimization problems that are exponentially hard on classical computers.
We differentiate between two types of optimization problems. The first kind are quantum optimization problems, such as finding the ground state of a complex molecule or the simulation of its dynamics. In this case, optimization typically involves minimization of the total energy as described by the energy expectation value of a nontrivial Hamiltonian as a function of some molecular parameters, such as interatomic distances. The second kind are classical optimization problems which can usually be mapped onto a relatively simple Ising-type Hamiltonian. In both cases, exponential scaling of the required computational resources with the problem size can make the problems hard to solve or even in-tractable on classical computers.
Generally, optimization problems are solved by finding the extremum of an objective function, such as cost, energy, profit or error. As the cost function typically depends on a large set of parameters, finding a solution involves searching a high-dimensional parameter space, which quickly makes a brute-force approach unfeasible. A quantum computer operates on Hilbert space, which grows exponentially as 2 N with the number of qubits N . The idea is to use this vast state space with the help of quantum entanglement, and thus boost the efficiency in finding the right solution, ideally with exponential speed-up [1,2,3,4,5]. A more careful analysis shows, however, that the speed-up for classical optimization problems is in many cases rather modest [6,7,8]. In contrast, one can benefit from quantum speed-up in problems that are directly related to the quantum-mechanical description of nature itself. A prominent example is finding the many-electron wavefunction of a molecular system. Classical computers fail to solve such problems exactly for more than a few tens of electrons because of the exponential increase of Hilbert space with the number of electrons. The large state space of a quantum computer can be used to simulate a chemical system and calculate its properties, including correlations and reaction rates, once the challenge of efficiently mapping the fermionic problem to the available qubit hardware is overcome.
In fact, on a quantum device the natural way is to solve the chemical system in second quantization [3,4,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33] formulated in terms of fermionic annihilation and creation operators. Because of the different statistics there is no direct one-to-one mapping: each fermion operator must be represented by a string of qubit operators, which induces long-range qubit-qubit correlations in the system and places demanding requirements on the connectivity and the number of gates (see Section 4.1). To compute the quantum evolution of chemical systems on a digital quantum computer, decomposition into discrete time steps is required and accordingly long gate sequences [3,14,34].
On current quantum devices, gate errors and decoherence restrict the number of sequential gate operations that can be performed while keeping a meaningful, coherent quantum state. Moreover, connectivity between qubits is limited by the physical routing of the wires on a qubit chip. This is why a new class of hybrid classical quantum algorithms, called the variational quantum eigensolver (VQE) [35,36,37,34,38,39,40,33,41,42,43], holds a lot of prospects for near-term quantum-computing systems (see Fig. 1). These algorithms work with short-depth circuits and will result in approximate results when the number of qubits, their coherence and the connectivity is large enough. These requirements on the quantum system can be quantified by the quantum volume [44], a hardware-independent figure or merit for the power of a quantum computer. The VQE can be used both for classical optimization problems as well as for fermionic Hamiltonians describing, e. g., quantum chemistry. In quantum chemistry the variational quantum eigensolver is used to calculate ground states [35,36,37,34,38,39,33] of chemical systems. The high-dimensional trial wavefunctions, which are costly to represent on a classical computer, are generated on the quantum computer using parametrized single-qubit and entangling gates. The optimization of the gate parameters is performed on a classical computer by summing expectation values of the qubit operators measured on the quantum device and thereby calculating the total energy as a cost function. This can in principle lead to very short-depth circuits which ideally run in a time that is shorter than the coherence time of the quantum computer. The same variational quantum eigensolver can be applied to other physical systems in condensed matter such as the Fermi-Hubbard model [2,45,12,17,46,47,48] and spin systems [49,50,51,52].
Hybrid algorithms are, however, not resilient against decoherence and gate errors, which may lead to inaccurate estimates of the expectation values. Currently available error-correction schemes, such as those based on surface codes [53], require a significant number of qubits, rendering quantum simulations of practical systems challenging in the near future. Still, novel schemes that do not require ancillas or code qubits can help mitigate induced errors, enabling longer and bigger quantum computations. Such error mitigation schemes [54,55] need to be developed further and tested to improve accuracy without the full overhead of error-correction codes for universal quantum computing.
The paper is structured as follows: The quantum volume is discussed in Section 2 before we explain the variational quantum eigensolver in Section 3 and its application to quantum chemistry problems in Section 4. After a brief discussion of the prospects of solving classical optimization problems with near-term quantum devices in Section 5, we elaborate on the choice of suitable optimizers for the classical feedback in the VQE in Section 6 and discuss the prospects of fighting back decoherence in near-term quantum devices without full error correction in Section 7. Finally, we conclude in Section 8.

Quantum volume, a metric for near-term quantum devices
For current quantum processors, various architectures and physical qubit realizations are being considered. While quantum systems based on superconducting qubits [56,57,58,34,59,60,61] at the moment seem to be leading the way, ion-trap-based systems [62,63] are close competitors. Furthermore, semiconductor-based spin qubits [64,65,66] and other quantum architectures [67,68,69] may still become important in the future. Given the different hardware implementations it is often difficult to benchmark the usefulness or power of quantum systems, which is why a hardware-independent measure is required. To define a suitable metric, we first note that a quantum computer's performance depends on five main hardware parameters: (i) Number of physical qubits N (ii) Connectivity between qubits (iii) Number of gates that can be applied before errors or decoherence mask the result (iv) Available hardware gate set (v) Number of operations that can be run in parallel With the goal to quantify a quantum computer's power with a single parameter, we would like to consider a metric based on the question 'can this device run a given algorithm?'. For any given instance of a quantum algorithm, there is a lower bound on the number of qubits N required to run the algorithm, as well as the necessary number of steps (or circuit depth) d. We therefore define a quantum volume V Q [44] that takes into account both the number of qubits N and the allowable depth d of quantum circuits that can be run on a near-term quantum device. In the simplest case, we could just choose the quantum volume to be d · N ; however, this has some undesirable properties in that it can be gamed in various ways. For example, in many cases the smallest error rates and therefore the largest circuit depth will result from very few qubits, even N = 2, as in this case there will be less connectivity and parallelization overhead and fewer issues with crosstalk between qubits. However, clearly N = 2 is a completely uninteresting limit. Also the other extreme, where a device has many qubits but little coherence, i.e. d ≈ 1, is not interesting because such a system cannot use entanglement as a resource and calculations become effectively classical.
We therefore conceptually define the quantum volume as Here, the number of qubits N is an easily accessible hardware parameter; however, the achievable circuit depth d(N ) needs further specification in terms of the hardware parameters given in the list above. We start by considering one step of a quantum algorithm (a depth-one circuit) on a number of N qubits. Such a step is expressed as a unitary operator that can be written as a tensor product of randomly chosen arbitrary two-qubit gates on disjoint pairs of qubits (see step 1 in Fig. 2(a)). Here, we allow any unitary two-qubit operation in the SU(4) group, which may consist of a combination of one-and two-qubit gates on the actual hardware. Then an effective error rate eff is defined as the error rate per two-qubit gate averaged over many realizations of such depth-one circuits. Therefore, eff depends on the gate overhead required when all-to-all connectivity, full parallelism and a suitable gate set is not available. Thereby, it also encapsulates both the errors of single-and two-qubit gates. If the hardware supports all possible two-qubit gates directly (requiring an all-to-all connectivity) with identical error rate , and in addition allows unlimited gate parallelism, then eff = . If the connectivity is limited, then it will be necessary to insert additional SWAP gates to permute the qubits in order to implement the random two-qubit gates, leading to an increase of eff > . A planar nearest-neighbor qubit coupling would lead to an effective error rate of eff ∝ √ N , and a linear chain of qubits would yield an effective error rate of eff ∝ N . On the other hand a hardware which supports more complex gates such as the Tofoli gate directly or the use of a compiler which efficiently compresses the gates of a test circuit could also lead to a situation with eff < . Other special features and limitations of the hardware must be dealt with in a similar manner.
The error rate of a single circuit step scales with the number of simultaneous twoqubit gates 1step ∝ N eff . In other words, we can estimate the circuit depth in which, on average, a single error occurs as d 1/(N eff ), linking the effective error eff to the previous definition of the quantum volume using the circuit depth. As an example, if an effective error rate eff = 10 −4 is experimentally achievable, depth d = 10 algorithms could be run on a 1000-qubit device, and d = 100 algorithms on a 100-qubit device.
However, the effective error rate eff will depend not only on the gate error rates and the connectivity but, more generally, on the complexity of the quantum system which grows with the number of qubits, for example, because of crosstalk. The effective error rate eff (N ) will therefore likely be a function of N even if full connectivity is available. Moreover, eff also depends on the sophistication of the scheduling algorithm responsible for mapping the quantum algorithm considered to the hardware. Both hardware and software improvements will thus impact the effective error rate eff (N ).
Finally, we note that with this definition the allowable circuit depth d 1/(N eff ) decreases with N at constant effective error eff , which means that a system's quantum volume decreases if more qubits with the same fidelity are made available on the hardware. However, a given algorithm does not necessarily need all N available qubits. It could even be beneficial for an algorithm that requires n < N qubits to run on an N -qubit machine when selecting a subset of qubits with good connectivity is selected. We therefore further refine the definition of the quantum volume in Eq. (1): where the maximum is taken over an arbitrary choice of n qubits to maximize the quantum volume that can be obtained with such a subset. To illustrate this, we plot an example quantum circuit with two circuit steps and the functional dependence of We also see that the usefulness of current quantum devices is likely limited by the typical effective error rates, which are eff > 10 −3 . To improve eff we will have to start encoding quantum states in logical qubits with an overhead in the number of physical qubits. This will eventually lead to fault tolerant quantum computing. The quantum volume is therefore an architecture-neutral metric that characterizes the capability of a chosen quantum computing architecture to run useful quantum circuits. It enables the comparison of hardware with widely different performance characteristics and quantifies the complexity of algorithms that can be run on such a system. An important conclusion that we can draw for the usefulness of near-term quantum devices is that when increasing the number of qubits the power of the quantum device will increase only if the effective error rate is improved at the same time.

Exploring Hilbert space with the variational quantum eigensolver
To exploit near-term quantum devices, applications and algorithms have to be tailored to current quantum hardware with only tens or hundreds of qubits and without full quantum error correction. One main constraint is the limited quantum volume that restricts the depth of meaningful quantum circuits. Still, a small-scale quantum computer with hundred qubits can process quantum states that cannot even be stored in any classical memory. A natural way to make use of this quantum advantage is via a hybrid quantum-classical architecture: A quantum co-processor prepares multi-  qubit quantum states |Ψ(θ) parametrized by control parameters θ. The subsequent measurement of a cost function E q (θ) = Ψ(θ)|H q |Ψ(θ) , typically the energy of a problem Hamiltonian H q , serves a classical computer to find new values θ in order to minimize E q (θ) and find the ground-state energy This variational quantum eigensolver approach to Hamiltonian-problem solving has been recently applied in different contexts [70,37,34,40,71,72]. In fact, the Hamiltonian H q can take many forms, the only requirement being that it can be mapped to a system of interacting qubits with a non-exponentially increasing number of terms. Here we distinguish two relevant cases: Hamiltonians that describe fermionic condensed-matter or molecular system (Section 4) and Hamiltonians that describe a classical optimization problem (Section 5).

Variational quantum eigensolver method
In detail, the variational quantum eigensolver method consists of four main steps as shown in Figure 3. First, on the quantum processor a tentative variational eigenstate, a trial state, |Ψ(θ) is generated by a sequence of gates parameterized by a set of control parameters θ. In the ideal case, this trial state depends on a small number of classical parameters θ, whereas the set of gates is chosen to efficiently explore Hilbert space.
In particular, the class of states forming the solution to the minimization problem in Eq.
(3) has to lie within the set of possible trial states. Suitable gate sets which provide a good approximation to the wanted target state, which minimizes the cost function, have been found for both classical optimization problems [41] (Section 5) and quantum chemistry problems (Section 4). Aside from these considerations, it is also essential that hardware constraints be taken into account. As not all gates are directly realizable in hardware, decomposing them into those available in the quantum hardware adds extra overhead in circuit depth. An alternative is, therefore, to use a heuristic approach based on gates that are readily available in hardware [72] as discussed below. Second, once the trial state has been prepared and the expectation value of the problem Hamiltonian H q is determined. The problem Hamiltonian can be decomposed into Pauli strings N denotes the number of qubits. To determine the expectation value of each Pauli operator in P α , each single qubit's population is measured repeatedly for a given number of experiments with identical trial state preparation |Ψ(θ) . This corresponds to measuring σ z j for each qubit; other Pauli operators can be determined by applying a prerotation on the qubit before the measurement that effectively rotates the measurement axis. To determine the expectation value of the Pauli strings, the measurement outcomes are multiplied for each run of the experiment and then averaged.
In a third step, the cost function is calculated by summing up the expectation values of P α with corresponding coefficients h α . Finally, the value of E q (θ) is minimized as a function of the parameters θ. A classical optimization algorithm processes E q (θ) and provides new parameters θ. For each parameter set, a new set of gates for trial state preparation has to be loaded onto the quantum processor. As this requires rather time-consuming re-programming of the quantum hardware, it is important that only a minimal number of queries should be made to the quantum processor. Moreover, the calculated expectation values will be noisy because of the limited sampling statistics of the qubit state. Therefore, classical robust optimizers have to used that can handle the noise on the measured expectation values and scale favorably with the number of parameters as described in Section 6. The procedure ends when the minimum of E q (θ) in Eq. (3) is reached within a given accuracy and the optimal parameters θ * are found.

Quantum chemistry with qubits
To demonstrate the potential of a quantum processor with limited quantum volume, one needs to consider quantum algorithms that provide a large scaling advantage compared with their classical counterparts. The solution of the electronic structure problem in quantum chemistry belongs to this class: Because of the exponential scaling of the problem, it is impossible to find an exact solution to the Schrödinger equation of systems with more than a few tens of electrons on a classical computer. Several approximations have been introduced to access the properties of large-scale systems with more than 1000 electrons on high-performance computers. The aim is to reach the required accuracy for chemical energies (∼ 50 meV). One approach is to approximate the many-electron Hamiltonian itself using, for example, density-functional theory [73]. There, the original system of interacting electrons is replaced by a fictitious one of non-interacting electrons moving in a modified external potential that allows, at least in principle, the original exact solution to be recovered.
An alternative approach starts from the exact Hamiltonian and attempts to find suitable approximations for the system wavefunction in the many-electron Hilbert space. This calculation can, in principle, be performed either within the first or the second quantization formalism. In first quantization, all spatial integrals have to evaluated on the quantum computer. For this reason, approaches based on second quantization are more suited for first-generation quantum devices. In this case, all spatial integrals are evaluated beforehand on a classical computer, whereas the sampling of the Hilbert space is performed in the orbital configuration space spanned by molecular Slater determinants. This approach maps naturally to the variational method described above (Section 3). It starts from the one-electron basis states that are obtained by solving the Hartree-Fock equation. These Hartree-Fock orbitals are then used to construct an anti-symmetrized product wavefunction, the Slater determinant, which is used as a starting point for a perturbative expansion. In this expansion a controlled series of excited configurations is added until a sufficiently accurate approximation of the ground state is found.

Mapping fermions to qubits
The electronic Hamiltonian in second quantization is given by where the operators a † i and a i create and annihilate electrons in the i-th orbital. The parameters t ij and u ijkl describe the one-and two-electron interactions and can be efficiently computed classically as the overlap integrals of the orbitals in the basis set [74]. The two-electron term scales at most with the number of orbitals to the fourth power [4,75] and does not grow exponentially, which would prohibit efficient computation even on a quantum computer.
Because a i and a † i , unlike the Pauli spin operators, follow fermionic commutation on a qubit-based quantum processor is not feasible without a mapping from fermionic to Pauli operators. The fermionic nature of electrons implies that many-electron wavefunctions must be anti-symmetric with respect to particle exchange. This is reflected in the way fermionic creation and annhilation operators act on state vectors: Here p i = (−1) i−1 k=0 f k denotes the parity and f i ∈ {0, 1} the occupation number of the fermionic orbital i. The naive replacement of the fermionic operators a ( †) i by Pauli ladder operators σ ± i = (σ x ± iσ y )/2 does, however, not reproduce Eqs. (7) because σ ± i describe distinguishable particles with no special symmetries. A variety of mappings have been developed that guarantee that the fermion statistics are captured on a system of qubits [76,77,78]. Among those, the Jordan-Wigner mapping [79] is particularly intuitive: It is based on a one-to-one mapping of fermionic to qubit occupations, i.e. the occupancy information is stored locally. To take into account the parity information p i in Eqs. (7), fermionic operators are translated as where N is the total number of qubits considered. It is obvious that calculating the parity when acting on qubit i requires the knowledge of all state occupations j < i, which is accomplished by the σ z terms in Eq. (9). However, this introduces a nonlocality in the mapping and, when inserted into the Hamiltonian in Eq. (5), gives rise to long sequences of σ z operators intercalating between σ ± operators of length k, known as k-local terms. This means that a fermionic wavefunction is spread out over O(N ) qubits, posing fidelity issues in the readout process of the expectation value of the Hamiltonian. Recent schemes for tapering off qubits in mapped fermionic Hamiltonians [80,78], based on fermionic symmetries, can partially alleviate the hardware requirements necessary for performing simulations of fermionic systems. These second-quantized tapering schemes exploit symmetries in the mapped qubit Hamiltonian to reduce the simulation space needed to host the mapped fermionic system.
The Jordan-Wigner transformation [79] consists of a local occupancy map and a non-local, O(N ), parity function, whereas the binary-tree transformation encodes both operations on maps that scale O(log(N )) with the number of qubits [76,77,78], which is a clear advantage compared with the Jordan-Wigner transformation.

Coupled cluster trial wavefunctions
Once a mapping of fermions to qubit has been chosen, suitable trial states for the VQE have to be prepared on the quantum processor. At best, these trial states incorporate the structure of the problem Hamiltonian and known properties of the solution state, such as the total number N of fermions. While one could aim to find a gate set that allows one to generate all possible excited Slater determinant configurations, which is known as the full configuration interaction (FCI) approach, the number of states scales factorially with the number of electrons, a clear obstacle for computing larger molecules. One way to improve the efficiency is to use a coupled-cluster approach for creating the trial states, which allows a systematic sampling of all relevant excited Slater determinants up to a given excitation degree. In conventional quantum chemistry, these coupled-cluster expansions are used as a benchmark for all other approaches.
In the unitary coupled-cluster (UCC) approach [81], which is a variational version of the commonly used coupled-cluster method [82], the unitary operator U (θ) that is used to generate a trial wavefunction |Ψ(θ) from the reference state |Φ is given by It is constructed by exponentiation of the cluster operator T (θ) defined as Here, the coefficients θ describes a vector of parameters that will be optimized using VQE. A common choice for the reference state |Φ is the ground-state Slater-determinant made up of the lowest-energy molecular orbitals obtained from the solution of the Hartree-Fock equation.
The coefficients θ of the cluster operators are not independent and their value decreases with the order of the excitation. Therefore, this expansion is typically truncated at the double (UCCSD) or triple level (UCCSDT) of excitation without significantly reducing the accuracy. In fact, the exponentiation of the cluster operator T (θ) introduces higher uncorrelated excitations at each level of truncation, e. g., for the expansion produced triplet and quadruple excitations in the first few terms of the expansion (fifth and sixth terms, respectively). Despite the compactness of this expansion, the number of coefficients θ increases already in UCCSD with the number of orbitals to the fourth power, which impacts the efficiency of the classical optimization of the trial state |Ψ(θ) . In practice, in the case of large molecular systems the limited achievable circuit depth in current quantum devices requires a further truncation of the series in Eq. (12). Thus, while the coupled cluster method guarantees in principle an efficient convergence towards the exact ground state, its implementation in state-of-theart quantum computers requires further studies in terms of how different approximations (truncations) affect the accuracy of the solution.

Hardware-efficient trial states suitable for near-term quantum hardware
A much simpler approach is, therefore, the heuristic generation of the trial state with unitary operations that are more suited to the available quantum hardware [72]. Independently of the particular problem to be solved, one may choose trial states that can be efficiently generated in current quantum hardware and at the same time allow the generation of highly entangled states that are close to the target state.
This approach is showcased in the examples provided in Sections 4.4 and 5.1. As shown in Fig. 4, the preparation of the heuristic trial states comprises two types of quantum gates, single-qubit Euler rotations U (θ) determined by the rotation angles θ and an entangling drift operation U ent acting on pairs of qubits. The N -qubit trial states are obtained by applying a sequence of D entanglers U ent alternating with the Euler rotations on the N qubits to the initial ground state |00 . . . 0 , This gate sequence has a total number of p = N (3D + 2) independent angles. To be more specific, the single-qubit operations are decomposed into rotations about the x− and the z−axes, U q,i (θ) = Z q , with X q (θ q,i j ) = exp −iθ q,i j σ x q /2 (and similarly for Z q (θ q,i j ), Y ( θ)) denoting the unitary operation acting on qubit q at the i-th position in the gate sequences. The heuristic approach does not rely on the accurate implementation of specific two-qubit gates and can be used with any U ent that generates sufficient entanglement. A natural choice can be the cross-resonance gate [83,84] as a two-qubit gate suited for the fixed-frequency superconducting qubit architecture as used, for example, for the IBM Q experience [61].

Small molecules calculated with the variational quantum eigensolver
As an application of the method described above, we present the calculation of the ground-state energy of simple molecules such as the hydrogen molecule: The starting point is the Hamiltonian in second quantization in Eq. (5) with the one-body terms, t ij , representing the kinetic energy of the electrons and the potential energy that they experience in the presence of the nuclei, and the Coulomb repulsion terms Z n are the nuclei charges Z n (n = 1, 2), and each wavefunction φ i (x 1 ) orbital is a 1s orbital centered at the one hydrogen atom. We assume that the system is in its spin singlet state. After reduction [78] a two-qubit Hamiltonian is obtained with f 0 = −1.0524, f 1 = 0.01128, f 2 = 0.3979, f 3 = 0.3979, and f 4 = 0.1809. These coefficients are calculated at the equilibrium distance of 0.74Å using Eqs. (14) and (15). We evaluate the ground state of the Hamiltonian in (16) on an ideal quantum simulator [61] using a heuristic trial wavefunction approach (Section 4.3) with an increasing number of entangling steps (one, two and four). Here, the single qubit rotations of heuristic trial wavefunctions where implemented as U i (θ) = Y (θ i 0 )Z(θ i 1 ) and the entanglement was introduced via control phase gates [85]. Figure 5 shows that a single entangling step is not sufficient to converge towards the correct energy value, whereas two or more entanglers can reproduce the expected results within a few tens of optimization steps in the rotation-angle space θ.
This method can be extended to larger molecules. For lithium hydride (LiH) and beryllium dihydride (BeH 2 ) the second-quantized fermionic Hamiltonian is constructed using a minimal set of atomic orbitals [72] (labelled by the conventional hydrogenic quantum numbers). In beryllium dihydride the basis is composed of the 1s, 2s, 2p x orbitals associated to beryllium, and the 1s orbital associated to each hydrogen atom. This results in a total of ten spin orbitals. The two innermost 1s spin orbitals of beryllium are assumed to be completely filled. The remaining eight spin-orbitals of beryllium dihydride are reduced to six by exploiting spin-parity symmetries [78]. Similarly, the lithium hydride is mapped onto four qubits. It is demonstrated numerically that in the absence of noise, a number of entangling steps D = 8 and D = 28 are required to achieve chemical accuracy for lithium hydride and beryllium dihydride, respectively, for the given experimental connectivity. However, the combined effect of decoherence and finite sampling limits the optimal depth for optimizations on current quantum hardware to between zero and two entanglers, which results in deviations of the simulated bond-dissociation energies from the real values. Decreasing the effective error rates or applying error-mitigation schemes as discussed in Section 7 will improve the accuracy of the simulations. At the equilibrium geometry and no entangler block in the circuit, the energy converges to a state with an energy that is about 50% too high. With two or more entanglers, the exact energy is obtained. The inset shows the entire dissociation profile for a hydrogen molecule calculated with four entangling steps.

Classical optimization with qubits
The complex Hamiltonians of quantum chemistry problems give quantum computers an inherent advantage over classical hardware. For classical optimization the advantage is not as obvious because many of the relevant problems can be mapped to a relatively simple Ising-spin Hamiltonian. It is diagonal in the computational basis and can be tackled by a range of classical methods. One of the issues with classical solvers is to avoid solutions in local minima of the cost function. In this context simulated annealing [86] is an approach that makes use of thermal fluctuations to escape such local minima. Quantum annealing [87] additionally exploits quantum tunneling and can potentially reach a ground state faster especially for problems with very corrugated cost functions [6,7]. The potential for quantum speed-up with this approach is heavily debated in the community; however, because of the tremendous application space even a modest speed-up for a selected number of problems might have a significant impact. Moreover, understanding the detailed evolution of the optimization process and the potential role of entanglement is critical even for improving algorithms that run on classical hardware. This is why the application of the VQE for solving classical optimization problems on gate-based near-term quantum devices is especially interesting.
To run the variational quantum eigensolver we again consider two different ways to create trial wavefunctions. First, the quantum approximate optimization algorithm (QAOA) [41] is discussed, which is a polynomial-time algorithm for finding an approximate solution to a classical optimization problem with a desired accuracy. It is related to the quantum adiabatic algorithm [88], but has shorter circuit-depth requirements. Second, we give a short example how heuristic trial states can be used to solve a MaxCut problem on a real quantum device using the variational quantum eigensolver.

Quantum approximate optimization algorithm with short depth
Similarly to the approach described in Section 4.3 the trial wavefunction in the QAOA is guided towards the solution by repeated unitary evolution according to two Hamiltonians. The first one is the Hamiltonian H C , which encodes the classical cost function C(x) of a binary constrained optimization problem. The second one is a mixing Hamiltonian H M , which helps guide the optimization in Hilbert space towards the ground state of H C . The number of times that both Hamiltonians are applied in the optimization process defines the level D of the circuit and determines the complexity of the algorithm. Without loss of generality, it is assumed that an optimal solution x minimizes the cost function C(x) which is a polynomial in the binary components x i ∈ {0, 1} of the variable x. Encoding of the cost function C(x) into a Hamiltonian H C requires shifting the binary variables x i → (1 − z i )/2 with z i ∈ {−1, 1} and then substituting z i → σ z i to obtain an Ising-type Hamiltonian. We chose the same notation as in Eq. (4) but consider only diagonal terms σ j i ∈ {1, σ z i } which gives Here the index i α runs over all σ z iα in P α , which constitutes a k-local term (many-body interaction term among k ≤ N qubits), matching the polynomial terms in the cost function C with corresponding real coefficients h α . The second Hamiltonian H M is just a global transverse field, i.e.
To find the ground state of the problem Hamiltonian H C , one proceeds by applying the evolution operator to a starting state |ψ 0 that can easily be generated on the quantum computer, e. g. a uniform superposition state. Using the VQE, the parameters of the final state |β, γ = U (β, γ)|ψ 0 are then adjusted such as to minimize the expectation value β, γ|H C |β, γ . Measurement of the final state |β, γ directly yields the solution of the classical optimization problem with a probability that approaches unity as D increases. However, with increasing D the circuit depth required will reach the decoherence limits of available quantum hardware, and the fidelity of the result will again decrease. Also, the number of classical parameters that need to be optimized for large D will result in a slower convergence. Instead of using the VQE choosing a fine interpolation (β l , γ l ) = (1 − l/D, l/D) with l = 0, ..., D would be equivalent to first order with a trotterized version of the adiabatic quantum algorithm [1,88]. By letting the VQE select optimal parameters (γ l , β l ), a more direct path to the target state becomes possible and the algorithm can reach the ground state with high accuracy even for relatively small values of D. The QAOA has been generalized and successfully applied to MaxCut with analytical and numerical studies [89].

Variational quantum eigensolver applied to the MaxCut problem
To give an example of a classical optimization problem, we discuss an instance of the maximum-cut (MaxCut) problem with five qubits. Instead of generating trial states with the QAOA, we again use the hardware-efficient approach explained in Section 4.3 to run the algorithm on a real quantum device. The MaxCut problem is an NP-complete binary optimization problem, with applications in clustering, network science, and statistical physics. It aims at grouping the nodes of a graph into two subgroups by cutting across the links between them. The cut is to be made in such a way that the added weights of the links (edges) that were cut are maximized.
The formal definition of this problem is the following: Consider an n-node nondirected graph with edge weights w ij > 0, w ij = w ji , where (i, j) enumerate the nodes linked by the corresponding edge [90]. The profit function to be maximized is therefore the sum of edge weights connecting points in the two different subsets. By assigning a subset label x i = 0 or x i = 1 to each node i, one tries to maximize We can then use the mapping described in Section 5.1 to obtain the Ising Hamiltonian In other words, the weighted MaxCut problem is equivalent to finding the ground state of the Ising Hamiltonian For exploring the solution space of H C we use the approach from Section 4.3 to define a hardware-efficient heuristic trial wave function where U ent is a collection of fully entangling gates that are diagonal and the number of entanglers D defines the level of the quantum circuit. The single-qubit rotations coefficients, while still exploiting entanglement to potentially converge faster to the solution. Evaluation of the energy expectation value for a specific trial wavefunction is especially simple in this case as it is sufficient to measure all four qubits and extract the pairwise σ z i σ z j correlators. Figure 6(a) shows two different cuts through a problem instance with four nodes (qubits). The lower of the two solves the problem if all non-zero weights in w ij are assumed to be equal. When we implement this on an ideal quantum simulator [61, 85] and use the VQE to optimize the parameters of the trial state in 100 trial steps, we get the state probabilities shown in Fig. 6(b). For this simple simulation, the solution is found with a probability that is higher than 95%.

Classical robust optimizers for measured expectation values
The optimization cycle of the VQE (see Section 3) involves evaluation of the cost function on a real quantum device, e. g., a superconducting quantum processor, and adjustment of the variational parameters using classical optimization algorithms (see Section 3). In the latter, several important aspects need to be considered for successful application of the VQE.
First, the optimization could get stuck in a local minimum that would correspond to an excited state of the system. Using a suitable optimization routine can prevent finding such false minima. Gradient-descent methods may be combined with simulated annealing steps or strategies that involve starting from multiple initial points. In this context, in [38] a greedy search with multiple starting points is alternated with a Powell search, showing good performances on Hubbard lattices of up to twelve sites.
Second, because of the limited number of samples of the Hamiltonian terms on the quantum computer one only has access to a noisy energy (cost) value. The error in the energy estimation goes as O(1/ √ s), with s the number of samples taken. Grouping Pauli operators into commuting sets [40,72] that can be measured with the same state preparation and post-rotations reduces the number of separate measurements and enables more averages and better sampling statistics. Still, the choice of the optimizer must take into account that the cost function is affected by stochastic fluctuations. In fact, while unitary coupled-cluster methods and other analytical variational circuits in principle support the use of gradient-based methods that increase the efficiency of the optimization [91], an imperfect knowledge of the unitary gates implemented in a given quantum device and statistical noise render gradient-based approaches less useful. Derivative-free methods, such as Nelder-Mead and the TOMLAB method, have been tested for optimization of the hydrogen molecule, resulting in a superior performance of the latter method in the presence of stochastic noise [40]. Third, time overheads due to repeated sampling and the number of function evaluations to update the variational parameters will affect the performance of the optimization. In this spirit, the use of a simultaneous perturbation stochastic approximation (SPSA) [92], used in [72] for molecular structure problems, provides both a constant overhead in terms of the number of variational parameters and robustness with respect to stochastic fluctuations. Extensions of the SPSA method that include approximations to the Hessian matrix can be explored to improve the speed of the optimization in the final steps, where estimating second derivatives helps achieve faster convergence [93]. In contrast, additional savings in time overhead in SPSA optimizations that rely on just one evaluation of the cost function per update step [94] could further improve the performance in large-scale quantum problems where sampling is particularly difficult. While simultaneous perturbation methods can be very useful in the optimization of fermionic problems, for classical problems, such as instances of MaxCut, the ease of evaluating the cost function may favor standard gradient-descent or derivative-free routines.
Another critical aspect is the improvement of the classical control hardware for running the VQE on a quantum device: measurement of the cost function with sufficient accuracy requires repeated sampling of the output state and thereby also repeated cycles of qubit initialization, application of the quantum gates and qubit measurement. The speed of the execution of the optimization can be improved on the hardware side by using integrated active reset techniques. In the case of superconducting qubits this is true for both qubits and resonators [95,96]. Moreover, the costly time overhead in synthesizing and loading control pulses onto the quantum processor for trial-state preparation can be reduced by short-latency field-programmable gate-array-based control and measurement architectures such that time overheads are solely related to the execution of the quantum gates and the readout of the qubits.

Prospects of fighting decoherence without full error correction
The hardest challenge for practical near-term quantum devices is their sensitivity to noise. Any computation that has the potential to leverage quantum effects and to provide a quantum speed-up over classical algorithms needs sufficiently coherent qubits. It was realized early on [97] that the coupling to the environment sets both a time and size limit for a quantum computation. Hence, the strength of this coupling determines how large a computation can be performed. This constant limit has to be contrasted to the improvements that are gained from the asymptotic scaling advantages of quantum algorithms. This limitation was, at least in theory, remedied with the advent of quantum error correction [98,99,100]. However, in spite of rapid experimental progress, the resource requirements for fully fault-tolerant operations with current codes [53] seem prohibitively large [101,102]. In turn, hopes were raised that non-error-corrected devices will soon become available that reach a regime of reasonably long coherence times and give rise to dynamics too complex to be simulated on a classical computer [103,43]. In light of these developments, the question arises which computational tasks can be accomplished with quantum devices that have only limited or no error correction. Depending on the form of the actual physical noise, it is expected that the production of entropy in any quantum circuit that is subject to noise will set a limit to this approach [104], and error correction is indispensable for any advanced form of quantum information processing. However, the full computational power of even short-depth circuits is not yet fully understood, and based on complexity-theoretic grounds, it can be argued, that even finite-depth circuits lie beyond the computational power of a classical computer [105,43].
Recent experiments in which the quantum simulation of small molecules was performed [72] showed that even for very short-depth circuits the effects of decoherence become apparent. For the simulation to be of value, the effect of this error needs to be mitigated, and several proposals have been made to deal with the effects of decoherence in short-depth quantum computation [106,39,55,54].
For a large fraction of applications, the computational task can be abstracted to estimate the expectation value of some observable after the application of a short-depth quantum circuit. This estimation must be accurate enough to achieve a simulation precision that outperforms approximate classical simulation tasks. Techniques to mitigate the error in the estimation of expectation values were introduced in [55]. It is shown that the estimate can be improved in the presence of noise with only a modest time overhead. This approach requires no additional hardware resources such as fresh ancilla or code qubits.
In this scheme, the estimation of an expectation value is improved by an extrapolation to the limit of zero noise as originally proposed by Richardson [107]. The method requires no a priori knowledge about the noise source, except that the noise is weak and time-independent. To understand this approach it is useful to choose a more physically motivated description of the computation rather than the gate-based quantum circuits. It is more convenient to consider a time-dependent Hamiltonian dynamics H(t) = α J α (t)P α that implements the circuit, where J α (t) are coupling coefficients and P α are N -qubit Pauli operators. In this model the coherent evolution is subject to a noise contribution L that is effectively constant in time and acts on a time scale much larger than the time-dependent Hamiltonian implementing the quantum circuit. The time evolution up to some time T of the open system with initial state ρ 0 can by described by a Lindblad master equation The expectation value E(λ) of some observable A is then obtained from the final state ρ λ (T ) and can be written as a power series of the noise rate λ where E * (0) corresponds to the noise-free expectation value. Richardson proposed a so-called deferred approach to the limit to estimate an expectation value such as E * (0) with high accuracy [107,108]. For this purpose, the expectation value E(λ j ) is measured for different noise rates λ j = c j λ, where c j is a rescaling factor and λ the actual noise rate in the experiment. The noise-free expectation value can then be estimated by [55] where n j=0 γ j = 1 and n j=0 γ j c k j = 0 for k = 1...n. In this way the largest terms in the error up to O(λ n ) are cancelled, thus leading to an estimation of the noise-free expectation value with very high accuracy. In practice however, the noise rate λ is fixed. To still obtain an experimental estimate of the expectation values E(λ j ), the following trick can be applied: the quantum circuit H(t) can be run for a time c j T and with a reduced coupling J α /c j . As the noise L is assumed to be constant in time, it can be shown that the state resulting from a rescaled dynamics is identical to the state obtained from the dynamics with an effectively rescaled noise parameter. Depending on the nature of the noise, relative errors for the noise-free expectation value range from 10 −6 to 10 −11 [55].

Conclusion
Current and near-term quantum processors will most likely be limited to a few hundred, maybe a thousand qubits, and operate without quantum error correction. If the qubits and their control were ideal, the computational power of quantum devices with a couple hundred qubits would already dwarf that of any classical computer and could show quantum advantage. However, errors in the quantum operations reduce their computational power.
In this paper it is argued that a proper metric, such as the quantum volume, should be used to assess the computing power of a quantum processor and to compare different prototypes on a fair basis. With this metric, it becomes clear that not only the qubit number has to be increased, but also and even more importantly, the effective error rate needs to be significantly reduced before practical applications come within reach.
Simple estimates show that to run a algorithm with depth hundred on a hundredqubit device requires an effective error rate of 0.01 %. This number is not completely unrealistic, but shows the necessity to construct algorithms with short depth. Moreover, error-mitigation schemes using no or only a small number of extra ancilla qubits will be important to compensate systematic deviations in the computed result.
Besides enlarging the quantum volume and reducing the effect of errors, it is essential to find suitable methods and algorithms to use quantum effects efficiently. We have discussed that a promising way forward is to consider hybrid quantumclassical architectures in which the quantum processor is used to generate trial quantum states that could not be stored in conventional memory. The variational quantum eigensolver method can be use to solve any type of problem that can be cast into a physical Hamiltonian. Constrained binary optimization problems can be described by an Ising-type Hamiltonian, whereas problems from the field of quantum chemistry or material science map into a more general spin Hamiltonian with more than longitudinal interactions among the spins.
For Ising-type Hamiltonian problems, it is not clear how much quantum speedup can be expected, because many fast classical algorithms have already been developed [41]. In contrast, the Hamiltonian for chemistry and materials-related problems contains so-called non-stoquastic terms, which makes it difficult to solve these problems exactly on a classical computer. It is, therefore, believed that using a quantum processor will lead to exponential speed-up. The current state of the art encompasses proof of concept simulations of small molecules: In the context of superconducting qubits the hydrogen molecule has been simulated with two qubits [34,33,72] and larger molecules such as lithium hydride and beryllium dihydride have been simulated with seven qubits [72]. As the size of the systems under study grows in electron number so does the required number of qubits, for example, the simulation of the electronic structure of small organic molecules such as benzene and ethane [13] already requires tens to hundreds of qubits. In the case of strongly correlated electrons, even the simplest systems made of a few atoms, like for instance the chromium dimer [109], quickly become intractable for classical computers when accurate numerical solutions are required. To address strongly correlated problems of practical relevance such as the nitrogen fixation catalytic center in bacteria [27] or the iron-sulphur clusters in the respiratory chain protein complexes [110,111] (see Figure 7) quantum processors with a significantly increased quantum volume are needed. To achieve this, the capabilities of next-generation quantum processors have to improve along several directions: (i) Improvement of coherence and qubit control, as well as development of errormitigation schemes.
(ii) Hardware-efficient and problem-specific trial state preparation when using variational quantum eigensolver.
(iii) Efficient circuit optimization by code optimizers and improved mappings from fermions to qubits.  (iv) Classical parameter optimization methods suited for variational quantum eigensolver.
As for (i), current best error rates of ∼ 10 −4 for single and ∼ 10 −3 for two-qubit gate fidelities in the case of superconducting qubit architectures do not provide sufficient accuracy for more complex quantum calculations. The coherence time of qubits has to be improved, e. g., by improving fabrication techniques or chip designs. The control pulses for qubits and their interaction have to be optimized to avoid systematic gate errors. Any remaining errors have to be compensated by error-mitigation strategies.
As for (ii), trial states that require only a variation of a few parameters to prepare the targeted solution state are required. It is an open question how to construct suitable trial states for a general problem set. One may speculate that some combination of heuristic and problem-specific approaches is best suited for the variational quantum eigensolver, e. g., hardware-efficient trial wavefunctions which obey certain physical constraints, for example, to conserve the particle number in the quantum chemistry context. Moreover, enlarging the set of available gates, e. g. by exploring coupling primitives that allow different types of interactions between two or more qubits to be realized [112,113] is considered to create problem-specific trial states and render the VQE efficient.
As for (iii), different fermions-to-qubits maps have been proposed which do not require the creation of entanglement over the entire qubit space. Among the different variants of the Jordan-Wigner and binary-tree methods, one can envisage approaches that perform better in the presence of system-specific noise. Moreover, it may be possible to identify new maps into qubits, which are especially suited for variational quantum eigensolvers and that can exploit, for instance, the use of additional ancilla qubits to further reduce the number and the complexity of the gates. Of particular interest is also the possibility to optimize quantum circuits using post-processing tools at compilation [27]. The use of high-level languages for the generation and the manipulation of quantum circuits will indeed offer the possibility to rationalize the qubits resources, thus reducing the circuit depth and therefore the time to solution.
As for (iv), specialized classical optimizers that can deal with large stochastic fluctuations resulting from queries to the quantum processor in the VQE are required. The possibility that optimization routines get trapped in false local minima or the effect of high noise render the robustness of optimizers of critical importance for nearterm applications. Even the use of quantum-enhanced optimization schemes may be envisaged.
In conclusion, several promising approaches to make use of near-term devices with hundreds of qubits and limited coherence times have been developed. Overcoming the remaining challenges will allow us to solve tangible problems, most likely in quantum chemistry, material science or classical optimization.