Finding spin glass ground states using quantum walks

Quantum computation using continuous-time evolution under a natural hardware Hamiltonian is a promising near- and mid-term direction toward powerful quantum computing hardware. We investigate the performance of continuous-time quantum walks as a tool for finding spin glass ground states, a problem that serves as a useful model for realistic optimization problems. By performing detailed numerics, we uncover significant ways in which solving spin glass problems differs from applying quantum walks to the search problem. Importantly, unlike for the search problem, parameters such as the hopping rate of the quantum walk do not need to be set precisely for the spin glass ground state problem. Heuristic values of the hopping rate determined from the energy scales in the problem Hamiltonian are sufficient for obtaining a better quantum advantage than for search. We uncover two general mechanisms that provide the quantum advantage: matching the driver Hamiltonian to the encoding in the problem Hamiltonian, and an energy redistribution principle that ensures a quantum walk will find a lower energy state in a short timescale. This makes it practical to use quantum walks for solving hard problems, and opens the door for a range of applications on suitable quantum hardware.


Introduction
Optimization problems need to be solved in a broad range of areas, such as scheduling, route planning, supply chains, finance. This is often computationally intensive, so the prospect of quantum enhanced solution methods is an important research direction for practical quantum computing. One way to tackle optimization in a quantum setting is to use a device which realises an Ising Hamiltonian with a transverse field. Computing using the Ising Hamiltonian works as follows: the optimization problem is encoded into the Ising HamiltonianĤ Îˆˆˆ( on n qubits, such that the solution corresponds to the ground state ofĤ I . In our notation, the operatorẐ j on the full Hilbert space applies the single qubit Pauli-Z operatorẐ to the jth qubit, where  2 is the identity operator on a single qubit. The (real) values of the coupling strengths J jk and fields h j define the optimization problem, and efficient methods are known for expressing optimization problems in terms of these coupling and field strengths (e.g. Choi 2010). The transverse field termĤ T H X, 3 T j n j 0 1 drives transitions between states, where Γ is a real-valued transverse field strength, andX j is the operator on the full Hilbert space that applies the single qubit Pauli-X operator to the jth qubit, defined by analogy withẐ j in (2). The qubits are initialised in the ground state ofĤ T , this is easy to do by applying a strong transverse field to align all the qubits in the state | (| | ) +ñ = ñ + ñ -2 0 1 1 2 . Then, the computation is carried out by applying the full transverse Ising Hamiltonianˆ( where t is time and A(t), B(t) are real-valued control functions. To obtain a candidate solution to the optimization problem, the qubit register is measured after a time t f . For some problems, sampling from the distribution of low energy states provides the required solution-this can be done by repeating the computation, which will in general not produce the lowest energy state with certainty. The Ising Hamiltonian is a natural choice for encoding problems for two reasons. First, it is proven to be universal for classical problems (De las Cuevas and Cubitt 2016). There are efficient methods for mapping NPhard optimization problems to the Ising model (Choi 2010, Lucas 2014, providing a practical route to quantum algorithms. Since many optimization problems are NP-hard, an exponential speed up is not expected, but even modest polynomial improvements are useful for practical applications. There is increasing interest in how to obtain polynomial advantages through quantum algorithms (Moylett et al 2017, Montanaro 2018, Ambainis et al 2019. Interesting results have been presented for a wide range of applications, such as mathematics (Bian et al 2013, Li et al 2017, computer science (Chancellor et al 2016), computational biology (Perdomo-Ortiz et al 2012), finance (Marzec 2016), and aerospace (Coxson et al 2014). Second, the Ising Hamiltonian can be implemented in a range of different physical systems. The quantum Ising Hamiltonian is the basic interaction Hamiltonian in the D-Wave Systems Inc.programmable devices (Johnson et al 2011, Boixo et al 2013. Implementations in other promising architectures include Rydberg systems (Bernien et al 2017) and trapped ions (Kim et al 2011). The Ising Hamiltonian is also the basic tool for specialised optimization hardware, such as coherent Ising machines (Inagaki et al 2016, McMahon et al 2016. Optimization using the Ising Hamiltonian can be implemented in digital quantum architectures by using the quantum approximate optimization algorithm (QAOA) (Farhi et al 2014a, 2014b, Marsh and Wang 2019 or quantum alternating operator ansatz (Hadfield et al 2019). Studies by Zhou et al (2018) show how to exploit non-adiabatic effects in QAOA on early quantum hardware.
There are several known methods for driving the quantum system from its initial state into the ground state of a Hamiltonian defining the problem to be solved. These methods correspond to different choices for the control functions A(t) and B(t) in (4). Adiabatic quantum computing (Kadowaki and Nishimori 1998, Farhi et al 2000) keeps the quantum system in the ground state while the initial Hamiltonian is slowly changed into the problem Hamiltonian. Quantum annealing (Finnila et al 1994) takes advantage of open quantum systems effects to cool the system towards the ground state. Continuous-time quantum walks evolve the system under a time-independent Hamiltonian for a suitable time before measurement of the final state. Computation by continuous-time quantum walk and adiabatic quantum computing are end points of a family of continuoustime protocols that use the same Hamiltonian terms but are applied with different time dependent modulation (Morley 2019). In this work, we focus on computation by quantum walk using time-independent transverse Ising Hamiltonians.
Quantum walks can solve the search problem (Childs and Goldstone 2004), achieving the same quadratic O(N 1/2 ) quantum speed up as is obtained by Grover's algorithm (Grover 1996). We describe the search problem further in section 2.4. For particular graphs, quantum walks can solve problems exponentially faster (e.g. Childs et al 2003), and quantum walks are now widely used as subroutines in more complex quantum algorithms. However, in the continuous-time setting, the application of quantum walks to optimization problems has not been studied in detail. There is increasing interest in quenches (Amin et al 2018) or pauses (Marshall et al 2019, Passarelli et al 2019 in quantum annealing, which effectively run an open-system version of a quantum walk during part of the computation. Thermal relaxation effects dominate in the regime currently accessible by flux qubit quantum annealers, which is the focus of these works. An algorithm which is essentially a quantum walk on a spin glass, although presented using different terminology, has been analysed by Hastings (2019). Along with the same energy conservation arguments we describe in section 6.2, Hastings' findings suggest that quantum walks on spin glasses will be interesting to explore. Given that quantum walks provide a better performance for searching than adiabatic quantum computing, especially when limited coherence time and other practical factors, such as precision of control settings, are considered (Morley 2019), it is important to understand how they perform for a wider range of problems.
In this work, we tackle the question of if, and how, a quantum walk can be useful for practical quantum optimization. We present a detailed numerical investigation of continuous-time quantum walks applied to solving combinatorial optimization problems, using the Sherrington-Kirkpatrick spin glass ground state problem as a prototypical example. Finding the ground state of a frustrated Sherrington-Kirkpatrick spin glass (Sherrington and Kirkpatrick 1975) is known to be not only NP-hard, but also uniformly-hard, as suggested by its finite-temperature spin glass transition. Without a finite temperature spin glass transition, a problem cannot be uniformly hard, since the lack of a transition implies that typical cases will be easy for the Monte Carlo family of algorithms, as discussed in Katzgraber et al (2014). As has been shown for a random problem type used in early benchmarks of quantum annealing hardware (Katzgraber et al 2014), uniform hardness is crucial: without this property, randomly generated instances of NP-hard problems are not necessarily hard to solve (Beier and Vöcking 2003, Krivelevich and Vilenchik 2006, Lucas 2014. We use a random energy model (Derrida 1980) for comparisons, to draw out the effects of the correlations between energy difference and Hamming distance in the spin glass. A problem with perfect correlations is easy to solve, like finding the ground state of a spin system with only local fields, no couplings. A completely random problem, such as finding the ground state of a random energy model instance, has no correlation to exploit and so is very hard to solve, essentially requiring random guessing. However, a completely random model is fully characterised by average values of its properties, and finding exact ground states of specific instances is typically not interesting. Intermediate problems with some correlations are both hard and interesting, with complex behaviour and phase diagrams, like spin models with frustration and spin glass phases. Real optimization problems typically have correlations; they are often hard to solve but also produce interesting solutions. The inherent complexity of a problem comes from the structures of the problem and its correlations, not the structure of the solution itself. One illustration of this is the construction of hard benchmarking problems with 'planted' solutions defined at the time of construction, which therefore have no special structure related to the problem's hardness, see for example Hen (2019), Hamze et al (2019).
The paper is structured as follows: in section 2, we review the setting for computation by continuous-time quantum walk encoded into qubits, including application to the search problem. In section 3, we introduce the Sherrington-Kirkpatrick spin glass model, and the random energy model we use for comparison. In section 4, we describe the numerical methods used in this investigation. In section 5, we present the main results showing how quantum walks can find spin glass ground states more effectively than a quantum search algorithm. In section 6, we identify the computational mechanisms and important aspects of the problem structure that contribute to the effectiveness of quantum walk computation. Finally, in section 7, we summarize and conclude.

Computing with quantum walks
Both discrete (coined) quantum walks (Aharonov et al 2001, Shenvi et al 2003 and continuous-time quantum walks Gutmann 1998, Childs et al 2003) are used for computation. This work only uses the continuous-time quantum walk, and also only as an encoded quantum walk, in which qubits are used to store the binary labels of the positions of the quantum walker (see figure 1 for a simple example).

Continuous-time quantum walks
A continuous-time quantum walk is defined on an undirected graph G(V, E), with 1 the set of N vertex labels and E the set of label-pairs ( j, k) associated with edges. The vertices correspond to the positions of the walker, and the edges indicate the allowed transitions between vertices. This is conveniently encoded in the adjacency matrix A of the graph, which has entries and A jk =0 otherwise. The Laplacian Figure 1. A 3-dimensional hypercube (a cube) graph in which the vertices are labeled by the 2 3 =8 computational basis states of 3-qubits, and the edges connect the states with Hamming distance 1 (single spin flips).
of G is = -L A D, where D is a diagonal matrix formed from the degree of each vertex, j deg is the number of edges connected to vertex j. Both the adjacency matrix A and Laplacian L are symmetric matrices which can thus be used to define a quantum Hamiltonian for the dynamics of the continuous-time quantum walk on the graph. In this work, we only need regular graphs, for which ( ) j deg is constant with respect to j. For regular graphs, the only difference between using the adjacency matrix A or Laplacian L is an irrelevant global phase (Childs and Goldstone 2004). We use the Laplacian form of the Hamiltonian for consistency with prior work. We thus define the quantum walk HamiltonianĤ G for a quantum walk on graph G by where γ is the hopping rate between connected vertices per unit time. The states | | ñ ñ j k , for Î j k V , are associated with the vertices of G and form a basis for a Hilbert space of dimension N. In the Ising model context, the dimension of the Hilbert space is = N 2 n where n is the number of qubits, and {| } ñ = j j N 0 1 is the computational basis. For a quantum walk starting in state | ( ) y ñ 0 , the state of the walker evolves according to the Schrödinger equation, with formal solution Ht exp i 0 , 6 G using units in which =  1.

Computing using a quantum walk
The task is to solve an optimization problem whose N=2 n candidate solutions j are represented in the computational basis , where j is a bit string corresponding to the state of n qubits. The problem is encoded in an Ising HamiltonianĤ P , of the form described byĤ I in (1) and whose eigenbasis is the computational basis. We write the basis state with eigenvalue Next, we choose a suitable walk graph G. The main requirement is that the ground state of the quantum walk HamiltonianĤ G coincides with the initial state, either biased or unbiased (see section 6.2). A simple way to achieve a biased starting state would be to 'tilt' the driver fields so they are no longer completely transverse. We only treat the unbiased case in this work, so our initial state will be | ( ) y ñ 0 throughout. The full Hamiltonian ( ) g H is defined by adding the quantum walk HamiltonianĤ G to the problem HamiltonianĤ P where the key parameter is the hopping rate γ inĤ G , see (5). The computation is performed by evolving the initial state (7) under the full Hamiltonianˆ( ) g H for a time t f , then measuring the qubit register in the computational basis. The intuition, based on the faster spreading of quantum walks over classical found in prior work (Farhi and Gutmann 1998), is that the quantum walk dynamics provide rapid exploration of the basis states, while the energy structure of the problem HamiltonianĤ P causes localisation around low-energy states.
The success probability ( ) | | ( )| ( ) y = á ñ P t E t f P f 0 2 of finding the solution state when measuring will not in general be unity. It will typically be necessary to repeat the protocol multiple times to obtain a high probability of success over all the repeats. In general, it will be best to use different measurement times t f for each repeat. Different measurement times will produce different success probabilities P(t f ), and varying the measurement time avoids repeatedly measuring at a time for which the probability P(t f ) happens to be atypically small. More precisely, we choose the measurement time t f uniformly at random in an interval [ ] , and define an average single run success probability¯( t t t f f Operationally, choosing the measurement time t f randomly in the interval [ ] samples success probabilities from the distribution with¯( ) D P t t , as its mean. Sampling measurement times in this way means that the protocol typically needs to be repeated¯( ) D M P t t 1 , rep times to achieve an overall O(1) success probability. Note that it is not generally possible to check whether the state measured is indeed the ground state ofĤ P . However, it is easy to calculate the energy of the state measured in each repeat. If only the lowest energy state is accepted, it is only necessary for the ground state ofĤ P to be measured once out of all the repeats. The more repeats, the more confidence is gained that the lowest energy state found is the ground state. And studying the distribution of the sampled energies can provide more information about the problem.
The procedure described in this subsection does not in general provide an optimal quantum algorithm, because the repeats do not use information gained from the outcomes of previous runs. We will discuss this further in section 7; for most of this paper we are concerned with understanding the average single run success probability, as an essential prerequisite to building optimal algorithms.
In the limit of small interval width Δt, the average success probability defined in (9) reduces to the single time probability ( )¯( The long time limit of this average, lim 0, , 10 t is particularly useful, because it can be calculated via a numerical diagonalization of the Hamiltonian (see section 4) and it predicts the short time average well (see section 5.3). In this paper, we will often use the long time average ¥ P as an indication of the success probability achievable in a single run, and thus the number of repeats required to achieve O(1) success probability overall. We will separately address the timescale required to reach this probability in each run.

Graph choice for quantum walk computing
There are many graph-based Hamiltonians with the initial state | ( ) y ñ 0 defined in (7) as the ground state. A common choice is the complete graph K, in which every vertex is connected to every other. This graph has the quantum walk HamiltonianĤ K that couples every computational basis state | ñ j state to every other, The complete graph is useful because it makes some algorithms analytically tractable (see, e.g. Childs and Goldstone 2004). However, for implementation on qubit-based hardware, the complete graph is not in general practical, requiring higher order interaction terms than the transverse Ising term (3). In this qubit setting, an implementation of the complete graph requires a sum over every one-body term (e.g.X j ), every two-body term (e.g.ˆX X j k ), every three-body term (e.g.ˆˆX X X j k l ) ... up to the n-body term = -X j n j 0 1 , a total of N terms. One-and two-body terms are relatively easy to implement, since they correspond to Hamiltonians found naturally. Terms in three or more Pauli-X operators are much more difficult and generally require extra qubits to engineer in real physical systems.
A more natural choice of graph for qubits is the hypercube. The n-bit labels are associated with the vertices of the graph such that the edges correspond to flipping one bit, as illustrated in figure 1. The hypercube quantum walk HamiltonianĤ h on n qubits is composed of single-body terms WithĤ h as the graph Hamiltonian, the full quantum walk computational Hamiltonianˆ( ) g H defined in (8) is a transverse Ising Hamiltonian in the form ofĤ TI in (4), with the control functions A(t) and B(t) kept constant throughout the computation.
In this work, we predominantly use the hypercube graph, with some comparisons made with the same problems on the complete graph.

Solving the search problem using quantum walks
The simplest example of an algorithm in this continuous-time quantum walk setting is the search problem. The problem is to find the marked state, a single bit-string { } Î m 0, 1 n out of N=2 n possible bit strings. Finding a marked state was shown to have a quantum algorithm with a speed up over classical algorithms by Grover (1996). To map this problem to the continuous-time Hamiltonian setting, the marked basis state | ñ m is given one less unit of energy than all the rest of the basis states, by defining the problem HamiltonianĤ S aŝ By construction, the problem HamiltonianĤ S has the marked state | ñ m as its ground state. The continuous-time quantum walk search problem has been analytically solved (Childs and Goldstone 2004) for several different walk graphs. For the complete graph and the hypercube graph, a quantum speed up is obtained for carefully chosen optimal values of the hopping rate γ. For the complete graph HamiltonianĤ K , the optimal value is ( ) , while for the hypercube Hamiltonian,Ĥ h , the optimal hopping rate ( ) g h opt is given by = -n r n r n r is the binomial coefficient. For a quantum speed up, the hopping rate must be set to ( ) g h opt as defined by (14) with high precision. It has been shown (Morley 2019) that the fractional tolerance to misspecification of the optimal hopping rate ( ) g h opt falls as ( ) -O N 1 2 . The measurement time must also be chosen appropriately. In the limit of large problem size N, the marked state can be found with unit success probability, 1 2 . This corresponds to a quadratic speed up compared to the best classical algorithm. Due to the absence of structure in the search problem specifically, such a quadratic speed up has been proven to the best possible quantum speed up (Bennett et al 1997).
The variation of P(t f ) with t f is shown in figure 2(a) for search on hypercube graphs of size N=2 30 (i.e. n=30 qubits) and N=2 11 (i.e. n=11 qubits), using the optimal hopping rate ( ) g h opt . The sinusoidal oscillations of the probability P(t f ) occur because the quantum walk is performing Rabi oscillations between the initial state and the marked state. The two lowest energy levels of the full Hamiltonianˆ( ) g H with varying γ undergo an avoided level crossing at ( ) g h opt and the associated eigenstates | ( ) are approximately the orthogonal equal superpositions of the starting state and marked state, (Childs and Goldstone 2004). These simple, two-level dynamics describe the quantum walk solution to the search problem well for large problem size N: the oscillations in the N=2 30 case have no visible irregularities. For smaller sizes, finite-size effects due to population of higher energy levels are apparent: the oscillations in the = N 2 11 case have lower probability peaks and show some irregular behaviour, such as the small dip on the first peak. These finite-size effects are further illustrated in figure 2(b), which shows the instantaneous success probability P(t f ) at the asymptotically optimal and numerically determined best times, as well as the infinite-time average success probability ¥ P defined in (10). All three probabilities show a pronounced dip around n=8 qubits, with smooth behaviour only settling in for n > 12 qubits. Figure 2(b) also shows that the infinite-time probability ¥ P asymptotes to a half. Hence, a quantum walk search with a random measurement time should on average only need to be repeated twice to locate the marked state; knowing the exact time to measure for the optimal success probability is not necessary for the success of the algorithm. Fixed point quantum search algorithms (Yoder et al 2014, Dalzell et al 2017 are another approach that avoids the need to know how long to run the algorithm for.
The search problem in the continuous-time quantum computing setting has two important drawbacks. Firstly, implementing the problem HamiltonianĤ S directly on n qubits requires O(2 n ) terms of products of up  (14). (a) The probability P(t f ) that a measurement at time t f results in successfully finding the marked state | ñ m for two different numbers n=30 (red, dashed line) and n=11 (orange, solid line) of qubits (i.e. problem sizes N = 2 30 and N = 2 11 respectively). (b) Comparison of instantaneous success probabilities P(t f ), at the asymptotically optimal (blue squares, dashed line) and numerically determined best (red circles, solid line) measurement times t f , and the infinite time average success probability ¥ P (green triangles, dotted line) defined in (10). to n Pauli-Z operators, similar to the problem with implementing the complete-graph HamiltonianĤ K , defined in (11), on qubits. Implementing higher order Pauli-Z terms can be done using extra qubits as 'gadgets', e.g. Jordan and Farhi (2008). An alternative type of gadget, specifically for permutation-symmetric problems like search, is given in Dodds et al (2019), building on classical problem mapping techniques in Chancellor et al (2016Chancellor et al ( , 2017. Secondly, it is impossible to map the problem Hamiltonian to qubits without specifying the solution outright. Hence, the search problem serves as a useful toy problem, especially in contexts where having analytic, computational, and physical implementations available for comparisons facilitates benchmarking and other testbed procedures.

Spin glass problem definitions
In this work we focus on spin glass problems that have features in common with real life hard optimizations problems and, unlike the search problem, do not admit analytic solutions. The search problem solved by quantum walk provides useful comparisons with these spin glass problems.

Sherrington-Kirkpatrick spin glass
The Sherrington-Kirkpatrick (SK) spin glass Hamiltonian H SK (Sherrington and Kirkpatrick 1975) is defined on n spins as ) and the couplings J jk are drawn independently from the normal distribution ( ) m s  , SK 2 with mean μ and variance s SK 2 . Finding the ground state of this Hamiltonian is NP-hard (Choi 2010), and uniformly hard, due to its finite-temperature phase transition (Sherrington and Kirkpatrick 1975).
It is computationally convenient to break the spin inversion symmetry by adding single-body field terms of the form å = h S j n j k 0 1 , where h j are the field strength values. Like the couplings J jk , the fields h j are also drawn independently from ( ) m s  , SK 2 . When the fields strengths h j are drawn from the same distribution as the coupling strengths J jk , the hardness of finding the ground state follows directly from the hardness of the h j =0 case. The SK spin glass with such fields is mathematically equivalent to a zero field spin glass with one more spin which is 'fixed' in one orientation. This is not true in general for different distributions of field strength h j . There are known examples in which fields can destroy spin glass behaviour (see, e.g. Young andKatzgraber 2004, Feng et al 2014). In particular, if the field strengths are much larger than the coupling strengths (| | | |  h J j j k for all j, k), then the energy is minimized trivially when all the spins each minimize the energy with respect to their individual fields. While the distribution of field strengths could be used to tune the problem hardness, we do not use it in this way here, and only consider cases where the field and coupling strengths are drawn from the same distribution.
An astute reader will notice that if one effectively un-fixes the spin which corresponds to the fields (thus making all states two fold degenerate and converting the system to a double cover of the orignal system), these couplings will effectively be on average stronger by a factor of 2 . As this increase in coupling strength does not scale with the number of spins, it is going to become less and less significant as the size of the system is scaled up the hardness will be preserved.
The mapping into the quantum Ising model is almost trivial: the classical spin variables S j are simply mapped to Pauli-Z operators. Thus, the problem HamiltonianĤ SK becomeŝˆˆˆ( The SK problem Hamiltonian differs from the search problem by having structure, produced by theˆẐ Z j k terms As a result, the covariances between the energies of two basis states depends on the Hamming-distance between them (Baldwin and Laumann 2018). Knowing the energy of one state gives some information about the energy of states that differ by a small number of bit-flips. This results in a distribution of the eigenenergies that is almost normal (as can be seen by plotting the distributions and numerically calculating moments), but which deviates from normal in the tails of the distribution.

Random energy model
To isolate the effect of the correlations in the SK problem, we compare it with the random energy model (REM) (Derrida 1980), in which the eigenenergies themselves are independently drawn from a normal distribution. The problem HamiltonianĤ REM for REM is . REM has a similar energy level distribution to that of SK, apart from the tails. By definition it lacks the correlations: knowing the energy of one state gives no information about the energies of other states. Comparison between these two models highlights the effect of the pairwise structure in the SK model.

Numerical methods
The main tool used for the investigations in this work is numerical simulation. We are studying computationally hard problems for which there are no tractable analytical solutions except in special cases.
For each number of qubits 5n20 we generated 10 000 random instances of the SK spin glass Hamiltonian, defined in (16), with the couplings J jk and fields h j drawn with a standard deviation s w = SK SK , where ω SK is an arbitrary energy unit. The value ω SK =5 was used for computational convenience. We also generated 10 000 random instances of the REM Hamiltonian, defined in (17), for each number of qubits 5n15, with normally-distributed energies F j drawn with a standard deviation σ REM =ω REM . The value ω REM =1 was used for computational convenience. Note that choosing any arbitrary constant for ω will only affect overall time and energy scales by a constant factor, and the energy unit ω SK has been scaled out of the plots where relevant.
The key quantity to determine numerically is the probability that the ground state is found by running a quantum walk computation on each spin glass instance. It is particularly convenient to compute the infinitetime probability ¥ P given by (20), for sizes where full diagonalization is possible. Writing the spectral expansion of the full computational quantum walk Hamiltonian aŝ and | ( ) g ñ E a the eigenstate with eigenvalue E a (γ), we can write the instantaneous probability in terms of the spectral expansions as Assuming no degeneracy (that is, all gaps E a -E b are nonzero), which is justified for the randomized nature of the SK and REM problems, the oscillatory terms cancel in the infinite limit (because ( ) , and the plotting has been done using matplotlib (Hunter 2007). The dynamical simulations have been performed by computing the action of the propagator (ˆ( )) g -tH exp i on the initial state | ( ) y ñ 0 , using the sparse matrix functions within SciPy when possible. For the more computationally demanding analyses, we were limited to n11 by the computational resources available. Where relevant, figures in this paper have error bars included. However, in most cases the error bars are much smaller than the size of the marker symbols used and so are not visible. This is due to the size of the data sets (10k instances per value of n), which provides a good level of accuracy for the average quantities.
Simulations were run on the Imperial and Durham University high performance computing facilities. The data for all the instances used is available on a permanent data archive (Chancellor et al 2019).

Quantum walks with spin glasses
In order to implement a quantum walk algorithm for finding the ground states of the spin glasses defined in section 3, we follow the procedure described in section 2.2: choose a quantum walk graph G and associated HamiltonianĤ G , and add the spin glass Hamiltonian to get the full computational quantum walk Hamiltonian ( )ˆĝ = + H H H G P , whereĤ P refers toĤ SK orĤ REM as appropriate. Since the hypercube is the natural choice of graph for qubit implementations, we use this graph, with quantum walk HamiltonianĤ h defined in (12), unless otherwise indicated. For the initial state | ( ) y ñ 0 , we use the equal superposition (7), which is the ground state of the hypercube HamiltonianĤ h .

Setting the hopping rate
In contrast to the search problem, for SK and REM it is impossible to efficiently calculate the optimal hopping rate ( ) g h opt that maximizes the success probability. It is not even clear which measure of success probability should be maximized because, unlike the search problem, there will be no efficient way to find the optimal measurement time ( ) t f opt for any choice of hopping rate γ. To bootstrap the investigation, we choose to define the optimal hopping-rate ( ) g h opt with respect to one of the average probabilities defined in (9); in particular, we choose the hopping rate that maximizes the infinite-time average probability ¥ P defined in (10). We make this choice because the infinite-time average probability ¥ P is numerically convenient to calculate, and because it has been seen to be a relevant measure of probability in the search example, see figure 2(b). We will see in section 5.3 that the probability ¥ P typically agrees well with probabilities averaged over shorter and more practical time windows.
Some plots of the infinite-time probability ¥ P against hopping rate γ for typical 11-qubit examples of the SK and REM are shown in figure 3. Note that the maximal success probability varies by an order of magnitude between the two problem-types, with REM highest and SK lowest. While the optimal hopping rate ( ) g h opt is instance-dependent, these plots show that the dependence of infinite-time probability ¥ P on hopping rate γ is typically characterised by broad, bumpy peaks for SK, and by narrow, well-defined peaks for REM. This implies that a precise value of the hopping rate γ is needed for REM, while there is some tolerance to non-optimal values of the hopping rate γ for SK for the sizes that we have studied.
To investigate the success probability more systematically, we performed a brute-force numerical search to find the optimal hopping rate ( ) g h opt that maximizes the success probability ¥ P for each spin glass instance from the data sets of 10k random instances for 5n11. This gives a baseline maximum average single run success probability for the quantum walk algorithm.
The optimal hopping rates ( ) g h opt correspond to the best a quantum walk algorithm on the hypercube can possibly do in a single run. For practical algorithms, we need a heuristic method for choosing the hopping rate that can be calculated from the known parameters. For the quantum walk search algorithm, the optimal hopping rate balances the energy between the two components of the Hamiltonian,Ĥ P andĤ G . Guided by this, we define the heuristic hopping rate ( ) g h heur for SK and REM such that it balances these overall energyscales on average. We match the energy-spread ( )  heuristic hopping rate ( ) g h heur by To demonstrate that this heuristic is sensible, we compare in figure 4 the distributions of optimal hopping rates ( ) g h opt for SK (blue) and REM (red), as well as the heuristic hopping rates (black, dashed and dotted lines for SK and REM respectively) calculated according to (21), for the 11-qubit data set. For both SK and REM, the heuristic hopping rate ( ) g h heur falls in the centre of the ( ) g h opt distributions. Note that the SK distribution is much broader than for REM: not only are the individual peaks for ( ) g h opt for SK much broader than for REM (figure 3), but the distribution of the maxima of those peaks is also much broader ( figure 4). This may seem to be a problem for specifying a heuristic value for γ for SK from average energies, but as we will show, it is actually REM that fails for the heuristic γ, while SK works well.
For a normal distribution of energy levels, the average problem energy spread can be estimated as  (22) is accurate for REM (which has normally-distributed energy levels by definition) but, as already noted, the distribution of the eigenenergies in SK deviates from normal, especially in the tails. Numerically, we find that there is a multiplicative constant factor of approximately 0.887 that corrects the formula in (22) for SK for the effects of the non-normal tails. For the numerical analysis, we use the numerically calculated average energy-spread at each number of qubits n. Figure 5(a) compares the heuristic hopping rate ( ) g h heur and average optimal hopping rate ( ) g á ñ h opt at different numbers of qubits 5n11. The full width at half maximum (FWHM) has also been calculated for each instance, to estimate the tolerance ( ) g D h opt to deviations from the optimal hopping rate ( ) g h opt (illustrated in figure 3). The width of the shaded regions in figure 5(a) corresponds to the average tolerance range ( ) g áD ñ h opt at each number n of qubits. While the heuristic hopping rate differs slightly from the the average optimal hopping rate for SK, the average tolerance range opt is much broader, and does not shrink with increasing number of qubits n. For REM, however, while we see close agreement on average, the tolerance range shrinks quickly with the number of qubits n as the peaks (as in figure 3, right) become narrower. This means that the heuristic hopping rate ( ) g h heur is more likely to lie further than opt outside of the actual probability peak for each instance, even though it agrees well with the average optimal hopping rate ( ) g á ñ h opt . Consequently, a quantum walk with the heuristic hopping rate ( ) g h heur does not perform well for most REM instances. It is instructive to quantify this sensitivity to deviations from the optimal hopping rate ( ) g h opt . Figure 5(b) shows log-linear and log-log plots of the average fractional tolerance range  ) of the numerically-found optimal hopping rates ( ) g h opt scaled by the energy unit w P for the 10 000 11-qubit instances of SK (blue) and REM (red). The dashed and dotted lines show the heuristic hopping rate ( ) g h heur , calculated according to (21), for SK and REM respectively (also scaled by ω P ).
Thus, we see that REM behaves like the search problem in a quantum walk setting. For a precisely optimal hopping rate ( ) g h opt , the success probability is high, but this instance-dependent hopping rate is hard to predict, unlike for the analytically tractable quantum walk search algorithm. Without this precise hopping rate, quantum walks perform no better than guessing for the search problem and for REM. In contrast, quantum walks applied to SK give a better-than-guessing success probability > ¥ P N 1 for the heuristic hopping rate ( ) g h heur calculated according to (21).
With the conditions under which we can achieve a better-than-guessing success probability characterised for the three problem types, SK, REM, and search, we turn to the scaling of this success probability with problem size N. Figure 6 shows how the single-time success probability P(t f ) varies with the measurement time t f for two typical 11-qubit examples of SK and REM. In the REM case, the behaviour is similar to that shown in figure 2(a) for search: an oscillatory nature indicating the dominance of a two-level avoided-crossing feature, but with evidence of the population of other energy-levels that lead to finite-size effects in search. For REM, these finite-size effects are more pronounced, and are instance-dependent. The random nature of the REM problems means there is not such a clear cut off size, as there is for the search problem, above which finite size effects are negligible. In any case, based on search, we expect finite size effects to be significant at n=11. For SK, the behaviour is quite different from search or REM. There is no indication of dominant oscillatory behaviour; instead, these plots show unpredictable, highly instance-dependent fluctuating dynamics for all the sizes we are using. This indicates that for SK, the behaviour is determined by the excitation of many energy levels.

Success probability
As with finding a suitable hopping-rate γ, both REM and SK differ from the search problem in that there is no practical way to find the optimal measurement time ( ) t ; f opt a different approach must be taken instead. As already noted for the search problem, this can be handled by using the time averaged probabilities defined in (9). We first consider the infinite-time probability ¥ P , as defined in (10), since it is easy to calculate (see section 4). Figure 7 shows the average infinite-time success probability á ñ ¥ P against the number n of qubits for the two problems using both the optimal ( ) g h opt and heuristic ( ) g h heur hopping rates. For SK, this gives exponential decay with the number of qubits n in both cases: the average probability á ñ ¥ P changes with n according to whereÕ may neglect factors logarithmic in its argument. That is, using the heuristic hopping rate ( ) g h heur instead of the optimal hopping rate ( ) g h opt has only a minor impact on the average success probability á ñ ¥ P .
For REM, the behaviour is quite different. With the optimal hopping rate ( ) g h opt we see a success probability ¥ P of constant order but with a pronounced dip. This behaviour is similar to that seen for the search problem, where the dip seen in figure 2(b) is a finite-size effect. This similarity is expected, given the similarity between the dynamical behaviour shown in figure 2(a) for search and in figure 6(b) for REM. With the heuristic hopping rate ( ) g h heur for REM, we see a significantly reduced success probability ¥ P compared to the optimal case. That is, the heuristic is performing poorly, despite the good agreement shown in figure 5(a).
The clear difference in behaviour between SK and REM can be explained by the different tolerances ( ) g D h opt to deviations from the optimal hopping rate ( ) g h opt shown in figures 5(a) and (b). For SK, the tolerance range is broad enough for the heuristic to lie within it, while for REM the heuristic hopping rate ( ) g h heur almost always misses this range entirely even though it is close to the average optimal hopping rate ( ) g á ñ h opt .  respectively. Red, right: log-linear plot of the same quantities for REM. In this case, the probability stays at constant order for the optimal rate and decays for the heuristic rate.

Mixing times
We have thus numerically determined an average success probability scaling with problem size of˜( ) -O N 0.42 for a quantum walk finding SK spin glass ground states, using the heuristic hopping rate ( ) g h heur . This is based on the infinite time-success probability ¥ P , i.e. uniform sampling from the distribution of all possible run times. We now investigate the time dependence in more detail: can we sample from a finite run time and still obtain the same speed up? Since P(0)=1/N corresponds to random guessing, there must be a minimum time before which it is not effective to measure.
We define a mixing-time ( ) t  mix to be the latest time, t, for which the time averaged probabilities¯( ) P t 0, and ( ) P t 0, 2 at the two times t and 2t differ by a fraction greater than the fluctuation parameter ò, This definition of ( ) t  mix is based on similar definitions found in prior work (Aharonov et al 2001), with modifications for computational convenience. We numerically estimated the mixing-time ( ) t mix 0.05 for each SK instance up to n=11 qubits, using the optimal hopping rate ( ) g h opt for each instance. We simulated the quantum walk computation dynamics for a successively-doubling duration until a time at which the condition is met was reached. The fluctuation parameter ò=0.05 corresponds to a deviation of 5%. To verify that the mixing-time ( ) t mix 0.05 correctly captures the relevant dynamical timescale, we also numerically estimated it for the search problem at each system size from n=5 to n=30 qubits. The search problem using continuous-time quantum walks can be mapped to the symmetric subspace, allowing larger sizes to be analysed. The mixing-time ( ) t mix 0.05 for search exhibits the expected exponential timescale: the solid green line of best fit in figure 8(a) has the expected scaling with problem size N of˜( ) 1 2 . For search, the scaling is dominated by the run time, the success probability is O(1). However, this behaviour only emerges clearly above n∼20. Below this, the behaviour is influenced by the finite-size effects that arise due to population of higher energy levels. This means it is not useful to analyse the behaviour of the REM time scaling, finite size effects mask the scaling behaviour for computationally tractable sizes. However, unlike search and REM, the SK behaviour is influenced by higher energy levels at all sizes, through the frustration provided by the random couplings between the spins. Hence, we do not expect to see such finite-size effects in SK; the behaviour is already dominated by the frustration at small sizes. Figure 8 Thus it contributes a logarithmic factor to the overall scaling. We emphasise that while this single-run timescale is polynomial in the number of spins n, the overall timescale is still exponential in n due to the exponential number of repeats required to achieve O(1) success probability.
To confirm the subsidiary nature of the time scaling for each SK run, we show in figure 9 a log-plot comparing, for the heuristic hopping rate ( ) g h heur , the success probability ¥ P in the infinite-time case (as in figure 7) and in the case of an early, logarithmically-scaling (with respect to N) measurement window . This n 0.5 scaling of the window is even shorter than the fitted scaling of n 0.75 , although at these sizes the difference is not significant. This finite-time probability This should be compared with (23), where the value of the exponent for the average infinite time success probability ¥ P with the heuristic hopping rate ( ) g h heur is given by −0.417±0.002. As the dominant factor in the total runtime comes from the required number of repeats, and because the single-run timescale contributes only a logarithmic factor, these results constitute good numerical evidence for an average total runtime which scales with problem size N as˜( )O N 0.41 for using quantum walks to find spin glass ground states, over the range of N in our data sets. This scaling is a better than the best possible (quadratic) speed up achievable for quantum walk search algorithms. Moreover, it comes without the requirement for exponential precision in setting the hopping rate that renders practical use of quantum walk searching difficult for large problems. We now present some insights into where the improvement over search comes from.
6. Computational mechanisms 6.1. Role of correlations in SK To investigate whether the energy correlations with Hamming distance in SK play a significant role in the computational process of finding the ground state with a quantum walk, we performed three additional sets of numerical tests.
Firstly, we used the same SK instances but performed the quantum walk using a complete graph HamiltonianĤ K , defined in (11), instead of the hypercube graph HamiltonianĤ h . This removes the correspondence of Hamming-distance between classical states with the distance between those states on the graph-for the complete graph, every state is one unit (edge) away from every other state. In terms of the Hamiltonian, the transverse Ising term is replaced by sums of products of up to n Pauli-X operators that flip up to n qubits at the same time, in all possible combinations. For each SK instance up to n = 11, we estimated the optimal hopping rate ( ) g K opt for the complete graph, and then used it to calculate the infinite-time probability ¥ P . Secondly, we constructed 'scrambled SK' instances, denoted sSK, by randomizing which state corresponds to which energy in the SK instances. In doing so, we arrive at Hamiltonians with identical energy spectra to the SK instances, but without the correlations between energy difference and Hamming distance on the hypercube graph. This approach has similarities with previous work , Hen 2014. For each sSK instance, we estimated the optimal hopping rate ( ) g h opt , which is different from that used for the ordinary SK versions. This hopping rate was then used to calculate ¥ P . Thirdly, we sorted the eigenenergies of each REM instance in increasing size and assigned them to the computational basis states in the order of a binary-reflected Gray code on their bitstrings, to arrive at a problem denoted REMGC. In doing so, we added some amount of Hamming-distance structure by ensuring that the closest energies are assigned to states that differ by only a single bit-flip. For each REMGC instance, we estimated an optimal hopping rate ( ) g h opt , which is different from that used for the ordinary REM problem. This was used to calculate the infinite-time probability ¥ P . While REMGC is not a hard problem as defined, it provides a useful example to compare with how the quantum walk finds the ground state of a SK spin glass.
These three variants provide separate tests of the influence of the graph structure (choice of quantum walk Hamiltonian) and problem structure (pairwise correlations in SK). Figure 10 shows how the infinite-time probability ¥ P varies with the number of qubits n for these three variants, alongside SK and REM on a hypercube graph from figure 7. The variation of ¥ P with the number of qubits n for the five variants is clearly split into two groups, behaviour like REM and search on the one hand, and behaviour like SK on the other. Removing the correlations from SK by scrambling the energies (sSK) results in behaviour like REM and search. Moreover, removing the correspondence between distance and Hamming weight by using the complete graph instead of the hypercube also changes the SK problem behaviour to be like REM and search. In the opposite direction, inserting pairwise correlations into REM via a Gray code (REMGC) results in problems that are much more like SK than like the REM problems on a hypercube graph.
From this, we infer that the problem structure-in this case the pairwise correlations in SK-needs to be matched by a compatible driver Hamiltonian-in this case the hypercube/transverse Ising-to obtain better than quadratic scaling. This type of local structure in the solution space is exploited in many classical algorithms. For example, classical Monte Carlo optimizations that use a single bit flip update rule are naturally using this hypercube structure. Using a complete graph instead would correspond to flipping a random number of bits, which is equivalent to guessing at each step.

Energy conservation dynamics
Continuous-time quantum walk time evolution is unitary, and there is no time dependence in the Hamiltonian that can lead to energy gain or loss by the system. Hence, it is important to consider how it can find a lower energy state than it starts in (with respect toĤ P ) with any better-than-guessing probability. For the search problem, this happens through an analog of Rabi flopping (see figure 2), cycling between the initial and solution states. However, the dominant avoided level crossing structure is not present in the spin glasses to provide this mechanism.
We now show that there is a very generic mechanism (also described independently by Hastings 2019) that relies on starting in the ground state of the quantum walk part of HamiltonianĤ G . Letˆ (  As | ( ) y ñ 0 is chosen to be the ground state ofĤ G , the lhs must be non-negative. Furthermore, as | ( ) y ñ 0 is not an eigenstate ofˆ( ) g H , some dynamics are guaranteed to occur and so the LHS must become positive at early times. Therefore, the rhs must also be non-negative always and positive at early times. Thus, taking any final time t f , we get the inequalityˆˆ( (30) shows that performing time evolution under the computational quantum walk Hamiltonian from the initial state | ( ) y ñ 0 is guaranteed to lower the energy of the system with respect toĤ P (the expectation valueˆ( ) á ñ y H P t ). This implies that the overlap with low energy eigenstates ofĤ P will increase, at least for short times. A measurement in the computational basis will thus be on average more likely than a random guess to produce a low energy state.
Starting in a low energy state is thus important for the success of the quantum walk algorithm (we have checked this numerically). It also implies that encoding prior information into the initial state will help, provided this is given in the form of a lower energy state than the uniform superposition state. This could be the final state from a previous run, for example, which will be explored further in Nita et al (2020). It is also necessary to bias the quantum walk Hamiltonian so that its ground state matches this biased initial state. Since this starting state is a known computational basis state, it is possible to do this biasing for suitably designed hardware.
For many optimization problem applications, it is helpful to find a low energy state, even if it is not actually the true ground state. From this point of view, that quantum walks necessarily lower the expectation energy with respect to the problem Hamiltonian is very appealing as a computational mechanism. This argument by itself does not provide a guaranteed scaling or quantum speed up, but it does explain how the quantum walk dynamics work in this setting, where there is no way to lose (or gain) energy. It is possible to generalise these arguments beyond time-independent Hamiltonians (Callison et al 2020a), to include monotonic functions A(t) and B(t) in (4).
To illustrate this energy redistribution mechanism, the plots in figure 11 show how the expectation valuê ( ) á ñ y H G t of the quantum walk Hamiltonian (green solid-line) and the expectation valueˆ( ) á ñ y H P t of the problem Hamiltonian (red solid-line) vary during a quantum walk. We have included the instantaneous success probability P(t) (faint grey) to show that the timescale used is long enough for significant dynamics to take place. A typical 10-qubit SK example is shown in figure 11(a) and a typical 10-qubit REM example is shown in figure 11(b), both on the hypercube using their respective optimal hopping rates ( ) g h opt . Also shown is the ground state eigenvalueˆ( ) á ñ H P E P 0 of the problem Hamiltonian (red, dashed-dotted line) and the ground state eigenvaluê ( ) á ñ y H G 0 of the quantum walk Hamiltonian (green, dashed line). In both SK and REM, the initial evolution takes the state away from theĤ G ground state, raising theĤ G expectation value, and thereby lowering theĤ P Figure 11. The expectation valueˆ( ) á ñ y H G t of the quantum walk Hamiltonian (green, thin solid line) and the expectation valueˆ( ) á ñ y H P t of the problem Hamiltonian (red, thick solid line) for a typical 10 qubit (a) SK and (b) REM instance. The ground state energy eigenvalues of the quantum walk Hamiltonian (green, dashed line) and problem Hamiltonians (red, dashed-dotted line) are also shown. To illustrate that significant dynamics take place over the timescales used, the instantaneous probabilities P(t) are also shown (grey, faint line). The energy values are on the left axes, while probability values are on the right axes. expectation value to a point around which it fluctuates for the duration simulated. This clearly shows the energy redistribution mechanism at work, and the short time scale over which it appears.

Summary and outlook
In this work, we have shown numerically that continuous-time quantum walks are a viable computational method for finding ground states of hard spin glass problems. We have produced strong numerical evidence for a better-than-search polynomial quantum speed up over random guessing, with a scaling of the average single run success probability˜( ) -O N 0.41 , using data sets of size 5n20 spins (32N1048 576). Moreover, and importantly, this is obtained without the need to set parameters exponentially precisely, as is required for quantum walk search algorithms. The hopping rate γ, that determines the relative strengths of the quantum walk and problem Hamiltonians, can be estimated from the overall energy scales, which are determined by the hardware and encoding of the problem.
To explain why quantum walks are able to do better than quantum searching in this case, we compared variants on the spin glass problems that remove or add pairwise correlations, and compared the hypercube graph quantum walk Hamiltonian with the complete graph quantum walk Hamiltonian. This showed that the combination of pairwise correlations in the encoding of the problem and a matching single spin flip quantum walk Hamiltonian is required to exploit the correlations. The single spin-flips driven by the transverse field termsX j in the hypercube quantum walk Hamiltonian are the correct operators for the pairwise interaction termsˆẐ Z j k in the spin glass Hamiltonian. A single spin flip on either qubit j or k changes the energy for that term from high to low, or vice versa. Since we can choose how to encode the problems into the Hamiltonians, and there are known methods to convert higher order terms to pairwise terms (Bremner et al 2002, Dattani 2019, we can arrange to use this mechanism both for its computational advantages and practicality for hardware implementation as the transverse Ising Hamiltonian.
To explain how quantum walks are able to find low energy states when the closed quantum dynamics have no mechanism for losing energy, we showed how starting in the ground state of the quantum walk part of the Hamiltonian guarantees dynamics that decrease the expectation value of the energy with respect to the problem Hamiltonian. This also ensures that prior information can be provided by starting in lower energy states, from which improved solutions can be found. Exploiting this process will allow an optimal quantum algorithm to be built from multiple quantum walk runs that use the information gained from prior runs. Performing multiple quantum walk runs in early, noisy quantum hardware is a more viable approach than maintaining coherence for sufficiently accurate adiabatic algorithms. Quantum walks may also be simpler to implement since they do not require time dependent controls. This work thus provides a significant advance in understanding how to exploit quantum walks in practical hardware for optimization problems.
It is likely that further insights into the computational effectiveness of quantum walks in this transverse Ising Hamiltonian setting are to be found in current knowledge of spin glass phases in the presence of transverse fields. The spin glass transition itself is not fully understood, in neither the quantum nor classical case (see, e.g. Parisi 1980, Fisher and Huse 1987, 1988, Thirumalai et al 1989, Larson et al 2013, Magalhaes et al 2017, Young 2017. However, the phases of interest for computation are not the spin glass phases themselves, but the phases where transitions between states are still occurring at a rapid enough rate to find solution states. Extremely long equilibration timescales are a defining property of all glass phases, including spin glasses (Bouchaud et al 1998, Cugliandolo 2002. Since the equilibration (mixing) times ( ) t  mix we find in section 5.3 for the SK spin glass only scale polynomially with the number of spins, it is most likely that at the optimal hopping rates ( ) g h opt , our quantum walks are not in a finite size precursor to a spin glass phase, but rather in a precursor to a paramagnetic phase, for which equilibration times can be fast. Given that the system should localize more in lower energy states for smaller transverse fields, it is reasonable that our optimal hopping rates ( ) g h opt occur near the edge of the precursor to the spin glass phase. Furthermore, the mild scaling of the width ( ) g D h opt of the peak around the optimal hopping rate ( ) g h opt suggests that the regime where quantum walks performs well may correspond to the second paramagnetic phase observed in Magalhaes et al (2017). Polynomial gaps have been found around the spin glass-paramagnetic phase transition in a related model in Knysh (2016).
A numerical study such as this inevitably leaves open questions regarding the asymptotic scaling of the problems. In particular, we observed a range of hardness in the SK data sets and future work will investigate what fraction of the instances are actually hard for classical algorithms. Forthcoming work applying similar techniques to Max2SAT (Callison et al 2020b) will characterise the hardness of small random instances in more detail, and establish quantum walks as an effective tool for hard optimization problems more generally. While general methods are known to speed up the best classical algorithms (Hartwig et al 1984) for this type of problem (Montanaro 2018, 2019), further work is required to determine whether an optimal continuous-time quantum walk algorithm can be devised that fully leverages the advantage from the correlations. Nonetheless, our work represents a significant advance in developing continuous-time quantum walk computation for hard optimization problems, and provides key insights into the computational mechanisms that can be exploited over short timescales, well-suited to the limited coherence times of noisy, intermediate scale quantum hardware.