Tight Quantum Depth Lower Bound for Solving Systems of Linear Equations

Since Harrow, Hassidim, and Lloyd (2009) showed that a system of linear equations with $N$ variables and condition number $\kappa$ can be solved on a quantum computer in $\operatorname{poly}(\log(N), \kappa)$ time, exponentially faster than any classical algorithms, its improvements and applications have been extensively investigated. The state-of-the-art quantum algorithm for this problem is due to Costa, An, Sanders, Su, Babbush, and Berry (2022), with optimal query complexity $\Theta(\kappa)$. An important question left is whether parallelism can bring further optimization. In this paper, we study the limitation of parallel quantum computing on this problem. We show that any quantum algorithm for solving systems of linear equations with time complexity $\operatorname{poly}(\log(N), \kappa)$ has a lower bound of $\Omega(\kappa)$ on the depth of queries, which is tight up to a constant factor.


Introduction
Quantum linear systems problem.Since the discovery of the celebrated quantum algorithm for solving systems of linear equations by [1], it has been applied in various fields, e.g., machine learning [2], quantum chemistry [3], and finance [4].
The Quantum Linear Systems Problem (QLSP) is to prepare an ε-approximation to the quantum state |x that is proportional to A −1 |b , given access to matrix A and vector |b .Since hard instances for QLSP are known by taking |b = |0 in [1], we formally state (the special form of) QLSP as follows for simplicity.
Problem 1 (QLSP).Suppose that A ∈ C N ×N is an Hermitian matrix with known condition number κ > 0 such that I/κ ≤ A ≤ I. Let Given quantum query access to A, the goal is to prepare a quantum state |x such that |x − |x ≤ ε with probability at least 2/3.We use QLSP(N, κ, ε) to denote the problem with the chosen parameters.
Here, two types of quantum query access to a matrix are often considered in the literature: quantum query access to sparse matrices and to block-encoded matrices.The former assumes a quantum oracle O A that computes each entry of an O(1)-sparse matrix A (given the row and column indices) and a quantum oracle O s that computes the index of each non-zero entry in each row (given the row index); the latter assumes a quantum oracle U A that is a block-encoding of a matrix A (not necessarily sparse), i.e., roughly speaking, A is encoded in the upper left corner of the unitary operator U A .Let Q sparse (N, κ, ε) and Q block (N, κ, ε) denote the quantum query complexity for QLSP(N, κ, ε) with access to sparse and block-encoded matrices, respectively.
Since quantum access to sparse matrices can be converted to quantum access to block-encoded matrices as noted in [5], it naturally holds that Q sparse (N, κ, ε) = O(Q block (N, κ, ε)).For simplicity, here we only consider the quantum query complexity for QLSP in terms of sparse matrices.The first quantum algorithm for QLSP proposed in [1] is based on quantum phase estimation [6,7,8], resulting in a query complexity of O(κ 2 /ε) and a time complexity of poly(log(N ), κ, 1/ε), where O(•) suppresses logarithmic factors; they also gave an Ω(κ 1−δ polylog(N )) quantum time lower bound for QLSP.Shortly after, the query upper bound was improved to O(κ/ε 3 ) in [9] by variable-time amplitude amplification, with an almost optimal dependence on κ.An exponential improvement over the dependence on ε was obtained in [10] via the Linear-Combinations-of-Unitaries (LCU) technique [11,12] combined with quantum walks for Hamiltonians, achieving a query complexity of O(κ polylog(κ/ε)).Subsequent works then focused on optimizing the logarithmic factors in the complexity.The query complexity for QLSP was improved to O(κ log(κ)/ε) in [13] based on the adiabatic randomization method; later, it was improved to O(κ log 2 (κ) log 4 (log(κ)/ε)) in [14] based on the time-optimal adiabatic method, and to O(κ(log(κ)/log(log(κ)) + log(1/ε))) in [15] based on Zeno eigenstate filtering.In [16], it was shown that Ω(κ) queries are required to solve QLSP even if the matrix is positive definite; they also identified a class of positive definite matrices for which efficient quantum algorithms with query complexity O( √ κ) exist.Recently, logarithmic factors of κ was finally removed in [17], resulting in a query complexity of O(κ log(1/ε)), which is optimal according to the lower bound Ω(κ log(1/ε)) claimed in the forthcoming work [18].
We summarize the developments of QLSP in Table 1.
Table 1: Developments on the quantum query complexity for QLSP.

249κ This work
Parallel quantum computation.The power of parallelism in quantum computing first attracted researchers' attention due to a fast parallel circuit of the quantum Fourier transform [19], which can be used to implement Shor's factoring algorithm [20] with quantum logarithmic depth and classical polynomial-time pre-and post-processing.This breakthrough was later implemented in 2D nearest-neighbor quantum architecture [21], and was extended to finding discrete logarithms on ordinary binary elliptic curves [22].The parallel version of Grover search [23] and its extensions were also studied in [24,25,26,27,28].Inspired by this, an optimal parallel quantum query algorithm for element distinctness [29] was proposed in [30].Recently, an unconditional quantum advantage with constant-depth circuits was discovered in [31].This surprising result was further enhanced in [32,33,34,35,36,37].A constant-depth quantum circuit for multivariate trace estimation was given in [38].A low-depth Hamiltonian simulation was proposed in [39] for a class of Hamiltonians; also, a depth lower bound Ω(t) was recently shown in [40] for Hamiltonian simulation for time t.Optimal space-depth trade-offs were found for CNOT circuits [41] and quantum state preparation [42,43,44,45].
In complexity theory, the complexity classes QNC and QACC, the quantum analogs of classical classes NC and ACC (for problems efficiently computable in parallel), were first defined in [46] and [47], respectively, and they were further studied in the literature [48,49,50,51,52].Inspired by the discovery of classical-quantum hybrid algorithm for factoring [19], Jozsa [53] and Aaronson [54] raised the open problem of whether any polynomial-time quantum algorithm can be simulated by a polynomial-time classical algorithm interleaved with low-depth quantum computation.This problem aims to compare the computability between BQP, BPP BQNC , and BQNC BPP .Recently, this problem was answered by [55,56], giving an oracle separation of BQP from either BPP BQNC or BQNC BPP , which was further improved by [57,58,59].

Main results
As mentioned above, a number of quantum algorithms turned out to have low-depth circuit implementations, which have potentials to be realized in the NISQ era [65].One may wonder if systems of linear equations can be solved in parallel on a quantum computer.We study the limitation of quantum parallelism on this problem.We use Q sparse (N, κ, ε) and Q block (N, κ, ε) to denote the quantum query-depth complexity for QLSP with quantum access to sparse matrices and to block-encoded matrices, respectively.Here, the quantum query-depth complexity for QLSP means the minimal depth (with respect to queries) of the quantum circuit that solves QLSP in quantum time poly(log(N ), κ, 1/ε) (see Section 2.3 for the formal definition).
Our main result is as follows.
We compare our result with the known bounds in Table 1.As a corollary, we can also obtain a tight depth lower bound for QLSP with quantum access to block-encoded matrices.
Regarding the range of the constant ε in Theorem 2 and Corollary 1, it is worth noting that if one can solve QLSP to a constant precision ε 0 = Θ(1), then one can improve this to arbitrarily small precision ε at an additional cost of O(κ log(1/ε)) via "eigenstate filtering" [15].
Constant factor in quantum complexity.It was shown in [1] that any quantum query algorithm for QLSP has time complexity Ω(κ 1−δ polylog(N )) for any δ > 0.Then, it was shown in [16] that Ω(κ) queries are necessary for QLSP even if the matrix is positive definite.Recently, a matching quantum query lower bound Ω(κ log(1/ε)) for QLSP was claimed in the forthcoming work [18].These results did not consider the explicit constant factor hidden in their lower bounds.As a corollary, our quantum depth lower bounds for QLSP imply quantum query lower bounds with an explicit multiplicative constant factor as follows.
Proof.This is straightforward by noting that any depth lower bound is also a query lower bound, e.g., Q sparse (N, κ, ε) ≥ Q sparse (N, κ, ε).Taking the bounds in Theorem 2 and Corollary 1 leads to the conclusion.

For comparison, a quantum query upper bound
for block-QLSP was recently derived in [17,66].Combining it with the lower bound given in Corollary 2, we conclude that the quantum query complexity Q block (N, κ, ε) grows linearly in κ with a constant factor bounded by as κ → ∞.Here, we bound the ratio Q block (N, κ, ε)/κ from both sides with explicit constants, though the range still remains large.The right hand side the Equation ( 1) can be improved to 56.0κ + 1.05 log(1/ε) due to the very recent work [67].

Techniques
The idea for proving the lower bounds is to reduce some computational problem that is hard to solve in parallel on a quantum computer to QLSP.That is, the reduction means that any lowdepth quantum algorithm for QLSP can be used to solve this hard problem in parallel.Previously, the quantum time lower bound Ω(κ 1−δ polylog(N )) for QLSP was derived in [1] by reducing the simulation of BQP quantum circuits to QLSP.In our case, we choose to reduce the permutation chain problem, which was discovered in [40] and recently shown to be hard for parallel quantum computing.The permutation chain problem is to find the final element after a chain of permutations applied on the initial element 0; that is, to find Π , given quantum query access to the (0-indexed) permutation π j of size N = 2 n for every 1 ≤ j ≤ q = poly(n) (see Section 2.4 for the formal definition).It was shown in [40] that any quantum algorithm for permutation chain with time complexity poly(n) has depth Ω(q).To obtain a depth lower bound for QLSP, we need to encode the permutation chain into a system of linear equations (see Section 3.1 for the explicit construction).Our construction is based on that of [1].In comparison, the condition number κ of the constructed matrix is bounded by the size of the simulated quantum circuit in [1]; while, in our case, κ is bounded by the length q of the permutation chain.The difference is that since we only consider the query complexity but not gate complexity here, our construction of the system of linear equations is stand-alone and does not directly relate to quantum computing.
Let A be the Hermitian matrix that encodes the permutation chain as above.It can be shown that measuring the normalized solution |x ∝ A −1 |0 can produce the desired result Π q (0) with a constant probability of, say, at least 0.01.Using the standard success probability amplification, we can obtain Π q (0) with probability at least 2/3 by O(1) repetitions of a QLSP solver.Therefore, any QLSP solver should have depth Ω(q) due to the hardness of the permutation chain, where q = Θ(κ) interchangeably.
We note that the permutation chain problem was also used in [40] to derive the quantum depth lower bound for Hamiltonian simulation, but our idea and techniques are different from theirs.For comparison, they used a graph-to-Hamiltonian reduction which solves the permutation chain using a quantum walk on a line [68]; in our case, we construct a QLSP encoding of a permutation chain, which modifies the construction of [1].
Towards explicit constant factors.To derive an explicit (multiplicative) constant factor in the complexity, we first observe that the constructed matrix A has condition number κ ≤ 2.001q for sufficiently large κ.Second, each of the quantum oracles O s and O A for sparse matrix A can be implemented using only 1 query to the quantum oracle O π for the permutations π 1 , π 2 , . . ., π q .Finally, combining the lower bound q/2 for the permutation chain [40] (see Theorem 3) with the above construction, we conclude that any QLSP should have depth ≥ κ/4.002 ≥ 0.249κ as stated in Theorem 2 for sparse-QLSP.
For block-QLSP, the depth lower bound is obtained through the construction of block-encoding from the sparse input model given in [5].Specifically, our constructed matrix is 2-sparse, and thus its (scaled) unitary block-encoding can be implemented using 4 queries to the sparse oracles O s and O A in total.From this, we can establish a connection from the complexity of block-QLSP to that of sparse-QLSP with explicit multiplicative constant factors.The constant factor in Corollary 1 is then obtained after some detailed error analysis.

Discussion
We study the limitation of parallel quantum computing for solving systems of linear equations and give a matching quantum depth lower bound Ω(κ).This means that quantum algorithms for QLSP cannot be parallelized in general if one hopes to retain the quantum exponential speedup in the dimension N of the matrix.We conclude by mentioning some open questions regarding the quantum complexity for solving systems of linear equations.
1. Since the constant factor of κ still remains in a large range as shown in Equation ( 1), an immediate question is whether we can tighten the range of the constant factor.The constant factor is important in the near future as QLSP is one of the most promising applications of quantum computing, with detailed running costs very recently analyzed in [66].
2. In this paper, we only obtain a depth lower bound in terms of the condition number κ. Can we derive a (joint) depth lower bound in terms of the precision ε (and condition number κ)?
For reference, a tight quantum query lower bound Ω(κ log(1/ε)) for QLSP has been claimed in the forthcoming work [18].
3. In Theorem 2, the depth lower bound holds when the dimension of the matrix A is exponential in κ.This requirement is due to the reduction from the permutation chain problem (Problem 4) to QLSP in Section 3.3, where the negligibility of the probability in Equation ( 9) requires a symmetric group of exponentially large degree.Given this drawback, the lower bound in Theorem 2 holds only for κ = O(polylog N ) (see Remark 1).On the other hand, the quantum query lower bound Ω(κ) for QLSP given in [16] holds for all κ ≤ N .An interesting question is whether we can broaden the range of κ in the quantum depth lower bound for QLSP.

Organization
We will introduce the necessary preliminaries in Section 2. The quantum depth lower bounds for sparse-QLSP and block-QLSP will be derived in Section 3 and Section 4, respectively.

Preliminaries
We first introduce basic notations, then define the quantum complexity for QLSP, and, finally, include some useful lemmas.

Block-encoding
Block-encoding is a useful concept to describe quantum operators encoded as blocks in other quantum operators.
Definition 1 (Block-encoding).Suppose that A is an n-qubit operator, α, ε ≥ 0 and a ∈ N.An where • is the operator norm.

Quantum query access
Let A ∈ C N ×N be an Hermitian matrix with A ≤ 1.There are two main types of input models for quantum query access to A.
Sparse input model.Suppose A is an s-sparse Hermitian matrix, i.e., there are at most s nonzero entries in each row and column of A. The sparse input model consists of two quantum oracles O s and O A .Here, O s computes the column index l j,k of the k-th non-zero entry in the j-th row of A for 1 ≤ j ≤ N and 1 ≤ k ≤ s, i.e., and O A computes the k-th entry in the j-th row of A for 1 ≤ j ≤ N and 1 ≤ k ≤ N , i.e., Here, we assume that A j,k is in a binary representation and, for simplicity, we assume that the binary representation is exact.
Block-encoded input model.The block-encoded input model consists of a quantum oracle U A and its inverse and controlled versions.Here, U A is a (1, a, 0)-block-encoding of A for some a = polylog(N ).In the following, we give a construction of block-encoding from the sparse input model.

Quantum query complexity for QLSP
A quantum query algorithm A given access to certain quantum oracles is described by a quantum circuit where each U j is a (controlled-)oracle and each G j is a quantum gate independent of the input oracles.The quantum query complexity of A is defined to be Q(A) = Q, and the quantum time complexity of A is defined to be , where C(G j ) denotes the number of oneand two-qubit quantum gates to implement G j .Now that quantum query algorithms for QLSP with time complexity poly(log(N ), κ, 1/ε), and even poly(log(N ), κ, log(1/ε)), are known, e.g., [1,10], achieving an exponential speedup in N over classical algorithms, we are only concerned with those quantum algorithms with such exponential speedup.We consider two variants of QLSP as follows.
Problem 2 (Sparse-QLSP).Let A ∈ C N ×N be an O(1)-sparse Hermitian matrix with I/κ ≤ A ≤ I. Given quantum oracles O s and O A for quantum query access to A, the task is to solve QLSP(N, κ, ε) for A.
Problem 3 (Block-QLSP).Let A ∈ C N ×N be an Hermitian matrix with I/κ ≤ A ≤ I. Given quantum oracle U A that is a block-encoding of A, the task is to solve QLSP(N, κ, ε) for A.
If a quantum query algorithm A given quantum oracle O can be described as where V j = O ⊗k ⊗ I for some k (we also call such V j a k-parallel query) or its controlled-version, and F j is a quantum gate independent of the oracle O, then the quantum query-depth complexity (namely, the quantum depth complexity with respect to queries) of A is defined to be Q (A) = D,1 and the quantum time complexity is defined to be T(A) = kD + D j=0 C(F j ).The quantum query-depth complexity for QLSP is defined by

Permutation chain
A formal definition of the permutation chain problem is given as follows.
Problem 4 (Permutation chain).Suppose that N is a positive integer and S N is the symmetric group S N of degree N .Let q be a positive integer such that q = polylog(N ).Let π 1 , π 2 , . . ., π q ∈ S N be (0-indexed) permutations of size N .Suppose we are given the quantum oracle O π such that for every 1 ≤ j ≤ 2q and 0 ≤ x < N , The goal is to compute Π q (0), where We use PermChain(N, q) to denote the problem with the chosen parameters.
Recently, the quantum hardness of the permutation chain in terms of circuit depth was shown in [40] for the average case.
Theorem 3 (Permutation chain, [40,Theorem 5.2]).Suppose that N is a positive integer.Let q = O(polylog(N )) and k = O(polylog(N )) be positive integers.For any quantum query algorithm A for PermChain(N, q) using ⌊(q − 1)/2⌋ k-parallel queries to O π , we have That is, in the average case, A finds the answer Π q (0) with probability O(q k/N ).
3 Quantum Depth Lower Bound for Sparse-QLSP In this section, we will derive the quantum depth lower bound for sparse-QLSP stated as follows.
Theorem 4. For every constant 0 < ε < e −4 −e −6 1−e −6 ≈ 0.015, we have In other words, Theorem 4 means that for every δ > 0, there is a large enough real number κ such that Taking δ = 0.001 will produce the introductory Theorem 2. We will prove Theorem 4 in the following subsections.Here, we sketch the main idea of the proof.
1. We first encode the problem PermChain(N, q) as a system of linear equations A|x = |0 in Section 3.1.
2. Then, we construct the quantum oracles O s and O A (in the sparse input model) for A using the quantum oracle O π for the permutation chain in Section 3.2.
3. Finally, we reduce PermChain(N, q) to QLSP with certain parameters in Section 3.3.

Encoding permutation chain by linear systems
Let π 1 , π 2 , . . ., π q ∈ S N be (0-indexed) permutations of size N .Then, we consider the system of linear equations described by an Hermitian matrix A such that where P is a 3qN × 3qN permutation matrix defined by The construction of A in Equation ( 5) is inspired by [1] where they used linear systems to describe the simulation of quantum computation in the quantum circuit model.For comparison, our construction encodes a precise problem -PermChain(N, q), which is not directly related to quantum computing.It can be shown that A satisfies the following basic properties.
Proof.It is easy to see that A is a 2-sparse 6qN × 6qN Hermitian matrix.Then, we consider how to bound A and A −1 .As A − B ≤ A + B , we have It can be verified that Then, For every δ > 0, it can be seen that for sufficiently large q ≥ 1, we have Therefore, Now we consider the solution to QLSP(6qN, (2 + δ)q, ε) with respect to A, which is where we denote | 0 = |0 |0 |0 to avoid ambiguity and each |0 corresponds to one of the three subsystems of A defined by Equation ( 5).After measuring |ψ on the computational basis, the outcome satisfies the following properties.
Lemma 3. Let (b, j, x) be the measurement outcome of |ψ on the computational basis.Then, 1 − e −6 > 0.015. Moreover, Here, Proof.It is easy to verify the relation between x and j.Let x j be the value of x corresponding to j defined by We only have to compute Pr[q + 1 ≤ j ≤ 2q] as follows.Let M = 2q j=q+1 I ⊗ |j j| ⊗ I be the projector onto the subspace where the second register j is between q + 1 and 2q (inclusive).Then, To this end, we first compute A −1 | 0 by Equation ( 6): With this, we have Then, By Lemma 3, we can obtain the solution Π q (0) to PermChain(N, q) with a constant probability by measuring |ψ on the computational basis with outcome (b, j, x): it holds with probability > 0.015 that q + 1 ≤ j ≤ 2q and thus x = x j = Π q (0).

Constructing quantum query oracles
As shown in Lemma 2, A is a 2-sparse 6qN × 6qN Hermitian matrix.In this subsection, we will explicitly construct the quantum oracles O s and O A for quantum query access to A. Proof.The matrix representation of A by Equation ( 5) is We first consider how to compute the entries in I − e − 1 q P , and the case for I − e − 1 q P −1 is similar.For 0 ≤ j < 3q and 0 ≤ x < N , we have |(j + 1) mod 3q |π j+1 (x) , 0 ≤ j < q, |(j + 1) mod 3q |x , q ≤ j < 2q, |(j + 1) mod 3q |π −1 3q−j (x) , 2q ≤ j < 3q.
By noting that A|1 |j |x = |0 ⊗ (I − e − 1 q P )|j |x , we have Similarly, we have We can use Equation (7) and Equation ( 8) to find the non-zero entries of A, and thus we can construct each of the quantum oracles O s and O A for A using only 1 query to O π .
Now if t ≤ (q−1)/2, then A can be considered as a quantum query algorithm that makes t k-parallel queries to O π .Then, by Equation ( 9), we have which leads to a contradiction since Equation (10) does not depend on the choices of π 1 , π 2 , . . ., π q .Therefore, it must be the case that t > (q − 1)/2, and thus t ≥ q/2 (note that t and q are positive integers).This means that Q sparse (6qN, (2 + δ)q, ε) ≥ q 2 .
By letting κ = (2 + δ)n, we have Because of the arbitrariness of δ, these yield the proof of Theorem 4.
Remark 1.The choice of N can be loosened to N = 2 n c for any constant c > 0, where the above proof chooses c = 1 for simplicity.With such choice of N , Theorem 4 can be strengthened to 4 Quantum Depth Lower Bound for Block-QLSP In this section, we extend the proof of Theorem 4 to the case of block-QLSP.The idea is to solve the specific instance of sparse-QLSP in the proof of Theorem 4 by quantum algorithms for block-QLSP.
The difference is that due to imperfect implementations of the block-encoding of sparse matrices (caused by Lemma 1), we have to analyze the perturbation of linear systems (see Lemma 5) in order to ensure the success probability of the quantum algorithms.The quantum depth lower bound for block-QLSP is formally stated as follows.
Because δ can be chosen arbitrarily large and the term 1 − 16κ −4 tends to 1 when κ is large, these yield the proof.

Lemma 1 (
Block-encoding of sparse matrices,[5, Lemma 48  in the full version]).Given quantum query access to s-sparse matrix A ∈ C N ×N in the sparse input model by O s and O A with |A j,k | ≤ 1, we can implement a quantum circuit that is an (s, ⌈log 2 (N )⌉ + 3, ε)-block-encoding of A, using 2 queries to O s , 2 queries to O A , and O(log(N ) + log 2.5 (s/ε)) one-and two-qubit quantum gates.

Lemma 4 .
Let O s and O A be quantum oracles, defined by Equation (2) and Equation (3), respectively, for sparse matrix A defined by Equation(5).Then, each of O s and O A can be implemented using 1 query to O π defined by Equation (4).
where negl(n) means a negligible function of n.Using the quantum oracles O s and O A constructed by O π in Lemma 4, the algorithm A can produce a quantum state | ψ such that|ψ − | ψ ≤ ε, using t = Q sparse (6qN, (2 + δ)q, ε) k-parallelqueries to O s and O A .Then, by Lemma 3, with probability at least e −4 −e −6