The detectability lemma and its applications to quantum Hamiltonian complexity

Quantum Hamiltonian complexity, an emerging area at the intersection of condensed matter physics and quantum complexity theory, studies the properties of local Hamiltonians and their ground states. In this paper we focus on a seemingly specialized technical tool, the detectability lemma (DL), introduced in the context of the quantum PCP challenge (Aharonov et al 2009 arXiv:0811.3412), which is a major open question in quantum Hamiltonian complexity. We show that a reformulated version of the lemma is a versatile tool that can be used in place of the celebrated Lieb–Robinson (LR) bound to prove several important results in quantum Hamiltonian complexity. The resulting proofs are much simpler, more combinatorial and provide a plausible path toward tackling some fundamental open questions in Hamiltonian complexity. We provide an alternative simpler proof of the DL that removes a key restriction in the original statement (Aharonov et al 2009 arXiv:0811.3412), making it more suitable for the broader context of quantum Hamiltonian complexity. Specifically, we first use the DL to provide a one-page proof of Hastings' result that the correlations in the ground states of gapped Hamiltonians decay exponentially with distance (Hastings 2004 Phys. Rev. B 69 104431). We then apply the DL to derive a simpler and more intuitive proof of Hastings' seminal one-dimensional (1D) area law (Hastings 2007 J. Stat. Mech. (2007) P8024) (both these proofs are restricted to frustration-free systems). Proving the area law for two and higher dimensions is one of the most important open questions in the field of Hamiltonian complexity, and the combinatorial nature of the DL-based proof holds out hope for a possible generalization. Indeed, soon after the first publication of the methods presented here, they were applied to derive exponential improvements to Hastings' result (Arad et al 2011, Aharonov et al 2011) in the case of frustration-free 1D systems. Finally, we also provide a more general explanation of how the DL can be used to replace the LR bound.


Introduction
A striking difference between quantum and classical systems is the number of parameters required to describe them. Whereas a classical system of n particles can generally be described by O(n) parameters, the state of a similar quantum system would generally require 2 O(n) parameters. A fundamental question is the following: for what class of states is this exponential gap inherent? This is the central issue in quantum Hamiltonian complexity, an emerging field at the intersection of condensed matter physics and quantum computational complexity theory.
Structurally, this exponential gap is directly related to the phenomenon of entanglement, which is the fundamental obstacle in the quest to simulate quantum systems on a classical computer, and has far-reaching implications for the power of quantum computers. The central object of study in quantum Hamiltonian complexity is ground states of local Hamiltonians. What are the conditions under which these ground states can be described efficiently, and what are the conditions under which such description does not exist? There is a beautiful sequence of papers using structures based on tensor networks: MPS [6][7][8][9], PEPS [10], TN [11] and MERA [12], which provide efficient descriptions in certain cases. Indeed, a necessary condition for these descriptions to be efficient is that the quantum states must have small entanglement.
Area laws constitute one of the most important tools for bounding entanglement in such systems. Consider the interaction graph (hypergraph) associated with a local Hamiltonian-it 3 has a vertex for each particle and an edge for each term of the Hamiltonian. Intuitively and very roughly, an area law says that entanglement is local in this interaction graph in the following sense: consider a subset of particles L. Then the entanglement between L andL in the ground state is locally 'concentrated' along the edges between L andL; more precisely, the area law states that the entanglement entropy across the cut is big-Oh of the number of edges crossing between L andL. This is clearly a very strong restriction on the entropy, which in the general case would be of the order of the number of particles (nodes) in L. Proving area laws for typical classes of Hamiltonians is thus a holy grail in quantum Hamiltonian complexity.
A few years ago, in a seminal paper [3], Hastings proved that the area law holds for one-dimensional (1D) systems (i.e. when the interaction graph is a path), for gapped Hamiltonians-that is, Hamiltonians whose overall spectral gap is of order O(1). In this case, the area law says that ground state entanglement across any contiguous cut is bounded by a constant. From this, one can deduce that the ground state of such systems can be described efficiently (by an MPS of polynomial bond dimension-see [3]). The question of whether ground states in two and higher dimensions obey an area law is still open.
Hastings' proof of the 1D area law, and many other proofs related to entanglement and correlations in ground states, use sophisticated analytic methods. Perhaps the most important of these is the famous Lieb-Robinson bound (LR bound) [2,13], which bounds the velocity at which disturbances propagate in quantum local systems; Fourier analysis, and other techniques are important players too.
In this paper, we introduce a combinatorial tool for analyzing correlations and entanglement in ground states of local Hamiltonians. This is a simple, basic version of the detectability lemma (DL) of [1]. We demonstrate that when the system is frustration-free, some of the important results that rely on the traditional analytic tools can be obtained in a much simpler, direct and intuitive way using this tool; we argue that the DL in this form constitutes a useful tool for the study of local Hamiltonians and their ground states. (DL). Our starting point is the DL introduced in [1]. The DL was developed in order to sensibly make a statement of the form 'If the ground state energy is at least k then the probability that it violates (or frustrates) at least ck terms of the Hamiltonian is bounded below by a constant'. The motivation for developing such a tool in [1] was quite specific: to help translate certain classical results about constraint satisfaction problems to quantum results about local Hamiltonians. In particular, it was used to prove a quantum analogue of gap amplification, a component of Dinur's proof of the PCP theorem [14], which is a cornerstone in classical computational complexity. The work [1] holds under the assumption that each particle participates in a bounded number of terms of the Hamiltonian (an assumption that holds for most systems of interest including interactions on a lattice embedded in any constant dimensions). The work [1] also required an additional technical assumption, that the number of distinct types of terms of the Hamiltonian is bounded. As we shall see, this latter assumption is no longer needed.

The detectability lemma
Here, we reformulate the DL and put it into a different context, that of understanding local properties of ground states. More precisely, our reformulation of the DL asks the following question. Consider a gapped frustration-free local Hamiltonian H = m i=1 H i with 0 H i 1, in which the ground energy is 0 and the spectral gap is = O(1). Denote the ground state by | . We would like to understand the properties of the ground state, such as the behavior of correlations between distant particles in the ground state and the propagation in space of perturbations. To this end, we would like to approximate the projection on the ground state, gs = | |, by a 'local' operator (for some notion of locality to be clarified soon). One way of approaching this question is to ask: is there a local operator that fixes the ground state but shrinks all other eigenvectors significantly? A natural first guess at such an operator is the operator G = (1 − 1 m H ). It fixes the ground space and shrinks its orthogonal complement by a factor that can be as large as (1 − /m). As m increases, the factor approaches 1 and the shrinking becomes negligible. This is not strong enough for various purposes, as we will see; our challenge is to do better. Suitably reformulated, the DL provides such an improvement: an operator that fixes the ground state but shrinks all other eigenstates by a constant factor, independent of m.
For simplicity of the current discussion, let us provide the exact statement of the DL, and in particular the definition of the DL operator, under the assumption that the particles are on a 1D chain and that H i are projections Q i that act on adjacent particles, so that H = n−1 i=1 Q i . Set P i to be the projection into the local ground space of H i , i.e. P i = 1 − Q i . The {Q i } terms can be partitioned into two sets, the even and odd terms, which we call layers (see figure 1), and the projections into the common ground spaces of these layers are given by odd def = P 1 P 3 P 5 · · · and even def = P 2 P 4 P 6 · · ·. The DL states that the operator is the 'local' operator we want.

Lemma 1.1 (DL in 1D)
. Let A be the operator defined above in (1), and let H be the orthogonal complement of the ground space. Then This statement follows from the DL of [1]; we give here a new simple proof of this reformulation of the DL, in the process dropping the assumption of [1] about the number of distinct types of terms of the Hamiltonian.
We can now explain what we mean by 'locality' of A. When A is applied times to some local perturbation B that acts on the ground state | , there is a pyramid-shaped 'causality cone' of projections that is defined by B. These are simply all terms that are graph-connected to the operator in question (see figure 2). All the projections outside that cone commute with B and can therefore be absorbed in the ground state, leaving us with a local operator of size O( ). The DL implies that A can be viewed as an excellent approximation to the ground state projection gs . Indeed, as A shrinks all the states that are perpendicular to the ground space by a constant factor, it follows that Using A , we can deduce various properties of the ground state.

Exponential decay of correlations in the ground state.
As a first example of how the DL can be used to handle correlations in the ground state, we provide a one-page proof of Hastings' result [2] that the correlations in the ground states of gapped Hamiltonians decay exponentially with distance. This applies to D-dimensional grids for any D. More precisely, consider two local observables X and Y , which act on sets of particles that are of distance m on the grid; the decay of correlations means that the expectation value of their product is (almost) equal to the product of their expectation, up to an error that decays exponentially with the grid distance between those operators.

1.1.3.
Simplifying the proof of Hastings' one-dimensional (1D) area law. Our main application of the new formulation of the DL is to give a significant simplification of the proof of Hastings' area law in 1D [3]. The proof uses the DL in two key points, completely bypassing the analytic methods, which are now replaced by combinatorial constructions. By this we hope to make this important result accessible to a wider audience, as well as possibly extendable to higher dimensions. The outline of the proof still follows that of Hastings and, indeed, it helps to clarify and shed light on the essence of this important proof; we defer the explanation of the proof and how the DL enters the picture to section 5. We mention that at first sight, one might connect the exponential decay of correlations to an intuition that entanglement between a region L and its surrounding is 'located' only close to the boundary of L, and thus scales as the area rather than as the volume. Although an appealing intuition, such an implication of exponential decay of correlation to area laws is not known, and indeed quantum expanders provide a counter-example to a naïve application of this logic [15].

The DL versus the LR bound.
In both the applications described above, the DL replaces a combination of the LR bound and other analytic tools. In fact, there is a more general outline for which both applications can be viewed as examples. We illustrate how exactly the DL is related to the analytic methods and how it can be used to replace them, with a more general toy application comparing the use of the LR bound to the alternative route offered by the DL. This is done in section 6.

Further improvements to the 1D area law.
Soon after the first publication of this paper on the arXiv [16], we used the DL to provide a completely new proof for the 1D area law [5]. Whereas the 1D proof in this paper follows the same outline as Hastings' proof, though with the DL replacing the analytical tools, the proof in [5] follows a very different route. There, the main idea is to start with the DL operator A and 'dilute' it to achieve an operator that remains a good approximation to the ground state projection, yet creates much less entanglement. As a result, the bound it achieves on the entanglement entropy is exponentially lower (i.e. better) than the bound in Hastings' 1D proof (and the version presented here). To be specific, for a 1D system H = n−1 i=1 H i of n particles of dimension d with a spectral gap > 0 and an interaction strength H i J , one defines the parameter X def = J log d . Then Hastings' entropy bound (reproved here) is given by S e O(X ) , whereas the proof in [5] gives S e O(log 2 X ) .
Shortly after the submission of [5], a major simplification in the proof of [5] was discovered. This led to a third paper [4] that is not only much simpler than [5], but also gives a much better bound: S O(1) · X 3 log 8 X . We mention that this bound is only within a polynomial factor from the best known lower bounds for S which scale as

The frustration-free restriction.
Our results hold in the frustration-free case. We note, however, that there are various frustration-free systems that are interesting from a physics and a computational point of view: for example the ferromagnetic XXZ model, the AKLT model [20] and stabilizer codes such as the Toric code [21]. In addition, many of the quantum phenomena are revealed already in the context of frustration-free Hamiltonians, and the major open problems in quantum Hamiltonian complexity (e.g. quantum PCP and 2D area law) are open already for this case. Much has to be learned from studying frustration-free Hamiltonians before we proceed to the more general case. Moreover, Hastings' original proof [3] essentially reduces the frustrated case into an approximately frustration-free system by coarse graining, and a similar approach may work to extend the results presented here to the general case.

Other related works.
The DL seems to be connected to various diverse scientific areas. The connection to the LR bound and other analytic tools used in condensed matter physics is discussed extensively in section 6; another connection is to view the DL operator A as a special instance of the general method of alternating projections (MAP), which was first studied by von Neumann [22]. In that method, one applies a fixed sequence of projections in order to approach the intersection subspace. In the general setting, the projections are not assumed to be local, nor is the Hilbert space assumed to be of finite dimension. In recent results [23], the convergence rate is given as a function of the Fridriechs angle, which is not easily related to a physical quantity. The DL, on the other hand, is a MAP under the special assumption that the projections are local, associated with a frustration-free k-local Hamiltonian, with a convergence rate that is given as a function of the spectral gap. It would be interesting to see whether more insights can be derived from these connections.

7
In recent years, much attention has been given to a quantum algorithm that, given a local Hamiltonian, uses a process involving random measurements of the energies of the local terms to approach the ground state efficiently (for certain cases) [24,25]. The algorithm discussed in those papers carries similarities to the situation we are handling here, despite the fact that measurements are applied rather than projections, and also that the terms are chosen randomly, rather than in some fixed order. It seems that the DL lemma, and the energy-norm trade-off, could potentially also be useful for the analysis of such algorithms. In particular, it would be very interesting to see a version of the DL that applies to the case in which the terms are chosen randomly.

Paper organization.
We start with notations and preliminaries in section 2. We then provide the statement and proof of the new version of the DL in section 3. We proceed to the one-page proof of the exponential decay in section 4 and then provide the 1D area law proof in section 5. In section 6, we conclude with the toy example comparing the LR bound approach to the DL one.

Notations and preliminaries
We consider a k-local Hamiltonian H acting on H = (C d ) ⊗n , the space of n particles of dimension d. H = i H i where each H i is a non-negative and bounded operator that acts nontrivially on a constant number of k qubits (hence the term local Hamiltonian). We assume that H has a ground space of energy 0, which must therefore also be a common zero eigenspace of all terms H i . This means that H is frustration free. We also assume that H is 'gapped', meaning that its lowest eigenvalue is 0 (the ground energy) and all the next are equal to or larger than some constant > 0. We denote by H ⊂ H the orthogonal complement ground space of H . Thus, H is an invariant subspace for H , and Most of these assumptions, except for perhaps the frustration-free assumption, are very often used in condensed matter physics.
Throughout this paper, we further assume that the H i s are projections and hence would be denoted by Q i . We define P i to be the projection on the ground space of Q i , P i The assumption that H is made of projections is not actually a restriction because we can reduce any frustration-free, bounded and gapped system into that case. Specifically, for H = i H i with H i K and a spectral gap τ > 0, we first add an appropriate constant to each H i such that their ground energy is 0. Then, for every i we define Q i as the projection into the space where the energy of H i is greater than 0 and P i def = 1 − Q i as the projection to the ground space of H i . Finally, we define the auxiliary Hamiltonian H = i Q i . This system is frustration-free because the original ground states would also be ground states in H with a vanishing energy. Moreover, for any state |ψ ⊥ ∈ H and every H i , and therefore the gap in H is τ/K . It follows that all of our results can be applied to bounded frustration-free Hamiltonians by replacing the gap in DL with the scaled version τ/K . 8 Given a state |φ and a partition of the qubits to two nonintersecting sets, R and L, with the corresponding Hilbert spaces H L , H R , we can consider the Schmidt decomposition of the state along this cut: |φ = j α j |L j ⊗ |R j . Here α 1 α 2 · · · are the Schmidt coefficients. Their squares are equal to the nonzero eigenvalues of the reduced density matrices to either side of the cut ρ L (φ) and ρ R (φ), which we denote by λ 1 λ 2 · · · . The Schmidt rank of |φ is then the number of nonzero eigenvalues λ j (or the Schmidt coefficients α j ), and the entanglement entropy is the entropy of the set {λ i } or, equivalently, the von Neumann entropy of the matrix ρ L (φ). A straightforward corollary of the Eckart-Young theorem [26] is then that the truncated Schmidt decomposition provides the best approximation to a vector in the following sense.
Fact 2.1. Let |φ be a vector on H L ⊗ H R and let λ 1 λ 2 · · · be the eigenvalues of its reduced density matrix. The largest inner product between |φ and a norm one vector with the Schmidt rank r is r j=1 λ j .
Throughout the paper, the function log(·) denotes the logarithmic function of base 2.

The detectability lemma (DL)
For clarity of presentation, we will prove the DL in the case stated in the introduction: where the particles are set on a line and the local terms are two-local involving nearest neighbors. This proof contains all the necessary ingredients for the proof of the more general DL in the case where the Hamiltonian has k local terms that can be partitioned into g layers; we make the precise statement of the more general case at the end of this section.
We begin with a simple lemma that quantifies the norm-energy trade-off in the simple case of two projections X, Y : we show that if the application of X Y does not move a vector very much, then the energy of that vector with respect to 1 − Y must be small.
The proof consists of a few lines of simple algebra.

Proof. Set |w
By definition, |w is a normalized vector inside the support of Y and therefore for every vector |ψ , we have (1 − Y )ψ (1 − |w w|)ψ . Plugging this to the equality above, we find that where the last inequality follows from the fact that X w X Y v .
Layer 1 (even) Layer 2 (odd) Figure 3. Decomposing A as the product A = 1 2 · · · m R. The pyramids i are represented by the gray ovals and R by the empty ovals P 4 , P 8 .
Let us now proceed to prove the DL in 1D.
Proofoflemma1.1.Suppose|ψ ∈H isanorm1statethatisorthogonaltotheground space, and define |φ def = A|ψ . Note that for every ground state | , |A|ψ = 0, and so |φ is orthogonal to the ground space. We would like to show that We will find both a lower and an upper bound for the energy of |φ , φ|H |φ , which will give us an inequality for φ 2 , from which (6) will follow. The lower bound is straightforward since |φ is orthogonal to the ground state, and so φ|H |φ φ 2 .
We shall now upper bound the energy φ|H |φ by carefully upper bounding the contributions of the individual terms φ|Q i |φ . We begin by noting that these terms are equal to 0 for i odd since A = odd even and Q i odd = 0 for any odd i (recall that even , odd are products of the projections P i = 1 − Q i ). We now want to bound the contributions coming from the even terms.
For this purpose, we present A in a convenient form, by reordering its terms. We call the triplet product of projections (P 1 P 3 P 2 ), (P 5 P 7 P 6 ), . . . as pyramids and denote them by i = P 4i−3 P 4i−1 P 4i−2 . The remaining terms are combined with the operator R def = P 4 P 8 . . . . See figure 3 for an illustration of this structure. Note that by just using the fact that P i and P j commute when i and j are not consecutive, we can write where m is the number of pyramids, which is approximately n/4.
We will use this reordering to bound the energy contribution of the terms Q 2 , Q 6 , . . .; a symmetric argument will bound the remaining even terms Q 4 , Q 8 , . . .. The energy contribution of Q 4i−2 will be related to the amount of movement produced by the i portion of the operator A.
The key point in providing this bound is this. We view the transformation of |ψ → A|ψ = |φ as a series of steps given by the application of the pyramids i . Specifically, letting |v i def = i i−1 · · · 1 R|ψ , we consider the transformation |ψ → R|ψ → |v 1 → |v 2 → · · · → |v m = A|ψ . The square of the norm of the first state, after applying R, is a 1 def = Rψ 2 . Let a i def = v i 2 / v i−1 2 be the 'shrinkage' of the square norm (or movement) resulting from the application of the ith pyramid, for 1 i < m.
We shall now prove, using lemma 3.1, that the shrinkage a i is related to the energy of the operator Q 4i−2 , which is at the top of the pyramid i : We write and recall that i = P 4i−3 P 4i−1 P 4i−2 . We can therefore apply lemma 3.1 to 1 − a i . Using this upper bound gives an upper bound for the energy contribution for Q i , i ∈ {2, 6, 10, . . .}: with the constraint on the norm φ 2 a i . The right-hand side is maximized when all the a i s are equal to each other, i.e. a i = φ 2 m , and therefore we are left with an upper bound of the energy coming from Q 2 + Q 6 + · · · as φ|(Q 2 + Q 6 where the last inequality follows from the fact 4 that for every x ∈ [0, 1], we have m For the energy of Q 4 + Q 8 + · · · , a similar decomposition to A = (P 3 P 5 P 4 )(P 7 P 9 P 8 ) · · · (P 2 P 6 · ·· ) can be made, thereby upper bounding the energy by 2 1− φ 2 φ . Combining the energy upper and lower bounds we therefore obtain and so which gives The above proof can easily be generalized to other geometries. In the general case, in accordance with section 2, we assume that we have a k-local, frustration-free Hamiltonian H = i Q i that is made of projections and has a spectral gap > 0. We further assume that each particle participates in a constant number of projections, and therefore the Q i can be 4 This inequality can easily be verified by noting that f m (x) partitioned into a constant number of g layers; each layer is made of projections that do not intersect each other and are therefore commuting.
Then for each layer we define the projection i as the product of all P j = 1 − Q j that are in the layer, and define the DL operator A by Finally, we define f (k, g) to be the number of sets of pyramids that are necessary to estimate the energy contribution of all the Q i terms. In the 1D case that we proved, we had f (k, g) = 2, because only the even layer contributed energy and we needed two sets of pyramids to cover that layer. In the general case it is easy to see that f (g, k) can be crudely bounded by f (g, k) (g − 1)k g . Using the above definitions, the general DL is as follows.

Consider now a Hamiltonian H = i Q i which is k-local and set on a d-dimensional grid.
Once again, in accordance with section 2, we assume that Q i are projections and that H is frustration-free with a unique ground state | (i.e. Q i | = 0) and a spectral gap > 0. We wish to show Dim grid). Consider a setting as described above. Let X, Y be two local observables whose distance on the grid from each other is m. DenoteX

Theorem 4.1 (decay of correlations in ground states of gapped Hamiltonians on a d-
Proof. Let us now consider two operators: P in , P out : P in is defined by applying the DL times to Y and discarding all projections outside the causality cone of Y . is chosen such that the resulting cone will not overlap with X (see figure 4). Therefore ∝ m, with the proportionality constant that is a geometrical factor. P out is the complement of P in , i.e. it is the product of projections one gets by applying the DL m times, excluding the projections that are inside the causality cone of Y . Together, we have P in · P out = A -see figure 4 for an illustration in 1D. P in , P out leave the ground-state invariant. In addition, they commute with X and Y , respectively; hence We now recall that A is in fact an approximation of the ground-state projection gs (see (26) in section 6), Figure 4. An illustration of the statement |X Y | = |X P out P in Y | . The operator P out is drawn in blue and P in is in red. Note that the number of projection layers is proportional to the distance between X and Y . and so Assuming that the ground state is unique, gs = | |, and therefore

The area law in one dimension using the DL
We consider a two-local Hamiltonian on a chain of n particles of dimension d, H = n−1 i=1 Q i and, as before, we assume that the Q i are projections and that H is frustration-free with a unique ground state | , and a spectral gap . Then according to (2), is the DL shrinking exponent of the square of the norm. In order to keep the presentation simple we will assume that 3, and therefore 1 2 < < 1. We prove the following version of a 1D area law.

Theorem 5.1 (area law for frustration-free Hamiltonians in 1D).
For any contiguous cut along the chain, the entanglement entropy of the ground state | across the cut is bounded by a constant which depends on the dimensionality of the particles d and on the spectral gap ; specifically, for X def = log d . We note that in accordance with section 2, for 1D frustration-free systems H = i H i , with a spectral gap > 0 and a bounded interaction strength H i J , one should replace → /J and consequently X → J log d .
The proof relies on two main lemmas. The first shows that for any cut along the line, there is a product state |φ 1 ⊗ |φ 2 that has a constant inner product with the ground state.

13
The second lemma shows that if there exists a product state with a constant overlap with the ground state | , then | has finite entanglement entropy. with a product state implies finite entropy). If for some cut there exists a product state |φ 1 ⊗ |φ 2 such that | φ 1 ⊗ φ 2 | | µ, then the entanglement entropy of | across the cut is bounded by

Lemma 5.3 (constant overlap
Theorem 5.1 then follows easily by combining the two lemmas and using the facts that log µ −1 = 0 (log d + 1 We prove lemma 5.2 in section 5.2 and lemma 5.3 in section 5.1.

Constant overlap implies finite entropy (proof of lemma 5.3)
In this section we prove lemma 5.3. The DL is clearly the right tool for the task, since it provides a 'local' operator that can be repeatedly applied to the promised product state |φ 1 ⊗ |φ 2 without increasing its entanglement rank much, while exponentially decreasing its distance from the ground state.
The only thing that is not entirely clear is how to get a constant bound on the entanglement entropy of the ground state, since a straightforward argument would mean applying the operator non-constant number of times to get arbitrarily close to the ground state. The key is to observe that after applications of the DL we get a state with a bounded Schmidt rank that is close to the ground state, and by Fact 2.1, this gives us a bound on the sum of the largest d 2 Schmidt coefficients of the ground state. With these bounds we can find a pessimistic constant upper bound on the entanglement entropy. We can now proceed to the more detailed proof.
Consider then a cut in the line between the particles i 0 and i 0 + 1, and let Q i 0 be the local term in H that involves i 0 , i 0 + 1. Assume that along that cut, the product state |φ 0 def = |φ 1 ⊗ |φ 2 has a constant projection µ on the ground state | : where | ⊥ ∈ H . We now apply the operator DL operator A times on |φ 0 . We obtain where by the DL, | ⊥ ∈ H and δ 2 . Let |v be the normalized version of A |φ 0 . Then This means that |v are exponentially close to the ground state, as a function of . How entangled are those states? We note that at each application of A, the entanglement rank of the state can only increase by a multiplicative factor of d 2 : for every i = i 0 , the projection term P i in A works entirely left to the cut or entirely right to the cut, thereby not increasing the Schmidt rank of the state. The only projection in A that may increase the rank is P i 0 , and as it is a two-local projection that works on d-dimensional particles, it can at most increase it by a factor of d 2 . 5 Consequently, the Schmidt rank of |v is at most d 2 .
We have therefore obtained a family of states {|v } with Schmidt ranks bounded by d 2 , which are closer and closer to | . Then using Fact 2.1, together with (18), it follows that the eigenvalues of the reduced density matrix of | along the cut, λ 1 λ 2 · · · , must satisfy which implies the following series of inequalities: for every 1: From here, the desired upper bound on the entropy can be deduced by choosing the distribution of maximal entropy which still satisfies the inequalities in (21). The following lemma, whose proof can be found in appendix A, gives one such bound.
Lemma 5.4. Consider a probability distribution {λ j } whose values are ordered in a nonincreasing fashion, 1 2 · · ·, and let D 2 be an integer and K 4, 1/2 < < 1 some constants such that for every 1 : Then the entropy of {λ j } is upper bounded by Substituting D = d 2 and K = µ −2 gives But looking at (14), we see that as long as 3 (which is our working assumption), log −1 /5, and so

A product state having constant overlap with | (proof of lemma 5.2)
A natural candidate for a tensor product state with a constant overlap with | is the mixed state ρ L ⊗ ρ R , where ρ L is the reduced density matrix of | to the left of the cut, and ρ R is the reduced density matrix to the right. 5 Given a vector |v = r j=1 α j |L j ⊗ |R j of the Schmidt rank r and i < i 0 , P i |v = r j=1 α j P i |L j ⊗ |R j and thus has Schmidt rank bounded by r (the symmetric argument shows the same result for i > i 0 ). For i = i 0 , we can decompose P i 0 = d 2 k=1 X k ⊗ Y k with X k acting on the i 0 th particle and Y k acting on the (i 0 + 1)th particle. Consequently, has Schmidt rank of at most r d 2 .
Let us assume for contradiction that the overlap between | and ρ L ⊗ ρ R , and in fact with any tensor product state along a certain cut, is less than /8 for some sufficiently large constant . If the overlap is small, then there is a measurement that distinguishes | from ρ L ⊗ ρ R with a probability of at least 1 − /4 ; this is simply the projection on the ground state, gs .
The challenge is to show that there is such a local measurement, i.e. a measurement confined to a local window, which distinguishes these two states almost as well. Using the DL, we shall now find such a local measurement that distinguishes with a slightly worse probability 1 − 2 /4 . Let us denote by ρ L (respectively ρ R ) the reduced density matrix of | restricted to the particles on the left (respectively right) of the cut. Also, let ρ 2 be ρ = | | restricted to the 2 particles, on each side of the cut. We refer to the state ρ L ⊗ ρ R as the 'disentangled' version of the state ρ 2 . The following lemma shows that under the assumption that | has low overlap with every product state |φ 1 ⊗ |φ 2 (along a given cut), there exists a measurement confined to the window of 2 particles around the cut, that with high probability distinguishes ρ 2 from ρ L ⊗ ρ R .

Lemma 5.5 (existence of a distinguishing measurement). Assuming that the overlap of the ground state with any product state satisfies
The DL ensures that by applying the layers one by one, we converge to the projection on the ground state quickly, and it is this projection that is exactly the distinguishing measurement we want to approximate. We can thus apply A only /2 times, approximating the projection on the ground space; now, following the intuition explained in the introduction (and in the example of section 6), only the causality cone of the cut should be used in this measurement, and the rest of the operators in those layers are swallowed by the state being measured; this amounts to a measurement which is restricted to the − , interval and still distinguishes well enough. The detailed proof can be found in the appendices.
The fact that such a measurement exists, distinguishing the original state confined to the 2 window from its 'disentangled' version, with high probability, must somehow indicate that there is a lot of entanglement along the cut, whose disentanglement caused this distinguishability. This can be made precise using an information-theoretical argument. Lemma 5.6 (distinguishing measurement implies large difference in entropies). If there is a 2 -local measurement that distinguishes ρ 2 from ρ L ⊗ ρ R with a probability of at least 1 − 2 /4 , then The lemma implies that the entropy in S(ρ L ) + S(ρ R ) is significantly larger than S(ρ 2 ), implying that disentangling along the cut has introduced a lot of new entropy. The proof is simple, based on relative entropy; essentially, all it uses is the fact that a measurement that distinguishes with high probability two states implies high relative entropy between the results of the measurements. Once again, details can be found in the appendices.
To finish the proof of lemma 5.2, we now need to derive a contradiction. Denote by S(2 ) the value of S(ρ 2 ), with 2 being the segment centered around the cut that provides our contradictory assumption (namely, that any tensor product state has less than /8 inner product with the ground state). Under these conditions, lemma 5.5 applies, and hence also the conditions of lemma 5.6 apply to this segment. Applying lemma 5.6 we conclude that S(2 ) 2S( ) − 20 + 1. We now want to recursively apply this inequality for the long segments on both sides of the cut and then for the /2-long segments within those segments and so on. The problem is that the cuts now move to different locations within the 2 long window, and so our assumption no longer applies for these cuts. However, if the inner product state with any tensor product state is small along the original cut, it can be shown to be quite small also along nearby cuts, and so all the above arguments can be applied for those cuts too. This can be formalized in the following claim, whose easy proof can once again be found in the appendices.
We therefore assume by contradiction that the inner product along a given cut is smaller than µ, such that µd = /8 , and so along all the cuts in the 2 window we have that the inner product of the states is at most /8 , and hence our assumptions apply. We can therefore use the same argument recursively. Since S(1) log(d), we get (for a power of 2), S( ) log(d) − 20 log + (log(d) + 1) − 20 log . Choosing = d 40/ then yields 20 log log d + 1, giving us the contradiction S( ) < 0.

Comparison of the Lieb-Robinson bound approach and the DL approach
In this section, we compare the DL with a standard method used in many of the seminal results in quantum Hamiltonian complexity, such as Hastings' areas law for 1D gapped systems [3], and Hastings' exponential decay of correlations proof [2]. The method combines the use of the Lieb-Robinson bound (LR bound) with a Fourier analysis and the existence of a gap, to reveal the locality properties of the ground state. Specifically, the method uses these tools to approximate expressions that involve the projection operator to the ground state, gs , by local operators.
To understand how this is done, let us concentrate on a simple example of locality in the ground state, and derive it using both the DL and the LR bound.
We focus on gs , the projection on the ground space of H . On the surface, this projector seems very far from being local in any sense. Nevertheless, in gapped systems it does possess some locality properties that are crucial to the analysis of correlations and entanglement in the ground state. To see this, one standardly considers an approximation of gs by another operator: Here q is a free parameter to be chosen as appropriate. For an eigenvector |E of H with eigenvalue E, we have P q |E = e −q E 2 /2 |E . Consequently, if the system has a constant spectral gap > 0, P q approximates gs well: We now want to argue regarding the local nature of P q in various contexts. Let us illustrate the LR bound approach with a simple example: consider the expression gs B| ≈ P q B| , where | is a ground state of the system with zero energy, and B is some local perturbation. It is easy to see that = e −iH t B e iH t is the time evolution of the perturbation B. The key point now is to use the famous LR bound, to approximate it by a 'local' operator, i.e. an operator that acts only on the 'neighborhood' of the particles on which B acts. The following is an immediate corollary of the original LR bound, which we omit for the sake of brevity. The full statement of the LR bound, together with the proof of this corollary, can be found in [27].
Given a length scale > 0, we may now set q = in (24) and obtain In the above series of approximations, ≈ implies an approximation of up to an error of e −O( ) . The first approximation follows from the assumption of the constant gap and (23). The second approximation is due to the exponential decay of the filter function e −t 2 /2 and the third one is due to the LR bound. We therefore obtain an exponentially (in ) good approximation to gs B in the expression gs B| by an operator which is -local.
Let us now derive the same result for the frustration-free case using the DL. First, we approximate the ground space projection gs by applying the DL operator A for m times. By (12), A leaves the ground space invariant while shrinking the orthogonal space by a constant factor. Therefore We now write and consider the expression A m B| . By assumption, the system is frustration-free, and therefore every local projection operator P i that appears in A leaves | invariant: P i | = | . We now consider the 'causality cone' of projections in A m that are defined by B. These are simply the projections that are graph-connected to B when all the projections in A m are arranged in consecutive gm layers (see figure 2). The main observation is that all the projections outside this causality cone commute with B, and can therefore be absorbed by | . We are therefore left only with the projections of the causality cone, whose support size is proportional to m. In other words, just as in the LR bound method, we found an exponentially (in ) good approximation to gs B in the expression gs B| by an operator which is -local. To simplify this expression, note that for any integer k 1, we can use only the inequalities for = k, 2k, 3k, . . . . This amounts to considering the original problem with (D, ) replaced by (D k , k ). Then choosing k = Proofoflemma5.5.LetQ={Q i : Q i acts only on particles in the 2 interval}. Let be a projection onto the ground space of all the operators in Q. We will show that { , 1 − } is the desired distinguishing measurement.
The first equality holds since only acts on the 2 particles in ρ L ⊗ ρ R and the last inequality is trivial; it is the middle equality which uses the structure of A /2 . Indeed, let us write A /2 = A M A L A R where A M = · · · (P k−2 P k P k+2 )(P k−1 P k+1 )(P k ) groups is a 'pyramid' of terms centered at the cut k, and A L and A R are the terms to the left and right of the pyramid, respectively, as in figure B.1. Then A /2 (ρ L ⊗ ρ R ) = A M A L A R (ρ L ⊗ ρ R ). But since we applied A for exactly /2 times, every P i projection in A M is also in the 2 window of . Therefore A M P i = A M and consequently A M = . Similarly, A L A R (ρ L ⊗ ρ R ) = ρ L ⊗ ρ R , and therefore A /2 (ρ L ⊗ ρ R ) = (ρ L ⊗ ρ R ), implying the desired equality.
In this case, we have X = 1 with probability 1 and Y is 1 with probability α 2 /4 . Thus, using straightforward analysis, i x i log x i y i = log( 1 α ) log(1/2) + 4 log( −1 ). But by (14), as long as 3 (which is our working assumption), log( −1 ) /5, and so i x i log x i y i 20 − 1, and the result follows.
Appendix D. Proof that close cuts behave similarly: claim 5.7 ProofofClaim5.7.Assumeforcontradictionthat|χ 1 ⊗ |χ 2 is a product state across the cut (k + j, k + j + 1) with j > 0 such that | χ 1 ⊗ χ 2 | | > µd . Schmidt-decompose |χ 1 = d i=1 α i |L i ⊗ |R i where the cut is between the first k particles and the j particles between k + 1 and k + j. By simple algebra, there exists at least one i such that | L i ⊗ R i ⊗ χ 2 | | > µ which violates the hypothesis.