Concentration bounds for quantum states with finite correlation length on quantum spin lattice systems

We consider the problem of determining the energy distribution of quantum states that satisfy exponential decay of correlation and product states, with respect to a quantum local Hamiltonian on a spin lattice. For a quantum state on a D-dimensional lattice that has correlation length σ and has average energy e with respect to a given local Hamiltonian (with n local terms, each of which has norm at most 1), we show that the overlap of this state with eigenspace of energy f is at most exp ( − ( ( e − f ) 2 σ ) 1 D + 1 / n 1 D + 1 D σ ) . This bound holds whenever ∣ e − f ∣ > 2 D n σ . Thus, on a one-dimensional lattice, the tail of the energy distribution decays exponentially with the energy. For product states, we improve above result to obtain a Gaussian decay in energy, even for quantum spin systems without an underlying lattice structure. Given a product state on a collection of spins which has average energy e with respect to a local Hamiltonian (with n local terms and each local term overlapping with at most m other local terms), we show that the overlap of this state with eigenspace of energy f is at most exp ( − ( e − f ) 2 / nm 2 ) . This bound holds whenever ∣ e − f ∣ > m n .


Introduction
A question of primary interest for local hamiltonian spin systems is to determine the energy distribution of natural class of states with respect to a given local hamiltonian.The knowledge of energy distribution reveals a lot of information about the nature of the state itself.As we shall discuss below, a gaussian distribution of energy can be associated to a product state.On the other hand, the well known entangled state 1 √ 2 |0 ⊗n + 1 √ 2 |1 ⊗n (also termed as the 'cat state') has energy distribution peaking at opposite ends of the spectrum of the hamiltonian: n i=1 |1 1| i .Moreover, the knowledge of energy distribution plays an important role in the study of thermalization of quantum systems.
The aforementioned question has been well studied in classical setting, important examples of which are the Chernoff bound [Che52] and the Central limit theorem (which applies to asymptotic regime).Chernoff bound can be informally states as follows.Let X 1 , X 2 . . .X n be independent and identically distributed random variables taking values in [0, 1] and each having average value A. Then Pr(|X 1 + X 2 . . .+ X n − nA| > ε) ≤ e −cε 2 /n , where c is a constant that depends on A.
One interpretation of this bound (which was the original motivation in [Che52]) is that it provides a recipe for distinguishing between two probability distributions P def = x p(x) |x x| and Q def = x q(x) |x x| with expectation values A and B respectively.Given n independent samples x 1 , x 2 . . .x n from either of these distributions, the sum i x i is highly likely to be concentrated around nA if the underlying distribution is P and around nB if the underlying distribution is Q.A more precise formulation of this idea requires characterizing the trace distance between P ⊗n and Q ⊗n as n becomes large, and it has been generalized to the quantum setting in [ACMnT + 07].
Another interpretation of the Chernoff bound, which is the focus of present work, lies in the setting of 'classical' local Hamiltonian systems.Consider a product state ρ ⊗n on n sites, where ρ def = x p(x) |x x|.Let H be a 1-local Hamiltonian H def = i h i , such that h i = x x |x x| acts non-trivially only on the site i and is same for each site.If A def = Tr(ρh i ) is the expectation value of ρ with respect to h i , then nA is average energy of ρ ⊗n with respect to the hamiltonian H. Let Π ≥nA+ε be the projector onto eigenstates of H with energy at least nA + ε.Then the Chernoff bound implies that Tr(ρ ⊗n Π ≥nA+ε ) ≤ e −cε 2 /n .Thus, the energy distribution of ρ ⊗n is highly concentrated around the average energy nA.
The energy distribution of a product state for quantum lattice system with infinitely many sites was considered in [GV89] (for translationally invariant systems) and in [HMH04a] (for nontranslationally invariant systems).These results can be regarded as a generalization of the Central limit theorem to quantum systems.A non-asymptotic version of Central limit theorem is the Berry-Esseen theorem ( [Ber41], [Ess45]), which provides an upper bound on trace distance between energy distribution of product state and the normal distribution as a function of lattice size.This upper bound goes to zero as lattice size approaches infinity, thus recovering the Central limit theorem.For quantum states with finite correlation length (which includes product states) on finite sized lattice, a quantum version of Berry-Esseen theorem was recently shown to hold in [BC15], [BCG15].
These results give a strong indication that states satisfying exponential decay of correlation behave similar to product states, even when their energy distributions are measured with respect to the eigenspectrum of a non-commuting (but local) hamiltonian.The work [KLW15] goes even further to show that non-commuting local hamiltonians themselves have energy spectrum that resemble that of a 1-local hamiltonian (although, quite curiously, the same work shows that almost all eigenvectors of non-commuting local hamiltonians are highly entangled, in contrast with the eigenvectors of 1-local hamiltonians).
Above mentioned results have added to the growing body of research on general properties of local hamiltonian systems, such as the Lieb-Robinson bound [LR72], exponential decay of correlation [Has04], the area laws [Has07, ALV12, AKLV13] and local reversibility [KAAV17], to name a few.They have also found several applications in the problem of thermalization of many body systems.To start with, one of the first steps towards the problem of locality of temperature1 was taken in [HMH04b].Crucially using the Central limit theorem obtained in [HMH04a], the authors characterized a set of conditions under which a given thermal state of a quantum local hamiltonian on a lattice would be close to a tensor product of thermal states on local subsystems on the lattice.
The work [Cra12] considered the problem of thermalization under random hamiltonians, where the hamiltonian was generated via a random unitary on a fixed local hamiltonian H.One of the main technical problems in this work was the study of the characteristic function Tr(e iHt I d ), where I d is the maximally mixed state (which is also a product state on the lattice).The techniques were inspired from the proof of Central limit theorem in the works [GV89], [HMH04a], where the characteristic function Tr(e iHt ρ) of a product state ρ had been investigated in detail.
The quantum version of Berry-Esseen theorem [BCG15] was used to show in [BC15] that Gibbs state of a local hamiltonian H at sufficiently high temperature (high enough to ensure a clustering of correlation) is indistinguishable, over sufficiently large regions of lattice (that scale sub-linearly with lattice size), from the microcanonical ensemble of eigenstates of H which have eigenvalues close to the average energy of the Gibbs state.This result bears close resemblance to the Eigenstate Thermalization Hypothesis [Sre94,Deu91], which is a stronger conjecture stating that every eigenstate of H with eigenvalue close to the average energy of Gibbs state of H is locally indistinguishable from this Gibbs state.
In present work, we provide further details on energy distribution of states satisfying exponential decay of correlation and product states.Our main results can be seen as an analogue of the Chernoff bound for quantum lattice systems.
Our first result concerns states that satisfy exponential decay of correlation on a D-dimensional lattice.Well known examples of such states include the ground states of gapped local hamiltonians [Has04] and Gibbs state above a finite temperature [KGK + 13].In fact, it has been shown in [FWB + 15] that for local hamiltonians exhibiting many body localization and having non-degenerate energy spectrum, all eigenvectors satisfy exponential decay of correlation.Thus our result provides information about structure of eigenvectors of such hamiltonians and may have applications in the phenomena of many body localization.
Fix a D-dimensional lattice with spins of arbitrary local dimension sitting on each lattice site.Consider local hamiltonian H on the lattice with n local interaction terms, such that each local term has operator norm at most 1 and its support is a hyper-cube of side length 2k, hence containing (2k) D lattice sites (see Section 2 and Figure 2 for a detailed description of H).
Theorem 1.1 (Informal).Let ρ be a quantum state with correlation length σ and H ρ def = Tr(ρH) be the average energy of ρ.Let Π ≥f (Π ≤f ) be the projection onto subspace which is union of eigenspaces of H with eigenvalues ≥ f (≤ f ).
Formal statement of the theorem is given in Theorem 4.2.Thus in one dimensional spin chain (with D = 1), our upper bound decays exponentially with energy, rather than as a gaussian.The bound becomes weaker with higher dimensions and is depicted in Figure 1.
Our second result concerns product states over a collection of spins and does not require any underlying lattice arrangement of these spins.It does impose, however, a locality constraint on the hamiltonian that acts on these spins.Consider a hamiltonian H which is a sum of n terms, each term being k-local (that is, it acts non-trivially on at most k spins) and having operator norm at most 1.Let m be the maximum number of neighbours of any local term, where two local terms are neighbours if there is a spin on which both act non-trivially (See Section 5 and Figure 3 for detailed description of H).We show the following.
Formal statement of the theorem is given in Theorem 5.3.The energy distribution is depicted in Figure 1.The bound is not only independent of any underlying lattice structure, but is also independent of the locality k.This is not surprising, since the quantity n that appears in the bound is the number of local terms in H, rather than number of spins on which H acts. Following corollary is a restatement of above bound, in terms of number of spins (which we call N ) and the maximum number of local terms that act on any given spin (which we call g).In the following, we also assume that each local term is exactly k-local.

Corollary 1.3. Consider a product state ρ with average energy
and Formal statement of the corollary appears as Corollary 5.4.It shows a gaussian decay for tail of energy distribution of product states in the scenario where g, k are constants2 independent of N .

Related recent works
A recent work [Kuw16] has obtained a similar concentration result for product states (Lemma 4 therein).The key idea is to split the hamiltonian H as H = H 1 + H 2 + . .., where each H 1 , H 2 . . . is composed of local terms that are non-overlapping.Then from classical Chernoff bound, the product state exhibits a Gaussian decay in energy distribution for each of the hamiltonians H 1 , H 2 . ... Final step (which is also the main argument of the paper) is to combine these tails bounds to obtain a final bound for energy distribution with respect to the original hamiltonian H. Unfortunately, the techniques do not extend to states satisfying exponential decay of correlation.To establish a bound for energy distribution with respect to H, one needs the knowledge of bounds for energy distribution with respect to each of the 'classical hamiltonians' H 1 , H 2 . ... But even for these Tr(ρΠ e ) Figure 1: The tail bounds according to Theorem 1.1 (right hand side) and Theorem 1.2 (left hand side).The x-axis is energy e and y-axis is the weight Tr(ρΠ e ), where Π e is projector onto eigenspace of H with energy e. Shaded region depicts the part of energy distribution with overall weight at most 1 n .The discontinuous part of the curve is where our results provide no information.We have ignored O(1) constants in the figure .classical hamiltonians, no bound is known for states that satisfy exponential decay of correlation (apart from Theorem 1.1, to the best of author's knowledge).We have provided further comparision of the bound in [Kuw16] and Theorem 1.2 in Subsection 5.1.
A concentration result has been noted in [KAAV17] (Section 5 in this reference) for ground states of gapped local hamiltonians on finite dimensional quantum lattice systems, which also exhibit exponential decay of correlation ([Has04]).In this work, the probability distribution has been shown to be concentrated about the median of the distribution with the weight of the distribution above energy ε decaying as e −|ε−f |/O(1) √ nσ (f being the median of the distribution, n being the number of local terms in the local hamiltonian and σ being the correlation length of the ground state).In comparison, we show a concentration about the mean of the distribution for all states satisfying exponential decay of correlation.While our bounds are weaker than those of [KAAV17] in higher dimensions, it may be noted that we have considered a larger class of states that might possess weaker properties than the ground states of gapped local hamiltonians.This behaviour appears in the context of area laws as well: ground states of gapped local hamiltonians are known to have very good scaling of area laws with correlation length [AKLV13]; whereas a recent observation of Hastings [Has15] suggests that states satisfying exponential decay of correlation may have much weaker dependence of area law with correlation length [BH13a].

Our technique and organisation
The idea behind our approach is straightforward, to compute the moment generating function The paper is organized as follows.We state basic facts and describe our physical set-up needed for Theorem 1.1 in section 2. We prove our combinatorial lemma in Section 3. In Section 4 we prove our bounds for states satisfying exponential decay of correlation.In Section 5, we introduce the physical set-up required for Theorem 1.2 and provide the proof of the theorem.This proof also requires a variant of the combinatorial lemma (Lemma 3.1) which we prove in Appendix A. We conclude in Section 6 and address some questions left open by this work.

Physical set-up and basic facts
In this section, we introduce the physical-set up required for Theorem 1.1.For simplicity of the presentation, we shall assume that the spins are arranged on a square lattice, with a local interaction term acting between only those spins that are the vertices of a common 'unit-hypercube'.We shall introduce the notion of a 'dual lattice' below, to formally and concisely represent these local interactions between the spins.It can be observed that more general local interactions on a square lattice can be put in this form by sufficient coarse-graining of lattice sites.The physical set-up for Theorem 1.2 is relatively simple, and shall be introduced directly in Section 5.
Consider a D-dimensional real vector space R D .For a vector It satisfies the triangle inequality: given v, v ′ , v" ∈ R D , we have For brevity, we shall refer to 1-norm distance simply as distance.
For an integer L > 0, define a lattice L D,L as the set of all vectors v ∈ R D that satisfy the following: for all i ∈ {1, 2 . . .D}, it holds that v i is an integer and 0 Henceforth, the vectors belonging to L D,L shall be referred to as sites.
For each site v ∈ L D,L , we associate a d-dimensional Hilbert space H d v and define the full Hilbert space as Local hamiltonian system is conveniently represented using the notion of dual lattice.Let LD,L be the set of vectors w such that for all i ∈ {1, 2 . . .D}, 0 < w i < L and w i is a half integer (that is, w i = k + 1 2 , for k an integer).For a fixed w ∈ LD,L and an integer k, let S(w, k) be the set of all sites v ∈ L D,L such that: for all i ∈ {1, 2 . . .D}, A local hamiltonian on L D,L is defined as H = w∈ LD,L h w , where h w is a '(2k +2) D -local' term that acts non-trivially only on sites in S(w, k) and acts as identity on rest of the sites.The number of sites in S(w, k) is at most (2k + 2) D , justifying h w as a '(2k + 2) D -local' interaction.Following the physical motivation, we shall refer to vectors in LD,L as interactions.Figure 2 illustrates the notions introduced above for the case when D = 2. Without loss of generality, we assume that the local terms h w are positive semi-definite matrices and h w ∞ ≤ 1, where .∞ is the operator norm.Given an operator A, support of A (called supp(A)) is the set of sites in L D,L on which A acts non-trivially.We define the distance between two operators A, B to be the minimum distance between their respective supports, that is, min For a quantum state ρ ∈ H, the reduced density matrix of ρ on a set T of sites is represented as ρ T .We define the average energy of ρ to be Tr(ρH), and represent it as H ρ .For every local term h w , let h w ρ def = Tr(ρ S(w) h w ).Then we have A state ρ ∈ H satisfies (C, l 0 , σ)− decay of correlation if for any two operators A, B such that distance between A, B is l ≥ l 0 , it holds that Define Π f to be the projector onto eigenspace of H with eigenvalue (energy) equal to f .Let Π ≥f (Π ≤f ) be the projection onto the subspace which is union of eigenspaces of H with eigenvalues greater (less) than f .The following fact follows from Markov's inequality.

A combinatorial lemma
In this section, we shall prove a combinatorial lemma, which we shall use in Section 4 to prove Theorem 1.1.A slight variant of this lemma shall be proved in Appendix A and used in Section 5 to prove Theorem 1.2.We recall the definition of the set LD,L from Section 2 and let the number of interactions in LD,L be n.It is easily seen that n = (L − 1) D .
Fix an integer l.An ordered set {w 1 , w 2 . . .w r } of r interactions in LD,L is said to satisfy a property P(l) if the following holds: for all w i , there exists a w j such that w i − w j ≤ l.Let the number of such ordered sets be N D (n, r, l).
Rest of the section is devoted to the proof of following lemma.
We start with the following definition that we shall extensively use.

Definition 3.2.
A selection is an ordered set {(b 1 , x 1 ), (b 2 , x 2 ) . . .(b r , x r )}, where b i ∈ {0, 1} and x i ∈ LD,L , that satisfies the following constraints: 1.If b i = 0, then x i can be any interaction in LD,L and if b i = 1, x i has to satisfy x i − x j ≤ 2 • l for some j < i.

Number of i for which b
We show the following lemma from which the proof of Lemma 3.1 shall follow immediately.
Lemma 3.3.Every ordered set {w 1 , w 2 . . .w r } that satisfies property P(l) can be mapped to a selection in such a way that for any two distinct sets satisfying P(l), the corresponding selections are distinct.
Proof.We assign a selection to an ordered set {v 1 , v 2 . . .v r } satisfying P(l) using the algorithm below.

Initialization
• Set i = 1 and b i = 0, x i = w i .
• Set i = 1.While (i ≤ r), do: If b i = 0, find the smallest j > i such that b j = 1 and x j − x i ≤ l (such a j exists due to property P(l)).Set R(i) = j.Set i → i + 1.

Update
• Let S be the set of all subsets of {1, 2, . . .r} which have cardinality at least 2.
• For each element S ∈ S, do: • Let s be the cardinality of S and i 1 , i 2 . . .i s be its elements arranged in increasing order.
• End For.
We show that above algorithm terminates and assigns a selection to each ordered set satisfying property P(l).
1. Consider the running of algorithm during the step Initialization.Condition 1 of a selection holds: for every i for which there is a j < i such that x i − x j ≤ l, we have set b i = 1.But we haven't constructed a selection yet, since condition 2 may not be satisfied.
2. After the step Pointer creation, it may be possible that there exist indices i 1 , i 2 . . .i s (for some s < r) In this case, we find using triangle inequality that Thus, the step Update sets b i 2 = b i 3 = . . .b is = 1, recognizing the fact that each of the points w i 2 , w i 3 . . .w is are at a lattice distance of at most 2l from w i 1 .This ensures that condition 1 of selection is still satisfied.
3. After the step Update terminates, condition 2 of selection is now satisfied as well.We now have that for every i with b i = 0, there is no other i ′ such that Thus, number of i with b i = 0 is at most as large as the number of j with b j = 1.
Lemma follows as two distinct ordered sets satisfying P(l) are not assigned the same selection.Now we prove Lemma 3.1.
Proof of Lemma 3.1.For n ≤ r(4l) D , we clearly have N D (n, r, l) ≤ n r ≤ ((4l) D nr) r 2 < (4(4l) D nr) r 2 .So we assume n > r(4l) D .We bound the number of selections, which gives the desired upper bound on N D (n, r, l) using Lemma 3.3.
Consider those selections for which number of i such that b i = 0 is u.For each i with b i = 0, number of possible choices of x i is n.For each i with b i = 1, number of possible choices of x i is at most (4l) D r (as there are at most (4l) D points x j ∈ L D,L that satisfy x i − x j ≤ 2l for a given x i 3 ).Hence total number of such selections is at most r u n u ((4l) D r) r−u .Since u ≤ r 2 , total number of selections is at most This proves the lemma.

Energy distribution of states that satisfy an exponential decay of correlation
Consider a state ρ that satisfies (C, l 0 , σ)− decay of correlation and the hamiltonian H = w∈ LD,L h w , where each term h w is (2k + 2) D -local, that is, it acts non-trivially only on sites in S(w, k).Let Tr(ρg w 1 g w 2 . . .g wr ) For every ordered set {w 1 , w 2 . . .w r } define the quantity D(w 1 , w 2 . . .w r ) def = max i (min j =i |w i − w j |).This is the distance of farthest interaction from rest of the interactions in the ordered set.
For an integer l > 0, define T (l) as the collection 4 of all sets {w 1 , w 2 . . .w r } that satisfy D(w 1 , w 2 . . .w r ) = l.Now, fix a set {w 1 , w 2 . . .w r } ∈ T (l).Without loss of generality, suppose that w 1 is an interaction at the distance l from rest of the interactions.The distance between operator g w 1 and g w i , for any i = 1, is at least l − 2Dk, as the distance from w i to any site in S(w i , k) is at most Dk.Then from (C, σ, l 0 )− decay of correlation and the relation Tr(ρg w 1 ) = 0, it holds that Tr(ρg w 1 g w 2 . . .g wr ) ≤ Tr(ρg 3 This is a very crude upper bound and can be found as follows.The number of non-negative integers {a1, a2 . . .aD} such that i ai ≤ 2l is at most (2l) D .Thus, number of integers {a1, a2 . . .aD} such that i |ai| ≤ 2l is at most 2 D (2l) D .
Theorem 4.2.Consider a quantum state ρ that satisfies (C, l 0 , σ)−decay of correlation and has average energy H ρ .

Dσn
, or equivalently (4l 0 +8Dk) D+1 Dna 2 σ < 1 Then we set r = 2⌈( na 2 8e(Dσ) D ) 1 D+1 ⌉, where ⌈.⌉ denotes the ceiling operation (rounding to the nearest larger integer) to obtain The last inequality follows from the assumption: Last inequality follows from the assumption: For second part of the theorem, consider the hamiltonian ≥f be the projector onto subspace with eigenvalues of H ′ larger than f .Same analysis as above for H ′ in place of H, along with the relation Π ′ ≥f = Π ≤n−f completes the proof.

Energy distribution of a product state
In this section, we introduce the physical set-up for Theorem 1.2 and also provide its proof.We shall continue using the notations H and h for the hamiltonian and its local term, as this notation is restricted only to this section.
Consider a collection C of spins, such that a d-dimensional Hilbert space H d s is associated to each spin s ∈ C. Let full Hilbert space H be defined as H = ⊗ s H d s .For an integer k > 0, let S k be the set of all subsets of C of size at most k.For an integer m > 0, let W k,m be a subset of S k defined as follows (note that W k,m is also a set of subsets of C) : for each w ∈ W k,m the number of w ′ ∈ W k,m such that |w ′ ∩ w| > 0 is at most m.For each w ∈ W k,m , let N(w) be the set of all w ′ ∈ W k,m such that |w ∩ w ′ | > 0. Elements of N(w) shall be referred to as neighbours of w.The set-up has been depicted in Figure 3. Let the hamiltonian H be defined as: where h w acts non-trivially only on spins in w and acts trivially on rest of the spins.Further, we assume that h w ∞ ≤ 1.The definition of W k,m thus translates to the assumption that: 1.Each 'local term' h w acts non-trivially on at most k particle, and hence is k-local.

2.
For each h w , the number of h w ′ such that the supports of h w and h w ′ intersect, is at most m.
Let ρ ∈ H be a product state, that is, ρ = Π s∈C ρ s and support of each ρ s is exactly the spin s.Let the reduced density matrix of ρ on a subset T ⊆ C of spins be denoted in the usual way as ρ T .
We bound the moment function Tr(ρ(H − H ρ ) r ) for an even r to be chosen later and use it to prove Theorem 1.2.Define g w def = h w − h w ρ I.We shall prove the following lemma.
Using Tr(ρg w ) = Tr(ρ w g w ) = 0, we observe that the term Tr(ρg w 1 g w 1 . . .g wr ) is non-zero only if the ordered set {w 1 , w 2 . . .w r } satisfies the following property Q: for every w i , there exists a w j such that |w i ∩ w j | > 0. In other words, there is a w j ∈ W k,m such that w i ∈ N(w j ).Let number of ordered sets {w 1 , w 2 . . .w r } that satisfy above property be N k,m (n, r).This gives us Tr(ρ( Proof.Lemma 5.1 gives the following upper bound on r-th moment: Using Fact 2.1, we have For second part of the theorem, consider the hamiltonian Let Π ′ ≥f be the projector onto subspace with eigenvalues of H ′ larger than f .Same analysis as above for H ′ in place of H gives This completes the proof since Π ′ ≥f = Π ≤n−f .

Restatement of Theorem 5.3 in terms of number of spins
Proof.We set ε def = na as the energy with respect to H. Then the bound in Theorem 5.3 can be restated as: Relation between N and n can be computed as follows.To each local term h w , one can associate exactly k spins on which h w acts non-trivially.On the other hand, to each spin s, one can associate at most g local terms that contain s in their support.From the first argument, the number of associations is exactly k • n, whereas from the second argument, the number of associations is at most g • N .Thus, g • N ≥ k • n which implies n ≤ gN k .Also, m ≤ k • g, since each local term is supported on k spins, and each of these spins are in the support of at most g other local terms.Collectively we obtain nm 2 ≤ N g 3 k and our bound takes the form: This completes the proof.
Above upper bound may be compared to Theorem 7 in [Kuw16].In this reference, the notion of g ′ -extensitivity has been introduced (Definition 2, [Kuw16]), which is analogous to the locality parameter g defined above.It is defined as follows: A local hamiltonian H is g ′ -extensive if for every spin s, we have w∈W k,m :s∈w h w ≤ g ′ .Using this, the following theorem has been shown in [Kuw16]: Theorem 5.5 (Informal version of Theorem 7, [Kuw16]).Given a g ′ -extensive local hamiltonian with locality k, it holds that where c is a O(1) constant that depends only on k, g ′ .
We observe that Equation (6) achieves a marginally better bound whenever the norm of each local term h w , that is h w , is a constant independent of w.In such a case, g ′ and g are same up to the norm of local terms.In case the normalizations of each local term are different, it is not clear how g, g ′ are related to each other.In such a case Equation (6) and Theorem 5.5 may be viewed as complementary results.

Conclusion
We have shown upper bounds on tail of energy distribution of states that satisfy exponential decay of correlation and product states, with respect to a local hamiltonian.Main technical tool we use is a combinatorial lemma that gives a non-trivial upper bound on the moments of the energy distribution.The results may have applications in the study of thermalization of many body quantum systems and also for many body localization, as noted in the Introduction.Main questions that we leave open are connected to tightness of our bounds, as we discuss below.
The bounds presented in Theorem 1.2 can only be improved up to constants, since classical Chernoff bound also exhibits a Gaussian decay, which is known to be tight.More interesting situation occurs with the bounds presented in Theorem 1.1.In one dimensional spin chain, our bound decays exponential with the energy.For gapped ground states, this is very similar to the behaviour noted in [KAAV17] (Section 5) using completely different techniques.This suggests that gapped ground states (such as the ground state of Transverse field Ising model, which is exactly solvable) are strong candidates for the study of tightness of above results.Our result for higher dimensions appears to be much weaker that those obtained in [KAAV17] (Section 5) for gapped ground states, and we expect further improvement using better combinatorial arguments.
An another interesting question is with respect to Matrix product states (with constant bond dimension) which are defined on one dimensional spin chain.It is well known that under reasonable assumptions (see Section 5.1.1,[Oru14]) Matrix product states satisfy exponential decay of correlation.Furthermore, it has already been shown in [Oga10] that given a Matrix product state ρ, if n is large enough and energy ε ≈ O(n), it holds that Tr(ρΠ ≥ H ρ +ε ) ≤ e −O(n) .It is a strong indication that our bound (which applies for all energies ε > O( √ n)) may be considerably improved for this special, but important, class of states.

Figure 2 :
Figure 2: A lattice L 2,4 in dimension D = 2 with L = 4. Black dots represent the lattice sites.Blue dots represent the vectors of the dual lattice L2,4 , which we call interactions.For the interaction w, the set S(w, 0) is {v 1 , v 2 , v 3 , v 4 }.The distance between sites v and v 3 is v − v 3 = 3.The distance between interactions w and w ′ is w − w ′ = 3.

Figure 3 :
Figure3: Collection of spins (blue dots) with local terms (gray triangles).There is no underlying lattice structure.Each local term is 3-local, thus k = 3 and each local term has at most 2 neighbours, thus m = 2.The set W 3,2 in above figure is {w 1 , w 2 , w 3 , w 4 , w 5 , w 6 }, where each w i is the set of spins that form vertices of corresponding triangle.Note that for a fixed k, m (here k = 3, m = 2), there can be several choices of the set W k,m and each choice gives a different hamiltonian H. Neighbours of w 4 are N(w 4 ) = {w 3 , w 5 }.Each spin in the figure is in the support of at most 2 local terms.Thus we have g = 2, where g is defined in Subsection 5.1.

Theorem 1.2 (
Informal).Consider a product state ρ with average energy H ρ .Let Π ≥f (Π ≤f ) be the projection onto subspace which is union of eigenspaces of H with eigenvalues ≥ f (≤ f ).It holds that Tr(ρΠ ≥ H +na ) ≤ e n Tr(H r ρ) of the energy distribution and then use Markov's inequality to upper bound the desired probability.Without loss of generality, we can assume that H = w h Lemma 3.1) which answers the following question: if we expand H r as a sum of product of local terms, that is H r = w 1 ,w 2 ...wr h w 1 h w 2 . . .h wr , how many terms h w 1 h w 2 . . .h wr make non-negligible (or non-zero) contribution to the moment generating function?We observe the terms that make non-negligible contribution possess a common property: there is no h w i which is supported 'far' from all of h w 1 , h w 2 , . . .h w i−1 , h w i+1 , . . .h wr .Making the notion of 'far' precise, we compute the number of such terms in Lemma 3.1 and use it to bound the moment generating function.
We introduce a new parameter that captures the number of local terms that act on any given spin.Now, we prove Corollary 1.3.Its formal statement is as follows, where we also assume that each local term is exactly k-local.Let the hamiltonian H be such that each term h w has locality equal to k.Let N |C| be the number of spins.Given the product state ρ = Π s∈C ρ s with average energy H ρ , consider a real number ε ≥ 8eg 3 kN .It holds that Tr(ρΠ ≥ H ρ +ε ) ≤ e def= max s g s , where g s is the maximum number of local terms that act non-trivially on spin s. def =