Determining system Hamiltonian from eigenstate measurements without correlation functions

Local Hamiltonians arise naturally in physical systems. Despite its seemingly `simple' local structure, exotic features such as nonlocal correlations and topological orders exhibit in eigenstates of these systems. Previous studies for recovering local Hamiltonians from measurements on an eigenstate $|\psi\rangle$ require information of nonlocal correlation functions. In this work, we develop an algorithm to determine local Hamiltonians from only local measurements on $|\psi\rangle$, by reformulating the task as an unconstrained optimization problem of certain target function of Hamiltonian parameters, with only polynomial number of parameters in terms of system size. We also develop a machine learning-based-method to solve the first-order gradient used in the algorithm. Our method is tested numerically for randomly generated local Hamiltonians and returns promising reconstruction in the desired accuracy. Our result shed light on the fundamental question on how a single eigenstate can encode the full system Hamiltonian, indicating a somewhat surprising answer that only local measurements are enough without additional assumptions, for generic cases.

In the past few years, rapidly developing machine learning techniques allow us to study these topics in a new manner [15][16][17][18][19][20]. Empowered by traditional optimization methods and contemporary machine learning techniques, we map the task of revealing the information encoded in a single eigenstate of a local Hamiltonian to an optimization problem.
For a local Hamiltonian H = i c i A i with A i being some local operators, it is known that a single (non-degenerate) eigenstate |ψ can encode the full Hamiltonian H in certain cases [21][22][23], such as when the expectation value of A i s on |ψ , a i = ψ|A i |ψ , are given and further assumptions are satisfied. A simple case is that when |ψ is the unique ground state of H; thus the corresponding density matrix of |ψ can be represented in the thermal form as |ψ ψ| = e −βH tr(e −βH ) (1) * zengb@ust.hk for sufficiently large β. This implies that the Hamiltonian H can be directly related to |ψ , and hence can be determined by the measurement results a i , using algorithms developed in the literature [24,25] for practical cases. Because A i s are local operators, the number of parameters of H (i.e., the number of c i s) is only polynomial in terms of system size. We remark that the problem of finding H is also closely related to the problem of determining quantum states from local measurements [15,16,[26][27][28][29][30][31][32][33], and it also has a natural connection to the study of quantum marginal problem [34][35][36][37], as well as its bosonic/fermionic version that are called the Nrepresentability problem [34,[38][39][40][41][42][43][44].
For a wavefunction |ψ that is an eigenstate (i.e. not necessarily a ground state), one interesting situation is related to the eigenstate thermalization hypothesis (ETH) [45][46][47][48][49]. When the ETH is satisfied, the reduced density matrix of a pure and finite energy density eigenstate for a small subsystem becomes equal to a thermal reduced density matrix asymptotically [21]. In other words, Eq. (1) will hold for some eigenstate |ψ of the system in this case, and one can use a similar algorithm [24,25] to find H from a i s, as in the case of ground states. Another situation previously discussed is that if the two-point correlation functions ψ|A i A j |ψ are known, one can reproduce H without satisfying ETH [22,23]. Once the two-point correlation functions are known, one can again use an algorithm to recover H from the correlation functions, for the case of ground states. However, in practice, the nonlocal correlation functions ψ|A i A j |ψ are not easy to obtain.
In this paper, we answer a simple but significant question: can we determine a local Hamiltonian (c i s) from only the local information (a i s) of any one of the eigenstates (|ψ ) with-arXiv:1903.06569v2 [quant-ph] 5 Sep 2019 out further assumptions. Surprisingly, we show that only the knowledge of a i s suffices to determine H. Our proposed algorithm is based on a key observation: the eigenstate |ψ of H is the unique ground state of the HamiltonianH 2 = (H − λI) 2 , where I is the identity operator and λ is the associated eigenvalue of |ψ . Then, recovering H from the set of expectation value a i turns to finding theH 2 from expectation values (a i ) of its ground states. In addition, the ground state ofH 2 could be written as for sufficiently large β. Hence, our problem can be reformulated as an unconstrained optimization problem -minimizing the following objective function: where x is the estimation of c, and x = c when f ( x) is minimized. We implement our algorithm in a way that combines the power of traditional optimization methods and modern machine learning techniques. Knowing A i s and a i s, our algorithm begins with a random guess of c . Then, we update x using the gradient-based Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm [50][51][52][53] in each iteartion. Let x (k) denotes the estimation of c after k-th iteration and x (0) = c . In each iteration, we update x (k) according to the gradient. The parameter d (k) i controls the direction and scale of iterations (In machine learning, it is often called learning rate). It can be calculated from the Hessian matrix, which are extremely hard to obtain in our problem. To solve this issue, we developed a new technique to obtain the first-order derivatives (or gradients ∂f ∂x (k) i ) based on a machine learning gradient techniqueautomatic differentiation [54]. The procedure of our approach is depicted in Fig. 1.
We test the algorithm in two scenarios: four randomly generated operator A i s, and two random local operators A i s. Our algorithm returns almost perfectly reproducing c i s, the average fidelities for both cases are close to 1. Based on the reconstructed Hamiltonian H (i.e. c i s), one can also recover the eigenvalue of |ψ from c i s and a i s, and the wavefunction itself from the eigenvectors of Hamiltonian H. In the case when A i s are local operators, our method can find c i s from only local measurement results, hence shed light on the correlation structures of eigenstates of local Hamiltonians.

II. ALGORITHM
We start to discuss our method in a general situation, where the Hamiltonian H can be expressed in terms of a set of known Hermitian operators (A 1 , A 2 , . . . , A m ): H = c i A i with (c 1 , c 2 , . . . , c m ) = c. For an eigenstate |ψ of H with unknown eigenvalue λ, which satisfies H|ψ = λ|ψ , we can denote the measurement results as a i = ψ|A i |ψ . With only knowing the measurement results a i s, our goal is to find the coefficients c to determine H.
We observe that, even if |ψ is not the ground state of H, it can be the ground state of another HamiltonianH 2 withH given byH Then the density matrix of |ψ (which is in fact of rank 1) can be written in the form of a thermal state: for sufficiently large β, satisfying tr(A i ρ( c)) = a i , tr(H 2 ρ( c)) = 0.
With these conditions in mind, we are ready to reformulate our task as an optimization problem with the Eq. (11). For the convenience of numerical implementation, we let H β = √ βH, then the thermal state ρ( c) becomes Consequently, the objective function Eq. (11) can be rewritten in an equivalent form We aim to solve f ( c) = 0 by minimizing f ( c). In practice, we terminate our iterations when f ( c) is smaller than a fixed small value . The corresponding optimization result is denotes as c opt . As we reformed the objective function by using H β , the result c opt achieves in our optimization frame is actually √ β · c. Theoretically, β could turn to infinity. However, numerically, when our algorithm achieved high fidelity, the outcome β in our algorithm is still an adequate large constant. A more detailed discussion can be found in Section A.
Since all the constraints are written in Eq. (9), minimizing f ( c) is an unconstrained minimization problem. There are plenty of standard algorithms for this task. In our setting, computing the second-order derivative information of f ( c) is quite complicated and expensive. Therefore, instead of using Newton method which requires the Hessian matrix of the second derivatives of the objective function, we choose the quasi-Newton method with BFGS-formula to approximate the Hessians [50][51][52][53]. The MATLAB function FMINUNC, which uses the quasi-Newton algorithm, is capable of realizing this algorithm starting from an initial random guess of c i s. When the quasi-Newton algorithm fails (converges into a local minimum), we start with a different set of random initial value c i s and optimize again until we obtain a good enough solution.
Start with a i s and A i s We upgrade x (k) using the gradients. The learning rate dis are approximated using BFGS algorithm. Repeatedly upgrading the gradients until ||∇f ( x (k) )||2 < 0, we process to check f ( x (k) ).If f ( x (k) ) is smaller than , the solution is found, i.e. c = x (k) . Otherwise, we repeat the procedure with another random guess c i .
The BFGS algorithm is a typical optimization algorithm which requires gradients of the objective function on c i s. The form of the objective function f ( c) is so complicated such that computing the gradient is a difficult task. To solve this issue, we borrow the methods of computational graph and auto differentiation from machine learning. The computational graph is shown in Fig. 2, in which we show the intermediate functions and variables. Mathematically, the final gradients could be calculated via chain rules. However, since some of the intermediate variables are matrices and complex-valued, the automatic differentiation toolboxes, which deal with real functions with numbers as their variables, can not be applied directly. To obtain the gradients, we have to be very careful when handling the derivation of the intermediate functions, especially those with matrices as their variables.
In order to compute the first-order derivatives, we developed our own machine learning-based technique. The com-putational graph of this technique is depicted in Fig. 2. In the computational graph, each node represents an intermediate function. In principle, the chain rule can be applied; thus, it seems that we can compute the gradient by using the existing deep learning frameworks. However, those auto differentiation tools embedded in the frameworks such as Py-Torch [55] and TensorFlow [56] cannot harness our problem due to the following two reasons: first, most of the tools only deal with real numbers while in quantum physics we often deal with complex numbers; second and the most important, some of the intermediate functions in the computational graph use matrices (or in physics, operators) as their variables, while in the neural network, the variables are pure real numbers. When applying differentiation to a matrix variable, due to the chain rule, one has to be very careful since usually, a matrix does not commute with its derivative. For example, even for some simple conditions, the node v 3 =H 2 = v 2 2 , the deriva- Therefore, based on the thoughts of automatic differentiation, we developed a way to obtain the gradients. Details on how the gradient is calculated, as well as the calculation of derivatives of matrix exponential functions, are shown in Section B.

III. RESULTS
We remark that the fact a non-degenerate eigenstate |ψ of H is in fact the unique ground state of another Hamiltoniañ H 2 with eigenvalue zero, whereH = i c i (A i − a i I). We then show that it is then enough to reconstruct H using only the information of a i s.
Notice that since |ψ is the unique ground state of aH 2 , one can represent |ψ in terms of c i based on its thermal form for sufficiently large β. This reformulate the task of finding the Hamiltonian H (i.e. to find c = (c 1 , c 2 , . . .)) as an unconstrained optimization problem of the target function f ( c) = (tr(A i ρ( c)) − a i ) 2 + tr(H 2 ρ( c)).
The minimum value of f ( c) is obtained by the c corresponding toH 2 with |ψ as the unique ground state. Our method applies in the general case whenever A i s are known, to find c i s by only knowing a i s. Now, the problem becomes an unconstrained optimization problem. For this unconstrained optimization problem of f ( c), usually, the gradient-based algorithm is employed. For better convergence, the algorithm requires the Hessian matrix, which is the second-order derivative. Therefore, we have obtained the first and second-order derivatives of the objective function, which is also a fundamental task in the training tasks of machine learning. However, the function form of f ( c) is rather complicated. Consequently, the derivatives of f ( c) is hard to obtain. One of the commonly used algorithm for unconstrained optimization problem is the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm matrix [50][51][52][53]. Instead of deriving the Hessian matrix, this algorithm can approximate the Hessian Matrix without knowing the second-order derivatives. However, we still have to calculate the first-order derivatives (i.e., gradients).
The recent rapid progress of machine learning is not only proved to be useful in various fields such as image identification and natural language processing but also provided new methods and tools in optimization. To obtain gradients for hugely various neural networks and cost functions automatically and easily, a technique called automatic differentiation is developed. This technique could be applied to many problems which need calculation of gradients and is already embedded in platforms such as TensorFlow and PyTorch. For our problem, due to the complexity of our objective function, these tools cannot be directly applied. Here, we developed our own methods for calculating the gradients based on this technique.
Gradient-based algorithms have a common obstacle that the results might fall into local minimums. Therefore, we try different initial values attempting to find a global minimum. The diagram of our approach is shown in Fig. 1. The detailed description of our approach is discussed in the Methods section Section II and the references therein.

A. General settings
We test our algorithm in three steps as follows. First, we randomly generate several A i s and c i s, hence the Hamiltonian H rd and its eigenstates. Second, for each Hamiltonian, we randomly choose one eigenstate |ψ , therefore we have a i = ψ|A i |ψ and ρ = |ψ ψ|. Hereby we can run our algorithm to find the Hamiltonian H al . Comparing H al and H rd , we know whether the approach works. The algorithm has been tested for two cases: A i being the generic operator and local operator.
To compare H al and H rd , we need a measure to characterize the distance between these two Hamiltonians. We use the following definition of fidelity [57] Notice that if |ψ is an eigenstate of H, it is also an eigenstate of bH with b ∈ R. Our definition (eq. (12)) gives the same result when all c i s are multiplied by the same constant.

B. Results with general operators
When implementing our method on generic operators, we randomly generate three Hermitian operators A i s and fix them. We also create several real constants c i s randomly, then assemble A i s and c i s into Hamiltonians After diagonalizing H, we choose one eigenvector |ψ and calculate the expectation values a i = ψ|A i |ψ . For n-qubit systems, the dimension of H is d = 2 n . We tested the cases for n = 3, 4, 5 and 6. For each n, we generated 200 data points. Each test set is {c i , A i , a i |i = 1, 2, 3}, where c i ∈ (0, 1) for i = 1, 2, 3 and A i s are randomly generated Hermitian matrices of dimension d = 2 n [58]. With these data, H rd = i c i A i is obtained. Diagonalizing H rd and randomly choosing one eigenvector |ψ , we obtain {a 1 , a 2 , a 3 }. We show the results for applying our algorithm to all cases in Fig. 3. We find that the final fidelities are larger than 99.8% for all tested cases. Although the fidelity slightly decreases with the increasing number of qubits, the lowest fidelity (7-qubit case) is still higher than 99.8%. Because the eigenvector |ψ is randomly chosen, our method is not dependent on the energy level of eigenstates.

C. Results with local operators
In this section, we report our results on the systems with a 2local interaction structure. We tested two different structures, shown in Fig. 4(a) and Fig. 4(d). Each circle represents a qubit on a lattice, and each line represents an interaction between the connected two qubits.
The fully-connected 4-qubit system is showing in Fig. 4(a). The Hamiltonian can be written as where c ij s are real parameters and A ij s are random generated 2-local operators.One eigenstate out of 16 is randomly chosen as the state |ψ . Our algorithm has been tested on 800 such Hamiltonians.
We also try the 7-qubit chain model shown in Fig. 4(d). Similarly, we can write the Hamiltonian as where c i,i+1 s and A i,i+1 s are the parameters and 2-local interactions. We randomly generated 40 such Hamiltonians and applied our algorithm to them. The results of these two 2-local Hamiltonians are shown in Fig. 4. Our algorithm recovered Hamiltonians with high fidelities for both cases. The average fidelity for our 4-qubit (7-qubit) system is 99.99% (99.73%). As the dimension of the system increases, the fidelity between H rd and H al is slightly decreased. The histogram of the fidelities shows that, for most data points, the fidelities are very close to 1. Our algorithm almost perfectly recovered these 2-local Hamiltonian from measurement outcomes of a randomly picked eigenvector.
We examine the effectiveness of our algorithm according to different eigenvectors of the same Hamiltonians. Fig. 5 demonstrates average fidelities between H rd and H al of 4qubit case by chosen different eigenvectors. These average fidelities are higher than 99.9% for all 16 eigenvectors. Hence, the effectiveness of our algorithm is independent of energy levels.

IV. DISCUSSION
In this work, we discuss the problem of reconstructing system Hamiltonian H = i c i A i using only measurement data a i = ψ|A i |ψ of one eigenstate |ψ of H by reformulating the task as an unconstrained optimization problem of some target function of c i s. We numerically tested our method for both randomly generated A i s and also the case that A i s are random local operators. In both cases, we obtain good fidelities of the reconstructed H al . Our results are somewhat surprising: only local measurements on one eigenstate are enough to determine the system Hamiltonian for generic cases, no further assumption is needed.  Figure 5. The average fidelities f by applying our algorithm to different energy levels of eigenstates of the 4-qubit case. The average fidelities for different energy levels of eigenstates are all higher than 99.97%. As expected, our algorithm is effective for any eigenstate.
We also remark that, in the sense that our method almost perfectly recovered the Hamiltonian H, the information encoded in the Hamiltonian H, such as the eigenstate |ψ itself (though described exponentially many parameters in system size) can also in principle be revealed. This builds a bridge from our study to quantum tomography and other related topics of local Hamiltonians. Empowered by traditional optimization methods and machine learning techniques, our algorithm could be applied in various quantum physics problems, such as quantum simulation, quantum lattice models, adiabatic quantum computation, etc.
Our discussion also raises many new questions. For instance, a straightforward one is whether other methods can be used for the reconstruction problem, and their efficiency and stability compared to the method we have used in this work. One may also wonder how the information of "being an eigenstate" helps to determine a quantum state locally, which is generically only the case for the ground state, and how this information could be related to help quantum state tomography in a more general setting. AKNOWLEGEMENT We thank Feihao Zhang, Lei Wang, Shilin Huang, and Dawei Lu for helpful discussions. S.-Y.H is supported by National Natural Science Foundation of China under Grant No. 11847154. N.C. and B.Z. are supported by NSERC and CI-FAR.
It is easy to observe thatH 2 andH 2 β have the same eigenstates structure. The constant β only contribute a constant factor to the magnitude of eigenvalues, i.e., the eigenenergies of the given system. Theoretically, a thermal state tends to the ground state of the given Hamiltonian if and only if the temperature is zero (or, for a numerical algorithm, close to zero), which means β = 1 kT goes to infinite. Numerically, the β only needs to be a sufficiently large positive number.
Let us denote the i-th eigenvalue ofH 2 β as E i and the energy gaps as ∆ i = E i+1 − E i . From the definition ofH 2 β , the ground energy E g = E 1 is always 0. From our experience, during the optimization process, the eigenenergies, as well as the energy gaps of the Hamiltonian, gets larger. From Fig. 6(a), we can see the energy gaps grow as the optimization goes. The corresponding β is a finite sufficient large number.
One the other hand, we can also examine the probability of the ground state ofH 2 β as the iteration goes. This probability can be expressed as Since the ground state ofH 2 β always have zero energy, the probability is The change of the probability P (E = E g ) with iterations is shown in Fig. 6(b), from which we can see finally, the probability goes to 1, which means the thermal state form is (almost) ground state when β is a relatively large positive constant.

General method
The derivatives in the computational graph shown in Fig. 2 are listed in Table I. Here, due to the reason mentioned in the main text, we use forward propagation to calculate the gradients. In Table I, the derivative could be separated into the following categories: 1. The variable(s) of the function is a number. This corresponds to the simplest case, and the derivatives could be obtained simply using chain rules. 2. The variables are matrices, but the function is the trace function tr. The derivatives could be obtained by applying dtr(A) dx = tr dA dx . Some functions take matrix as their variables but are still simple, i.e., v 3 = v 2 2 . In this case, one should be careful of the commutators when dealing with it. 4. v 4 = eH 2 , of which the gradients are very hard to calculate, which will be discussed in detail later.

Derivative of matrix exponentials
The derivative of function f (X) = exp(X(c k )) with expect to c k can be written as ∂e X(c k ) ∂c k = 1 0 e αX(c k ) ∂X(c k ) ∂c k e (1−α)X(c k ) dα, where, in our cases, X(c k ) = −H(c k ) 2 = −v 3 . Generally, the commutator [X, ∂X ∂c k ] = 0. Therefore, the integration Eq. (B1) can not be simply calculated. In Ref. [25], the authors simply approximate the integration using the value of the upper and lower limits, which, for our experience, does not work. Here, we introduce a new way to calculate it. Let where B(0) is a matrix with all entries 0 and C(0) is the identity matrix. With B(1), we can obtain A(1), hence the integration Eq. (B1).