Method to solve quantum few-body problems with artificial neural networks

A machine learning technique to obtain the ground states of quantum few-body systems using artificial neural networks is developed. Bosons in continuous space are considered and a neural network is optimized in such a way that when particle positions are input into the network, the ground-state wave function is output from the network. The method is applied to the Calogero-Sutherland model in one-dimensional space and Efimov bound states in three-dimensional space.


Introduction
Quantum many-body problems are difficult because the size of the Hilbert space increases exponentially with the number of degrees of freedom. For example, in quantum spin problems, each spin state is represented by two bases, i.e., spin-up and spin-down states, and hence the total number of bases to represent the quantum state of N spins is 2 N . In numerical calculations, therefore, the amount of memory required to store the quantum states and the computational time required to handle them increases exponentially with N. This hinders precise numerical analysis of quantum many-body systems with large N. To circumvent this problem, various methods have been developed, such as quantum Monte Carlo method, 1) the density matrix renormalization group, 2) and tensor networks. 3) Recently, a method to solve quantum many-body problems using artificial neural networks was proposed. 4) In this work, it was shown that quantum many-body states can be efficiently stored in an artificial neural network, [5][6][7][8] called a restricted Boltzmann machine, where the number of parameters used to represent the neural network is much smaller than the size of the Hilbert space. Using this neural network representation, it was demonstrated that the ground state and time evolution of quantum spin systems can be obtained. 4,9) This method has subsequently been applied to the Bose-Hubbard model 10) using feedforward neural networks and to the Fermi-Hubbard model. 11) Deep neural networks with multiple hidden layers can efficiently represent quantum many-body states. 12,13) Quantum many-body states in neural networks have also been studied from the perspective of tensor networks [14][15][16][17] and quantum entanglement. 18,19) All of the above-mentioned studies and other studies on applications of neural networks to physics [20][21][22][23] have considered spatially discrete systems, i.e., spins or particles on lattices. In the present paper, by contrast, quantum many-body problems in continuous space are solved using artificial neural networks. We consider interacting several bosons, and the positions of these particles x 1 , x 2 , · · · , x N or their ap-propriate functions are input into the neural network, which is to be optimized to output the ground-state wave function ψ(x 1 , x 2 , · · · , x N ). This contrasts with previous studies, in which discrete variables, such as spin states and particle numbers on lattices, are input into neural networks. It is well known that any continuous function can be represented by a neural network, 24) and in fact, it was demonstrated that a single-particle Schrödinger equation can be solved using a neural network. 25) However, also in the case of continuous space, the size of the Hilbert space increases exponentially with the number of particles, and it is not obvious that such wave functions with large degrees of freedom can be efficiently represented by neural networks.
In the present paper, we will show that the ground-state wave functions of many-body problems in continuous space can be obtained by a method using artificial neural networks. The positions of particles are transformed appropriately and input into the neural network, which is optimized to output the ground-state wave function. We apply our method to the Calogero-Sutherland model in one-dimensional space and the obtained wave functions are shown to be close to the exact solution of the ground state. The method is also applied to bosons with resonant interaction in three-dimensional space, and the Efimov bound states are obtained.
The remainder of the paper is organized as follows. Section 2 describes a method to obtain the wave function of the many-body ground state using neural networks. Section 3 presents numerical results for one-dimensional and threedimensional problems. Finally, Section 4 offers conclusions.

Method
We use a feedforward artificial neural network, as illustrated in Fig. 1. The input units u in 1 , u in 2 , · · · , u in N in are set to real numbers, where N in is the number of input units. The value of each hidden unit is generated by all the input units through weights as where the weights W (1) ji and biases b (1) i are real numbers. In our feedforward network, there is only a single output, whose value is given by where N hid is the number of hidden units, the weights W (2) i are real numbers, and f is the activation function. In this paper, we adopt the hyperbolic tangent function, f (x) = tanh(x), as the activation function. We thus use a fully connected network with a single hidden layer. The number of parameters W (1) ji , b (1) i , and W (2) i in this neural network is (N in + 2)N hid . We consider a system of N identical bosons. Particle positions are written as X = (x 1 , x 2 , · · · , x N ) for one dimension and X = (x 1 , x 2 , · · · , x N ) for higher dimensions. We input ξ(X) = (ξ 1 (X), ξ 2 (X), · · · , ξ N in (X)) into the neural network, where ξ i (X) are functions of the particle positions. The functions ξ i (X) are chosen to facilitate the representation of manybody states in the neural network, and depend on the problem, which will be specified later. The many-body wave function is given by the output unit as ψ(X) = exp(u out ). ( This wave function is not normalized. Our aim is to optimize the network parameters in such a way that if we input ξ(X) into the neural network, the output ψ(X) = exp(u out ) provides an approximate ground-state wave function. In other words, we use neural networks as variational wave functions. Since the ground-state wave functions of bosons can be taken to be positive everywhere, we only use a single output unit that gives a positive wave function ψ(X) = exp(u out ). If we use a network with two output units, complex-valued wave functions can be represented. 10,12) The expectation value of a quantityÂ is calculated as follows. To deal with a large Hilbert space, we calculate the integration dX using the Monte Carlo method. Through Metropolis sampling, we obtain a series of samples (X 1 , X 2 , · · · , X N sample ), where the probability P(X) of X being sampled is proportional to ψ 2 (X). Using these samples, the expectation value is calculated as To minimize the energy E of the system, we need to calculate the derivative of energy with respect to the network parameters as where W is one of the network parameters ( i ),Ĥ is the Hamiltonian of the system, and O W = ψ −1 ∂ψ/∂W. Using the gradient in Eq. (5), the network parameters are updated using the Adam scheme, 27) by which energy converges faster than the steepest descent method. 12) Typically, 10 4 -10 5 updates are needed for energy convergence with N sample ∼ 10 4 samples taken in each update.
It is important to choose initial values of the network parameters because if we use random numbers for this purpose and start the update procedure, the network state seriously fluctuates so that the calculation breaks down in most cases. This is different from previous cases focusing on lattice systems, 10,12) in which random numbers in the initial network parameters worked well. To prepare appropriate initial network parameters, we train the network so that it represents a wave function Ψ train . We train the network to maximize the overlap integral K given by where A = Ψ train /ψ. This is accomplished by using the gradient of K with respect to network parameters as Using the Adam scheme with Eq. (7), K is maximized and ψ converges toward Ψ train . The initial values of the network parameters in this training process can be random numbers with an appropriate magnitude. 26) For example, Ψ train is cho- sen to be a noninteracting ground state, and the interaction is introduced adiabatically. We demonstrate that the foregoing procedure increases K and hence the wave function ψ that is close to Ψ train can be prepared. We take the target wave function as which is the ground state of N bosons in a one-dimensional harmonic potential. The neural network inputs are simply the positions of the particles, and therefore, the number of input units is N in = N.

Interacting bosons in one dimension
First, we apply our method to one-dimensional problems that have exact solutions to evaluate the precision of our proposed method. We consider the Calogero-Sutherland model, 28,29) in which bosons are confined in a harmonic potential and interact with each other through an inverse squared potential. The Hamiltonian for the system is given bŷ where length and energy are normalized, and β is an interaction parameter. The exact ground-state wave function for this Hamiltonian is available as 28) with the ground-state energy The wave function in Eq. (11) satisfies the permutation symmetry of bosons. We prepare the initial network parameters using the method in Sec. 2 with Ψ train given in Eq. (8), which is the noninteracting ground state. The interaction potential is then gradually introduced asĤ where n is the update step and a determines the ramp speed of the potential. Initially n = 0 and the interaction potential vanishes. As the update step n increases, the peak of the interaction potential gradually rises. This prescription can avoid the calculation breaking down due to the sudden introduction of an infinite potential.
We first explore a simple input of particle positions as in Eq. (9); that is, the positions of the particles are directly input into the neural network. After a sufficient number of network update steps, we obtain the wave function, as shown in the middle panel of Fig. 3(a), which shows ψ(x 1 , x 2 , 0) for N = 3. The permutation symmetry x 1 ↔ x 2 is not satisfied in this wave function. This is because, in the Calogero-Sutherland model, many degenerate states exist near the ground states, such as a state satisfying ψ( x 1 , x 3 ). The linear combination of these states breaks the permutation symmetry, as shown in the middle panel of Fig. 3(a). To resolve this problem, we use as an input function where x i 1 ≤ x i 2 ≤ · · · ≤ x i N ; that is, we input the sorted positions into the network. By this input method, bosonic permutation symmetry is maintained from the definition. The result  of this input is shown in the leftmost panel of Fig. 3(a), which maintains the permutation symmetry, and accords with the exact solution shown in the rightmost panel of Fig. 3(a). Figure 3(b) shows the evolution of energy with respect to update step. 30) Because of the gradual increase in interaction potential as in Eq. (13), the energy grows from E < E exact , and stochastically converges to the final value. Figure 3(c) shows the energy obtained after 40000 updates. The energy is improved as the number of hidden units N h is increased for N h = 10-20, since the representation ability of the net-work increases with N h . However, the improvement saturates for N h 20. Such saturation is also seen in Ref. 12 because network optimization becomes increasingly difficult for larger N h . Figure 3(d) illustrates the fidelity F of the wave function, defined as where B = Ψ exact /ψ. The fidelity is larger than 0.99 for N = 5 and larger than 0.95 for N = 8. Fidelity improvement also saturates for N h 20. A periodic boundary condition can also be treated in our method. Specifically, we consider the Calogero-Sutherland model in free space with a periodic boundary condition. The Hamiltonian for the system is given bŷ where 0 ≤ θ i < 2π. The ground-state wave function and energy are obtained as 29) and To incorporate the periodic boundary condition into the wave function produced by the neural network, we use an input method as follows. We first sort the positions as θ i 1 ≤ θ i 2 ≤ · · · ≤ θ i N . If we directly input these positions into the network, a discontinuity arises at θ = 0, because the network does not know that θ = 0 = 2π. To avoid this problem, we take the distance between the adjacent positions, We also take their cyclic permutation. The input into the network is thus where p is a random integer such that 1 ≤ p ≤ N. Figure 4(a) shows the change in energy with respect to the number of updates. As in the previous case, we gradually increase the interaction using Eq. (13), and the energy E grows from E < E exact and approaches E exact . Figure 4(b) shows the N h dependence of energy and fidelity. Also, in this case, results improve with an increase in N h for N h 20 and saturate for N h 20. The fidelity is larger than 0.99 for N = 5 and larger than 0.95 for N = 8.

Few-body bound states in three dimensions
We consider bosons in three-dimensional space with an attractive interaction between particles. As an interesting demonstration of such quantum many-body problems, we aim to produce the Efimov bound state, which is a three-body bound state near the resonant interaction. 31,32) The Hamilto- nian for the system is assumed to bê where x i are three-dimensional vectors of the position of the ith particle, α is the strength of harmonic confinement, and V 0 and r 0 are the strength and range of the Gaussian interaction, respectively. The parameter α is controlled to promote boundstate formation. We assume that interaction strength is at the first resonance, V 0 ≃ 2.684, at which the first two-body bound state emerges at this interaction strength. The Jacobi coordinate is convenient to remove the translational degree of freedom. We define the Jacobi coordinate r i as for j = 1, 2, · · · , N − 1, and Using this Jacobi coordinate, the Hamiltonian can be divided intoĤ =Ĥ 1 +Ĥ ′ witĥ Since the Gaussian interaction term inĤ ′ does not include r 1 , the ground-state wave function can be written as where ψ ′ is the ground state forĤ ′ . Thus, we seek the wave function ψ ′ that minimizes E ′ = Ĥ ′ . Metropolis sampling is performed in the coordinate X; i.e., we sample x 1 , x 2 , · · · , x N (not r 2 , · · · , r N ) in each sampling with a probability ψ 2 = e − N+1 2 α 1/2 r 2 1 ψ ′ 2 in Eq. (25), where the network gives ψ ′ = exp(u out ).
We prepare the initial state of the network parameters using which is the noninteracting ground state. The interaction is then gradually introduced as V 0 = 2.684 min (n/n 0 , 1) , where n is the number of update steps and we take n 0 = 5000. Increasing the interaction gradually avoids sudden change in network parameters which could cause the calculation to break down. First, we consider the case of N = 3. The simplest input into the network is where r (x) i , r (y) i , and r (z) i are x, y, and z components of the Jacobi vector. However, the input function in Eq. (28) was found to be inappropriate. The calculation is unstable in many cases, and even if stability is maintained, convergence is very slow. This is partly because rotational degrees of freedom remain in the coordinates in Eq. (28). We then examine the input function as where In this input method, the distances between particles are input into the neural network; this takes into account not only translational symmetry but also rotational symmetry. Figure 5(a) shows the change in energy with respect to update steps. The energy drops to E ≃ −12, which indicates that the bound state is formed. The energy is close to −11.9 obtained in Ref. 33. To promote formation of the bound state, we set α = 1.5 until 10000 updates, and α = 1 subsequently. The inset in Fig. 5(a) shows the hyperradial distribution D(R) of the state after convergence, where the hyperradius R is defined as R 2 = (∆ 2 12 + ∆ 2 13 + ∆ 2 23 )/3. The particles are localized in R ≪ 1, and hence the harmonic potential in Eq. (24) is almost irrelevant for the binding. In fact, a very similar result is obtained when the harmonic potential is removed (α = 0) after the bound state is formed. It was also confirmed that the bound state is "Borromean", i.e., the bound state is formed for V 0 < 2.684 for which the two-body bound state does not exist.
In our method with the input function in Eq. (29), the bosonic permutation symmetry of the particles is not imposed explicitly. To confirm that the obtained wave function exhibits permutation symmetry, we introduce the overlap integral given by whereP jk exchanges the positions of the jth and kth particles, and A jk = ψ(P jk X)/ψ(X). If the wave function exhibits permutation symmetry, P becomes unity. From Fig. 5(a), we find that the obtained wave function indeed acquires permutation symmetry. Figure 5(b) shows the energies for N ≥ 3. The input function is ξ(X) = (∆ 12 , ∆ 13 , · · · , ∆ 1N , ∆ 23 , ∆ 24 , · · · , ∆ N−1,N ), (32) and the number of input units is N(N − 1)/2. For comparison, the energies E ref obtained in Ref. 33 are shown, in which the path-integral Monte Carlo method is used. The energies obtained by the present method are in reasonable agreement with those obtained by the pre-existing method. At present, our method cannot surpass the precision of the path-integral Monte Carlo method. However, in our method, wave functions are obtained directly, which facilitates the study of quantum few-body problems.

Conclusions
We have developed a method to obtain approximate ground-state wave functions of quantum few-body problems using artificial neural networks. Bosons in continuous space were considered and particle positions or their functions were input into neural networks, which were optimized to output the desired wave functions. The method was applied to the Calogero-Sutherland model in one-dimensional space and the results were compared with exact solutions. Using the appropriate input functions in Eqs. (14) and (19), the bosonic permutation symmetry of the wave functions was taken into account, and we successfully obtained the approximate groundstate wave functions. The method was also applied to the three-dimensional problem of bound states with resonant interaction. The Efimov trimer was obtained and its energy was close to that calculated in the previous study. Also, in this case, the appropriate input function in Eq. (29) was important to efficiently reach the ground state. The bound states for N > 3 bosons were also obtained.
In the present paper, we have restricted our attention to a fully connected neural network with a single hidden layer. Deep neural networks with multiple hidden layers have larger representation power, and one may posit that the use of such networks would improve the results. However, network optimization becomes difficult as the complexity of the network structure increases. In fact, we observed that the precision of results saturates as the number of hidden units N h increases, even for the single hidden layer. It is important to find a more efficient network structure and input method to facilitate optimization of the network parameters. We also need to find an appropriate way to deal with fermions.