Machine Learning Quantum States – Extensions to Fermion-Boson Coupled Systems and Excited-State Calculations

To analyze quantum many-body Hamiltonians, recently, machine learning techniques have been shown to be quite useful and powerful. However, the applicability of such machine learning solvers is still limited. Here, we propose schemes that make it possible to apply machine learning techniques to analyze fermion-boson coupled Hamiltonians and to calculate excited states. As for the extension to fermion-boson coupled systems, we study the Holstein model as a representative of the fermion-boson coupled Hamiltonians. We show that the machine-learning solver achieves highly accurate ground-state energy, improving the accuracy substantially compared to that obtained by the variational Monte Carlo method. As for the calculations of excited states, we propose a di ﬀ erent approach than that proposed in K. Choo et al. , Phys. Rev. Lett. 121 (2018) 167204. We discuss the di ﬀ erence in detail and compare the accuracy of two methods using the one-dimensional S = 1 / 2 Heisenberg chain. We also show the benchmark for the frustrated two-dimensional S = 1 / 2 J 1 - J 2 Heisenberg model and show an excellent agreement with the results obtained by the exact diagonalization. The extensions shown here open a way to analyze general quantum many-body problems using machine learning techniques.


Introduction
Solving quantum many-body Hamiltonians with high accuracy is a great challenge in condensed-matter physics. The behaviors of particles in such quantum systems are governed by many-body wave functions; once exact many-body wave functions are given, we can predict the properties of manybody systems. Therefore, a goal is to obtain exact many-body wave functions. However, in general, many-body wave functions are vectors with huge dimensions growing exponentially with the number of particles, which makes exact diagonalization intractable for large systems. Given this situation, it is imperative to represent many-body wave functions accurately with a computationally tractable number of parameters.
For this problem, machine learning techniques can play a role. Machine learning is powerful in extracting essential features from big data. Therefore, it would also be useful in extracting the essential patterns of the many-body wave functions and obtaining compact representations.
Indeed, Carleo and Troyer 1) have introduced variational ansatz for many-body ground states based on the restricted Boltzmann machine (RBM). The RBM is a kind of artificial neural network 2) consisting of visible and hidden units and is used to approximate probability distribution over the visible unit configurations. If we interpret the values of many-body wave functions as a generalized probability (allowing complex numbers) and make a mapping between physical and visible unit configurations, we can represent many-body wave functions in terms of RBM. The RBM wave functions were first applied to spin systems and have been shown to be able to represent ground states of spin Hamiltonians in high accuracy. 1) One of the advantages of using the RBM is its flexible representative power. The RBM can represent quantum states showing volume law entanglement entropy. 3,4) If an arbitrarily large number of hidden units are introduced, it can rep- * yusuke.nomura@riken.jp resent any bounded continuous function to arbitrary accuracy (universal approximation). 5,6) Another advantage is that the RBM allows us to approximate quantum many-body states in an unbiased way. This is in contrast to the conventional wave function methods where most calculations assume a specific form of the wave function, i.e., the calculations are biased.
This paper is organized as follows. Secs. 2 and 3 are devoted to the extension to fermion-boson coupled systems and calculations of excited states, respectively. In both sections, we provide the details of the methods and demonstrate the accuracy of the machine learning methods. We show the summary and future perspective in Sec. 4.

Extension to electron-phonon coupled systems
The electron-phonon coupling is a fundamental interaction in solids. In the conventional superconductors, the electronphonon coupling gives a "glue" for creating Cooper pairs. Even in the strongly-correlated materials, the coupling to the phonons plays an important role. For example, in transition metal oxides, the metal-insulator transitions often accompany structure transition. 35) In the alkali-doped fullerides, 36,37) unusual cooperation between strong correlation and phonons induces unconventional s-wave superconductivity next to the Mott insulating phase. 38,39) The coupling to the optical phonons of SrTiO 3 substrate has been suggested to be the origin of enhanced superconductivity in FeSe thin film on the SrTiO 3 substrate. 40) Therefore, in order to get a deeper understanding of electron-phonon physics, it is essential to develop powerful numerical methods to analyze electron-phonon coupled Hamiltonians. Here, we discuss an extension of the machinelearning wave function methods to the electron-phonon coupled systems.

Model
Hereafter, as a representative example, we restrict ourselves to the case of the one-dimensional spinless Holstein model and discuss how we construct machine-learning wave functions. The Hamiltonian is given by whereĉ † i (ĉ i ) is the creation (annihilation) operator for the electron at site i andb † i (b i ) creates (annihilates) phonon at site i. t is a hopping amplitude, g is the strength of the electronphonon coupling (g > 0), and ω is the phonon frequency. n i =ĉ † iĉ i is the number operator. The dimensionless displacement operator of the phonons is given byx i = 1 √ 2 (b † i +b i ). For simplicity, we take the mass of nuclei to be 1. The methods presented here can be easily extended to more general electron-phonon coupled systems and also even more general fermion-boson coupled systems.
For the latter use, it is convenient to define occupation and dimensionless displacement operators (n s i andx s i , respectively) in a staggered way: Physically,n s i is proportional to the electron and hole occupations (with constant shift) for even and odd sites, respectively. Introduction ofx s i corresponds to the change of positive direction of phonon displacements depending on whether the site index is even or odd. With this definition, the instability for the charge-density wave with the momentum π can be interpreted as the ferroic order in the n s i occupation. The Hamiltonian is rewritten as As for the basis, we use the occupation number basis of electrons and phonons in real space; it is a natural choice because the Hamiltonian in Eq. (4) is defined in real space. The electron occupations are specified by n s i = (−1) i (2n i − 1). The phonon basis set {|m i } with the phonon occupation number

Methods 2.2.1 Recent variational wave functions
Recently, there have been several proposals for the variational wave functions for electron-phonon coupled systems. 41,42) In those studies, the form of the variational wave functions reads Here, |Ψ el and |Ψ ph are the electron and phonon wave functions, respectively. P el-ph is the electron-phonon correlation factor. In this form, the correlations between electrons and phonons are taken into account only by P el-ph ; without P el-ph , electron and phonon degrees of freedom are decoupled.

Machine-learning variational wave function
We improve the accuracy of the variational wave function by i) preparing electron-phonon-entangled wave function |Ψ el-ph even without electron-phonon correlation factor and ii) improving electron-phonon correlation factor with RBM (denoted as P RBM el-ph ). Then, the form of the variational wave function is given by In the following, we discuss these two points in more detail.
i) Electron-phonon-entangled wave function |Ψ el-ph . As can be seen in Eq. (4), when the n s i occupation is positive n s i = 1 (negative n s i = −1), the positive (negative) x s i displacement is energetically favored. As we will show in the following, even without the electron-phonon correlation factor, one can take into account this primitive correlation between electrons and phonons.
Defining real-space electron and phonon configurations as ν = (n s 1 , n s 2 , ..., n s N site ) and µ = (m 1 , m 2 , ..., m N site ), respectively, the electron-phonon-entangled wave function reads where the phonon wave function Ψ ph for the phonon configuration µ does depend on the electron configuration ν. Note that, in the previous studies, 41,42) the phonon wave function Ψ ph is independent of the electron configuration ν. The phonon part, which depends on the electron configuration ν, is given by Here, m max is the maximum phonon occupation number and the coefficient c m i (n s i ) is a variational parameter. In this case, the coefficients for the phonon states depend on local electron configuration, which allows us to take into account the above described primitive local correlation between electrons and phonons.
In the actual calculations, we impose translational symmetry, and the number is reduced to 2(m max +1). In the particle-hole symmetric case without symmetry breaking, we can further reduce the number to m max +1: with some non-negative integer k. As for the electron wave function Ψ el , we employ the pair-product (geminal) wave function. The pair-product wave function has flexible representability: it can describe, for example, Fermi-sea, antiferromagnetic, charge-ordered, and superconducting states. The pair-product wave function is given by with the number of electrons N el and the variational parameter f i j . Here, we write the electron spin degrees of freedom explicitly to make it possible to apply our wave functions to the Hamiltonian with spin degrees of freedom. In the case of the present spinless case, we just use one spin component out of up and down components. When the boundary condition of the lattice is periodic or anti-periodic, we can take ( f i+pN u.c. , j+pN u.c. = f i j for some integer p) to reduce the number of variational parameters. With this setting, we can study the ordered state whose period is smaller than or equal to N u.c. .
The number of f i j is reduced from N 2 site to N u.c. N site . ii) RBM electron-phonon correlation factor P RBM el-ph . In Refs. 41 and 42, the form of the electron-phonon correlation factor reads and respectively (α i j and v i j are variational parameters). Both of them are constructed based on physical insight. In the present study, we replace them with the RBM correlation factor, which is more flexible and unbiased. Indeed, as is shown in Refs. 18 and 25, the two-body correlation factors such as Eqs. (10) and (11) can be analytically expressed by the RBM correlation factor. On top of such two-body correlations, the RBM can describe many-body correlations such as three-body and four-body correlations simultaneously in an unbiased way. Furthermore, as is discussed in the introduction, it is ensured that any correlations can be expressed exactly by the RBM in the limit of an infinite number of hidden units ("universal approximation"). The structure of the RBM is shown in Fig. 2. By identifying visible units configuration {σ l } and that of physical degrees of freedom (ν, µ) (see the following), the RBM correlation factor is given by Here, σ l = ±1 and h k = ±1 denote the state of visible and hidden units respectively. σ = (σ 1 , σ 2 , ..., σ N visible ) is the spin configuration of the visible units. {a l , W lk , b k } are variational parameters. We neglect irrelevant one-body a l terms and op-

Electron configurations
Phonon configurations Because the maximum phonon occupation number m max scales exponentially with M, we can essentially simulate infinite m max . This definition gives one to one correspondence between σ spin configuration and electron and phonon configuration (ν, µ).
The correlation factor P RBM el-ph (σ) in Eq. (12) is combined with the electron-phonon-entangled wave function |Ψ el-ph [Eq. (6)]. In addition to the primitive correlation between electrons and phonons taken into account by |Ψ el-ph , the RBM correlation factor P RBM el-ph takes into account more sophisticated correlations. Note that the RBM correlation factor takes into account not only electron-phonon correlations but also electron-electron and phonon-phonon correlations. Therefore, we do not need to introduce additional correlation factor for electron-electron and phonon-phonon correlations, whereas the wave functions employed in the previous studies 41,42) use electron-electron correlation factor separately with electronphonon correlation factor in Eqs. (10) and (11).
The accuracy of the wave function is controlled by the number of hidden units: The more hidden units we introduce, the more accurate the wave function becomes. We define the control parameter of accuracy α to be α = N hidden /N visible with the number of hidden unit N hidden . We call α hidden variable density. By taking α to be integer and imposing translational symmetry in W ik and b k parameters, 1) the number of W ik and b k parameters becomes αN visible and α, respectively.

Optimization
We employ the stochastic reconfiguration (SR) method 43) to optimize the variational parameters. 44) The optimization is done to minimize the energy expectation value E = Ψ|H|Ψ / Ψ|Ψ , where the expectation value is calculated us-ing the Monte Carlo method over the real-space configurations (µ, ν) with the weight |Ψ(µ, ν)| 2 . The SR method realizes imaginary time Hamiltonian evolution within the representative power of the variational wave function 45) and hence enables stable optimization. The variational parameters to be optimized are {c m i (n s i ), f i j , W lk , b k } in Eqs. (8), (9), and (12). The numbers of independent parameters for the translationally invariant systems are 2(m max +1) for c m i (n s i ), N u.c. N site for f i j , αN visible for W lk , and α for b k parameters. If the system has particle-hole symmetry, the number for c m i (n s i ) parameters can be reduced to m max +1. See the previous sections for more detail on the number of independent parameters.

Total energy
To show the accuracy of our wave function, we first show the results for the total energy. We consider the onedimensional spinless Holstein model whose Hamiltonian is given by Eq. (4). Following the previous study, we use periodic (anti-periodic) boundary condition when the number of electrons is odd (even). We simulate 6-, 8-, and 16-site systems with ω/t = 1, g/t = 1.5. The filling is set to be half-filling. In this setting, the ground state is described by Tomonaga-Luttinger liquid (TLL). Because the system has particle-hole symmetry, the number of independent phonon coefficients {c m i (n i )} becomes m max +1 (see Methods section). We set maximum phonon occupation number m max at each site to be 15, i.e., the number for visible units for phonon configuration is M = 4 per site. Then the total number of visible units are N visible = 5N site . We confirm the convergence of the results with respect to m max . As for the f i j parameters, we take 6-site unit-cell structure for the 6-site system and 4-site unitcell structure for the 8-site and 16-site systems. Figure 4 shows the results for the energy per site for different hidden variable densities α = 0, 2, 4, 8, and 32. In each panel, the vertical axis is the energy per site and the horizontal axis is the energy variance ∆ var = ( H 2 − H 2 )/ H 2 . The energy variance becomes zero if we obtain exact groundstate wave function (in more general, for any eigenstates of the Hamiltonian). Therefore, the energy variance tells us how close the variational wave function is to the ground state. As we can see, the accuracy becomes better with increasing α, and in all the cases, the energy for α = 8, 16, 32 agrees with those obtained by numerically exact Green's function Monte Carlo method 46) within one standard error. In the case of the 16-site system, extrapolation to zero energy variance limit seems to overshoot the ground state energy estimated by the Monte Carlo method. However, the deviation is still within two standard errors. Note that our wave function method is variational; therefore, the energy never becomes lower than that of the ground state.
Compared to the result of the previous study (blue lines), 41) our wave function improves the accuracy substantially. The improvement can be ascribed to the synergetic effect of the introduction of electron-phonon entangled wave function |Ψ el-ph in Eq. (7) and the usage of the flexible RBM correlation factor in Eq. (12). In Ref. 41, the results for the wave function without the electron-phonon correlation factor |Ψ el ⊗|Ψ ph , in which the electron and phonon degrees of freedom are decoupled, are also shown; the obtained energies are  Fig. 4). The RBM correlation factor P RBM el-ph (σ) further improves the energy, achieving the agreement with the Monte Carlo results within one standard error.

Charge structure factor
When the electron-phonon coupling becomes large, the ground state of the one-dimensional half-filled Holstein model [Eq. (4)] changes from TLL to charge density wave (CDW) state. Here, we show the results for the charge structure factor for (ω/t = 0.1, g 2 /ω 2 = 2) and (ω/t = 0.1, g 2 /ω 2 = 20). The former gives the TLL ground state, and the latter gives rise to the CDW instability. In the present Hamiltonian, the charge-charge correlation function has a peak at the momentum q = π, and the charge structure factor is given by Before showing the results for S c (π), we show the detail of the parameter setting. In the TLL phase with the particle-hole symmetry, we can impose the relationship in the {c m i (n s i )} coefficients as described in the previous section: c 2k (n s i = 1) = c 2k (n s i = −1) and c 2k+1 (n s i = 1) = −c 2k+1 (n s i = −1) with some non-negative integer k. The number of independent phonon coefficients {c m i (n s i )} is m max + 1. On the other hand, in the CDW phase, we have charge-rich and charge-poor sites, and the particle-hole symmetry at each site is broken. The actual ground state for the finite-size system is made of superposition of (rich, poor, rich, poor, ...) and (poor, rich, poor, rich, ...) CDW patterns, and the symmetry is recovered. However, in the present study, we assume one of two patterns. In this case, in the notation of n s i , the CDW corresponds to the ferroic order favoring either (1, 1, 1, 1   Correspondingly, we prepare independent c m i (n s i ) parameters for majority and minority n s i occupations. Thus, c m i (n s i = 1) and c m i (n s i = −1) become independent and the number of independent phonon coefficients {c m i (n s i )} is 2(m max +1). In the TLL (CDW) phase for ω/t = 0.1 and g 2 /ω 2 = 2 (ω/t = 0.1 and g 2 /ω 2 = 20), we set maximum phonon occupation number m max at each site to be 7 (31), and the number for visible units for phonon configuration becomes M = 3 (M = 5) per site. The total number of visible units is given by N visible = N site (M + 1). We confirm the convergence of results with respect to m max .
As for the f i j parameters, we take 4-site unit-cell structure in common, which is large enough to study both TLL and commensurate CDW phases. The hidden variable density α is set to be α = 8. Figure 5 shows the calculated charge structure factor S c (π) for (ω/t = 0.1, g 2 /ω 2 = 2) and (ω/t = 0.1, g 2 /ω 2 = 20).
In the former case, S c (π) vanishes as N site → ∞, showing the absence of CDW order. In the latter case, CDW order is confirmed by nonzero S c (π) with N site → ∞. We also compare our results with those 47) obtained by density-matrix renormalization group (DMRG). 48) They show good agreements, confirming the accuracy of our methods.

Discussion
In the present study, all the variational parameters are taken to be real. In this case, the RBM correlation factor P RBM el-ph (σ) always gives positive weight, i.e., it just controls the amplitude of the wave function and does not change the node of the wave function. The sign of the wave function is determined by electron-phonon entangled wave function |Ψ el-ph . Therefore, we have paid close attention to the initial sign of the phonon coefficients {c m i (n s i )} as described in the previous sections. In principle, if we introduce a complex RBM correlation factor, the node of the wave function can change. Even in that case, we believe that a good initial guess for the sign of phonon coefficients will make the optimization easier. Therefore, although it is straightforward to apply the machine learning wave functions to more general fermion-boson coupled Hamiltonians once the mapping between fermion-boson and visible-unit configurations is defined, it will be helpful to guess the sign structure of the exact wave functions (at least at primitive level) to obtain good accuracy.

Calculations of excited states
So far, almost all the variational studies using machine learning techniques have focused on approximating the ground state wave function. However, the calculations of excited states also give important information such as the ground state degeneracy, the size of the excitation gap, and low-lying dispersion of excitations.
The first attempt to obtain excited states with machinelearning wave function has been performed in Ref. 29. In this study, we present another approach to obtain excited states. We apply the method to the one-dimensional Heisenberg model and show that our method gives better accuracy than that obtained in Ref. 29 even though the present neural network is much more compact than that employed in Ref. 29. We also apply the method to the two-dimensional Heisenberg model with frustration and demonstrate good accuracy of the method.

Method 3.1.1 General idea
For finite size systems, eigenstates of many-body Hamiltonians with several symmetries are labelled by quantum numbers. Therefore, we can use quantum number projection to obtain excited states: We find the lowest energy eigenstates for different quantum number sectors; then we obtain excited states characterized by different quantum numbers than that of the ground state. This approach is taken commonly in the present study and Ref. 29. 49) The difference in the schemes comes from the way of enforcing quantum numbers to the many-body wave function.
In the present study, we focus on translationally invariant systems, in which the total momentum K is a good quantum number. By applying the momentum projection, we can discuss the dispersion of the excited states. With a transla-tion operator T R shifting all the particles by the amount R, the many-body wave function with the total momentum K is transformed as where r denotes the real-space configuration of the the particles r = (r 1 , r 2 , ..., r N particles ). In the following, we discuss, in detail, how to make the wave function satisfy the symmetry in Eq. (14).

Quantum number projection
Here, we discuss how to apply the momentum number projection to the wave function. For simplicity, let us give an explanation using the one-dimensional S = 1 2 spin models with the periodic boundary condition. The spin configuration is specified by σ = (σ 1 , σ 2 29. For a spin configuration, we can generate shifted configurations by applying the translation operators. Among the generated spin configurations including the original configuration, we choose a canonical configuration σ canonical . In Ref. 29, the canonical configuration σ canonical is chosen to be the lexicographically smallest one. By introducing an operator T shifting spin configurations by one site, the amplitude of the wave function for a configuration T n σ canonical (0 ≤ n < N site ) is given by Here, instead of calculating the amplitude for T n σ canonical directly, the amplitude is given by referring to that of the canonical configuration σ canonical . With this, the wave function on the left-hand side satisfies the proper symmetry in Eq. (14) even when we do not impose any translation symmetry constraint on the wave function on the right-hand side Ψ(σ canonical ). In Ref. 29, Ψ(σ canonical ) is prepared by the RBM or three-layer feedforward neural networks (FFNN). Present Scheme. In the present study, we employ the scheme in Refs. 50 and 51. The wave function projected onto the total momentum K sector is given by Is it easy to show that the wave function on the left-hand side is transformed according to Eq. (14). Note again that the wave function on the right-hand side does not necessarily satisfy the symmetry in Eq. (14). To represent Ψ(σ), we employ the RBM wave function with N visible = N site , which reads b k is the bias on the hidden units, and W ik is the interaction between the visible and hidden units. As in Sec. 2, we neglect the bias term for the visible units. In order to make it possible to represent the sign change of the wave function, we take the b k and W ik parameters to be complex variables.
Comparison between the present scheme and that in Ref. 29. Let us show the difference in the above two schemes using the simple one-dimensional four-site spin model. In this case, for example, the following four spin configurations are related with each other by the translation operator: Let us take σ canonical to be σ 0 . Then, the wave function in Ref. 29 is given by On the other hand, the wave function in the present scheme reads When we define the hidden variable density α as α = N hidden /N visible (= N hidden /N site ) as in Sec. 2, the computational cost of the scheme in Ref. 29 scales as O(αN 2 site ). On the other hand, the present scheme scales as O(αN 3 site ) because we need to compute the summation over n in Eq. (16), which gives an additional factor of N site . Therefore, when the hidden variable density α is the same, the scheme in Ref. 29 is computationally cheaper. However, as we show in the next section, the present scheme is much more accurate with the same α. In the scheme in Ref. 29, if the value of α giving the same accuracy as that in the present scheme becomes comparable to N site , the computational cost becomes comparable between the two methods.
Furthermore, when the number of variational parameters becomes large, numerical optimization becomes difficult. Therefore, in general, it helps to save the number of variational parameters for achieving stable optimizations. In this sense, the present scheme has an advantage because we can achieve much better accuracy with smaller α and hence a smaller number of variational parameters.

One-dimensional antiferromagnetic Heisenberg
model First, we show the result of S = 1 2 antiferromagnetic Heisenberg model on the one-dimensional spin chain. The Hamiltonian is given by Here S i is the spin-1/2 operator at site i and J is the Heisenberg exchange interaction. We take J as energy unit, i.e., we set J = 1. As in Ref. 29, we take 36 site systems (N site = 36) and impose periodic boundary condition. The relation between the visible units and physical spins is given by σ i = 2S z i , as we described above. The number of visible units is N visible = N site . We do not impose any symmetry in b k or W ik , and the numbers of independent b k and W ik parameters are αN visible and αN 2 visible , respectively. The total number of variational parameters amounts to 2αN visible (N visible + 1), where the factor of 2 comes from the fact that b k and W ik parameters have both real and imaginary parts. The optimization of the variational parameters is done in the very same way as in Sec. 2, i.e., we use the SR method.
We optimize the RBM wave function with the hidden variable density α = 1 for each momentum K sector and compare the energy with the exact results (Fig. 6). We find that the relative error of the RBM energy is at most 2 × 10 −5 , showing an excellent agreement with the exact energy. Indeed, the present RBM energy is much more accurate than that obtained in Ref. 29: In Ref. 29, the results obtained by the RBM with α = 3 show the relative error of about 7 × 10 −3 at the momentum in which the accuracy is the worst (around K = 2 3 π). The best accuracy is achieved at K = 0 in both cases, the relative error is about 4 × 10 −7 (the present study) and 2 × 10 −5 (Ref. 29).
As we already described, the way of imposing total momentum is different in the two methods, which leads to a significant difference in the accuracy. The present RBM wave function gives better accuracy even though the number of hidden units is smaller. To improve the accuracy, in Ref. 29, the three-layer FFNN is introduced. The size of the FFNN is characterized by the first hidden-layer variable density α 1 = 2 and the second hidden-layer variable density α 2 = 0.5. However, the accuracy of the FFNN is still worse than that obtained by the present RBM: the best accuracy is achieved at K = 0 and the relative error is about 3 × 10 −5 , whereas the worst accurate momentum is at K = π/2 and the relative error is about

Two-dimensional antiferromagnetic J 1 -J 2 Heisenberg model
We also apply the method to the S = 1 2 antiferromagnetic J 1 -J 2 Heisenberg model on the square lattice. The Hamiltonian is given by where S i is the spin-1/2 operator at site i, and J 1 (J 2 ) is the nearest neighbor (next nearest neighbor) exchange interaction. We take J 1 as energy unit, i.e., we set J 1 = 1. i, j and i, j denote pairs of nearest neighbor and next nearestneighbor sites, respectively. In the S = 1 2 antiferromagnetic J 1 -J 2 Heisenberg model on the square lattice, the next nearestneighbor J 2 interaction gives frustration and the model may host spin liquid ground state around J 2 /J 1 = 0.5. However, it is still an open problem whether the spin liquid ground state is realized or not. For the frustrated systems, whereas the quantum Monte Carlo methods suffer from the negative sign problem, the RBM scheme is free from the sign problem and can be applied to this challenging problem.

Discussion
We have shown that the complex RBM can well describe the excited states of the quantum spin Hamiltonians. The method seems to work well even in the frustrated spin systems, in which the node structure of the wave function becomes crucial. Therefore, it opens a way to investigate, for example, the existence of the spin gap in the frustrated systems.
For large system sizes, the number of variational parameters would become large to achieve high accuracy. Because it is difficult to perform stable optimization of a large number of parameters, it is helpful to reduce the number of parameters. As is discussed in Ref. 25, by combining the RBM with other powerful wave functions, the same accuracy can be achieved by smaller number of variational parameters. It is an interesting future issue to implement such a combination to study excited states.

Summary
We have discussed two extensions of the machine learning variational method. The first extension is the application to the fermion-boson coupled Hamiltonians. We have studied the Holstein model as a representative and showed that the present RBM-based variational wave function outperforms the previous variational wave function. The improvement is achieved by preparing electron-phonon-entangled wave function and improving the electron-phonon correlation factor using the RBM.
The second extension is the calculation of excited states. This is achieved by imposing quantum numbers to the wave functions. The difference between the present method and that in Ref. 29 lies in the way of imposing quantum numbers. By applying it to the one-dimensional S = 1/2 Heisenberg model, we have shown that the present scheme gives better accuracy than that achieved in Ref. 29, even though the present neural network is more compact. We have also performed the benchmark using the two-dimensional S = 1/2 J 1 -J 2 Heisenberg model, and showed an excellent agreement with the exact results.
Finally, we briefly discuss several future directions. As for the first part, an application to more general fermion boson systems would be of great interest. It will open a way to study real materials, which sometimes have complicated forms of interactions. As for the second part, the information of excited states is crucial to reveal the nature of quantum spin liquids in the frustrated spin systems, if the spin liquid exists as a stable phase. For example, the two-dimensional S = 1/2 J 1 -J 2 Heisenberg model, which we have studied in this paper, is one of the candidate Hamiltonians to host quantum spin liquid as a ground state. It is intriguing to perform calculations for larger system sizes and investigate the nature of the quantum spin liquid. We finally note that the present method to calculate excited states can be easily generalized to, e.g., fermion and fermion-boson coupled systems, which is also left as an important future issue.