A random-sampling method as an efficient alternative to variational Monte Carlo for solving Gutzwiller wavefunctions

We present a random-sampling (RS) method for evaluating expectation values of physical quantities using the variational approach. We demonstrate that the RS method is computationally more efficient than the variational Monte Carlo method using the Gutzwiller wavefunctions applied on single-band Hubbard models as an example. Non-local constraints can also been easily implemented in the current scheme that capture the essential physics in the limit of strong on-site repulsion. In addition, we extend the RS method to study the antiferromagnetic states with multiple variational parameters for 1D and 2D Hubbard models.


Introduction
Predicting materials stability and functionality requires a robust and reliable treatment of the fundamental interaction in the system, which is described by a many-body Schrödinger equation. To solve this equation rigorously requires the diagonalization of the many-body Hamiltonian in a Hilbert space that grows exponentially with the system size, a task that may never be fulfilled for realistic materials [1]. To simplify the problem, various methods have been developed to break down the Schrödinger equation into subsequent oneparticle equations in a mean-field framework. A typical example is the Hartree-Fock (HF) approach which tries to find an optimal solution in the subspace containing only wavefunctions in the form of a single Slater determinant. However, the true ground state can be far away from a single determinant, giving rise to the electron correlation effects [2].
Decades of extensive studies have been devoted to accurately describing the electron correlation [3,4]. The popular density-functional theory (DFT) [5][6][7] uses a universal functional to describe the electron exchange and correlation, and can be efficiently formulated within the framework of mean-field single determinant with Kohn-Sham equations [6]. However, the approximate functional often leads to qualitatively incorrect results for strongly-correlated systems. Other methods explicitly consider the multi-configurational nature of the ground state, and adopt different strategies to deal with the exponential expansion of the Hilbert space. For instance, in the coupled cluster method [8,9], the configurational space is truncated up to two-electron or three-electron excitations from an initial single-determinant state; quantum Monte Carlo (QMC) [10][11][12][13][14][15][16] uses stochastic sampling rather than explicit integration over the entire configurational space. In another group of methods, variational wavefunctions are constructed by acting a parametrized projector on an initial trial wavefunction |Ψ 0 〉: For fermionic systems, |Ψ 0 〉 is often chosen to be an antisymmetrized product of single-particle wavefunctions, such as the HF state. The correlation effects are introduced by the variational parameters  q, which are determined by minimizing the Rayleigh-Ritz quotient , where H is the Hamiltonian of the system [17].
Gutzwiller introduced a variational wavefunction, called the 'Gutzwiller wavefunction' (GWF), that uses a single parameter to control the double occupancy, based on the fact that the non-interacting initial trial wavefunction underestimates the onsite repulsion among electrons [18]. In general, the GWF has the form Y ñ = Y ñ |ˆ| P G U 0 , where |Ψ 0 〉 is the non-interacting wavefunction, and the projectorP U is determined by the on- Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. site interaction term in the Hamiltonian. Despite its apparent simplicity, exact evaluation of expectation values of physical quantities with respect to the GWF is still a very difficult many-body problem, and is only possible in a few special cases, such as the one-dimensional Hubbard model [19][20][21][22]. Gutzwiller himself proposed a so called Gutzwiller Approximation (GA) by effectively neglecting the spatial correlations between spins of the electrons. This treatment, while being efficient, delivers incorrect physics in many situations. For instance, it predicts a Brinkman-Rice type Mott metal-insulator transition at a finite U c for the single-band Hubbard model at half-filling, that is, E = 0 when U > U c [23]. This contradicts with the well-known fact that in the large U limit, the single-band Hubbard model behaves like an antiferromagnetic Heisenberg model with E scales as − t 2 /U. As pointed out by Hetényi et al. [24], the GA fails to capture the details of the 'exchange hole' in the non-interacting wavefunction, defined as a pair correlation function of two electrons with the same spin g(r ij ). In the GA, g(r ij ) = 0 if r ij = 0 due to the Pauli exclusion principle. However, the two electrons become completely uncorrelated if r ij ≠ 0, which is only true at infinite dimensions. It was demonstrated that the error of GA can be largely corrected by implementing the exact exchange hole while maintaining the general theoretical framework of GA [24]. As the strong-coupling counterpart of the GWF, the Baeriswyl wavefunction (BWF) [25,26] was proposed to describe the insulating states in the limit of large interactions. In the BWF, a projector solely dependent on the kinetic energy term is applied onto the wavefunction in the large U limit: . |Ψ ∞ 〉 can be chosen as the ground state of the Heisenberg Hamiltonian or as |Ψ G 〉 obtained in the large U limit [26,27]. Approximate combinatorial solutions similar to the GA were also developed for the BWF and the hybrid Baeriswyl-Gutzwiller wavefunction [27]. A comparative study demonstrated that the BWF performs better qualitatively than the GWF in describing the insulating states at large U [28].
The variation Monte Carlo (VMC) method has been established as a reliable approach for solving variational wavefunctions for strongly correlated problems [29][30][31][32][33][34][35][36][37]. Here, we propose an alternative random-sampling (RS) method that is easy to implement based on a simple idea illustrated as follows using the GWF as an example. In the GWF, the variational projector is determined by the number of doubly-occupied sites. Therefore, one can naturally group the electron configurations based on the number of doubly-occupied sites, such that all the configurations in the same group have the same projector applied on them. The expectation value of an observable can be expressed in terms of summations of physical quantities over all configurations in each group. Such sum can be calculated as m´ , where μ is the mean value that can be estimated by randomly sampling configurations in the group, and  is the total number of configurations in the group that is readily available based on combinatorics. This method has been preliminarily applied to calculate the double occupancy in Hubbard models [38]. Here, we generalize it to arbitrary observables. Moreover, we demonstrate that it performs well in the entire parameter range, without the low acceptance rate problem that VMC suffers from in the limit of strong on-site repulsion.

General formalism
We use a 1D single-band Hubbard model, whose exact solution is available for comparison, to demonstrate our approach. The Hamiltonian is defined as  (1) is the kinetic energy describing the hopping of electrons to a NN site, and the second term is the on-site Coulomb repulsion between electrons with different spins.
The GWF is constructed by applying a Gutzwiller parameter g (0 g 1) onto doubly-occupied lattice sites: whereD i is a local operator counting the double occupancy on site i: =  ˆD n n i i i , and |Ψ 0 〉 is a suitable starting wavefunction. Here, |Ψ 0 〉 is chosen to be the single Slater determinant HF solution. That is, where k are occupied wave vectors in the Brillouin zone, σ denote the spin, and AEñ | is the vacuum state. We choose the local orbitals |iσ〉 as the Wannier functions defined by where N is the total number of sites and R i are lattice vectors. |Ψ 0 〉 can be projected onto local orbitals as . |Γ〉 can also be regrouped as |Γ 1 , Γ 2 , L ,Γ N 〉, in which Γ i = 0, ↑ , ↓ or ↑ ↓ , denoting the occupation at site i. Assuming there are n d doubly-occupied sites in the configuration Γ, that is, The expectation value of an operatorÔ can be expressed as We first rewrite the denominator of equation (9) by grouping the configurations based on n d : where N e is the total number of electrons, and p  is the group of configurations containing p doubly-occupied sites. ) as an example, the number of configurations . The mean value μ p (|λ Γ | 2 ) is obtained by randomly sampling configurations containing p doubly-occupied sites. The numerator in equation (9) can be rewritten as In each simulation, we still sample configurations Γ with a fixed number of doubly-occupied sites p. For each sampled configuration, we rigorously evaluate the sum given in the parenthesis in equation (11), which usually contains only a few non-zero terms For instance, there is only one term in this sum ifÔ is a single hopping operator. The configurations G¢ are also grouped according to the number of doubly-occupied sites ¢ p , which can be different from p. Assuming ¢ p is in the range of (δ = 1 for a one-electron hopping operator, δ = 2 for a two-electron hopping operator, etc.), we then generate 2δ + 1 sub-groups, one for each possible ¢ p value, so that the Gutzwiller modifier + ¢ g p p remains constant for each sub-group. The quantity to be sampled for each subgroup is then l l , which shares the same structure as equation (10) and sub-grouping will be unnecessary.

Extension to antiferromagnetic (AFM) states
We also investigated the AFM state of the Hubbard model using the GWF. To generate the initial trial wavefunction, we seek an AFM solution to the mean-field HF Hamiltonian [39] å å = -+ á ñ + á ñ -á ñá ñ with m ≠ 0. m is called the staggered magnetization that serves as an order parameter describing the AFM ordering in the system [40]. For any finite U, equation (12) exists an AFM solution formed by single-particle wavefunctions with the dispersion relation where ò k is the dispersion relation for the non-interacting system. The AFM state is an insulator with the band gap characterized by the parameter Δ, which is related to m by Δ = Um/2. Δ and m are solved self-consistently. It is easy to verify that the Néel state | ↑ , ↓ , ↑ , ↓ , L 〉 with perfect AFM ordering (m = 1) is not an eigenstate of the exact Hamiltonian equation (1) for any U/t. Even in the large U limit, m is significantly reduced by quantum fluctuations (QF) causing imperfect AFM ordering. However, the QF is underestimated by the HF solution, which converges to the Néel state as U → ∞ . To promote the QF in the system especially at large U, we introduce a new variational parameter to control the deviation from the Néel state. The new parameter depends on the number (n f ) of 'frustrated' sites, defined as a single spin-down electron on the sublattice with even sites or a single-up electron on the sublattice with odd sites. Consequently, the Gutzwiller projector applied on a configuration |Γ〉 will is changed to Gñ = Gñ| . During the RS runs, the configurations are grouped according to the two parameters n d and n f : The equations (9)-(11) can be easily generalized to the case with multiple variational parameters.

VMC method
For comparison purpose, we also performed VMC calculations, which is summarized as follows. We define the weight of the configuration Γ in the GWF as Then, it is easy to show that equation (9) can be written as According to equation (14), if one can establish an ensemble of configurations with the probability of each configuration proportional to w Γ , then the expection value ofÔ is simply the ensemble average of the quantity Q. For diagonalÔ, Q Γ is reduced to the diagonal elements O ΓΓ . To evaluate the ensemble average of Q using the Monte Carlo method, we start with a random configuration. During each trial, a randomly selected electron is attempted to hop to a new site, also randomly selected from the pool of sites that are not currently occupied by an electron of the same spin. The acceptance probability is calculated according to = G¢ G ( w w min 1,  ), where Γ and G¢ are the configurations before and after the attempted move, respectively.

Results and discussions
In our calculations, we chose the number of sites of the periodic Hubbard chain N = 34, and the filling fraction n = N e /N = 1. The number of doubly-occupied sites 0 n d 17. A separate sampling was performed for each n d . With the data collected, the expectation value at arbitrary g can be calculated. 2 million randomly generated configurations were sampled in each run, which is sufficient to reach convergence [38].
An immediate check of the RS method is to see whether the normalization condition of |Ψ 0 〉 can be recovered from the sampled group weights p  . That is, . Indeed, the calculated value for å p p  is 0.993, with an error of less than 1%. The kinetic energy in the Hamiltonian (equation (1) figure 1, we show the double occupancy á ñ D i and the kinetic energy E kin as a function of g, calculated by both RS and VMC. The exact solution for the GWF obtained by a Feynman diagrammatic approach [19][20][21][22] is also presented for comparison. 2 million MC trials were performed for each VMC run, so that the computational costs for VMC and RS are comparably. It can be seen that both the double occupancy and the kinetic energy calculated by the RS method match excellently with the exact GWF solutions for the entire range of g values. On the other hand, the convergence of VMC deteriorates as g decreases. VMC even gives a nonphysical drop in E kin as g approaches 0. The poor convergence at small g values in VMC is caused by the low acceptance rate in MC trails [30,31,38]. This problem is fixable by introducing electron-exchange processes in addition to the electron hopping, and averaging over thousands of independently collected samples [30]. However, this results in significantly more algorithmic complexity and computational cost than the RS method.
The total energy E tot = E kin + E pot can also be expressed as a function of g for any specific U. By minimizing this function with respect to g, one obtains a variational solution for the corresponding U value. We show this process in figure 2 using U/t = 5 as an example. An energy minimum of E tot = − 0.404 can be found at g=0.38.
The total energy as a function of U can be obtained by repeating this process for various U values, and is plotted in figure 3. Also shown for comparison are the total energies of the exact GWF [20] and the initial PM HF wavefunctions. The PM HF energy is linear to U for the single-band Hubbard model, which does not provide an acceptable approximation even for moderate U. For the GWF, the results from the RS method generally agree well with the exact Gutzwiller solutions. However, the seemingly overlapping curves between RS and exact solutions in figure 1 still produce noticeable errors for large U values. This is because the tiny disagreements in the double occupancy for small g are magnified by a factor of U in E tot .

PM state with non-local constraint
One of the main deficiencies of the GWF is that the Gutzwiller projector is local in nature, and thus it does not fully describe the spatial correlations of the electron occupation at different sites. When the on-site repulsion is strong, NN pairing of an empty site and a doubly-occupied site is favored, because the empty site provides a hopping channel for the electrons to escape from double occupancy so that the repulsive energy can be reduced. Therefore, in addition to the original Gutzwiller projector as described above, we also considered the constraint to force every doubly-occupied site to be paired with a NN empty site [42]. This is straightforward to implement in RS by only constructing configurations satisfying this constraint. For a non-magnetic system with n = 1, the  total number of such configurations in the group with p doubly-occupied sites is reduced In figure 4, we show the total energy calculated by the RS method with and without the constraint. Also shown is the true ground-state energy for the 1D single-band Hubbard model obtained by Lieb and Wu based on the Bethe ansatz [43]. It can be seen that the GWF significantly overestimates the total energy for U/t > 3, which is well-documented in the literature [19,20,41,42]. However, with the constraint applied, the total energy of the GWF agrees extremely well with the true ground state for large U/t > 15. showing that the constraint captures the essences of physics in the strong repulsion limit. On the other hand, in the low U limit, the constraint overpromotes the correlation between doubly-occupied and empty sites, resulting in unreasonable total energies. Instead, the results without the constraint matches excellently with the exact values in this regime. In a recent work [44], we expressed the probability of site occupations as a weighted sum of the two limiting cases with and without the pairing constraint, which resulted in total energies and spin correlations that compare favorably with benchmark data for the entire range of U [44,45].

AFM states
We first applied the RS method to calculate the AFM states of the periodic 1D Hubbard chain containing N = 34 sites.
(the lattice constant is set to 1) in equation (13). The AFM HF wavefunctions with Δ/t=1, 2, 3, 4 and 5 were used as the initial states for constructing the GWFs. The corresponding values of U/t are 3.11, 4.79, 6.59, 8.46 and 10.38, respectively. As mentioned earlier, the configurations are grouped based on n d and n f values. The weight on each group, l = å GÎ G | | 2    , is calculated using the RS technique. The sum of these weights for U/t=4.79 as an example is 1.0001, well recovering the normalization condition of |Ψ HF 〉. The accuracy is further improved compared with the PM states for two reasons. Firstly, the number of groups is significantly increased due to the 2-parameter grouping. More importantly, due to the strong AFM ordering in |Ψ HF 〉, the disordered configurations, which are the most difficult to sample, make significantly less contributions. Indeed, in figure 5(a) we plot   for each n d and n f pair for |Ψ HF 〉, where one can see the weights are strongly localized in the lower left corner with relatively small n d and n f . After the Gutzwiller projector is applied, the total energy varies as a function of g 1 and g 2 , as shown in figure 6. A minimum can be located at g 1 =1.04 and g 2 =2.36, with the energy E = 0.472t, which is much reduced from the HF value of − 0.399t. In figure 5 (b), we plot the normalized weight for each group, å g g g g n n n n 1 2 1 2  , for the optimized GWF. Compared with figure 5 (a), the weights clearly spread to groups with larger n f , showing increased QF. We summarize the AFM results in figure 7. In figure 7 (a), it can be seen that the GWF effectively improves the total energy of the HF initial states for various U values. On the other hand, the GWF energy remains higher than the exact ground-state energy. This is consistent with the fact that the ground state of the 1D Hubbard model is PM for any U values. Figure 7 (b) shows that staggered magnetization is also significantly reduced by the GWF due to the increased QF. However, the double occupancy of GWF is similar to that of the AFM HF wavefunction, as shown in figure 7 (c). The results given in figure 7 (b) and (c) are also reflected in figure 7 (d), where one can see that the g 1 parameter controlling the double occupancy only slightly deviates from 1 for all the U values studied, while the g 2 parameter controlling the 'frustration' in the system shows a much stronger dependence on U, causing a significant reduction in m.
The above calculations can be extended to 2D square lattices containing N = L × L sites. Again, we seek an AFM solution by using the AFM HF wavefunction as the starting point. Similar to the 1D case, two variational parameters g 1 and g 2 were included in the calculations, controlling the number of doubly-occupied and frustrated sites, respectively. Only NN hopping is in the Hamiltonian, which leads to a non-interacting dispersion relation = -+ ( ) t k k 2 cos cos k x y  in equation (13). We considered the half-filling case for L=4, 6, 8, 10. U/t was fixed at 10, corresponding to Δ/t = 4.69. For the smallest size L = 4, we evaluated equation (9) exactly by enumerating all possible configurations Γ. Since the number of configurations scales exponentially as 4 L×L , the exact solution is infeasible for other values of L, and the RS method was used instead. The Gutzwiller parameters, total energy, staggered magnetization, and double occupancy for all the sizes are shown in table 1. As one can see, the calculated   quantities depend weakly on the system size, indicating that the sizes studied are sufficient to converge these quantities. The good convergence of the calculations with respect to the system size, together with the fact that the results for L = 4 are exact, demonstrate that the RS method is also highly accurate for the 2D system.

Conclusion
We report a random-sampling method for evaluating expectation values with respect to a variational manybody wavefunction. Based on proper grouping of configurations in the Hilbert space, group summations can be efficiently and accurately estimated by randomly sampling configurations in the group. We use this method to calculate the total energy for the single-band 1D Hubbard model at half filling, based on the Gutzwiller wavefunctions. The results compare favorably with those derived from analytic solutions. On the other hand, the variational Monte Carlo method suffers from a low acceptance-rate problem in the large U limit that is computationally challenging to solve. Furthermore, non-local constraints on real-space configurations can be conveniently implemented in the current scheme to further improve the variational wavefunction. For instance, we apply the constraint that forces nearest-neighbor paring of doubly-occupied and empty sites. The constraint leads to much improved ground-state energy with strong on-site repulsion. Moreover, we extend the method to treat problems containing more than one variational parameters, using the antiferromagnetic states of the 1D and 2D Hubbard models as examples. As a general method for evaluating partial sums, this method can find applications in various many-body problems, and help address the exponential expansion of the Hilbert space with respect to system size or number of bands. Table 1. Gutzwiller parameters (g 1 and g 2 ), total energy (E tot ), staggered magnetization (m), and double occupancy (d) for 2D Hubbard models with varying size (L).